On a Data-Driven Approach for Detecting Disturbance in the Brazilian Savannas Using Time Series of Vegetation Indices

: Remote sensing of disturbance in the savannas from Brazil is challenging, especially due to confounding effects of the vegetation phenology and natural soil exposure on the detection of clearing and ﬁre events. In this study, we investigated the detection of disturbance over this global hotspot of biodiversity using seven vegetation indices (VIs) calculated from the Landsat time series (2017–2019) and the Continuous Change Detection and Classiﬁcation (CCDC) algorithm. The selected VIs represented distinct biophysical characteristics of the savannas. We evaluated the effects of disturbance on these VIs and assessed the accuracy of CCDC-detection in 2019, considering individual VIs, ensemble VIs, and the type of disturbance (savanna clearing and ﬁre). Finally, we analyzed the possible existence of seasonal patterns of disturbance in a study area located at the new agricultural frontier of the Cerrado biome. The results showed that the overall accuracy of CCDC detection of total disturbance ranged from 51.2% for the Green-Red Normalized Difference (GRND) to 65.9% for the Normalized Burn Ratio (NBR2). It increased to 71.2% for ensemble VIs, whose multivariate approach reduced the omission errors in the analysis when compared to the use of single VIs. For detecting events of savanna clearing and ﬁre, the most important VIs used near-infrared and shortwave infrared reﬂectance bands on their formulations (NBR2, NBR, and Moisture Stress Index—MSI). The CCDC accuracy was generally higher for detecting clearing than for mapping burned areas. In contrast, the recorded date of disturbance occurrence was less precise for detecting clearing than for recording events caused by ﬁre, especially due to the existence of some gradual processes of vegetation degradation until complete clearing. Our ﬁndings showed also the existence of a seasonal pattern of disturbance occurrence. Savanna clearing predominated in the transition from the rainy to the dry season (April to July) to open new areas for agriculture. It preceded most events of ﬁre disturbance between August and October that occurred near the consolidated areas of agriculture and extended into the native vegetation areas. Results reinforce the importance of data-driven approaches for generating early warning alerts of disturbance in the Cerrado to be further checked in the ﬁeld.


Introduction
Savannas in Brazil comprise a global hotspot of biodiversity that has already lost half of the native vegetation cover to crops and pastures [1,2]. Locally known as the Cerrado, the savannas are the second largest biome in the country and the second largest terrestrial source of carbon emissions after the Amazon biome [3,4]. Savannas show gradients ranging from grasslands to woodlands, having species well adapted to natural and human-induced fires as well as to the strong seasonality of precipitation [5][6][7].
In addition to its biodiversity, the Cerrado is also very important for agriculture, and has half of the croplands in Brazil [8][9][10]. The recent expansion of soybean over native vegetation of the northern portion of the Cerrado (new agricultural frontier) has increased the number of disturbance events in this region [11][12][13][14][15]. As in other regions of the Cerrado, clearing and fire are the main drivers of disturbance. However, the seasonal pattern of occurrence of these two types of disturbance is not entirely understood. The comprehension of the relationships between fire and clearing over each year requires further studies. For land management, fire is a common practice that generally invades the native vegetation areas. Although part of the savanna vegetation is adapted to fire, an excessive number of events may cause biodiversity loss and land degradation [16,17]. Fire is also used in recently cleared areas to burn and clean the non-photosynthetic vegetation left on the ground after disturbance (e.g., rest of leaves, branches, and trunks).
To monitor vegetation disturbance caused by clearing and fire in the Cerrado, remote sensing programs have been developed in Brazil. The use of satellite imagery is perhaps the only way for large-scale monitoring of this biome [18]. An example of these initiatives is the project for monitoring the Cerrado developed by the Brazilian National Institute of Space Research (INPE), which is mostly based on the visual interpretation of several satellite data. The objective of the INPE's project is to provide annual rates and warning alerts of savanna clearings [19]. Monitoring disturbance in the Cerrado is part of the strategies used to mitigate greenhouse gas emissions and to conserve biodiversity. For instance, if a similar agreement to the Amazonian Soy Moratorium is adopted to the Cerrado, such types of remote sensing programs can adequately support the major exporting companies to not buy and finance soybean production from new savanna-cleared areas. This would probably reduce savanna-clearing rates, as was observed in the last decade for tropical deforestation in the Amazon region.
However, satellite monitoring of human-induced disturbance in the Cerrado is not a trivial task. It still requires the development and testing of new approaches for the detection of disturbance and the reduction of uncertainties in the data analysis. Some factors that affect detection are: the distinct resilience of savanna vegetation types to disturbance events; the significant changes in satellite spectral response due to vegetation phenology; the natural soil exposure due to the presence of more open canopies in the savanna formations; and the frequency of satellite observations affected by cloud cover. To reduce the influence of vegetation phenology on clearing detection, vegetation indices (VIs) calculated in a fixed period of each year (e.g., dry season) can be used as input data for algorithms. When compared to band reflectance, VIs can reduce data variability associated with the geometry of data acquisition and with terrain illumination or topographic effects. This is especially valid for VIs with a band reflectance normalization on their formulations [20][21][22]. Algorithms like the Landsat-based detection of Trends in Disturbance and Recovery (LandTrendr) retrieve the annual rates of disturbance from Landsat time series [16,23]. In this case, early warning alerts of disturbance cannot be generated. Alternatively, algorithms that consider the seasonal response of vegetation can be used for near real-time detection of disturbance. This is the case of the Continuous Change Detection and Classification (CCDC), which can detect land cover changes continuously using all available Landsat data or every time a new image is collected [24].
When compared to other methods, CCDC has the advantage of working with multiple input variables for providing robust detection of land covers, especially when working with high quantities of missing data in the time series, as demonstrated in a simulation study by Awty-Carroll et al. [25]. CCDC uses ordinary least squares in a time series model that has components of seasonality, trend, and overall values. The CCDC algorithm has been successfully used in forest environments [26,27]. As far as we know, there are no studies using CCDC over the savannas in Brazil. When VIs are considered for change detection in other environments, most studies generally focus on the Normalized Difference Vegetation Index (NDVI). However, in the Cerrado, it is particularly important to consider adding other VIs that capture the information of different savanna properties associated with canopy structure, biochemistry, and plant physiology [28]. There are also specific VIs proposed to be sensitive to disturbance by fire, such as the Normalized Burn Ratio (NBR) [29]. For detecting disturbance in the Brazilian savannas, the performance of each VI, or all VIs in a multivariate approach with CCDC (henceforth named "ensemble VIs"), has not been evaluated yet.
In this study, we investigated the detection of disturbance over the savannas in Brazil using seven VIs calculated from a Landsat time series (2017-2019) and the CCDC algorithm. The specific objectives were to evaluate: (1) the effects of disturbance on savanna reflectance spectra and VIs time series; (2) the accuracy of CCDC-detection in 2019, considering single VIs, ensemble VIs, and the type of disturbance (savanna clearing and fire) in the analysis; and (3) the possible existence of seasonal patterns of occurrence of fire and clearing for a better understanding of the dynamics of disturbance in the savannas.

Data and Methodology
The main methodological steps for the detection and analysis of disturbance in the savannas are presented in Figure 1. In the accuracy assessment of the CCDC-derived maps of disturbance, the detections were compared with reference data (validation points). Such high-quality data were collected through a sample-based approach (sampling design), allowing for a careful interpretation of specific areas of the map (response design). Response design was the step to determine whether the map and reference data would agree. The reference data were then used to compare the results and finally estimate the accuracy and area of the changes, and their related confidence intervals (data analysis). The steps are detailed in the next sub-sections.

Selection of the Study Area
The study area is located in MATOPIBA, an acronym for the names of four Brazilian states that compose this region (Maranhão, Tocantins, Piauí, and Bahia). Intensive savanna clearing for single cropping (soybean, maize, and cotton plantations) or double cropping (soybean-corn or soybean-cotton rotations) activities in this region is a consequence of several factors. These factors include the spillover effect of the Amazon Soy Moratorium to the Cerrado biome and several local actions for promoting recent agriculture expansion [11][12][13]. Savanna clearing rates vary with: market oscillations in commodities and land prices; local policies of credit incentives to agriculture; infrastructure improvement; and technological advances in soil/crop management practices [14,15].
We selected a study area of 215 km × 215 km located close to the municipalities of Balsas and Uruçuí in the Brazilian states of Maranhão (MA) and Piauí (PI), respectively ( Figure 2). Intensive savanna clearing, especially for soybean expansion, has already removed 22% of the native vegetation cover. Native vegetation is mainly composed of

Pre-Processing of Landsat Time Series and Selection of VIs for CCDC
We used surface reflectance data obtained from 1 January 2017 to 31 December 2019 by the Enhanced Thematic Mapper Plus (ETM+)/Landsat-7 and Operational Land Imager (OLI)/Landsat-8. Despite the data gaps associated with the Scan Line Corrector (SLC) failure in ETM+ imagery, the remaining pixel observations are useful because they maintain the same radiometric and geometric corrections as data collected prior to the failure. All the available images from both sensors in the period were considered in the data analysis, The study area has well-defined rainy and dry seasons, with an average annual precipitation of 1100 mm [31]. In 2019, the year of our analysis of disturbance, the total precipitation was 1065 mm. From May to September (dry season), the monthly accumulated precipitation was generally lower than 35 mm. During the rainy season, rainfall reached values higher than 170 mm per month between January and March ( Figure 3).  [32], the reflectance of the ETM+ bands was radiometrically adjusted to that of the OLI bands using slopes and intercepts reported by Roy et al. [33] This procedure is recommended to face the spectral differences in spectral response functions of the bands from both sensors. Such differences tend to produce, for instance, greater NDVI values for OLI than for ETM+ [33]. Therefore, statistical functions to adjust the band reflectance between the sensors ensure temporal continuity of the time series generated from ETM+ and OLI. Moreover, we excluded pixels under clouds and cloud shadows using the CF-mask algorithm [34,35]. From the surface reflectance data, we calculated seven VIs for detecting disturbance (savanna clearing and burned areas). Equations and references of the VIs are listed in Table 1 [29,[36][37][38][39][40]. They were selected because of their sensitivity to different biophysical parameters of the savannas. For instance, the Enhanced Vegetation Index (EVI) and the Soil-Adjusted Vegetation Index (SAVI) are the VIs more sensitive to changes in vegetation From a geomorphological point of view, there are plateaus and depressions in the study area. The elevation ranges from 160 m to 663 m with plateaus having more than 400 m of elevation. In contrast to depressions, plateaus allow mechanized agriculture because of the more favorable slope conditions for the displacement of tractors [16]. Crops are mostly planted over oxisols that are managed to improve soil fertility.

Pre-Processing of Landsat Time Series and Selection of VIs for CCDC
We used surface reflectance data obtained from 1 January 2017 to 31 December 2019 by the Enhanced Thematic Mapper Plus (ETM+)/Landsat-7 and Operational Land Imager (OLI)/Landsat-8. Despite the data gaps associated with the Scan Line Corrector (SLC) failure in ETM+ imagery, the remaining pixel observations are useful because they maintain the same radiometric and geometric corrections as data collected prior to the failure. All the available images from both sensors in the period were considered in the data analysis, including 469 images from four tiles: 215 images from Landsat-7 and 254 images from Landsat-8.
On the Google Earth Engine (GEE) cloud-computing platform [32], the reflectance of the ETM+ bands was radiometrically adjusted to that of the OLI bands using slopes and intercepts reported by Roy et al. [33] This procedure is recommended to face the spectral differences in spectral response functions of the bands from both sensors. Such differences tend to produce, for instance, greater NDVI values for OLI than for ETM+ [33]. Therefore, statistical functions to adjust the band reflectance between the sensors ensure temporal continuity of the time series generated from ETM+ and OLI. Moreover, we excluded pixels under clouds and cloud shadows using the CF-mask algorithm [34,35].
From the surface reflectance data, we calculated seven VIs for detecting disturbance (savanna clearing and burned areas). Equations and references of the VIs are listed in Table 1 [29,[36][37][38][39][40]. They were selected because of their sensitivity to different biophysical parameters of the savannas. For instance, the Enhanced Vegetation Index (EVI) and the Soil-Adjusted Vegetation Index (SAVI) are the VIs more sensitive to changes in vegetation structure produced by disturbance. They were proposed to reduce constraints of the NDVI that are mainly associated with signal saturation over dense vegetation, or with soil background influence over sparse vegetation. On the other hand, the Green-Red Normalized Difference (GRND) and the Moisture Stress Index (MSI) may detect changes in old/new foliage and leaf/canopy water stress resulting from gradual processes of savanna clearing or degradation, respectively. Finally, we also considered in the data analysis the Normalized Burn Ratio (NBR) and its variant (NBR2), which are VIs sensitive to fire disturbance. They use near-infrared (NIR) and shortwave infrared (SWIR) bands on their formulations (Table 1). Therefore, besides their sensitivity to different biophysical parameters of the savannas, such VIs explore different band combinations of OLI and ETM+ in the visible, NIR, and SWIR spectral regions.

Continuous Change Detection and Classification (CCDC)
The selected VIs of Table 1 were used individually as input data (univariate approach) and in an ensemble approach (multivariate) for the detection of disturbance with the CCDC algorithm on the GEE platform. Thus, the algorithm was run eight times: one run for each of the seven VIs and another run for the ensemble approach (VIs together as seven input variables). The ensemble approach explored indirectly the distinct spectralbiophysical information provided by the seven VIs for disturbance detection. The CCDC is an online algorithm that uses a data-driven statistic threshold (adjusted for each pixel) to detect changes in the time series [24]. An online approach means that a model is fitted to observations from the beginning to the end of the time series, being updated as new observations become available. Thus, if new observations over the time series do not diverge widely from the fitted model, they are added to the time series before updating the model. The time series modeled by CCDC have components of overall value (mean VI throughout the time series), seasonality, and trend. In our study, Least Absolute Shrinkage and Selection Operator (LASSO) regression fitting estimated the coefficients of the model. After testing different parameters in pixels with available information of disturbance, the model based on the overall value was then selected for analysis. When we considered single VIs as input data for CCDC, a disturbance caused by clearing or fire was flagged when the difference between observed and predicted values by the model exceeded three times the normalized root mean square error (RMSE) for three consecutive observations. In the ensemble approach, the algorithm averages and normalizes such differences (observed minus predicted) by three times the RMSE for all VIs. A disturbance was marked in the time series when the result was larger than one for three consecutive observations. This number of observations was set to avoid the pseudo-detection of disturbance caused by noise in the time series [34]. While noise factors tend to be ephemeral, land cover change is more persistent through time. The results of the disturbance detection that occurred in 2019 were evaluated.

Data Analysis from Sampling and Response Designs
The data analysis was divided into three parts. In the first one, the CCDC's ability to correctly detect disturbance in 2019 was evaluated per VI and ensemble VIs, considering also the performance of the metrics as a function of the type of disturbance (savanna clearing or fire). In the second part, the ability of the algorithm for precise detection of the time of disturbance occurrence was analyzed. Finally, the last part looked for the possible existence of temporal patterns of clearing and fire in 2019. The strategy used for accuracy assessment is described below.

Accuracy Assessment per VI, Ensemble VIs, and Type of Disturbance
The sampling and response design for the accuracy assessment of the CCDC disturbance detection in 2019 was based on the Area Two [41] and Collect Earth Online (https://collect.earth/; accessed on 30 June 2021) tools, respectively. These applications provide the necessary tools to comply with the recommended practices and international guidelines for assessing map accuracy. Using these approaches, we estimated the optimal number of samples and randomly located them in the study area (sampling design). In addition, we built a reference dataset for these sample locations (response design) to support the validation of the results. The sampling design considered only the native vegetation areas in 2018. Land cover changes observed until 2018 and reported by the INPE's program of Cerrado monitoring were masked.
A stratified random sample design with equal sample allocation on each class (disturbed and non-disturbed) was adopted in the work. The stratification was based on the classes (disturbed by fire, clearing, and non-disturbed) and on the inspection of ancillary maps. There are two main purposes for stratification. First, the strata are important to report accuracy per class. Second, the stratification ensures a sufficient representation of each class, even that occupying a small proportion of the area (e.g., disturbance class). The sample unit is pixel. We choose an equal number of samples per strata to make possible the comparison of accuracy per type of disturbance. We selected 588 samples for the spatial accuracy assessment and area estimation of disturbance in 2019. In addition to the visual inspection of the Landsat time series, we used other satellite data with more detailed spatial resolution than the Landsat instruments for the identification of disturbed and non-disturbed areas. Examples of these satellites include the MultiSpectral Instrument (MSI)/Sentinel-2 images having some bands with 10-m spatial resolution, and the PlanetScope satellite constellation with 3-m spatial resolution data.
From the 588 allocated samples, we selected 294 non-disturbed sites and 294 disturbed sites. We divided also the set of 294 disturbed areas into two groups having an equal number of samples: one group affected by fire and another one affected by clearing. The cross-tabulation of the class label (disturbed and non-disturbed) assigned from the CCDC results against the reference dataset generated the error matrix. The error matrix was the basis for quantification of the overall accuracy, user's accuracy (and its complementary measure, commission error), and producer's accuracy (and its complementary measure, omission error) estimates. Based on these estimates, we identified the performance of each VI and the ensemble VIs for overall detection of disturbance and specific detection of savanna-cleared and fire-affected areas.

Accuracy of the Detected Time of Disturbance
For evaluating the accuracy of time detection of disturbance by CCDC, we used the set of 294 disturbed samples by clearing and fire. We tracked them over time in the Landsat images and VI profiles using the Collect Earth Online. This tool was customized here to record the type and date of each disturbance in the attributes of the validation points (shapefile). Thus, the date of change was obtained visually for the reference data and was compared with the CCDC output to obtain the difference between them. The evaluation of the date of disturbance occurrence was affected by the number of clear observations in the images. In other words, the reference date of disturbance occurrence can be different from the exact time of the event due to the limitations imposed by cloud cover, cloud shadows, and temporal resolution of the instruments.
The date of change for clearing events considered the time when the vegetation was completely removed (clear-cutting). The date of change for fire events considered the time when active fire or burned areas were first recorded in the reference samples. The accuracy of time detection of disturbance by the CCDC algorithm was then calculated by comparing this reference dataset with the CCDC outputs of the time of occurrence of each disturbance type in 2019.

Seasonal Patterns of Savanna Clearing and Fire Events
The relationships between savanna clearing and fire are not completely understood in the Cerrado environment. In order to look for temporal patterns of occurrence of these events in the study area, we first inspected our reference dataset by plotting the type of disturbance on a per-month basis in 2019. We compared these results with the CCDCderived disturbance maps.
Because of the difficulties of CCDC to separate both types of disturbance in the time series, we complemented our analysis using ancillary data from savanna clearing in 2019 provided by the INPE's program entitled "Sistema de Detecção de Desmatamento em Tempo Real" (DETER Cerrado; http://terrabrasilis.dpi.inpe.br/; accessed on 21 July 2021). In addition, we used the 250-m Moderate Resolution Imaging Spectroradiometer (MODIS) FireCCI v.5.1 product (European Space Agency-ESA) to compare the CCDC results with the burned areas detected by MODIS in 2019. Such comparisons used the validation points of disturbance that were also recorded on each product.

Time Series of Vegetation Indices and the Detection of Disturbance
Savanna clearing and fire produced a great variety of changes in surface reflectance that depended on the intensity of disturbance, the disturbed savanna physiognomy, and the soil background composition before and after disturbance. For instance, when a woodland savanna area was cleared in the study area, we observed a reflectance increase Remote Sens. 2021, 13, 4959 8 of 20 in the visible and SWIR spectral intervals and a reflectance decrease in the NIR region ( Figure 4a). This response was a combination of the soil reflectance with the reflectance of non-photosynthetic vegetation left on the ground after clearing (e.g., litter, branches, and twigs). In contrast, the effect of fire on shrub savannas was to decrease the reflectance of the NIR and SWIR-1 OLI/Landsat-8 bands (Figure 4b). Recently burned areas over savanna grasslands had low reflectance for the sensor, which gradually increased from the visible and NIR (VNIR) to the SWIR.  For detecting disturbance by fire, VIs like NBR were especially important when used with CCDC because they tracked the reflectance changes over time in the NIR and SWIR spectral intervals (Table 1; Figure 4b). In our study area, disturbance by fire over the savannas generally transformed positive NBR values into negative ones. Therefore, "breaks" in the NBR time series were created because of the NIR reflectance decrease and the SWIR reflectance increase caused by biomass burning (Figure 5b).  Both events of disturbance reduced the VIs response. Because the reflectance after savanna clearing increased in the red band (640-670 nm) of OLI/Landsat-8 and decreased in its NIR band (850-880 nm) (Figure 4a), this type of disturbance produced a significant change in the temporal profile of NDVI (Figure 5a). To be labeled as a disturbance ("break" in the time series) by the CCDC algorithm, such NDVI modifications for a given pixel should be distinct from the seasonal NDVI variations produced by vegetation phenology. NDVI decreased from the rainy to the dry season of each year, reaching the lowest values toward September under maximum water stress for the plants. As demonstrated in Figure 5a, clearing can be also followed by a quick vegetation regrowth that may produce an increase in NDVI and a new pattern of seasonal profile (see NDVI after October 2019 in Figure 5a).
For detecting disturbance by fire, VIs like NBR were especially important when used with CCDC because they tracked the reflectance changes over time in the NIR and SWIR spectral intervals (Table 1; Figure 4b). In our study area, disturbance by fire over the savannas generally transformed positive NBR values into negative ones. Therefore, "breaks" in the NBR time series were created because of the NIR reflectance decrease and the SWIR reflectance increase caused by biomass burning (Figure 5b).
For detecting disturbance by fire, VIs like NBR were especially important when used with CCDC because they tracked the reflectance changes over time in the NIR and SWIR spectral intervals (Table 1; Figure 4b). In our study area, disturbance by fire over the savannas generally transformed positive NBR values into negative ones. Therefore, "breaks" in the NBR time series were created because of the NIR reflectance decrease and the SWIR reflectance increase caused by biomass burning (Figure 5b).

Variations in Accuracy Metrics per VI, Ensemble VIs, and Type of Disturbance
The CCDC algorithm detected 1.24 ± 0.12 Mha of total disturbance (clearing plus fire) in 2019. This result is based on the ensemble approach. The other VI-area estimates presented approximately similar results, since the estimates were based on a stratified estimator. This number represents an area equivalent to 41.8% of the native vegetation of our site. The overall accuracy of detection ranged from 51.2 ± 4.2% for GRND to 65.9 ± 4.2% for NBR2, and increased to 71.2 ± 4.0% for ensemble VIs ( Table 2). The best single VIs for detecting total disturbance were therefore metrics that used NIR, SWIR-1, and SWIR-2 reflectance bands of the Landsat instruments on their equations (NBR2, NBR, and MSI in Table 1).

Variations in Accuracy Metrics per VI, Ensemble VIs, and Type of Disturbance
The CCDC algorithm detected 1.24 ± 0.12 Mha of total disturbance (clearing plus fire) in 2019. This result is based on the ensemble approach. The other VI-area estimates presented approximately similar results, since the estimates were based on a stratified estimator. This number represents an area equivalent to 41.8% of the native vegetation of our site. The overall accuracy of detection ranged from 51.2 ± 4.2% for GRND to 65.9 ± 4.2% for NBR2, and increased to 71.2 ± 4.0% for ensemble VIs ( Table 2). The best single VIs for detecting total disturbance were therefore metrics that used NIR, SWIR-1, and SWIR-2 reflectance bands of the Landsat instruments on their equations (NBR2, NBR, and MSI in Table 1). For ensemble VIs, the producer's accuracy varied from 35.0% (high omission error) to 97.3% (low omission error) for the disturbance and non-disturbance classes, respectively ( Table 2). On the other hand, user's accuracy ranged from 90.2% to 67.5%, indicating a low commission error for detecting disturbance. In Table 2, the contrast between producer's and user's accuracies occurs because of the differences in the proportion of the disturbed and non-disturbed classes. The disturbance class had higher user's accuracy (small commission error) and lower producer's accuracy (large omission error) than the predominant nondisturbance class in the scene. The use of an unbiased estimator accommodates the effects of map classification errors in the area estimates. Compared to the use of single VIs, a practical effect of the ensemble CCDC approach was to reduce the omission error and slightly increase the commission error ( Figure 6).
Remote Sens. 2021, 13, x FOR PEER REVIEW 11 of 21 a low commission error for detecting disturbance. In Table 2, the contrast between producer's and user's accuracies occurs because of the differences in the proportion of the disturbed and non-disturbed classes. The disturbance class had higher user's accuracy (small commission error) and lower producer's accuracy (large omission error) than the predominant non-disturbance class in the scene. The use of an unbiased estimator accommodates the effects of map classification errors in the area estimates. Compared to the use of single VIs, a practical effect of the ensemble CCDC approach was to reduce the omission error and slightly increase the commission error ( Figure 6). When we considered the performance of detection of each VI or ensemble VIs per type of disturbance, we observed higher accuracies over cleared areas than over burned areas (Figure 7). NBR2 had the highest accuracy values for detecting fire (34.0%) and clearing (55.2%) disturbance. Such values increased to 36.7% and 71.4%, respectively, when ensemble VIs were used as input data to CCDC (Figure 7).  When we considered the performance of detection of each VI or ensemble VIs per type of disturbance, we observed higher accuracies over cleared areas than over burned areas (Figure 7). NBR2 had the highest accuracy values for detecting fire (34.0%) and clearing (55.2%) disturbance. Such values increased to 36.7% and 71.4%, respectively, when ensemble VIs were used as input data to CCDC (Figure 7).

Assessment of the Correct Time of Disturbance
The evaluation of the correct time of CCDC-detection of disturbance occurrence from the ensemble VI approach showed larger uncertainties in days for events of savanna clearing than for events caused by fire. This fact was indicated in Figure 8 by the lower data dispersion around the zero-day line recorded for fire disturbance detection (Figure 8b) when compared to clearing detection (Figure 8a). Disturbance by fire comprised more abrupt events than savanna clearing, which usually includes gradual processes of degradation before the complete removal of vegetation cover. The degradation process produces uncertainties in establishing the exact date of vegetation suppression. On the other hand, clearing events are more detectable than disturbance by fire because of the predominant period of their occurrence in the beginning and middle of the dry season. The great availability of cloud-free images in this period favors the detection of three consecutive observations of clearing change by the CCDC algorithm, which does not occur for fire. Using single VIs in the data analysis, results were approximately similar to those observed with the ensemble approach.

Assessment of the Correct Time of Disturbance
The evaluation of the correct time of CCDC-detection of disturbance occurrence from the ensemble VI approach showed larger uncertainties in days for events of savanna clearing than for events caused by fire. This fact was indicated in Figure 8 by the lower data dispersion around the zero-day line recorded for fire disturbance detection (Figure 8b) when compared to clearing detection (Figure 8a). Disturbance by fire comprised more abrupt events than savanna clearing, which usually includes gradual processes of degradation before the complete removal of vegetation cover. The degradation process produces uncertainties in establishing the exact date of vegetation suppression. On the other hand, clearing events are more detectable than disturbance by fire because of the predominant period of their occurrence in the beginning and middle of the dry season. The great availability of cloud-free images in this period favors the detection of three consecutive observations of clearing change by the CCDC algorithm, which does not occur for fire. Using single VIs in the data analysis, results were approximately similar to those observed with the ensemble approach.
The availability of clear observations in the images of the study area was variable over space and time. The greatest difficulties of tracking over time such disturbance events were noted during the rainy season, when the frequency of satellite observations decreased due to cloud cover and related shadows. For instance, the average number of Landsat clear observations per pixel reached its maximum value in the dry season between July and August ( Figure 9). Therefore, the assessment of the correct time of disturbance occurrence detected by the CCDC was biased to some extent by the dry season period. Moreover, disturbances that occurred in areas with a low number of clear observations can be detected with greater delay when compared to the real date of its occurrence. Remote Sens. 2021, 13 The availability of clear observations in the images of the study area was variable over space and time. The greatest difficulties of tracking over time such disturbance events were noted during the rainy season, when the frequency of satellite observations de creased due to cloud cover and related shadows. For instance, the average number o Landsat clear observations per pixel reached its maximum value in the dry season be tween July and August (Figure 9). Therefore, the assessment of the correct time of disturb ance occurrence detected by the CCDC was biased to some extent by the dry season pe riod. Moreover, disturbances that occurred in areas with a low number of clear observa tions can be detected with greater delay when compared to the real date of its occurrence From the set of reference samples detected as disturbance by CCDC, 74% of them matched ±5 days of the occurrence of fire. Considering the most favorable period of im agery (dry season), when the combination of sensors can generate up to four monthly observations (Figure 9), this difference in days is likely non-statistically significant. It is lower than the 8-day spaced interval of data acquisition in the dry season. For clearing From the set of reference samples detected as disturbance by CCDC, 74% of them matched ±5 days of the occurrence of fire. Considering the most favorable period of imagery (dry season), when the combination of sensors can generate up to four monthly observations (Figure 9), this difference in days is likely non-statistically significant. It is lower than the 8-day spaced interval of data acquisition in the dry season. For clearing, 70% of the reference samples matched ±30 days of the occurrence of the events. As mentioned before for savanna clearing, early detection by the CCDC was generally associated with processes of vegetation degradation preceding the complete removal of vegetation. 70% of the reference samples matched ±30 days of the occurrence of the events. As mentioned before for savanna clearing, early detection by the CCDC was generally associated with processes of vegetation degradation preceding the complete removal of vegetation.

Seasonal Patterns of Occurrence between Savanna Clearing and Fire
Inspection in the reference dataset of the date of occurrence of clearing and fire events revealed the existence of a seasonal pattern of disturbance in the study area ( Figure 10). Savanna clearing predominated between April and July, while fire disturbance mainly occurred between August and October. This result is important because the discrimination between these two types of disturbance is not an easy task using data-driven procedures. Therefore, there is a period more favorable to detect cleared areas (April to July) that precedes the largest occurrence of biomass burning (August to October) ( Figure 10).
We plotted the CCDC-detected disturbed areas for the two periods of analysis: April to July (Figure 11a), and from August to October (Figure 11b). Assuming the existence of the seasonal pattern of disturbance observed in Figure 10, we anticipate that the extension of cleared areas is much smaller than that of burned areas. Furthermore, most of the disturbed areas between August and October occurred predominantly near the consolidated areas of agriculture (areas in white in Figure 11), indicating possible land conversion for agriculture expansion (ignition sources). Low air humidity, high temperatures, water stress for the plants, and great amounts of non-photosynthetic vegetation (fuel load) occur at the end of the dry season. These factors favor fire expansion.

Seasonal Patterns of Occurrence between Savanna Clearing and Fire
Inspection in the reference dataset of the date of occurrence of clearing and fire events revealed the existence of a seasonal pattern of disturbance in the study area ( Figure 10). Savanna clearing predominated between April and July, while fire disturbance mainly occurred between August and October. This result is important because the discrimination between these two types of disturbance is not an easy task using data-driven procedures. Therefore, there is a period more favorable to detect cleared areas (April to July) that precedes the largest occurrence of biomass burning (August to October) ( Figure 10).
We plotted the CCDC-detected disturbed areas for the two periods of analysis: April to July (Figure 11a), and from August to October (Figure 11b). Assuming the existence of the seasonal pattern of disturbance observed in Figure 10, we anticipate that the extension of cleared areas is much smaller than that of burned areas. Furthermore, most of the disturbed areas between August and October occurred predominantly near the consolidated areas of agriculture (areas in white in Figure 11), indicating possible land conversion for agriculture expansion (ignition sources). Low air humidity, high temperatures, water stress for the plants, and great amounts of non-photosynthetic vegetation (fuel load) occur at the end of the dry season. These factors favor fire expansion. Remote Sens. 2021, 13, x FOR PEER REVIEW 1 Figure 10. Temporal local patterns of disturbance over areas affected by clearing and fire in 2019 for the reference samples Savanna clearing events predominate between April and July, while fire disturbance mainly occurs between August and October.
(a) (b) Figure 11. CCDC-detected disturbance over two periods of predominant occurrence of (a) savanna clearing (April-July and (b) fire (August-October). White represents consolidated areas of agriculture that have been masked before data anal ysis.
The consistency of the results of Figures 10 and 11 was confirmed after the inspe of cleared areas mapped in 2019 by the DETER Cerrado program ( Figure A1 in App A), and of the burned areas mapped in both periods by the 250-m MODIS FireCCI product ( Figure A2). Figure A1 confirmed the predominance of savanna clearing e between April and July, while Figure A2 showed the highest frequency of fire distur events recorded in the second period of observation (August to October) in the vici of the agricultural areas (areas in white). (a) (b) Figure 11. CCDC-detected disturbance over two periods of predominant occurrence of (a) savanna clearing (April-July) and (b) fire (August-October). White represents consolidated areas of agriculture that have been masked before data analysis.
The consistency of the results of Figures 10 and 11 was confirmed after the inspection of cleared areas mapped in 2019 by the DETER Cerrado program ( Figure A1 in Appendix A), and of the burned areas mapped in both periods by the 250-m MODIS FireCCI v.5.1 product ( Figure A2). Figure A1 confirmed the predominance of savanna clearing events between April and July, while Figure A2 showed the highest frequency of fire disturbance events recorded in the second period of observation (August to October) in the vicinities of the agricultural areas (areas in white). Figure 11. CCDC-detected disturbance over two periods of predominant occurrence of (a) savanna clearing (April-July) and (b) fire (August-October). White represents consolidated areas of agriculture that have been masked before data analysis.
The consistency of the results of Figures 10 and 11 was confirmed after the inspection of cleared areas mapped in 2019 by the DETER Cerrado program ( Figure A1 in Appendix A), and of the burned areas mapped in both periods by the 250-m MODIS FireCCI v.5.1 product ( Figure A2). Figure A1 confirmed the predominance of savanna clearing events between April and July, while Figure A2 showed the highest frequency of fire disturbance events recorded in the second period of observation (August to October) in the vicinities of the agricultural areas (areas in white).

Discussion
Our study has important contributions for the remote sensing of disturbance in the savanna environment. As far as we know, this is the first investigation using univariate and multivariate approaches that evaluates the detection of disturbance in the Brazilian savannas from a set of VIs that respond differently to biophysical attributes of vegetation. In addition, our investigation demonstrates for the first time the performance of each VI and ensemble VIs to detect two types of disturbance (savanna clearing and fire). Finally, it shows the existence of a seasonal pattern of occurrence between these two drivers of disturbance in the MATOPIBA region.
When compared to the available projects of Cerrado monitoring that are mostly based on visual interpretation of satellite data, the current approach is automated and does not require training samples to detect disturbance. This is very important because we have observed a large reflectance variability over the savanna physiognomies in response to different types and intensities of disturbance. In this context, our approach using CCDC and ensemble VIs can optimize the generation of early warning alerts of disturbance to be further inspected in the field. Our data-driven approach can be extended into other savanna areas from Brazil, since the selected site is representative of the different patterns of land use, vegetation physiognomy, and geomorphology observed in a great part of the biome.
Our findings showed that the most important VIs to detect savanna clearing and fire used NIR and SWIR Landsat bands on their formulations (e.g., NBR, NBR2, and MSI). This result is not a surprise, since the first two VIs have been designed to detect burned areas [29]. Moreover, MSI is sensitive to changes in canopy water that may result from gradual processes of vegetation degradation [5,28]. It also expresses modifications in non-photosynthetic vegetation left over in the soil surface after events of savanna clearing (e.g., dry leaves, branches, and twigs). Our findings also agree with results obtained by Bueno et al. [42], who showed the importance of the SWIR-derived VIs (e.g., NBR and NBR2) to detect clearing in highly seasonal areas of Brazilian savannas. When using the Breaks For Additive Season and Trend (BFAST) algorithm over a savanna site located in southeastern Brazil, Bueno et al. [43] obtained an overall accuracy ranging from 54.7% to 59.6% for different VIs. This accuracy was approximately consistent with results obtained in our work with the use of single VIs. On the other hand, it was comparatively lower than the accuracy reported for our ensemble approach (71.2%).
From a visual comparison between the VI disturbance maps, we observed that their similarity depended on the intensity of disturbance and on the spectral range of VI operation. For instance, most VIs detected intense disturbance, while VIs operating at a similar spectral interval (e.g., NDVI and SAVI) produced maps that were more similar. The ensemble approach using CCDC took advantage of these aspects. Using the CCDC algorithm, the ensemble VI approach had a better performance for detecting overall disturbance than the use of single VIs. When compared to NBR2, the ensemble VIs increased the overall accuracy of detection from 65.9% to 71.2%. This result indicates an advantage of the CCDC over other time-series algorithms that do not run with multiple input data [23,[44][45][46]. However, even for these algorithms, our results are useful when indicating single VIs that deserve attention and testing over other savanna areas of the world.
In the literature, it is very difficult to compare the results of different products generated from distinct approaches, satellite data, and sensor specifications. For instance, our approach uses a data-driven procedure with Landsat data, while the DETER Cerrado program involves visual interpretation of medium spatial resolution CBERS/WFI and endmember-fraction images. Even considering the limitations of such a comparison, we observed that 73.14% of our reference samples detected as disturbed sites by CCDC were also assigned as savanna-cleared areas by the DETER Cerrado. On the other hand, when we compared our CCDC-detected reference samples of fire disturbance with the burned areas mapped by the 250-m MODIS FireCCI v.5.1 product, the agreement was only 40.7% on a per-pixel basis.
The differences in agreement between the clearing and burning products are explained by different factors. For instance, there is a greater duration of the signals of clearing disturbance over the OLI/Landsat-8 images when compared to burning events. Overall, our results showed that the CCDC accuracy was generally higher for detecting clearing than for mapping burned areas, but the contrary was observed for the accuracy of the actual date of disturbance occurrence. As mentioned before, the great availability of cloud-free images in the predominant period of savanna clearing favors the record of three consecutive observations in the CCDC algorithm for the assignment of change. On the other hand, fire comprises more abrupt events than savanna clearing, which may include gradual processes of degradation. Such processes add uncertainties in establishing the exact date of vegetation suppression.
Another interesting result obtained here is the existence of a seasonal pattern of disturbance occurrence in MATOPIBA. Our results showed the predominance of savanna clearing between April and July, and of disturbance caused by fire between August and October. They were consistent with reports published by Schmidt and Eloy [46] on the fire regime in the Brazilian savannas. According to them, the lack of fire management causes wildfires over native vegetated areas, especially during the late dry season (August-October) and near the agricultural fields. From 2000 to 2018, the new agricultural frontier of MATOPIBA was responsible for approximately 40% of the total cleared areas of the Cerrado biome and for 57% of the burned areas, as indicated by data of the MapBiomas and INPE's projects of savanna monitoring [47].
Finally, our study has some constraints to consider in future investigations of savannas. For instance, we did not consider other algorithms in the analysis of disturbance. Recent studies have proposed also the use of an ensemble of time series algorithms to monitor land cover changes [48]. By comparing four dense time series algorithms using simulated data, Awty-Carroll et al. [25] discussed the advantages and limitations of each approach. They demonstrated the advantage of CCDC in working with multiple input variables for providing robust detection of land covers, especially when high quantities of missing data occur in the time series. This is the case with our datasets, which have distinct frequencies of satellite observations in the rainy and dry seasons of MATOPIBA due to cloud cover. In this context, we did not consider the inclusion of other satellites than Landsat (e.g., MSI/Sentinel-2) to produce denser time series. This would require strategies of data harmonization that may add other uncertainties in the data analysis. The number of images was still a limiting factor for disturbance detection in MATOPIBA, adding commission errors during the rainy season and omission errors at the end of the dry season. This limiting factor also added uncertainties in the definition of the date of disturbance occurrence. In spite of these constraints, our results reinforce the importance of data-driven procedures for the fast generation of near real-time warning alerts of disturbance to be carefully inspected in the images and further checked in the field. Therefore, future studies should consider hybrid satellite datasets to produce denser time series. By increasing the number of consecutive observations, the accuracy of disturbance detection by CCDC will probably increase. Future studies should also consider the use of high spatial and temporal resolution data from a constellation of satellites, especially if reflectance data from the NIR, SWIR-1 and SWIR-2 spectral regions are available for disturbance analysis. For instance, daily observations from the PlanetScope satellite constellation with spatial resolution of 3 m should be tested over the savannas.

Conclusions
The use of a time series of VIs with different sensitivities to biophysical attributes of savannas in an ensemble approach improved the CCDC-detection of disturbance when compared to the detection using single VIs. Considering the specific objectives of the work, we concluded that: (a) The disturbance caused by clearing and fire produced significant reflectance modifications in the NIR and SWIR-1,2 Landsat bands that explained the better performance of NBR2, NBR, and MSI to detect such events.
(b) The ensemble VI approach with CCDC had an accuracy of 71.2 ± 4.0% for detecting total disturbance (savanna clearing plus fire), reducing the omission errors when compared to the use of single VIs in the analysis. The accuracy was generally higher for detecting clearing rather than for mapping burned areas, but the opposite was observed with respect to the recorded date of the occurrence of disturbance.
(c) Savanna clearing predominated in the transition from the rainy to the dry season (April to July), while most events of fire disturbance occurred between August and October toward the end of the dry season.
Finally, the results highlight the importance of data-driven approaches for generating early warning alerts of disturbance in the Cerrado to support programs of monitoring of savannas.  (a) The disturbance caused by clearing and fire produced significant reflectance modifications in the NIR and SWIR-1,2 Landsat bands that explained the better performance of NBR2, NBR, and MSI to detect such events.
(b) The ensemble VI approach with CCDC had an accuracy of 71.2 ± 4.0% for detecting total disturbance (savanna clearing plus fire), reducing the omission errors when compared to the use of single VIs in the analysis. The accuracy was generally higher for detecting clearing rather than for mapping burned areas, but the opposite was observed with respect to the recorded date of the occurrence of disturbance.
(c) Savanna clearing predominated in the transition from the rainy to the dry season (April to July), while most events of fire disturbance occurred between August and October toward the end of the dry season.
Finally, the results highlight the importance of data-driven approaches for generating early warning alerts of disturbance in the Cerrado to support programs of monitoring of savannas.    Results confirmed the higher frequency of fire disturbance events observed in the second period, as indicated in Figure 10. White represents consolidated areas of agriculture that have been masked before the data analysis.