Reconstructing Three Decades of Land Use and Land Cover Changes in Brazilian Biomes with Landsat Archive and Earth Engine

: Brazil has a monitoring system to track annual forest conversion in the Amazon and most recently to monitor the Cerrado biome. However, there is still a gap of annual land use and land cover (LULC) information in all Brazilian biomes in the country. Existing countrywide e ﬀ orts to map land use and land cover lack regularly updates and high spatial resolution time-series data to better understand historical land use and land cover dynamics, and the subsequent impacts in the country biomes. In this study, we described a novel approach and the results achieved by a multi-disciplinary network called MapBiomas to reconstruct annual land use and land cover information between 1985 and 2017 for Brazil, based on random forest applied to Landsat archive using Google Earth Engine. We mapped ﬁve major classes: forest, non-forest natural formation, farming, non-vegetated areas, and water. These classes were broken into two sub-classiﬁcation levels leading to the most comprehensive and detailed mapping for the country at a 30 m pixel resolution. The average overall accuracy of the land use and land cover time-series, based on a stratiﬁed random sample of 75,000 pixel locations, was 89% ranging from 73 to 95% in the biomes. The 33 years of LULC change data series revealed that Brazil lost 71 Mha of natural vegetation, mostly to cattle ranching and agriculture activities. Pasture expanded by 46% from 1985 to 2017, and agriculture by 172%, mostly replacing old pasture ﬁelds. We also identiﬁed that 86 Mha of the converted native vegetation was undergoing some level of regrowth. Several applications of the MapBiomas dataset are underway, suggesting that reconstructing historical land use and land cover change maps is useful for advancing the science and to guide social, economic and environmental policy decision-making processes in Brazil. Amazon biome, the expansion of agriculture in the Cerrado, and in the South of the Atlantic Forest biomes. T.A.;


Introduction
Our society is highly dependent on a functional and stable land system for food production, and to access natural resources including water, timber, fiber, ore and fuel, among other ecosystem services and goods [1]. However, human-induced land use and land cover (LULC) changes over the past 50 years have been altering the composition, structure and services of land ecosystems at an unprecedented rate [2][3][4] As consequences, the equilibrium between land, atmospheric and ocean systems, human welfare and wellbeing, as well as global biodiversity are at high risk [5,6].
Brazil is one of the richest biodiversity countries in the world [7] with six unique biomes: Amazon, Atlantic Forest, Caatinga, Cerrado, Pampa and Pantanal. These biomes possess large carbon stocks in their forest [8] and soils [9], and additionally possess the largest global reserves of freshwater [10]. On the other hand, this country is one of the world's producers of agricultural commodities and it has been a major contributor to LULC changes from greenhouse gas (GHG) emissions at the global scale [11,12].
Deforestation for pasture and agriculture expansion, infrastructure development, cities, and political and financial incentives to land occupation are the main drivers of LULC changes in the Brazilian biomes, affecting biodiversity, water resources, carbon emissions, regional and local climate [13]. Currently, the biomes undergoing more pressure on the original land cover are the largest ones, the Amazon (419 Mha, i.e., 49% of the country) and Cerrado (203 Mha, i.e., 23% of the country) [14,15]. However, the Atlantic Forest (111 Mha, i.e., 13% of the country) is the Brazilian biome that suffered the most extensive LULC change in the past, dating back to colonial history in the sixteenth century [16,17]. This biome is highly fragmented by roads and urban centers [18], and immersed in a large agriculture matrix, resulting in 11.7% of old secondary forest cover (i.e., >30 years) [19]. The semi-arid Caatinga biome (84 Mha, i.e., 10% of the country), located in the northeast region of Brazil, is considered the Brazilian biome that has been most altered by LULC change [20], being mainly covered nowadays by secondary growth forests [21,22]. Similarly to the previous ones, the Pantanal biome (15 Mha, i.e., 1.7% of the country) is likewise under high conversion pressure. Cattle ranching  [41]. The Coastal Zone, which was mapped as a cross-cutting theme, is distributed along the Atlantic coast and involves all five biomes except the Pantanal. Evergreen forest, with enclaves of savanna, natural grassland, and extensive wetlands and surface water, with almost 20% of the forested areas biome cleared.
Mosaic of savannas, grasslands and Agriculture, cattle ranching, Figure 1. Map of the six Brazilian biomes, as defined by the Brazilian Institute of Geography and Statistics (IBGE) [41]. The Coastal Zone, which was mapped as a cross-cutting theme, is distributed along the Atlantic coast and involves all five biomes except the Pantanal.
We produced the annual historical maps of LULC between 1985 and 2017 separately for each biome and cross-cutting themes (i.e., pasture, agriculture, coastal zone, mining and urban infrastructure). Then, the maps of biomes and cross-cutting themes were integrated annually. The building blocks of the image-processing protocol to reconstruct annual LULC maps is presented in detail in the sub-sections (Figure 2), and the methodological details for each biome and themes can be found in the Supplementary Material (S1 and S2 Appendices). This is a key element of the MapBiomas LULC protocol which enables the mapping of complex and heterogeneous ecosystems. Evergreen forest, with enclaves of savanna, natural grassland, and extensive wetlands and surface water, with almost 20% of the forested areas biome cleared.
Cattle ranching, agriculture, mining, logging and non-timber forestry production.
Cerrado 203 (23.92%) Mosaic of savannas, grasslands and forests, 50% of the native vegetation cover has already been converted.
Agriculture, cattle ranching, artificial water reservoir and timber exploitation for coal production.
Agriculture, livestock production (in natural grasslands), forest plantation, and urbanization. We produced the annual historical maps of LULC between 1985 and 2017 separately for each biome and cross-cutting themes (i.e., pasture, agriculture, coastal zone, mining and urban infrastructure). Then, the maps of biomes and cross-cutting themes were integrated annually. The building blocks of the image-processing protocol to reconstruct annual LULC maps is presented in detail in the sub-sections (Figure 2), and the methodological details for each biome and themes can be found in the Supplementary Material (S1 and S2 Appendices). This is a key element of the MapBiomas LULC protocol which enables the mapping of complex and heterogeneous ecosystems.

Land Use and Land Cover (LULC) Classification
Defining a classification system remains a challenge for remote sensing and land ecosystem studies, especially for harmonizing different map products [42]. Land cover refers to the Earth's surface characteristics, while land use is linked to human interactions with land surfaces [43]. The MapBiomas classification scheme is a hierarchical system with a combination of LULC classes

Land Use and Land Cover (LULC) Classification
Defining a classification system remains a challenge for remote sensing and land ecosystem studies, especially for harmonizing different map products [42]. Land cover refers to the Earth's surface characteristics, while land use is linked to human interactions with land surfaces [43]. The MapBiomas classification scheme is a hierarchical system with a combination of LULC classes compatible with the Food and Agriculture Organization (FAO) [44] and IBGE [45] classification systems (Table S2). At Level 1, there are six classes: forest (1), non-forest formation (2), farming (3), non-vegetated area (4), water (5) and not observed (6). The forest class includes old growth mature forest (i.e., >30 year-old), early stage (i.e., 5-15 year-old) and advanced secondary growth (i.e., [15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30] year old) forests, pristine forests that have not undergone anthropogenic conversion, savanna woodlands, mangroves and forest plantation. Farming constitutes a land use class for areas dedicated to growing crops and raising livestock. We also attempted to separate farming from non-forest natural formation and non-vegetated area at this first level, but made no attempt to separate natural forest cover from forest plantation, which is a land use activity ( Table 2 and Table S2). Table 2. Land cover and land use classification system for MapBiomas in Brazil.

Level 1 Level 2 Level 3 Description
Forest Natural Forest Formation

Forest Formation
Vegetation types with a predominance of tree species with high-density continuous canopy, areas that were disturbed by fires and/or logging, and forest resulting from natural regrowth.

Savanna Formation
Vegetation types with a tree layer varying in density, distributed over a continuous shrub-herbaceous layer.

Mangrove
Dense and evergreen forest formation often flooded by tide and associated with the mangrove coastal ecosystem.

Forest Plantation
Planted tree species for commercial use.

Non-Forest Formation in Wetland
Floodplain with fluvial and lake influence, subject to periodic or permanent flooding, located along watercourses and in lowlands areas that accumulate water, with herbaceous shrub vegetation and/or arboreal and pioneer formations, and marshes (marine influence).

Grassland Formation
Vegetation type with a predominance of herbaceous stratum, including patches with a well developed shrub-herbaceous stratum.
Salt flat "Apicuns" or salt flats are formations often without tree vegetation, associated to saline and a less flooded area in the mangrove, generally in the transition between this area and the continent.
Other Non-Forest Formation Natural grasslands, Savanna, Park Savanna, Steppe Savanna, Woody-Grassland Savanna, "Campinarana" in the Amazon biome. Rocky Outcrop Naturally exposed rocks in the terrestrial surface without soil cover, often with partial presence of rock vegetation and high slope.

Mining
Areas related to large mineral extraction, with clear soil exposure due to heavy machinery. Only areas belonging to National Department of Mineral Production's (DNPM) chart (SIGMINE) were considered.
Other Non-Vegetated Area Non-vegetated surface areas (infrastructure, urban areas or mining) not mapped into their classes, and exposed soil areas (mainly sandy soil) not classified as grassland formation or pasture.
Water River, Lake and Ocean Rivers, lakes, dams, reservoir and other water bodies Aquaculture Artificial lakes, where aquaculture and/or salt production activities predominate Not Observed Areas blocked by clouds or atmospheric noise, or with absence of ground observation masked out from analysis. Level 2 has 12 classes that also have a combination of LULC classes ( Table 2). The Forest Level 1 class is broken down into two sub-classes: natural forest formation (1.1) and forest plantation (1.2). Then, non-forest natural formation is divided into wetland (2. The MapBiomas LULC Classification System can be linked to other international and national classification systems [44,45]. A detailed description of the MapBiomas LULC Classification and the correspondence with other systems is available in the Table S2. The classification method for implementing the MapBiomas LULC Classification System is also hierarchical, combining the different methods that are described below. Because of that, we defined the prevalence rules to combine the classification results obtained with different methods in order to obtain the final LULC classes (Table 2).

Remote Sensing Dataset
The satellite imagery dataset used in the MapBiomas project was composed by the Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+) and the Operational Land Imager (OLI) Landsat sensors, on board Landsat 5, Landsat 7 and Landsat 8, respectively. MapBiomas used Collection 1 Tier 1 from [46] with the original digital numbers (DN) converted to top of the atmosphere (TOA) reflectance. The Landsat Tier 1 imagery data were already orthorectified based on ground control points and digital elevation model. The Landsat Collection 1 Tier 1 was orthorectified accounting for pixel co-registration and the correction of displacement errors with the pixel spatial resolution of 30 m, and is normalized to TOA reflectance, making it suitable for LULC change analysis [47]. The Landsat Collection used in this study was accessible and processed through Google Earth Engine [38].

Pre-Processing
The mapping unit adopted in the MapBiomas project was defined based on the subdivision of the International Chart of the World to the Millionth on the 1:250,000 scale. Each tile covers an area of 1 • 30 of longitude by 1 • of latitude, totaling 558 tiles covering all the Brazilian biomes ( Figure S1). Mapping tiles intercepting more than one biome were processed separately, with a classification legend, input and parameterization, specifically set for each biome and subsequently merged to the tile area in the post-classification step (explained later on).
The first pre-processing step was to build cloud-free annual tile Landsat mosaics yearly. Cloud and cloud shadow masks were applied to all Landsat scenes. For that, we used the temporal dark outlier mask (TDOM) [48] algorithm and the band quality assessment (BQA) information available in the Landsat Collection. The cloud-masked Landsat scenes, which are image data for specific dates and geographical locations, were selected in Google Earth Engine from the available Landsat archive and combined to produce annual temporal mosaics targeting specific periods of the year for each biome and sub-regions. This procedure warranted that optimal spectral contrast and separability amongst the LULC classes were obtained across the biomes. The Landsat mosaics were generated with statistical reducers (i.e., mathematical functions in Earth Engine), including median, standard deviation, minimum and maximum, among others. We used both Landsat mosaics and Landsat cloud-free scenes in our LULC random forest classification approaches, as described below and in the Appendix S1. These Landsat datasets are the basis to build input features for the classifiers, as described in the next section.

Landsat Feature Space
The annual Landsat scenes produced in the pre-processing step were used to generate the feature space (i.e., variables) used as input for the random forest classifier. All cloud-free Landsat scenes acquired over the biomes in a given year were used to produce temporal annual mosaics with spectral bands and index, fractions and index from spectral mixture analysis (SMA), temporal index (based on median, min, amplitude and standard deviation reducers) and textural index. The images from the best period of each biome (Table S3) were used to produce median images. Each biome had the flexibility to define the optimal period of the year to build the annual mosaics, since the cloud condition and phenological behavior of the LULC varies across the biomes. Therefore, the annual image mosaics have a different reference period for each biome. We also tested the map accuracy with calibration training data to assess if the period selected for building the annual mosaics was optimum to increase the accuracy of LULC classes. Building the annual mosaics with a different time period in the year did Remote Sens. 2020, 12, 2735 9 of 27 not affect LULC annual change estimates because the reference period of each biome is the same over time, allowing to build a single annual product at that level (Table S3).
After building the annual mosaics, the next step was to build the feature space for the random forest classifier. For that, we used the compositional, spectral, temporal and textural information extracted from the image annual mosaics and from all cloud-free pixels within the year. Fractional information was obtained with spectral mixture analysis (SMA; Tables S4 and S5) which was also used to calculate additional SMA indices such as the normalized difference fraction index (NDFI) [49].
The NDVI values of all cloud-free pixels of all images in each year were divided into quartiles, and then the median values of the highest quartile were considered the wet season image and the lowest one the dry season image. Reducers of minimum, maximum, difference and standard deviation were used in cloud-free pixels of all images in each year to produce the temporal index. Finally, an entropy function was applied in a 5 × 5 window around each pixel to produce the texture index in the Green band. A total of 104 features were available for each biome to select the best features. The feature dataset used by each biome is presented in the Supplementary Material (Appendix S1 and Table S6).
The temporal mosaics procedure described above is not optimal to separate all LULC, especially agriculture and pasture and non-forest formations due to seasonal changes. To overcome this limitation, some classes were classified separately, as cross-cutting themes. These classes included: pasture, agriculture, forest plantation, urban infrastructure, and mining. Each cross-cutting theme (used a specific classification approach with all cloud-free Landsat scenes available yearly, or a temporal mosaic for specific intra-annual period, to highlight the seasonal change and better distinguish these LULC classes. Additionally, the feasibility to derive the 104 variables varied temporally and spatially due to data availability and cloud conditions. The areas affected by these factors, with less Landsat data, generally produced poorer classification results which were corrected using the temporal filtering approach. More information about the cross-cutting theme input images and their classification methods are provided in the Appendix S2.

Random Forest Classification
We used the random forest classifier available in Google Earth Engine for LULC classification [38]. The number of trees in the random forest classifier varied from 50 to 100 iterations (Appendix S1 Table B; Appendix S2). The number of features selected was the default value for this parameter (i.e., mtree which is given by the square root of the number of features as defined for each biome).
For training the random forest classifier, we applied two approaches. For the Amazon biome, we combined existing land cover maps to randomly select and automatically assign the land cover classes to the training samples. We extrapolated the land cover class assigned to the training sample for the years that did not have land cover maps from other sources. Image analysts inspected Landsat color composites of these years and reassigned the correct land cover to the training samples. For the other biomes, we built training samples for the random forest classifier using a previous MapBiomas Collection 2.3 as a reference to identify stable land cover classes; i.e., pixels with no change in land cover over the timespan of this map collection (2000-2016). The stable land cover classes are the ones that did not change the pixel LULC class between 2000 and 2016. This map with stable land cover classes was used to randomly select samples and assign the LULC class to training samples. For the years 1985-1999 and 2017, which completes the LULC Collection 3.1 of this study, we extrapolated the stable land cover classes, assessed possible land cover change, and kept the LULC class if no change was observed. The cross-cutting themes also used a combination of stable samples, reference maps and image interpretation to build their training samples. More details about the procedure to build training samples are presented in the Appendix S1 and Table S6.

Post-Classification Filters and Map Integration
The final classification result for each map tile consisted of three products: (a) classification (no post-processing), (b) classification after applying spatial and temporal filters and (c) post-filter, integrated classification resulting from combining with cross-cutting themes following empirically defined prevalence orders. The first post-classification action was the application of spatial and temporal filters to the maps generated in the LULC classification step. The application of these filters removed classification noise and disallowable LULC class transitions. The temporal filter was also used to fill the information gap due to the cloud. These post-classification procedures were implemented in the Google Earth Engine platform and are described in more detail below.

Spatial Filter
The spatial filter segmented and indexed the classes of each collection into contiguous regions [50], which were subsequently identified and reclassified based on the following criteria: areas less than or equal to half a hectare (i.e., approximately 5 pixels) were reclassified based on the majority of the neighboring classes. For instance, a patch belonging to a given class of up to 5 pixels was first identified along with its neighboring pixels; this patch was then reclassified as the predominant class value of the neighboring pixels. This process was applied to all segments of the LULC classes selected for filtering, which corresponds to the minimum mapping unit of MapBiomas Collection 3.1. Spatial filters applied in each biome and cross-cutting theme are described in the Appendices S1 and S2.

Temporal Filter
The temporal filter seeks to identify and correct the class transitions that are expected along a series of consecutive years (i.e., 3 to 5), as well as to fill in pixels with no data caused by cloud cover [50]. For example, a pixel classified as non-forest in a given year ti (where i = 2008, 2009, . . . , 2015), and forest in year ti1 and ti+1, was reclassified as forest for the year ti. Several transition rules were defined and applied to be used in the temporal filter for each biome to deal with specific phenological and land use transitions. Temporal filters applied to each biome and cross-cutting theme are presented in the Appendices S1 and S2.

Map Integration
The biome products of digital classification after temporal filter application, for each of the 33 years in the period 1985-2017, were then integrated with the cross-cutting themes, by applying a set of specific hierarchical prevalence rules (Table S6). The integration process was made on a per pixel basis. A spatial filter similar to the one described above was applied in the integrated maps to remove the isolated classes with less than half a hectare, as well as the noise resulting from any Landsat data misregistration.

LULC Transitions
Transition classes represent LULC change measured by the annual pixel-to-pixel class difference between 1985 and 2017. A similar spatial filter described in the Section 2.3.5 was applied in the transition maps to remove spurious isolated class transitions. The aim of the LULC transition filter is to eliminate single pixels or streams of pixels on the border of different classes derived from the created transition maps. The general rule applied in this filter was to remove from the transition classes' single isolated pixels and streams of up to five pixels along the border of transition classes.

Accuracy Assessment and Area Estimation
Accuracy assessment analysis and area estimation were performed based on~75,000 independent samples (named reference dataset) at the Landsat pixel level for each one of the years from 1985 to 2017 in all of Brazil (Appendix S3). These samples were generated by a stratified random sampling, which considered 127 regular strata (resulting from the spatial aggregation of neighbors tiles from International Charts of the World to the Millionth on the 1:250,000 scale), a confidence interval of 95%, and a maximum standard error of 5% to establish the sample size for each stratum, following the good practices proposed by [51,52]. In order to increase the number of samples in heterogeneous landscapes, we stratified the samples proportionally to six slope classes (see the Appendix S3 for details).
Each sample (i.e., Landsat pixel) was inspected by three independent interpreters, and in case of disagreement among interpreters, a senior interpreter assigns the final LULC class of the pixel. This evaluation was performed using the Temporal Visual Inspection web application (TVI-tvi.lapig.iesa.ufg.br), developed by the Laboratório de Processamento de Imagens e Geoprocessamento (Lapig).The TVI application allowed the evaluation of all LULC classes using at least two Landsat images per year, the Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation index, precipitation time-series, and high-resolution imagery available in Google Earth. Accessing all these datasets for each sample through a graphical user interface of TVI with satellite image color composites to provide texture and contextual information, spectrum-temporal graphs to assess phenology of LULC class, and higher spatial resolution imagery, allowed the interpreters to evaluate and differentiate the LULC classes (Table 2). Additionally, all TVI interpreters were trained by experts from each of the Brazilian biomes, in order to establish the interpretation criteria for all the LULC classes mapped by MapBiomas.
Subsequently, we used a majority agreement rule, which considers the LULC class assigned by the majority of the interpreters as the final one to the reference sample. Finally, a classification error matrix was created and several metrics (i.e., global user and producer accuracies, quantity and allocation disagreement) were calculated for each year, class and biome, following a standard good practice protocol [51]. Then, we calculated the unbiased area estimation (using the sample weight obtained with our reference sample dataset), classification error matrix, and estimated user, producer's and global accuracies, and quantity disagreement and allocation disagreement for each year, class and biome following a standard good practice protocol [51] (see Appendix S3 for details).

LULC Map Accuracy
We generated 33 annual LULC maps for Brazil with Landsat data from 1985 to 2017 using Google Earth Engine at 30 m pixel resolution. First, we presented the temporal trends of the five main LULC classes including forest, non-forest formation, farming, non-vegetated area and water (Table 3, Figure 3). At Level 1, the LULC mapping product presented 89.13% of global accuracy with 9.21% of allocation disagreement and 1.66% of area disagreement ( Table 4). The Amazon biome had 95.13% average global accuracy, which was the highest accuracy at class Level 1, followed by the Atlantic Forest biome with 87.3%. The Cerrado, Pampa and Caatinga had an average global accuracy ranging from 81.4% to 80.03%. The Pantanal biome had the lowest average global accuracy at class Level 1 (73.17%). For the class Level 2, the average global accuracy for Brazil had a marginal reduction (87.91%), with the Amazon biome showing almost the same value (95.03%) obtained at class Level 1. The Pantanal and the Atlantic Forest biomes showed the largest decrease in average global accuracy at class Level 2, of 7% and 4%, respectively. The Caatinga, Cerrado and Pampa had an average decrease of 1.7% ( Table 4). The overall accuracy had a standard error < 1%, implying that it did not vary over the annual time series. The classification accuracy for each year between 1985 and 2017 is available in the Figure  S2 and Dataset S2.      We summarized the user's and producer's accuracy estimates by averaging them over the 33 years timespan of this study for the entire country (Table 5). These estimates allowed to break down the overall accuracy reported for the country level (Table 4) into class levels. For class level 1, the user's accuracy ranged between 61.76% and 92.54%, for non-forest formation and non-vegetated, respectively. The lower user's and producer's accuracy of non-forest formation LULC classes, such as wetlands and grasslands at class Level 2, can be explained by the spectral confusion with other LULC classes such pasture, agriculture and water with Landsat data. Understanding the spectral confusion amongst the LULC classes is crucial for the proper user of the MapBiomas dataset. Because of that, all the classification error matrices are available in the Datasets S1 and S2 and we built an interactive visual tool to explore this information in the MapBiomas dashboard (https://mapbiomas.org/en/accuracy-analysis).

LULC Spatial and Temporal Trends in Brazil
In 1985, the forest class, which also includes forest plantation, secondary forest and old growth forest with or without signs of degradation by fire, forest fragmentation or selective logging, covered 70.5% of the Brazil's territory, with 600 Mha. The total extent of the farming class reached 172 Mha in that year, with the second largest extent of land cover (20.3% of the territory). The minority land cover classes in terms of extent were non-forest formation (63.4 Mha, 7.48%), followed by surface water with 12.5 Mha (1.47%) and non-vegetated area (2.17 Mha, 0.25%) ( Table 3).
We estimated the annual area change in the land cover classes at Level 1 and captured their temporal trends between 1985 and 2017 ( Figure 3, Table 3). The forest class was the one with the greatest reduction in area, with a conversion of 61 Mha or 10% of forest loss between 1985 and 2017. Non-forest formation, comprised mostly of grasslands, also had almost 10% of its areas reduced in this period, losing 5.7 Mha. On the other hand, farming expanded by 38% with an absolute area of 66 Mha, but not all farming advanced through forested regions, as we will explain in more detail below. The other classes, water and non-vegetated area also expanded their area relative to 1985 by 8% (1 Mha) and 44% (0.96 Mha), respectively ( Table 3). The fastest rate of annual land change happened between 1985 and 2005, for forest loss and farming and water expansion (Figure 3). The non-forest formation and non-vegetated areas showed faster annual changes after the year 2000.
At the country level, the water class showed the lowest annual average percent change (0.17% per year; s ± 7.12%) with a more pronounced trend towards reducing surface water between 2010 and 2017 in the Caatinga and Cerrado biomes ( Figure 4B). The Caatinga biome showed the highest rate of surface water shrinkage in the 2010s, with an annual average rate of −5.1% per year, followed by the Cerrado biome which showed five years in this decade with surface water reduction. Disregarding the outlier in 2011 (8.89% increase relative to 2010), the Cerrado biome showed a decreasing change in surface water. In the Amazon biome, we also detected a slight signal of surface water reduction in the 2010s. The Atlantic Forest and Pampa biomes also showed a minor trend towards an increase in surface water and higher variation over the time-series period. However, surface water had the lowest extent in the Pampa biome between 1985 and 1995. The Pantanal biome presented the highest variation of surface water exhibiting a twenty-year harmonic cycle   (Figure 4; Table S7). Since 2005, surface water has been increasing in the Pantanal biome (5% average per year).
Non-vegetated area was the LULC class with the smallest extent, but showed a rapid increase in area from 1985 to 2017 with an average expansion rate of 1.72% per year (s ± 10.2%) in the whole country ( Figure 4B). We expected this temporal change because urban and infrastructure development increased in Brazil in this period. We were not able to estimate the statistical factors to adjust the area estimate of this class for the Pantanal biome because of the lack of reference data. The Amazon biome had the lowest average annual change for this class in the entire period (0.4% per year). Moreover, the Cerrado biome had a 2% increase per year in surface area for non-vegetated areas, and the Caatinga and Pampa near 2.5%. The Atlantic Forest showed an overall annual increase in non-vegetated areas with an annual average rate of 1.3% per year. The overall rate for the Amazon biome was a 0.4% increase, with a higher variation of change along the time-series ( Figure 4B; Table S7).
The farming class also expanded between 1985 and 2017 at rate of 1.7% per year (s ± 3.8%) ( Figure 4). The highest annual average rate of farming expansion was detected in the Amazon biome (4.6% per year), followed by the Pantanal (3.5% per year). For the Cerrado, Pampa and Caatinga, the expansion of Farming was at a lower pace, with an annual rate of 0.9%, 0.8% and 0.6% percent increase per year, respectively ( Figure 4B). However, we observed higher rates of farming expansion between 1985 and 2005 in the Amazon, Pantanal and Cerrado biomes, at 6.7%, 5.1% and 1.3% yearly expansion, respectively. After 2005, the annual expansion of Farming in these biomes fell to 1.2% in the Pantanal, 1% in the Amazon, and 0.4% in the Cerrado ( Figure 4A, Table S7).
The non-forested formation class showed an annual change of −0.34% per year (s ± 2.8%) between 1985 and 2017. The Atlantic Forest and the Pampa biomes showed the highest rate of shrinking with −0.83% and −0.68% per year on average, respectively. The remaining biomes had a lower annual average reduction of non-forested areas ranging between −0.06% per year in the Amazon and −0.2% in the Cerrado ( Figure 4B). decreasing change in surface water. In the Amazon biome, we also detected a slight signal of surface water reduction in the 2010s. The Atlantic Forest and Pampa biomes also showed a minor trend towards an increase in surface water and higher variation over the time-series period. However, surface water had the lowest extent in the Pampa biome between 1985 and 1995. The Pantanal biome presented the highest variation of surface water exhibiting a twenty-year harmonic cycle (1985-2005) ( Figure 4; Table S7). Since 2005, surface water has been increasing in the Pantanal biome (5% average per year).  However, in absolute terms, the Amazon region has lost much more forest area than the other biomes in Brazil ( Figure 4A,B).

The Main LULC Change in Brazil
The spatial distribution of these LULC classes in 1985 and 2017 are shown in Figure 5, which was used for quantifying the main LULC change in this period. Breaking down these LULC classes to the class Level 2 revealed more unique temporal trends of these classes by biome. In Figure 6, we have the area estimate for 10 of 16 LULC classes at Level 2 ( Table 2) per biome. The wetland class was mapped only in the Pampa and Pantanal biomes and the other non-forest formation (ONFF) in the Amazon and Atlantic Forest (Figure 6), totaling 2.9 Mha and 15.6 Mha in 1985, respectively (Table 6). These classes combined covered 2.1% of Brazil in 1985 and 2017, and both lost 0.7 Mha in this period ( Table 6). The majority of wetland mapped lays in the Pantanal biome, and the other non-forest formation was almost evenly split in the Amazon and the Atlantic Forest.  In our classification scheme, the Forest Level 1 class is split into natural forest formation (NFF) and forest plantation at the classification Level 2. The NFF is sub-classified into forest and savanna formations ( Table 2). The NFF class covered 69% (598.9 Mha) of the Brazilian territory in 1985. Between 1985 and 2017, 11% of the country's the NFF were converted into other land cover types, resulting in a forest loss of 65.9 Mha. Despite the relative 277% growth of forest plantation in this period, we estimated only 6 Mha was in plantations in 2017 (Table 6). It was not possible to estimate the area of forest plantation over the entire timespan of this study (i.e., 33 years) because of the limited reference samples for applying the area adjustment protocol over many years, in the Amazon, Cerrado and Pampa biome. Therefore, the area of this LULC class remains highly uncertain.
The grassland class was mapped in all biomes except in the Amazon ( Figure 6). The total area estimated of grassland in 1985 reached 45.2 Mha (5.3% of the Brazilian territory). We detected a loss of 4.4 Mha of this class between 1985 and 2017 (9.8% loss). The annual mapping uncertainty over the time-series was high in Cerrado, and much higher in the Pampa and Pantanal biomes, and no area estimate was obtained for the Amazon biome grasslands ( Figure 6, Table 6).  The grassland class was mapped in all biomes except in the Amazon (Figure 6). The total area estimated of grassland in 1985 reached 45.2 Mha (5.3% of the Brazilian territory). We detected a loss of 4.4 Mha of this class between 1985 and 2017 (9.8% loss). The annual mapping uncertainty over the time-series was high in Cerrado, and much higher in the Pampa and Pantanal biomes, and no area estimate was obtained for the Amazon biome grasslands ( Figure 6, Table 6). The area of agriculture increased in all biomes, except in the Pantanal, where it was not possible to characterize its temporal trend because we were not able to estimate the annual area for the entire time-series ( Figure 6). The total area of agriculture increased from 18.9 Mha to 51.5 Mha between 1985 and 2017, representing a 172.5% expansion of 32.6 Mha ( Table 6). Most of this expansion happened in the Atlantic Forest, Cerrado and Pampa biomes, with the highest area uncertainty in the later one ( Figure 6).
The pasture class showed different annual trends in the biomes with periods of continuing expansion (Amazon and Pantanal), stabilization after 1995 (Caatinga, Cerrado), and shrinking (Atlantic Forest). This class was not mapped in the Pampa biome because natural grasslands area is used for raising animals in this biome. Overall, the pasture area increased by 45.4 Mha in Brazil from 1985 to 2017 (Table 6, Figure 6). In four biomes (Atlantic Forest, Caatinga, Cerrado, and Pampa), 143.5 Mha were mapped as either pasture or agriculture, which were assigned in the mosaicking class (Table 6, Figure 6). This mosaic class was reduced from 55.3 Mha in 1985 to 42.7 Mha in 2017 (i.e., a loss of −22.9%). This result may indicate that it has become more feasible to reliably distinguish pasture from agriculture in more recent years, reducing the mapping uncertainty between them.
River, lakes and ocean (RLO), as well as urban infrastructure LULC classes had an increase in area between 1985 and 2017 of 1.0 Mha each ( Table 6). The temporal variation is better depicted for the river, lakes and ocean in the Figure 4B and the results for each biome were presented in the previous section. The remaining LULC classes that were not discussed here are rare classes (i.e., have a small proportion of area and reference data, such as mangrove, salt flat, beach and dune, rocky outcrop). Considering this, we were not able to provide a reliable estimate of their area with Landsat imagery.

LULC Biome Transitions
We estimated that 63.5% of the Brazilian territory did not undergo change in its original land cover class from 1985 to 2017, totaling 540.7 Mha (Table 7, Figure 7A). These persistent LULC classes do not imply that there were no land cover degradation processes during the 33-year period of this study or no land cover regeneration. Changes in their structure and/or composition, usually associated with wood, non-timber product forest harvesting, and fires may have happened, but no effort to detect degradation processes has yet been applied at the country level. The majority of the areas with persistent LULC classes occurred in the Amazon biome, encompassing 40% (341.6 Mha) of the total area in Brazil that did not show land cover change in the timeframe of this study. Our analysis showed that 323.4 Mha in the Amazon biome, or 94.6% of the total LULC that did not change, was represented by natural forest formation ( Table 7). The Cerrado biome had the second largest area with no LULC change (110.2 Mha; 11.8% of the entire country), of which 67 Mha were natural forest formation. In the Atlantic Forest, the persistent land cover made up of 47.6 Mha (9.1% of the country), with nearly 50% associated with agriculture and pasture. The Caatinga biome had 39.2 Mha of LULC classes mapped in 1985 continuing in the same classes in 2017, with 80% (i.e., 31.6 Mha) related to natural forest formation. The Pampa and Pantanal biomes showed 0.7% for the total area relative to Brazil with no change in LULC classes (Table 7, Figure 7A). The remaining 36.5% (310.8 Mha) of the country area showed changes in LULC classes from 1985 to 2017. We defined five types of changes amongst the LULC mapped in this study: vegetation loss, vegetation gain, water-land transitions, shifting land use and vegetation dynamics (Table 8, Figure 7B).
Shifting land-use was responsible for 31.6% of the total LULC change between 1985 and 2017 (Table 8), encompassing an area of 98.3 Mha. The Atlantic Forest and the Cerrado biomes contributed 40.9% and 36.1% to this change, respectively. The Caatinga (13.7%) and Amazon (6.6%) biomes contributed less to the shift amongst land use classes. In most cases, these changes were associated with farming, with agriculture replacing pasture. Finally, a shift in land use was much less pronounced in the Pampa (2.3%) and Pantanal (0.3%).   The vegetation loss totaled 102.4 Mha, representing 33% of the total area that underwent LULC change between 1985 and 2017 (i.e., 310.8 Mha; Table 8). We discovered that most vegetation loss, which includes forest and non-forest formations, occurred in the Amazon (41.8%) and Cerrado (33.8%) biomes. To a lesser extent, in this category we had the following biomes: Caatinga (10.7%), Atlantic Forest (9.2%), Pampa (2.9%) and Pantanal (1.6%) (Table 8, Figure 7).
Shifting land-use was responsible for 31.6% of the total LULC change between 1985 and 2017 (Table 8), encompassing an area of 98.3 Mha. The Atlantic Forest and the Cerrado biomes contributed 40.9% and 36.1% to this change, respectively. The Caatinga (13.7%) and Amazon (6.6%) biomes contributed less to the shift amongst land use classes. In most cases, these changes were associated   Figure 7B).
Since we did not generate reference data to assess the accuracy of the LULC transitions presented above, all the area estimates were based on pixel counting. Therefore, these results must be considered as a first attempt to characterize and measure LULC change in long time series in Brazil. We have not attempted to evaluate intermediary LULC change, which is the subject of another ongoing study.

Discussion
This is the first time that LULC change has been quantified in all Brazilian biomes with this degree of spatial detail (i.e., at 30 m pixel size) using +30-year time-series Landsat data. Until now, this LULC change information in Brazil was either restricted in space and time, covering a few biomes and short periods of time (e.g., [53][54][55]), or long time-series, but focusing on deforestation in the portion of one of the biomes [56]. Coarser spatial resolution remote sensing images have also been used to map LULC using Google Earth Engine covering all biomes in a single year [57], and global LULC products [31] are available with limited inputs from local experts. We did not attempt to investigate the level of spatial and temporal agreement between the MapBiomas LULC maps with the existing regional and global ones. This task is an ongoing effort of our research group, which requires the harmonization of LULC classification schemes, and spatial and temporal coherence amongst the LULC maps for undertaking the agreement analysis [58]. A recent study conducted by another research group compared their LULC maps, produced with PROBA-V imagery at 100 m pixel size, with our MapBiomas LULC map for 2015, resulting in a 69% agreement among the most representative LULC classes (i.e., forestland, shrubland, grassland, pastureland, cropland, water body used in the PROBA-V study) [57]. However, this study did not investigate which of these LULC products had the highest accuracy.
The LULC annual dataset presented in this study allowed numerous applications, such as the estimation of vegetation gain and loss, and the understanding of land cover drivers. Between 1985 and 2017, 38% of the Brazilian territory was modified by cattle ranching and agriculture activities, as well as infrastructure development, changing native forest and non-forest formations, indistinctly in all six biomes. Pasture expanded by 46% in the country, mainly in the Amazon and Pantanal biomes, while agriculture increased by 172%, mostly in the Atlantic Forest replacing old pastures and in the Cerrado biomes converting savanna and grasslands formations. Our LULC dataset revealed that 86 Mha of the converted native vegetation is undergoing some level of regrowth. The MapBiomas time-series also generated that, in the Amazon biome, secondary vegetation increased 12 Mha in 2017 [59], exceeding 45.5% to the area of primary deforestation mapped by the Brazilian monitoring system (PRODES) [60]. Thus, our LULC annual dataset goes beyond the existing LULC studies and monitoring systems and helps to fill the information and knowledge gaps in monitoring LULC dynamics in the country in the past three decades.
We built the LULC maps of this study iterating over map collections, such as that applied to MODIS global land cover products [33]. In the MapBiomas Collection 3.1, we had substantial improvement in the random forest classifier and built a robust reference dataset for accuracy assessment. The first Collection 1.0 was mainly developed to allow our research team, engineers and data scientist to port our existing classification algorithm and optimize Google Earth Engine LULC mapping in a short time-series (i.e., between 2000 and 2016). In the Collection 2 (which evolved until Collection 2.3), we were able to move from empirical decision trees based on hierarchical rules defined by analysts to a random forest machine learning algorithm. Empirical decision rules have an advantage of better understanding the variables and rules to map LULC classes, working well with a set of small classes [61]. However, as we increased the levels and numbers of LULC classes, empirical rules became complex, making human decision for partitioning the data into hierarchical binary classes unfeasible. To overcome this task, we adopted random forest in Collection 2.3 which evolved with unbiased training and accuracy assessment into Collection 3.1, using existing LULC maps from different sources to randomly select the training samples. Besides the iterative mapping collections, we implemented a flexible LULC mapping protocol which allows each biome to define the feature space and samples for training the random forest classifier (Appendix S1), as long as the biome maps can follow the map integration protocol to guarantee the spatial and temporal coherence along their transitional ecotone zones.
Yet, besides the gain in information brought by MapBiomas LULC Collection 3.1, there are still challenges and limitations to be overcome. First, overall accuracy was lower than 80% in highly seasonal and heterogeneous biomes (e.g., Cerrado, Caatinga, Pantanal and Pampa). The Amazon biome, with most of the land cover comprised by forest, had the highest overall mapping accuracy of 95%. However, less predominant LULC classes had lower accuracy in all biomes. For example, the spatial variability of native vegetation types and spectral similarity among LULC classes, such as grassland and pasture, are challenging to separate [62], even using hyperspectral images [63]. Second, the reference dataset used to assess the mapping accuracy was built based on the visual interpretation of Landsat color composites, and ancillary spectra-temporal time-series and higher resolution imagery data (when available). We were not able to estimate the classification uncertainty of our reference dataset, which is a task in course. Third, we still need to advance in the analysis of LULC transitions and estimate its uncertainties. In this study, we limited the LULC change analysis to a +30-year period for the LULC classes that had lower classification errors (Tables 6-8). Further investigation is necessary to understand the impact of LUCC classification error to estimate yearly change. Our research group is also exploring methods to understand the frequency of a pixel change to its LULC class and the number of times it happens [53,64]. Fourth, we recognize that the MapBiomas LULC mapping approach is complex because it involves different data inputs and algorithm parametrization for each biome, and some classes are mapped separately as a cross-cutting theme requiring post-classification integration from multiple classification results. As a single random forest classification failed to include all LULC (Table 2), it is likely that we will continue to use cross-cutting themes and post-classification integration of several maps and class prevalence rules to compose the final LULC. Rare classes in our classification schemes (e.g., beach and dunes, aquaculture, mining, salt flat, rocky outcrops) were penalized by the random forest classifier and tended to be under mapped, as pointed out in global mapping studies [33]. These rare LULC classes were impacted in our accuracy assessment analysis as well, showing high classification errors. As an alternative to overcome this issue, we balanced the training classes in our random forest algorithm by using published and accessible reference LULC maps for the Amazon biome, and by adding manually sampled areas that were under mapped in the other five biomes. However, the impact of rare classes persisted, leading us to analyze in this study the LULC dynamics of only the eight most predominant classes (Table 6). Finally, we will also attempt to separate agriculture from pasture in the mosaic class and improve the spatial and temporal consistency in some periods of time series in future MapBiomas Collections.
Several studies have already been published by the MapBiomas network to better understand the spatial-temporal LULC dynamics in Brazilian biomes focusing on specific LULC classes. For example, surface water dynamics in the Amazon region [65], Cerrado native vegetation change [62], mangrove [66] and pastureland dynamics and characterization over the whole country [67]. New research methods to explore and analyze LULC trajectories were applied to the Caatinga biome with the MapBiomas LULC time-series [68]. The Greenhouse Gas Emission and Removal Estimating System-SEEG [69] uses the LULC annual data to estimate the GHG emissions and removals for the land use sector in Brazil, which represents 44% of GHG emissions in 2018 (SEEG 7 available at https://seeg.eco.br/). Indeed, reducing the LULC uncertainty of the SEEG GHG estimates was the main reason for our research group to launch the MapBiomas Project.
All products, methods and tools of the MapBiomas Project are open access, transparent and publicly available in the internet (https://mapbiomas.org/) for non-commercial use. With open access data, it was possible to perfect the LULC maps with end-user feedback, which reached one hundred thousand users in 2019. Additionally, more than one hundred peer-reviewed research articles were published between 2017 and 2019 using the LULC maps of this project. Since the first MapBiomas Collection, the applications of this dataset keep growing in science including, for example, the assessment of conservation and biodiversity policies [70][71][72][73], climate change impact [74,75], and the mapping of human disease risks, including hantavirus [76], yellow fever [77], and Leishmania [78]. Therefore, these LULC maps presented in this study are already contributing to better inform the scientific community, policy makers and civil society organizations. In this study, we present an in-depth methodology and the processes used to build the MapBiomas collections, and a rigorous assessment of the map accuracy, which is required to support the existing and emerging scientific and societal applications of our LULC map collections. In addition, we are also advancing in the understanding of historical LULC dynamics in the Brazilian biomes and of the main drivers of change.

Conclusions
We reconstructed LULC time-series information over three decades in Brazil, based on Google Earth Engine cloud-computing, freely available Landsat data and a collaborative network of experts willing to share knowledge. Our LULC mapping protocol required breaking up the image classification per biome and cross-cutting themes, followed by the post-classification map integration rules. This process was required to account for the unique conditions of the biomes, including the phenological changes of LULC classes, the availability of Landsat data due to cloud conditions, and the history and intensity of land use. The accuracy assessment was used to define the optimum period of each biome using calibration data training embedded in the random forest classifier. Classifying separately cross-cutting classes (e.g., pasture, agriculture and urban infrastructure) was necessary to reduce spectral confusion in the random forest. The key element of our LULC approach is the post-classification integration protocol, which requires spatial coherence of the integrated maps along the biome boundaries. Finally, we decided to put the LULC maps openly available prior to the scientific publications, with a detailed description of the methodology. This allowed several scientific publications using our LULC dataset in Brazil and abroad, and getting feedback from the data users to improve our maps. Policy-makers are also using the LULC dataset to make, plan and assess public policies in the country. The MapBiomas collaborative initiative is also expanding to generate new LULC products in other countries, such as in the Pan-Amazonian countries (https://amazonia.mapbiomas.org/), Chaco region (https://chaco.mapbiomas.org/) and most recently in Indonesia. Based on MapBiomas experience in Brazil, involving local institutions and experts, and international partners, our LULC mapping protocol will likely expand to other countries contributing to support science and societal applications and better policy decisions.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/12/17/2735/s1, Figure S1: Mapping tiles (1 × 1.5 degrees) covering the Brazilian biomes mapped to reconstruct LULC change information, Figure S2: Classification accuracy in Levels 1 and 2 of each year from 1985 to 2017, Table S1: Land use and land cover classes mapped by biomes in Brazil, Table S2: Description of MapBiomas LULC Classification System linked to other international and national classification systems, Table S3: Period applied in metrics using median values and maximum cloud cover percentage in image metadata used in the annual Landsat image mosaics by biomes, Table S4: List and description of bands, fractions and indices available in the feature space, Table S5: Bands, indices and fractions of the feature space dataset applied for each biome, Table S6: Prevalence rules for combining the output of digital classification with the cross-cutting themes in the biomes in MapBiomas, Table S7: Land use and land cover percent change statistics (mean and standard deviation) by circa decade biomes, Appendix S1: Classification specificities by biomes, Appendix S2: Cross-cutting theme classification specificities, Appendix S3: Accuracy method and area estimation, Dataset S1: Classification error matrix of Level 2 land use and land cover classes, Dataset S2: Classification accuracy in Levels 1 and 2 of each year during 1985 to 2017.