Tropical Forest and Wetland Losses and the Role of Protected Areas in Northwestern Belize, Revealed from Landsat and Machine Learning

Changes in land-use and land-cover, including both agricultural expansion and the establishment of protected areas, have altered the landscape pattern and extent of forest and wetland cover in the tropics. In Central America, land-use and land-cover change is also threatening the cultural resources of the region’s ancient Maya heritage since many ancient sites have been degraded by burning, deforestation, and plowing. In this study of Orange Walk District of northern Belize, from the 1980s to the present, we used multitemporal Landsat data with a random forest classifier to reveal trends in land-use and land-cover change and the increasing loss of forest and wetlands. We develop a random forest classifier that is time-generalized to map land-use and land-cover across the entire Landsat record, including Landsat 4, 5, 7, and 8, with a single algorithm. Including multiyear and seasonal composites was important for obtaining cloud-free coverage and distinguishing between different land-use and land-cover types. Early deforestation (1984–1987) was in small patches scattered across the landscape and likely driven by small scale agriculture such as milpa and smaller area tractor and horse-drawn plowing. The establishment of protected areas in the late 1980s and early 1990s allowed for forest regrowth in these areas, while wetland losses were high at 15%. The transition to industrial agriculture in the 2000s, however, drove a 43.6% expansion of agriculture and a 7.5% loss of forest and a 28.2% loss of wetlands during the ~15 years. Protected areas initiated in the 1980s led to a nearly 100 km2 decrease in agriculture from 1984–1987 to 1999–2001, and they became essential refugia for habitat and maintaining ecosystem services.


Introduction
Deforestation in tropical forests and wetlands has proliferated across Central America for decades and accelerated across the tropics [1,2]. Declines in forests and wetlands are often the outcome of anthropogenic land management, but the drivers and timescales of these changes vary broadly [3][4][5]. These land changes are local processes that accumulate to global effects [6][7][8][9]. Like in many Central American countries, the forest and wetland ecosystems of Belize are under increasing threat due to land-use change [10]. Particularly, the increase in industrial agriculture in recent decades has driven the loss of tropical forests and wetlands. These diverse ecosystems not only provide ecosystem services and bring income to this tourism-dependent nation, but they are also essential for maintaining atrisk species such as jaguars, tapirs, and white-lipped peccaries [10,11]. In addition, these forested areas contain expanses of ancient Maya structures, roads, water management infrastructure, and agricultural fields [12][13][14].
While agriculture expands in Belize, the country also has established many public and private protected areas for ecosystems and cultural heritage. Yet studies in Belize assessing land-use and land-cover change through time have simply estimated total forest loss without relating changes to potential drivers [15]. While forests are important ecosystems Belize has a long history of human land-uses, with widespread agriculture dating back at least 5000 years [12]. The Maya once occupied the entire region from about 4000-1000 years BP, and Maya land-uses peaked from 2000 to 1000 BP, when intensive farming systems such as terracing and wetland cultivation flourished. Around 1000 BP population declined, and many urban areas were abandoned [18]. For the millennium after Maya abandonment, the area has had much smaller populations that practiced subsistence farming through milpa agriculture and colonial logging for logwood and mahogany [19]. In milpa agriculture, farmers rotate plots every few years, using fire to clear a new plot and letting the previous plot lie fallow, and rotating plots before returning to the original [20][21][22]. In the last half-century, however, industrial, agricultural production has spread rapidly and increased the country's food security and exports but has also driven deforestation and destruction of wetlands.
A handful of studies have quantified the land change in all of Belize, focused on particular regions or protected areas, or only quantified the loss of forest [15][16][17]23]. The only study to explicitly map deforestation across Belize is a NASA SERVIR technical report, and it only quantified changes in forest and non-forest classes [15]. The report notes that from 1980-2012, forest cover in Belize declined from 1,648,783 hectares to 1,366,300 hectares, or from 74.38% to 61.64% of Belize's land area [15]. Other studies of deforestation in Belize have focused on smallholder lands in local areas or specific reserves [16,17,23,24]. Belize remains an understudied example of rapid agricultural expansion driving deforestation and wetland destruction. The Orange Walk District provides a particularly interesting example to study both deforestation and conservation because it has such rapid and intensive agricultural development up to the borders of large tracts of protected land.
There are many classification methods for mapping land-use and land-cover (LULC) with remotely sensed data. Machine-learning algorithms are growing in scientific research because they are more accurate than commonly used parametric algorithms like maximum-likelihood [25,26]. In addition, machine-learning algorithms can handle more complex data spaces and are more efficient because they do not rely on data distribution assumptions [25,27]. Machine-learning techniques that use ensembles of classification-such as neural network ensembles, random forests, bagging and boosting-have multiplied substantially in recent years. These algorithms obtain high accuracy by using the same classifier to perform many classifications of the same data or different classifiers to perform multiple classifications. The classification ensembles are combined using a weighting, error minimization, or a rule-based approach. By applying many classification iterations, these approaches generally perform better than other machine-learning algorithms, particularly in land cover classification [25].
The random forest (RF) algorithm has several advantages over other algorithms for mapping land cover change with remotely sensed data and generally performs best for this application [25]. Compared to most other algorithms, including advanced algorithms like support vector machines, RF is more computationally efficient, is less affected by outliers or nonparametric data, and has few parameters that need to be optimized to achieve high accuracy [25][26][27][28][29]. In addition, new advances in computing allow researchers to employ machine-learning algorithms with larger datasets at faster speeds [30]. For this study, therefore, we chose to use the RF algorithm implemented on the Google Earth Engine platform to map land-use and land-cover change in Orange Walk District with NASA's Landsat data archive [30][31][32]. Differences in spectral resolution make it difficult to apply an algorithm trained on one Landsat sensor to other Landsat sensors, so most studies use only a single sensor (e.g., Landsat 8 only), although some have made efforts to harmonize the data for longer study periods [31][32][33]. Here, we develop a single RF algorithm trained on Landsat 8 data that we also apply to Landsat 7, 5, and 4 data to evaluate how well a single algorithm performs through time and across the Landsat sensors.
In this study, we quantify land-use and land-cover change from 1984-2016 in Orange Walk, Belize, and assess how agricultural expansion and establishing protected areas have influenced changes in forest and wetland cover. First, we developed a single RF classifier to map land-use and land-cover in Orange Walk, Belize using Landsat 4, 5, 7, and 8 data. Second, we used the algorithm to quantify the change in land-use and land-cover from 1984-2016, with a focus on the change in the forest, wetland, and agriculture extent. We expected the results of this study to allow us to estimate the loss of tropical forests and wetlands over the past 30 years and how protected areas affected the distribution of landuse. Ultimately, this assessment should augment methods for LULC mapping in the tropics, as well as provide results that could be considered in conservation policymaking in Belize and beyond.

Environment
Northwest Belize lies primarily on Cretaceous through Tertiary limestone and is part of the Yucatan Platform. Trans-tensional deformation along the North American-Caribbean transform boundary has created southwest to northeast oriented normal faulting, leaving a horst and graben structured landscape [34]. The limestone bedrock has resulted in the formation of a complex fluviokarst system throughout this area. These tectonic, karst and fluvial processes have formed a series of escarpments and depressions that descend from the west, where the elevation ranges from 100 to 300 m above sea level, to the Belizean coastal plain in the east, which ranges from 0 to 10 m above sea level (masl) [35]. The western part of the Orange Walk District is elevated on the La Lucha Escarpment above 200 masl. To the east, the District descends down two more escarpments (the Rio Bravo and the Booths River Escarpments) to the coastal plain ( Figure 1). Agricultural deforestation and deforestation from runaway fires have spread to nearly all elevations, particularly with cattle ranching and scattered coconut plantations that have expanded across the landscape.
The region's climate is considered subtropical, with distinct wet and dry seasons driven by the annual migration of the intertropical convergence zone (ITCZ). The wet season runs from June to December and the dry season from January through May. The average annual precipitation in this region is 1500 mm [18]. Annual daytime temperatures vary between 26 • C and 32 • C, with the highest temperatures in April and May [18]. This moist environment in the landscape has produced many wetlands that occur in karst depressions, fault lines, and floodplains [36,37]. Some perennially inundated wetlands exist, such as around the New River Lagoon on the east side of Orange Walk, but many of the forested wetlands are seasonally inundated. Pine forest and savannas dominate edaphically dry sand plains across the coastal plain, while most of the upland areas are covered by tropical broad-leaf forest [37]. These forests have a high species diversity of both flora and fauna, which are now mostly restricted to the RBCMA and connected protected areas [36,38,39].

Land Cover Classes
To assess agricultural changes in this region, we chose 10 land-use and land-cover classes to map change over time across the study area (Table 1). These land cover classes were based on previous definitions of local ecosystems from Meerman and Sabido [37] and on expert knowledge of this area. After an initial assessment using 10 classes (see 4. Results), we combined two pairs of similar classes for a final set of 8 land-use and land-cover categories to achieve higher accuracy for assessing changes through time (Table 1). Since the input variables combine multi-season data from three years, a forest degradation class was included initially to capture areas that may have been deforested during the composite period. Here, we use the term "forest degradation" as a broad class that encompasses forested areas that transitioned into non-forest, or deforested area that has been fallow, during the time the period of the Landsat composite. For example, areas that were cleared for pasture between 2014 and 2016, and so their spectral signature during the composite period is mixed between forest and pasture, would be considered forest degradation. The agriculture class was broad and included pasture, sugar cane, corn, and other rain-fed crops. Rice in the study area is placed in its own class because it is irrigated from rivers during the dry season and therefore has a different timing in phenology from the other agricultural types (Figure 2). Agriculture in the region is dominated by cattle and sugarcane production. There are few commercial orchards (e.g., orange groves) [40], although there has been land-clearing and planting of coconut palms and short-lived papaya plantations in northwestern Belize. The agriculture and forest degradation classes were combined in the final 8 category scheme, as most of the forest degradation transitions to agriculture. The two broad-leaved forest types were also combined for the final 8 classes as they are both spectrally and ecologically similar. Lastly, the Urban class refers to the built environment of villages, houses, barns, and roads, as cities are absent in the region. While some roads in the district have been paved in recent years, historically, most roads were covered with crushed limestone, and many still are today.  The distinction between some of these classes was not clear because human-environment interactions in the study area may result in "agroecosystems." For example, ranchers graze their cattle on a "natural" savanna, making it both pastureland and savanna. In other areas, a farmer may clear some forest but then leave it fallow for a period of time. Coconut plantations can mimic forests spectrally. While industrial agriculture is widespread across Orange Walk, traditional agriculture, including milpa agriculture and horse-drawn plowing, still occurs today, but much less so than in the past. Using fire and letting fields lie fallow in the milpa technique maintains soil fertility through time [41]. In other areas, the land is cultivated for longer periods of time with horse-drawn plowing and organic or inorganic fertilizer inputs. Sometimes a field lies fallow due to crop disease and lack of markets. These practices create a mosaic of rapid (year-to-year) changes in agriculture versus forest succession. Much of the modern agriculture in Orange Walk, however, is mechanized and involves permanently clearing large tracts of forest. This industrialized agriculture relies on mechanized plowing, planting, and harvesting, large amounts of fertilizer inputs, and pesticide use.

Data
Previous studies mapping land cover in Belize have relied mainly on single-date mosaics of Landsat data (14,33). Cloud-free Landsat scenes are rare for this tropical region, which limits the ability to get full area coverage at multiple periods. Even with a cloud-free image, a single date during any season will not provide enough data to distinguish between different LULC types because they may appear similar at a certain time of year but transition through various phenological stages over the course of the year. Including multitemporal data increases classification accuracy by helping to characterize plant phenology [42][43][44]. For example, using a cloud-free mosaic from a few images in the dry season may produce results that incorrectly show seasonally inundated wetlands or certain crops as savanna. Including data from both the dry and wet seasons, however, clearly shows the differences in seasonal phenology and inundation. In this study, therefore, we composited multitemporal seasonal data to create cloud-free image mosaics that capture the phenological differences between the wet and dry seasons of the region. We mapped the land-use and land-cover change using composites for 3 periods: 1984-1987, 1999-2001, and 2014-2016. While consistent, high-quality data are available from Landsat 8 since 2013, data availability and quality have varied more in earlier Landsat missions. For the most consistent dataset through time, we used the Level 1 Tier 1 Surface Reflectance data for Landsat 4, Landsat 5, Landsat 7, and Landsat 8 [45]. Landsat 4 and 5 data in the earliest years (1984)(1985)(1986)(1987) are much more limited compared to later years of the same sensors ( Table 2). Because of data availability and persistent cloud coverage, the 1984-1987 period in this study required a longer time frame of data to create a cloud-free composite image. In addition, Landsat 7 data starting in 2003 have many no-data stripes due to a scan line correction error on the sensor. To overcome these data issues, we used three years of data to composite for each period in the study [32]. In addition, using data from multiple years can help smooth temporal variations in images (e.g., moving clouds and shadows and climatic and atmospheric differences) and improve the ability to apply a single algorithm trained on one period to a different period [46]. For the earliest period, only two tiles from the same day were available during 1984, but those images were nearly cloud-free. For the earliest composite, therefore, we included mostly data from 1985-1987, but also included the two tiles from 1984 to acquire nearly complete coverage of the area ( Table 2). The Landsat path 19 and row 48 covers the entire study area, but the surrounding path/row numbers 19/47, 20/47, and 20/48 also intersect with the district. The different Landsat sensors also have different spectral resolutions for each band, requiring a correction to make Landsat 8 comparable to Landsat 4, 5, and 7 data. Thus, we harmonized the Landsat 4, 5, and 7 data to the Landsat 8 data using the relationship and coefficients defined by Roy et al. [33]. The relationships were developed from correlations in the spectral signatures of corresponding bands between Landsat 8 OLI and Landsat 7 ETM+. Within each period, the data were also aggregated into the dry season and wet season composites. As mentioned, the wet season in Belize generally lasts June through mid-November, with a dry season from December through May. The changes in plant phenology, however, are slightly delayed from the start of the season and vary depending on precipitation events. Rather than focus on the rainfall trends, the wet and dry seasons were considered on the basis of the phenological response. For each period of the study, we calculated the median value composite for all data from August through January of the following year to represent the wet conditions, and data from February through July to represent dry season conditions ( Figure 2). For the 2014-2016 period, for example, the wet season composite was calculated by taking every Landsat scene acquired during the months of August through January during the years 2014, 2015, and 2016 and calculating the median value at each pixel for all spectral bands. The dry season composite is then the median value of each pixel calculated from all Landsat scenes acquired from February through July during the years 2014, 2015, and 2016. Taking the median value with the clouds removed from each image results in a cloud-free composite of values that should be representative of the season [46,47]. In the earlier two periods, however, there were some pixels that did not have a cloud-free observation, resulting in~0.57 km 2 in the 1999-2001 composite and~14.93 km 2 in the 1984-1987 composite with no data. All of the spectral composites were smoothed by applying a convolution filter with an octagonal kernel of two pixels radius. This filter helped reduce noise in the temporal composites to produce results with less speckling of individual pixels that can occur with pixel-based classifications.
For each season, we included the median values for each of the six spectral bands common across Landsat sensors as well as spectral indices and topographic variables commonly used in LULC classification (Table 3). For each seasonal composite, we calculated the median normalized difference vegetation index (NDVI; Equation (1)), which is a widely used proxy for plant "greenness" and is a useful representation of plant phenology [48]. We also included the normalized difference water index (NDWI; Equation (2)), which represents wetness, to reveal seasonal inundation in wetlands and which generally increases classification accuracy [49]. Vegetation and ecosystem types in this region are correlated with topography [36,38,50,51]. To account for the topographic correlation with vegetation, we included elevation from the global Advanced Land Observing Satellite (ALOS-2) digital surface model (DSM) with 30 m resolution [52], as well as slope and the topographic position index (TPI) [53] calculated from the DSM (Table 3).

Training and Validation Data
One of the goals of this study was to develop a single algorithm that can be applied across the Landsat record. Therefore, we trained the algorithm on Landsat 8 data for the most recent period (2014-2016) of this study and then applied the trained algorithm to the other Landsat sensors in the earlier periods. Separate validation datasets were used for each period to assess the transferability of the algorithm across sensors and time.
First, we created polygons to select training data for the 2014-2016 period based on GPS data collected on the ground and expert knowledge of the study area (the authors have over 60 years of combined experience in the study area beginning in 1992). The polygons were created in Google Earth Engine based on the ground information and visual interpretation of the Landsat seasonal composites for 2014-2016, individual Landsat images from each year of the composite period, and Google Earth satellite data. While the expert knowledge of the area allowed us to select training and validation data confidently for 1999-2001 and 2014-2016, the 1984-1987 polygons relied on visual interpretation of the Landsat data and historical context. The seasonal composites were particularly useful for determining wetlands, and the individual images from each year were used to find areas of deforestation that occurred during the composite period. We randomly sampled 450 pixels per class contained within the training polygons using the stratified random sampling method in Google Earth Engine.
Next, we repeated the process to create polygons to select validation pixels for each class that did not overlap with the training polygons. A randomly sampled set of 450 pixels per class were selected from these validation polygons to create an independent validation dataset. We use an independent set of an equal number of samples for training and validation data to fairly assess the performance of the algorithm across space as well as time. For the 1984-1987 and 1999-2001 periods, we repeated the process of creating polygons for each class and randomly sampled 450 pixels per class to use for validation.

Optimizing Random Forest
Random forest is considered to be a robust algorithm with few parameters to optimize, and the effects of parameters are insignificant at high levels of trees [25,27]. The main parameters that affect the algorithm's accuracy are the number of regression trees created (t) and the number of variables used at each node in the tree (m). To achieve the highest accuracy algorithm to apply across periods, we optimized t and m by testing a range of values for each variable and calculating the out-of-bag error (OOB), the training accuracy, and the overall validation accuracy for each combination of parameters for the training period of 2014-2016. The number of training and validation samples were held at 450 pixels per class for all algorithm combinations, and 30% of the training data in each run were withheld for calculation of the OOB error and training accuracy. Previous studies have shown that at high values of t, there is little to no increase in accuracy and that the highest accuracies are achieved with low values of m (Rodríguez-Galiano et al. 2012). Therefore, we tested every combination of t = 1, 25, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600 and m = 1, 2, 3, 4, 5, resulting in 70 total RF algorithms to compare.
The optimal parameter combination was determined using a sensitivity analysis based on the OOB error, training, and validation accuracies calculated from all combinations of t and m. The optimal algorithm was determined as the one that achieved the highest training and validation accuracies while maintaining the lowest OOB error. We applied the optimized algorithm to the earlier periods, assessing accuracy with the validation data sets for each period to evaluate how well the single algorithm performs through time. In addition to overall accuracy, confusion matrices revealed which classes were most misclassified by the algorithm. We then combined classes that were highly confused in the results and reassessed the accuracy, resulting in eight final LULC classes. We combined the lowland broad-leaved moist forest and lowland broad-leaved moist scrub forest classes, as they are similar forest types, and the species differences are not important for assessing forest loss in this study. Additionally, the agriculture and forest degradation classes were combined, as the deforestation in the study area is almost entirely for agriculture, and so can be considered agriculture in this study. These final eight classes were used to calculate the area and percent of each LULC in each time period and the percent of change between each of the three maps.

Algorithm Selection and Variable Importance
Increasing the number of trees in the RF increased both OOB and validation accuracies sharply up to 25 trees, while gains in accuracy diminished above 100 trees. At all values of m, both the validation accuracy and OOB accuracy reached a maximum at around 300 trees (Figure 3). At t = 25 and 50, the highest accuracy was achieved with m = 2, but at all values of t ≥ 100, the accuracy was highest at m = 3 ( Figure 3). Therefore, the final algorithm selected for land cover mapping had t = 300 and m = 3, with 450 pixels per class used for training with 10 land cover classes. The RF algorithm used the Gini index to assess the relative amount that each variable contributed to the difference between classes in the classification [54]. Higher values of the Gini index represent higher importance in the final classification accuracy and a larger contribution to maximizing the difference between classes. Since ecosystems are strongly correlated with topography in this area, it is not surprising that the elevation from the ALOS-2 DSM was the most important variable in the classification (Figure 4). In addition, the majority of the agriculture occurs in the flat areas of the coastal plain. The next most important variables were the wet season NIR and the dry season NIR composites. The NIR wavelength is highly sensitive to water content as well as chlorophyll and is used in the calculation of both NDVI and NDWI, and the seasonal differences reflected plant phenology and seasonal inundation. The least important variable was slope gradient, calculated from the ALOS-2 DSM. While slope likely does influence LULC distributions, most of the study area occupies the relatively flat coastal plain, so there was little slope variation in the analysis. Moreover, where there is a high gradient in steeply faulted and karst collapse escarpments, and Landsat and ALOS-2 resolutions may have been too low to capture some of these changes.  Table 3.

2014-2016 Land-Use Land Cover Classification
The final model with t = 300 and m = 3 achieved an overall accuracy of 90.2% across the 10 LULC classes for the 2014-2016 period ( Table 4). The classes with the lowest accuracies were forest degradation and agriculture, which were most confused with each other. The confusion of these two classes makes sense because the deforestation in this area is almost entirely driven by different types of agriculture in this period, and so the forest degradation ultimately becomes agriculture. The algorithm mapped rice well (producer accuracy = 84.67%, user accuracy = 98.96%), but understandably sometimes confused rice for wetland or vice versa. The three key distinctions between the rice and wetland classes were that the rice fields are flooded during the dry season with irrigation, harvested and drained, and often burned before the wet season. Alternatively, many wetlands flood with the wet season (though some are perennial) and dry with the dry season. Using the seasonal data allows the algorithm to not only identify seasonal wetlands but distinguish seasonal wetlands from irrigated rice agriculture. The urban class was also not always distinguishable from the savanna and agriculture classes, likely because there is little area that is actually densely built environment, and buildings are often interspersed between large yards, fields, and gardens. These low-density towns and narrow limestone roads were difficult to identify in the 30 m spatial resolution of Landsat pixels.  Algorithm performance decreased when applied to the earlier years with previous Landsat sensors (Table 4). Even after normalizing the sensor data to Landsat 8 (31), there likely remained differences in the spectral data from each sensor. It is important to note that the previous periods have far fewer images available (Table 1), which affected the median composite values and left some pixels without any available data. With Landsat 8 data from 2014-2016, the RF was able to distinguish between the two different types of broad-leaved forests in the study area, but when applied to past sensors, these forest types were indistinguishable. While species compositions may differ, it was appropriate to combine these two forest types into a single tropical forest class to meet the goal of quantifying forest loss. Most of the forest degradation class that could be mapped reliably with the recent Landsat 8 data (PA = 76%; UA = 90.48%) was fairly indistinguishable from agriculture in the earlier periods (PA = 24.22%, UA = 73.15% for 1999-2001; PA = 37.11%, UA = 67.61% for 1984-1987). In Orange Walk, most of the deforestation is for conversion to agriculture, usually where agriculture meets the forest edge (Figure 7). Hence, combining these two classes was again appropriate for this study. In the earliest period, 1984-1987, some of the agriculture (some originally classified as forest degradation) was likely other types of rotational agriculture (milpa), illicit growing plots, and selective logging, none of which are nearly as prevalent in the two more recent periods. Therefore, for assessing land-use and land-cover change across time, we combined these two pairs of classes to create the eight final classes (Table 1).
With the classification results combined into eight classes, the algorithm performed well in the earlier periods, and the overall accuracy increased for every period (Table 4). When applied to the 1999-2001 data, the overall accuracy increased by 1%, to 93.6%, and when applied to 1984-1987 data, the overall accuracy was lower, but still useful at 85.7%. The earliest period had less than 33% of the number of Landsat images available as the later periods did. Thus, it was likely that the lower frequency of observations in the earliest classification affected its composite values and contributed to its higher error rate.
The land-cover and land-use maps revealed changes in land-use patterns and the losses of tropical forest and wetlands over recent decades ( Figure 5). The area classified as agriculture decreased from 1984-1987 to 1999-2001 by nearly 100 km 2 , but wetlands (including wetland and swamp forest) decreased by 115.2 km 2 ( Table 7). From 1999-2001 to 2014-2016, the agriculture class increased by nearly 350 km 2 . The early decrease in agriculture can be attributed to multiple factors. First, the 1984-1987 period was before the establishment of the protected areas that now occupy much of the district and when there were active logging and milpa agriculture in these now protected forests [55]. After the establishment of public and private protected areas during the late 1980s and 1990s, these areas showed reforestation, but zones outside of the protected areas experienced a drastic increase in intensive agriculture ( Figure 5). The largest increase in the area of any class between the 1999-2001 and 2014-2016 periods was agriculture, which gained 347.68 km 2 , representing an increase of 43.6% (Tables 5-7). The classes that lost the most area were lowland broad-leaved forest and wetlands, losing 210.47 and 145.6 km 2 , respectively. The urban/road class only increased by 15.22 km 2 , but given the low amount of that class in the study area, this increase represents 57.3% growth for that class during this time period. We also know from yearly field research in this area that road paving and accessibility has increased significantly over this time. Another factor was the doubling of the population in Orange Walk District. We used a simple mean between 1980 and 1990 censuses for the 1985 estimate, the census figure for 2000, and the census estimate for 2016 for these baseline population growth numbers [56]. We lacked rural numbers where land-use changes were taking place, but the District population provided a general trend, and it grew by 33.6% from 1985 to 2000 and by 19.7% from 2001 to 2016. Population growth was greatest in the first period from 1985-2000, but agriculture increased by a much higher percent from 2000 to 2015 (Table 8).

Discussion
By using data harmonization, multitemporal data, and machine learning, we were able to achieve our goals of developing a time-generalized land-cover classifier and quantifying land-use and land-cover change in Orange Walk, Belize with the Landsat archive. The approach used multi-season data to distinguish land-use and land-cover types and multiyear data to acquire nearly complete coverage and derive median values that were more comparable through time after applying sensor harmonization coefficients [33]. The results of the classifications revealed a rapid loss of tropical forest and two types of wetlands due to agricultural expansion since 2000. These losses occurred during a time of population growth in Orange Walk, Belize, but forest and wetland loss was far greater than the increase in population since 2000 relative to the higher population growth from 1984-2000. Moreover, the role of conservation in landscape patterns and the protection of natural and cultural resources in Orange Walk was reflected in the distributions of land-use land-cover through time. In 1984-1987, before the establishment of protected areas, small clearings were prolific throughout the forests. By 1999-2001, the now protected forests grew back, but continuous clear-cutting for mechanized agriculture had begun to increase outside of the protected areas, continuing to expand rapidly through 2016.
We achieved our first goal by optimizing an RF algorithm trained with Landsat 8 data and successfully classifying LULC with harmonized Landsat 4, 5, and 7 data. After initially running a sensitivity analysis of the RF parameters to achieve high accuracy (>90%) with 10 classes in the Landsat 8 data, we found that combining classes to 8 more general classes resulted in much higher accuracy when applying the algorithm to the earlier data. The combination of the two pairs of classes, however, did not affect the interpretation of our results because the distinction between these classes was subtle and not important for the application of this study. Multiple factors likely contributed to the decrease in accuracy of the algorithm applied to the past time periods. First, even after harmonizing the Landsat 4, 5, and 7 data, the spectral resolutions and consistency across sensors did not perfectly match [33]. In the experiments training with and classifying the Landsat 8 data, the algorithm was able to distinguish between the two forest types in the original 10-category classification scheme. The algorithm could not distinguish these forest types from each other when applied to the earlier datasets, however, likely because the NDVI phenological signatures of the two forest types were similar ( Figure 2). Additionally, fewer Landsat scenes were available for 1984-1987 and 1999-2001 than for 2014-2016 ( Table 2). As a result, the seasonal median composite values in the earliest data were comprised of far fewer samples and will likely produce median values that differ from the Landsat 8 training samples due to sample size. Previous studies demonstrated that using monthly median composites across multiple years improved the application of an RF algorithm across time [46]. Here, we demonstrated that even just two seasonal median composites created from multiple years of data could produce a time-generalized classifier for general LULC categories.
After using the optimized algorithm to map LULC during three time periods, we were able to quantify these changes in our study area. While Orange Walk had forest regrowth between 1984 and 1987 and 1999-2001 (~15 years) due to the establishment of protected areas, there was a significant loss of forest and wetlands, 210.47 km 2 and 145.6 km 2 , respectively, from 1999-2001 to 2014-2016 (~15 years). During this more recent period, the agricultural area increased by 347.68 km 2 . The total loss of forest and wetland classes closely matches the gain in the agriculture class during this most recent period. Interestingly, the population grew much more in Orange Walk during the 15 years between the 1984-1987 and 1999-2001 periods (13,525 people) than it did between the 1999-2001 and 2014-2016 periods (9,907 people), while the deforestation was much greater during this later period. We do not have enough data to directly infer a relationship between population growth and deforestation, but studies in the global tropics have observed that shifts to large commercial agriculture from small landowners change the relationship between population and forest cover [5,57].
The deforestation that occurred in 1984-1987 in the RBCMA was classified as agriculture in this study, but there were other contributions to the observed deforestation that differed from the industrial agricultural, driving most of the long-term deforestation. In the earliest period, the agriculture area within what is now the RBCMA also consisted of milpa agriculture and possibly illicit agriculture and logging. In the Toledo district in southern Belize, Emch et al. [58] estimated that~360 km 2 of forest were lost between 1975 and 1999 and that the primary driver was increased milpa agriculture due to a 259% increase in the population [58]. The rapid expansion of industrial agriculture in recent decades has occurred in the Mennonite communities of Northwest Belize. In Campeche, Mexico, Mennonite industrial farming lands had~4 times higher rate of deforestation than on other private property [59]. Land tenure and expansion of the agro-industrial production model were considered to be the major drivers of forest loss in the central Yucatan; communal property experienced significantly less forest loss. In other parts of Central America, including in similar forests in adjacent Guatemala, narcotics production and trafficking can drive deforestation as well [60].
There are no current studies that estimate LULC change for the entire Landsat record for Orange Walk, but other studies have estimated deforestation rates for other parts of Belize or Belize as a whole. A study of tropical forest loss using Landsat data estimated that all of Belize lost 80 km 2 per year of the forest, defined as >30% tree cover, between 1990 and 2000 and 70 km 2 per year from 2000-2010 [1]. Voight et al. [24] estimated that in the Maya Golden Landscape in southern Belize, 20.9 km 2 of mature forest were cleared from 2014-2016.
Another key finding is the role of conservation areas in reforestation and forest maintenance. The Rio Bravo Conservation and Management Area (RBCMA) (Figure 1) showed both reforestation from its founding in 1988 as well as forest maintenance in this area. Figure 5 shows that areas that were to become the RBCMA had~56 km 2 of deforestation before its establishment (Figure 5a), but~42 km 2 were reforested by~2000 ( Figure 5b) and another~20 km 2 increase in the forest through~2015 (Figure 5c). For example, the area of the RBCMA around the ancient Maya site of Gran Cacao, even though it is difficult to access today, had a few square kilometers of deforested area, but these became reforested by 2000 and are currently closed-canopy forest ( Figure 5). The southern part of Orange Walk in what is now the privately managed Yalbac Ranch experienced similar deforestation in the earliest years with forest regrowth and maintenance since ( Figure 5).
The protected Archaeological Preserve of the important ancient Maya city of Lamanai remained forested, but deforestation on its west side has occurred up the small park's boundary. Additionally, the wetlands to the west have decreased in an area with drainage and conversion to agriculture. Deforestation there has also taken place over numerous ancient Maya mounds on private property in well-drained uplands, but not so in wetlands and lower-lying savannas to Lamanai's east. Once surrounded by tropical forest, the Lamanai Archaeological Reserve now stands as an island of forest in an expanse of agricultural fields ( Figure 6). The changes in land-use and land-cover patterns revealed by these maps demonstrate the importance of protected land for the conservation of both natural and cultural resources. In the most recent map, most of the forest that is not in protected areas has already been deforested (Figure 7). Our results estimate that about 627 km 2 of forest remain unprotected while about 1978 km 2 (76%) of forest in Orange Walk is in protected areas today. With little area left for agriculture to expand into, the protected areas remain essential for conserving the biodiversity and habitat of these forests. The protected area of the RBCMA is particularly important for jaguar and peccary conservation, and efforts should aim to increase the habitat connectivity and corridors between the protected areas of Belize [11]. While previous studies in the region and international conservation efforts mainly focus on forest ecosystems, this study suggests that wetland ecosystems should be equally considered in conservation efforts. While the establishment of protected areas helped forest regrowth between 1984 and 1987 and 1999-2001, wetlands and swamp forests still declined by 115 km 2 . While both forests and wetlands continue to be rapidly lost, conservation efforts should include wetlands, and not just forests, in their efforts to preserve natural resources. It is important to note, however, that the increase in agricultural production is important for developing food security and economic commodities, as well as local economies. Further, land-use and conservation planning should aim to balance the urgency of protecting cultural and natural resources with the economic and social needs of the population.

Conclusions
Land-use land-cover change in the tropics can be accurately mapped across the Landsat archive with a random forest classifier by including multi-season and multiyear composites. While many studies rely on a single sensor or single date imagery, the multitemporal machine learning approach in this study allowed for the distinction of more land-use and land-cover types and the ability to use the entire Landsat archive. This novel approach revealed that tropical forests and wetlands are under threat from rapidly expanding industrial agriculture in Northwest Belize and that the remaining tracts of these valuable ecosystems are largely within protected areas. Early deforestation (1984)(1985)(1986)(1987) was likely driven by a combination of milpa agriculture, illicit agriculture, and selective logging, as well as some industrial agriculture. The establishment of the RBCMA in 1988 and the subsequent protected areas allowed for forest regrowth in the 1990s, though wetland losses were high at 15%. Between 1999 and 2001 and 2014-2016, however, industrial agriculture rapidly spread across all areas that were not protected with heavy losses of tropical forest (7.5%) and the continued loss of wetlands (28.2%). As a result, the protected areas in Orange Walk are critical to preserving the remaining forests and wetlands in the district. These threatened ecosystems provide essential ecosystem services and habitat for maintaining biodiversity, connectivity of wildlife corridors, and threatened species. The protected areas also protect the cultural heritage that is threatened with land-use and land-cover change, such as in our case study of the ancient Maya site of Lamanai, which is now an island of preserved forest. Agriculture provides an important means of food security and economic development for Belize, but its rapid increase has occurred across biodiverse ecosystems and millennia of ancient Maya structures. Combining new advances in computing and machine learning with the growing length of the satellite record allows us to provide robust estimates of these threatened natural and cultural resources and inform conservation and land management policies and decision-making.
Author Contributions: C.D. conceptualized the study, designed the methods, implemented the analyses, and wrote the main text initial draft. T.B. and S.L.-B. contributed to and edited the main text, provided additional expertise of the study area for selecting training and validation sites, and interpreted the results. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: The code and data used to produce the results of this study are publicly available at https://github.com/astro-hippie/owbz_lulcc.