1. Introduction
The dramatic decline in biodiversity has become so great that it is considered an important facet of global change in its own right [
1]. Biodiversity loss has multiple causes, with habitat destruction via land-cover and land-use change as the predominant driver [
2,
3,
4]. The majority of future land-use change will likely occur in tropical regions [
5], which are home to almost half of all described species [
6], including humans’ closest living relatives, the chimpanzee (
Pan troglodytes). It has been estimated that more than 70% of chimpanzee tropical forest habitats are now threatened by infrastructure development and land use change [
7]. All four chimpanzee sub-species (Western (
P.t. versus), Nigeria-Cameroon (
P.t. ellioti), Central (
P.t. troglodytes), and Eastern (
P.t. schweinfurthii)) have been classified as endangered on the International Union for the Conservation of Nature (IUCN) Red List. In fact, it has been estimated that their cumulative population has declined by more than 66% over the past 40 years [
8] and there is a decreasing trend in population for all four sub-species [
9,
10,
11,
12]. Major threats to current populations include habitat loss from resource extraction activities and land conversion, hunting, disease and the illegal pet trade. Chimpanzees occupy a wide variety of forest and woodland habitats and serve as an umbrella species [
13]; thus, protecting chimpanzee habitat could help conserve habitat for many other species.
The Open Standards for the Practice of Conservation (OSPC) provides a framework to guide the development and implementation of conservation projects [
14]. OSPC is a collaborative and adaptive management process that strategically focuses conservation decisions on clearly defined objectives and prioritized threats and measures success in a manner that enables adaptation and learning over time. A key component outlined in the OSPC is an overall assessment of the viability or health of the conservation target and the establishment of a monitoring plan that requires indicators that are measurable, precise, consistent and sensitive [
14]. Habitat suitability models (HSMs), also referred to as species distribution or ecological niche models, are useful tools that can fit easily within this framework. HSMs use statistical and machine learning techniques to find empirical relationships between observed species' occurrences and environmental descriptors, such as resource, biotic and climatic factors, to estimate conditions that are suitable for population viability [
15], or to predict the likelihood of occurrence [
16,
17] in geographic space. Results from HSMs are most convincing when they are fit with both presence and absence data. However, due to the difficulty of acquiring ecologically meaningful absence data, HSMs are often constructed with so called ”presence only” data [
18]. When models are fit with presence only data, the output can be interpreted as a relative measure of environmental suitability rather than the probability of presence [
19,
20] typically with a range between 0 (unsuitable) and 1 (highly suitable).
The use of HSMs to model and map chimpanzee habitat suitability is a recent endeavor and has been used to answer questions applicable to chimpanzee conservation and ecology. Vegetation layers derived from Landsat ETM+ data were used in [
21] to model potential habitat suitability for chimpanzees in Western Tanzania at 90 m resolution. Landsat TM derived variables were used to generate habitat distribution maps for three time periods between 1986 and 2003 for a small area in Guinea-Bissau, within the range of
P.t. verus, where a marked decrease in habitat area and connectivity was found [
22]. Habitat suitability was mapped for the
P.t. schweinfurthii range at 10 km resolution using climate, human impact and vegetation variables to aid in the delineation of areas for new population survey efforts [
23]. HSMs were used for
P.t. ellioti in Nigeria and Cameroon and
P.t. troglodytes in Cameroon using climate, topographic, human population density and canopy cover variables at 1-km resolution to investigate the role of environmental variation in the genetic dissimilarity between the two sub-species and to assess potential future impacts of climate change on their habitat suitability [
24]. The first range-wide study was performed by [
25], who used climate, human impact and vegetation variables as model inputs to map habitat suitability for all four chimpanzee sub-species, and all African Great Apes, at 5-km resolution for both the 1990s and 2000s, and found a decrease in suitable habitat over time. Field researchers from several conservation organizations contributed data and were involved in the publication; thus, their results represent the most currently accepted and published state of chimpanzee habitat suitability across the field. We therefore rely on the results from Reference [
25] for this study.
The objectives of this study are to map chimpanzee habitat suitability at 30-m resolution and provide information on changes in suitability through time to demonstrate that remotely sensed datasets that are updated on a continual basis can be used in near real-time conservation monitoring plans. Previous studies have attempted to determine the environmental variables that govern the observed distribution of chimpanzees at local, regional and range-wide scales. In this study, we determine which regularly updated remotely sensed variables are most useful indicators for effective and continuous monitoring of chimpanzee habitats. We utilized a modified version of habitat suitability modeling where, instead of using presence data to calibrate our model, we rescaled the coarse resolution habitat suitability map cited in Reference [
25] to 30-m resolution using variables derived from remotely sensed datasets as input into a Random Forests regression model. Our results inform field monitoring efforts that support conservation planning and measure success as they portray finer scale habitat variation over several time periods.
3. Results
We found similar spatial patterns of habitat suitability between the calibration data (
Figure 4A) and the predictions from our model (
Figure 4B). Moreover, we were able to model chimpanzee habitat suitability reasonably well compared to the calibration data, explaining 82% (±0.2%) of the variance for the entire chimpanzee range (
Table 2). However, there was considerable variation in the model’s predictive capability among the four sub-species. Our model was able to explain 35% (±1.95%), 89% (±0.18%), 66% (±0.47%), and 73% (±0.44%) of the variance for
P.t. ellioti,
P.t. schweinfurthii,
P.t. troglodytes, and
P.t. verus, respectively (
Table 2). Compared to the calibration data, our model tended to underestimate habitat suitability (
Figure 5A,B). In general, areas where suitability was underestimated occurred at low elevations with sparse canopy cover, while the opposite was true for overestimated areas. Linear error patterns could be seen within the range of
P.t. troglodtyes (
Figure 5A), which resulted from a spatial mismatch of rivers between our study and Reference [
25] due to differing river network datasets.
Elevation, Landsat ETM+ band 5 and Landsat derived canopy cover were the strongest predictor variables in our model (
Figure 6). A significant association between habitat suitability and bioclimatic variables was found in the suitability models for all four sub-species in Reference [
25]. The bioclimatic variables used in Reference [
25] were taken from the WorldClim set of global climate layers, which were created by interpolating weather station data using latitude, longitude and elevation as independent variables [
50]. It is therefore probable that elevation acted as a proxy for suitable climate conditions in our model. Shortwave infrared reflectance is highly correlated with forest/non-forest, where forests have lower reflectance due to absorption properties of live vegetation and light extinguishing effects of tall forest canopies. The value of band 5 in discriminating Congo Basin forest from non-forest was illustrated in Reference [
51]. Trees in the CART family have been shown to exhibit a bias toward variables that have many cut-points (
i.e., variables with many unique values) [
48]. Band 5 is tightly correlated with canopy cover (
r2 = 0.83) and canopy height (
r2 = 0.82) in the calibration data. Canopy cover values were all integers between 0 and 100, while canopy height values were all integers between 0 and 23. On the other hand, band 5 values were continuous on the interval 0 to 0.7, which provided the algorithm with many more possible cut-points relative to canopy cover and canopy height. We believe it may therefore likely be driving the model as a surrogate for forest structure.
4. Discussion
This study represents the first attempt to map temporal dynamics of range-wide chimpanzee habitat suitability at a finer spatial scale. Our model demonstrated sensitivity to environmental change, which is a key feature for monitoring applications. The area depicted in
Figure 7 is situated in the Republic of Congo, which historically has not experienced any substantial human disturbance but recently has seen the most rapid increase in the establishment of logging roads among all Central African nations [
51]. Our model was able to detect the fine-scale changes in habitat suitability due to human activities in the area (
Figure 7B–E), whereas the coarse scale model was unable to resolve any fine-scale variation (
Figure 7F). These results highlight the value of integrating continuously updated variables derived from satellite remote sensing into habitat suitability models for near real-time monitoring and decision support. The Landsat suite of satellites has provided the longest continuing record of standardized data of Earth’s land cover at 30-m resolution; metrics derived from these data will be crucial for the development of standardized habitat suitability models that can be repeated and integrated into long-term monitoring plans.
Previous studies have reported successful efforts to model chimpanzee habitat suitability, including the two sources from which our model was calibrated. Direct comparison between studies is difficult because, so far, there are only a few published studies and they generally do not have overlapping study areas or represent habitat at substantially different spatial scales. All studies except for Reference [
30] utilized the MaxEnt software to generate their models, which reports model accuracy based on the Area Under the Curve of the Receiver Operating Characteristic (AUC). AUC characterizes the performance of a model at all possible thresholds by a single number that represents the probability that a known presence location will be ranked higher than a random background location [
29]. Typical values for AUC are between 1.0–0.5 where a value of 1.0 represents a perfect fit while 0.5 indicates the model is no better than a random expectation [
29]. The first study to use Landsat data to model chimpanzee habitat suitability was Reference [
22] and reported an average AUC value greater than 0.9 indicating a good model fit. In addition, they were also able to detect changes in habitat suitability between two time periods. While their study area was much smaller than ours, a 2700 km
2 region in Guinea-Bissau, it lends support to the approach of using Landsat data to detect changes in chimpanzee habitat suitability. Similar AUC values were reported in Reference [
25], the primary calibration data source for our model, which ranged from 0.86 to 0.93. The authors of [
25] were able to compare their habitat suitability model for
P.t. schweinfurthii to the results from Reference [
23] and found large differences in the spatial distribution of habitat suitability for this sub-species. However, the maximum AUC reported in Reference [
23] was 0.67, which suggests the results from Reference [
25] are a better representation of habitat for this sub-species. Due to a lack of calibration data, important miombo woodland habitat in Western Tanzania was predicted as unsuitable in Reference [
25], which is the reason we included model output from [
30] in our calibration data. While the authors of [
30] did not provide AUC values for their model, they did assess model predictions with a 25% excluded portion of the nest locations that was not used to create the Mahalanobis distance model and an independent dataset from historical surveys. They found 92% of nests that were not used in model calibration and 84% of nests from historical surveys fell into the moderate to high suitability range. This indicates that the model was able to predict suitable nesting sites, but unfortunately does not provide any information about the model’s commission error.
While our model performed well across the chimpanzee range, it did show variation in its predictive capability. Compared to the calibration data, our model underestimated regions of high suitability in the
P.t. ellioti and
P.t. troglodytes ranges, which is likely due to missing predictor variables. We found dense canopy cover to be an important predictor of habitat suitability in our model, however; the authors of [
25] did not find canopy cover to be a significant predictor variable in their models for these two sub-species. Both of these sub-species are found in central Africa, which is thought to be the last bastion of contiguous high quality chimpanzee habitat. While these large tracts of undisturbed forest are likely suitable places for chimpanzees to live, recent research has found a catastrophic decline in populations due to intense bush-meat hunting and disease [
52]. High value tree species are selectively logged in the region, which leaves forests largely intact; however, the creation of logging roads opens up previously isolated forests, increasing the frequency of human contact and allowing easier access for hunters [
52,
53]. An absence of chimpanzees in these intact forests is likely the reason why habitat suitability is low in these areas and incorporating information on the proximity to logging roads and camps would benefit model accuracy.
It is important to note that inaccuracies in our model could also arise from errors in the input datasets to our model as well as errors in the datasets used to calibrate our model. Tree canopy cover was an important predictor variable in our model and was derived from Landsat ETM+ spectral data [
28]. Canopy cover estimates from satellite data can be influenced by the composition of understory vegetation, satellite signal saturation and phenological noise [
54,
55]. Moreover, tree canopy cover from Reference [
28] does not discriminate between forest types which could lead to predicting suitable habitat in intensely managed forests. Habitat suitability models in Reference [
25] were generated using bioclimatic variables from the WorldClim dataset and considerable uncertainty surrounds the estimated climate conditions in Africa due to an extremely sparse network of weather stations found there [
53]. Since elevation likely served as a proxy for climate in our model, errors in Reference [
25] associated with poorly estimated climate conditions will be present in our model.
An additional source of uncertainty in our model could come from the assumption that habitat suitability is insensitive to changes in spatial resolution. However, our model used only continuous variables, and the scaling properties of continuous data have been known for a long time [
56]. Reflectance and canopy cover, which figure prominently in our estimates of habitat suitability, have been demonstrated to be insensitive to changes in resolution [
54,
57]. In addition, the authors of [
58] showed that coarsening the grain size of predictive variables in a species distribution model did not significantly impact model performance, “Change in grain size does not have a substantial effect on species distribution models. The trend is overall weakly significant towards degradation of model performance, but improvement can also be observed for some species”. We further evaluated model sensitivity to changes in spatial resolution by averaging all pixels of the 30-m suitability model output within the extent of the 5-km grid-cells from the map used to calibrate the model.
Figure 5A portrays the difference between averaged predictions from our model at 5-km resolution and the map used for calibration (observed), where areas of underestimation are seen in blue while overestimates are seen in red.
Figure 5B depicts the relationship of suitability values from the calibration data (x-axis) and average predicted suitability from the Random Forests model in 5-km grid-cells for each sub-species. The correlation coefficients between the estimated habitat suitability and suitability from the calibration map were nearly identical to the relationships found from the cross-validation error assessment (
Table 1), which suggests that error scales with spatial resolution.