1. Introduction
Tropical ecosystem services are severely impacted by deforestation and forest degradation [
1,
2,
3]. Not only does tropical forest Land-Use and Land-Cover Change (LULCC) constitute 10% to 15% of the total global carbon emissions [
4] it also changes forest fragmentation and influences forest structure and function [
5,
6,
7]. Strong fragmentation effects decrease the number of large trees along forest edges [
8,
9], while species composition and biodiversity are equally negatively affected [
10,
11,
12]. Estimates show that 31% of carbon emissions are caused by edge effects alone [
6].
Accurate estimates of LULCC and forest canopy structure are therefore imperative to estimate carbon emissions and other ecosystem services [
1,
2]. Remote sensing products have been key inputs in LULCC assessments as they provide accurate spatial information to help estimate carbon emissions [
1,
13]. More so, high-resolution aerial images provide scientists with tools to monitor forest extent, structure, and carbon emissions as canopy texture is linked to aboveground biomass [
14,
15,
16]. However, most of these estimates are limited in time to recent decades [
1,
2,
17,
18].
Historical estimates of Land-Use and Land-Cover (LULC) in the pre-satellite era (<1972) exist but generally rely on nonspatially explicit data (i.e., socioeconomic data) [
2,
17,
19,
20]. Efforts have been made to use other geospatial data sources such as historical maps [
21], declassified CORONA satellite surveillance data across the US and central Brazil [
22], as well as aerial surveys in post World War II Germany [
23]. Survey data across the African continent is less common, inaccessible, or both. Some studies do exist; Buitenwerf et al. [
24] and Hudak and Wessman [
25] used aerial survey images to map vegetation changes in South African savannas whilst Frankl et al. [
26] and Nyssen et al. [
27] mapped the Ethiopian highlands of the 1930s.
Across the central Congo Basin, most of these historical images were collected within the context of national cartographic efforts by the “Institut Géographique du Congo Belge” in Kinshasa (then Léopoldville), DR Congo. Despite the existence of large archives of aerial survey imagery of African rainforest (
Figure 1,
Appendix A Figure A3), as of yet, no studies have valorized these data. The lack of a consistent valorization effort is unfortunate as the African rainforest is the second largest on Earth and covers ~630 million ha, representing up to 66 Pg of carbon storage [
28] and currently loses forest at an increasing pace [
29]. Given the impact of LULCC on the structure and functioning of central African tropical forests and their influence on both carbon dynamics [
30] and biodiversity [
12], accurate long-term reporting of historical forest cover warrants more attention [
21].
Here, we use a combination of historical aerial photography (1958) and contemporary remote sensing data (2000–2018) to map long-term changes in the extent and structure of the tropical forest surrounding Yangambi (DR Congo) in the central Congo Basin, effectively linking the start of the anthrophocene [
31] with current assessments. Yangambi was, and remains, a focal center of forest and agricultural research and development in the central Congo Basin. Past research in the region allows for thorough assessment of LULCC from a multidisciplinary point of view, confronting us with complex deforestation and land-use patterns.
We leverage structure-from-motion to generate a large orthomosaic of historical imagery and to develop a convolutional neural network-based forest cover mapping approach based upon a semi-supervised generated dataset extensively leveraging data augmentation. Our methodology aims to provide a historical insight into important LULCC spatial patterns in Yangambi, such as fragmentation and edge complexity. We further contextualize the influence of changes in the forest’s life history on past and current research into Aboveground Carbon (AGC) storage [
30] and biodiversity [
12] in the central Congo Basin. Our fast scalable mapping approach for historical aerial survey data, using limited supervised input, would further support long-term land-use and land-cover change analysis across the central Congo Basin.
4. Discussion
Finely grained spatial data sources, such as remote sensing imagery, are rare before the satellite era (<1972). This lack of data limits our understanding of how forest structure has varied over longer time periods in remote areas. Long-term assessment can be extended by using large inventories of historical aerial survey data [
22,
23,
49]. Despite the difficulties in recovering this data and its limitations, such as invisible disturbances [
50], remote sensing generally remains the best way to map and quantify LULCC [
2]. In our study, we used novel numerical remote sensing techniques to valorize, for the first time, historical remote sensing data in order to quantify (long term) land-use and land-cover change and canopy structural properties in the central Congo Basin. Despite these successes, some methodological and research considerations remain.
4.1. Methodological Considerations
4.1.1. Data Recovery Challenges
In our study, the archive data recovered was limited to contact prints and therefore did not represent the true resolution of the original negative. In addition, analogue photography clearly produces a distinct softness compared to digital imagery (
Figure 8). Despite favourable nadir image acquistions [
51], image softness combined with illumination effects between flight paths and the self-similar nature of vast canopy expanses [
52,
53,
54] limited our ability to provide wall-to-wall coverage of the entire dataset containing 334 images. A few man-made features in the scenes also made georeferencing challenging. Although the village of Yangambi provided a range of buildings as (hard-edge) references, other areas within the central Congo Basin might have fewer permanent structures and would require the use of soft-edged landscape features (e.g., trees and river outflows). Research has shown that soft-edged features can help georeference scenes even when containing few man-made features [
55]. Our two-step georeferencing approach resulted in a referencing accuracy of ~4.7 ± 4.3 m across reference points. However, it shoud be noted that referencing accuracy of the final scene is less constrained toward the edges of the scene.
4.1.2. LULC Classification and Validation
When classifying the orthomosaic into forest and non-forest states, we favoured a deep-learning-supervised classification using a CNN over manual segmentation to guarantee an “apples-to-apples” comparison between the historical and the contemporary GFC forest cover map used in our analysis. We acknowledge that both the CNN and GFC land-use and land-cover maps use different underlying features, i.e., spatial or spectral data, yet attain a similarly high accuracy of up to 99% [
1]. More so, when validating our CNN classifier against GFC data (
Figure 7) for a contemporary high-resolution Geo-Eye panchromatic image, we reach an accuracy of ~87%, despite a time difference of almost 60 years. Visual inspection of the classification data in
Figure 7 suggests that the GFC map more often than not classifies non-forest areas as forest. Actual classification accuracy of our algorithm might therefore be higher than our reported value.
4.1.3. Scaling Opportunities
Our approach uses broadly defined homogeneous polygons to construct a balanced dataset of synthethic landscapes. The methodology is analoguous to the use of sparse labelling as used by Buscombe and Ritchie [
56] and contrasts with the standard methodologies which generally extract pixel (windows) [
22] or delineate land cover classes [
23] to drive a classifier or analysis. More so, the use of heavy image augmentation during model training sidesteps texture representation issues which affect classification of image scenes with inconsistent illumination or sharpness [
25] or ad hoc feature engineering [
22]. The use of synthetic landscapes allowed us to account for most, but not all, of the variability within our orthomosaic. Our analysis has shown that, despite being trained on historical data, our model could map contemporary forest cover in remote sensing data with similar spatial and spectral characteristics (
Figure 7), suggesting that the classifier consistently works across both space and time. We acknowledge that the use of synthetic landscapes is limited by the available homogeneous areas to sample from and the number of classes. However, the latter should not be a constrained as previous research efforts have focussed on simple forest cover maps [
1].
4.2. Research Context
4.2.1. Long Term Changes in LULC and Aboveground Carbon
Our analysis shows that the majority of deforestation around Yangambi happened toward the late 1950s (~16,200 ha). Considerable reforestation has occured since the aerial survey was executed (~9918 ha), and socioeconomic instability prevented further large scale forest exploitation. In particular, many plantations have reached maturity, and forest has reestablished in previously cleared or disturbed areas. The majority of this reforestation takes the form of isolated patches of forest but is offset by further deforestation of previously untouched forest. Generally, the function and structure of forests can be influenced by forest edges that are located up to 1 km away; however, most effects are pronounced within the first 300 m from the edge [
57]. Our analysis of edge effects on AGC has shown that the influence is negligible 200 m away from the edge. Phillips et al. [
58] have shown similar weak responses to edge effects in the Amazon forest. Due to a lack of data on the extent (depth) of edge effects and their influence on AGC beyond 200 m, we did not include any estimates of carbon loss or gain within these zones. However, it must be stated that edges throughout the landscape make up a substantial area and account for 13,151 ha. Thus, edges could have a substantial negative [
6] or positive [
59] influence on AGC. Similarly, uncertainties in how to explicitly correct for plantations in the landscape present a further challenge. Similarly, variability across mixed forest plots used in scaling aboveground carbon estimates due to deforestation introduced additional uncertainty (see
Appendix A Table A3). Thus, although our estimates are only indicative, they do underscore the important influence of landscape structure in carbon accounting. However, our findings do not indicate that deforestation in Congo Basin is declining, on the contrary.
Over the past half century, there has been a clear shift in land use in Yangambi (
Figure 3). Land use has shifted away from a regular (fishbone) deforestation pattern that emerges when (large scale) agricultural interests dominate the landscape [
60] to a more fragmented landscape (
Figure 3D). The former is consistent with historical land management at the time of the aerial survey [
46]. These regular patterns reversed due to a decrease in large-scale intensive agriculture and an increase in ad hoc small-scale subsistence farming with large perimeter-to-area relationships (i.e., ragged edges). Consequently, edge effects in the current landscape are far more pronounced than in the historical scene.
Visual inspection of the images also suggests that reforestation within the historically cleared areas and experimental plots is not necessarily limited to areas far removed from more densely populated areas. For example, large reforested areas exist close to the Congo stream and Yangambi village itself (
Figure 3). Here, regional political components, such as land leases and large-scale ownership could have played a role in safeguarding some of these areas for rewilding or sustainable management [
61,
62]. Despite widespread anthropogenic influences throughout the tropics [
31], the retention of these forested areas show the potential of explicit or implicit protective policy measures (e.g., INERA concessions, Bustillo et al. [
46]) on a multi-decadal time scale. Reforestation in noncontinuous areas within Yangambi could increase landscape connectivity and could help increase biodiversity [
12].
Our analysis therefore provides an opportunity to highlight and study those regions that have previously suffered confirmed long-term disturbances and those that have been restored since. Assessments of old plantations and recovering clear-cut forests can serve as a guide to help estimate carbon storage capacity and forest recovery rates in managed and unmanaged conditions [
18,
20,
63] over the mid to long term in support of rewilding and general forest restoration [
12,
61,
62]. In addition, mapping long-term edge effects can further support research into issues such as receding forest edges [
57].
4.2.2. Canopy Structure and FOTO Texture Analysis
Finally, the FOTO technique used to quantify relationships between canopy structure and forest characteristics rendered no valuable insights of either the historical orthomosaic or recent Geo-Eye scene. Similarly weak correlations were found previously by Solórzano et al. [
64]. In contrast, site-based texture metric statistics did show correspondence between historical and contemporary satellite imagery. None of them were either consistent or significant. Although visual interpretation shows distinctly different canopy structures (
Figure 3), the differences in how resolution is defined and issues related to image quality prevented us from quantifying these further. Unlike large-scale studies by Ploton et al. [
14], we could not scale this technique to historical data. The successful use of our CNN classification model on a contemporary remote sensing data does suggest that texture can be used to consistently capture canopy properties 60 years apart. Differences in PC between forest types (e.g., mono-dominant vs. mixed,
Figure 9) corroborate that texture can serve as a basis for LULC classification. However, inflexibility on part of the FOTO technique in dealing with non-standardized (historical) data or scaling these results to AGC values limits its use case. We advise that future valorisation efforts should preferentially work from FOTO negatives (if available) to ensure optimal data quality in resolution, contrast, and sharpness.
5. Conclusions
Given the impact of tropical forest disturbances on atmospheric carbon emissions, biodiversity, and ecosystem productivity, accurate long-term reporting of LULCC is an imperative. Our analysis of historical aerial survey images (1958) of the Central Congo Basin provides a window into the state of the forest at the start of the anthropocene. The use of a CNN-based LULC classifier using synthetic landscapes-based image augmentation provides a robust semi-supervised solution which scales across space and time, even for image scenes with inconsistent illumination or sharpness. Combined with contemporary remote sensing data, we have shown that historical aerial survey data can be used to quantify long-term changes in LULC and AGC. We showed a shift from previously highly structured industrial deforestation of large areas for plantation purposes to discrete smallholder clearing for farming, increasing landscape fragmentation and opportunties for substantial regrowth. Efforts to quantify canopy texture features and their link to AGC had limited to no success. Our analysis provides insights into the rate at which deforestation and reforestation has taken place over a multi-decadal scale in the central Congo Basin. As such, it provides a useful historical context while interpreting past and ongoing forest research in the area.