Historical Aerial Surveys Map Long-Term Changes of Forest Cover and Structure in the Central Congo Basin

: Given the impact of tropical forest disturbances on atmospheric carbon emissions, biodiversity, and ecosystem productivity, accurate long-term reporting of Land-Use and Land-Cover (LULC) change in the pre-satellite era (<1972) is an imperative. Here, we used a combination of historical (1958) aerial photography and contemporary remote sensing data to map long-term changes in the extent and structure of the tropical forest surrounding Yangambi (DR Congo) in the central Congo Basin. Our study leveraged structure-from-motion and a convolutional neural network-based LULC classiﬁer, using synthetic landscape-based image augmentation to map historical forest cover across a large orthomosaic (~93,431 ha) geo-referenced to ~4.7 ± 4.3 m at submeter resolution. A comparison with contemporary LULC data showed a shift from previously highly regular industrial deforestation of large areas to discrete smallholder farming clearing, increasing landscape fragmentation and providing opportunties for substantial forest regrowth. We estimated aboveground carbon gains through reforestation to range from 811 to 1592 Gg C, partially offsetting historical deforestation (2416 Gg C), in our study area. Efforts to quantify long-term canopy texture changes and their link to aboveground carbon had limited to no success. Our analysis provides methods and insights into key spatial and temporal patterns of deforestation and reforestation at a multi-decadal scale, providing a historical context for past and ongoing forest research in the area.


Introduction
Tropical ecosystem services are severely impacted by deforestation and forest degradation [1][2][3]. Not only does tropical forest Land-Use and Land-Cover Change (LULCC) constitute 10% to 15% of the total global carbon emissions [4] it also changes forest fragmentation and influences forest structure and function [5][6][7]. Strong fragmentation effects decrease the number of large trees along forest edges [8,9], while species composition and biodiversity are equally negatively affected [10][11][12]. Estimates show that 31% of carbon emissions are caused by edge effects alone [6].
with an Aviogon lens assembly (114.83 mm/f 5.6, with a 90 • view angle) resulting in square photo negatives of 180 by 180 mm. Flights were flown at an average absolute altitude of~5200 m above sea level, covering roughly 18,530 km 2 at an approximate scale of 1/40,000. The use of the integrated autograph system ensured timely acquisition of pictures with a precise overlap (~1/3) between images. This large overlap between images together with flight parameters would allow post-processing using stereographs to create accurate topographic maps. Original data from this campaign are stored in the Royal Museum for Central Africa in Tervuren, Belgium (Figure 1, Appendix A Figure A3).

Site Selection
We prioritised flight paths and images that contained current-day permanent sampling plots, larger protected areas, and past agricultural and forest research facilities ( Figure 2). This selection provides a comprehensive mapping of the Yangambi area and the life history of the forest surrounding it. Thereafter, we selected flight paths 1 through 11 for digitization. From this larger dataset of 334 images, we selected 74 survey images for orthomosaic compositing and further analysis. All the selected images stem from the flight campaign made during January and February of 1958. The area includes the Yangambi village, 20 contemporary permanent sampling plots [30], past and present agricultural experimental plots [32], and large sections of the Yangambi UNESCO Man and Biosphere reserve surrounding the west and east sides of the village. Although not formally mosaicked, we provided a full dataset of preprocessed images using the cropping and normalization routines described below. The latter data was not used in subsequent LULCC analysis but has been archived and made available to the public separately (see code and data availability statement below).

Digitization and Data Processing
All selected images covering the Yangambi area were contact prints as original negatives of the prints were not available. Images were scanned at a resolution exceeding their original resolution (or grain) at the maximal physical resolution of an Epson A3 flatbed scanner (i.e., 2400 dpi or 160 MP per image) and saved as lossless tiff images. Data were normalized using contrast limited histogram equalization [33] with a window size of 32 and a clip limit of 1.5. Fiduciary marks were used to rectify and downsample the images into square 7700 × 7700 pixel images (~1200 dpi, 81 MP). This resulted in a dataset with digital images at a resolution that remained above the visible grain of the photographs, whilst the reduced image size facilitated easier file handling and processing speed. Overview of the historical flight paths during aerial photo acquisition and ancillary data used in this study: The bounding box of the orthomosaic data presented in this study is shown as a rectangle (23 × 36 km). The outline of a recent high-resolution Geo-Eye panchromatic image is shown as a dashed dark grey rectangle (10 × 10 km). The location of various permanent sampling plots are shown as x, +, and open squares and triangles for the mixed, mono-dominant, and edge plots, respectively. The grey polygon delineates the current-day Yangambi Man and Biosphere reserve. The inset, top right, situates the greater Yangabmi region (white rectangle) within the DR Congo. The full flight plan and details are shown in Appendix A Figures A1 and A2.
Data was processed into a georeferenced orthomosaic using a structure-from-motion (SfM, Ullman [34]) approach implemented in Agisoft Metashape version 1.5.2 (Agisoft LLC, St. Petersburg, Russia). An orthomosaic corrects remote sensing data to represent a perfectly downward-looking image, free from perspective distortions due to topography and camera tilt. Using the SfM technique features, areas in images with a large degree of similarity are matched across various images to reconstruct a three-dimensional scene (topography) from two-dimensional image sequences. During the SfM analysis, we masked most clouds, glare, or large water bodies such as the Congo river.
We calculated the orthomosaic using a low-resolution point cloud and digital elevation map (DEM). Additional ground control points were provided to assist in the referencing of image and to constrain the optimization routine used in the SfM algorithm. Ground control points consisted of permanent structures which could be verified in both old and new aerial imagery (i.e., ESRI World Imagery) and consisted of corner points of build structures (e.g., a building, bridge, or swimming pool etc.). Although most clouds were removed during the SfM routine, some were retained to provide sufficient SfM tie points to maximize continuous forest coverage in the final orthomosaic. The final scene was cropped to provide consistent wall-to-wall coverage of the reconstructed scene. The orthomosaic was exported as a geotiff for further georeferencing in QGIS [35] using the georeferencer plugin (version 3.1.9) and additional ESRI World Imagery high-resolution reference data. We used 3rd degree polynomial and 16 ground control points to correct the final image. Ground control points, raw image data, and final processed image are provided in addition to measures of uncertainty such as mean and median error across all ground control points. All subsequent analyses are executed on the final geo-referenced orthomosaic or subsets of it.

Model Training
We automatically delineated all natural forest in the historical data, thus excluding tree plantations, thinned or deteriorated forest stands which showed visible canopy cover loss, fields, and buildings. We used the Unet Convolutional Neural Net (CNN, Ronneberger et al. [36]) architecture implemented in Keras [37] with an efficientnetb3 preprocessing backbone [38] running on TensorFlow [39] to train a binary classifier (i.e., forest or non-forested). Training data were collected from the orthomosaic by randomly selecting 513-pixel square tiles from locations within homogeneous forested or non-forested polygons in the historical orthomosaic ( Figure 3). Separate polygons were selected for training, testing, and validation purposes. Validation polygons were sampled 300 times, while both testing and validation polygons were sampled at 100 random locations. Tiles extracted from locations close to the polygon border at times contained mixed cover types. Tiles with mixed cover types were removed from the list of source tiles (Table 1). Homogeneous source tiles were combined in synthetic landscapes using a random Gaussian field-based binary mask ( Figure 4). We generated 5000 synthetic landscapes (balancing forest and non-forest classes) for training, while 500 landscapes were generated for both the validation and the testing datasets for a total of 6000 synthetic landscapes. In order to limit stitch line misclassifications, along the seams of mosaicked images, we created synthetic landscapes with different forest tiles to mimick forest texture transitions. We applied this technique to 10% of the generated synthetic landscapes (across training, validation, and testing data).   Figure 5): Homogeneous polygons used in the generation of the synthetic landscape for convolutional neural network training, testing, and validation are marked as dashed and full lines for forest and non-forest regions, respectively. Training, testing, and validation regions are denoted in panel B in green, red, and orange, respectivelly. Black polygon outlines denote cloud and image stitch line regions which were manually excluded from analysis but retained in calculation of validation statistics (see Table 2).
The CNN model was trained for 100 epochs with a batch size of 30 using Adam optimization [40], maximizing the Intersect-over-Union (IoU) using Sørensen-Dice [41] and categorical cross-entropy loss functions. Data augmentation included random cropping to 320-pixel squares, random orientation, scaling, perspective, contrast and brightness shifts, and image blurring. The optimized model was used to classify the complete orthomosaic using a moving window approach with a step size of 110 pixels and a majority vote (>50% agreement) across overlapping areas to limit segmentation edge effects. In addition, we provide raw pixel level classification agreement data for quality control purposes (see data availability below). We refer to Figure 6 for a synoptic overview of the full deep learning workflow.    Figure 6. A diagram of the deep learning workflow followed in training a binary forest/non-forest cover convolutional neural net U-Net model to generate our forest cover map.

Model Validation
We report the CNN accuracy based upon the IoU of our out-of-sample validation dataset of synthetic landscapes. In addition, we report confusion matrices for all pixels across the homogeneous validation polygons as well as the training and testing polygons (see Figure 3). Furthermore, we used the first acquisition of a recent panchromatic Geo-Eye 1 stereo pair (Geo-Eye, Thornton, Colorado, U.S.A., order 737537, 2011-11-11 08:55 GMT or 09:55 local time) to classify and assess the robustness of the CNN algorithm on contemporary remote sensing imagery with similar spectral and spatial characteristics. We used the Global Forest Change version 1.6 (GFC, tile 10N-020E) [1] map data, capturing deforestation up to 2011, to quantify accuracy on downsampled CNN Geo-Eye classification results. Once more, we report the confusion matrix between the GFC-and CNN-derived forest cover maps, masking clouds, and cloud shadows. To summarize confusion matrices, we report accuracy as follows: where TP, TN, FP, and FN are true positive, true negative, false positive, and false negative, respectively.

Characterizing Long-Term Change
To map long-term LULCC in the Yangambi region, we used the contemporary Global Forest Change version 1.6 (GFC, tile 10N-020E) (lossyear) map data [1]. Using the GFC data, we calculated the latest state of the forest with respect to the conditions at the start of 1958, 60 years earlier. In our analysis, we only included GFC pixels which recorded no forest loss throughout the whole 2000-2018 period. Forest loss in the context of GFC is defined as "a stand-replacement disturbance or a change from a forest to non-forest state". As such, locations which would see reforestation or deforestation between 2000 and 2018 would be marked as non-forest (i.e., disturbed). As the resolution of the historical forest classification exceeds that of the GFC map, we downsampled our historical forest cover data to 30 m GFC resolution using a nearest neighbour approach. We masked out all water bodies using the Global Forest Change survey data mask layer and limited the analysis to the right bank of the Congo river. We provide summary statistics of historical and contemporary deforestation and reforestation. We map permanent deforestation after 1958, reforestation after 1958, recent deforestation, and long-term (stable) forest cover. All references to changes over time in the context of our analysis explicitly compare the historical and contemporary periods from hereon forth.

Landscape Fragmentation and Aboveground Carbon Estimates
To quantify changes in the structure of forest cover and its disturbances, we used spatial landscape pattern analysis (i.e., fragmentation) metrics [42]. Landscape metrics provide a mathematical framework for the analysis of discrete land-cover classes and allows us to capture their composition and configuration. These metrics are therefore commonly used to compare how landscapes change over time [43]. In particular, fractals provide a way to quantify complex natural landscapes, including their self-similarity across scales [44,45]. We report the ratio of edge to area and the fractal dimension to quantify landscape complexity of forest disturbances. A fractal dimension closer to 2 suggests a more complex (fragmented) landscape.
Statistics were calculated for all forest disturbance patches larger than 1 ha and smaller than the 95th percentile of the patch size distribution using the R package "landscapemetrics" [43]. We provide mean and standard deviation on edge, area, their ratio, and fractal dimension for both the historical and contemporary Hansen et al. [1] forest cover maps.
We estimated aboveground carbon (AGC) losses and gains due to deforestation and reforestation using plot-based averages of recent inventory data at permanent sampling plots in the area ( Figure 2). We refer to Kearsley et al. [30] for the survey method and allometric relations used to scale the survey data. Unlike standard square 1 ha plots, edge plots (163.03 ± 19.39 Mg C ha −1 , N = 5) were set back 200 m from forest edges and were 50 × 200 m, with the 50 m side of the plot along the forest edge and continuing 200 m into the forest (Appendix A Table A2). We further confirmed that forest edge plots, as compared to mixed forest plots (160.48 ± 23.84 Mg C ha −1 , N = 8, see Appendix A Table A3 for all forest types) did not show a significantly different AGC (Mann Whitney U test, p < 0.05). Thus, it was not necessary to explicitly quantify changes in AGC caused by edge effects. Moreover, we used the mean value and its uncertainty (i.e., standard deviation) of the mixed forest as representative for potential AGC losses. Despite the challenges inherent in quantifying AGC for forest edges, we mapped the total extent of the edges in the contemporary landscape. To align our landscape analysis with exploratory analysis of the survey data, we used a buffer of 200 m to estimate the extent of forest edges and patches up to the location of forest edge plots.
Surveys of old plantations show a large variation in AGC, depending on age and the crop type. For example, the AGC values varied from 86.55 to 168.67 Mg C ha −1 , for Elaeis guineensis (oil palm) and Hevea brasiliensis (rubber tree) plots, respectively (Bustillo et al. [46], personal communications). These higher values are in line with the mixed plot AGC estimates (160.48 ± 23.84 Mg C ha −1 , N = 8) in the area, while the palm plantations resemble old-regrowth values (81.87 Mg C ha −1 , N = 1). To quantify AGC in reforested areas, we therefore use both AGC estimates of old-regrowth and mixed forest as lower and upper bounds. We did not have sufficient data to account for individual changes in AGC across plantations.

Canopy Structure and FOTO Texture Analysis
We compared the structure of the canopy both visually and using Fourier Transform Textural Ordination (FOTO, Couteron [47]). FOTO uses a principal component analysis (PCA) on radially averaged 2D Fourier spectra to characterize canopy (image) texture. The FOTO technique was first described by Couteron [47] to quantify canopy stucture in relation to biomass and biodiversity and can be used across multiple scenes using normalization [16].
We used the standard FOTO methodology with fixed zones instead of the moving window approach. The window size was set to the same size (187 pixels or~150 m) as used in the moving window analysis above. To account for illumination differences between the two scenes, we applied histogram matching. No global normalization was applied, as the scene was processed as a whole. PC values from this analysis for all permanent sampling plots in both image scences were extracted using a buffer with a radius of 50 m.
For both site-based and scene analysis, we correlated PC values with permanent sample plot inventory data such as stem density, aboveground biomass, and tree species richness (Appendix A  Tables A2 and A3). Additional comparisons are made between contemporary Geo-Eye data and the historical orthomosaic-derived PC values. Due to the few available permanent sampling plots in both scenes, we used a nonparametric paired signed rank (Wilcoxon) test [48] to determine differences between the PC values of the Geo-Eye and historical orthomosaic image scenes across mono-dominant and mixed forest types. In all analysis, mono-dominant site 4 was removed from the analysis due to cloud contamination.

Orthomosaic Construction
Our analysis provides a first spatially explicit historical composite of aerial survey images in support of mapping land-use and land-cover within the Congo Basin. The use of high-resolution historical images combined with SfM image processing techniques allowed us to mosaic old imagery across a large extent. The final orthomosaic composition of the Yangambi region provided an image scene covering approximately 733 million pixels across~93,431 ha with a resolution of 0.88 m/pixel (or~23 × 36 km, Figure 2). The overall spatial accuracy of the SfM orthomosaic composition using the sparse cloud DEM (with a resolution of 45.8 m/pixel) was limited to approximately 23 m. Further georeferencing outside the SfM workflow reduced the mean error at the ground control points to 5.3 ± 4.9 px (~4.7 ± 4.3 m), with a median error of 2.9 px (2.6 m). The orthomosaic served as input for all subsequent LULCC analysis with all derived maps provided with the manuscript repository (see data and code availability statements below).

CNN Model Validation
The CNN deep learning classifier reached an intersection-over-union accuracy of 97% on the detection of disturbed forest in the out-of-sample (validation) synthetic landscape data. Using all pixels within the validation polygons ( Figure 3) showed a similar accuracy value of~98%. Using all polygons across the scene, including those used in the generation of testing and training synthetic landscapes, increased the accuracy to~99% (Table 2). A comparison with recent panchromatic Geo-Eye data shows good agreement, with an accuracy of~87% across all pixels between the landsat-based GFC data and downscaled CNN results (Table 2 and Figure 7).

Long-Term Changes in LULC and Aboveground Carbon
Scaling our classifier to the whole historical orthomosaic, we detected~16,200 ha (or~20% of the scene) of disturbed forests. A large fraction of the disturbed area was restored in the period between the two acquisition periods. In total, 9918 ha, or little over half of the affected forest, was restored ( Figure 3C,D, dark blue). Recent deforested areas, as registered through satellite remote sensing (>2000), approximate 8776 ha (Table 3, Figure 5, light green). Recent deforestation follows a distinctly different pattern compared to historical patterns. Historical deforestation showed a classical fishbone pattern for forest clearing with very sharp edges, while current patterns are patchy and ad hoc ( Figure 5C, dark blue and green colours, respectively). These differences are reflected in the analysis of landscape metrics of deforestation. Between the historical and contemporary LULCC maps, we see an increase in small disturbances, as indicated by the decreasing area of the mean patch size, down to~1.86 ± 0.75 ha from~5.25 ± 5.02 ha historically. Perimeter lengths were longer historically, at 1451 ± 943 m, compared to contemporary landscapes 921 ± 362 m ( Table 4). This shift in perimeter area ratio led to a similar change in the fractal index, slightly increasing in value to 1.1 ± 0.05 from 1.09 ± 0.04 over time. Values closer to a fractal index of 2 suggest a more complex (fragmented) landscape.
A comparison of forest edge plots with mixed forest plots showed no signficant difference in AGC or other reported values such as species richness, basal area, or stem density (Mann Whitney U test, p < 0.05). Edge influence did not extend beyond 200 m from a forest edge but still represented an area of 13,151 ha (Table 3).
Changes in both land use and land cover led to concomitant changes in AGC stocks. Recovery throughout the region was characterized for patches of forest and plantations. Assuming high density stands, based on previous work, this could amount to a carbon gains of up to 1592 Gg C across our study area, offsetting more recent losses of approximately 1408 ± 209 Gg C. On the other hand, at the low end, if we assume a lower carbon density of 81.8 Mg C ha −1 , this would result in a total carbon gain of 811 Gg C. Using our approach, results indicate that overall deforestation around Yangambi has resulted in a loss of~2416 ± 358 Gg C in AGC stocks. Table 3. Land-use land-cover change statistics of forest cover around Yangambi in the central Congo Basin: The data evaluates a difference between a historical (1958) aerial photography-based survey and the Hansen et al. 2013-based satellite remote sensing data. Spatial coverage statistics are provided hectares (ha), rounded to the nearest integer as well as Above Ground Carbon (AGC) scaled using recent survey measurements. Both non-forest cover in 1958 and the forest edge area are subsets of the original data (see Figure 5) and do not count toward the total scene area.

Canopy Structure and FOTO Texture Analysis
Visual interpretation of the scenes provide evidence that most locations do not change dramatically with respect to canopy composition, except for the large areas of disturbances in contemporary fallow or young-regrowth plots. One marked difference is noted in the mono-dominant plot 6 (Appendix A  Table A2). Here, the current mono-dominant Brachystegia laurentii is a recent development, changing the canopy structure visibly during the last half century (Figure 8). The previous varied canopy structure gave way to a more dense and uniform canopy. This is reflected in a change of the FOTO PC value from 0.19 historically to its current value of 0.54 ( Figure 9). This historical value is similar to the mean of contemporary mono-dominant stands of Gilbertiodendron dewevrei with PC averaging 0.34 ± 0.1 and is only slightly higher than historical values for a mixed forest (0.18 ± 0.08, Figure 9). The reverse pattern is seen in the contemporary PC values. Here, the value of 0.54 exceeds those of most mono-dominant stands (0.35 ± 0.08) and is even further removed from the values noted for mixed forests (0.12 ± 0.03, Figure 9).
Using only small subsets around existing permanent sampling plots, we show distinct differences between forest types, with PC values in both historical and contemporary imagery markedly higher for the mono-dominant forest types compared to all others (Appendix A Figure A4). Provided that the young-regrowth and fallow permanent sampling plots have seen recent disturbance, the Wilcoxon signed rank test on the mixed and mono-dominant plots between the historical and contemporary PC values did not show a significant difference (p > 0.05). Similarly, no significant different using PC values extracted from the whole scene analysis was noted (p > 0.05). Any relationships between contemporary Geo-Eye data and permanent sampling plot measurements of aboveground carbon, stem density, and species richness were nonsignificant (p > 0.05, Appendix A Figures A4-A6).  Furthermore, visual inspection of the scene-wide analysis suggests historical scences do not show landscape-wide canopy features ( Figure 10A,B), unlike the contemporary scene ( Figure 10C,D). In the contemporary scene, the FOTO algorithm picks up landscape features such as changes in texture along the river valley (the diagonal line in Figure 10D). However, no corresponding landscape patterns are found by the FOTO algorithm in the historical orthomosaic.

Discussion
Finely grained spatial data sources, such as remote sensing imagery, are rare before the satellite era (<1972). This lack of data limits our understanding of how forest structure has varied over longer time periods in remote areas. Long-term assessment can be extended by using large inventories of historical aerial survey data [22,23,49]. Despite the difficulties in recovering this data and its limitations, such as invisible disturbances [50], remote sensing generally remains the best way to map and quantify LULCC [2]. In our study, we used novel numerical remote sensing techniques to valorize, for the first time, historical remote sensing data in order to quantify (long term) land-use and land-cover change and canopy structural properties in the central Congo Basin. Despite these successes, some methodological and research considerations remain.

Data Recovery Challenges
In our study, the archive data recovered was limited to contact prints and therefore did not represent the true resolution of the original negative. In addition, analogue photography clearly produces a distinct softness compared to digital imagery (Figure 8). Despite favourable nadir image acquistions [51], image softness combined with illumination effects between flight paths and the self-similar nature of vast canopy expanses [52][53][54] limited our ability to provide wall-to-wall coverage of the entire dataset containing 334 images. A few man-made features in the scenes also made georeferencing challenging. Although the village of Yangambi provided a range of buildings as (hard-edge) references, other areas within the central Congo Basin might have fewer permanent structures and would require the use of soft-edged landscape features (e.g., trees and river outflows). Research has shown that soft-edged features can help georeference scenes even when containing few man-made features [55]. Our two-step georeferencing approach resulted in a referencing accuracy of 4.7 ± 4.3 m across reference points. However, it shoud be noted that referencing accuracy of the final scene is less constrained toward the edges of the scene.

LULC Classification and Validation
When classifying the orthomosaic into forest and non-forest states, we favoured a deep-learning-supervised classification using a CNN over manual segmentation to guarantee an "apples-to-apples" comparison between the historical and the contemporary GFC forest cover map used in our analysis. We acknowledge that both the CNN and GFC land-use and land-cover maps use different underlying features, i.e., spatial or spectral data, yet attain a similarly high accuracy of up to 99% [1]. More so, when validating our CNN classifier against GFC data ( Figure 7) for a contemporary high-resolution Geo-Eye panchromatic image, we reach an accuracy of~87%, despite a time difference of almost 60 years. Visual inspection of the classification data in Figure 7 suggests that the GFC map more often than not classifies non-forest areas as forest. Actual classification accuracy of our algorithm might therefore be higher than our reported value.

Scaling Opportunities
Our approach uses broadly defined homogeneous polygons to construct a balanced dataset of synthethic landscapes. The methodology is analoguous to the use of sparse labelling as used by Buscombe and Ritchie [56] and contrasts with the standard methodologies which generally extract pixel (windows) [22] or delineate land cover classes [23] to drive a classifier or analysis. More so, the use of heavy image augmentation during model training sidesteps texture representation issues which affect classification of image scenes with inconsistent illumination or sharpness [25] or ad hoc feature engineering [22]. The use of synthetic landscapes allowed us to account for most, but not all, of the variability within our orthomosaic. Our analysis has shown that, despite being trained on historical data, our model could map contemporary forest cover in remote sensing data with similar spatial and spectral characteristics (Figure 7), suggesting that the classifier consistently works across both space and time. We acknowledge that the use of synthetic landscapes is limited by the available homogeneous areas to sample from and the number of classes. However, the latter should not be a constrained as previous research efforts have focussed on simple forest cover maps [1].

Long Term Changes in LULC and Aboveground Carbon
Our analysis shows that the majority of deforestation around Yangambi happened toward the late 1950s (~16,200 ha). Considerable reforestation has occured since the aerial survey was executed (~9918 ha), and socioeconomic instability prevented further large scale forest exploitation. In particular, many plantations have reached maturity, and forest has reestablished in previously cleared or disturbed areas. The majority of this reforestation takes the form of isolated patches of forest but is offset by further deforestation of previously untouched forest. Generally, the function and structure of forests can be influenced by forest edges that are located up to 1 km away; however, most effects are pronounced within the first 300 m from the edge [57]. Our analysis of edge effects on AGC has shown that the influence is negligible 200 m away from the edge. Phillips et al. [58] have shown similar weak responses to edge effects in the Amazon forest. Due to a lack of data on the extent (depth) of edge effects and their influence on AGC beyond 200 m, we did not include any estimates of carbon loss or gain within these zones. However, it must be stated that edges throughout the landscape make up a substantial area and account for 13,151 ha. Thus, edges could have a substantial negative [6] or positive [59] influence on AGC. Similarly, uncertainties in how to explicitly correct for plantations in the landscape present a further challenge. Similarly, variability across mixed forest plots used in scaling aboveground carbon estimates due to deforestation introduced additional uncertainty (see Appendix A Table A3). Thus, although our estimates are only indicative, they do underscore the important influence of landscape structure in carbon accounting. However, our findings do not indicate that deforestation in Congo Basin is declining, on the contrary.
Over the past half century, there has been a clear shift in land use in Yangambi (Figure 3). Land use has shifted away from a regular (fishbone) deforestation pattern that emerges when (large scale) agricultural interests dominate the landscape [60] to a more fragmented landscape ( Figure 3D). The former is consistent with historical land management at the time of the aerial survey [46]. These regular patterns reversed due to a decrease in large-scale intensive agriculture and an increase in ad hoc small-scale subsistence farming with large perimeter-to-area relationships (i.e., ragged edges). Consequently, edge effects in the current landscape are far more pronounced than in the historical scene.
Visual inspection of the images also suggests that reforestation within the historically cleared areas and experimental plots is not necessarily limited to areas far removed from more densely populated areas. For example, large reforested areas exist close to the Congo stream and Yangambi village itself ( Figure 3). Here, regional political components, such as land leases and large-scale ownership could have played a role in safeguarding some of these areas for rewilding or sustainable management [61,62]. Despite widespread anthropogenic influences throughout the tropics [31], the retention of these forested areas show the potential of explicit or implicit protective policy measures (e.g., INERA concessions, Bustillo et al. [46]) on a multi-decadal time scale. Reforestation in noncontinuous areas within Yangambi could increase landscape connectivity and could help increase biodiversity [12].
Our analysis therefore provides an opportunity to highlight and study those regions that have previously suffered confirmed long-term disturbances and those that have been restored since. Assessments of old plantations and recovering clear-cut forests can serve as a guide to help estimate carbon storage capacity and forest recovery rates in managed and unmanaged conditions [18,20,63] over the mid to long term in support of rewilding and general forest restoration [12,61,62]. In addition, mapping long-term edge effects can further support research into issues such as receding forest edges [57].

Canopy Structure and FOTO Texture Analysis
Finally, the FOTO technique used to quantify relationships between canopy structure and forest characteristics rendered no valuable insights of either the historical orthomosaic or recent Geo-Eye scene. Similarly weak correlations were found previously by Solórzano et al. [64]. In contrast, site-based texture metric statistics did show correspondence between historical and contemporary satellite imagery. None of them were either consistent or significant. Although visual interpretation shows distinctly different canopy structures (Figure 3), the differences in how resolution is defined and issues related to image quality prevented us from quantifying these further. Unlike large-scale studies by Ploton et al. [14], we could not scale this technique to historical data. The successful use of our CNN classification model on a contemporary remote sensing data does suggest that texture can be used to consistently capture canopy properties 60 years apart. Differences in PC between forest types (e.g., mono-dominant vs. mixed, Figure 9) corroborate that texture can serve as a basis for LULC classification. However, inflexibility on part of the FOTO technique in dealing with non-standardized (historical) data or scaling these results to AGC values limits its use case. We advise that future valorisation efforts should preferentially work from FOTO negatives (if available) to ensure optimal data quality in resolution, contrast, and sharpness.

Conclusions
Given the impact of tropical forest disturbances on atmospheric carbon emissions, biodiversity, and ecosystem productivity, accurate long-term reporting of LULCC is an imperative. Our analysis of historical aerial survey images (1958) of the Central Congo Basin provides a window into the state of the forest at the start of the anthropocene. The use of a CNN-based LULC classifier using synthetic landscapes-based image augmentation provides a robust semi-supervised solution which scales across space and time, even for image scenes with inconsistent illumination or sharpness. Combined with contemporary remote sensing data, we have shown that historical aerial survey data can be used to quantify long-term changes in LULC and AGC. We showed a shift from previously highly structured industrial deforestation of large areas for plantation purposes to discrete smallholder clearing for farming, increasing landscape fragmentation and opportunties for substantial regrowth. Efforts to quantify canopy texture features and their link to AGC had limited to no success. Our analysis provides insights into the rate at which deforestation and reforestation has taken place over a multi-decadal scale in the central Congo Basin. As such, it provides a useful historical context while interpreting past and ongoing forest research in the area. Zenodo: https://doi.org/10.5281/zenodo.3547767. All data not included in the latter repository can be found bundled with the analysis code as listed below. Proprietary datasets (i.e., Geo-Eye data) are not shared, but purchase order numbers allow for acquisition of these datasets to ensure reproducibility.

Code Availability
All analysis code is available as R/python [65] projects (https://khufkens.github.io/orthodrc and https://khufkens.github.io/orthodrc_cnn/). The analysis relied heavily on the "raster" [66], "RStoolbox" [67], and "landscapemetrics" [43] packages, while post-processing and plotting was facilitated by the "tidyverse" ecosystem [68], "ggthemes" [69], "scales" [70], and "cowplot" [71]. Additional plotting elements were formatted or provided by "sf" [72] and "rnaturalearth" [73] packages. I am grateful for the contributions to the scientific community by the developers of these packages.  Table A1. Flight path meta-data for the Isangi-Stanleyville aerial survey. Data provided consists of the flight path number, cardinal direction of the flight, the image numbers, and the duration of the flight provided by the start and end time of the acquisition. Data is sourced from Appendix A Figure A2 and the sensor logs recorded in the margin of acquired images (see Figure 1 main manuscript).