A Remote Sensing Approach to Understanding Patterns of Secondary Succession in Tropical Forest

: Monitoring biodiversity on a global scale is a major challenge for biodiversity conservation. Field assessments commonly used to assess patterns of biodiversity and habitat condition are costly, challenging, and restricted to small spatial scales. As ecosystems face increasing anthropogenic pressures, it is important that we ﬁnd ways to assess patterns of biodiversity more efﬁciently. Remote sensing has the potential to support understanding of landscape-level ecological processes. In this study, we considered cacao agroforests at different stages of secondary succession, and primary forest in the Northern Range of Trinidad, West Indies. We assessed changes in tree biodiversity over succession using both ﬁeld data, and data derived from remote sensing. We then evaluated the strengths and limitations of each method, exploring the potential for expanding ﬁeld data by using remote sensing techniques to investigate landscape-level patterns of forest condition and regeneration. Remote sensing and ﬁeld data provided different insights into tree species compositional changes, and patterns of alpha- and beta-diversity. The results highlight the potential of remote sensing for detecting patterns of compositional change in forests, and for expanding on ﬁeld data in order to better understand landscape-level patterns of forest diversity.


Introduction
Biodiversity is under unprecedented threat [1][2][3][4], driven largely by habitat loss [5][6][7]. Rapid changes in biodiversity [8][9][10] put ecosystem functioning at risk and jeopardize the essential services that we rely on [11]. While the Aichi global biodiversity targets [12] succeeded in bringing global attention to concerns about biodiversity, most of the targets set for 2020 were not met, highlighting the need for better communication, an improved knowledge of operational monitoring techniques, and action for biodiversity conservation.
Here we refer to biodiversity as the variety of life distributed heterogeneously across the Earth. We mainly focus on the variability of biodiversity in tropical forests [13]; with a focus on tree species richness and evenness, while considering compositional change over succession.
Although tropical forests are important biodiversity hotspots which support many endemic and threatened species [14][15][16][17][18][19], these habitats continue to experience record losses across the globe. Worldwide, the reduction in the extent of primary forest in 2019 was 2.8% higher than in previous years, the third highest loss since the turn of the century according to the latest Global Forest Watch report [20]. As the proportion of secondary and We substantiated our results in the context of Trinidad's Northern Range forests in order to understand the variation in the beta-diversity estimated from RS data, and its potential to distinguish between forests at different stages of succession. We then used this information to produce a map of forest age for the region. Our objectives are twofold: (i) to assess patterns of tree species diversity change over succession, and (ii) to compare the results generated from field and RS data. We address the strengths and limitations of using field and RS data, and identify complementarities between the two methods, in order to develop operational models of biodiversity monitoring.

Study Area and Field Plot Network
This study was conducted in Trinidad and Tobago, a twin-island nation in the Caribbean off the coast of Venezuela. Trinidad has a long history of cacao farming; it was the third largest cacao producer in the world in 1830, and the cacao industry dominated its economy from approximately 1866 to 1920 [61]. Though the cacao industry has declined and many old cacao estates have been abandoned, much of Trinidad's land has been transformed by cacao production and there are still many cacao farmers today (an estimated 3500 farmers in 2004 [61,62]).
Twenty-nine forest patches were surveyed on foot in the Northern Range of Trinidad (Table 1). Only 19 of these sites are used in this study because of persistent cloud cover, which precluded remote sensing analysis (Figure 1; see Section 2.3). Together, the sites represent around 100 years of secondary forest succession, including active cacao agroforests, secondary forests regenerating following cacao agroforest abandonment, and primary forest ( Figure 1). We used a chronosequence approach whereby the number of years since each cacao agroforest site was abandoned is used as a proxy for successional age. This approach assumes that secondary forests are transitioning through succession at a similar rate. The field sites were chosen based on expert opinion to account for variability in environmental variables (including matrix habitat type, altitude, land gradient, forest type, and land-use history in Supplementary Materials) and based on our ability to obtain reliable ages and land-use history information. The successional age and land-use history of each site was gathered from historical records, including those held at the National Archives of Trinidad and Tobago, and from local knowledge. Sites were all 1-5 ha, with over 50% canopy cover and trees over 5 m tall. The sites were spread throughout the southern slopes of the Northern Range (Figure 1), which are dominated by seasonal evergreen forest [63]. The boundaries of each site were determined from landmarks which are commonly used as boundary markers such as the crotons Cordyline fruticosa, and from local knowledge [64]. The field sites were chosen based on expert opinion to account for variability in environmental variables (including matrix habitat type, altitude, land gradient, forest type, and land-use history in Supplementary material) and based on our ability to obtain reliable ages and land-use history information. The successional age and land-use history of each site was gathered from historical records, including those held at the National Archives of Trinidad and Tobago, and from local knowledge. Sites were all 1-5 ha, with over 50% canopy cover and trees over 5 m tall. The sites were spread throughout the southern slopes of the Northern Range (Figure 1), which are dominated by seasonal evergreen forest [63]. The boundaries of each site were determined from landmarks which are commonly used as boundary markers such as the crotons Cordyline fruticosa, and from local knowledge [64]. All sites were over 0.8 km away from each other, and within 0.14 km of a larger tract of forest [65].
Active cacao agroforestry systems were planted primarily with cacao trees, with some shade trees such as mountain immortelle (Erythrina poeppigiana) and other crop trees such as mango (Mangifera indica) and citrus (Citrus sp.). The cacao trees were pruned so that they remained short and easy to harvest, and the ground vegetation and epiphytes on the cacao trees were often removed. The secondary forests were regenerating from abandoned cacao agroforests and were in the range of c.20-100 years post-abandonment. The primary forests had no record or indication of disturbance, and had not been disturbed in local memory. Primary forest sites were given a conservative age estimate of 200 years old for the regression analyses.

Survey Methods
The 29 forest sites were surveyed between November 2018 and August 2019. All of the trees (woody vegetation > 3 m tall and >6 cm DBH) within five metres of a 50 m transect line were identified to the species level [64]. Transects were randomly placed and Active cacao agroforestry systems were planted primarily with cacao trees, with some shade trees such as mountain immortelle (Erythrina poeppigiana) and other crop trees such as mango (Mangifera indica) and citrus (Citrus sp.). The cacao trees were pruned so that they remained short and easy to harvest, and the ground vegetation and epiphytes on the cacao trees were often removed. The secondary forests were regenerating from abandoned cacao agroforests and were in the range of c.20-100 years post-abandonment. The primary forests had no record or indication of disturbance, and had not been disturbed in local memory. Primary forest sites were given a conservative age estimate of 200 years old for the regression analyses.

Survey Methods
The 29 forest sites were surveyed between November 2018 and August 2019. All of the trees (woody vegetation > 3 m tall and >6 cm DBH) within five metres of a 50 m transect line were identified to the species level [64]. Transects were randomly placed and oriented within each site using a random number table. If a transect line reached an impassable obstacle, the path was diverted by 90 degrees. Samples and photographs were taken of the trees, and the species were re-identified by the National Herbarium of Trinidad and Tobago where necessary. Topographical and environmental data were collected, including altitude using a GPS, land gradient using an inclinometer, and canopy cover using a densiometer.

Satellite Imaging Surveys
Sentinel-2 (S2) is a constellation of two satellites launched by ESA in 2015 as part of the Copernicus Earth observation program. S2 acquires multispectral images in the optical domain (visible, near infrared, and short wave infrared), and the combination of spectral bands acquired by the multispectral sensors are especially designed for sensitivity Remote Sens. 2021, 13, 2148 6 of 20 to biophysical parameters of vegetation [66]. S2 ensures global coverage, and the repetition time of five days between each acquisition increases the odds of observing areas that are often obscured by cloud cover such as tropical forests.
Satellite images were accessed through the Copernicus open access hub (Copernicus Open Access Hub, 2020) in Level-1C products, which correspond to Top-Of-Atmosphere reflectance images in cartographic geometry and UTM/WGS84 projection. These Level-1C images were atmospherically corrected and converted into Level-2A images with the Overland software developed by Airbus [67]. All the plots surveyed in the field are located on a single S2 110 × 110 km 2 tile (T20PPS of the S2 tiling grid), allowing the use of a single image to observe all the plots at once. We used the image acquired on 13 February 2019 (wet season), which had <50% cloud cover and covered the largest number of plots.

Alpha-Diversity
To quantify tree species' alpha-diversity, we used the exponential form of the Shannon diversity measure, also known as Hill number of order 1 [68], hereafter named Shannon's D. Shannon's D was quantified using the following formula [69]: Here, S is the total number of species in the assemblage, and p i is the proportion of individuals of the ith species. Shannon's D (equivalent to expH) incorporates information on both species richness and evenness. Shannon's D reports the number of species that would be present at a site if all the species there were equally abundant. As such, it provides a readily interpretable measure of the evenness of species abundances at a site, in relation to the richness of that site. We made fair comparisons of estimates of Shannon's D for the field data using extrapolated rarefaction for sample size of 100 trees, to account for differences in sampling effort between sites (iNEXT package in r statistical software; [70,71]).

Beta-Diversity
We used Bray Curtis dissimilarity (BC) [72] as a measure of compositional dissimilarity between plots. BC is a commonly used index that quantifies the dissimilarity in species composition between two assemblages weighted by species abundances: where U i and V i are the abundances of the ith species in the first and second assemblage, respectively, and min (U, V) is the lowest abundance of the ith species (depending on whether the abundance is smaller in assemblage U or V), and including only species which are present in both assemblages [72,73]. BC is bound between 0 and 1, where higher values indicate greater compositional dissimilarity and lower values indicate greater similarity between sites.

Remote Sensing Analyses
An increasing number of approaches are being proposed to monitor plant biodiversity from remotely sensed images [36,74]. Supervised approaches are based on the training of models with remote sensing data and field data. However, most of these methods are not easily reproducible, as their reliability depends on the quality and abundance of the training data. Aiming for a more general and reproducible approach, we opted for a method requiring minimum supervision and no ground information for calibration. We used the R package biodivMapR [75], which includes an adaptation of the method initially developed by Féret and Asner [52] for the analysis of airborne imaging spectroscopy. This method aims to extract information corresponding to the spatial heterogeneity of spectral information, and to express it as diversity indices commonly used in ecology. It is based Remote Sens. 2021, 13, 2148 7 of 20 on a series of image pre-processing steps, followed by clustering, aimed at assigning a "spectral species" to each pixel. This population of spectral species intends to automatically discriminate among optically distinguishable functional types, defined as 'optical types' by Ustin and Gamon [76], and can then be inventoried for a given spatial unit in order to compute various diversity indices, including Shannon's D and BC. Finally, these diversity indices are mapped based on a fixed grid.
As the method uses spatial heterogeneity of spectral information to produce diversity indices, it is potentially sensitive to multiple factors extrinsic to vegetation properties that can impact the radiometric properties measured by a satellite. These factors include atmospheric perturbations such as cloud and haze, geometric and topographic effects leading to changes in illumination, and sensor artefacts such as Spectral Response Nonuniformity (SRNU) described in the Sentinel-2 Data Quality Report [77], among others. Therefore, proper image data pre-processing is crucial for relating spectral heterogeneity to vegetation diversity.
In our study, all available images showed significant cloud cover, which makes preprocessing and data filtering even more important. We first applied a series of radiometric filters in order to discard non-vegetated pixels (NDVI thresholding), as well as pixels suspected to be in shaded areas with an insufficient signal (NIR thresholding), or perturbed by atmospheric effects (blue thresholding). Then, we performed a Principal Component Analysis (PCA) from a correlation matrix on the remaining pixels, and selected components based on visual interpretation in order to discard components showing unwanted artefacts. Finally, we performed K-Means clustering, with 50 clusters, on the selected components.
The elementary surface units used to produce diversity maps depend on the image spatial resolution and on the ecological processes at work. Here, we produced alphaand beta-diversity maps over the S2 tile using 1ha elementary surface units. The map corresponding to alpha-diversity corresponds to the computation of Shannon's D for each of these elementary surface units. The map corresponding to beta-diversity corresponds to the computation of BC for a random selection of elementary surface units over the tile, followed by a Principal Coordinate Analysis (PCoA) [78] applied to the BC matrix in order to transpose the dissimilarity space into a 3-dimensional space. The generalization of the PCoA to all elementary surface units was then obtained by applying a nearest-neighbour procedure, enabling visualization (see Féret and Boissieu [75] for additional details). These diversity indices can also be computed from the pre-defined field plot network in order to use the exact spatial extent of the sites, and to enable proper comparison between ground observations and remotely sensed information for validation, as illustrated in the next section.

Spatial Sampling of Satellite Images
The computation of spectral diversity using the original field plots may result in difficulties when comparing diversity among plots due to heterogeneity in size and shape. Moreover, diversity indices derived from S2 data require a minimum number of pixels in order to compute reliable estimates of the diversity and spectral dissimilarity between forest plots. In order to overcome these difficulties, we applied a procedure designed to harmonize size and shape among plots.
First, we identified the pole of inaccessibility for each field plot, corresponding to the point maximizing the distance from all borders ( Figure 2). The average distance between the pole of inaccessibility and the closest plot boundary was 47 m. Then we defined a buffer around each pole of inaccessibility, leading to circular plots of a 60 m radius, corresponding to 1.13 ha and 113 pixels in the raster data. This buffer size offered the best trade-off between the number of pixels and the spatial match with original plot boundaries. This procedure ensured consistency in the spatial units used, which is especially important when using indices that are sensitive to sample size such as Shannon's D. point maximizing the distance from all borders ( Figure 2). The average distance betwe the pole of inaccessibility and the closest plot boundary was 47 m. Then we defined buffer around each pole of inaccessibility, leading to circular plots of a 60 m radius, corr sponding to 1.13 ha and 113 pixels in the raster data. This buffer size offered the best trad off between the number of pixels and the spatial match with original plot boundaries. Th procedure ensured consistency in the spatial units used, which is especially importa when using indices that are sensitive to sample size such as Shannon's D. Figure 2. Illustration of the field plot LHC defined by its original contour (gray boundaries), the corresponding inaccessibility pole and 60 m radius circular buffer, and the 1 ha sampling unit de fined in biodivMapR when mapping diversity indices over the whole image. The background co responds to the RGB color composite of the original S2 image with 10 m spatial resolution.

Alpha and Beta-Diversity Analyses
The field and RS Shannon's D results were compared using a Spearman correlatio coefficient, which computes the correlation of two variables based on their rank [79].
A 0.5 quantile regression [80] was used to further test whether BC changes with i creasing age difference between sites. Quantile regressions were used because they a more robust to unequal variance and outliers than standard linear regressions [81,82]. W used a tanglegram dendextend package [83] to compare the extent of concordance b tween the field and RS-derived BC results. To do this, we created a dendrogram for t field BC results and for the RS-derived BC results using agglomerative nesting, hiera chical clustering, agnes function in cluster package R, [84] and the Ward method. Differe methods within the untangle function in dendextend were used to optimise the alignme between the two dendrograms. The degree of agreement between the two dendrogram was expressed as the entanglement metric, where zero indicates no entanglement and o is full entanglement. A cophenetic correlation coefficient was also used to assess the alig ment between the two dendrograms (cor_cophenetic function R; Spearman). Cophene correlation coefficient is the correlation between the cophenetic distance matrices of t

Alpha and Beta-Diversity Analyses
The field and RS Shannon's D results were compared using a Spearman correlation coefficient, which computes the correlation of two variables based on their rank [79].
A 0.5 quantile regression [80] was used to further test whether BC changes with increasing age difference between sites. Quantile regressions were used because they are more robust to unequal variance and outliers than standard linear regressions [81,82]. We used a tanglegram dendextend package [83] to compare the extent of concordance between the field and RS-derived BC results. To do this, we created a dendrogram for the field BC results and for the RS-derived BC results using agglomerative nesting, hierarchical clustering, agnes function in cluster package R, [84] and the Ward method. Different methods within the untangle function in dendextend were used to optimise the alignment between the two dendrograms. The degree of agreement between the two dendrograms was expressed as the entanglement metric, where zero indicates no entanglement and one is full entanglement. A cophenetic correlation coefficient was also used to assess the alignment between the two dendrograms (cor_cophenetic function R; Spearman). Cophenetic correlation coefficient is the correlation between the cophenetic distance matrices of the two trees. Cophenetic correlation is bound between −1 and 1, where values closer to 0 indicate that there is no significant correlation between the two dendrograms.

Mapping Forest Age
While land cover maps that include different forest types have already been produced for this region [85], in this particular study we also explored the possibility of using field data in conjunction with satellite data to produce a map of forest successional age. We hypothesized that part of the variation in the spectral species produced from S2 data is linked to the successional age of each forest site. Therefore, we evaluated the relationship between PCoAs from the S2-based BC and forest age in the 19 surveyed forest sites, and used the most relevant axis to generate a map of forest ages in the region of interest.

Alpha Diversity
There was a significant correlation between Shannon's D and forest age using the field data (r s = 0.50, p = 0.03; Spearman; Figure 3A), but no correlation between remotely sensed Shannon's D and forest age (r s = 0.05, p = 0.85; Spearman; Figure 3B). There was Remote Sens. 2021, 13, 2148 9 of 20 also no correlation between remotely sensed and field-derived Shannon's D (r s = −0.37, p = 0.21; Spearman). and used the most relevant axis to generate a map of forest ages in the region of interest.

Alpha Diversity
There was a significant correlation between Shannon's D and forest age using the field data (rs = 0.50, p = 0.03; Spearman; Figure 3A), but no correlation between remotely sensed Shannon's D and forest age (rs = 0.05, p = 0.85; Spearman; Figure 3B). There was also no correlation between remotely sensed and field-derived Shannon's D (rs = −0.37, p = 0.21; Spearman).

Bray-Curtis Dissimilarity
The BC values quantified from field inventories and from the S2 image are presented as pairwise dissimilarity matrices (Figure 4). In the field-based data ( Figure 4A), each primary site was compositionally unique (BC > 0.98 between primary sites and all other sites; BC = 0.84 between primary sites). The actively managed sites were all more compositionally similar to each other than to the secondary and primary forest sites ( ̅̅̅̅ = 0.58 between all active plots, ̅̅̅̅ = 0.76 across all sites).
The pairwise BC derived from the S2 analysis ( Figure 4B) showed different patterns from those derived from field inventories. Active sites were strongly dissimilar to all other age categories, particularly with >80 yo sites ( ̅̅̅̅ > 0.95). Dissimilarity between active sites and other sites tended to increase with increasing age difference. Secondary forest sites

Bray-Curtis Dissimilarity
The BC values quantified from field data and from the S2 image are presented as pairwise dissimilarity matrices (Figure 4). In the field-based data ( Figure 4A), each primary site was compositionally unique (BC > 0.98 between primary sites and all other sites; BC = 0.84 between primary sites). The actively managed sites were all more compositionally similar to each other than to the secondary and primary forest sites (BC = 0.58 between all active plots, BC = 0.76 across all sites).
The pairwise BC derived from the S2 analysis ( Figure 4B) showed different patterns from those derived from field data. Active sites were strongly dissimilar to all other age categories, particularly with >80 yo sites (BC > 0.95). Dissimilarity between active sites and other sites tended to increase with increasing age difference. Secondary forest sites exhibited lower dissimilarities (BC = 0.48 for all plots over 80 yo), with the lowest values found between plots of 100 yo and older (BC = 0.41).
A 0.5 quantile regression showed there was a significant increase in compositional change (BC) ( Figure 5) with increasing age difference between sites for both the remote sensing data (t (169, 171) = 5.18, p < 0.001) and the field data (t (169, 171) = 6.70, p < 0.001).  A 0.5 quantile regression showed there was a significant increase in compositional change (BC) ( Figure 5) with increasing age difference between sites for both the remote sensing data (t(169, 171) = 5.18, p < 0.001) and the field data (t(169, 171) = 6.70, p < 0.001). A tanglegram was used to further explore the concordance in patterns of BC between the field and RS data. The tanglegram ( Figure 6) shows the dendrograms for the field data (left) and RS data (right). Corresponding sites are connected by gray lines, and sites are labeled according to their age category: (A) active cacao agroforests, (B) 25-50 yo, (C) 60-80 yo, (D) > 100 yo, and (E) primary forest. Although the two dendrograms align well (entanglement = 0.15 using the step2side method), there is no significant correlation between the topologies of the field and RS dendrograms (cophenetic correlation = 0.08).   A tanglegram was used to further explore the concordance in patterns of BC between the field and RS data. The tanglegram ( Figure 6) shows the dendrograms for the field data (left) and RS data (right). Corresponding sites are connected by gray lines, and sites are labeled according to their age category: (A) active cacao agroforests, (B) 25-50 yo, (C) 60-80 yo, (D) > 100 yo, and (E) primary forest. Although the two dendrograms align well (entanglement = 0.15 using the step2side method), there is no significant correlation between the topologies of the field and RS dendrograms (cophenetic correlation = 0.08). A tanglegram was used to further explore the concordance in patterns of BC between the field and RS data. The tanglegram ( Figure 6) shows the dendrograms for the field data (left) and RS data (right). Corresponding sites are connected by gray lines, and sites are labeled according to their age category: (A) active cacao agroforests, (B) 25-50 yo, (C) 60-80 yo, (D) > 100 yo, and (E) primary forest. Although the two dendrograms align well (entanglement = 0.15 using the step2side method), there is no significant correlation between the topologies of the field and RS dendrograms (cophenetic correlation = 0.08).

Ordination of Bray Curtis Dissimilarity Computed from Sentinel-2
We used PCoA ordination to compute the position of each site in three-dimensional space for the RS data. In this representation, the distance between two sites is based on BC. Figure 7 shows the values for the first two dimensions of the PCoA when the BC dissimilarity matrix was computed using only the pixels inside the original plot boundaries ( Figure 7A). The first component of the PCoA consistently appeared as a relevant indicator of age category among plots, which is consistent with the age gradient observed on the dissimilarity matrix ( Figure 4B). The combination of the first two PCoA dimensions revealed three main groups of points: (1) active with low values of pco1, (2) secondary forest plots with age values ranging from 25 to 80 yo, (3) a mix of some recovering plots and all older plots with high values of pco1. As the PCoA is a data-driven process, a second analysis was performed following a scenario designed to test the consistency of our results when using dissimilarity maps produced with biodivMapR over the full image; we selected the 1 ha elementary surface units, including the inaccessibility pole corresponding to each field plot, in order to observe their distribution along the two first components of the PCoA (Figures 2 and 7B). The distribution of the field plots along these two components was complemented with 300 random 1 ha elementary surface units. This distribution showed the same trend with an age gradient following pco1, confirming that the observations obtained from the analysis of field plots with only maximum consistency in spatial footprint are still valid when analyzing a full S2 tile.

Ordination of Bray Curtis Dissimilarity Computed from Sentinel-2
We used PCoA ordination to compute the position of each site in three-dimensiona space for the RS data. In this representation, the distance between two sites is based on BC. Figure 7 shows the values for the first two dimensions of the PCoA when the BC dis similarity matrix was computed using only the pixels inside the original plot boundarie ( Figure 7A). The first component of the PCoA consistently appeared as a relevant indica tor of age category among plots, which is consistent with the age gradient observed on the dissimilarity matrix ( Figure 4B). The combination of the first two PCoA dimension revealed three main groups of points: (1) active with low values of pco1, (2) secondary forest plots with age values ranging from 25 to 80 yo, (3) a mix of some recovering plot

Mapping Forest Age
The first component of the PCoA, applied to the BC indices computed from the satellite image, showed a strong correlation with forest age (R 2 = 0.89, Figure 7B). In order to take advantage of this relationship, we used a simple linear model to estimate forest age over the whole image ( Figure 8). However, because the high values of the first component (pco1) for 100-year-old secondary forests and 200-year-old primary forests overlap at approximately 0.4 (the highest values found over the image) the maximum forest age estimated was Remote Sens. 2021, 13, 2148 12 of 20 120 years old. More than 20% of the region was estimated to consist of forests in early successional stages (less than 10 years old). The young secondary forests were generally estimated to be located on the south and west of the Northern Range, whereas most of the northern region of the island was classified as older forest (Figure 9).

Mapping Forest Age
The first component of the PCoA, applied to the BC indices computed from the satellite image, showed a strong correlation with forest age (R² = 0.89, Figure 7B). In order to take advantage of this relationship, we used a simple linear model to estimate forest age over the whole image ( Figure 8). However, because the high values of the first component (pco1) for 100-year-old secondary forests and 200-year-old primary forests overlap at approximately 0.4 (the highest values found over the image) the maximum forest age estimated was 120 years old. More than 20% of the region was estimated to consist of forests in early successional stages (less than 10 years old). The young secondary forests were generally estimated to be located on the south and west of the Northern Range, whereas most of the northern region of the island was classified as older forest (Figure 9).

Mapping Forest Age
The first component of the PCoA, applied to the BC indices computed from the sat ellite image, showed a strong correlation with forest age (R² = 0.89, Figure 7B). In order t take advantage of this relationship, we used a simple linear model to estimate forest ag over the whole image ( Figure 8). However, because the high values of the first componen (pco1) for 100-year-old secondary forests and 200-year-old primary forests overlap at ap proximately 0.4 (the highest values found over the image) the maximum forest age esti mated was 120 years old. More than 20% of the region was estimated to consist of forest in early successional stages (less than 10 years old). The young secondary forests wer generally estimated to be located on the south and west of the Northern Range, wherea most of the northern region of the island was classified as older forest (Figure 9).

Discussion
There was a significant increase in the field-based Shannon's D with forest age [64], but no evidence of a significant relationship between RS-based Shannon's D and forest age. Overall, the field and RS data showed similar patterns of beta-diversity, with significantly greater turnover in species composition as the age difference between sites in-

Discussion
There was a significant increase in the field-based Shannon's D with forest age [64], but no evidence of a significant relationship between RS-based Shannon's D and forest age. Overall, the field and RS data showed similar patterns of beta-diversity, with significantly greater turnover in species composition as the age difference between sites increased. Finer-scale patterns of beta-diversity differed between the field and RS results though, and pairwise comparisons between sites yielded different degrees of compositional dissimilarity. While the field results showed primary forests to be compositionally unique and active sites as being compositionally similar to other active sites, the RS results better captured the turnover in composition with age difference between sites. The RS did not distinguish primary sites from 100 yo sites, while the field results did. There was no overall correlation between the alpha-or beta-diversity results of the two datasets. However, the field plots' historical records made it possible to interpret RS-derived BC to produce a forest age estimate over the whole region.

Alpha and Beta-Diversity Comparison
In agreement with the field results, other studies indicate that tree diversity tends to increase with forest age as species naturally colonize abandoned areas [24,[86][87][88]. The RS data may not be able to capture these changes in alpha-diversity because of local spectral diversity being more strongly driven by the structure and density of vegetation than by the diversity of tree species. While other studies have had success applying hyperspectral, high-resolution data to alpha-diversity estimates through remote sensing [54], here the spatial and spectral resolution may be too coarse to capture the fine variation of spectral signatures of the species variability at a local scale (Figure 3) [89]. RS analyses at this scale may be more closely related to functional diversity of photosynthetic traits at the community scale, [43,90,91] or to forest structure, [57,92] rather than to tree species diversity. Furthermore, in this case, it may also be possible that the species present are not spectrally different enough to be accurately detected by Sentinel-2 [93]. The RS results may also be improved by an increased number of sites.
Both the field and RS results showed considerable turnover in species composition with forest age (Figure 4). This aligns with other studies which have found that species composition changes over succession, and that it can take centuries for tree species composition in secondary forests to converge with that of primary forests [22,59,94,95]. Primary sites are identified as particularly compositionally unique in the field data, indicating that species composition in secondary forests has not recovered old-growth characteristics even after c. 100 years of succession [64].
The disparity in the patterns of compositional change over time detected by the field and RS data implies that these methods prioritise different aspects of the vegetation assemblages ( Figure 6). The field data may fail to distinguish some patterns of species turnover over succession due to younger saplings being recorded alongside mature trees, remnant cacao and agricultural trees persisting in the lower canopy layers of older forests, stochasticity in successional trajectories, and canopy gap dynamics in older forests [57,96,97]. The height and DBH thresholds for the field study were 3 m and 6 cm, respectively, in order to account for all the cacao trees, which often have many thin stems and are pruned to be short for easy cultivation. These thresholds are small compared to most tree studies [54,98] and many saplings were recorded in the field. As a result, data in the field presented differences when compared to RS data, which mostly detected the canopy trees. Differences in the canopy tree composition and the sapling composition could be diminishing the distinction between younger and older secondary forest sites for the field data, driving the differences in the field and RS results.
Overall, successional processes, including changes in species composition, are highly variable and can be affected by many different factors, including: the composition of nearby forests, the condition of the soil seed bank, the amount and type of seed rain, local environmental conditions, and other abiotic factors [22,57,60,[99][100][101][102][103][104]. Cacao farming itself can alter successional processes. Abandoned cacao agroforests already have canopy cover from the remnant cacao and shade trees which can affect local abiotic conditions, the behavior of animal seed dispersers, and the tree species that are able to colonize [22,92,105]. Cacao trees can live for over 100 years, and still occur in the understory of the oldest secondary forests in this study.
Another subject for consideration is that these analyses were based on relatively few sites. The number of field sites was restricted to those which we had permission to access, were able to access without natural barriers or impassable terrain, and where land-use history information could be obtained. The number of sites was further limited by cloud cover, which obscured some of the field sites in the RS image. It could be that comparing more sites would reveal a stronger trend in biodiversity change over secondary succession, and stronger correlations between the field and RS results.

Mapping Forest Age
The forest age map generated from the remote sensing BC (Figure 9) shows that old-growth forest is primarily found in the north-east of the Northern Range mountains, while the southern slopes have a higher proportion of agroforest and young secondary forest. Even though the data used to inform this map were limited, and we did not have additional data with which to validate our map, these results are consistent with local knowledge. The south-eastern stretches of the Northern Range are more easily accessible and densely populated, [85] and it is likely that these areas have experienced greater human modification and disturbance.
Land cover maps produced using remote sensing and field data can deepen our understanding of both local and landscape-level ecological processes and patterns of biodiversity. For example, the matrix surrounding an agroforest can mediate its value as habitat, and the surrounding landscape can influence the speed and trajectory of secondary forest succession [100]. The ability to produce maps such as the one presented here demonstrates the potential of using remote sensing in conjunction with field data for ecological and conservation purposes, in order to assess the biological significance of certain locations and predict how ecosystems may change over time.

Conclusions
This study highlights the potential benefits of using RS to measure tropical forest biodiversity, as well as its limitations. We did not find direct correlations between diversity indices based on field and RS data. However, RS-derived BC shows potential for the identification of a gradient of regeneration in abandoned cacao agroforests, as there is clear compositional dissimilarity between active cacao agroforests and secondary forests at different stages of succession. Combining accurate but laborious field data with remote sensing methods allows us to scale up our results. With RS we can extrapolate field data to assess landscape-level patterns of biodiversity and forest regeneration. RS-based indices may allow us to explore aspects of biodiversity and forest cover change that are complementary to field studies in areas that would be otherwise inaccessible. Finally, spectral diversity indicators derived from Sentinel-2 multispectral images show strong potential for monitoring biodiversity while assessing landscape level patterns. RS provides an important complementarity for improving biodiversity monitoring at the landscape level. This approach represents an opportunity to build upon these operational methods, supporting forest management practices and environmental conservation policies.