Mapping Natural Forest Remnants with Multi-Source and Multi-Temporal Remote Sensing Data for More Informed Management of Global Biodiversity Hotspots

: Global terrestrial biodiversity hotspots (GBH) represent areas featuring exceptional concentrations of endemism and habitat loss in the world. Unfortunately, geospatial data of natural habitats of the GBHs are often outdated, imprecise, and coarse, and need updating for improved management and protection actions. Recent developments in satellite image availability, combined with enhanced machine learning algorithms and computing capacity, enable cost-efficient updating of geospatial information of these already severely fragmented habitats. This study aimed to develop a more accurate method for mapping closed canopy evergreen natural forest (CCEF) of the Eastern Arc Mountains (EAM) ecoregion in Tanzania and Kenya, and to update the knowledge on its spatial extent, level of fragmentation, and conservation status. We tested 1023 model possibilities stemming from a combination of Sentinel-1 (S1) and Sentinel-2 (S2) satellite imagery, spatial texture of S1 and S2, seasonality derived from Landsat-8 time series, and topographic information, using random forest modelling approach. We compared the best CCEF model with existing spatial forest products from the EAM through independent accuracy assessment. Finally, the CCEF model was used to estimate the fragmentation and conservation coverage of the EAM. The CCEF model has moderate accuracy measured in True Skill Statistic (0.57), and it clearly outperforms other similar products from the region. Based on this model, there are about 296,000 ha of Eastern Arc Forests (EAF) left. Furthermore, acknowledging small forest fragments (1–10 ha) implies that the EAFs are more fragmented than previously considered. Currently, the official protection of EAFs is disproportionally targeting well-studied mountain blocks, while less known areas and small fragments are underrepresented in the protected area network. Thus, the generated CCEF model should be used to design updates and more informed and detailed conservation allocation plans to balance this situation. The results highlight that spatial texture of S2, seasonality, and topography are the most important variables describing the EAFs, while spatial texture of S1 increases the model performance slightly. All in all, our work demonstrates that recent developments in Earth observation allows significant enhancements in mapping, which should be utilized in areas with outstanding biodiversity values for better forest and conservation planning.


Introduction
Global terrestrial biodiversity hotspots (GBH) represent areas featuring exceptional concentrations of endemism and habitat loss in the world [1]. The 36 GBHs recognized today harbor (EAFs) [37,[43][44][45]. More recently, Newmark and McNeally [39] evaluated the amount and distribution of natural forest fragments, reporting a higher number of smaller fragments across the EAM. However, the spatial distribution of small remnants, as well as open gaps within large intact forest patches, remains unknown in the EAM and their role in promoting connectivity and significance as nodal points for habitat regeneration and rehabilitation has been unaccounted [19,39,46].
In this study, we developed a high-resolution (10 m) closed canopy evergreen natural forest (CCEF) model (generated from 10-30m data) across the EAM through testing optimal combinations of multi-temporal and multi-sensor satellite imagery and ancillary geospatial data. The CCEF model was then compared to previous spatial estimates of the EAF and used to evaluate the conservation status of the natural forests in the EAM. Results were discussed in the light of conservation and management actions. Furthermore, we discussed the need and potential of similar mapping approaches in updating the spatial information of other invaluable tropical forest ecosystems located in generally data-scarce Global South.

Study Area
The Eastern Arc Mountains consist of 13 separate mountain blocks that stretch geographically from southwest Tanzania to south Kenya ( Figure 1). The mountains share similar geology and rainfall patterns, governed by the Indian Ocean climatic regime, supporting humid montane forest habitats [38,47]. The EAM area holds outstanding biodiversity value, globally ranked among the most important areas for the conservation of endemic birds and plants [11,42,48], and hosting at least 96 endemic vertebrate species [38] and four endemic or near-endemic primate species [49,50]. At the same time, millions of people depend for their livelihood on these tropical mountain forest ecosystems that provide a source of water, wood for fuel and construction, charcoal, handicrafts, and medicines [51,52]. Deforestation and forest degradation are the biggest threats to EAFs. Fortunately, a majority of the remaining EAFs are protected via government and private protection areas, and recently many of the nature and forest reserves have been coded into the International Union for Conservation of Nature (IUCN) system [53]. Nevertheless, approximately 25% of the EAFs are located outside of the protected areas [43,47,52].

Methodological Design
The CCEF model was based on freely available satellite images and geospatial data sets. An overview of the methodological design is illustrated in Figure 2. The approach was divided into six phases: (1) Data sets' acquisition and preprocessing, (2) training data collection, (3) value extraction and variable grouping, (4) random forest model development, (5) validation and map comparison, and (6) fragmentation and protection status. The sections below describe the phases in more detail. Figure 2. Illustration of methodological design and its six phases. In phase 1, GRD (ground range detected), GLCM (gray-level co-occurrence matrix), and NDVI (normalized difference vegetation index). In phase 4, ntree (number of trees) and mtry (number of variables per node). In phase 5, CCEF (closed canopy evergreen natural forest). In phase 6, ENN (Euclidean nearest neighbor) distance and WDPA (World Database of Protected Areas).

Data Sets and Preprocessing
As optical Earth observation data, we used Sentinel-2 Multi-Spectral Instrument (MSI) imagery with spatial resolution of 10-20 m [54]. We screened visually the entire catalogue of Sentinel-2 imagery available for the study area to extract only images with low cloud coverage on the EAFs, as image pre-selection has shown to improve the quality of pixel-based composites [55]. Due to frequent cloud cover in the study area, we had to expand our image acquisition for multiple years (2016-2018). Altogether, 45 Sentinel-2 top-of-atmosphere (TOA) Level-1C images (Supplementary Materials, Table S1) were downloaded from the Copernicus Open Access Hub of European Space Agency (ESA) and atmospherically corrected with Sen2Cor processor. Sen2Cor algorithm associates image-derived atmospheric properties with the precomputed look-up tables [56]. Default settings of Sen2Cor were used. The corrected images were imported to GEE and masked for clouds and shadows using the available quality band of Sentinel-2 and blue band thresholding. To avoid remaining cloud and shadow effects, a median pixel composite was calculated from the images and all the visible (B2-B4), near-infrared (NIR), red edge (B8, B8a, B5-B7), and shortwave infrared (SWIR) bands (B11 and B12) were extracted for further analysis as these bands have shown to distinguish effectively spectral variance in forest canopies [57,58].
As radar data, we used the available Sentinel-1 SAR-C Level-1 Interferometric Wide (IW) ground range detected (GRD) dual-polarized VV (vertical transmit-vertical receive), VH (vertical transmithorizontal receive) imagery with spatial resolution of 10 m, which were acquired from the Copernicus Open Access Hub [59]. Sentinel-1 bands are sensitive to surface structure and moisture content and have shown useful in forest coverage and structure mapping [60]. Altogether, 33 images were retrieved from 28 September 2008 -30 September 2017 (Supplementary Materials, TableS1) since September represents the dry period throughout the study area with similar conditions considering vegetation phenology and land use practices. The number of images covering each block varied from three to five depending on the availability of images. The preprocessing of the SAR images was conducted in Sentinel Application Platform (SNAP) [61] and included thermal noise removal, image border noise removal, image calibration, radiometric and geometric corrections, and speckle filtering in order to derive gamma-naught backscatter values (dB) [62]. We used the default settings in each processing step. The images were imported to GEE and masked for layover and shadow. A mean pixel mosaic was calculated for the whole study area and VV and VH bands were extracted for further analysis.
Time series of Earth observation products have shown to capture seasonal trends of land cover features and improve land cover classifications [63]. Our aim was to derive annual phenological response of vegetation to discriminate between the closed canopy evergreen forest and other type of vegetation. We used Landsat-8 Operational Land Imager (OLI) Level 1A surface reflectance products to calculate Normalized Difference Vegetation Index (NDVI) for the entire Landsat-8 epoch in GEE. The NDVI time series was then fitted with a harmonic model using Fourier transform coefficients of first order constituting in simple phenological cycles [64]. Four seasonal variables (magnitude, phase, mean, and standard deviation) were derived from the fitted NDVI model.
Texture measures of satellite images are sensitive to forest structure (variation in crown size, crown shadow, spacing, and cover) [33]. We generated texture indices for Sentinel-1 and Sentinel-2 bands to assess the ability of the spatial texture to identify closed canopy evergreen forest. We chose three metrics from a descriptive group (mean, variance, and correlation), three metrics from a contrast group (contrast, dissimilarity, and homogeneity), and two metrics from an orderliness group (angular second moment and entropy), calculated based on gray-level co-occurrence matrix (GLCM) [65] as these texture metrics have shown to capture forest structure efficiently [33,66]. We used the default parameters in GEE with 3X3 pixel window size, resulting in four GLCM with offsets (−1,−1), (0,−1), (1,−1), and (−1, 0). The directional bands were averaged resulting in one average GLCM from which the statistics were derived.
Topographic variables have shown to improve models of forest cover [67]. We used global shuttle radar topographic mission (SRTM) digital elevation model (DEM) to derive altitude, slope, and aspect at 30-m resolution.

Training Data Collection
The 3000 training samples were randomly distributed to the landscape constrained by the area of EAM blocks [47] and stratified based on tree coverage percentage   [68] and aspect [69] (north, east, south, west) to cover the variance within the natural forests in the samples (Supplementary Material, Table S2). Since our aim was to map closed canopy natural forests, areas with less than 50% of tree cover were neglected. We used Collect Earth tool of the Open Foris suite to manage the collection of the samples [70]. The Collect Earth tool allows automatic investigation of Google Earth, Bing Maps, and GEE from the sample location. The collection was done by interpreting VHR imagery in Google Earth and Bing Maps. The interpretation was conducted using 30-m grid cells to account for the mislocation of different data sets. We defined the natural forest as evergreen forest with canopy closure at least 75%, as we wanted to focus on the montane forests and rule out the semideciduous woodlands. Due to cloud cover, low image quality, and lack of recent image availability, we neglected 298 samples, resulting in a total of 2702 samples, of which 1081 represented natural forests. These samples were used as dependent variables in the modelling. ( Figure S1)

Value Extraction
Altogether, 115 variables were extracted from the training sample locations (Table 1) using the original resolution of each satellite data set.

Variable Grouping
Our aim was to test the best variable combination out of all possible combinations. However, testing all the possible variable combinations of the 115 independent variables would have led to billions of different combinations. To avoid this, the independent variables were grouped into 10 groups (Table 1) describing different properties of the vegetation. Grouping was considered a suitable approach to test all the possible multi-sensor combinations. Although within group multicollinearity complicates the estimates of variable importance with this approach, it does not influence the model predictions.
The Sentinel-2 median mosaic bands formed optical variables group containing 10 variables. The two bands from Sentinel-1 mean mosaic formed a microwave variables group. For Sentinel-2 bands, the calculated texture metrics were grouped based on the texture metrics groups described in Section 2.2.2, leading to three groups: Descriptive texture of optical, contrast texture of optical, and orderliness texture of optical. Similarly, the three texture groups were formed from Sentinel-1 bands. The four seasonal variables of NDVI time series formed seasonal group and altitude, slope, and aspect formed topography group.

Random Forest Model Development
Modelling was conducted with a random forest (RF) machine learning algorithm. RF has been widely used in forest modelling based on Earth observation data [33,34,71,72] for its nonparametric nature and capability to handle dimensionality, multicollinearity, and overfitting [73,74]. RF is an ensemble learning algorithm developed to improve the classification and regression (CART) method by building multiple decision trees simultaneously [75]. The final model prediction is made based on averaging the outcomes of all produced trees [74,75]. The main limitation of RF is its black box nature as split rules are unknown [73]. RF models have three main parameters: Number of regression trees (ntree), number of independent variables tested at each node (mtry), and minimum size of terminal nodes (node size) [73,75]. RF models were executed in two different platforms: Model development and cross-validation were conducted in R (version 3.2.2) with randomForest package, due to R's statistical capabilities, while the final model was built in GEE.
The training samples (natural forest/other vegetation) were used as dependent variables in the modelling and all the 115 variables, organized in 10 variable groups, were used as independent variables. Model development included two main steps: Evaluation of best group combinations and parameter optimization. In the first step, the model performance was tested with each possible 1023 variable group combinations using default settings (ntree 500, mtry n/3). In the second step, ntree and mtry were optimized for the best variable group combination from step one [76]. We tested all mtry settings between 1 and 40, and the following numbers of trees: 100, 200, 500, 1000, 1500, 2000, 3000, and 5000. Both assessments were based on cross-validation with repeated leave-p-out crossvalidation method where 75 % of the sample observations was randomly assigned to training data and the remaining 25% was used as test data. To reduce the effects of permutation, this process was repeated 30 times.
The results were evaluated with the True Skill Statistic (TSS) as it is independent of prevalence, and thus a more reliable estimator compared to kappa [77]. For presence-absence model, TSS is calculated as: where a is true positives, b is false positives, c is false negatives, and d is true negatives. For calculating the TSS, model predictions were divided into binary presence-absence data, which was done by selecting a threshold that would maximize the sum of sensitivity and specificity. In general, TSS values of 0.2-0.4 can be considered low, 0.4-0.6 moderate, 0.6-0.8 substantial, and 0.8-1.0 perfect [78].
As multicollinearity was not removed from the model, the relative importance of each variable or variable group could not be evaluated with traditional means. Instead, we used two approaches to evaluate the importance of variable groups in the models. Firstly, we calculated the frequency of each variable group in all the 1023 models and ranked based on TSS to find out the variable groups used most frequently in the best performing models. Secondly, we calculated the alone and drop contributions of each variable group used in the best model [79,80]. The alone contribution refers to TSS result of each variable group used alone, while drop contribution indicates the decrease in TSS when the variable group was removed from the final model.
The final model, developed based on the best variable group combination and model parameters, was built in GEE to utilize its cloud computing capabilities [35]. The classification target resolution was set to 10 m, meaning that the data being classified gets automatically resampled to 10m resolution with nearest-neighbor method [72].

Validation and Map Comparison
Additional, independent assessment was conducted to estimate the accuracy of the CCEF model. Before the assessment, we removed the isolated pixels by running a majority filter using four neighboring cells. Furthermore, we removed manually all forest patches smaller than 1 hectare in size due to their lower potential biological value [39,81,82] and higher potential inaccuracies [22]. To evaluate the new information on EAFs gained by the CCEF model we also used the accuracy assessment samples to compare the CCEF model to two other products estimating the coverage of Eastern Arc Forests: Eastern Arc Forest Baseline [45] and Forest Fragments of Eastern Arc [39] data sets. The three forest cover layers were stacked as one forest layer and the samples were distributed according to CCEF model coverage within the stacked forest layer. In total, 2076 samples were randomly distributed to each block according to the stacked forest layer coverage per block (Supplementary Material, TableS3). The samples were interpreted through VHR imagery in Google Earth with a similar setup as in the training sample collection. (Figure S3)

Fragmentation and Protection Status
To evaluate the fragmentation of the forest area we calculated number of patches (NP), patch area, mean, and standard deviation of Euclidean nearest neighbor distance between patches using Fragstats version 4 [83]. Furthermore, we estimated the protection status of the EAFs by overlaying the Geographic Information System (GIS) data of protected areas of World Database of Protected Areas (WDPA) with the CCEF model [53].

Variable Selection and Parameter Adjustment
The most frequent variable groups in the best performing models were seasonality, descriptive texture of optical, and topography ( Figure 3). The best model (TSS 0.641) used six variable groups: Optical, descriptive texture of optical, contrast texture of microwave, orderliness texture of microwave, seasonality of optical, and topography. These variable groups showed different contributions in the best model. Descriptive texture of optical (90.5%), optical (80.6%), and seasonality (71.6%) showed the highest explanatory power based on alone contributions ( Figure 4A), while seasonality (5.5%), descriptive texture of optical (5.3%), and topography (3.0%) had the highest importance in the drop contribution test ( Figure 4B). The high and low contributions of optical group in alone and drop contributions test, respectively, indicated that the explanatory power of optical group overlapped with other variable group(s), and thus provided relatively minor added value to the final model. On the other hand, the low and relatively high contributions of topography group in alone and drop contributions test, respectively, indicated that the topography-related variables had poor explanatory power alone, but they bring significant value to the final model when combined with the other groups. Furthermore, the used microwave variable groups, descriptive texture of microwave and orderliness texture of microwave, had relatively modest alone and drop contributions. . Each variable group was used in 512 models and, thus, the frequency of each variable group converged to 50% when all models were considered. Seasonality, descriptive texture of optical, and topography were most frequently used in the best models (highest TSS) while microwave variables, orderliness, and contrast texture of optical groups were less frequent in the best models. All the models were built with default parameters (ntree = 500, mtry n/3).

Coverage and Distribution of the Eastern Arc Forest
Based on CCEF model, approximately 296,000 ha of closed canopy natural forest remain in the EAM ( Figure 5A and Table 2). The model area estimate for the whole EAM did not fit within erroradjusted area estimate calculated based on validation data indicating that the model was a conservative estimate of the remaining natural forests. Also, compared to the previous maps of EAF cover, CCEF model estimated less forest cover ( Table 2). The overall accuracy (OA) of the CCEF model was 0.8 and TSS 0.57 tested with the independent validation data set. The accuracy of the CCEF model varied among the mountain blocks, being highest in Ukaguru and lowest in North Pare. In 10 out of 13 blocks, the CCEF model was the most accurate model compared to the previous forest maps from the EAM. Complete error matrices are reported in Supplementary Materials (Tables S4-S16).
The CCEF model captured the locations of small fragments and revealed the pattern of the remaining remnants effectively ( Figure 5B-D). Moreover, the CCEF model identified the gaps of nonforested areas within the larger forest patches ( Figure 5E-G). However, despite our multitemporal and multi-sensor approach, some parts of CCEF model were still contaminated due to persistent cloud cover ( Figure 5H-J).  The remaining EAFs were highly fragmented and isolated (Table 3). Approximately 23,500 ha (8%) of the forest area is in forest fragments smaller than 10 ha. East Usambara is spatially the most concise block with relatively small mean distances among forest patches and small standard deviation of the Euclidean distances. In Nguu, Ukaguru, and Rubeho, there exists multiple separate forest patch concentrations shown by relatively large mean patch size and high variation of distance between the patches. Mahenge is the most fragmented block with a high number of small size forest fragments.
The majority of the remaining EAF is protected (Table 3). However, the protection status varies among the blocks. In Malundwe, all of the forest areas are protected with IUCN, coded Mikumi National Park. Furthermore, in west and east Usambara over 70% of the forest is protected with IUCN-coded areas. On the other hand, in Uluguru over 80% of the forest is protected but only 1.3% of these areas are IUCN-coded and the rest of the forests are protected under national and village forest reserves. On average, the small remnants (<10 ha) are less protected. The most vulnerable blocks are Nguru and Mahenge where a vast majority of the forest is located outside the protected areas.

Extent, Fragmentation, and Protection of Eastern Arc Forests
The produced CCEF model showed that the EAF extent was lower than previous studies estimated, even though our study included the previously excluded small fragments (1-10 ha) [39,45]. The difference in the EAF coverage, compared to the previous estimations, is explained by the fact that the CCEF model identified the gaps of nonforested areas within the larger forest patches. Furthermore, the CCEF models improved the separation of closed canopy evergreen natural forests from other woody vegetation within the EAM including semideciduous woodlands, degraded forests, agroforests, and plantations. Furthermore, acknowledging the small fragments showed that the ecosystem is more fragmented, as previously considered. The number of fragments is over 30 times higher and the fragment size is one-third of the values reported previously [39]. However, on a positive side, the average distance from a fragment to the nearest fragment is less than previously reported.
Comparison of the CCEF model with the existing protected area network showed that some 24% is unprotected, confirming the previous estimates [43,47,52]. However, there are significant geographical differences in protected area coverage. Many previous studies on biodiversity values of the EAM have concentrated on the east and west Usambara, Uluguru, and Udzungwa, as these mountain blocks are considered most important for biodiversity [38]. In these mountain blocks, a vast majority of the forests are protected. However, in less-studied mountains of Nguru, Rubeho, or Mahenge [84] a vast majority of the forest remains unprotected.
In total, some 88% of forest patches less than 10 ha in size are unprotected. Although it is well known that significant conservation measures should be and have been targeted to the large forest fragments [39,82], there is a global tendency to underestimate the value of small fragments [19,46]. It is highly likely that many of the small fragments are holding significant extinction debt and their rehabilitation and connection to the larger fragments should be a priority. Therefore, we suggest that CCEF model should be used to evaluate and update the representativeness of the protected area network and target new inventories while acknowledging the limitations in available conservation resources.

Accuracy of the CCEF Model
Overall, the produced CCEF model had a moderate accuracy (TSS 0.57). The assessment was done with independent and statistically robust accuracy assessment following the state-of-the-art guidelines [85]. Since the confidence interval for the natural closed canopy forest did not include the value for the map area of CCEF model, the model can be considered a conservative estimate on the remaining EAFs. However, the discrepancy between the estimated area and map area was largely caused by the discrepancy of natural forest cover in Udzungwa. As the accuracy assessment of the previous forest estimates were based on expert opinion, our approach provided also a more robust assessment for previously mapped forest areas [39,45].
However, there were clear variations in model performance among EAM blocks, as shown by the TSS value range (0.36-0.87) ( Table 2). The availability of cloud-free and temporally consistent data is a key prerequisite for developing a consistent model. In cloud-prone areas, the pixel-based median mosaic may be temporally inconsistent, as several images are needed to construct a cloud-free mosaic. This may have caused lower performance of the CCEF model in some blocks and can explain the discrepancy of area between the model and error-adjusted area estimate calculated based on validation samples. Furthermore, some clouds remained in the median mosaics, especially in the northern parts of the study area, leaving gaps in the CCEF model. As the Sentinel-2 image archives update, the availability of cloud-free images from the EAM will increase. Thus, updating the model every 2-3 years enables identification of the dynamics of EAFs and increases the reliability of the coverage.
The number of small fragments in the CCEF model was large and to some extent caused by the methodological approach. At some areas, natural closed canopy forests were found in patches surrounded by forests of sparser canopy cover. Such areas were found, e.g., in Mahenge and in Udzungwa scarp, where the forest cover is variating in terms of canopy cover and structure. In such areas, the CCEF model was only identifying the closed canopy patches, leading to a highly fragmented structure. These forest blocks showed also the highest discrepancies in forest area compared to Newmark and McNeally's [39] work as they used human interpretation to define the forest borders. However, we argue that the spatial pattern of closed canopy forest patches is valuable information for conservation and management purposes. Future studies should concentrate on the forest disturbance dynamics within these areas and study the processes behind the sparser canopy cover areas.

Optimal Modelling Setup for Detecting Closed Canopy Natural Forests
The results of our study demonstrated that combining data from multiple sensors (optical, microwave) and of multiple natures (texture, topography, seasonality) improves the detection of closed canopy natural forests in a tropical environment. However, certain variable groups are clearly more valuable in increasing the model performance.
We found that texture metrics of Sentinel-2 imagery are highly meaningful predictors of closed canopy natural forest. This suggests that the heterogeneous texture of the natural forest canopy, caused by different age and height structures and different phenological phases of different species, can be identified with the texture metrics accounting for the neighborhood reflectance relationships. The importance of texture metrics has also been reported by other recent studies mapping forest types and species [33,70,86,87]. In addition to the texture, we found adding variables describing seasonality and topographical variance increases model performance, also reported by previous land cover classification studies [71,88,89]. In EAMs, many of the blocks are surrounded by deciduous woodlands with different temporal spectral signature compared to evergreen forests [38], which explains the importance of seasonality group in the model. Furthermore, as the EAFs locate at variating and rugged terrain, the hydrological conditions and solar radiation may vary substantially in small geographical areas [90], which explains the importance of topographical variables in the CCEF model.
Recent studies have shown that combining radar with optical data improves detection of tropical rainforests [88], mangrove forests [91], forest plantations [33], and upland vegetation [92], as well as forest attributes such as above-ground biomass or volume-growing stock [30,34,93]. Results of this study are in concordance with this knowledge as the best performing model included two microwave texture variable groups. However, their influence on the model's performance was modest in both alone and drop contributions test. Furthermore, radar variables were not frequently present in the best models. Considering the tedious preprocessing of radar data, its use in similar studies should be carefully considered.
Bundling independent variables to groups caused certain limitations for assessing variables' importance, though it was a necessity for computational reasons. Few variables, such as mean of texture and mean of seasonality, correlated with the band reflectance values [66] as they are simple descriptive statistical measures. Thus, these variables could have been grouped to multiple groups. The texture and seasonality groups, where these variables were eventually placed, had high average TSS and alone and drop contribution test results. However, it can be questioned if these groups would have had similar contribution on the model's performance if these correlating variables had placed to Optical group instead. The grouping of the variables also limited the possibility to run standard statistical tests for variable importance due to the collinearity between the groups. Therefore, this study was only able to provide indicative guidance on the importance of the variable groups instead of full statistical evidence on independent variables.
On the modelling side, it is uncertain if 30 repetitions were enough to remove randomness caused by permutation from the averaged results. Alternative options could be repeated by random subsampling cross-validation with only a fraction of the observations used at each repetition or out-of-bag error estimates integrated to RF, though the reliability and independence of the latter is still debated [73,80]. Furthermore, in our case, modifying the RF model parameters ntree and mtry had negligible impact on model performance. The TSS variations were so minor that they could fit into the stochasticity caused by permutation. However, optimizing these parameters is an important step in the model development as their importance is context dependent [76].

Global Considerations
As is shown by this study, combining the advantages of multiple Earth observation sensors, machine learning approaches, and cloud computing has a significant potential to improve mapping of highly valuable fragmented evergreen forests. Previous studies have shown this potential in other key habitats, such as wetlands and heathlands [91,92,94,95].
Even though spatial information on most of the tropical forest ecosystems and GBHs exists, this information is often outdated and coarse [20,[96][97][98]. Meanwhile, highly accurate baseline data are becoming a prerequisite for advanced Earth observation-based forest monitoring, such as detection of canopy disturbance and forest degradation [25] or in detailed ecological studies on the movements of those endangered species inhabiting these habitats, such as the primates of EAM [49]. Thus, there is a continuous need to improve the mapping of GBHs. We urge researchers and practitioners to consider the need to update the spatial information of other areas with outstanding biodiversity values.

Conclusions
This study demonstrated that recent developments in Earth observation regarding image availability, quality, and processing capacity, allow significant enhancements in mapping accuracies of fragmented tropical forests over large geographic areas. The results show that texture metrics of Sentinel-2 imagery, seasonality metrics derived from Landsat-8 time series, and topography metrics derived from SRTM DEM are highly meaningful predictors of closed canopy natural forest in the EAM of Tanzania and Kenya. Furthermore, texture metrics of Sentinel-1 imagery were found to improve the model performance slightly. The developed CCEF model shows that the coverage of the EAF is smaller than previously estimated, the remaining forests are more fragmented with 88% of the remnants being less than 10 ha in size, and that 24% of the natural forests are outside of a current conservation network. Therefore, the CCEF model facilitates evaluation and updates on the representativeness of the protected area network in this valuable GBH. The methodological setup was based on open source data and tools, providing a feasible approach for mapping and monitoring tropical forest remnants, especially relevant in the countries representing Global South where access to data and computing power has previously prohibited developing efficient and cost-effective approaches.
Supplementary Materials: The following are available online at www.mdpi.com/2072-4292/12/9/1429/s1, Table  S1: Satellite imagery and ancillary geospatial data set used. Table S2: Training sample distribution based on tree cover and aspect.  Figure S1: Spatial distribution of training samples. Figure S2: Spatial distribution of validation samples. Figure S3: TSS values of all the tested mtry and ntree combinations using the best variable group selection.