Does Sentinel-1A Backscatter Capture the Spatial Variability in Canopy Gaps of Tropical Agroforests? A Proof-of-Concept in Cocoa Landscapes in Cameroon

: A reliable estimation and monitoring of tree canopy cover or shade distribution is essential for a sustainable cocoa production via agroforestry systems. Remote sensing (RS) data offer great potential in retrieving and monitoring vegetation status at landscape scales. However, parallel advancements in image processing and analysis are required to appropriately use such data for different targeted applications. This study assessed the potential of Sentinel-1A (S-1A) C-band synthetic aperture radar (SAR) backscatter in estimating canopy cover variability in cocoa agroforestry landscapes. We investigated two landscapes, in Center and South Cameroon, which differ in predominant vegetation: forest-savannah transition and forest landscape, respectively. We estimated canopy cover using in-situ digital hemispherical photographs (DHPs) measures of gap fraction, veriﬁed the relationship with SAR backscatter intensity and assessed predictions based on three machine learning approaches: multivariate bootstrap regression, neural networks regression, and random forest regression. Our results showed that about 30% of the variance in canopy gap fraction in the cocoa production landscapes was shared by the used SAR backscatter parameters: a combination of S-1A backscatter intensity, backscatter coefﬁcients, difference, cross ratios, and normalized ratios. Based on the model predictions, the VV (co-polarization) backscatter showed high importance in estimating canopy gap fraction; the VH (cross-polarized) backscatter was less sensitive to the estimated canopy gap. We observed that a combination of different backscatter variables was more reliable at predicting the canopy gap variability in the considered type of vegetation in this study—agroforests. Semi-variogram analysis of canopy gap fraction at the landscape scale revealed higher spatial clustering of canopy gap, based on spatial correlation, at a distance range of 18.95 m in the vegetation transition landscape, compared to a 51.12 m spatial correlation range in the forest landscape. We provide new insight on the spatial variability of canopy gaps in the cocoa landscapes which may be essential for predicting impacts of changing and extreme (drought) weather conditions on farm management and productivity. Our results contribute a proof-of-concept in using current and future SAR images to support management tools or strategies on tree inventorying and decisions regarding incentives for shade tree retention and planting in cocoa landscapes.


Introduction
The spatial variability in canopy cover over forest land is a vital information for monitoring and assessing the influence of forest management practices on ecosystem services (ESs). Ecosystem services are produced by different components or functions of forests and tree-based land uses as cocoa agroforests [1][2][3]: habitat, tree diversity, canopy cover, etc. Canopy cover refers to the proportion of the forest floor that is covered by the vertical (or map) projection of the tree crowns; it is different from canopy closure-the proportion of sky hemisphere obscured by vegetation when viewed from a single ground point [4]. Such distinction may be a necessary guide to field sampling and management of related ESs in different vegetation types and land uses. Cocoa agroforestry systems (CAFS) feature canopy cover and tree height that are similar to that in transition forest types [5]. The widely practiced canopy cover (or shade) typologies in CAFS include: (1) cocoa with production shade tree species, (2) cocoa with mixed shade species, (3) rustic cocoa or 'cabrucas', and (4) successional agroforests. These agroforests constitute mostly canopy shade trees (CST) above 5 m tall, which provide a stratification of above 10% canopy cover [6]. The associated shade trees provide the different canopy strata in CAFS. The CSTs provide both shade and non-shade uses. The known range of ecosystem provision and supporting services of associated shade trees qualify CAFS for climate change mitigation mechanisms, such as REDD + [3]. Cocoa agroforestry land use is, yet, excluded from the international definition of forests [7]. However, there are associated disservices to upper canopy tree species (UCT), as well: host to disease vectors, competition for soil water and nutrient, etc. Thus, shade management is still an issue of contention between farmers, and the shade densities often vary with site characteristics, such as landscape structure, land use history, vegetation type and distribution, etc. Besides the technical difficulties related to tree planting in CAFSs, farmers are unable to measure and maintain the, essentially theoretical, recommended shade proportion on farms [8,9].
Beside the challenges of shade management, climate change impact projections have not been readily established for agroforestry systems due to their complex structure [10]. Fluctuations in weather pattern often result in changes in micro-climate. For CAFS, the micro-climate is influenced by the density and distribution of associated shade trees [11]. However, current weather fluctuations and the projected extreme dry seasons herald modifications in hitherto standard farm management, for example, the type and density of shade tree species, spatial configurations for cocoa trees in CAFS, as well, changes in cocoa production sites [12,13]. Moreover, obstructions in forest cover due to deforestation and corresponding fluctuations in leaf area drive changes in micro-climates [14]. Thus, estimates of canopy cover, at landscape level, may provide supporting information to monitor spatial changes in micro-climate; such estimates may also be useful for predicting drought-or disease-related risks on cocoa farm productivity.
The assessment of canopy cover in forests has been widely done using spherical densiometers and digital hemispherical photos [15]. However, such methods are beyond the skills of the predominantly rural small-scale cocoa farmers; though they are meant to be used by farmers. Some scholars use tree stock density and parameters of upper canopy trees (UCTs) (such as crown length) as a measure of shade level [16,17]. However, such an approach requires large-scale field inventory, which is demanding in time and human resources, as well as difficult under inaccessible field conditions. Moreover, such measurements are impractical in certain farm conditions, for instance, in farms with overlapping canopy of the shade trees. Thus, point estimation methods are little reliable for application at both farm and landscape scales. Considering the contiguous effects of changes in micro-climate across forest or farms units, cocoa-producing farmers and countries need support for managing shade conditions beyond the farm scale. For instance, it may be logical to assess shade conditions at landscape scales since the amount and distribution of shade often vary within and between CAFS. Moreover, it is impractical to device and implement a sustainable shade management plan for each farm. Thus, a management plan for shade trees may be suggested for each cocoa production landscapes.
Several tools are used for monitoring and managing canopy cover in forest and tree-based land cover: Light Detection and Ranging (LiDAR) approach [18,19], information from aerial photographs [20], Unmanned Aerial Vehicle (UAV) images [21], and Digital Hemispherical Photographs (DHPs). DHPs have known potentials, amongst other applications, in providing timely, up-to-date, and large-scale monitoring of changes in vegetation structure. For instance, the leaf area index (LAI) for subtropical forests has been reliably estimated using DHP [22]. Different remote sensing methods are applicable to estimate LAI [23][24][25]. The effectiveness of DHP in estimating LAI has also been reported for deciduous forests [26] and for rubber plantations by comparing data from Worldview-2 imagery [27]. Although synthetic aperture radar (SAR) imagery has been widely used for canopy cover estimation and biomass monitoring, such remarkable applications have been extensive for temperate forests [28][29][30][31][32][33], and few studies also assess canopy cover in tropical forests from optical satellite data [34][35][36]. Such information, however, is currently non-existent for cocoa agroforestry systems.
Unlike remote sensing images that are acquired using optical sensors, the backscatter intensity of SAR images only provides indirect information about canopy structure. In the case of dual polarization SAR imagery, cross-polarization backscatter (Vertical transmitted (VH) and Horizontal received (HV)) can distinguish the structural parameters of different crops types [31]. Moreover, the biophysical attributes of forests, such as canopy height, can be reliably inferred from SAR imagery [37,38]. However, the SAR backscatter intensity over most forest types is complex due to the often non-distinctive structural attributes and complex composition of trees. The attribute of vegetation that relates with SAR backscatter intensity also depends on sensor properties, such as the wavelength, polarization, and incidence angle of the transmitted (and received) radar signal [39][40][41]. Besides sensor properties, backscatter intensity depends on structural properties of vegetation: moisture content of leaves and branches, the size and orientation of branches and leaves, the spatial configuration of trees in a resolution cells, and the shape and size of tree canopies (leaf-on versus leaf-off). These structural properties, assessed across time and season, can be used to map forest clear cut and differentiate crop fields [31,33]. However, the sensitivity and potentials of SAR backscatter have not been assessed in relation to tree-crop production systems with multi-strata canopy, such as cocoa agroforestry. In the context of measuring the structural parameters in high vegetation canopy, Reference [4] distinguished measures of canopy cover/shade. According to Reference [4], DHP are random point measures that integrate information from different heights of the canopy. Thus, it provides a good resolution and an efficient bias estimator of the canopy cover, actually canopy closure. In this experiment, we estimated canopy gap fraction from field sampling by DHP of overstorey canopy.
In the framework of large-scale shade management for cocoa tree-crops, and the prediction weather-induced impacts on farm productivity, this study evaluates the potential of S-1A SAR data for detecting canopy gaps in CAFS and cocoa production landscapes. We consider two cocoa production landscape in the Center and South regions of Cameroon. The landscapes differ in the prevailing weather pattern, predominant vegetation type, and cocoa agroforests structure and practices [42,43]. In both landscapes, we collected field reference data in CAFSs of a varying range of tree density, visually assessed canopy cover, farm ages, and observed management of canopy trees. In a first analysis, we assessed the relationship between canopy gap fraction and S-1A backscatter intensity from both VH (cross-polarization) and VV (co-polarization) polarization images; canopy gap fraction was estimated from in-situ DHPs. We assessed machine learning (ML) approaches for their potential to delineate different canopy gaps fractions for the sampled cocoa agroforests parcels. We hypothesized that C-band SAR backscatter intensity may reflect variability in the intensity and distribution of canopy gaps in cocoa agroforests, and this may be informative for making predictions at the landscape scale. We aimed to evaluate the spatial distribution in canopy gaps as a leverage for monitoring changes in shade cover and climate change impacts in cocoa production landscapes.

Study Sites
The first study site is found in the Bokito district of the Mbam & Inoubou Division, which is located in the northern part of the Center Region of Cameroon. This division covers a total surface area of about 6821 km 2 , and it is located between latitudes 4 • 39 -4 • 49 N and longitudes 11 • 4 -11 • 19 E, within the Universal Transverse Mercator zone (UTM) Zone 32N. The Bokito district has a topography of mostly lowlands with altitudes ranging from 338 to 1109 m a.s.l (Figure 1c). The district is within the agro-ecological zone with humid forest and bi-modal rainfall pattern, and the climate is characterized by four seasons: with a small rainy season from March to June and a large rainy season from August to November. The vegetation is mainly a forest-savannah transition zone that is currently covered by both savannah, secondary forests, and gallery forest patches. The Bokito locality is situated about 150 km from Yaoundé. In the landscape, establishments of cocoa-based agroforests are widespread, prolonged, and still frequent [42,44,45]. Among the major cocoa production zones in Cameroon, this administrative division is characterized by relatively low annual precipitation levels (annual total rainfall of about 1400 mm in Bokito [46]), but it is recognized for producing good quality cocoa. In the center region of Cameroon, the natural vegetation is, however, heavily degraded by pressure on land tenure due to local high population density that can be greater than 100 inhab. km −2 . Bokito is located in the zone with a relatively low population density (29 inhab. km −2 ) compared to other parts of the region. The second site is found in Efoulan district that is located between 2 • 55 to 3 • 12 N and 10 • 35 to 11 • 2 E, within the UTM Zone 32N, in the Mvila Division of the South Region of Cameroon. The topography belongs to the continental plateau of southern Cameroon, with undulating and rolling hills that alternate with some small rivers and widely distributed swamps [47]. The district has an altitude range between 215 to 900 m a.s.l (Figure 1d). Located also within the aforementioned agro-ecological zone, the climate is characterized by two rainy seasons (March-June and September-November) and two dry seasons between these. The average annual rainfall is around 2000 mm. The vegetation cover is predominantly secondary forests, remnant primary forest areas, and agricultural lands. Farming practices are mostly extensive via shifting cultivation--slash and burn agriculture--and cocoa agroforestry expansion into forest areas [9,48]. The Efoulan locality, the headquarter of the district, is situated 32 km from Ebolowa and 43 km to Lolodolf (main towns) in the tropical humid forest zone. The administrative boundaries of Efoulan council cover about 830 km 2 [43]. According to the census data in 2005, the total population of about 28,000 inhabitants distributed in about 37 villages. Settlements are mainly distributed along the main roads within the district. The population is rural and depends on agricultural and forest exploitation activities.

Assessment of Canopy Cover
To estimate canopy cover (or conversely, gap fraction-a measure of the proportion of gaps within tree canopies) in cocoa agroforests, a total of 134 farm plots with size approximating 20 × 20 m 2 were sampled in 2016 and 2017 for collection of DHPs of the canopy (Table 1). We used a hand-held Garmin 64s Global Positioning System (GPS) tool to locate plot boundaries and DHP collection points. It is equipped with a space-based global navigation satellite system (GLONASS) receiver, an alternative to a Global Navigation Satellite System (GNSS) receiver, which provides reliable navigation and location data. Thus, based on the manufacture specifications, location accuracy is within a range of approximately 3 m. During sampling, we placed the GPS in a stable position for a necessary duration to get stable GPS coordinate readings. And, considering potential distortions in the accuracy of ground control points (GCPs), while extracting baskscatter data, we used a image masks (50 × 50 m 2 ) that were large enough to include the boundaries of representative sample plots (20 × 20 m 2 ) and the hemispherical camera viewpoints. Upward-looking DHP were taken at a height of 1 m using a Nikon D7000 DSLR camera with a fisheye objective lens (Figure 2a). The photos were taken in a sequence of thirteen points in each farm, and this was later reduced to five sampling locations as latter had no significant RMSE of gap fraction compared to the thirteen points. The adapted DHP sampling scheme ( Figure S1) and camera settings [49][50][51] are detailed in the supplementary materials. DHPs were processed using the free and open software, ImageJ, in three steps [22]. First, pre-processing entailed a batch process for cropping the circular canopy portion of each image and extracting the different RGB image channels: red, green, and blue. The latter were used for further analysis as these offer the highest contrast between sky and vegetation [51]. Secondly, the cropped blue channel images were further subject to binarization into black/white image of two pixel classes: sky and non-sky pixels. The binary image was created using a threshold level determined by analyzing the histogram of the blue channel image. We applied an automated segmentation procedure: the vegetation and sky pixels were separated by applying the "Minimum" thresholding algorithm to the cropped circular images [52]. Lastly, canopy gap fraction (a proxy for canopy cover) was estimated as the proportion of sky or white pixel in the circular cropped image [4,52] (Figure 2b and Table 2).

Relating SAR Backscatter to Canopy Gap Fraction
To match SAR backscatter with the in-situ field measurements of canopy cover, we identified multi-seasonal S-1A SAR images in the range of ±0 to 6 days from the date of DHPs inventory. We used images that were captured using the Interferometric Wide-swath (IW) mode, which have been geo-referenced and provided by the European Space Agency (ESA) as Level-1 Ground Range Detected (GRD) image products with a spatial resolution of 10 m. The different SAR data, DHP collection dates, and corresponding number of sampled plots are shown in Table 1. We use the term "sample plots" to refer to different mapped areas of about 20 m × 20 m; in each plot, DHPs of the over-storey canopy were taken and processed for estimating in-situ canopy gap fraction ( Figure 2). Thus, the number of sampled plots for each row constituted a total of plots that were sampled in a particular farm and also across several different locations during each of the DHPs inventory dates (Figure 1).
We conducted a time-series analysis of available S-1A SAR images over the sampled locations in the study landscapes. Time-series analysis was conducted in Google Earth Engine (GEE), and the backscatter intensity (sigma nought) were aggregated by the sampled locations (merged pixel masks) in each landscape and year. The distribution of mean VV and VH backscatter did not show a strong temporal variability over the sampled locations, especially during the period of field sampling and canopy gap measurements ( Figures S2 and S3). For available SAR images during the window period of field sampling, we compared the mean backscatter between image date based on a non-parametric Kruskall-Wallis sum rank test at alpha 0.05; the Bonferroni correction was applied to adjust alpha value against type I error rate (Table S1). We observed significant differences in mean backscatter between a few image dates though the differences between dates are not consistent for VV and VH backscatter. This mild temporal differences in backscatter intensity over the sampled field locations informed our selection of SAR images, matching with in-situ DHP sampling dates, and extraction of backscatter intensity for subsequent analyses.
Each of the used SAR images subset of the study areas (Table 1) were pre-processed using a workflow described in Reference [53]. In this study, though, we applied a Radiometric Terrain Correction (RTC) of the amplitude bands to Gamma0 intensity [32,54] with a spatial resolution of 10 m. Image processing was conducted in the the Sentinel Application Platform (SNAP) toolbox [55,56]. According to Reference [54], the radiometric terrain flattening to Gamma0 makes use of the local illumination area of the sensor and reduces image acquisition bias due to incidence angle. Figure 3 illustrates SAR backscatter intensity, by RGB image composites, for the sample landscapes in the wet and dry seasons.
Thus, compared to geometric terrain correction (GTC) based on incidence angle, the RTC corrects for local topography and provides easier interpretation of backscatter statistics [32,54]. The Gamma0 VV and VH polarization intensity bands of each image were calibrated to logarithmic scale (decibel, dB). The logarithm is applied [57] to normalize the backscatter intensity and improve contrast between different ground features.
Several studies exhibit the relationship between SAR backscatter and vegetation canopy parameters. To complement the assessment of vegetation structure, such as the normalized vegetation index (NDVI), radar vegetation index (RVI) and other backscatter ratios have been applied, with different objectives, to classify L-and C-band SAR images [31,[57][58][59][60]. RVI from L-band SAR backscatter has a coupled correlation with both vegetation water content and LAI (plant structure) [60]. According to Reference [31], the cross ratio (VH/VV) of Sentinel-1 (S-1) backscatter is more sensitive to vegetation water content. In addition, for crop fields, the S-1 backscatter strength decreases with increasing NDVI of cereals, and, unlike the VV polarization, vegetation scattering (stronger in VH polarization) increases with increasing LAI [57]. Ref. [57] used different co-and cross-polarization bands to assess canopy attributes of cereals from S-1 SAR. Ref. [56] also observed that the sensitivity of S-1 backscatter parameters to crop height and canopy cover depend on crop type. The authors used several backscatter ratios, arithmetic differences of backscatter band polarizations, and normalized difference. However, these cited studies focused on assessing vegetation structure and canopy cover of temperate and low canopy crops, such as cereals. Seasonal fluctuations in S-1A SAR backscatter were observed for tropical forests and agroforests [53]. Thus, though not as remarkable as in temperate forest conditions, one may expect backscatter contributions from fluctuations in canopy gaps sizes.  In this study, we used ten (10) Gamma0 backscatter features to assess backscatter sensitivity to canopy cover in cocoa agroforests and transition forest stands ( Table 3). The backscatter parameters comprized the following intensity coefficients, backscatter ratios, and backscatter differences and normalized differences. To analyze the relationship between canopy gap fraction and SAR backscatter intensity, and the backscatter behavior of the different canopy structures of cocoa agroforests and transition forests, we created masks of either the corresponding pixel location of DHP sample points or outline polygons of the sampled agroforest areas or forest stands. Two scenarios were evaluated for the relationship between SAR backscatter coefficient and the proxy for canopy cover, canopy gap fraction: Model (1) pixel-based extraction of SAR backscatter intensity: representative 5 pixels from the overlaid five DHP collection points in each sample plot; Model (2) polygon-based extraction of SAR backscatter intensity: representative 25 pixels covering each sample plot area ( Figure 4). The extracted backscatter intensity by both scenarios are, respectively, summarized in Tables S2 and S3 in the supplementary materials. a b

Predicting Canopy Gap from SAR Backscatter
Machine learning (ML) algorithms have been applied for retrieving and modeling vegetation canopy parameters from SAR backscatter. Reference [31] used the Random Forest (RF) algorithm to model the sensitivity of S-1A backscatter to vegetation dynamics. In addition, the use of neural networks (NN) is not new; its usage encompass retrieval of different vegetation canopy parameters from SAR backscatter [61][62][63]. We explored, iteratively, three different ML methods for the prediction of canopy gap fraction from SAR backscatter intensity: Stepwise Multivariate Regression, the back-propagation Multi-layer Perceptron (MLP) Neural Network Regression, and the Random Forest (RF) regression algorithm. For data exploration and model assessment, a combination of Python and R programming software was used.

Bootstrap Regression
First, the stepwise regression analysis was used to test for significant backscatter predictors of canopy gap fraction in sampled cocoa agroforets and secondary forests. As model validation criteria, a 10-fold cross validation and the akaike information criterion (AIC) was used to select the significant regressors. Following an evaluation of the regression model performance and conditions for a parametric regression analysis, the assumptions for normality of model residuals, homogeneity of variance, and independence of variables were not met. Hence, we ran a stepwise regression with bootstrap resampling; we used 1000 bootstrap samples (or replications) for the non-parameteric stepwise regression. Separate regression models were run for the ten (10) Gamma0 SAR backscatter predictor variables (or features): backscatter intensity extracted using 5 pixels masks for each reference plot (model A) and for 25 pixels and those extracted using polygon masks of 25 pixels (model B). The pixels used for model A were, thus, a subset of those used in model B.

Neural Network (NN) Prediction
Secondly, Multi-Layer Perceptron (MLP) Neural Networks were tested for predicting canopy gap fraction in cocoa agroforests from C-band SAR backscatter. The development of a NN that learns from data is a very empirical procedure involving the following cycle: an idea for a NN architecture based on available data (number of neurons, number of layers in each neuron, and activation function), implementing the idea in code, experiment how the NN work for your application, and tuning the hyper-parameter to improve learning or updating the NN architecture. Considering that no NN architecture has been developed for CAFS as such, we tested and iterated different NN architectures and parameters. The backscatter features, extracted using polygon masks of 25 pixels for each reference plot, were used as predictor variables in all the models. The following three NN provided the best estimates of canopy gap fraction based on the prediction error (RMSE).
(1) NN1 comprised 10 input features (backscatter parameters), two hidden layers of 100 and 5 nodes, respectively, and the output layer. No regularization parameter was used, and the rectified linear unit (ReLU) activation function was used for the hidden and output layers ( Figure S6a in Supplementary Materials). The asymptote regions in the Sigmoid and TanH function provide almost zero gradients or slope ( Figure S6b), and the values in this region slow learning. The ReLU function is a non-linear function, and the term "rectified" implies taking a maximum of zero: the derivative or slope of the function is 1 for positive z (activation) values and zero for negative z. Thus, with the ReLU function, learning can by very fast since z > 0 in most cases. For this NN, a dropout ratio of 0.2% was used for both hidden layers, and the learning rate set at 0.001. For all the NN, we used the RMSprop (root mean squared prop) optimization algorithm to speed up gradient descent or learning. It updates the model parameters, with the square root of the squared derivatives, at each iteration to enhance oscillations in the direction that speeds-up the NN learning algorithm.
(2) NN2 had 3 successive hidden layers of 128, 64, and 10 nodes, respectively, and 1 output layer. The "Tanh"(hyperbolic tangent) activation function ( Figure S6b) was set for the hidden and output layers, L2 regularization was set to 0.01. The Tanh function, like the ReLU, is non-linear and is a modification of the sigmoid function. The sigmoid function has values that range between 0 and 1 (with highest slope at 0.5), and it centers the derivatives or slopes of the activation values (z) around zero. This is handy for learning data with binary output. The tanh function centers the activation values (z) at zero (with highest derivative a zero ) and a range between −1 and +1, which, compared to the sigmoid function, makes learning easier. To optimize learning for the NN2, dropout ratios of 0.4%, 0.3%, and 0.2% were set for the successive hidden layers, as well as a learning rate of 0.01.
(3) NN3 was different from Tuned1NN in the activation function (ReLU), L2 regularization value (0.05), and dropout rate of 0.8%, 0.4%, 0.2%, and a learning rate of 0.001. For all NN models, a random initial weighting and bias was used for the hidden layer1, and a back-propagation algorithm, based on a RMSE optimizer, was used to train the NNs and estimate the loss or gradient descent. The NN assessment was carried out using the Tensorflow library (and Keras distribution) in R.
The geo-referenced SAR images and sampled field data were used for extracting SAR backscatter data ( Figure 4). The extracted SAR backscatter and estimated in-situ canopy gap fraction constituted the data set [X i ,Y i ] for training the NNs, where Update the weights.

5.
Iterating back to step 2 as needed, and depending on set learning rate, or stopping the training when a set error tolerance level is reached-early stopping.
For the input data X i , the squared error of canopy cover prediction by the NN is: whereŶ i is the NN output for the input X i . Thus, RMSE, for N samples:

Random Forest Ensemble Regression
Thirdly, a Random Forest (RF) decision tree algorithm was applied to predict gap fraction from the 10 SAR backscatter predictors. As in the NN, we used the backscatter variables that were extracted using polygon masks of 25 pixels, per sample plot, as features for the RF model. The extracted (3350) data points for both study sites were separated into a ratio of 70%:30% training and test subsets. The decision tree hyper-parameter was tuned to an upper bound of 500 decision trees (ntree)-the default sufficient number of trees for running RF within the mlr package of R statistical software. Moreover, the RF important variable analysis, based on the Gini statistics [64,65], was used to identify the most influential backscatter variables in the decision trees. The RF model was applied to predict canopy gap maps for the study sites.

Semi-Variogram Analysis of Spatial Correlation in Canopy Gap Fraction
In final analysis, we evaluated the spatial continuity in canopy gap, at landscape scale, using an experimental semi-variogram model. A semi-variogram model is calculated using several different approaches in literature, and we applied the a widely used function (Equation (3)). where: 1. γ(h): the semi-variogram function.

2.
N(h): the set of all pairwise Euclidean distance i − j = h.

4.
z i and z j : data values at spatial location i and j, respectively.
In goestatistics and semi-variogram theory, the experimental semivariogram is defined as the half of the average squared difference between objects or values separated by a given lag distance (h). The experimental semivariogram is computed by increasingly changing the lag distance (h) to obtain an ordered set of semi-variances from the spatial feature or data. The assumption is that closer objects are related than distant ones. In this sense, spatial features or objects located at shorter distances (h) apart are likely similar or have low variance. No variance is expected for two objects or features at zero h. Spatial variance increases with distance between spatial points or features; near observation tend to share more characteristics than distant ones. However, beyond a certain distance threshold, referred to as the range, the target features or observations are no longer spatially correlated [66]. The range indicates the distance of spatial heterogeneity, and below this distance, the data are statistically dependent. At the lag distance equal to the range, the maximum variance is attained; known as sill, this estimates the true variance in the target spatial objects or features. After the sill, the semi-variogram curve flattens off as there is no further change in the variance of the ground feature. The nugget effect or y-intercept explains a combination of errors due to data sampling and variations in the data at smaller spatial scales relative to the distance between spatial variable.
The semi-variogram model for a random spatial variable depends on parameters, such as the lag distance (h) at which spatial correlation is estimated, the choice of model, nugget or y intercept, and sill (variance in empirical data). In our analysis, the parameters, such as the variogram model, the nugget, and sill, were not set beforehand; we computed an experimental semi-variogram for the estimated canopy gap fraction and fitted a set of three common variogram models to the experimental variogram [66]: Exponential, Spherical, and Gaussian models. We assumed that forest vegetation, as well agricultural lands and grasslands, have an isotropic distribution. The best fit model was automatically selected, and its estimated parameters used to model the semi-variogram. Further, we interpret the spatial correlation in canopy gap fraction based on the reported semi-variogram curves.
A graphical abstract of the used methods and analyses is provided in Figure 5.

Landscapes
In-situ DHP sampling  Figure 5. Framework for the used methods and estimation of canopy gap distribution. The tools or software, used at each analysis step, are indicated within braces. The black arrows = all acquired information from preceding steps in the analysis; the blue arrows = informed selection of data or algorithm from available options.

Canopy Gap Estimation from DHPs
A boxplot analysis shows differences in canopy gap distribution for cocoa agroforests versus transition forests ( Figure 6). Generally, the estimated gap fraction ranged between 2.58 to 58.23%, and, for the sample cocoa agroforests, canopy gaps were more variable in the Efoulan landscape compared to that of Bokito. The canopy gap fraction in cocoa agroforests ranged between 4.22 to 58.22%, and the range for transition forest was 2.57 to 4.46%. The differences in gap fraction between the two sites were, actually, not unexpected considering the disparities in the predominant vegetation structure both sites, and generally, more so due to the cocoa management practices. Cocoa tree density is higher in CAFS in the Center region (1217 to 1644 tress ha −1 ) than in the south region (1048 trees ha −1 ), and the reverse for the density of non-cocoa trees [46,67,68]. However, the range and distribution of canopy gap fraction is constraint by tree density and structure in the sampled plots.

Relationship between Canopy Gap Fraction and SAR Backscatter
The statistics of stepwise regression analysis are summarized in Table 4. The stepwise regression model A indicated a combination of four backscatter features as important predictors of gap fraction, and the predictors explained 30.4% of the variance in canopy gap fraction (R 2 = 0.304, F (4,129) = 14.1, p < 0.001). The VV intensity contributed significantly in predicting canopy gap fraction (β = 1.23, p < 0.001), as did the log-transformed VV intensity (β = −2.51, p < 0.001) and ratio of VV on VH (β = −1.97, p < 0.001). For the model B, the log-transformed VV and VH backscatter, as well as their ratio, were significant predictors of canopy gap fraction. Figures 7 and 8 illustrate the distribution of significant predictors of canopy gap fraction based on the stepwise regression model. The SAR backscatter of the co-polarization band, VV, varies between cocoa agroforests and transition forests. The VH backscatter was a significant predictor only in model B, and its distribution did not vary between the aggregated land use types (Figure 8). The cross-polarized (VH or HV) backscatter is particularly sensitive to volume scattering over forest canopy [69]. Thus, the 25 pixels (model B) cover an area of 50 × 50 m 2 which, compared to 5 pixels (model A), was large enough to provide more backscatter information that relates to vegetation volume, which may be similar in the considered land cover and land use categories. Cocoa afroforests have a similar canopy structure to tropical secondary forests [67], and it is difficult to differentiate them on RS data collected from optical sensors [70]. The cross-polarization backscatter may, therefore, be regarded as less sensitive to the canopy gaps in cocoa agroforests for the sampled landscapes.

Bootstrap Regression Prediction
The non-parametric regression models, based on bootstrap resampling, showed that the modeled gap fractions from the sample data can provide reliable inference for the variability in canopy gaps in other locations in the population (landscapes). For both bootstrap models, the 95% Confidence Interval (CI) of both the coefficient of determination (r 2 ) and RMSE did not include zero (Figure 9), indicating that the model's prediction performance was reliable, without an instance of predicting a gap fraction of 0%. As well, a similar observation is showed for the 95% CI of the intercept and coefficients in model A (Table 5). On average, about 30.0% of the variance in canopy gap fraction was explained or modeled by the used backscatter variables in Model B with a 95% CI of 10.7 to 55.04% (Figures 9a,b). The average prediction error was 8.4% (Figures 9c,d). Thus, about 70% of the variance in gap fraction may be explained by other variables that were either not measured or captured by the variables used in models A and B. However, predicting canopy gap fraction from VV backscatter (in decibels), compared to the other variables, had a low bias (−0.13, S.E of 1.95) and a narrow CI range ( Table 5). This is indicative of the sensitivity of VV backscatter to the estimated canopy gap fraction. Moreover, the 95% CI of VV coefficient did not cross zero, which, once more, justifies that the estimates may be generalized for predictions about the population (the landscapes). The VH backscatter, again, was sensitive to canopy gap fraction only in model B. However, for model B, the 95% CI of intercept (−7.07, 157.69) and VH (−357.2, 480.4) included zero (Table 5); an indication that their parameters may not provide reliable information for making inference on canopy gap for the landscapes. In addition, the 95% CI for the intercept of model B included zero, which reveals a likely violation of the underlying assumption in the parametric regression model-the 95% CI (3.727, 132.58) did not cross zero.

Neural Network (NN) Prediction
The trained NN are compared in terms of their parameters and in prediction error of canopy gap fraction ( Table 6). The RMSE of the predicted gap fraction is 7.18% for NN1, 8.02% for NN2, and 8.00% for NN3. These values are lower than the RMSE for the stepwise and RF models. The relative errors (MAE) shows high training bias for the NN with lowest error (NN3) (Figure 10). Although its RMSE was lower than for the other NN, NN1 did not generalize better than NN3, in terms of gradient descent (loss) and MAE. However, a relatively high training bias is observable for NN3, which, if addressed, can reduce the prediction error.

Random Forest (RF) Prediction
The RF regression prediction error (RMSE of 8.48%) for gap fraction was lower than for the stepwise regression model which considered only a subset of backscatter parameters. The model performance accuracy was 40.94%, and, from the RF importance variable plot, the co-polarization backscatter intensity (Gamma0VV and Gamma0VV in decibel) contributed most to this ( Figure 11). Thus, the co-polarization backscatter showed higher potential in predicting canopy cover. Based on the RF model results, prediction maps of the study sites are illustrated in Figure 12. The canopy gap fraction range was similar for both study sites (6.92% to 48.96%). The predicted mean canopy gap was 19.81% (SD: 10.09%) and 17.49% (SD: 8.70%), respectively, for Bokito and Efoulan sites. A semi-variogram analysis revealed a low spatial correlation in canopy gap for the Bokito landscape with a distance range of 18.9 m, compared to the 51.1 m range in the Efoulan landscape (Figure 13c,d). This reflects the difference in vegetation type and distribution between the landscapes. A view of some cocoa agroforestry plantation areas shows the level of mosaic and discontinuity in canopy cover within and across farms (Figures 12e-h).

Discussion
We estimated canopy gap fraction as the proportion of gap within the vegetation canopy based on DHP taken below the cocoa trees or at the forest floor (1.3 m above ground). The estimates from DHP do not refer to canopy gaps above the cocoa trees since the DHP do not represent the entire vertical canopy profile. The transmitted energy from C-band wavelength is limited to top parts of vegetation canopy [71], and, in the case of CAFS, the energy can penetrate to the canopy stratum constituted by cocoa trees. Thus, the used backscatter values, by large, represent attenuated energy from the canopy gaps above cocoa trees. Using the SAR backscatter variable, we explained up to 30% of the variance in the estimated gap fraction with 8.4% error. However, including measurements of canopy gap in a vertical profile may improve the predictions from C-band SAR. In addition, the use of backscatter from longer wavelengths, L-band, for example, may better explain variability in estimated gap fraction.
Our results had RMSE and r 2 values that comparable to other published literature. For instance, Meyer et al. [72], in estimation of LAI in temperate forests by comparing several indices from Landsat-8 and Sentinel-2 images, observed r 2 values in ranging between 0.328 to 0.454. However, our analysis is based on SAR backscatter intensity which, unlike information from optical images as in the aforementioned reference, is not sensitive to vegetation spectral information or leaf phenological status. In predicting LAI of vineyards from remote sensing optical (Sentinel-2) and SAR (S-1) images, Beeri et al. [73] observed r 2 values in the range 28.3% to 31.0%. In this study, we did not apply the machine learning models to predict canopy gap fraction; rather, we explored state-of-the-art algorithms to show-case the potentials of SAR images in supporting canopy cover inventory and informing management interventions in cocoa production landscapes.
Canopy gap fraction represents the proportion of sky hemisphere as viewed from a specific point [4]. Our field measurements of canopy cover were based on a DHP with a non-zero angle of view (AOV) perspective of the canopy. The non-zero AOV (180 • for the used fisheye lens) had a wide canopy area and, mainly, side view of tree crowns at each sampling point. Thus, the canopy cover maps may well be interpreted as angular perspective of canopy cover and not a vertical (or map) projection as such. These maps reflect the spatial distribution of vegetation types and canopy trees in both landscapes, and it should be interpreted with consideration of a prediction error of 8.4%.
Unlike the low frequency microwaves (L-and P-bands) that can reach the understorey canopy and underlying soil, the high frequency C-band microwave penetration is limited to the upper canopy strata [41]. Moreover, the sensitivity of either co-or cross-polarization S-1A backscatter intensity to vegetation canopy vary with the crop type [56,57]. In our experiment, however, the VH polarization was less performant in predicting canopy cover based on the bootstrap regression results. The cross-polarized backscatter (VH) significantly contributed in explaining variability in gap fraction only in model B. Ideally, the dominant canopy components, often timber trees species, in cocoa agroforests and transition forests comprise similar tree species and canopy structure. Most of the tree species are light-tolerant and deciduous, which results to a less dense canopy structure. Thus, since the C-band microwave penetrate forest canopy only to a limited extent, CAFS and transition forests may experience similar volume scattering of microwaves as such. This may explain the similar distribution of the VH backscatter across the aggregated land use categories (Figure 8), and its poor performance in modeling canopy cover. Obvious differences between the two land use/cover types relate to the density of canopy trees, which may also be due to spatial fluctuations in the vertical canopy profile. For this, the distribution of VV backscatter intensity varies between land use aggregates (Figures 7 and 8).
Making use of all the backscatter intensity and ratios, the MLP NN showed a high potential in retrieving canopy gap (or cover) status over cocoa agroforests and adjacent transition forests. Though the canopy gap prediction errors (MAE) for the NN were in a range of 5 to 7%, the RMSE values were lower than those from bootstrap regression models that used a few backscatter predictors. The NN1 showed reduction in training error to about 2% with increase in model iteration, but the model was poor in generalizing new data. The high number of predictor bands for the low sample data may explain the high variance or model over-fitting beyond 75 model runs or iterations. So far, further tuning of NN hyper-parameter, NN3, reduced over-fitting but did not improve the prediction error; meanwhile, the MAE was by far higher than that of the former NN.

Sensitivity of S-1A Backscatter (VV vs. VH) to Canopy Gap Fraction
The normalized backscatter coefficients (sigma nought) is a measure of the cross-sectional area of the, spatially random, scatterers within each resolution cell or pixel. Thus, the radar backscatter energy depends on the shape, dielectric constant, density, and spatial configuration of the ground targets in each pixel; there are also contributions from sensor properties, such as wavelength, incidence angle, and polarization. We used the gamma backscatter which corrects for effects of different incident angle on backscatter cross-section [32,54,69]. According to our model predictions by the bootstrap regression and RF, the VV backscatter was more sensitive or important in predicting canopy gap fraction. Literature has widely reported a high sensitivity of cross-polarized backscatter to forest canopy or volume. The gap fraction is related to LAI (or plant area index (PAI)) and a commonly used proxy for canopy volume; it refers to a different parameter of vegetation canopy. LAI has been reported to be correlated with backscatter volumes over forest and crop fields [57,60], and it is commonly estimated using DHP [74,75]. Our gap fraction estimates, based on DHP measurements, is the inverse of LAI [22]. Thus, we hypothesize a different relationship between the former and microwave backscatter: high sensitivity to the co-polarized (VV) backscatter to surface scattering in canopy gaps. A further explanation is that the C-band wavelength mildly penetrates forest canopies and, thus, relates to LAI at a lesser degree than to canopy gap. However, our models are subject of further fine-tuning and improvement considering the high variance in the predictions.
Several factors may, thus, influence information retrieval from SAR images. For this study, temporal changes in SAR backscatter intensity may have contributed to a reduced prediction potentials. However, for the period of field data collection, a time-series analysis showed differences in backscatter only between a few images ( Figures S2 and S3, and Table S1 in the Supplementary Materials). Moreover, our analysis was conducted for perennial tropical agricultural land use, and consistent changes in vegetation structure may be expected only between periods of major climatic and rainfall patterns: the dry and rainy seasons. Amongst other factors, SAR backscatter intensity is dependent on the geometry of ground features, as well as their dielectric properties. SAR backscatter intensity is also largely influenced by spatial distribution of ground features in each resolution cell; this intensity may vary between representative pixels for similar ground features--the salt and pepper or speckle effect in SAR images. Speckle effect in SAR images may contribute to degrading the sensitivity to measured parameters on ground features [76]; thus, canopy gap fraction prediction from the used SAR images may be negatively influenced by variability in backscatter intensity values for a similar range of gap fractions estimated by hemispherical photos. Nonetheless, an important source of uncertainty in our prediction is the hemispherical photos we took below the canopy (1.3 m above ground) which could result in an underestimation of actual gaps at the upper canopy strata. Here, the short wavelengths from Sentinel-1 C-band sensor penetrate at most (and interact with features in) the upper canopy strata.
Forests are often considered as heterogeneous volumes when in contact with microwave. The main scatterers from forest canopies comprise a combination of leaves, twigs, and branches that are randomly distributed and have different geometry. Over forest, cross-polarized backscatter is dominated by the contribution from canopy elements than by surface components. Thus, the cross-polarized backscatter often correlates best with biomass, and it is reported to have a high sensitivity to volume scattering [57,77]. However, the depolarization effect of vegetation reduces with increasing vegetation. According to Reference [69], there is a high contrast between H and V polarization over bare ground, and the contrast reduces with increasing amount of vegetation. This may provide additional explanation for the observed distribution of VH between land cover types and its prediction performance for canopy gap fraction.
In general, the small canopy elements are numerous and located higher in the canopy and the larger elements tend to be near the ground. The reduced penetration depth of C-band (and X-band) microwave into forest canopies is, therefore, due to the relatively large sizes of lower canopy elements compared to the short wavelengths. In dense forest canopies, therefore, the backscatter from such microwaves easily saturates and represents only a cross-section of the top canopy. However, the edge of forests or canopy trees is particularly significant for radar system as such edges can provide conditions that result in "double-bounce" backscatter [69,78]. The co-polarized backscatter is sensitive to conditions of double bounce which is provided by surface components of vegetation [69]. For the sparse and multi-strata canopy as in CAFS, canopy elements are to some extent arranged in vertical directions considering the gaps within and between different canopy strata. Thus, even for C-band microwave (Sentinel 1A in this case), we can expect double-bounce backscatter within such canopy conditions. For example, radar scattering between the elements of the canopy and/or trunk of a tree in the higher canopy stratum relative to canopy elements in a lower stratum. These conditions for double-bounce, and the vertical elements within canopies and canopy gaps, may explain the observed high importance of the VV backscatter in our prediction of canopy gap fraction.

Spatial Variability in Canopy Gap Fraction
Our experimental semi-variogram models showed that canopy gap distribution in the Bokito landscape is spatially correlated up to a distance of ≈18 m, unlike in the Efoulan landscape were spatial correlation in canopy gap is observable up to a distance range of ≈51 m. The semi-variogram for the canopy gap fraction in the forest-savannah transition landscape showed a combination of Gaussian and Spherical spatial distribution (Figure 13c). This is indicative of a short-scale continuity in canopy gaps in the landscape. The characteristic canopy tree and vegetation structure in the landscape may explain the variogram results. In most CAFS in the landscape, the top canopy stratum of shade trees comprise few, and often isolated, trees of the following predominant species: Ceiba pentandra (Figure 13e), Milicia excelsa; Canarium schweinfurthii, Ricinodendron heudelottii (Non-Timber Forest Product (NTFP) tree), Pycnanthus angolensis, Triplochiton scleroxylon, etc. The average crown diameter for mature specimens of these tree species is usually in the range of 15 to 25 m. For instance, based on forest inventory data in Nigeria, Reference [79] observed that the crown length of Milicia excelsa is in the range 3.6-27 m with a mean of 17 m. Thus, the spatial correlation in gap fraction at a range distance of ≈18 m may be contributed by the isolated large shade trees in CAFS.
For the forest landscape, the spatial distribution of canopy gap fraction follow an exponential semi-veriogram model (Figure 13d). There was spatial correlation in canopy gaps at long scales up to a distance of 51 m. The dominant canopy stratum in most CAFS, compared to the Bokito site, comprised a higher density of different timber and NTFP tree species (Figure 13f). Some common species identified on CAFS include: Pycnanthus angolensis, Irvingia gabonensis (NTFP), Milicia excelsa, Piptadeniastrum africanum Pentaclethra macrophylla, Albizia glaberrima, Tetrapleura tetraptera (NTFP), etc. A similar interpretation, therefore, stands that the spatial correlation in gap fraction up to distance of 51 m in the forest landscape is as a result of the typical spatial distribution of large shade trees in CAFS. The spatial correlation in canopy gap, however, applies more to cocoa agroforests with multi-storey shade trees and secondary forest areas as these were the main sampling areas. Field data on canopy gap fraction did not include other features of the landscapes, such as agricultural fields of other crop, bare land, and savannah vegetation (grass and shrub). Following the results from an experimental semi-variogram, variogram models may be further explored to ascertain the spatial distribution of canopy gaps at different spatial distances and directions.

Management Considerations for Canopy Gap Distribution
Observations in different landscapes have widely suggested that about 30-50% shade may suffice to sustain cocoa production in agroforestry systems [8,80,81]. Yet, the estimation of shade cover in multi-strata CAFS is daunting and challenging; often, farmers lack the appropriate technical skills to operate different forestry equipment for canopy cover assessment and are bound to rely on personal subjective observations of shade percentage in farms. Thus, for the large proportion of cocoa farmers, small-scale rural producers, such equipment widely do not serve the intended purpose. To circumvent such limitations towards appropriate shade management, an assessment of canopy gap distribution at a landscape scale is suggested in this study, which may guide management of shade trees and cocoa yield. We did not measure cocoa productivity in sampled farms; consequently, we do not directly relate farm productivity with canopy gap fraction estimates. However, over a one-year assessment, Reference [82] estimated the average potential yield of cocoa in the Bokito landscape to be 819 kg ha −1 , and the mean accessible yield of cocoa in mature CAFS of Central Cameroon (including Bokito, Ngomedzap, and Talba villages) was estimated to be 869 kg ha −1 yr −1 [45]. Thus, managers may relate, though not in a direct causal sense, the spatial distribution of canopy gaps with reported cocoa productivity in a said landscape.
Cocoa productivity is influenced by several ecological, economic, and environmental factors, which may vary from one production landscape to another. In ongoing joint stakeholders commitments towards zero deforestation cocoa, therefore, improving management strategies in relation to farm intensification, associated tree diversity, or farmers' technical knowledge [9,45,68] alone would be ineffective to sustain and enhance cocoa production. Climate parameters are currently widely at play in influencing farm productivity in cocoa landscapes. For instance, according to Reference [83], cocoa productivity in Bokito is more susceptible to climatic parameters compared to a production zone in the forest landscape. However, due to a high prevalence of the pest that causes cocoa black pod disease, Pythopthora megakarya, and related farm management challenges, CAFS in the forest landscapes, South of Cameroon, have reported low productivity [84]. Thus, managers and stakeholders may require a "solution basket" to sustain and improve cocoa production, and, top in the basket, there is need to map productivity and identify areas likely vulnerable to erratic climate parameters. We suggest an assessment tool for potential changes in canopy structure in cocoa landscapes, which can guide sampling and managing shade tree density and canopy groves in landscapes as such.

Conclusions
In this study, we assessed the relationship between the canopy cover, with focus on cocoa agroforests, and the C-band SAR backscatter intensity. Field inventory was conducted in two cocoa production landscapes in Cameroon that differ in vegetation structure, agricultural practice, and land tenure, amongst other livelihood and institutional difference.
Our results show that the co-polarization backscatter, VV, was more sensitive to canopy gap compared to VH polarization. Moreover, of the significant bootstrap regression predictors of canopy gap fraction, the distribution of VV backscatter varied between the aggregated land use/cover categories. Still, the VV backscatter intensity contributed most to the explained variance in canopy gap fraction by RF regression. Thus, we suggest the VV backscatter to be more sensitive to canopy gaps in cocoa agforests, and this co-polarization backscatter coefficients can be subject of focus to improve predictions from C-band SAR data. Based on the RF model prediction, we provide canopy gap fraction maps for the study sites. Noteworthy, the canopy gap prediction errors (RMSE and MAE) for the NN models were lower than the RF and stepwise regression; this may be a potential basis for further landscape analyses. The results show potential of S1 for canopy gap fraction prediction in CAFS, although further fine-tuning is needed to obtain a better prediction accuracy.
The NN model results can be explored further in either of the following ways, or both, depending on available time or resources: first, further tuning of NN layers, number of neurons in each layer, and model regularization parameters can reduce the high bias and prediction error of NN Tuned1; a second and less feasible procedure is collection of additional and sufficiently large sample data and tuning of model parameters may reduce the over-fitting of NN Tuned1 and hopefully retain its potential low prediction error. For the second approach, the needed training data may also be retrieved from multi-temporal and multi-polarization SAR backscatter [61].
Until now, literature mainly reports shade cover percentage or levels as a proxy for shade tree management in cocoa agroforests. For example, Reference [85] suggested that a shade level of above 30% may not provide expected outcome of sustained cocoa crop production and biodiversity. While such values provide useful guide for shade management at the scale of individual CAFS, shade tree management plans at landscape levels may require information on the spatial distribution of canopy cover. Such information is yet unknown for cocoa production landscapes in Cameroon. The distance range of spatial correlation in gap fraction provides new information for the landscapes. We envisage that this is a valuable basis for inventorying and managing tree shade cover in CAFS at the landscape scale. As example, a practical application is in monitoring canopy cover status in CAFS landscapes and designing sustainable management practices, especially for areas likely vulnerable to changing and extreme weather conditions as prolonging dry seasons.
We have conducted an analysis based on field data and the availability of Sentinel-1 SAR in the context of tropical agroforestry landscapes. Our analysis and results are not meant to provide a "black box" or generic application for all of such landscapes. We used state-of-the-art machine learning algorithms to explore the potential of free S1 data in a landscape with typical vegetation cover, type of anthropogenic action and land use, and poor data availability. Worthy of note, this study supports a proof-of-concept, from applying C-band SAR backscatter information, and the findings are comparable to others reported in literature [76,86]. Nonetheless, they are based on backscatter intensity from C-band microwave, which only penetrates forest canopy to a certain degree. We think that improved results can be achieved using data from SAR images acquired using longer wavelengths as L-or P-band. For instance, data from up-coming satellite missions, such as NISAR (NASA-ISRO Synthetic Aperture Radar), which will be launched, hopefully, in 2022, may be very handy for such applications. However, due to overlapping satellite flight orbits, S-1 data acquisition has a high frequency of SAR images collected over areas at higher latitudes than at the equatorial regions. Likewise, up-coming SAR mission may also be launched with predetermined flight orbits and specific duty cycles for prioritized locations (e.g., India for NISAR) and lower image acquisition frequency over certain parts of the globe. Hence, currently operational S-1 C-band SAR image archives will remain a valuable source of complementary remote sensing data to be explored for different applications.  Table S1: Summary of mean difference in SAR backscatter (σ0) across the DHP field sampling window period in the study study landscapes; Tables S2 and S3: Summary statistics of extracted SAR backscatter.  Acknowledgments: This study was conducted under the Special Research Fund of Ghent University for students from developing countries. The research was conducted under the guidance of Robert De Wulf. Field work was supported by the World Agroforestry Center in Cameroon. Field inventory was assisted by local resource persons, administrative personnel, and the cocoa farmers who granted access to their plantations. We thank the European Space Agency for the freely accessible archive of Sentinel-1 SAR images. We also thank to the developers (and contributors) of the R and Python data mining and programming software packages. We thank the four anonymous reviewers for their constructive comments and suggestions that helped to improve the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.