Evaluating Combinations of Sentinel-2 Data and Machine-Learning Algorithms for Mangrove Mapping in West Africa

: Creating a national baseline for natural resources, such as mangrove forests, and monitoring them regularly often requires a consistent and robust methodology. With freely available satellite data archives and cloud computing resources, it is now more accessible to conduct such large-scale monitoring and assessment. Yet, few studies examine the reproducibility of such mangrove monitoring frameworks, especially in terms of generating consistent spatial extent. Our objective was to evaluate a combination of image processing approaches to classify mangrove forests along the coast of Senegal and The Gambia. We used freely available global satellite data (Sentinel-2), and cloud computing platform (Google Earth Engine) to run two machine learning algorithms, random forest (RF), and classiﬁcation and regression trees (CART). We calibrated and validated the algorithms using 800 reference points collected using high-resolution images. We further re-ran 10 iterations for each algorithm, utilizing unique subsets of the initial training data. While all iterations resulted in thematic mangrove maps with over 90% accuracy, the mangrove extent ranges between 827–2807 km 2 for Senegal and 245–1271 km 2 for The Gambia with one outlier for each country. We further report “Places of Agreement” (PoA) to identify areas where all iterations for both methods agree (506.6 km 2 and 129.6 km 2 for Senegal and The Gambia, respectively), thus have a high conﬁdence in predicting mangrove extent. While we acknowledge the time-and cost-e ﬀ ectiveness of such methods for the landscape managers, we recommend utilizing them with utmost caution, as well as post-classiﬁcation on-the-ground checks, especially for decision making.


Introduction
Mangrove forests cover approximately 0.7% of tropical forest area around the world [1][2][3][4], in more than 118 tropical and sub-tropical countries.Yet, these forests can store three to four times more carbon per equivalent area compared to tropical forests [5].In particular, mangrove forests in carbonate, peat-dominated settings are likely to store 25-50% more soil organic carbon compared to mangroves in deltaic and estuarine coastal settings [6].These forests are also known to host 1.6% of the total tropical forest biomass (considering both above-and below-ground biomass) [7].
Remote sensing provides a time-and cost-effective approach for natural resource monitoring at any large scale, especially at the national level.Particularly for mangrove mapping, remote sensing methods have been widely used .Most of these studies use optical satellite data, especially Landsat, due to longer temporal coverage and ease of data accessibility.With the availability of active satellite data, many studies are increasingly utilizing radar data from sensors including the Advanced Land Observing Satellite (ALOS) based Phased Array type L-band Synthetic Aperture Radar (ALOS PALSAR), RADARSAT-2, the Shuttle Radar Topography Mission (SRTM), for quantifying mangrove extent and other biophysical characteristics [15,[58][59][60][61][62][63][64][65].While availability of Sentinel-1 from the European Space Agency (ESA) data has shown promise for continued use of radar data in mangrove mapping in the coming years, historical land cover mapping and monitoring often need to rely solely on optical remote sensing data due to the lack of radar data before the 1990s.
In terms of the methods, prior studies use a range of classification techniques such as the iterative self-organizing data analysis (ISODATA) clustering, maximum likelihood classification (MLC), hybrid, random forest (RF), classification and regression trees (CART), support vector machine (SVM), and object oriented classification among others [16,17,34,[37][38][39][40][41]43,66].With the advent of cloud computing platforms with free access to petabytes of geospatial data, such as Google Earth Engine (GEE), it has now become increasingly accessible and straight-forward to analyze enormous amounts of satellite imagery covering large regions [37,[67][68][69][70][71][72].While GEE offers more than 15 classification techniques, most studies rely on machine-learning algorithms [37,[68][69][70][71][72], such as CART and RF, since these have proven to be some of the robust methods for land cover classifications.Such methods based on free data and robust algorithms can be particularly beneficial for regular monitoring, including tracking SDG indicators.
Landscape managers in many developing countries struggle to establish a consistent methodology for SDG monitoring and assessments, both spatial and temporal.This is primarily due to the lack of computing resources required for method development using high-resolution satellite data, limited accessibility to high-resolution satellite data, and challenging physical environment for collecting data required for method calibration and validation.Such systematic monitoring is even more challenging for wetland forests due to the difficult terrain and often remote location.While significant advances have been achieved in satellite-derived monitoring in the recent past, few studies focus on the ease of the landscape managers to adopt the methodology for on-the-ground monitoring, and evaluate the performance of machine-learning algorithms in identifying mangrove extent in fragmented, heterogeneous, and rapidly changing landscapes.In this study, our objective is to evaluate freely available satellite data and machine-learning algorithms, specifically RF and CART, available on GEE to predict mangrove extent in the West African countries of Senegal and The Gambia.We evaluated the performance of these two classifiers by running 10 iterations for each, using Sentinel-2 images for 2017 and comparing the range of mangrove extent and accuracy for each iteration.Our objective was to examine if a simple framework that relies on freely available geospatial resources can provide consistent and reliable mangrove estimates, rather than identifying the best model parameters to generate the most accurate land cover map for the study area.

Study Area
Senegal, covering a land area of 192,530 km 2 , is home to over 12.7 million people, out of which nearly two-thirds live in the coastal region.The country has a tropical climate and heavier vegetation in the southern part, whereas the northern part is dominated by the desert and grasslands influenced by a Sahelian climate (Figure 1) with projected changes in wet and dry extremes in the coming decades [73].Approximately 70% of the population depends on agriculture that covers 46% of the land area [74], and is highly vulnerable to ongoing and future climate variability and change.The Gambia is the smallest country in mainland Africa, covering a land area of 10,120 km 2 .Other than permanent wetlands and grasslands, the country is dominated by croplands (Figure 1), covering approximately 60% of the land area [74].Table 1 includes estimation of land cover types as extracted from MODIS land cover type global data product (MCD12Q1) [75].The mangroves in this landscape are primarily located in the Sine-Saloum and Casamance Deltas, but smaller areas are also located near Dakar and in the northern end.The Sine-Saloum Delta is located north of The Gambia in the Sahelian climate zone with an average rainfall of 450-920 mm per year.The Casamance Delta is located south of The Gambia within the Sudanese-Guinean climate zone, with rainfall between 800 and 1700 mm per year.The region experiences monsoonal rainfall that occurs between June and September [76].The coastal regions of Senegal experience microtidal (<2 m), semi-diurnal tides [77].
The region is dominated by two mangrove species.Rhizophora racemosa occur along the tidal channels while Avicennia germinans are generally located further from the channel in tidal flats that tend to have higher salinities [78].More recently, R. mangle has been planted during large restoration efforts across the country at a density of 5000 stems per hectare [79].Just landward of the mangrove margins are barren mud flats with high salinity, creating a distinct separation between the mangroves and upland vegetation [78].Across the country, the mean mangrove height is 6.9 m with a maximum height of 11.9 m [80].The mangroves in this landscape are primarily located in the Sine-Saloum and Casamance Deltas, but smaller areas are also located near Dakar and in the northern end.The Sine-Saloum Delta is located north of The Gambia in the Sahelian climate zone with an average rainfall of 450-920 mm per year.The Casamance Delta is located south of The Gambia within the Sudanese-Guinean climate zone, with rainfall between 800 and 1700 mm per year.The region experiences monsoonal rainfall that occurs between June and September [76].The coastal regions of Senegal experience microtidal (<2 m), semi-diurnal tides [77].
The region is dominated by two mangrove species.Rhizophora racemosa occur along the tidal channels while Avicennia germinans are generally located further from the channel in tidal flats that tend to have higher salinities [78].More recently, R. mangle has been planted during large restoration efforts across the country at a density of 5000 stems per hectare [79].Just landward of the mangrove margins are barren mud flats with high salinity, creating a distinct separation between the mangroves and upland vegetation [78].Across the country, the mean mangrove height is 6.9 m with a maximum height of 11.9 m [80].

Satellite Data
We accessed Sentinel-2 level-1C assets on GEE provided by the European Space Agency (ESA).We used the top-of-atmosphere (TOA) reflectance data that included radiometric and geometric corrections following the methods described in the Sentinel-2 User Handbook [81].Specifically, we used the GEE function "ee.ImageCollection" to filter the time-series data for the calendar year 2017 and considered all bands with spatial resolution 10 m and 20 m (bands 2-8a, 11,12).The TOA reflectance data used in this study generally retain considerable atmospheric signals.For studies considering biophysical properties of vegetation, TOA reflectance data should be corrected for atmospheric signals and the resulting surface reflectance data should be used.However, we converted TOA reflectance data into categorical land cover map (thematic information) in this study, hence our findings should not be influenced by our data choice.We then calculated the Normalized Difference Vegetation Index (NDVI) and added the NDVI band to each image.A total of 4153 images were considered for the study period and region, with cloud coverage ranging between 0-100% with an average of 34%.An annual composite was generated using a 'quality mosaic' that selected cloud-free greenest pixels [82][83][84].In other words, the maximum NDVI values in the stack of pixels within entire time-series determined the rest of the reflectance band values in the annual composite in order to capture the vegetation pixels at the same phenological stage.This method was repeated for both Senegal and The Gambia.The quality mosaic served as the input image for the classifiers (Section 2.4).

Training and Testing the Classifiers
We collected a total of 800 reference points for the four land cover classes (mangrove, water, other vegetation, and sand/soil) for the Senegal-Gambia landscape (Figure 2).We used both high-resolution images available on GEE and the greenest-pixel composite to facilitate reference data collection.We considered homogeneous patches of a specific land cover (i.e., same land cover for at least 9 Sentinel-2 pixels) for collecting reference points.We avoided fragmented landscape to minimize mixed pixel issues, and/or to avoid collecting reference points from the edge of a particular land cover.For this reason, we have fewer points in The Gambia.However, the landscape was classified as a whole, and not for each country separately.Hence, fewer training points in The Gambia should not severely affect the outputs, as long as land covers have similar spectral signatures in both countries.We used a stratified random sampling approach based on a visual assessment of the relative proportion of different land covers in the coastal zone of the study area, and collected 500 points for the mangrove class, and 100 points for each of the other three land covers.We used half of the 800 points collected for training the classifiers (i.e., 'train' points on GEE), and the other half for accuracy assessment (i.e., 'test' points on GEE).We reported producer's/user's/overall accuracy as well as kappa (κ) coefficient [85].Producer's accuracy measures the error of omission, i.e., the proportion of pixels in a certain class that is being evaluated that were incorrectly classified in another category and were omitted from the 'truth' class as identified by the test points.User's accuracy measures the error of commission, i.e., the proportion of pixels that were incorrectly included in a class that is being evaluated.We collected a total of 800 reference points for the four land cover classes (mangrove, water, other vegetation, and sand/soil) for the Senegal-Gambia landscape (Figure 2).We used both highresolution images available on GEE and the greenest-pixel composite to facilitate reference data collection.We considered homogeneous patches of a specific land cover (i.e., same land cover for at least 9 Sentinel-2 pixels) for collecting reference points.We avoided fragmented landscape to minimize mixed pixel issues, and/or to avoid collecting reference points from the edge of a particular land cover.For this reason, we have fewer points in The Gambia.However, the landscape was classified as a whole, and not for each country separately.Hence, fewer training points in The Gambia should not severely affect the outputs, as long as land covers have similar spectral signatures in both countries.We used a stratified random sampling approach based on a visual assessment of the relative proportion of different land covers in the coastal zone of the study area, and collected 500 points for the mangrove class, and 100 points for each of the other three land covers.We used half of the 800 points collected for training the classifiers (i.e., 'train' points on GEE), and the other half for accuracy assessment (i.e., 'test' points on GEE).We reported producer's/user's/overall accuracy as well as kappa (κ) coefficient [85].Producer's accuracy measures the error of omission, i.e., the proportion of pixels in a certain class that is being evaluated that were incorrectly classified in another category and were omitted from the 'truth' class as identified by the test points.User's accuracy measures the error of commission, i.e., the proportion of pixels that were incorrectly included in a class that is being evaluated.

The Classifiers
The RF is an ensemble of tree-based classifiers where each classifier uses a random vector sampled independently from the original training set (Figure 3a), and each tree casts a vote to the most popular class [87,88].The RF uses 'bootstrap aggregating' or 'bagging' [78], a method to

The Classifiers
The RF is an ensemble of tree-based classifiers where each classifier uses a random vector sampled independently from the original training set (Figure 3a), and each tree casts a vote to the most popular class [87,88].The RF uses 'bootstrap aggregating' or 'bagging' [78], a method to generate random vectors with replacement N examples (where N is the size of the original input training data), to select training data for each class.Each pixel is assigned to a class based on the most popular vote from all tree predictors (Figure 3a).The number of trees and variables per split in a RF classifier is defined by the user.Since the classifier performance is not sensitive to the number of variables per split [89], limiting this value to the square root of the input variables (a default value for RF in GEE and R statistical software) can help with reducing the computational complexity and decreasing correlation among the trees [89].Unlike decision tree (DT) classifiers (such as CART), pruning is not required for RF.However, RF classifiers are more complex than DT classifiers, are less intuitive due to the inherent complexity, and can be computationally intensive.We used 10 trees and the square root of the number of inputs for variables per split in this study.Since the objective of this study is to evaluate the performance of machine-learning algorithms that can be easily reproducible in developing countries with limited computing resources, and not to find out the best parameters for these algorithms, we decided to generate simple yet robust models with the default parameters available on GEE.
The CART is a decision-rule based classifier that operates in a tree-structured decision space (Figure 3b) and is a modern-day analog to the DT approach [90].Within CART, input data are recursively split at each decision node (Figure 3b), also known as a greedy splitting approach, based on a statistical test (such as Gini index) to increase the homogeneity of the training data in the resulting nodes.Since a complex tree runs the risk of overfitting, thus reducing the accuracy of the classified output, pruning the tree (i.e., removing the tree sections that do not contribute to increased accuracy) is an important step in a DT classifier.One known limitation of CART is high variance across samples leading to high variability in predicted classes and estimates [91].For the CART classifiers used in this study, we have used the default values of 10 for both the cross-validation factor for pruning and maximum depth of the tree (i.e., maximum level that the initial tree can grow), in order to minimize the computational resource usage that is often a limitation in many developing countries.The standard error threshold of 0.5 was used to determine the simplest tree with an accuracy comparable to the minimum cost-complexity tree.
The output classified images for both classifiers have a spatial resolution of 20 m.We calculated the area under mangroves by multiplying the number of pixels classified as mangrove by the cell size.We further calculated area under the "places of agreement (PoA)" that were classified as mangrove pixels by both algorithms (Figure 3c).All output images were further clipped for low elevation coastal zone (LECZ) with elevation ≤40 m, a criterion widely used to define LECZ (e.g., see [16]).All analyses were performed using GEE and ArcMap 10.5.1.

Model Cross-Validation
For each of the two classifiers, we ran 10 iterations to determine the range of accuracies of the classified maps (Figure 3c).In order to do that, we randomly selected 400 training points for each iteration out of the 800 total points collected before running the classifiers using stratified random sampling in ArcMap 10.5.1 and trained the classifiers on GEE.In other words, we created a unique set of 400 points each for training and testing.We utilized the iteration-specific set for training and testing points for both classifiers.

Model Cross-Validation
For each of the two classifiers, we ran 10 iterations to determine the range of accuracies of the classified maps (Figure 3c).In order to do that, we randomly selected 400 training points for each iteration out of the 800 total points collected before running the classifiers using stratified random sampling in ArcMap 10.5.1 and trained the classifiers on GEE.In other words, we created a unique set of 400 points each for training and testing.We utilized the iteration-specific set for training and testing points for both classifiers.

Accuracy Assessment
Table 2 lists accuracies per class-both producer's and user's accuracy-for the two algorithms used in this study.We report the average accuracy with standard deviations using all 10 iterations for both algorithms (Table 2).The mangrove class has the highest user's accuracy among the four land cover classes for both algorithms (99.2%-99.56%),closely followed by the 'other vegetation' class (95.7%-96.74%).Both classifiers were only moderately successful in distinguishing between water and sandy soil often present along the river (user's accuracy ranging between 71.49%-75.61% for water and 75.28%-77.53%for sand/soil).The producer's accuracy follows the same patterns for per class accuracy.The average overall accuracy of the RF-generated classified image is 93.44% (κ = 0.89), while that for the CART-generated image is 92.18% (κ = 0.86) (Figure 4).land cover classes for both algorithms (99.2%-99.56%),closely followed by the 'other vegetation' class (95.7%-96.74%).Both classifiers were only moderately successful in distinguishing between water and sandy soil often present along the river (user's accuracy ranging between 71.49%-75.61% for water and 75.28%-77.53%for sand/soil).The producer's accuracy follows the same patterns for per class accuracy.The average overall accuracy of the RF-generated classified image is 93.44% (κ = 0.89), while that for the CART-generated image is 92.18% (κ = 0.86) (Figure 4).

Mangrove Extent in Senegal and The Gambia
While all 20 iterations for the two classifiers show over 90% overall accuracy, the resulting maps vary widely in terms of mangrove extent (Figure 4).The extent predicted by the CART have a wider range and a lower average compared to those predicted by the RF for both countries (Figure 4).The 10 iterations for the RF and the CART agree on 714.28 km 2 and 507.26 km 2 of mangroves in Senegal, respectively (Figure 5a).Similarly, the overlapping mangrove areas are 237.14km 2 and 131.05 km 2 for The Gambia as per RF and CART, respectively (Figure 5a).Both classifiers agree on approximately

Mangrove Extent in Senegal and The Gambia
While all 20 iterations for the two classifiers show over 90% overall accuracy, the resulting maps vary widely in terms of mangrove extent (Figure 4).The extent predicted by the CART have a wider range and a lower average compared to those predicted by the RF for both countries (Figure 4).The 10 iterations for the RF and the CART agree on 714.28 km 2 and 507.26 km 2 of mangroves in Senegal, respectively (Figure 5a).Similarly, the overlapping mangrove areas are 237.14km 2 and 131.05 km 2 for The Gambia as per RF and CART, respectively (Figure 5a).Both classifiers agree on approximately 506.59 km 2 under mangrove cover in Senegal, whereas the PoA are around 129.64 km 2 in The Gambia (Figure 5b).
For Senegal, mangrove extent varies from 990 km 2 to 2726 km 2 according to the 10 iterations of the RF classifier (Figure 5c), while that for CART vary between 826 km 2 to 4396 km 2 with an outlier of 15,352 km 2 (Figure 5d).For The Gambia, mangrove extent range between 340 km 2 to 964 km 2 as per RF (Figure 5c), and 245 km 2 to 1271 km 2 with an outlier of 3630 km 2 as per CART (Figure 5d).
For Senegal, mangrove extent varies from 990 km 2 to 2,726 km 2 according to the 10 iterations of the RF classifier (Figure 5c), while that for CART vary between 826 km 2 to 4,396 km 2 with an outlier of 15,352 km 2 (Figure 5d).For The Gambia, mangrove extent range between 340 km 2 to 964 km 2 as per RF (Figure 5c), and 245 km 2 to 1,271 km 2 with an outlier of 3,630 km 2 as per CART (Figure 5d).

Discussion
The objective of the current work was to evaluate open data and tools for regular monitoring, especially suitable for in-country technicians who can complete the analysis and share results with the landscape managers and decision-makers who can then make an informed decision.We ran ten iterations for each of the algorithms (total 20) to derive the range of accuracies and mangrove extent

Discussion
The objective of the current work was to evaluate open data and tools for regular monitoring, especially suitable for in-country technicians who can complete the analysis and share results with the landscape managers and decision-makers who can then make an informed decision.We ran ten iterations for each of the algorithms (total 20) to derive the range of accuracies and mangrove extent generated by these algorithms.In general, this cross-validation indicates agreement among the RF iterations indicating robustness of this classifier.Seven out of ten RF iterations for Senegal predict mangrove extent in the range of 2050-2726 km 2 , with one iteration predicting less than 1000 km 2 (Figure 4).For The Gambia, the RF predictions have a wider range, with three iterations predicting mangrove extent less than 360 km 2 , five iterations predicting in the range of 670-760 km 2 , and two iterations predicting over 800 km 2 of mangrove cover (Figure 4).The CART iterations predict a wider range of mangrove extent compared to those by RF for both countries.For Senegal, while three iterations predict approximately 1000 km 2 of mangroves, three iterations predict in the range of 1500-1800 km 2 .One CART iteration grossly overestimates mangrove extent (15,352 km 2 ; Figure 5d) even though it had a high overall accuracy (93.5%), likely due to the inherent CART property of wide variability in predicted estimates [91].This pattern of gross overestimation by at least one CART iteration holds true for The Gambia, while one iteration predicted 3630 km 2 of mangrove extent (Figure 5d).Four out of ten iterations predicted ≤400 km 2 mangroves in The Gambia, while three iterations predicted mangroves in the range of 550-610 km 2 .These findings indicate that the same satellite data-classifier combination, based on a discrete classification of Sentinel-2 pixels, can provide a range of estimates for mangrove extent (Figure 4), running into the problem of over-or under-prediction.In this study, this wide range of estimates is an artifact of different sets of training/testing points, which underscores the importance of collecting reference points from 'pure pixels' per se, i.e., avoiding mixed environment, such as swampy savannas, or salty bare soils.This lack of consistency across iterations, along with occasional overestimations, also highlights the need for cross-validation in mapping and monitoring natural resources so as to avoid false positives.There are several ways for achieving higher accuracy, including tuning the classifiers by identifying the most accurate parameters, and collecting highly reliable set of training/testing points.However, our findings also highlight the need of examining the outputs more carefully, and not just focusing on the accuracy of the output maps.
One way to avoid false positives would be running multiple iterations of the selected classifier, and then identifying the PoA.In this study, we first quantified PoA for each classifier separately, and then quantified PoA between the two classifiers (Figure 5b).However, for landscape managers, the spatial distribution of PoA might be more meaningful than the overall area.The PoA approach described here can serve as a-priori data set and help the managers by directing their efforts to validating the regions where the iterations did not agree upon.It will be helpful in terms of SDG reporting to include a spatially explicit uncertainty map, with each annual report to highlight regions with high confidence in prediction, but more importantly regions with high uncertainty because of the wide range in model predictions.Another advantage of this PoA approach is its flexibility in terms of input satellite data.While we used Sentinel-2 optical data, Sentinel-1, or radar data from other sources might be better suited for regions with persistent cloud covers, such as tropical countries in Africa or Asia.Decision-makers interested in examining long-term change trajectory might also consider utilizing the entire Landsat record (also available on GEE).
It should be noted that any direct comparison between available datasets using other input data or methods should not be used for tracking national-level progress, or lack thereof.To the best of our knowledge, the most updated mangrove estimates for these countries are from 2016, derived from a 2010 baseline (Table 3) developed under the global mangrove watch (GMW) project [1].This study used ALOS PALSAR and Landsat data to generate a baseline for 2010, and then used JERS-1, ALOS PALSAR, and ALOS-2 PALSAR-2 to quantify changes between 1996 and 2016 from the 2010 baseline.As per this dataset, mangrove extent in 2016 is 1288.17km 2 and 577.48 km 2 in Senegal and The Gambia, respectively.Another recent study [93] provided 2012 mangrove estimates for Senegal and The Gambia (Table 3) among other countries, based on a combined estimation derived from three existing datasets-the global forest gains/losses and fractional cover [94], mangrove forests of the world [38] and terrestrial ecoregions of the world [95].The primary methodological difference between [93] and other existing data sets (Table 3) is that [93] assigns a sub-pixel percentage value for each mangrove pixel identified by the discrete classification system adopted in other studies, thus reporting only a fraction of mangrove extent compared to other estimates that were based on presence/absence of mangrove pixels.The authors of [93] pointed out that discrete classification approach is often plagued with overestimation.However, it is challenging, if not almost impossible, to adopt a continuous classification approach for SDG reporting, because such an approach needs a historical comparison where the known accuracy or detailed field data from prior years are often missing.It is also important to consider mangroves as an ecosystem, where mud banks, salty bare soils, and tidal channels are considered integral parts of the ecosystem.However, such detailed mapping often requires prolonged field data collection effort, which might not be a pragmatic recommendation everywhere.The PoA approach presented here relies on input data of a finer spatial scale (20 m) and provides a middle-ground approach between complicated continuous mapping effort and overestimation resulting from discrete classification based on coarser-resolution input data.Even though there are still internet connectivity issues in many countries that might prevent the in-country analysts to generate and/or display maps on GEE, with freely available data, robust methods, and cloud computing platforms, it is easier than ever to conduct regular monitoring for any natural resources.However, our findings indicate that solely reporting estimates, without uncertainty attached to the report, could lead to erroneous decision-making.It is thus our recommendation to consider the spatial distribution of the ecosystem of interest (mangrove forests in this study), and report a confidence map or uncertainty analysis along with the thematic map.Such an approach not only will look beyond the accuracy assessments of thematic maps but can also reduce the necessity for post-classification on-the-ground validation.

Conclusions
Landscape managers often need to conduct repeated monitoring of natural resources, such as mangrove forests, on an annual or semi-annual basis.Such monitoring that uses consistent methods is even more important for documenting progress at a national level, e.g., tracking SDG environmental indicators.Satellite data is freely available on cloud computing platforms, along with easily implementable robust methods, such as machine-learning algorithms, including RF and CART, thus offering an unprecedented ease in processing enormous amount of data with no local requirement for advanced computing resources.While many prior studies have used such approaches in quantifying mangroves and other natural resources, our findings indicate that predictions can have a wide range depending on the classifier and the set of training and testing data used.At least one iteration for each of the classifiers used in this study grossly overestimated mangrove extent in Senegal and The Gambia.Hence, such an approach must also include a confidence map to avoid the risk of under-or over-predicting mangrove extent.We acknowledge the potential of utilizing such nearly-automated approaches in decision making over a larger region, but recommend using these with uncertainty analysis, especially in heterogeneous landscapes.

Figure 1 .
Figure 1.Land cover types and remotely sensed surface features in the study area.Data on land cover types were extracted from MODIS land cover type global data product (MCD12Q1.006;spatial resolution: 500 m) following the International Geosphere-Biosphere Programme (IGBP) classification for 2016 using Google Earth Engine (GEE) platform.Inset map shows location of the study area within Africa.Maps were created in ArcGIS 10.5.1.

Figure 2 :
Figure 2: Spatial distribution of training and testing points from one of the 10 iterations against the backdrop of very high-resolution satellite data (spatial resolution: 1 m) provided by [86].

Figure 2 .
Figure 2. Spatial distribution of training and testing points from one of the 10 iterations against the backdrop of very high-resolution satellite data (spatial resolution: 1 m) provided by [86].

Figure 3 .
Figure 3. Methodological framework used in this study.Panels (a) and (b) show internal structures of a random forest (RF) classifier (modified after [92], and classification and regression tree (CART) classifier.Panel (c) shows overall workflow with input data (Sentinel-2 top-of-atmosphere (TOA) reflectance data, along with reference points for model training and testing), Google Earth Engine (GEE) classifiers and output maps, including places of agreement (PoA) map.

Figure 3 .
Figure 3. Methodological framework used in this study.Panels (a) and (b) show internal structures of a random forest (RF) classifier (modified after [92], and classification and regression tree (CART) classifier.Panel (c) shows overall workflow with input data (Sentinel-2 top-of-atmosphere (TOA) reflectance data, along with reference points for model training and testing), Google Earth Engine (GEE) classifiers and output maps, including places of agreement (PoA) map.

Figure 4 .
Figure 4. Left panel shows range (n = 10 for each classifier) of overall accuracies for the classified maps generated by the random forest (RF) and the classification and regression trees (CART).Right panel shows range of mangrove estimates generated by RF and CART classifiers for Senegal and The Gambia in 2017.

Figure 4 .
Figure 4. Left panel shows range (n = 10 for each classifier) of overall accuracies for the classified maps generated by the random forest (RF) and the classification and regression trees (CART).Right panel shows range of mangrove estimates generated by RF and CART classifiers for Senegal and The Gambia in 2017.

Figure 5 :
Figure 5: Spatial distribution of mangrove forests in Senegal and The Gambia as classified by different iterations of the two classifiers: (a) mangrove extent identified by all iterations in Random Forest (RF) and Classification and Regression Trees (CART) showing places of disagreement; (b) Places of Agreement (PoA) between the two classifiers from a total of 20 iterations; (c) spatial distribution of mangroves from the RF iterations with maximum and minimum extent; (d) spatial distribution of mangroves from the CART iterations with maximum and minimum extent.

Figure 5 .
Figure 5. Spatial distribution of mangrove forests in Senegal and The Gambia as classified by different iterations of the two classifiers: (a) mangrove extent identified by all iterations in Random Forest (RF) and Classification and Regression Trees (CART) showing places of disagreement; (b) Places of Agreement (PoA) between the two classifiers from a total of 20 iterations; (c) spatial distribution of mangroves from the RF iterations with maximum and minimum extent; (d) spatial distribution of mangroves from the CART iterations with maximum and minimum extent.

Table 1 .
[75]ortion of area under different land cover types in 2016 (in %) as extracted from[75]for Senegal and The Gambia.

Table 2 .
The range of per class, overall, producer's and user's accuracy for the classifications generated by random forest (RF) and classification and regression trees (CART).

Table 2 .
The range of per class, overall, producer's and user's accuracy for the classifications generated by random forest (RF) and classification and regression trees (CART).

Table 3 .
Comparison of mangrove extent estimates for Senegal and The Gambia derived from existing data sets and work presented here.The estimates from the existing data sets is an approximation and were extracted from the original data sets using ArcMap 10.5.1.