Fodder Biomass Monitoring in Sahelian Rangelands Using Phenological Metrics from FAPAR Time Series

: Timely monitoring of plant biomass is critical for the management of forage resources in Sahelian rangelands. The estimation of annual biomass production in the Sahel is based on a simple relationship between satellite annual Normalized Difference Vegetation Index (NDVI) and in situ biomass data. This study proposes a new methodology using multi-linear models between phenological metrics from the SPOT-VEGETATION time series of Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) and in situ biomass. A model with three variables—large seasonal integral (LINTG), length of growing season, and end of season decreasing rate—performed best (MAE = 605 kg·DM/ha; R 2 = 0.68) across Sahelian ecosystems in Senegal (data for the period 1999–2013). A model with annual maximum (PEAK) and start date of season showed similar performances (MAE = 625 kg·DM/ha; R 2 = 0.64), allowing a timely estimation of forage availability. The subdivision of the study area in ecoregions increased overall accuracy (MAE = 489.21 kg·DM/ha; R 2 = 0.77), indicating that a relation between metrics and ecosystem properties exists. LINTG was the main explanatory variable for woody rangelands with high leaf biomass, whereas for areas dominated by herbaceous vegetation, it was the PEAK metric. The proposed approach outperformed the established biomass NDVI-based product (MAE = 818 kg·DM/ha and R 2 = 0.51) and should improve the operational monitoring of forage resources in Sahelian rangelands.


Introduction
Livestock farming is the most widespread human activity and most dominant form of land use in rangeland ecosystems [1]. Worldwide, it contributes 40% of the agricultural gross domestic product, and provides income for more than 1.3 billion people and nourishment for at least 800 million food-insecure people [2]. In the West African Sahel in particular, livestock is the primary renewable resource [3] and the region is characterized by the extensive use of pastures in rangelands. Rainfall here is considered to be the main driving factor responsible for fluctuations in natural vegetation [4][5][6] and, therefore, fluctuations in grazing stock, for which growth is limited to 3-4 months of the rainy season (between July and October), are closely linked to rainfall. Because of unpredictable intra-annual variations in rainfall, the vegetation is often exposed to a water shortage in the growing season, sometimes leading to severe droughts, food shortages, and production deficits [7]. Given the random and recurring environmental stress, estimating plant biomass in the rangelands is important in assessments of livestock fodder availability [8]. Depending on the needs of the agricultural monitoring systems, however, estimates should be provided as early as possible in the growing season so that stakeholders can take early decisions and identify areas with large variation in (and potential for) vegetation productivity [9].
The estimation of biomass production using remote sensing data has been the subject of many studies in the Sahel [10][11][12][13][14][15][16]. The Normalized Difference Vegetation Index (NDVI) is the most commonly used satellite index in the region for quantifying the temporal monitoring of vegetation [17]. Another widely used indicator is Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), recognized as a key variable in the assessment of vegetation productivity [13,18,19]. FAPAR is defined as the fraction of radiation absorbed by the green vegetation elements in the 400-700 nm spectral domain under specified illumination conditions [20]. It is directly linked to photosynthesis and therefore expresses a canopy's energy-absorption capacity [21]. Many authors have studied the relationship between FAPAR and NDVI [7,[22][23][24][25][26][27], which has been shown to be generally linear for green vegetation, particularly in the semi-arid environment of the Sahel [28,29].
The Centre de Suivi Ecologique (CSE) has been estimating total annual biomass in Senegal on an operational basis since 1987 in order to monitor fodder availability in the national pastoral rangelands. The method used is inspired by an approach proposed in 1983 by [10] and is based on an empirical relationship between the satellite seasonal integrated NDVI and in situ measured biomass at the end of the growing season. The same approach is used by Niger's Ministry of Livestock and Animal Industries to assess forage availability in the rangelands at the end of the season. Although very useful for the spatial management of pastoral resources, this approach based on a simple linear regression has several limitations, including: (i) its temporal scale is restricted to biomass data of the ongoing year not being used again in the following year, which leads to high costs for annual data collection; (ii) its low predictive potential is due to the large time gap between data collection (mid-October) and published results (late December); and (iii) this is a single-predictor model, which generally omits details of reality [30].
The satellite-derived seasonal metrics of plant phenology and seasonal dynamics in reflectance and greenness [31] could be very useful for establishing multiple-predictor models for estimating plant biomass production. Many authors have endorsed the applicability of these phenological metrics for crop [9] and grassland [32] phenological monitoring, particularly for analyzing African farming systems [33]. Recently, phenological metrics based on the FAPAR time series were recommended by [34] for providing relevant early information on biomass production in the Sahel. When associated with ground biomass data, such variables would enable early warning models to be established and would improve the use of early warning systems (EWS) in planning emergency responses and food aid interventions in the case of a forthcoming crisis [35].
A significant relationship (p < 0.001) between the field measurements of herbaceous biomass and the cumulated FAPAR over the growing season was found by [34] in Senegal, although only 34% of the in situ biomass was explained by the linear regression. The study focused only on herbaceous biomass, however, and recently [36] highlighted the importance of woody plants in the satellite time series of vegetation indices in the Sahelian ecosystems.
In this context, the present study sought to develop an operational system for monitoring total biomass, including both herbaceous and woody leaf biomass, rather than focusing exclusively on using remote sensing data to forecast vegetation productivity [35,37]. The proposed method is based on multiple linear regression models using phenological variables derived from the seasonal dynamics of the FAPAR SPOT-VEGETATION time series and ground measurements of total biomass production collected in different Sahelian ecosystems in Senegal over 15 years. The specific objectives of the study were to: (i) determine the proper metrics and establish the best model for total biomass estimation across the study area; (ii) verify whether or not the disaggregation of the overall dataset into ecoregions and related metrics improved the estimates; and (iii) validate the multiple-predictor models developed, including an early warning model, based on the comparison with ground measurements as well as with the CSE biomass product that is currently used for fodder-monitoring in Senegal.

Study Area and Ground Control Sites
The study area covers the northern and eastern parts of Senegal between 12.9°N and 16.8°N latitude and 12.3°W and 16.5°W longitude (an area of about 98,556 km² ; Figure 1) and includes both the natural grazing and pastoral areas of the country. The mean annual rainfall varies between 250 and 750 mm from north to south, based on the average rainfall estimates (RFE) time series (1999-2013) of the Famine Early Warning Systems Network (FEWS Net) with 8 km spatial resolution [38]. The terrain is relatively flat, with a maximum altitude of 163 m in the southeastern part of the study area.
The CSE established 36 ground control sites in 1987 for monitoring biomass at the end of the growing season in Senegal's pastoral areas. Of these sites, 24 were monitored until now and are in relatively homogeneous vegetation zones of 3 × 3 km 2 , making them ideal for comparison with moderate/coarse resolution remote sensing data. They are representative of the main geomorphological forms of sampled landscapes [12]. In this study, these 24 ground control sites were grouped into four ecological zones: Northern Sandy Pastoral Region (ECOnorth), Ferruginous Pastoral Region (ECOeast), Mixed Pastoral-Agricultural Region (ECOwest), and Eastern Transition Region (ECOsouth). The ECOwest area corresponds to a combination of the Southern Sandy Pastoral Region (north) and Agricultural Expansion Region (south) of the initial classification proposed by [39]. Detailed information on the ecoregions used is provided in Table 1 (see also [36,39]).

Biomass Data Collection
The in situ biomass data used in this study were obtained from the CSE database. The data covered the 1999-2013 period, apart from 2004 when no data was collected. The biomass data were not regularly collected for all monitoring sites due to occasional lack of logistics or the early passage of bush fires before planned conductance of field campaigns. Biomass measurements were conducted annually at the end of the growing season (late October), separately for the herbaceous and woody layers. The herbaceous and woody leaf biomass values were subsequently added together to provide an estimate of total biomass production.

Herbaceous Biomass Collection
The collection technique used for the herbaceous layer was the stratified sampling line developed by the International Livestock Centre for Africa (ILCA) for monitoring pastoral ecosystems in Mali's Gourma region [40]. Taking a 1000 m transect, each meter on one side of the measuring line is allocated to one of four density/production strata, ranging from 0 to 3: 0 = bare soil; 1 = low production; 2 = medium production; and 3 = high production. Then, between 35 and 100 plots of one square meter are chosen randomly along the transect, taking account of the variability of different strata. The biomass in each plot is cut close to the ground and weighed with a precision scale. After three strata (low, medium, and high production) are re-sampled, only three samples of about 200 g of fresh material are taken back to the laboratory and dried in an oven (three samples for each strata = nine samples for the sites). They are dried for 48 h at 110 °C in order to obtain the dry matter.
The dry matter rate, obtained by dividing the dry weight of the sample by the green weight, is then integrated into an equation for calculating the herbaceous biomass production. For a given stratum, the calculation equation is written as follows: Where Ph is the dry herbaceous biomass (kg· ha −1 ) in the stratum, fr is the relative frequency of the strata along the transect, pm is the average green weight (g· m −2 ) measured in the field, ms is the dry matter rate, and 10 is a conversion factor for translating g· DM/m² into kg· DM/ha.
The herbaceous dry matter biomass production of the site is given by adding together the biomass production of all three strata (low, medium, and high).

Woody Leaf Biomass Collection
The leaf biomass of trees and shrubs was sampled for each site in two steps, one repeated every 2 years and one annually. In the first step, every 2 years, four circular plots were delineated and centered 200, 400, 600, and 800 meters from the 1000 m-long transect. The plot size depended on vegetation density and varied from 1 to 1 /16 ha. The plots tended to be larger in the open land that characterizes the Sahelian area in the north of the study area, and relatively smaller in the North Sudanian domain where the woody vegetation cover is denser. Within these plots, the parameters measured were individual height, number of live trunks (some species, such as Guiera senegalensis and Boscia senegalensis, can have several trunks), plant cover, circumference at the base of the trunk, and phenological state (flowers, fruits, etc.). In the second step, the most representative species were sampled annually and, for each species type, 10 twigs were defoliated and fresh leaf biomass weighed. About 200 g of fresh leaves were dried and weighed in order to determine the leaf dry matter content. The primary foliage production of a given site reflected the total amount of leaf biomass produced for all trees and shrubs. Therefore, for an individual plant belonging to a given species, the primary foliage production (Pi) in kg was reached via an allometric relationship that integrates its circumference at the base. This relationship has been established for certain species of trees in the Sahel [12,16] as a result of the work done by [41] and [42] in Mali. The expression is written as follows: where a and b are two constants, depending on the species, and C is the base circumference of the trunk in cm. The primary foliage production in kg of one species (Pe) in the four sampling plots of the site is obtained using the following formula where n represents the number of individual plants inventoried: The correction of this primary foliage production (including fruits, if any) [41] into foliar production (Pf) (i.e., material that is available for the ongoing season and can be eaten by livestock as fodder) is done using the formula: where Pve is the average weight of fresh foliar biomass (kg) of the 10 twigs, Mse is the dry matter rate in %, Ps0 is the standard average dry weight of foliar biomass (kg) of 10 twigs (reference value), and S is the total area of the four sampling plots in hectares. Finally, the leaf biomass of one site was obtained by adding together the foliar production of all the inventoried woody species.

Filtering the Ground Dataset
The ground measurements conducted by different people over 15 years were subject to uncertainty in sampling, measuring, and post-processing, which inevitably resulted in some unrealistic outliers. The post-processing of data (typing handwritten values into digital values) was never quality-checked, thus, in order to filter out obviously erroneous observations related to typos, etc., the datasets went through a two-step filtering: (1) first, the data were examined through an exploratory analysis of all observations using a boxplot with boundaries that were ±1.5 times the interquartile range and (2) since biomass production in Sahel is highly dependent on rainfall ( Figure 2), a second filtering of biomass estimates was conducted with outliers identified in the first step to remove unrealistic values that showed no relation to the rainfall estimates obtained from FEWS Net. This rainfall data has a resolution of 8 km and is thus able to capture the heterogeneous pattern. Only those observations for which the difference between the anomalies of biomass and annual rainfall was less than 65% were selected. This threshold value was determined by referring to the total number of biomass anomalies explained by the linear regression with rainfall anomalies (  . Note that the relationship between rainfall and biomass is strengthened when woody leaf and herbaceous biomass are added together.

CSE Biomass Product
The total biomass estimated from the CSE based on a NDVI single-predictor model was used here for comparison purposes. The CSE biomass estimates are derived annually at the end of the growing season by a simple linear regression between the integrated NDVI over the growing season and ground biomass measurements for the corresponding year. The growing season is determined by the visual analysis of dekadal (10-day) images throughout the May-October period. The start of the season corresponds roughly to the dekadal in May that displays generalized plant growth for the whole study area and the end of the season is fixed at the third dekadal of October. Between 1987 and 1999, this method was implemented using the seasonal integrated NDVI (i.e., seasonal weighted average) from the Advanced Very High Resolution Radiometer (AVHRR) of the National Oceanic and Atmospheric Administration (NOAA) satellites acquired in Local Area Coverage (LAC) format at the CSE receiving station in Dakar. Since 2000, the 1-km NDVI S10 products derived from atmospherically corrected SPOT-VEGETATION reflectances based on the maximum composition value (MVC) algorithm at 10-day intervals have been used [43]. Here, we used only the CSE biomass estimates from 2000, which are based on the same SPOT-VEGETATION satellite sensor as our biomass estimates. It has to be noted that the aim of the research presented here was not a direct one-to-one comparison between models, since different datasets were used for their calibration. On the contrary, the comparison is done at the level of biomass output estimates and the CSE biomass product represents the reference product used operationally for fodder-monitoring in Senegal.

Phenological Metrics from FAPAR Time Series
We used the 1999-2013 time series of the GEOV1 Copernicus Global Land FAPAR product derived from the SPOT-VEGETATION instrument. The products are freely available at a 1-km resolution and 10-day intervals on http://land.copernicus.eu/global. The principles for generating this FAPAR dataset are based on the use of neural networks that were first trained with CYCLOPES reflectances and existing MODIS and CYCLOPES FAPAR products [44]. In order to take benefit from their complementarities, MODIS and CYCLOPES products were fused by assigning higher weights to the MODIS (CYCLOPES) product for the high (low) FAPAR values, with weight being driven by previous validation studies [45,46]. The trained neural network algorithm was then applied using the directionally normalized top of canopy reflectance in the red, near infrared, and shortwave infrared bands, and the cosine of the sun zenith angle at the time of satellite overpass (i.e., about 10:30 for VEGETATION). For details on the algorithm used to estimate GEOV1 products, we refer to [20]. Recent validation studies indicated that GEOV1 outperformed MODIS, CYCLOPES, and JRC-SeaWIFS FAPAR products in both accuracy and precision [47].
Phenological metrics were derived from the GEOV1 FAPAR time series. Although compositing techniques were used for synthesizing dekadal GEOV1 products, these could have been affected by artifacts related to the presence of clouds [48] and residual atmospheric effects [49]. Prior to using them for phenological detection, therefore, filtering the noise was essential [47]. This is particularly important in the Sahel zone where the vegetation has a rapid phenological cycle associated to the short rainy season when most of the optical satellite observations are missing or affected by noise [50]. GEOV1 FAPAR time series were filtered using the Savitzky-Golay (SG) fitting method available via TIMESAT software [51]. SG is a simplified least squares fit convolution that can be understood as a weighted moving average filter with weighting given as a polynomial of a certain degree [49]. This filtering method was demonstrated to improve other existing temporal filters for reconstructing satellite time series in terms of the accuracy as compared to the original data by ensuring robustness to noise and missing data, while preventing over-smoothing [52][53][54]. The accuracy of the estimation of the timing of phenological stages for the start, maximum, and end of the season as derived from the Savitzky-Golay reconstructed time series is about 10 days for the sites with a percentage of missing data <10% (which is mostly the case in the study area, except in some regions mostly located in the south close to the Gambia River; Figures A1 and A2) but shows a rapid decay with the percentage of missing data and the length of the period with missing data [52]. The key parameters used in the TIMESAT SG filter are: seasonal parameter = 0.5 (to fit one season per year), number of envelope iterations = 2 (number of iterations for upper envelope adaptation), adaptation strength = 2 (strength of the envelope adaptation, maximum = 10), and window size = 4 (half-window for SG filtering, also defining the degree of smoothing) [55]. The beginning and end of the growing season were estimated at 20% and 50% of the FAPAR seasonal amplitude. These latter values were chosen after a qualitative analysis was conducted of the growth profile for different pixels throughout the study area. The result was a smoothed curve fitted to the upper envelope of the FAPAR values of the time series. From the smoothed profiles, 11 phenological metrics were calculated (Table 2 and Figure 3). All metrics were computed at the pixel scale (1-km spatial resolution) for each year between 1999 and 2013. The annual values were then averaged for each site over a 3 × 3 pixel window to match, as far as possible, the spatial sampling of ground data.  Phenological metrics are represented on the ECOsouth curve. For acronyms see Figure 1 and Table 2.

Reduction of Explanatory Variables and Model Development
In order to understand the relationship between phenological metrics and total biomass across the Senegalese rangelands, a Pearson correlation analysis was performed on the whole dataset (n = 263). Only those variables that were significantly correlated to total biomass at the 99% significance level were considered for the study. The significant phenological metrics were sorted according to their importance in predicting total biomass using the Partial Least Squares (PLS) regression method [56]. This technique is generally used to build predictive models when the number of explanatory variables is large or presents multi-collinearity. The usefulness of such a method to determine the most important predictor variable in predicting a response variable was discussed by [57]. The ranking was performed using the statistical criterion Variable Importance in the Projection (VIP) [58]. Only the most important phenological metrics with a VIP ≥ 0.8 [59] related to total biomass estimation were selected so as to reduce the number of variables for model development and therefore avoid the computational problems.
In the development of statistical models, single selection parameters could be biased and only a combination of different unrelated parameters should be used to assess model performance. Several performance criteria for measuring the quality of the models have been proposed in the literature based on minimizing a penalty parameter. In this study, we used the Akaike Information Criterion (AIC) and the adjusted coefficient of determination (Adj. R 2 ), which constitutes a good indicator of model robustness [60]. The Adj. R 2 is a corrected value of the R 2 , taking account of the influence related to the number of predictor variables in a given model [61] and also considering the sample size.
In order to detect the influence of observations on the model's adjustment and to verify the homoscedasticity and normality of residuals, we analyzed the Cook's Distance, the Studentized Residuals, and the QQ-Plot [62]. The Variance Inflation Factor (VIF) indicator was also used for detecting multi-collinearity between the variables in the models. Only those models in which all included variables had a VIF below 10 were selected [63]. Table 3 presents the main criteria and indicators used for model selection.

. Bootstrap Resampling and Model Verification
The small sample size of the ground dataset does not allow a reliable assessment of the bias and variance report of the evaluated models and extraction of a validation sample of adequate size simultaneously. In order to overcome this, we used the bootstrap validation technique [64] to assess the predictive ability of the models [65]. This technique involves using randomized subsets of the observations in the original sample [57]. It provides random samples of the same size as the original sample, each of which is referred to as the bootstrap sample [66]. The number of bootstrap samples was set at 200, with a sample rate of 0.75. Statistical calculations were performed using these bootstrap samples, for which the standard deviation of the re-sampled statistics was the empirical standard error of the statistics generated by the original sample [67].
The parameters used to assess the predictive ability of the final models were the Mean Absolute Error (MAE) and the Normalized Mean Absolute Error (NMAE) ( Table 4). The MAE was chosen for the model verification because it provides an unambiguous measure of the magnitude of the average error and is therefore more appropriate than the Root Mean Square Error (RMSE) for dimensioned evaluations of average model-performance error [68]. Finally, only the best models that included, at most, three variables with the highest Adj. R 2 , while minimizing the AIC, MAE, and NMAE criteria, were selected.
Where n is the number of observations; p is the number of predictor variables; R 2 j is the correlation coefficient for regression of Xj with the (p − 1) other variables; SSE is the sum of squared errors; Oi is the observed biomass in the field, Pi is the predicted biomass by model; Bm is the mean observed biomass; A is the number of relevant components for prediction; wa is the loading weight; SSa is the sum of squares explained by the ath component; (wa/‖wa‖)² is the importance of the corresponding variable [58].

Relationship between Total Biomass and Phenological Variables
The results of the Pearson correlation analysis (Table 4) showed that the phenological variables extracted were all significantly correlated to the total biomass (p < 0.0001), except for the increasing rate during the green-up stage (LDERIV). The large FAPAR integral (LINTG) had the highest correlation value, while the lowest correlation was produced by LDERIV. Therefore, only LDERIV was removed, and all the remaining phenological variables were used to assess total biomass across the Senegalese rangelands.

Importance of the Explanatory Variables in Total Biomass Prediction
Key variables were identified by VIP in order to further reduce the number of variables for model development. At the scale of the whole study area, the results confirmed that LINTG was the most important phenological metric, whereas at the ecoregion scale, the maximum FAPAR (PEAK) was the most important ( Figure 4). Variables of minor importance (VIP < 0.8) were removed and the remaining variables for model selection differed among the ecoregions. For example, SOS was the least important variable for ECOnorth and ECOwest, whereas in ECOeast and ECOsouth, the least important variable was middle of season (PMID).  Figure 1). Table 5 presents the best-performing models built across the study area and per ecoregion. It shows that the study area model (Model_SA) with the LINTG, LOS, and RDERIV variables could be considered the most suitable ones for estimating total biomass across the study area at the end of the growing season, with an adjusted R 2 (Adj. R 2 ) of 0.67 and an NMAE of 26%. Although slightly less strong (Adj. R 2 = 0.62) compared with Model_SA, the early warning model (Model_EW) using the PEAK and SOS variables also showed good accuracy (NMAE = 27.3%). For the ecoregions, the models showed good accuracy, with NMAE usually below 24%, apart from ECOnorth (NMAE = 31%). The Adj. R 2 values were generally low and varied between 0.15 (ECOwest) and 0.49 (ECOeast). The selected phenological metrics varied among ecoregions. According to the VIP, PEAK was ranked as the most important variable in predicting total biomass throughout four ecoregions (Figure 4). However, according to the predicting performance of the ecoregion models, PEAK was selected only in models for ECOnorth and ECOwest, whereas in ECOeast and ECOsouth, the models included LINTG (Table 5), suggesting a preferential selection of variables depending on ecoregion. LOS was also an important metric in modeling total biomass across ECOeast and ECOsouth. Table 5. Calibration and bootstrap verification performances of multiple linear regression models for total biomass estimation across the study area and per ecoregion. "n" is the size of the original sample used for calibration and "n_test" is the size of the bootstrap sample used for statistical calculations of verification. For other acronyms, see Tables 2 and 3.

Comparison with the NDVI-Based CSE Biomass Product
In order to assess the improvement in the quality of biomass estimates provided by the multiple linear models, the estimates from Model_SA, Model_EW, and the ecoregion models were compared with the CSE biomass product (Figures 5 and 6). The total biomass estimates for 1999 were removed because the CSE estimates were based on LAC AVHRR data for that year (cf. Section 2.3.1). The relationship between observed and estimated total biomass was plotted using the same validation dataset, comprising n = 247 samples. The relationship between satellite and ground estimates of biomass was highly significant in all cases (p < 0.01). Model_SA (R 2 = 0.68, Figure 5a), Model_ EW (R 2 = 0.64, Figure 5b), and the ecoregion models (R 2 = 0.77; Figure 5c) outperformed the CSE product (R 2 = 0.51, Figure 5d). The CSE product was the least accurate, with an MAE of 818.46 kg· DM/ha, whereas the ecoregion models used throughout the study area, with varying metrics per ecoregion, provided the most reliable estimates (MAE = 489.21 kg· DM/ha). Total biomass estimates from the ecoregion models provided the highest slope (0.78) and the lowest offset (517) values of the linear regression equation with observed data, indicating an improvement in prediction accuracy when disaggregating the study area and applying models related to ecological characteristics. The ecoregion models allowed a better sampling of the space of biomass and corrected the slight saturation effects observed for Model_SA and Model_EW (Figure 5a,b) where high biomass values were missrepresented. The validation with ground measurements demonstrated that the new developed models improved the estimation of total biomass as compared with the CSE current product. This was confirmed by the temporal evolution of the estimated total biomass from these models between 1999 and 2013 ( Figure 6). Along the full time series, estimates provided by Model_SA, Model_EW, and the ecoregion models had similar values, unlike the CSE biomass products that were generally found to be over-estimating total biomass. This over-estimation was clearly apparent in 2010, when estimated total biomass by CSE products exceeded 5000 kg· DM/ha, whereas the observed biomass value was about 3000 kg· DM/ha, similar to the values provided by the multiple-predictor models.

Testing the Multiple-Predictor Model for Early Warning
The early warning model (Model_EW) was tested and applied in 2002 and 2010 (Figure 7a,b), selected because of extreme biomass production in these years compared with the 1999-2013 mean across the study area. Total biomass production was particularly low in 2002 (as shown in Figure 2a), with a deficit of about 23%, whereas in 2010, there was a surplus, with an excess of about 25%. The extrapolated results with Model_EW visually reflected these exceptional anomalies of total biomass production. In 2002, particularly north of Linguè re and Rané rou, estimated total biomass was very low, with values generally below 500 kg· DM/ha (Figure 7a). Total biomass values higher than 4000 kg· DM/ha were obtained only south of Tambacounda. In contrast, in 2010, about 60% of the study area was characterized by total biomass production above 3000 kg· DM/ha (Figure 7b).
In addition, a strong and significant relationship was revealed by the scatterplot of observed and predicted total biomass anomalies from Model_EW (Figure 8a). There was a similar relationship between predicted total biomass and the rainfall anomalies (Figure 8b). This indicated that the early warning model was able to predict total biomass spatially ( Figure 7) and temporally ( Figure 8

Discussion
The seasonal analysis of FAPAR patterns along the 1999-2013 time series enabled us to retrieve 11 phenological metrics for total biomass modeling in the Senegalese rangelands. With the exception of LDERIV, all the phenological metrics were significantly correlated to total biomass and were therefore used to develop prediction models. Model development involved using a multiple regression approach with a 15-year time series of in situ biomass data. The results showed that the model with the three input variables LINTG, LOS, and RDERIV was the most suitable for estimating total biomass production across the study area, with a high adjusted R 2 , while minimizing the MAE, indicating good fit and accuracy of the model. The three variables selected for this model have been discribed in the literature in relation to their ecological function. The cumulated FAPAR during the growing season was found to be a well-suited proxy of biomass production [21,34,69]. The length of the season provides information on the timing of vegetation growing start and end. In various terrestrial ecosystems (e.g., savannah and grasslands), the LOS metrics are positively correlated with annual carbon sequestration, and thus with biomass productivity, simply due to the fact that more days are available for carbon assimilation [70]. The decay rate is strongly species-dependent [71]. Thus, by applying this metric information, the spatial patterns of dominating species are included, as well as other factors like grazing and burning. The performance of the models varied among the Sahelian ecoregions studied, with different metrics selected for each ecological region, adapted to local ecological conditions. The ecoregion models reduced the MAE on the total biomass estimates of 19%, compared with models calibrated over the whole study area. This meant that the subdivision of the study area into ecoregions increased the overall accuracy of estimates, confirming the study by [72] in Chinese rangelands. The adjusted R 2 of the models per ecoregion, however, was generally low (less than 0.50). This can be explained by the reduced number and limited distribution of ground monitoring sites per ecoregion, giving an unequally distributed time series with a low dynamic range of values, in addition to possible remaining uncertainties related to the ground sampling method. This could be mitigated by using wider ecological regions where the distribution of monitoring sites is selected to reflect the spatial distribution of total biomass. This approach is supported by [36], who found that a wider spatial coverage of biomass data, including different ecological areas over many years, could improve results and reduce data uncertainties in the Senegalese rangelands. The poor fit for ecoregion models could also be explained by the classification of the ecoregions, based on the integration of various biophysical and socio-economic components of the Senegal landscape [39].
Therefore, the ecoregions that comprise several environmental factors might not sufficiently reflect the spatial variation of plant biomass production. From this perspective and according to [73], who agree that vegetation type is the main source of landscape heterogeneity across Senegal, future studies should apply other types of classifications related solely to the vegetation growth cycle, (e.g., "phenoregions" based mainly on phenological parameters, such as SOS, EOS, and LOS; an example is provided by [74]).
In addition, disaggregation to the ecoregion scale provides important information on the most appropriate phenological metrics for monitoring vegetation because metrics are closely linked to specific ecological characteristics, such as soil type, rainfall, woody cover, and species patterns (Table 1). These four points are closely intertwined and differ between the ecoregions. They control the biomass production in a different way, resulting in varying production levels of herbaceous and woody leaf biomass for each ecoregion. The selection of different metrics for the best performing ecoregion models indicates that these differences might be reflected in the metrics selected. LINTG was found to be the best proxy for total biomass production in ECOeast and ECOsouth, whereas in ECOnorth and ECOwest, the best proxy was the PEAK metric. The LINTG variable is known to represent the total amount of biomass produced by the entire vegetation cover, including woody and herbaceous vegetation [75]. Its selection in ECOeast and ECOsouth, therefore, seems to be related to the presence of woody strata providing a high woody leaf biomass production. Likewise, the PEAK metric was selected as the main proxy for ECOnorth and ECOwest. These regions have a high herbaceous biomass production (Table 1), the growth of which is highly dependent on rainfall and its intra-seasonal distribution. The relevance of PEAK for grassland biomass monitoring in northern Senegal has been endorsed by [34], supporting our results. LOS was found to be an important variable for ECOeast and ECOsouth, providing information throughout the growing period. In addition, among the 10 phenological metrics used for developing models, SINTG and AMPL were the only ones not selected in the best-performing models, either for the study area or at the ecoregion scale. Although significantly correlated with ground total biomass, SINTG and AMPL are therefore thought to reflect mainly the seasonal herbaceous cycle signal [71] and are of minor importance compared with LINTG and PEAK.
Until now, fodder biomass in the Sahelian rangelands of Senegal has been estimated using a single-predictor model retrieved annually by simple linear regression between a vegetation index and ground biomass measurements [76]. This method is cumbersome and expensive, with data having to be collected annually and processed before providing information on the available forage. Although the established method gives useful biomass estimations, the multiple-predictor models with phenological variables have proved to improve the available biomass estimates in terms of accuracy. Overall, the advantages and improvements with the proposed approach can be summarized thus: (i) the phenological variables used are retrieved more precisely, pixel by pixel, by applying a fixed rule for the detection of the start and end of the growing season, instead of by a visual and general approximation of these two dates across the whole studied area; (ii) the multiple-predictor models are closer to reality because they take account of all or part of the interactions among phenological factors and therefore provide more reliable estimates; (iii) calibrated on a 15-year time series of biomass sampling data, the new approach is more robust than the conventional one, which is based on field sampling data for a single year [72], and it allows current year predictions to be made without additional field work; and (iv) the multiple-predictor models allow the use of phenological metrics available early in the growing season to predict fodder biomass at the end of the season. A multiple-predictor model tailored to provide early biomass predictions is potentially very useful for the early warning monitoring systems in rangeland ecosystems in general and in Sahelian countries in particular, where most livelihoods are very dependent on fodder biomass. The usefulness of these phenological metrics for early warning of food insecurity in the Sahel zone has been noted recently by [34,35]. It is possible to link the SOS variable with the PEAK one in order to establish a precise early warning model, although the PEAK variable is often detected relatively late in the growing season (i.e., on average, in the second dekadal of September). With these two variables, the model gave valuable results when applied to 2002 and 2010, demonstrating the ability to provide information on a deficit or surplus in fodder production in extreme years. The early warning model outperformed the CSE prediction, which over-estimated actual biomass production in 2010. Such models could have mitigated the effects of the Sahel crisis in 2012, which was caused by late and irregular rains and the prolonged dry spells in 2011 [77]. In Senegal, this crisis led to a decline in agropastoral production that threatened the food security of more than 739,000 people of the population [77]. In the future, such early warning models should enable stakeholders to take decisions as early as September (current year as biomass shortage) with regard to livestock by triggering protocols designed for livestock management (e.g., Opé ration de Sauvegarde du Bé tail [78]) in Senegal. The phenological anomalies for a particular year as compared to the long-term baseline characteristics of the seasonal cycle derived from FAPAR time series at the dekadal time step would allow the set up of critical indicators for food security [54,79]. The operational delivery of near real-time as well as long-time series of biophysical variables in the Copernicus Global land service from PROBA-V [80] and forthcoming Sentinel-3 [81] will ensure continuity of the early warning biomass monitoring system. A recalibration of the proposed models is required for a year with very particular climatic conditions [82] or in case of change in the satellite system (i.e., satellite/sensor drifts/end-of-life) [83]. Special attention would be devoted to the early warning model in its future use insofar as [34] found a large heterogeneity in the strength of the relation between the cumulated FAPAR over the growing season (i.e., biomass production) and the SOS and PEAK variables. More analyses are needed to better understand the relations between ground biomass and applied metrics.

Conclusions
The multiple-predictor models using phenological metrics and 15 years of ground observation data showed robust performance and gave accurate estimates of fodder biomass production across the Senegalese rangelands. The phenological variables selected in the predictor models depend on the production level and the ratio between the total woody leaf and herbaceous biomass. The large integral (LINTG) of Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) appears well-suited in pastoral areas characterized by a dense woody cover and a high leaf biomass production, whereas the seasonal maximum (PEAK) metric is preferently selected in grazing areas with lower woody density and a high herbaceous biomass production. The subdivision of the study area into ecoregions increased the overall accuracy of the multiple-predictor models. The validation with ground measurements shows that the proposed approach improves fodder resource monitoring in Senegal, providing more reliable and accurate estimates as compared to the current CSE biomass product. In addition, it allows reducing time and cost because, upon first model calibration, biomass estimation for a given year can be obtained without additional field work. On the contrary, the CSE model requires a yearly calibration with dedicated ground biomass measurements. Finally, using phenological metrics that are available relatively early in the growing season (i.e., PEAK and SOS), the proposed models can provide timely information on forage availability in rangelands. This allows helping stakeholders to make early decisions about possible livestock production deficits and related food insecurity. It constitutes an important benefit as compared to the current state of biomass estimation in Senegal, which is based on a single-predictor model that ingests the NDVI data at the end of the growing season. In order to enhance the performance of ecoregion models, future studies should apply classification schemes centered on the vegetation growth cycle (e.g., "phenoregions" based mainly on remotely sensed phenology). Further research is required to better understand the relation between satellite-derived phenological metrics and ecosystem properties.