Modeling Biomass Production in Seasonal Wetlands Using MODIS NDVI Land Surface Phenology

Plant primary production is a key driver of several ecosystem functions in seasonal marshes, such as water purification and secondary production by wildlife and domestic animals. Knowledge of the spatio-temporal dynamics of biomass production is therefore essential for the management of resources—particularly in seasonal wetlands with variable flooding regimes. We propose a method to estimate standing aboveground plant biomass using NDVI Land Surface Phenology (LSP) derived from MODIS, which we calibrate and validate in the Donana National Park’s marsh vegetation. Out of the different estimators tested, the Land Surface Phenology maximum NDVI (LSP-Maximum-NDVI) correlated best with ground-truth data of biomass production at five locations from 2001–2015 used to calibrate the models (R2 = 0.65). Estimators based on a single MODIS NDVI image performed worse (R2 ≤ 0.41). The LSP-Maximum-NDVI estimator was robust to environmental variation in precipitation and hydroperiod, and to spatial variation in the productivity and composition of the plant community. The determination of plant biomass using remote-sensing techniques, adequately supported by ground-truth data, may represent a key tool for the long-term monitoring and management of seasonal marsh ecosystems.


Introduction
Plant primary production is a key driver of ecosystem dynamics and can thus influence several ecosystem functions, such as water purification capacity and secondary production by animals.Knowledge of the spatio-temporal dynamics of plant biomass production is essential to inform the management of natural resources, in conservation areas and in agro-pastoral systems [1][2][3][4]-particularly in the Mediterranean and semiarid regions, where inter-annual changes in precipitation often result in large variations in plant production [5].
Traditional methods for plant biomass estimation are based on in-situ observations.They can be highly accurate but often involve intensive field work and destructive methods, which makes them costly and inapplicable to inaccessible or sensitive areas, or when involving endangered species [4,6,7].Remote sensing constitutes an increasingly used alternative [8], based on the relationship between satellite-derived metrics and primary production [7].Remote sensing may allow for the non-destructive, high-resolution coverage of large, remote, and/or inaccessible areas, such as mountains [9], deserts [10], or wetlands [4,[11][12][13].Remote sensing allows for the reconstruction of historical trends as well, using satellite image time series: for example, the reconstruction of the hydroperiod in Doñana marsh from 1974-2014 [14], or the assessment of rangeland conditions in semiarid regions [15].The most widely used methods to monitor vegetation are based on the use of vegetation indexes, such as the Normalized Difference Vegetation Index (NDVI) or the Enhanced Vegetation Index (EVI), as proxies of aboveground biomass [7,16,17].However, the use of these indexes is also subjected to limitations and criticism; for example, they have been shown to saturate asymptotically at high biomass values [12,18].
In general, the assessment of plant production can be based on the analysis of single (i.e., one-date) images, bi-temporal change detection, or temporal trajectory analysis, followed by the interpretation of results over time [19].For vegetation, a traditional group of methods relies on quantifications of the differences in statistical metrics of the vegetation-index time series like, for example, the beginning and end of the growing season, the maximum and minimum values, the annual mean, or the variance [9].In regions with strongly seasonal climates, production is typically assessed by searching for anomalies in the current NDVI against the average of the whole time series, or against reference values from the same period of the year, which informs about the current status of vegetation as compared to other seasons, or to an average condition [20].This is made at predefined fixed dates, which works well when seasonal cycles are regular, but is often problematic when they vary across years due to climatic or environmental variability.In such cases, observed anomalies in NDVI data may simply constitute a temporal shift of the growth season-i.e., an early (positive NDVI anomaly) or delayed (negative NDVI anomaly) start of the growth season [10].Other key problems may be related to the lack of consistency and reliability of the NDVI images used for analysis due to noise or errors, especially when a single image per year is used.Examples include variation in viewing and illumination geometry, resolution and calibration, digital quantization errors, ground and atmospheric conditions, as well as orbital and sensor degradation [7,21].
To overcome these limitations, the use of smoothed NDVI time series including a number of consecutive growing seasons (instead of a single image per growing season) is being proposed.Such time series analyses make use of all the information accumulated at the end of the growing season to estimate the parameters describing vegetation phenology (e.g., [21,22]).Indeed, the study of vegetation phenology has become very relevant in several realms, such as productivity and the carbon cycle (e.g., [23,24]), climate change and its impacts on ecosystems [25][26][27], as well as crop and pasture monitoring [10].During the last decade, on-the-ground phenological studies have been complemented by studies focusing on large-scale remote sensing [28], technically referred to as Land Surface Phenology (LSP, [12]).LSP can be defined as the timing of recurring changes in the reflectance of electromagnetic radiation from the land surface due to concurrent life-cycle changes of vegetation [29].It is generally measured by deriving either vegetation parameters (e.g., leaf area index (LAI), fraction of absorbed photosynthetically active radiation (FAPAR) or vegetation indexes (e.g., NDVI, EVI) from remote-sensing data [9,[30][31][32].These vegetation indexes are used to maximize the extraction of variability assigned to certain plant features (e.g., leaf area, canopy cover, photosynthetic activity) while minimizing other unwanted effects (e.g., geometric, soil color, or atmospheric effects), thus enhancing the information contained in spectral reflectance data [12,33].LSP is then characterized using different mathematical procedures such as the identification of global/local thresholds and points of maximum increase/decrease, curve fitting and the subsequent extraction of inflection points or thresholds, and harmonic analysis [20].
In this article, we present a method for estimating plant biomass production in seasonal wetlands based on the NDVI from the Moderate-resolution Imaging Spectroradiometer (MODIS).Method development included the comparison of the two different approaches discussed above, namely the use of single images versus the characterization of LSP using the whole time-series; as well as the use of different estimators within each of these two approaches to estimate biomass production across the whole study area for the 16-year series.
We developed and applied this method in a particularly challenging study area: the semiarid marsh and wetlands of the Guadalquivir river estuary (Doñana National Park, SW Spain; 'Doñana marsh' hereafter).As in many arid and semiarid regions, the determination of biomass production is particularly challenging due to the flooding regime, the color influence of soils, and the spatial variation in vegetation communities and species composition [34][35][36][37].The Doñana marsh consists of a diverse and complex array of ecosystems affected by a highly dynamic interplay among vegetation, soil and water [38], whose prolonged land-use history fostered a mix of natural and semi-natural vegetation [39].Its vegetation provides habitat and food for a highly diverse fauna, making the area a biodiversity hotspot; but grazing by wild and domestic herbivores largely determines plant standing crop and may result in overgrazing, particularly during dry years [40,41].Studying the primary production of the marsh vegetation is therefore essential for the management and conservation of the Doñana National Park; while its huge size and accessibility problems during most of the flooding period makes this a particularly challenging task using solely on-the-ground approaches.

Study Area
The study focuses on the helophyte community of the Doñana marsh, an iconic wetland included in the Doñana National Park (SW Spain, Figure 1, 37 • 01 N, 6 • 26 W).Doñana has a sub-humid Mediterranean climate characterized by mild winters and hot summers, and rainfall concentrated in autumn (October-December) and spring (March-May).The Doñana marsh is a seasonal floodplain with a flooding regime that depends on rainfall [14,42].The helophyte community is strongly synchronized with flooding, starting to grow after the water level reaches a peak (February-March), then growing rapidly to create a vegetation layer of approximately 1 m height, and becoming senescent by August when the marsh is dry [42].More specifically, we focused on nearly-monospecific stands of saltmarsh bulrush (Bolboschoenus maritimus) belonging to the phytosociological association Bolboschoenetum maritimi.The saltmarsh bulrush represent one of the key primary producers of the marsh and thus sustains many elements of its food chain, including wintering waterfowl that consumes its tubers and seeds (e.g., greylag geese Anser anser; [43,44]).Domestic (cattle and horse) and wild (red deer, fallow deer and wild boar) ungulates also make use of the study area, grazing on saltmarsh bulrush and other plants [40].
Vegetation species composition fluctuates interannually as a consequence of climatic variability.To define the limits of the study area in the Doñana marsh, we selected an area as homogenous as possible, by performing an unsupervised classification in ENVI 5.4 using five classes and 10 iterations.We used the descriptive statistics (mean, median, standard deviation, maximum value and minimum value) of a time series of NDVI images from the Landsat satellite TM and ETM+ sensors from 1984 to 2015 [45].The resulting class that spatially corresponded better with saltmarsh bulrush dominance for the study period defined the study area limits.
The study area includes two estates with different ownership regimes (public and communal land, respectively).During the study period (2001-2015, see below), there were no major changes in the area apart from natural variation in rainfall (typical in a Mediterranean climate), small changes in the hydrology of the marsh, and moderate shifts in the stocking rates allowed at communal land (which had decreased during the 1990s, but remained relatively stable during the 2000s and tended to increase after 2010).

Satellite Data
Remote sensing data consisted of satellite images from the MODIS (Moderate Resolution Imaging Spectroradiometer) sensor onboard the TERRA satellite (NASA).In particular, we used a raw time series of 387 images provided by the data service platform from the University of Natural Resources and Life Sciences of Vienna (BOKU; [46]).This platform offers a modification of Global MOD13Q1 data, which is the NDVI vegetation index product provided by NASA Land Processes Distributed Active Archive Center, in smooth and raw images (from 2000 to the present) every 16 days as the product of an algorithm that calculates the Maximum Value Composite [47].

Biomass Data
Plant biomass data consisted of two datasets: one used to calibrate and select among alternative remote sensing models, and another used to validate the best model.The calibration dataset belongs to a long-term study on the impact of ungulate grazing initiated in 1982 by Ramón Soriguer (Doñana Biological Station).Currently, it is the only long-term data set available on above-ground biomass for the Doñana marsh.It was designed neither for this study nor to be a ground-truthing set for data extracted from satellite images; hence, it presents some limitations such as being restricted to localities that remain accessible in high-flood years [40].Calibration data consisted of regular annual harvests of aboveground biomass production (standing crop, in kg dw/ha) in five fixed locations within the Doñana marsh (Figure 1) from 2001 to 2015, amounting to 75 samples [40].These five locations were selected due to their accessibility and representativeness of the saltmarsh bulrush community.One of the calibration pixels (C.1 in Figure 1) was placed outside the study area, but it showed a similar vegetation community (dominated by saltmarsh bulrush) and ecological characteristics.For each calibration location and date, biomass production values are based on 5

Satellite Data
Remote sensing data consisted of satellite images from the MODIS (Moderate Resolution Imaging Spectroradiometer) sensor onboard the TERRA satellite (NASA).In particular, we used a raw time series of 387 images provided by the data service platform from the University of Natural Resources and Life Sciences of Vienna (BOKU; [46]).This platform offers a modification of Global MOD13Q1 data, which is the NDVI vegetation index product provided by NASA Land Processes Distributed Active Archive Center, in smooth and raw images (from 2000 to the present) every 16 days as the product of an algorithm that calculates the Maximum Value Composite [47].

Biomass Data
Plant biomass data consisted of two datasets: one used to calibrate and select among alternative remote sensing models, and another used to validate the best model.The calibration dataset belongs to a long-term study on the impact of ungulate grazing initiated in 1982 by Ramón Soriguer (Doñana Biological Station).Currently, it is the only long-term data set available on above-ground biomass for the Doñana marsh.It was designed neither for this study nor to be a ground-truthing set for data extracted from satellite images; hence, it presents some limitations such as being restricted to localities that remain accessible in high-flood years [40].Calibration data consisted of regular annual harvests of aboveground biomass production (standing crop, in kg dw/ha) in five fixed locations within the Doñana marsh (Figure 1) from 2001 to 2015, amounting to 75 samples [40].These five locations were selected due to their accessibility and representativeness of the saltmarsh bulrush community.One of the calibration pixels (C.1 in Figure 1) was placed outside the study area, but it showed a similar vegetation community (dominated by saltmarsh bulrush) and ecological characteristics.For each calibration location and date, biomass production values are based on 5 samples spaced 40 meters along a 200 m transect.Each sample consist of the aboveground biomass present in a randomly oriented rectangle 10 cm wide by 100 cm long, clipped at ground level using an electric grass shear.Samples were transported to the lab, sorted by species discarding dry biomass from the previous year, dried in paper envelopes (72 h at 60 • C) and weighted (accuracy: ±0.01 g).After two years of monthly sampling to assess the phenological cycle, harvests were taken twice a year: (i) one in May-August to estimate peak biomass production (date adjusted to flooding levels, which determine plant phenology), and (ii) another in September-October to estimate biomass "leftovers" (biomass production not consumed by herbivores after the end of the growth season) [40].We used as ground-truth the annual maximum biomass estimate sampled at each location regardless of the month in which it was collected.
Validation data were collected in August 2016 following a new stratified random sampling scheme.In order to ensure that the sampling areas covered adequately the complete productivity range, we first classified the study area in three biomass-production categories using the averaged value of maximum NDVI from 2001 to 2015: low (<0.5 NDVI), medium (0.5-0.6 NDVI) and high (>0.6NDVI).These thresholds divided the area in three sub-areas of similar size.Then, among all the MODIS pixels we randomly selected 3 pixels within each NDVI category (250 × 250 m, 9 MODIS pixels in total).Within each MODIS pixel we randomly selected 3 Landsat pixels (30 × 30 m).Within each Landsat pixel we randomly selected four sampling points (1 × 1 m; Figure 1).This design resulted in a total of 108 sampling points (12 biomass estimates per MODIS pixel).The design was chosen to adequately sample the spatial variability in biomass in the study area, to obtain more precise estimates of mean MODIS pixel biomass.The sampling design was aimed at also providing information on the spatial scale at which variation in biomass production occurs and how it relates to the resolution provided by MODIS and Landsat images.Aboveground biomass samples were collected using 100 × 100 cm squares with the electric grass shear.Samples were transported to the lab, sorted by species discarding dry biomass from the previous year, dried in paper envelopes (72 h at 80 • C), and weighted (accuracy: ±0.1 g).

Environmental Data
Environmental data were used to evaluate model robustness and analyze model results.They consisted of two variables: (1) hydroperiod, i.e., the number of days that each pixel remained flooded in each annual cycle estimated by Díaz-Delgado et al. [14]; and (2) precipitation data, in particular the cumulative precipitation in the hydrometeorological year from 1 September to 31 August (data provided by Doñana's Long-Term Monitoring program, ESPN at ICTS-RBD).

Biomass Production Models
We used three approaches to model annual biomass production based on NDVI data and selected the best-performing model using the calibration dataset (see above).The calibration dataset was also used to select among alternative model parameters (see model 3, below) and to obtain a relationship between NDVI-based estimates and biomass production.The three model types were based on the following information: 1.
The maximum NDVI value observed in each given year and MODIS pixel ('Maximum-NDVI' hereafter).

2.
The NDVI value at the time at which peak biomass occurs in an average year (8 May), for each given year and MODIS pixel ("May-NDVI" hereafter).The average time of the biomass peak was calculated as the mean date of the maximum NDVI values observed at the five MODIS pixels included in the calibration dataset from 2001 to 2015.
The calibration procedure was also used to inform the choice of three settings that must be decided by the user before fitting the curves to NDVI data [49], namely: (1) The baseline value of the phenological curve, a parameter that discards all the values below a specific NDVI value from the growth season under analysis.(2) The criterion that defines the beginning and end of the growth season.We evaluated two options: a fixed threshold value and a fixed proportion of the seasonal amplitude observed during each growth season.(3) The fitting method used to filter noise in the data: Savitzky-Golay filter, Asymmetric Gaussian and Double Logistic.For all other settings, we used the default values in TIMESAT, namely: no spike method, one season per year, no adaptation to the upper envelope of the curve, and normal adaptation strength.
All subsequent analyses were done using packages "car" [50] and "lmodel2" [51] in R [52].We fitted linear regression models using the annual ground-truth values of biomass production (calibration dataset) at each location and year as response variables and the NDVI-based estimates at a MODIS pixel and year as predictor variables.Models were fitted to untransformed and transformed variables (linear, exponential, logarithmic, power and log-log regressions), to test for an improved fit.We selected among alternative models using the proportion of explained variance as estimated by the R 2 [53], the root mean square error (RMSE) and the percentage of the normalized RMSE (calculated dividing for the mean value).
In addition, we used the calibration data and the best model to evaluate the robustness of model predictions, i.e., to test whether the relationship between NDVI-based estimates and observed biomass production was influenced by (i) changes in two key environmental variables (precipitation and hydroperiod), and (ii) spatial variation in productivity (i.e., among-site variation in soil fertility).For the first purpose, we used multiple regression of observed biomass production on NDVI-based estimates and either precipitation or hydroperiod, and compared them to the univariate regression (only NDVI-based estimates) using F-tests [53], and adjusted R 2 values.For the second purpose, we compared the relationship between biomass production and NDVI-based estimates among the five calibration locations (which showed consistent differences in biomass-production range across the whole data series; Figure 2).We used an Analysis of Covariance (ANCOVA, [53]) to assess the differences in the mean slope of biomass in relation to the sampling locations used for calibration.We included "location" as a fixed, categorical factor.A significant effect of the interaction between the continuous (NDVI-based estimates) and fixed (location) factors would indicate that the slopes of the relationship between NDVI and biomass production varies significantly among localities, thus a common calibration line should not be used across the whole study area.

Model Validation
The best model was validated with the new validation (2016) dataset.For validation, we regressed with a major axis regression the biomass field data on the predictions of the best NDVI-based model, evaluating whether the slope differed significantly from 1 (i.e., whether model predictions significantly over-or under-estimated observed values), and estimating model performance with the RMSE and the percentage of the normalized RMSE.In addition, since field data indicated a high variability in species composition in the validation dataset, we evaluated whether model performance was affected by such variability (i.e., if model predictions were better in areas dominated by B. maritimus).For this purpose, we performed a multiple regression with NDVI-based estimates and plant composition (the proportion of B. maritimus biomass present in each sample) as independent variables, following arcsine (square root) transformation of the latter variable to ensure residuals' normality and homoscedasticity.
We used these estimates to analyze the spatial and temporal variability in biomass production.Change in spatial variability over time was analyzed using IDRISI Earth Trends Modeler [54] to calculate the Theil-Sen slope estimator.This is a temporal trend estimator more robust than the least-squares slope because it is much less sensitive to outliers and skewed data.In our analysis, it was used to identify pixels where biomass increased or decreased, considering a significance level of (α = 0.05).The main driver of temporal variability in biomass production was analyzed by regressing biomass production (averaged across the whole study area) on cumulative precipitation, calculated over five different time intervals (September-March, September-April, September-May, September-June and September-July) to identify the period over which precipitation was most influential (i.e., the one providing the best fit).

Model Parametrization
Estimators based on a single image (Maximum-NDVI and May-NDVI) were obtained directly from the MODIS images.For the two NDVI estimators modeled using TIMESAT (LSP-Maximum-NDVI and LSP-Accumulated-NDVI), the first step was to choose the three model settings that resulted in the best calibration:

•
Baseline value: The best results were obtained with a baseline value of 0.27, which corresponds to the average value of NDVI in September across the whole study area-i.e., the NDVI value of senescent B. maritimus vegetation on dry marsh soil.This baseline value resulted in a much better regression fit than using no fixed baseline value (R 2 = 0.63 vs. R 2 = 0.22, in the best-performing model and filter: LSP-Accumulated-NDVI with Savitzky-Golay, see below).Other baseline values, like the NDVI value of open water (NDVI = 0.31), resulted in the failure to recognize the growing season-probably because it results in large variations in baseline values between early-and late-flooding years, which is unrelated with plant primary production.

•
Beginning and end of the growth season: The criterion based on a proportion of the seasonal amplitude performed better than the one based on a fixed threshold value, which resulted in TIMESAT failing to recognize the growth season for most of the years, due to their strong inter-annual variability.Among the different threshold-amplitude values tested, a value of 10% performed best (R 2 = 0.65, as compared to R 2 = 0.63 for 3% and R 2 = 0.61 for 5%, in the best-performing model and filter: LSP-Maximum-NDVI with Savitzky-Golay, see below), allowing for the recognition of the growth season of all years and succeeding with the filtering of the noise.

•
Fitting method: The metrics derived from the Savitzky-Golay filter performed slightly better than those obtained with the other two methods (Table 1).Hence, we solely use and report this method hereafter.

Model Calibration
The results showed that there was a statistically significant relationship between each of the four NDVI biomass estimators (Maximum-NDVI, May-NDVI, LSP-Maximum-NDVI, LSP-Accumulated-NDVI) and biomass production.The best results were obtained with a log transformation of the response variable (ground-truth biomass production), (Table 2; Appendix A Table A1).The two estimators based on Land Surface Phenology models performed considerably better, with LSP-Maximum-NDVI providing the best fit.Parameter values from this calibration fit (Table 2, Figure 2) are used hereafter to estimate biomass production from LSP-Maximum-NDVI values.The use of multiple regression models including as a second predictor a key environmental variable (either precipitation or hydroperiod) did not improve the fit of the best model with a single predictor (LSP-Maximum-NDVI, Table 3).Therefore, the relationship between LSP-Maximum-NDVI and biomass production was apparently not influenced by either of these two environmental variables.
The ANCOVA including the five calibration locations as a categorical factor indicated that the slopes were not heterogeneous (i.e., the interaction location * LSP-Maximum-NDVI was not significant: F (9, 61) = 0.36, P = 0.83).However, the model with location without interaction showed a better fit (adjusted R 2 = 0.85) than the model without location (R 2 = 0.64, Table 3).This indicates that the localities differ significantly in average biomass production across the years, but the slope between LSP-Maximum-NDVI and biomass production is not affected by such variation.The use of multiple regression models including as a second predictor a key environmental variable (either precipitation or hydroperiod) did not improve the fit of the best model with a single predictor (LSP-Maximum-NDVI, Table 3).Therefore, the relationship between LSP-Maximum-NDVI and biomass production was apparently not influenced by either of these two environmental variables.
The ANCOVA including the five calibration locations as a categorical factor indicated that the slopes were not heterogeneous (i.e., the interaction location * LSP-Maximum-NDVI was not significant: F (9, 61) = 0.36, P = 0.83).However, the model with location without interaction showed a better fit (adjusted R 2 = 0.85) than the model without location (R 2 = 0.64, Table 3).This indicates that the localities differ significantly in average biomass production across the years, but the slope between LSP-Maximum-NDVI and biomass production is not affected by such variation.

Model Validation
Biomass production varied considerably among validation plots, ranging from less than 500 kg dw/ha to almost 3000 kg dw/ha (mean = 1595 kg dw/ha, median = 1524 kg dw/ha, standard deviation = 678 kg dw/ha) (Figure 3).Species composition showed an unexpectedly high variation among sampling localities.B. maritimus represented 47% of the biomass production, followed by Eleocharis palustris (28%) and Scirpus lacustris (10%).B. maritimus was dominant in 5 of the 9 MODIS pixels, while the other 4 pixels were dominated by E. palustris (3 pixels) and S. lacustris (1 pixel).The results of the validation exercise showed that the model based on LSP-Maximum-NDVI could explain a reasonable percentage of the variance of the biomass (R 2 = 0.70; Figure 4), particularly regarding the high spatial and temporal variability present in the study area.The prediction error estimate, based on RMSE, was 354 kg dw/ha and the %RMSE was 22%.The slope of the predictedobserved relationship did not differ significantly from 1 (95% CI = 0.30; 1.07), indicating that model predictions neither under-nor over-estimate observed biomass production.Variability in species composition did not significantly influence the relationship between measured and predicted biomass production.Results of the multiple regression model including LSP-Maximum-NDVI and the proportion of B. maritimus showed that the latter did not significantly influence biomass yield (Adj. R 2 = 0.61 in contrast to R 2 = 0.70; significance of proportion of B. maritimus t-test = −0.39,p-value = 0.71).

Trend Analysis
Based on our LSP-Maximum-NDVI model, we produced 16 maps representing biomass production per pixel (in kg dw/ha) for each growth season between 2001 and 2016 (Figure 5).The average value per pixel (across all years) was 3869 ± 1781 kg dw/ha.The maps reveal a high spatial variation in biomass production, resulting from a combination of high spatial heterogeneity and high inter-annual variation.In some years, such as 2010, there were extensive areas with high biomass production (up to 10,000 kg dw/ha); while in other years, such as 2005, biomass production was one order of magnitude lower (i.e., it did not reach 1000 kg dw/ha at any pixel across the study area).However, the areas with high and low biomass production were not stationary, but strongly varied among years.For example, in 2001 biomass production peaked at the southern part of the study area, while in 2010 it peaked at its northernmost part.
Values of the Theil-Sen slope estimator showed that there is a general trend towards diminishing biomass production over the last 16 years-i.e., there were more areas where biomass production decreased than areas where it increased (Figure 6A).Biomass production tended to decrease in the central part of the study area, whereas it tended to increase in its periphery.A comparison with the spatial distribution of the average biomass production from 2001 to 2016 revealed that biomass production tended to decrease in areas with high productivity (high average biomass production) and to increase in areas with low productivity (low average biomass production) (Figure 6A,B) (R = −0.24,t-test (725) = −6.51,p-value = 1.38 × 10 −10 ).Disentangling the causes behind this pattern probably deserves further analyses.Variability in species composition did not significantly influence the relationship between measured and predicted biomass production.Results of the multiple regression model including LSP-Maximum-NDVI and the proportion of B. maritimus showed that the latter did not significantly influence biomass yield (Adj. R 2 = 0.61 in contrast to R 2 = 0.70; significance of proportion of B. maritimus t-test = −0.39,p-value = 0.71).

Trend Analysis
Based on our LSP-Maximum-NDVI model, we produced 16 maps representing biomass production per pixel (in kg dw/ha) for each growth season between 2001 and 2016 (Figure 5).The average value per pixel (across all years) was 3869 ± 1781 kg dw/ha.The maps reveal a high spatial variation in biomass production, resulting from a combination of high spatial heterogeneity and high inter-annual variation.In some years, such as 2010, there were extensive areas with high biomass production (up to 10,000 kg dw/ha); while in other years, such as 2005, biomass production was one order of magnitude lower (i.e., it did not reach 1000 kg dw/ha at any pixel across the study area).However, the areas with high and low biomass production were not stationary, but strongly varied among years.For example, in 2001 biomass production peaked at the southern part of the study area, while in 2010 it peaked at its northernmost part.
Values of the Theil-Sen slope estimator showed that there is a general trend towards diminishing biomass production over the last 16 years-i.e., there were more areas where biomass production decreased than areas where it increased (Figure 6A).Biomass production tended to decrease in the central part of the study area, whereas it tended to increase in its periphery.A comparison with the spatial distribution of the average biomass production from 2001 to 2016 revealed that biomass production tended to decrease in areas with high productivity (high average biomass production) and to increase in areas with low productivity (low average biomass production) (Figure 6A,B) (R = −0.24,t-test (725) = −6.51,p-value = 1.38 × 10 −10 ).Disentangling the causes behind this pattern probably deserves further analyses.Inter-annual variability in biomass production (summed across the whole study area) was strongly influenced by annual precipitation-with cumulative precipitation from September to April exerting the strongest influence of all periods tested.The best fit between both variables was obtained using a log-log transformation, which indicates that the effect is stronger at low precipitation values and saturates when precipitation is very high (R 2 = 0.69, F(1, 14) = 30.8,p-value = 7.15 × 10 −5 ; Figure 7).
Inter-annual variability in biomass production (summed across the whole study area) was strongly influenced by annual precipitation-with cumulative precipitation from September to April exerting the strongest influence of all periods tested.The best fit between both variables was obtained using a log-log transformation, which indicates that the effect is stronger at low precipitation values and saturates when precipitation is very high (R 2 = 0.69, F(1, 14) = 30.8,p-value = 7.15 × 10 −5 ; Figure 7).

Discussion
We have shown that MODIS Global MOD13Q1 NDVI data provides a good source of information for estimating biomass production in a challenging situation-a seasonal marsh characterized by high spatio-temporal variation in precipitation and hydroperiod [55].While the use of a single image per growth season provided estimates of reasonable quality (39-41% of variance explained in the calibration dataset), modeling the phenological cycle using Land Surface Phenology (LSP) techniques considerably improved the quality and robustness of such estimates (65% and 70% of variance explained using LSP-Maximum-NDVI, in the calibration and the validation datasets, respectively; see also [56]).Furthermore, biomass production estimates derived from the bestperforming model for the whole study area and time period indicate a strong role of a key climatic driver, the inter-annual variation in precipitation; and a pattern of spatio-temporal change (decreasing yields in the most productive areas) that could be consistent either with changes in vegetation community composition due to marsh siltation and changes in hydroperiod [14] or with the impact of a key biotic driver, overgrazing by domestic and wild herbivores.
The modeling process was particularly challenging because marshes are highly dynamic and heterogeneous wetland ecosystems where the reflectance signal can change rapidly, sometimes within hours or days [6,38].Despite these challenges, the four different, NDVI-based estimators predicted biomass production with reasonable quality (39-65% of variance explained during calibration).However, the two NDVI biomass estimators derived from TIMESAT models of LSP performed significantly better than those based on a single image per year only-reinforcing previous suggestions that LSP may improve biomass determination in complex ecosystems [20,57].The improved performance of LSP estimators is probably caused by the higher sensitivity of singleimage estimators to several sources of error and noise, such as sensor resolution and calibration, digital quantization errors, ground and atmospheric conditions, or orbital and sensor degradation

Discussion
We have shown that MODIS Global MOD13Q1 NDVI data provides a good source of information for estimating biomass production in a challenging situation-a seasonal marsh characterized by high spatio-temporal variation in precipitation and hydroperiod [55].While the use of a single image per growth season provided estimates of reasonable quality (39-41% of variance explained in the calibration dataset), modeling the phenological cycle using Land Surface Phenology (LSP) techniques considerably improved the quality and robustness of such estimates (65% and 70% of variance explained using LSP-Maximum-NDVI, in the calibration and the validation datasets, respectively; see also [56]).Furthermore, biomass production estimates derived from the best-performing model for the whole study area and time period indicate a strong role of a key climatic driver, the inter-annual variation in precipitation; and a pattern of spatio-temporal change (decreasing yields in the most productive areas) that could be consistent either with changes in vegetation community composition due to marsh siltation and changes in hydroperiod [14] or with the impact of a key biotic driver, overgrazing by domestic and wild herbivores.
The modeling process was particularly challenging because marshes are highly dynamic and heterogeneous wetland ecosystems where the reflectance signal can change rapidly, sometimes within hours or days [6,38].Despite these challenges, the four different, NDVI-based estimators predicted biomass production with reasonable quality (39-65% of variance explained during calibration).However, the two NDVI biomass estimators derived from TIMESAT models of LSP performed significantly better than those based on a single image per year only-reinforcing previous suggestions that LSP may improve biomass determination in complex ecosystems [20,57].The improved performance of LSP estimators is probably caused by the higher sensitivity of single-image estimators to several sources of error and noise, such as sensor resolution and calibration, digital quantization errors, ground and atmospheric conditions, or orbital and sensor degradation [7]; and to the rapid changes in the NDVI signal in heterogeneous ecosystems-which may bias such estimators, for example, if an image is taken after a rainfall episode [38].LSP makes use of the information gathered across the complete growth season to produce a smooth NDVI curve that integrates the whole vegetation cycle, thus reducing noise and errors [12,21,58,59].On the one hand, the difference among fitting methods was marginal for the best-performing predictor (LSP-Maximum-NDVI; Table 1), suggesting that the smoothing provided by all fitting procedures sufficed to remove noise and ensure predictor quality-in contrast with works reporting that the over-smoothing introduced by the Asymmetric Gaussian and Double Logistic methods affected the accuracy of parameter estimates [21].On the other hand, the use of a baseline criterion that removed the influence of water removed the strong bias introduced on NDVI-based estimates by early-flooding years-which caused a drop in NDVI values, unrelated to plant productivity.
Besides their statistical properties, the choice of estimator may influence its potential use by managers or policy makers.Management applications that rely on an early prediction of the season's standing crop, for example to adjust the stocking rates of domestic herbivores (cattle and horses), will be best served by those based on single images taken at early dates-such as the May-NDVI, chosen to coincide with the average NDVI maximum without requiring the uptake of ulterior images to identify the exact time of the season's maximum.Similarly, one of the two indicators based on LSP can be calculated at a much earlier point than the other-since LSP-Maximum-NDVI only requires the maximum value to be reached, while LSP-Accumulated-NDVI can only be calculated at the end of the growth season.Under such circumstances, it might be more useful to use a statistically weaker estimator that can be estimated earlier, as long as the associated decrease in accuracy is acceptable.Unfortunately, single-image estimators such as May-NDVI had a much lower accuracy than LSP-based estimators (39-41% vs. 65-70% of variance explained).We therefore recommend the use of LSP-Maximum-NDVI, which provides the best estimates at a relatively early date.
Estimators based on NDVI have been shown to saturate asymptotically at high biomass values [12,18].While the relationship between NDVI and biomass production was multiplicative (i.e., the slope decreased with increasing NDVI, following a logarithmic relationship), the best-performing estimator LSP-Maximum-NDVI was far from reaching a plateau at the highest biomass production values we measured.As a consequence, estimates based on LSP-Maximum-NDVI performed reasonably well in the validation exercise.We cannot rule out, however, a saturation of these estimators in situations (years or localities) with higher biomass production-which would result in a disproportionate increase in prediction errors.We decided to build our models using NDVI because it is the most frequently used vegetation index, but as it is prone to saturation, and to noise caused by soil color and water, it would be interesting to test whether models can be improved using EVI, a vegetation index less prone to these problems [60].
Testing the robustness and validating the performances of the best estimator with independent data was particularly relevant given the high heterogeneity, complexity and unpredictability of the Doñana marsh ecosystems [14,61].Validation yielded satisfactory levels of predictive ability, particularly given the characteristics of the study system and the high variation in species composition detected.More importantly, the estimator also proved to be robust to the influence of environmental variables (precipitation and hydroperiod), spatial variation in baseline productivity, and species composition-suggesting that it can be safely used under the variety of situations present in the Doñana marshes, as well as in similar systems.
The analysis of the spatial and temporal variation of biomass production in the Doñana marsh confirmed that production is both highly variable and highly heterogeneous.Based on previous studies we expected precipitation, which determines the flooding regime, to account for a large percentage of the variation in biomass production [62].Indeed, precipitation explained 69% of the temporal variation in biomass production (summed across the study area).The relationship between precipitation and biomass production was however non-linear, indicating that biomass production is strongly dependent on precipitation in dry years but it tends to saturate in very wet years (similar to what Coe et al. [63] report).Whether this saturation results from self-thinning effects (intra-and inter-specific competition) and/or from the negative effect of prolonged inundation on plant development remains a topic for future studies.
The effect of herbivores on the marsh vegetation is another important source of variability.Specifically, changes in plant consumption caused by variation in the number and distribution of domestic (cattle and horses) and wild (fallow deer Dama dama, red deer Cervus elaphus, wild boar Sus scrofa) herbivores have been shown to determine the abundance and distribution of plant biomass, reducing it severely in dry years [40].The spatio-temporal trends detected using the Theil-Sen slope estimator suggest that biomass production has decreased, during the last 16 years, precisely in the areas where this production was more abundant.This pattern could be consistent with changes in vegetation community composition due to temporal trends in mean hydroperiod [14] probably due to marsh siltation.However, they could also be reflecting the effect of overgrazing by herbivores, which may be expected to concentrate their grazing (thus consuming more biomass) in the areas with higher biomass yield-particularly in dry years with low biomass production.Indeed, herbivores do not distribute uniformly in the marsh; they move tracking food and water availability, and avoiding heavily flooded areas.Mapping the biomass is an important first step to monitor and manage the effects of herbivores [5,39,64].It can support management programs that rationalize the number of domestic animals and find a dynamic balance between cattle and vegetation [65,66], helping to prevent land degradation, soil erosion and biodiversity loss [67].In this regard the study of the vegetation patterns could be improved by correlating the changes in vegetation biomass with hydroperiod trends, and with the spatial distribution and movements of domestic and wild herbivores.The modeling process in a heterogeneous ecosystem such as the Doñana marsh could also benefit from increasing the spatial resolution using other sensors such as Landsat.

Conclusions
We show that by using Land Surface Phenology (LSP) techniques and relatively simple statistical models, it is possible to provide accurate estimations of plant biomass production in a large seasonal wetland, the Doñana marsh.Estimators based on LSP models provided substantially better predictions than those based on a single image, and were robust to environmental variation and spatial heterogeneity.Model predictions indicate that the marsh areas with highest productivity coincide with those in which productivity has been declining during the last 16 years-suggesting changes in vegetation communities or the potential effect of overgrazing by wild and domestic herbivores.The estimation of plant biomass using remote-sensing techniques, adequately supported by ground-truth data, may represent a key tool for the long-term monitoring and management of ecosystems, especially in protected areas where the natural world and human activities coexist.

Figure 1 .
Figure 1.(A) Location of the Doñana National Park in the southwest of Spain.(B) Location of the study area inside the Doñana National Park marsh.(C) Ground-truth biomass plots inside the study area.(D) Zoom to a MODIS validation pixel that exemplifies the sample design stratification.(E) Picture of the helophyte community (May 2016).(F) Picture of an area heavily grazed by cattle (June 2016).

Figure 1 .
Figure 1.(A) Location of the Doñana National Park in the southwest of Spain.(B) Location of the study area inside the Doñana National Park marsh.(C) Ground-truth biomass plots inside the study area.(D) Zoom to a MODIS validation pixel that exemplifies the sample design stratification.(E) Picture of the helophyte community (May 2016).(F) Picture of an area heavily grazed by cattle (June 2016).

Table 2 .
Results of model calibration.Relationship between each of the four NDVI estimators tested and biomass production.Best fits were obtained with ln (y) = a * x + b transformation.The other two transformations, y = a * x + b and ln (y) = a * ln (x) + b, are included in Appendix A. SE = Standard Error.RMSE = Root mean square error.%RMSE = Percentage of RMSE.
Relationship between each of the four NDVI estimators tested and biomass production.Best fits were obtained with ln (y) = a * x + b transformation.The other two transformations, y = a * x + b and ln (y) = a * ln (x) + b, are included in Appendix A. SE = Standard Error.RMSE = Root mean square error.%RMSE = Percentage of RMSE.

Figure 2 .
Figure 2. Model calibration.Relationship between the best NDVI estimator tested (LSP-Maximum-NDVI) and the logarithm of biomass production (kg dw/ha).Continuous line: regression line.Dotted lines: 95% confidence intervals.The dot colors represent the five different locations of the calibration biomass plots.

Figure 2 .
Figure 2. Model calibration.Relationship between the best NDVI estimator tested (LSP-Maximum-NDVI) and the logarithm of biomass production (kg dw/ha).Continuous line: regression line.Dotted lines: 95% confidence intervals.The dot colors represent the five different locations of the calibration biomass plots.

Figure 3 .
Figure 3. Results of the validation survey.Aboveground biomass production per species at each of the nine MODIS pixels sampled.N = 12 sample plots per pixel.

Figure 3 . 18 Figure 4 .
Figure 3. Results of the validation survey.Aboveground biomass production per species at each of the nine MODIS pixels sampled.N = 12 sample plots per pixel.

Figure 4 .
Figure 4. Model validation.Major axis regression between measured and predicted biomass.Continuous line: regression line.Dotted lines: 95% confidence intervals.

Figure 5 .
Figure 5. Model predictions.Estimated biomass production (in kg dw/ha) per pixel across the study area.

Figure 6 .
Figure 6.Trend analysis.(A) Changes in biomass production from 2001 to 2016, based on the Theil-Sen slope estimator.Positive values (blue colors): increase.Negative values (red colors): decrease.(B) Average biomass production (kg dw/ha) from 2001 to 2016.All categories except the one for "nonsignificant results" indicate Theil-Sen slope estimator values significantly different from zero.

Figure 5 .
Figure 5. Model predictions.Estimated biomass production (in kg dw/ha) per pixel across the study area.

Figure 5 .
Figure 5. Model predictions.Estimated biomass production (in kg dw/ha) per pixel across the study area.

Figure 6 .
Figure 6.Trend analysis.(A) Changes in biomass production from 2001 to 2016, based on the Theil-Sen slope estimator.Positive values (blue colors): increase.Negative values (red colors): decrease.(B) Average biomass production (kg dw/ha) from 2001 to 2016.All categories except the one for "nonsignificant results" indicate Theil-Sen slope estimator values significantly different from zero.

Figure 6 .
Figure 6.Trend analysis.(A) Changes in biomass production from 2001 to 2016, based on the Theil-Sen slope estimator.Positive values (blue colors): increase.Negative values (red colors): decrease.(B) Average biomass production (kg dw/ha) from 2001 to 2016.All categories except the one for "non-significant results" indicate Theil-Sen slope estimator values significantly different from zero.

Figure 7 .
Figure 7. Effect of cumulative precipitation (from September to April) on biomass production (average across the study area).Continuous line: regression fit.Dotted line: 95% confidence intervals.Note the log-transformation in both axes.

Figure 7 .
Figure 7. Effect of cumulative precipitation (from September to April) on biomass production (average across the study area).Continuous line: regression fit.Dotted line: 95% confidence intervals.Note the log-transformation in both axes.

Table 1 .
Comparison among TIMESAT curve-fitting methods to predict B. maritimus biomass using R 2 .

Table 2 .
Results of model calibration.

Table 3 .
Model robustness to environmental and spatial variation.Results of multiple regression models of biomass production on LSP-Maximum-NDVI plus a key environmental variable (either precipitation or hydroperiod, as continuous variables) or spatial location (as a categorical variable).Adj.R 2 = adjusted R 2 .DV = dummy variable.SE = Standard Error.RMSE = Root mean square error.%RMSE = Percentage of RMSE.

Table 3 .
Model robustness to environmental and spatial variation.Results of multiple regression models of biomass production on LSP-Maximum-NDVI plus a key environmental variable (either precipitation or hydroperiod, as continuous variables) or spatial location (as a categorical variable).Adj.R 2 = adjusted R 2 .DV = dummy variable.SE = Standard Error.RMSE = Root mean square error.%RMSE = Percentage of RMSE.