Crop Biomass Mapping Based on Ecosystem Modeling at Regional Scale Using High Resolution Sentinel-2 Data

We evaluate the potential of using a process-based ecosystem model (BEPS) for crop biomass mapping at 20 m resolution over the research site in Manitoba, western Canada driven by spatially explicit leaf area index (LAI) retrieved from Sentinel-2 spectral reflectance throughout the entire growing season. We find that overall, the BEPS-simulated crop gross primary production (GPP), net primary production (NPP), and LAI time-series can explain 82%, 83%, and 85%, respectively, of the variation in the above-ground biomass (AGB) for six selected annual crops, while an application of individual crop LAI explains only 50% of the variation in AGB. The linear relationships between the AGB and these three indicators (GPP, NPP and LAI time-series) are rather high for the six crops, while the slopes of the regression models vary for individual crop type, indicating the need for calibration of key photosynthetic parameters and carbon allocation coefficients. This study demonstrates that accumulated GPP and NPP derived from an ecosystem model, driven by Sentinel-2 LAI data and abiotic data, can be effectively used for crop AGB mapping; the temporal information from LAI is also effective in AGB mapping for some crop types.


Introduction
Biomass is the organic matter produced by plants and animals with accumulated energy directly or indirectly from photosynthesis. Within botanic biomass, agricultural biomass is characterized by two major types, food-based crops and non-food-based crops, albeit with no clear boundaries between these categories. In the following, we briefly review the biomass mapping methods based on remote sensing data for two major grain-crops, oil and starch crops, within the category of food-based crops.
The methodologies for biomass mapping of herbaceous grain crops are also applicable to energy crops. Ahamed et al. [1] extensively reviewed remote sensing methods for biomass mapping for feedstock production with a focus on using various vegetation indices (VIs), biophysical and biochemical variables, such as leaf area index (LAI, or L) and chlorophyll content. Recently, Chao et al. [2] categorized the methods for biomass mapping of energy crops into five types: VI-based simple statistical analysis, Synthetic Aperture Radar (SAR) backscatter-based analysis, Net Primary Productivity (NPP) models with GPP, NPP and measured AGB for several crops are examined in Section 3; the correlation between AGB and individual LAI or LAI time-series is also examined. The uncertainties of GPP and NPP simulations for crop biomass mapping are discussed in Section 3. Section 4 summarizes the key conclusions.

Ecosystem Model for Crop GPP and NPP Simulation
The Boreal Ecosystems Productivity Simulator (BEPS) is used to simulate crop GPP and NPP in this study [37,38]. BEPS has been developed since 1997 and has evolved from a daily version [38] to the current hourly version [39], with process-based modules to simulate water, energy and carbon budgets [40,41]. Although BEPS was initially developed for boreal ecosystems, it has been evolved and expanded to cover temperate and tropical ecosystems at regional and global scales [42][43][44][45][46] and has been re-structured for parallel computation and data assimilation [46][47][48]. BEPS has also been used for estimations of winter wheat yield in China [49] and cotton yield in the US [50], as C3 plants share the same enzymatic mechanism-based photosynthesis theory at the leaf-level, i.e., Farquhar's leaf-level biochemical model [51], and BEPS has a "two-leaf" approach [52] to upscale the ecosystem GPP to canopy-level for various ecosystems including crops [38]. NPP is then modeled as the difference between photosynthesis (GPP) and autotrophic respiration. BEPS has the mechanism to simulate CO 2 fertilization, using CO 2 concentration data from https://www.esrl.noaa.gov/gmd/ccgg/trends/global.html (accessed on 20 July 2019).
The maximum rate of Rubisco carboxylation (V cmax ) of crops varies with cultivars and nutrition condition. Although BEPS has the model structure to assimilate a dynamic variable V cmax , the V cmax values are fixed at 120 µmoL m −2 s −1 for different crops in this study, to be consistent with our previous crop modeling work [48], since pixel-level V cmax measurements are currently unavailable at the spatial resolution of Sentinel-2.
The C4 plants have less photorespiration and are more efficient in photosynthesis although their leaf V cmax is low, e.g., at 40 ± 1 µmoL m −2 s −1 [53] for maize (corn). Since BEPS has no C4 module, we test if the increase in V cmax to 120 µmoL m −2 s −1 for C4 corn using a C3 model could partially compensate the underestimation of GPP simulation due to the lack of a photorespiration module for the purpose of corn biomass mapping.

The Link between Carbon Fluxes and Biomass
In this study, BEPS uses daily LAI and hourly meteorological data as inputs and outputs GPP and NPP for each hour (g C m −2 h −1 ); the hourly GPP and NPP are then binned into daily scale in units of g C m −2 day −1 . Using GPP as an example, the accumulative GPP in unit of g C m −2 on a specific day is defined as the sum of daily GPP from day 1 to day n: where GPP A is also referred to as GPP for simplicity, hereafter; GPP i is the daily total GPP for day i; day 1 is defined as the first day when LAI is larger than 0. As summarized in the Introduction, there is justification to link crop GPP to crop biomass. Given an accumulated crop GPP (g C m −2 ) simulated by the BEPS over a given period of time in the growing season, the accumulated crop biomass (B, in units of g m −2 ) in the same period is estimated as: where c 1 is the carbon use efficiency (CUE, the ratio of NPP to GPP) and ranges from 0 to 1 for different species [54]; CUE is often set to 0.5 as an average for global ecosystems [55]; c 2 is the biomass to carbon ratio. As summarized by Schlesinger [23], the C content of biomass is almost always found to be between 45 and 50%; therefore, c 2 is close to 2.0. As CUE increases with a decrease in temperature [56,57], CUE in high latitudes is above 0.5. Since c 2 is slightly larger than 2.0, the product of c 1 and c 2 is expected to be larger than 1.0 in high latitudes. Given the small variations in c 1 and c 2 in a specific area, it is desirable to directly link GPP to biomass by a single coefficient C for practical applications: Equation (3) assumes a linear relationship between GPP and biomass. The correlation coefficient between GPP and AGB for a few crop types will be examined since only AGB data are available in this study.
Based on Equations (2) and (3), crop yield (Y, in unit of Kg Ha −1 ) can also be estimated from the end-of-season crop GPP, although it is not the focus of this study: where c 3 is the harvest index (HI), the ratio of harvested grain mass to total aboveground dry biomass; and c 4 is 10, the factor to convert the unit from g m −2 to kg to Ha −1 . As reported by Lambert, Traoré, Blaes, Baret and Defourny [30], the AGB explains more variations in crop yield than the peak LAI values. Similarly, we also compare crop biomass to accumulated LAI or NPP in a period in order to test their correlations. Accumulated LAI is defined as the sum of daily LAI from the date when LAI is larger than zero to a specific date in the growing season; the daily LAI is interpolated from remote sensed derived LAI. Similar to GPP, NPP and accumulated LAI also contain information from a time series rather than a single measurement.
To estimate AGB from NPP: where c 5 is ratio of AGB to total biomass. c 5 may vary with different crops and can be calibrated; in this study, c 5 is set to as 80%. A full diagram of the data processing flow is illustrated in Figure 1. where c1 is the carbon use efficiency (CUE, the ratio of NPP to GPP) and ranges from 0 to 1 for different species [54]; CUE is often set to 0.5 as an average for global ecosystems [55]; c2 is the biomass to carbon ratio. As summarized by Schlesinger [23], the C content of biomass is almost always found to be between 45 and 50%; therefore, c2 is close to 2.0. As CUE increases with a decrease in temperature [56,57], CUE in high latitudes is above 0.5. Since c2 is slightly larger than 2.0, the product of c1 and c2 is expected to be larger than 1.0 in high latitudes. Given the small variations in c1 and c2 in a specific area, it is desirable to directly link GPP to biomass by a single coefficient C for practical applications: Equation (3) assumes a linear relationship between GPP and biomass. The correlation coefficient between GPP and AGB for a few crop types will be examined since only AGB data are available in this study.
Based on Equations (2) and (3), crop yield (Y, in unit of Kg Ha −1 ) can also be estimated from the end-of-season crop GPP, although it is not the focus of this study: where c3 is the harvest index (HI), the ratio of harvested grain mass to total aboveground dry biomass; and c4 is 10, the factor to convert the unit from g m −2 to kg to Ha −1 . As reported by Lambert, Traoré, Blaes, Baret and Defourny [30], the AGB explains more variations in crop yield than the peak LAI values. Similarly, we also compare crop biomass to accumulated LAI or NPP in a period in order to test their correlations. Accumulated LAI is defined as the sum of daily LAI from the date when LAI is larger than zero to a specific date in the growing season; the daily LAI is interpolated from remote sensed derived LAI. Similar to GPP, NPP and accumulated LAI also contain information from a time series rather than a single measurement.
To estimate AGB from NPP: where c5 is ratio of AGB to total biomass. c5 may vary with different crops and can be calibrated; in this study, c5 is set to as 80%. A full diagram of the data processing flow is illustrated in Figure 1.

Crop LAI and Above-Ground Biomass Collection from the SMAPVEX16-MB Field Campaign
In this study, ground measurements of AGB are from the SMAP Validation Experiment 2016 in the Carman/Elm Creek region of southern Manitoba, Canada (SMAPVEX16-MB). This experiment was conducted to support NASA's SMAP satellite post-launch calibration/validation activities [58]. Figure 2 shows the location of the SMAPVEX16-MB experiment and the distribution of the AGB sampling plots. A total of 50 fields were selected for sampling in this campaign, with each field covering an area of 800 m × 800 m. These 50 fields represent the dominant crops within the study area. According to the AAFC Annual Crop Inventory, soybeans, spring wheat and canola accounted for approximately 70% of crops grown in the study area, with corn, oats, and other field beans accounting for approximately 17% of the study area in 2016. These sites are within a region of approximately 26 km × 48 km.
In this study, ground measurements of AGB are from the SMAP Validation Experiment 2016 in the Carman/Elm Creek region of southern Manitoba, Canada (SMAPVEX16-MB). This experiment was conducted to support NASA's SMAP satellite post-launch calibration/validation activities [58]. Figure 2 shows the location of the SMAPVEX16-MB experiment and the distribution of the AGB sampling plots. A total of 50 fields were selected for sampling in this campaign, with each field covering an area of 800 m × 800 m. These 50 fields represent the dominant crops within the study area. According to the AAFC Annual Crop Inventory, soybeans, spring wheat and canola accounted for approximately 70% of crops grown in the study area, with corn, oats, and other field beans accounting for approximately 17% of the study area in 2016. These sites are within a region of approximately 26 km × 48 km. Field sampling was conducted on 13 different dates, 13,15,18,20,27,and 28 June and 5,6,11,12,17,20, and 21 July . Each field had 16 sampling locations for soil moisture, Field sampling was conducted on 13 different dates, 13,15,18,20,27,and 28 June and 5,6,11,12,17,20, and 21 July. Each field had 16 sampling locations for soil moisture, although data on crop vegetation were collected from only 6 of these sites (Locations 2, 3, 10, 11, 13 and 14). The objective during SMAPVEX16-MB was to collect crop data approximately once per week during the experiment. Given this goal, each field was sampled 5 times for the purpose of characterizing crop growth. This strategy resulted in 380 data points to be compared with GPP simulated at 20 m resolution from Sentinel-2 data.
During sampling, all above-ground crop biomass was cut at the soil surface, placed in plastic bags to prevent water loss, and immediately transported to a portable field lab for weighing. For crops with wide row spacing (corn and beans), 5 plants were harvested in each of two rows. Biomass (g) per unit area (m 2 ) was determined using wet and dry weights and plant density. For narrowly placed crops (canola, oats and wheat), all biomass was collected within a 0.5 x 0.5 m square. After wet weights were determined, plants were transported to a drying room. Here plants were left for approximately one week, until stable dry weights determined that plants had completely dried. The weights of these samples were further calibrated using samples dried by an oven (60 • C for 48 h). This dataset includes samples for six crop types, i.e., oat, soybean, wheat, corn, canola and black bean. A complete description of biomass collection is detailed at https://nsidc.org/ data/SV16M_V/versions/1 (accessed on 30 July 2019) and by Bhuiyan, McNairn, Powers, Friesen, Pacheco, Jackson, Cosh, Colliander, Berg, Rowlandson, Bullock and Magagi [58].
The water content of plant organs varies with specific crop type and seasons; after the gross biomass is dried, the dry matter can be directly compared to the BEPS simulation of the net carbon assimilation.
Along with biomass measurements, LAI was measured with hemispherical digital photos [59]. In this technique, a camera with a fish-eye lens captured photos of the crop canopy with the camera positioned at least 50 cm above or below the canopy. The camera was mounted on a pole if the picture was taken above the canopy. Seven photos were taken along two transects (14 photos in total) at three sampling sites for each of the fields. Crews were instructed to take one photo then move approximately 5 m to take the next photos. Field crews positioned themselves relative to the sun to minimize shadowing in the photos. These photos were post-processed to estimate LAI using the CanEye software. All 14 photos were averaged to provide one estimate of LAI per sample site. Prior to the use of CanEye, the photos were enhanced to increase the quality through the use of ViewNX-2 (Nikon) software. Once enhancements were implemented, the images were then imported in the CanEye software for classification and LAI calculation. The parameters of the software were configured in accordance with the photos, i.e., number of lines and columns, julian date and latitude. Masking for any non-soil or data acquisition anomalies were applied where necessary in addition to any removal of photos that were problematic.
In the data processing, all green leave, stems, and spikelets are included. Since the measurements were taken in summer, the plants did not senescence at the time of measurements. Therefore, the ground LAI measurements include total green elements. This corresponds well to Sentinel-2-derived LAI as the simple ratio index is sensitive to all elements with chlorophyll pigments.

Key Input Data for BEPS Simulation
Sentinel-2 Level-1C top-of-atmosphere (TOA) reflectance products are used to derive LAI. These products are organized into tiles of 100 by 100 km 2 in UTM projection and WGS84 ellipsoid and are available from the Copernicus Open Access Hub (https://scihub. copernicus.eu/dhus/#/home (accessed on 1 June 2019)). Two tiles (14UNA and 14UNV) covering the SMAPVEX16-MB region were selected and mosaicked. For BEPS simulation, a region of approximately 40 km × 60 km is subset to match the region shown in Figure 2.
The Sen2Cor atmospheric correction processor [60] is used to convert the TOA reflectance to top-of-canopy (TOC) reflectance, i.e., Sentinel-2 Level-2A product. A Simple Ratio index (SR), defined as a ratio of near-infrared band to red band reflectances (or band 8A to band 4 reflectances from Sentinel-2) was used to derive LAI at 20 m resolution, while taking into account the effect of bidirectional reflectance distribution [61]. We chose band 8A in 20 m resolution rather than band 8 in 10 m resolution because band 8A is more sensitive to LAI.
In [61], the LAI is modeled as where θ s is the solar zenith angle, θ v is the view zenith angle, and ϕ is the relative azimuth angle between the sun and the viewer. The bidirectional reflectance distribution function (BRDF) kernel f BRDF is simulated using a physically based geometrical optical model [62]. In this way, the effect of angular changes to SR is minimized. Although we used the terminology "LAI" in this study, our retrieved LAI from Sentinel-2 includes the area of all green elements. Although Sentinel-2 has a high revisit frequency, the availability of cloud-free and cloud shadow-free images is limited (Table 1). A locally adjusted cubic-spline capping (LACC) method is used to smooth the time-series of LAI and discretize the smoothed LAI curve into daily values for BEPS modeling [63]. In LACC, a variable local smoothing parameter, which controls the local smoothness of the fitted curve, is automatically determined according to the local curvature of the original seasonal variation pattern. An iteration procedure is designed to produce a seasonal capping curve by progressively replacing abnormally low values with fitted values. Normally, less than 5 iterations are required to produce a smoothed capping curve. The lower-right corner in our study area is excluded from GPP simulation owing to too few Sentinel-2 images available for LACC inputs.
Six meteorological variables including air temperature, relative humidity, wind speed, and atmosphere pressure at 2 m above the surface, and incoming solar shortwave flux and total precipitation at the surface level are used to drive BEPS. These variables are acquired from a global reanalysis dataset, Modern-Era Retrospective Analysis for research and Applications, Version 2 (MERRA-2) [64].
In BEPS, the soil up to a 2 m depth is stratified into five layers for the simulation of soil water and heat fluxes. Four soil attributes including texture, bulk density, organic content and reference depth are extracted from the Soil Grids (https://soilgrids.org (accessed on 15 June 2019)) that provide global soil attributes at 250 m resolution [65]. Clay and fine loamy soils dominate in the eastern and southern parts of the study area and coarse loamy and sandy soils are prominent in the western part ( Figure S1). To match the pixel resolution of Sentinel-2 data, the meteorological variables are resampled to 20 m resolution using a bilinear method, and the soil attributes are resampled to 20 m using the nearest neighbor method.
To map crop biomass in the study area, the annual crop inventory map products (in 30 m resolution) from Agriculture and Agri-Food Canada (AAFC) are used [66].

Correlation between BEPS-Simulated Crop GPP and Ground Measurements
The uncertainty of Sentinel-2-derived LAI time-series after smoothing by LACC from our study has a Root Mean Square Error (RMSE) of 1.1 m 2 m −2 (R 2 = 0.83) compared with the ground-measured plant area index (including crop stems and heads) collected during the SMAPVEX16-MB campaign [67]. A previous validation for the same LAI algorithm reported an RMSE of 0.81 m 2 m −2 [68]. Knowledge of the starting and ending dates of the growing season, for each field, is important when simulating crop GPP; this information is clear from the LAI time series. Since the crops in different fields were sowed or planted on different dates, we simply run BEPS from the beginning of 2016. To avoid a numerical problem in the modeling, we have set the minimum LAI to 0.01; these two settings result in an intercept in the GPP-biomass relationships as shown in Figure 3 for all the crop types and in Figure 4, by crop type. To map crop biomass in the study area, the annual crop inventory map products (in 30 m resolution) from Agriculture and Agri-Food Canada (AAFC) are used [66].

Correlation between BEPS-Simulated Crop GPP and Ground Measurements
The uncertainty of Sentinel-2-derived LAI time-series after smoothing by LACC from our study has a Root Mean Square Error (RMSE) of 1.1 m 2 m −2 (R 2 = 0.83) compared with the ground-measured plant area index (including crop stems and heads) collected during the SMAPVEX16-MB campaign [67]. A previous validation for the same LAI algorithm reported an RMSE of 0.81 m 2 m −2 [68]. Knowledge of the starting and ending dates of the growing season, for each field, is important when simulating crop GPP; this information is clear from the LAI time series. Since the crops in different fields were sowed or planted on different dates, we simply run BEPS from the beginning of 2016. To avoid a numerical problem in the modeling, we have set the minimum LAI to 0.01; these two settings result in an intercept in the GPP-biomass relationships as shown in Figure 3 for all the crop types and in Figure 4, by crop type.    The quality of the LAI time series is important for accurate estimation of AGB. As shown in Figure S2 and Table 1, two Sentinel-2 images are missing in August during key growth periods. This gap leads to inadequate inputs to the LACC method and subsequently, the failure of LAI time-series smoothing, i.e., extreme values in the peak growing season, as observed for the fields in the lower-right corner of our study area. The errors in LAI can propagate into the biomass estimation as indicated by the red circle in Figure S3 and have led to some overestimations in biomass estimation. Although an overall strong correlation between simulated GPP and AGB is observed, these errors highlight the limitation of using Sentinel-2 data alone for biomass mapping. Fusion of Sentinel-2 data with other optical and/or SAR images is one strategy to improve the success of biomass mapping [69].
After excluding the sites with large biases in LAI, Figure 3 shows that the correlation between BEPS-simulated GPP and ground-measured AGB is stronger with R 2 of 0.82. As the comparison is conducted at a pixel size of 20 m while the AGB is sampled in a lesser area, the AGB samples may not capture all the variations in crop growth present within a 20 m pixel.
Although the V cmax is likely to vary within the study area, it is set as a constant value. Figure 3 indicates that a unique GPP-biomass relationship is still practical for regional applications.
The slope of 1.03 for the linear relationship between BEPS-simulated GPP and groundmeasured AGB, as shown in Figure 3, is reasonable. The slope between GPP and total biomass in high latitudes should be larger than 1.0; however, GPP and only AGB are compared here. The fraction of AGB to total biomass is less than 1, e.g., 78% for maize [70]. Therefore, the CUE (larger than 1) compensates the AGB-to-total-biomass ratio (less than 1), leading to the slope between GPP and AGB close to 1.
The coefficient of determination (R 2 ) between the simulated GPP and AGB ranges from 0.65 to 0.96 for the six crop types (Figure 4). It is worth noting that the two legume crops, black bean and soybean, are characterized with smaller slopes of 0.71 and 0.4, respectively. Legume-cereals have a strong N rhizodeposition and are often cultivated in rotation cropping to enhance crop yield [71]. The N rhizodeposition is associated with large amounts of C rhizodeposition accompanying the senescence of roots and nodules which are made of C [72]. Therefore, a significant amount of biomass is allocated to the below-ground biomass for these crops, leading to a smaller ratio of AGB to total GPP. Qiao et al. [73] found that soybean has greater C rhizodeposition than maize, and the amount of C rhizodeposition varies greatly with soil fertility and organic amendment, presenting a challenge in using GPP for biomass mapping by BEPS or other crop growth models.
Although non-linearity is shown between GPP and AGB of corn, correlation is still high (R 2 = 0.76) (Figure 4). This suggests that C3 models may be used to simulate GPP of C4 plants if some parameters are adjusted accordingly, although this is subject to further investigation. The correlations between LAI and crop AGB are shown in Figures S4 and S5. For all crop types, crop LAI explains only 50% of variation in crop AGB, significantly lower than that explained by crop GPP ( Figure S4); this is because LAI only explains mostly the top part of total AGB; as crop structure varies, the non-leaf part the AGB has a significant difference; for example, corn has far more non-green AGB than the soybean does. This suggests that the cumulative GPP correlates better to AGB which is also a cumulative variable, than it does to the green LAI, which indicates the status of the crop. By crop type, the LAI variations of canola, oat, and wheat explain only 61%, 59%, and 0% of variations in their corresponding AGB, respectively. The LAI variations of black bean, corn, and soybean explain 94%, 81%, and 62% of the variations in their AGB, respectively, higher than that of canola, oat and wheat. This difference can be explained by the growing stages: unlike canola, oat, and wheat, the AGB of black bean, corn and soybean cover only the vegetative growth stage of crop phenology. This is evidenced by the stronger correlation between leaf biomass and LAI (R 2 = 0.55) than between AGB and LAI (R 2 = 0.32%) as shown in Figures S6 and S7. Ideally, the LAI should be highly correlated with leaf biomass. However, LAI values derived from Sentinel-2 are equivalent "green" LAIs, resulting from the accumulation leaf pigments and leaf senescence [74]. Therefore, satellite-derived LAI correlates more with green biomass including green stems and fruits [75] and the R 2 of 0.55 is still considered to be high.

Correlation between BEPS-Simulated Crop NPP and Ground Measurements
The R 2 between the simulated NPP and AGB for all the plots is 0.83 ( Figure 5); individual R 2 ranges from 0.56 to 0.95 for the six crop types ( Figure 6). There are minor improvements in R 2 compared to GPP-derived AGB, except for black bean for which the R 2 is already very high, and for soybean. Similar to GPP-derived AGB, a non-linearity between simulation and measurements is still observed for corn. Since we used the same biomass respiration rates for different crops in BEPS simulations, the minor differences between GPP-derived and NPP-derived AGB are expected. Overall, the RMSE and mean absolute error (MAE) for AGB estimation from NPP are 115.7 and 85.8 g m −2 , respectively.
shown in Figures S6 and S7. Ideally, the LAI should be highly correlated with leaf biomass. However, LAI values derived from Sentinel-2 are equivalent "green" LAIs, resulting from the accumulation leaf pigments and leaf senescence [74]. Therefore, satellite-derived LAI correlates more with green biomass including green stems and fruits [75] and the R 2 of 0.55 is still considered to be high.

Correlation between BEPS-simulated Crop NPP and Ground Measurements
The R 2 between the simulated NPP and AGB for all the plots is 0.83 ( Figure 5); individual R 2 ranges from 0.56 to 0.95 for the six crop types ( Figure 6). There are minor improvements in R 2 compared to GPP-derived AGB, except for black bean for which the R 2 is already very high, and for soybean. Similar to GPP-derived AGB, a non-linearity between simulation and measurements is still observed for corn. Since we used the same biomass respiration rates for different crops in BEPS simulations, the minor differences between GPP-derived and NPP-derived AGB are expected. Overall, the RMSE and mean absolute error (MAE) for AGB estimation from NPP are 115.7 and 85.8 g m −2 , respectively.

Figure 5.
The correlation between BEPS-simulated crop net primary production (NPP) and groundmeasured above-ground biomass for six crop types (oat, soybean, wheat, corn, canola and black bean) for SMAPVEX16-MB.

Correlation between Accumulated LAI and Ground Measurements
The R 2 between the accumulated S2 LAI and AGB for all the plots is 0.85, which is similar to GPP (R 2 = 0.82) and NPP (R 2 = 0.83) (Figure 7); individual R 2 ranges from 0.59 to 0.97 for the six crop types (Figure 8) with decreased R 2 for soybean, oat and canola. Notably, the nonlinearity between corn GPP or NPP and AGB is not observed in Figure 8 but the opposite is observed for oat.
Although AGB may be estimated at a similar accuracy by using accumulated S2 LAI for a specific crop type, we believe that GPP-and/or NPP-based approaches produced a reliable overall result for the whole study area. Here, BEPS performs on par with accumulated LAI in biomass estimation because: (1) the accumulated LAI already contains adequate seasonal information for biomass estimation; (2) the meteorological variables do not change much since the study area is small; and (3) certain management information (irrigation, fertilizer use, crop-specific parameters) is unavailable to BEPS. If the study area is big enough to cover different climate zones, we could expect that the BEPS-based estimates will explain more variation in AGB than the accumulated LAI.

Correlation between Accumulated LAI and Ground Measurements
The R 2 between the accumulated S2 LAI and AGB for all the plots is 0.85, which is similar to GPP (R 2 = 0.82) and NPP (R 2 = 0.83) (Figure 7); individual R 2 ranges from 0.59 to 0.97 for the six crop types (Figure 8) with decreased R 2 for soybean, oat and canola. Notably, the nonlinearity between corn GPP or NPP and AGB is not observed in Figure 8 but the opposite is observed for oat.

Spatial Distribution of BEPS-Simulated Crop GPP and AGB
BEPS-simulated GPP for the whole growing season is shown in Figure 9. The lowerright corner is excluded from analysis due to biases in LAI retrievals. In the study area, the crop's total GPP in the whole growing season ranges from 600 to 2000 g C m −2 yr −1 . Lower GPP values in the upper-right corner are associated with a region of heavy clay soil; for this clay area, the matric water potential can be very low even when the soil water content is relatively high, limiting available water to the crops ( Figure S1). The area with center-pivot irrigation (circles; potatoes according to Figure 1) is characterized with high GPP values up to 2000 g C m −2 yr −1 . The variation of GPP is generally small within each field, while considerably larger among different fields, reflecting individual farmer's management on irrigation and fertilizer use.  Table 1. shows the averaged AGB by crop type in 2016 derived for the SMAPVEX16-MB study area. It is found that the average AGB among crop-type is clear while the one-standard deviation of AGB for each crop type is relatively small; we suggest that the small deviation is due to the small study area with the similar climate condition. Two sample t-test results suggest that the AGB average values among crop types are significantly different (p < 0.05) except for Oats vs. Triticale, Triticale vs. Spring Wheat, and Winter Wheat vs. Hemp (Table S1).  The regressions in Figures 3 and 4 are used to derive an AGB map for the study area ( Figure 10). For the six target crops, their explicit GPP-AGB relationships are applied; for the rest of crop types, the regression coefficients averaged for all crop types in Figure 3 are applied. In Figure 10, distinct variations among fields are revealed. In the study, the total AGB ranges from 200 to 2000 g m −2 . The peas and beans are characterized with low AGB values (blue area), e.g., 5.4 tons ha −1 for soybean ( Figure 11 and Table S1), while potato field has the largest AGB (16.6 tons ha −1 ). Compared to Figure 9, more variations among fields are observed since the crop-specific calibration from Figure 4 is applied. This suggests that although the accumulative GPP correlates well to the observed AGB, early crop identification/inventory within the crop growing season is critically important for crop growth monitoring and yield prediction.

Challenges in Using Sentinel-2 Data for Crop Biomass Mapping
Crop identification/classification, crop growth monitoring and crop yield estimation/prediction are among the most important remote sensing applications in agriculture, and they are tightly linked. Crop identification from satellite data provides the basis for annual crop inventory; while crop inventory provides the first order information for yield prediction; a crop classification map is also an important input to the ecosystem model, where the crop-specific parameters can be assigned. Crop growth monitoring from remote sensing involves mapping time-series of crop biophysical parameters (LAI, height, clumping, etc.) and biochemical parameters (e.g., leaf chlorophyll content, Vcmax); a biophysical parameter, AGB, can also be mapped from SAR images at a certain accuracy; the ecosystem model (or some crop growth models) can integrate these remote sensed information, meteorological data, and some available management information to produce reliable GPP, NPP and biomass. Crop biomass, along with crop-specific HI, can then provide more information for yield prediction/estimation than the crop inventory can. In this study, we tested BEPS for biomass mapping using Sentinel-2-derived LAI time-series as main input and with an existing archive of crop inventory map from AAFC.
Although crop AGB is robustly estimated for the majority of our study area, there are a few challenges in applying the same methodology to larger regions at continental and global scales. The same challenges may also apply to general crop growth models with explicit modules of crop phenology development and carbon allocations.
The first challenge is to acquire consistent LAI time-series that are free of cloud contamination. In our study area, there is a one-month gap in August without cloud-free Sentinel-2 images. Although temporal smoothing can mitigate the impact of these gaps to some extent, missing LAI estimates for cloud-contaminated pixels leads to incorrect AGB estimation. It is also a challenge to produce smoothing of LAI time series for near realtime AGB estimation, given the knowledge needed of crop growth stage. To track crop phenology for reliable estimation of AGB, one solution would be to fuse multi-satellite

Challenges in Using Sentinel-2 Data for Crop Biomass Mapping
Crop identification/classification, crop growth monitoring and crop yield estimation/prediction are among the most important remote sensing applications in agriculture, and they are tightly linked. Crop identification from satellite data provides the basis for annual crop inventory; while crop inventory provides the first order information for yield prediction; a crop classification map is also an important input to the ecosystem model, where the crop-specific parameters can be assigned. Crop growth monitoring from remote sensing involves mapping time-series of crop biophysical parameters (LAI, height, clumping, etc.) and biochemical parameters (e.g., leaf chlorophyll content, Vcmax); a biophysical parameter, AGB, can also be mapped from SAR images at a certain accuracy; the ecosystem model (or some crop growth models) can integrate these remote sensed information, meteorological data, and some available management information to produce reliable GPP, NPP and biomass. Crop biomass, along with crop-specific HI, can then provide more information for yield prediction/estimation than the crop inventory can. In this study, we tested BEPS for biomass mapping using Sentinel-2-derived LAI time-series as main input and with an existing archive of crop inventory map from AAFC.
Although crop AGB is robustly estimated for the majority of our study area, there are a few challenges in applying the same methodology to larger regions at continental and global scales. The same challenges may also apply to general crop growth models with explicit modules of crop phenology development and carbon allocations.
The first challenge is to acquire consistent LAI time-series that are free of cloud contamination. In our study area, there is a one-month gap in August without cloud-free Sentinel-2 images. Although temporal smoothing can mitigate the impact of these gaps to some extent, missing LAI estimates for cloud-contaminated pixels leads to incorrect AGB estimation. It is also a challenge to produce smoothing of LAI time series for near real-time AGB estimation, given the knowledge needed of crop growth stage. To track crop phenology for reliable estimation of AGB, one solution would be to fuse multi-satellite data at similar spatial resolutions, such as from Landsat 7 and 8 optical data and/or Synthetic Aperture Radar data from Sentinel-1 [76].
The second challenge is to perform rigorous model calibration for different crop types, especially for carbon allocations of crops for different cultivars and species, as plants develop different acclimation processes to maximize their photosynthetic capacity. For example, the AGB to GPP ratios range from 0.4 to 1.02 for the six target crops. This is especially true for legume crops with root rhizodeposition. Though not tested in our study, tuber crops such as potatoes can account for a significant amount of underground carbon, and their AGB might have been overestimated in our study. The accuracy of using C3 model to simulate AGB of C4 plant (corn in this study) is acceptable but not optimal.
The third challenge is to acquire soil nutrition conditions of the field and relate this information to photosynthetic capacity of crops. Although some crop growth models have a mechanism to relate nitrogen pool of crop system to leaf V cmax , such nitrogen management information is unavailable at the regional and national scales. Although Sentinel-2 has the red-edge bands for retrieval of leaf chlorophyll content [77][78][79][80], how leaf chlorophyll content correlates with leaf V cmax among species, crop growth stages and radiation conditions are not completely understood.
The fourth challenge is to acquire pixel-level irrigation information for crop growth models. Without correct irrigation information as an input, crop GPP and NPP may be underestimated by the models [46]. Sentinel-1 C-band SAR may be used for soil moisture mapping but only when the canopy is sparse [81]. The SMAP L-band radiometer has superior sensitivity to surface soil moisture even at moderate canopy water content, but its spatial resolution is low (40 km) [82]. Although some studies have integrated SAR data at high resolution and with SMAP radiometer data to produce soil moisture at a 1 km resolution [83], it is still impractical to do this at the 20 m resolution.
The last challenge for crop biomass mapping using models is the limitation of computation resources. For example, it takes half a day to simulate crop GPP at a 20 m resolution using a 54-core computer (2.5 Ghz) for our study area. This means that, for these process-based deterministic models, the computation of GPP mapping for a county at 20 m resolution is comparable to the GPP simulation of a global ecosystem at the 0.1 • resolution. If a data assimilation approach, such as the ensemble Kalman filter (EnKF) is used to assimilate any crop biophysical parameter into a crop growth model, the computation will be increased by an order of 1 to 2, suggesting that it is almost impossible to make use of EnKF for data assimilation at the current cost of computation.
These challenges, although not limited to Sentinel-2 data applications, suggest that although BEPS (hourly version) has an advanced model structure to simulate canopyscale photosynthesis [52], this requires significant computation effort. In contrast, some crop growth models are executed at a daily step and/or LUE-based photosynthesis model [15,20,21], but with the strength of crop traits modeling and handling of soil nutrition and water management. However, such an approach necessitates availability of these crop traits at large scales. Operational crop biomass (and yield) mapping may require a compromise between large scale and high spatial, and simplicity of the model structure. Mapping biomass at field scales (segmentation from crop identification map) might be a good alternative approach to address current big-data challenges.
Similar to this work, two recent studies have used Sentinel-2-derived LAI for biomass and yield mapping. Punalekar et al. [84] reported that the R 2 between pasture biomass and LAI ranges from 0.16-0.73 to 0.22-0.76, for the NDVI-derived LAI and radiative transfer model-based LAI, respectively. Lambert, Traoré, Blaes, Baret and Defourny [30] demonstrated that the R 2 values between LAI and yield are 0.68, 0.62, 0.8 and 0.48 for cotton, maize, millet and sorghum, respectively. Here, we show that the R 2 values between GPP and AGB range from 0.65 to 0.96 for six annual crops which dominate Manitoba's agro-ecosystem. Despite these aforementioned challenges, our study demonstrates the advantage of using LAI time-series and integrating both biotic and abiotic factors for biomass mapping at very high spatial resolutions.

Conclusions
This research explores the potential of integrating high spatio-temporal leaf area index (LAI) estimates derived from Sentinel-2 spectral reflectance data with an ecosystem model for crop above-ground biomass (AGB) mapping at the regional scale. The accuracy of this approach is evaluated using crop AGB field data collected during the SMAPVEX16-MB experiment conducted in western Canada, in 2016. Considering abiotic factors derived from meteorological data and soils data and LAI from Sentinel-2, we find that the modelsimulated crop gross primary production (GPP) at a 20 m resolution explains 82% of the variation in the annual crop AGB for six annual crop types. The results outperform the biomass mapping using LAI alone. The correlation between the model-simulated GPP and ground measured AGB for each of the six crop types is also strong, suggesting that our approach is reliable for AGB mapping at a high spatial resolution. Similarly, crop NPP and accumulative LAI explain 83% and 85% of the variation in the crop AGB, suggesting the importance of using temporal information from remote sensing.
The challenges for crop AGB mapping at high resolution are also discussed. These include difficulty in acquiring temporally consistent LAI time series due to cloud contamination, the lack of soil nutrition and water management information, the need for model calibration at the regional and national scales, and the high cost of computation. A practical approach for crop biomass mapping should include a compromise among data accessibility, model complexity, and computational resource availability.
Supplementary Materials: The following are available online at https://www.mdpi.com/2072-429 2/13/4/806/s1, Figure S1: Soil texture at the SMAPVEX16 field campaign. Figure S2: (a) Successful smoothing of LAI time-series derived from Sentinel-2; (b) Unsuccessful LAI smoothing due to missing of LAI data in the growing season (August) in 2016. Figure S3: BEPS-simulated crop GPP vs. above-ground biomass for all crop types in the study area including these fields with wrong LAIs as shown in the red circle. Figure S4: The correlation between crop above ground biomass (AGB) and Sentinel-2 derived LAI for all crops in the study area. Figure S5: The correlation between crop above ground biomass (AGB) and Sentinel-2 derived LAI for each of the crop types in the study area. Figure S6: The correlation between crop leaf dry biomass and Sentinel-2 derived LAI for all crop types in the study area. Figure S7: The correlation between crop leaf biomass and Sentinel-2 derived green LAI for each crop types in the study area. Table S1: Statistics of above-ground biomass for each crop type in the study area.