Modeling Gross Primary Production of Midwestern US Maize and Soybean Croplands with Satellite and Gridded Weather Data

Gross primary production (GPP) is a useful metric for determining trends in the terrestrial carbon cycle. To estimate daily GPP, the cloud-adjusted light use efficiency model (LUEc) was developed by adapting a light use efficiency (LUE, ε) model to include in situ meteorological data and biophysical parameters. The LUEc uses four scalars to quantify the impacts of temperature, water stress, and phenology on ε. This study continues the original investigation in using the LUEc, originally limited to three AmeriFlux sites (US-Ne1, US-Ne2, and US-Ne3) by applying gridded meteorological data sets and remotely sensed green leaf area index (gLAI) to estimate daily GPP over a larger spatial extent. This was achieved by including data from four additional AmeriFlux locations in the U.S. Corn Belt for a total of seven locations. Results show an increase in error (RMSE = 3.5 g C m−2 d−1) over the original study in which in situ data were used (RMSE = 2.6 g C m−2 d−1). This is attributed to poor representation of gridded weather inputs (vapor pressure and incoming solar radiation) and application of gLAI algorithms to sites in Iowa, Minnesota, and Illinois, calibrated using data from Nebraska sites only, as well as uncertainty due to climatic variation. Despite these constraints, the study showed good correlation between measured and LUEc-modeled GPP (R2 = 0.80 and RMSE of 3.5 g C m−2 d−1). The decrease in model accuracy is somewhat offset by the ability to function with gridded weather datasets and remotely sensed biophysical data. The level of acceptable error is dependent upon the scope and objectives of the research at hand; nevertheless, the approach holds promise in developing regional daily estimates of GPP.


Introduction
Gross primary production (GPP) in maize and soybean crops is an important measure for quantifying large scale carbon fluxes and plant productivity. GPP is defined as the total amount of carbon dioxide fixed by photosynthesis in a given area over a unit of time. GPP is a useful metric in determining the patterns and dynamics of the terrestrial carbon cycle [1] and it is essential in the study of ecosystem respiration and biomass accumulation [2]. The need to quantify the North American carbon sink necessitates precise carbon dioxide flux measurements [3] and while in situ data sources are available for this quantification, they represent field level data collection at specific locations. Extending GPP from field to regional scales can identify larger patterns and dynamics

Study Site Summary
Data from seven AmeriFlux Midwestern agricultural sites in Nebraska, Illinois, Iowa, and Minnesota were used in this study ( Figure 1 and Table 1). All sites are characterized by a humid continental climate (Dfa Koeppen climate classification) with hot humid summers and cold winters ( Figure 2). The Nebraska sites are located at the University of Nebraska Eastern Nebraska Research and Extension Center near Mead, NE. All three Nebraska sites (US-Ne1, US-Ne2, and US-Ne3) have been under no-till or minimum-till management aside from initial disking conducted in 2001. Conservation-plow tillage has been in effect at US-Ne1 (only) since 2005. Only US-Ne1 and US-Ne2 are irrigated. Water applied was in accordance with best practices to reduce water stress and administer fertilizer, herbicide, and pesticides. The Illinois site, US-Bo1, is located approximately 12 km south of Champaign, Illinois. The site is not irrigated and is under no-till management administered by NOAA. The two Iowa Brooks Field sites, US-Br1 and US-Br3, are located approximately 8 km southwest of Ames, Iowa. The sites are not irrigated and are under a tillage management system typical throughout the Upper Midwest Corn Belt. The Minnesota Rosemount site, US-Ro1, is located approximately 24 km south of Saint Paul, Minnesota. This field is not irrigated and is under chisel plow tillage in the fall following maize harvest and in the spring after soybean harvest in the fall. US-Ro1 is managed jointly by the University of Minnesota and USDA. Each site was equipped with an eddy covariance tower measuring key fluxes on a continuous basis. For these sensors, measurement heights were adjusted at each site (except US-Bo1) either once per season (US-Ne1, US-Ne2, US-Ne3) or at more frequent intervals (US-Ro1, US-Br1, US-Br3) based on crop height to keep the fetch area within the field during unstable, neutral, or moderately stable atmospheric conditions. Each site was equipped with an eddy covariance tower measuring key fluxes on a continuous basis. For these sensors, measurement heights were adjusted at each site (except US-Bo1) either once per season (US-Ne1, US-Ne2, US-Ne3) or at more frequent intervals (US-Ro1, US-Br1, US-Br3) based on crop height to keep the fetch area within the field during unstable, neutral, or moderately stable atmospheric conditions.

Daymet Data
Daymet weather data were obtained from the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC) Daymet (Version 2) [16]. The dataset was selected due to its relatively fine spatial resolution of 1 × 1 km. The dataset provides a grid of mosaicked gridded estimates of daily weather parameters. Parameters are generated using a network of ground observation sites to provide input data which is then interpolated and extrapolated using Daymet model algorithms. Daily data for respective years of the selected AmeriFlux study sites (referred to as "field years") were extracted from the various sites using the "Single Pixel Extraction" tool (https://daymet.ornl.gov/dataaccess.html#SinglePixel) with specified latitude, longitude, and date range. Minimum temperature (Tmin, °C), maximum temperature (Tmax, °C), incoming shortwave radiation (Rg, J m −2 d −1 ), average vapor pressure (P, kPa), snow-water equivalent (kg m −2 ), precipitation (mm), and day length (s) were downloaded in CSV format in which average daily

Daymet Data
Daymet weather data were obtained from the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC) Daymet (Version 2) [16]. The dataset was selected due to its relatively fine spatial resolution of 1 × 1 km. The dataset provides a grid of mosaicked gridded estimates of daily weather parameters. Parameters are generated using a network of ground observation sites to provide input data which is then interpolated and extrapolated using Daymet model algorithms. Daily data for respective years of the selected AmeriFlux study sites (referred to as "field years") were extracted from the various sites using the "Single Pixel Extraction" tool (https://daymet.ornl.gov/ dataaccess.html#SinglePixel) with specified latitude, longitude, and date range. Minimum temperature (Tmin, • C), maximum temperature (Tmax, • C), incoming shortwave radiation (Rg, J m −2 d −1 ), average vapor pressure (P, kPa), snow-water equivalent (kg m −2 ), precipitation (mm), and day length (s) were Remote Sens. 2020, 12, 3956 6 of 21 downloaded in CSV format in which average daily temperature, vapor pressure deficit, and incoming PAR were calculated in Microsoft Excel. Average daily temperature (T avg ) was calculated as: Vapor pressure deficit (VPD) was calculated using vapor pressure, P, as: Incoming PARin (µmol m −2 d −1 ) was calculated from incoming shortwave radiation (Rg) as: The constant 2.07 is computed as the product of the fraction of total shortwave at the top of the atmosphere, 0.45 Rg, J m −2 d −1 [17] and approximate energy conversion over the PAR range, 4.6 µmol J −1 [18].

MODIS Data
MODIS v.5 eight-day composite surface reflectance (ρ) (products MOD09Q1 and MYD09Q1, Terra and Aqua, respectively) for band 1 (620-670 nm, ρ red ) and band 2 (841-876 nm, ρ NIR ) at 250 m spatial resolution were obtained using Google Earth Engine. Reflectances were used to calculate a wide dynamic range vegetation index (WDRVI) time series at eight day intervals as: where α is a weighting parameter (typically 0.1-0.2 and set at 0.1 here). WDRVI is the normalized difference vegetation index (NDVI) when α equals 1; WRDVI equals zero when α equals (ρ red /ρ NIR ) [19]. WDRVI values from the pixel nearest the geographic center of each field were used for analysis to avoid mixed pixels from adjacent areas due to the 250 m pixel resolution. Green leaf area index (gLAI) values were estimated from the WDRVI time series following the approach of Nguy-Robertson et al. [20], with coefficients determined from ground samples at the Nebraska sites: gLAI = 5.06 · (WDRVI value) + (−0.47) {Maize} (5) The gLAI data were interpolated to continuous daily values for the growing season using Curve Expert 1.4 software (Hyams Development, https://www.curveexpert.net) and a cubic spline interpolation algorithm.

AmeriFlux Data
AmeriFlux (https://ameriflux.lbl.gov/) half-hourly or hourly GPP mean values were used to calibrate and validate the LUEc. GPP mean values were calculated from AmeriFlux half-hourly or hourly averages of net ecosystem exchange (NEE, µmol m −2 s −1 ) and ecosystem respiration (Re, µmol m −2 s −1 ). At all non-Nebraska sites NEE was assumed equal to canopy carbon dioxide flux (Fc, µmol m −2 s −1 ) as storage term calculations were not provided in these datasets and are likely small for these crop systems. NEE values were screened employing light response relationships. Values were graphed in relation to concurrent AmeriFlux measured half-hourly or hourly averages of PAR (µmol m −2 s −1 ) and a nonrectangular hyperbola model fitted to a period of five days. Outliers, typically more than three standard deviations from the model, were removed. For field years in which PAR was not provided, Equation (3) was used to approximate incident PAR fluxes using values of incoming solar radiation (Rg). Interpolation of daytime NEE and nighttime AmeriFlux measured Remote Sens. 2020, 12, 3956 7 of 21 ecosystem respiration (Re, µmol m −2 s −1 ) were then conducted. Half-hour averages were converted to half hour fluxes by multiplying the averages by 1800 s (half-hour) −1 ; hourly averages were converted to hourly fluxes by multiply the averages by 3600 s h −1 . Daily GPP was determined as the sum of daily NEE (half-hourly NEE fluxes (hourly for Mead sites)) and daily Re (half-hourly Re fluxes (hourly for Mead sites)): daily GPP = daily NEE − daily Re (7) where NEE and Re are positive when the flux is toward the surface. Daily GPP values are in units of g C m −2 d −1 .
Field year growing season start dates were determined to be days in which daily GPP values commenced to be greater than zero and field year growing season end dates were determined to be days in which daily GPP values fell below the zero threshold.

GPP Modeling Approach
The cloud-adjusted light use efficiency model (LUEc) [13] is an adaptation of the simple light use efficiency model of Monteith [21] based on earlier progressive work [3,7] to incorporate scalars to estimate daily GPP as: ε 0 is the daily light use efficiency during clear sky conditions, APAR is the daily photosynthetically active photon flux absorbed by the green portion of the canopy (Equation (10)); the scalars account for the impact of diffuse light (C scalar ), air temperature (T scalar ), water stress (W scalar ), and phenology (P scalar ). ε 0 and the four scalars combined represent the well-known daily light use efficiency term, ε [21]: Canopy extinction coefficients are the same as those used in the previous study [13] and are the average value of k for each crop when gLAI is greater than 1.5 m 2 m −2 and dead LAI is less than 0.5 m 2 m −2 . Calculations of the scalars follow those previously mentioned using gridded Daymet and MODIS-derived LAI values instead of AmeriFlux on-site meteorological and biophysical observations as model input. This was done to meet our objective of evaluating the LUEc applicability in using input data derived from remotely sensed satellite data and gridded weather datasets to evaluate the potential to provide regional GPP estimates. A summary of each scalar is provided here but the reader is directed to Nguy-Robertson et al. [13] for details.
C scalar . C scalar accounts for the effects of diffuse lighting on photosynthesis [3]. Plants tend to use diffuse light more efficiently than direct sunlight. ε will increase on overcast days, where lighting is more diffuse than on clear days. C scalar is calculated as: The term β is the sensitivity of daily light use efficiency to diffuse light and PAR d is the diffuse PAR flux. As PAR d is not commonly measured, the ratio PAR d /PAR in can be approximated using a cloudiness coefficient (CC) term [13]: which is determined from PAR in and PAR potential (PAR pot ): Remote Sens. 2020, 12, 3956 8 of 21 PAR pot refers to the estimated potential amount of incoming PAR flux, accounting for influences such as time of year, latitude, atmospheric pressure, and elevation as calculated by Weiss and Norman [17] with corrections reported in Nguy-Robertson et al. [13].
T scalar . Initially developed by Raich et al. [22] in regions excluding croplands, T scalar takes into account temperature effects on photosynthesis. The scalar is calculated using mean daily temperature, T avg , calculated as the average of Daymet minimum and maximum temperatures. The constants for minimum temperature, maximum temperature, and optimal temperature (10, 48, and 28 • C, respectively) of Kalfas et al. [23] designed specifically with maize were used to define T scalar as: while croplands were not examined in Raich et al. [22], values for ecoregions were consistent. Therefore, for this study, soybean optimal temperatures were assumed to be similar to those of maize. W scalar . The W scalar accounts for water stress effects on photosynthesis and was calculated based on VPD using Equation (2) and associated constants [13]. This scalar was modeled by combining the approach of Wu et al. [24] and Maselli et al. [25], as water stress can be introduced through both atmospheric water deficits and soil water deficits. Unlike models in which the scalar remains constant until a critical threshold for VPD is reached, the LUEc approach has no threshold for VPD and the scalar was allowed to vary based on Daymet data.
σ Wscalar is a term for the curvature parameter for water stress, proposed by Gilmanov et al. (2013) [26] to account for varying convexity in the relationship between photosynthesis and water stress: While famers can irrigate to mitigate the impacts of VPD, the site-specific volume of applied water to a field is not inherent to the LUEc. P scalar . P scalar accounts for leaf phenology, as immature leaves do not photosynthesize as efficiently as mature leaves, and leaves that are senescing do not have an optimal photosynthetic capacity [27,28]. The scalar is calculated using gLAI, a constant maximum gLAI value (gLAI max ) specific to each crop, and σ Pscalar, the curvature parameter for the relationship between photosynthesis and phenology:

Calibration and Validation
Four basic steps were followed in this study: (1) Field Year Selection. For statistical purposes, 75% of field years were chosen randomly without replacement (i.e., 47 out of the 65 total field years) to calibrate the model (Table 2), to ensure at least one field year was reserved for each crop type for validation purposes for a total of 18 field years (Table 3). (2) LUEc Calibration. The iterative training process using an R script as described in Nguy-Robertson et al. [13] was used to determine ε 0 , β, σ Wscalar , and σ Pscalar values for irrigated and nonirrigated maize and soybean crops. The training utilized input values from the selected calibration field year flux data, Daymet derived T avg , VPD, APAR, and T scalar , and remotely-derived gLAI. The iterative process trained the parameters using a step-by-step method in which each scalar was estimated one at a time. During the iteration in which a parameter is calculated, assumptions are made about the other parameters to mimic optimal field conditions. Once the parameters are calculated, the iterations are repeated using the entire calibration dataset to produce ε 0, β, σ Wscalar , and σ Pscalar values with corresponding standard deviations. The calibrated dataset includes AmeriFlux-derived GPP, Daymet-derived T avg , VPD, CC, APAR, and T scalar , and remotely derived gLAI data. Additionally, gLAI max values for irrigated and rainfed soybean and maize were estimated from MODIS-derived gLAI data (Equations (5) and (6)). From these outputs, scalars were calculated (Equations (8)-(15)). (3) Daily GPP Estimation. Data from validation field year datasets (Table 3) and parameters derived from the iterative process along with specified constants (Table 4), were used to calculate daily values of the scalars, C scalar , W scalar , and P scalar (Equations (11), (14), and (15)) and APAR (Equation (10)), from which daily GPP values were estimated (Equation (8)). (4) LUEc Validation. The estimated daily GPP values were compared to the observed AmeriFlux determined daily GPP using linear regression (α = 0.05), with corresponding R 2 , root mean square error (RMSE), and mean normalized bias (MNB).

All Sites Years Combined
The parameters ε 0, β, σ Wscalar , σ Pscalar , and gLAI max used to drive the LUEc, differed slightly from those reported previously [13] (Table 4). By using field years from geographically separated locations in the calibration process, the parameters were generalized to best fit conditions across all study locations of the northern Midwest region rather than just for the Nebraska sites as in the original Nguy-Robertson et al. [13] study.
Daily GPP for maize and soybean for the selected validation field years (Table 3), estimated using calculated daily values of C scalar , W scalar , and P scalar , and APAR, were compared to observed AmeriFlux-determined daily GPP and indicate a strong correlation (r = 0.89; R 2 = 0.80), a slope of 0.93 ± 0.02, and an overall RMSE of 3.5 g C m −2 d −1 (Figure 3a) for all field years; however, the MNB was 30.5%. The results varied with crop. RMSE for overall maize field years was 3.7 g C m −2 d −1 (r = 0.90; R 2 = 0.82, and slope of 0.93 ± 0.02), with a MNB of 17.9% (Figure 3b) while RMSE for overall soybean field years was 3.2 g C m −2 d −1 (r = 0.80; R 2 =0.64, and a slope of 0.89 ± 0.05) with a MNB of 53.6% (Figure 3c). Residual plots did not highlight any potential issues with the data and demonstrated data were of a normal distribution.

All Sites Years Combined
The parameters ε0, β, σWscalar, σPscalar, and gLAImax used to drive the LUEc, differed slightly from those reported previously [13] (Table 4). By using field years from geographically separated locations in the calibration process, the parameters were generalized to best fit conditions across all study locations of the northern Midwest region rather than just for the Nebraska sites as in the original Nguy-Robertson et al. [13] study.
Daily GPP for maize and soybean for the selected validation field years (Table 3), estimated using calculated daily values of Cscalar, Wscalar, and Pscalar, and APAR, were compared to observed AmeriFlux-determined daily GPP and indicate a strong correlation (r = 0.89; R 2 = 0.80), a slope of 0.93 ± 0.02, and an overall RMSE of 3.5 g C m −2 d −1 (Figure 3a) for all field years; however, the MNB was 30.5%. The results varied with crop. RMSE for overall maize field years was 3.7 g C m −2 d −1 (r = 0.90; R 2 = 0.82, and slope of 0.93 ± 0.02), with a MNB of 17.9% (Figure 3b) while RMSE for overall soybean field years was 3.2 g C m −2 d −1 (r = 0.80; R 2 =0.64, and a slope of 0.89 ± 0.05) with a MNB of 53.6% (Figure 3c). Residual plots did not highlight any potential issues with the data and demonstrated data were of a normal distribution.

Maize and Soybean Field Years
While several sites were relatively unbiased (<15% MNB), the model tended to overestimate GPP overall. Most of the underestimation occurred at low GPP values (Figures 5 and 6). Another trend is

Maize and Soybean Field Years
While several sites were relatively unbiased (<15% MNB), the model tended to overestimate GPP overall. Most of the underestimation occurred at low GPP values (Figures 5 and 6). Another trend is

Maize and Soybean Field Years
While several sites were relatively unbiased (<15% MNB), the model tended to overestimate GPP overall. Most of the underestimation occurred at low GPP values (Figures 5 and 6). Another trend is that the correlation between measured and modeled GPP values is stronger with maize than with soybean datasets (R 2 of 0.65-0.87 for maize and 0.55-0.75 for soybeans).
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 21 that the correlation between measured and modeled GPP values is stronger with maize than with soybean datasets (R 2 of 0.65-0.87 for maize and 0.55-0.75 for soybeans).

Discussion
This study utilized seven fields from four geographic locations. The agreement between modeled and measured values is not as strong as for maize and soybean GPP at sites in southeastern Nebraska alone in the original study [13] (overall slope of 0.94 ± 0.02, R 2 = 0.91, RMSE = 2.6 g C m −2 d −1 , and MNB = 1.7%). Thus, the MNB of 30.5% across all sites in the current study is of concern (Figure 3a). It was expected that errors would be larger when using modeled input data as opposed to measured input data specific to the site. This was the case for Cui et al. [1], who attained an RMSE of 2.97 g C m −2 d −1 when comparing their modeled GPP data to in situ derived GPP of mixed land cover in the Heihe River Basin, China. Three of the seven fields, which include 36 of 65 total field years, are located in southeastern Nebraska and are the same as those used in the original study. With over half of the field years from this specific location, a calibration bias may have occurred. This bias may have led to smaller errors for the Nebraska sites and larger errors for the non-Nebraska sites than would be the case if the field years were more evenly represented for model calibration ( Table  2).
In addition, there are inherent errors and uncertainties in the gridded Daymet data [29], attributed to weaker agreement between measured and modeled daily GPP values. Additionally, the gLAI algorithms used in this study were developed using the Mead, NE sites [20] and may not be ideal estimators of gLAI for the other sites due to characteristics not captured in the site-specific

Discussion
This study utilized seven fields from four geographic locations. The agreement between modeled and measured values is not as strong as for maize and soybean GPP at sites in southeastern Nebraska alone in the original study [13] (overall slope of 0.94 ± 0.02, R 2 = 0.91, RMSE = 2.6 g C m −2 d −1 , and MNB = 1.7%). Thus, the MNB of 30.5% across all sites in the current study is of concern (Figure 3a). It was expected that errors would be larger when using modeled input data as opposed to measured input data specific to the site. This was the case for Cui et al. [1], who attained an RMSE of 2.97 g C m −2 d −1 when comparing their modeled GPP data to in situ derived GPP of mixed land cover in the Heihe River Basin, China. Three of the seven fields, which include 36 of 65 total field years, are located in southeastern Nebraska and are the same as those used in the original study. With over half of the field years from this specific location, a calibration bias may have occurred. This bias may have led to smaller errors for the Nebraska sites and larger errors for the non-Nebraska sites than would be the case if the field years were more evenly represented for model calibration ( Table 2).
In addition, there are inherent errors and uncertainties in the gridded Daymet data [29], attributed to weaker agreement between measured and modeled daily GPP values. Additionally, the gLAI algorithms used in this study were developed using the Mead, NE sites [20] and may not be ideal estimators of gLAI for the other sites due to characteristics not captured in the site-specific algorithms used in this study. Unfortunately, ground data were not available at the other AmeriFlux locations to generate more generic gLAI algorithms. The underestimation of early and late season GPP is attributed to the sensitivity of gLAI algorithms to soil background at low LAI and the inaccurate representation of canopy architecture with the given k constants when foliage density is low. In both cases, any noise in the approach is amplified due to low values of GPP during these periods.
The correlation between measured and modeled GPP values tended to be higher with maize than with soybean datasets (Figure 3b,c). The association among individual sites varied as well (Figures 5 and 6). Factors contributing to a trend of a stronger association with the maize data include (1) the increased sensitivity to diffuse light of soybean plants compared to maize (β of 0.294 ± 0.002 for soybean, 0.181 ± 0.014 for maize) combined with the concerns regarding PARin data (discussed below); (2) estimates of gLAI may be a factor as gLAI algorithms for maize and soybean were developed for the Mead, NE training sites (US-Ne1, US-Ne2, and US-Ne3) and may not adequately estimate gLAI for the other locations which may have different plant densities (there is also uncertainty associated with separating green and nongreen leaf material during senescence [30]); (3) there were fewer soybean field years (17) than maize field years (30) with which to train the scalars, which may have allowed for more error in the soybean scalars due to the smaller sample size ( Table 2); (4) the soybean growing season is much shorter than for maize, so there are fewer data points to calibrate/validate the model; (5) the temperature scalar was selected based on optimal temperatures for maize. There are physiological differences between C3 and C4 plants that may account for minor shifts in the optimal temperature range. However, using crop-specific temperature ranges will complicate the model and require in-season crop type maps. This would reduce the applicability of generating regional estimates.
To further understand errors in the GPP estimates, preliminary investigations were undertaken to determine the potential contribution of Daymet-derived values, scalars of the LUEc, and gLAI.

Daymet-Derived PARin and VPD
Daymet-derived values are compared to those measured at the AmeriFlux sites in Figure 7. T ave values (Equation (1), using Daymet T min and T max ) showed good agreement with AmeriFlux measurements (Figure 7a). Daily PARin values and VPD (Equation (2) using Daymet vapor pressure) were less consistent (R 2 of 0.48 and 0.54, respectively) (Figure 7b,c).
Overestimation of PARin (Figure 7b) would result in a decrease in the C scalar value and is self-correcting in the calibration process which accounts for some of the differences in ε 0 and β (Table 4). By isolating PAR and LAI as variables for GPP modeling, Suyker and Verma [3] showed that PAR was a predominant factor in GPP variability. Gilabert et al. [31] employed an optimization scheme by adjusting values of PAR, the fraction of APAR, and ε (using the Monteith approach of Equation (9) with the product of the fraction of APAR and PAR equal to APAR). They found that PAR and the fraction of APAR were the most important factors in estimating daily GPP. The present study suggests an unsystematic error in PARin (and thus APAR) values. This unsystematic error in PARin is attributed to the Daymet calculation of shortwave radiation from which PARin is estimated (Equation (3)) and/or our attempt to upscale field-level PARin and GPP measurements to the Daymet 1 km grid. Overestimation of PARin will overestimate the amount of light available for photosynthesis in APAR calculations (Equation (10)) and reduce the model's ability to properly account for contributions of diffuse light to GPP (Equation (11)). Other sources of input for PARin are likely needed [32,33].
Likewise, the apparent poor representation of the Daymet vapor pressure product for a specific site location and the fact that the inherent error is less predictable implies that a new source may be needed for this variable. One such source for both vapor pressure and PARin data could be PRISM (Parameter-elevation Relationships on Independent Slopes Model; PRISM [34]), which Mourtzinis et al. [29] found more accurate than Daymet. Underestimation of Daymet-derived VPD is likely due to the air mass over agricultural fields, especially irrigated agricultural fields, having lower VPD than reported for grids in which the fields are located. Underestimation of VPD increases the W scalar value, which may inaccurately represent water stress conditions at field sites. Additionally, the calibration is influenced by 17 irrigated field years at the Mead, NE sites (US-Ne1 and US-Ne2; 58% of the field years used to calibrate the LUEc). This may have biased the σ Wscalar parameter to better reflect irrigated crop management practices while the other five fields are under rainfed crop management. The modeled VPD would likely be overestimated at the two Mead, NE sites due to VPD being anthropogenically lowered through the irrigation practice. While the rainfed sites have higher annual precipitation rates than the Mead, NE sites, the timing is irregular and can contribute to greater uncertainty; irrigation at US-Ne1 and US-Ne2 supplements rainfall and thus reduces uncertainty. Thus, irrigation bias could increase error in modeled GPP at the rainfed sites.
Remote Sens. 2020, 12, x FOR PEER REVIEW 16 of 21 which may inaccurately represent water stress conditions at field sites. Additionally, the calibration is influenced by 17 irrigated field years at the Mead, NE sites (US-Ne1 and US-Ne2; 58% of the field years used to calibrate the LUEc). This may have biased the σWscalar parameter to better reflect irrigated crop management practices while the other five fields are under rainfed crop management. The modeled VPD would likely be overestimated at the two Mead, NE sites due to VPD being anthropogenically lowered through the irrigation practice. While the rainfed sites have higher annual precipitation rates than the Mead, NE sites, the timing is irregular and can contribute to greater uncertainty; irrigation at US-Ne1 and US-Ne2 supplements rainfall and thus reduces uncertainty. Thus, irrigation bias could increase error in modeled GPP at the rainfed sites.

LUEc Scalars
The scalars were generalized to better estimate GPP across the various sites within the region rather than calibrate scalars for each site, causing them to be less precise for any given location than those calculated for the Nebraska sites in the previous study [13]. The LUEc was run with all scalars set to a value of 1 to investigate their impact in the LUEc on the GPP calculation using the gridded

LUEc Scalars
The scalars were generalized to better estimate GPP across the various sites within the region rather than calibrate scalars for each site, causing them to be less precise for any given location than those calculated for the Nebraska sites in the previous study [13]. The LUEc was run with all scalars set to a value of 1 to investigate their impact in the LUEc on the GPP calculation using the gridded datasets. GPP calculated with the scalars set to 1 had an RMSE of 5.35 g C m −2 d −1 , an R 2 of 0.71, and an MNB value of 104.3% (Figure 8). Although the scalars may have been more effective if more accurate gridded input data were available, the model performed better overall with the scalars derived in this study using the Daymet and MODIS data than when run strictly as a LUE model. This supports the need to include meteorological data (stressors) in estimates of GPP, in contrast to an observation of Tramontana et al. [8] that predictions based on remotely sensed data gave optimally accurate estimates.

LUEc Scalars
The scalars were generalized to better estimate GPP across the various sites within the region rather than calibrate scalars for each site, causing them to be less precise for any given location than those calculated for the Nebraska sites in the previous study [13]. The LUEc was run with all scalars set to a value of 1 to investigate their impact in the LUEc on the GPP calculation using the gridded datasets. GPP calculated with the scalars set to 1 had an RMSE of 5.35 g C m −2 d −1 , an R 2 of 0.71, and an MNB value of 104.3% (Figure 8). Although the scalars may have been more effective if more accurate gridded input data were available, the model performed better overall with the scalars derived in this study using the Daymet and MODIS data than when run strictly as a LUE model. This supports the need to include meteorological data (stressors) in estimates of GPP, in contrast to an observation of Tramontana et al. [8] that predictions based on remotely sensed data gave optimally accurate estimates.

gLAI Estimates
LAI is a primary consideration in estimating net primary productivity [35]. The gLAI values used in this study were not verified at the non-Nebraska sites as no gLAI measurements are available for those sites. Thus relationships previously developed for the three Nebraska sites [13] were used. Of concern is the relationship with the gLAI after peak green-up. In addition to the fact that the algorithms were developed over a geographically limited area, the uncertainty associated with separating green and nongreen leaf material during senescence likely also contributed to less reliable gLAI estimates during senescence. Algorithms using a red edge band are less sensitive to nongreen

gLAI Estimates
LAI is a primary consideration in estimating net primary productivity [35]. The gLAI values used in this study were not verified at the non-Nebraska sites as no gLAI measurements are available for those sites. Thus relationships previously developed for the three Nebraska sites [13] were used. Of concern is the relationship with the gLAI after peak green-up. In addition to the fact that the algorithms were developed over a geographically limited area, the uncertainty associated with separating green and nongreen leaf material during senescence likely also contributed to less reliable gLAI estimates during senescence. Algorithms using a red edge band are less sensitive to nongreen leaf material [20]; however, the temporal resolution would have been reduced and increased uncertainty in interpolation had MERIS or Sentinel-2 data been selected.

APAR Importance in GPP Estimates
Overestimation of PARin greatly affects APAR in a manner that causes overestimation of GPP, which will inflate the MNB. Estimated APAR (Equation (10)) for the three Nebraska sites was plotted as a function of measured APAR in an attempt to determine how well the model was estimating the variable. Similar data were not available from the AmeriFlux website for the other sites. Two of the inputs in calculating APAR are gLAI and PARin, and a good portion of the increased error is attributed to the APAR term in the GPP equation (Equation (9)). When compared to measured APAR, estimated APAR for the three Nebraska sites (US-Ne1, US-Ne2, and US-Ne3) had a combined RMSE of 10.5 mol m −2 d −1 , with an r of 0.72 and an R 2 of 0.53 (Figure 9a). When the data are separated into green-up (DOY ≤ 220) and post-green peak (DOY > 220) sets, it becomes apparent that there is a stronger correlation between measured and modeled APAR during the green-up phase than the post green-up phase (Figure 9b,c). inputs in calculating APAR are gLAI and PARin, and a good portion of the increased error is attributed to the APAR term in the GPP equation (Equation (9)). When compared to measured APAR, estimated APAR for the three Nebraska sites (US-Ne1, US-Ne2, and US-Ne3) had a combined RMSE of 10.5 mol m −2 d −1 , with an r of 0.72 and an R 2 of 0.53 (Figure 9a). When the data are separated into green-up (DOY ≤ 220) and post-green peak (DOY > 220) sets, it becomes apparent that there is a stronger correlation between measured and modeled APAR during the green-up phase than the post green-up phase (Figure 9b,c).

Conclusions
The main objective of this study was to evaluate the potential of the LUEc enhanced four-scalar light use efficiency approach of Nguy-Robertson et al. [13] to provide regional GPP estimates. By incorporating MODIS imagery and gridded weather data into the LUEc, daily GPP was estimated over the entire growing season in seven U.S. Corn Belt agricultural sites in four Midwestern US locations. Overall RMSE between measured GPP and modeled GPP for all field years was 3.5 g C m −2 d −1 . This is an increase in RMSE of 0.9 g C m −2 d −1 compared to the in situ-based study [13]. However, the MNB of 30.5% is of particular concern. Generalized scalars were developed for maize and soybeans when the model was calibrated with remotely sensed derived gLAI data and gridded weather data from all seven sites. GPP estimates were better at some sites (Mead, NE) than at others. Accuracy of the LUEc estimates decreased when remotely sensed and gridded weather data inputs were used compared to the results of Nguy-Robertson et al. [13]. While accuracy decreased, so did the amount of resources required to run the model. The findings of this study indicate that the LUEc can provide acceptable daily GPP estimates for regions outside eastern Nebraska (for which the model was originally tested) using gridded and remotely sensed data. The LUEc performs better than a LUE model that does not incorporate stressors, supporting the need to include meteorological data (stressors) in estimating daily GPP. Future research should be conducted to determine if more generalized coefficients can be derived to allow the model to function accurately at broader regional scales.