Combining High-Resolution Remote Sensing Products with a Crop Model to Estimate Carbon and Water Budget Components: Application to Sunﬂower

: The global increase in food demand in the context of climate change requires a clear understanding of cropland function and of its impact on biogeochemical cycles. However, although gas exchange between croplands and the atmosphere is measurable in the ﬁeld, it is di ﬃ cult to quantify at the plot scale over relatively large areas because of the heterogeneous character of landscapes and di ﬀ erences in crop management. However, assessing accurate carbon and water budgets over croplands is essential to promote sustainable agronomic practices and reduce the water demand and the climatic impacts of croplands while maintaining su ﬃ cient yields. From this perspective, we developed a crop model, SAFYE-CO 2 , that assimilates high spatial- and temporal-resolution (HSTR) remote sensing products to estimate daily crop biomass, water and CO 2 ﬂuxes, annual yields, and carbon budgets at the parcel level over large areas. This modeling approach was evaluated for sunﬂower against two in situ datasets. First, the model’s output was compared to data acquired during two cropping seasons at the Aurad é integrated carbon observation system (ICOS) instrumented site in southwestern France. The model accurately simulated the daily net CO 2 ﬂux (root mean square error (RMSE) = 0.97 gC · m − 2 · d − 1 and determination coe ﬃ cient (R 2 ) = 0.83) and water ﬂux (RMSE = 0.68 mm · d − 1 and R 2 = 0.79). The model’s performance was then evaluated against biomass and yield data collected from 80 plots located in southwestern France. The model was able to satisfactorily estimate biomass dynamics and yield (RMSE = 66 and 54 g · m − 2 , respectively). To investigate the potential application of the proposed approach at a large scale, given that soil properties are important factors a ﬀ ecting the model, a sensitivity analysis of two existing soil products (GlobalSoilMap and SoilGrids) was carried out. Our results show that these products are not su ﬃ ciently accurate for inclusion as inputs to the model, which requires more accurate information on soil water retention capacity to assess water ﬂuxes. Additionally, we argue that no water stress should be considered in the crop growth computation since this stress is already present because of remote sensing information in the proposed approach. This study should be considered a ﬁrst step to fulﬁll the existing gap in quantifying carbon budgets at the plot scale over large areas and to accurately estimate the e ﬀ ects of management practices, such as the use of cover crops or speciﬁc crop rotations on cropland C and water budgets.


Introduction
Sunflower (Helianthus annuus L.) was introduced by Spaniards in Europe during the 16th century as an ornamental plant species before its oil started to be used for food during the 19th century. Currently, sunflower is cultivated on more than five continents, with Ukraine and Russia being the largest producers, followed by the EU-28. In France, sunflower crops cover 590,000 ha, with an annual production of 1.36 M tonnes, resulting in an average yield of 2.3 t·ha −1 (source: FAO, http://www.fao.org/faostat/en, 10-year average between 2007 and 2016). In France, this oilseed crop species is grown mainly in the central-western (Nouvelle Aquitaine) and southwestern (Occitanie) regions. Sunflower is a summer crop; it is mainly grown under rainfed conditions and has a low water-use [1]. In southwestern France, the temperate Mediterranean-like climate (dry and warm summers) results in sunflower crops being subjected to frequent and severe water stress conditions that impact crop productivity at both field and landscape/regional scales. These variations should be quantified to better manage croplands. Moreover, in this context, accurate estimation of yield, carbon (C), and water budgets at the field level over large areas remains challenging. The heterogeneous character of landscape in terms of climatic and soil conditions as well as the diversity of agricultural practices make the upscaling of agronomic models difficult. Thus, a precise understanding of the cropland C budget is needed, as well as agronomic indicators (e.g., water-use efficiency) useful for monitoring crop species, such as sunflower.
Crop models were first used for research purposes [2,3] in the early 1970s. During the 1980s, they became more diversified and complex to meet management needs and to support field-scale decision-making ( [4,5] and farming applications [6]). Today, a considerable number of crop models exist. They can be classified into three groups: 1.
C-driven models, or what we call the "de Wit school" crop growth simulation models [7], are based on the assimilation of C by photosynthetic parts of the plant, such as the SUCROS [8] or GECROS [9] models. These models, on the basis of their hierarchical architecture, are complex, require numerous parameters, and are difficult to calibrate; thus, they are difficult to upscale to large areas. 2.
Water-driven models are based on a predetermined relation between crop growth rate and transpiration according to water productivity parameters. Compared to that of the previous class, the approach of this class results in models whose structure is less complex, leading to fewer parameters. Among the most representative ones are the AquaCrop [10] or AqYield [11] models. 3.
Radiation-driven models, which have been constructed on the basis of [12]'s approach, convert intercepted or absorbed solar radiation into aboveground or total biomass via a radiation-use efficiency parameter, significantly reducing the number of photosynthetic parameters needed. Most crop models are radiation driven, including the CERES [13], EPIC [14], STICS [15], SAFY [16], and SAFY-WB [17] models.
The model presented in the current study belongs to both the first and third groups. Indeed, it evolved from (i) the SAFY-WB model [17], which belongs to the third group and allows the simulation of biomass production, yield, and water fluxes and budget, and from (ii) the SAFY-CO 2 model [18], which calculates biomass production as the difference between plant assimilation and respiration.
To reduce the number of model parameters and to increase the robustness of the models upon upscaling, several remote sensing data assimilation approaches for crop models have been developed. The benefits of using HSTR satellite products in crop models have already been demonstrated in several studies [18,[23][24][25][26]. These approaches are either based on establishing relations between remote sensing products and crop productivity [27] or on calibrating the model's parameters [28][29][30]. Such approaches have already been applied to sunflower crops for biomass estimation [31,32]. Those studies showed that accounting for the spatial heterogeneity in sunflower development is crucial for the accurate estimation of crop dynamics; nevertheless, these models are not able to estimate cropland C budgets. In this context, we propose a newly developed agro-meteorological model, the Simple Algorithm for Yield, Evapotranspiration, and CO 2 fluxes estimates (SAFYE-CO 2 ), that assimilates HSTR products, and we validate the model for use in sunflower. Adapted from models developed by [17,18], this model allows us to simulate cropland growth, yield, and CO 2 and water fluxes at a large scale because it requires little input data (e.g., little to no crop management data).
Initially, the Simple Algorithm For Yield estimates (SAFY) model was developed to estimate crop biomass and production and was first validated for use in winter wheat [16], followed by sunflower and maize [23,31]. Those studies demonstrated the benefits of assimilating the HSTR green area index (GAI) derived from optical remote sensing satellite data for crop monitoring by calibrating several phenological parameters as well as light-use efficiency. Later, this model was coupled with a water module based on the FAO-56 approach [33] to estimate cropland evaporation (Ev) and transpiration (TR) and thus water stress and water needs. This resulted in the SAFY-WB (water balance) model, which was validated for use in wheat [17] and maize [34]. More recently, Pique et al. [18] showed the benefit of using HSTR GAI products for assessing the components of the C budget over winter wheat crops by adapting the SAFY model to a C module, which led to a new model called SAFY-CO 2 .
In line with these studies, a new model, coupling the SAFY-CO 2 model with the same water module as that in the SAFY-WB model, has been constructed and is evaluated here. The new model is called the SAFYE-CO 2 model (E stands for evapotranspiration (ETR)), which aims to assess daily biomass production and CO 2 and water fluxes, as well as annual yields and the C and water budget over croplands. The development and validation of the SAFYE-CO 2 model for sunflower was driven by the following perspectives: (i) Simulation of the water flux and of the soil water provides an opportunity to estimate the effects of water stress on crop growth, which could improve biomass and CO 2 flux estimates for summer crops (e.g., sunflower); and (ii) SAFY-CO 2 has already been validated for use in winter wheat [18], and validation for use in sunflower would allow us to simulate C and water budgets for sunflower-winter wheat crop rotations, which are very common in southwestern France.
In terms of upscaling the model over large areas and simulating the effects of water stress on crop production and C budgets, the objective of this study was twofold: (i) To analyze whether a coupled remote sensing/modeling approach, such as that involving SAFYE-CO 2 , is able to accurately reproduce sunflower function, i.e., biomass, yield, and CO 2 and water fluxes, at the ground-atmosphere interface; and (ii) to evaluate the impact of current soil map products on the simulation accuracy as well as the additive value of coupling SAFY-CO 2 with a water module to improve sunflower simulation.
In the following sections, the materials and methods are described. Afterward, the results section starts with a validation of the SAFYE-CO 2 model against biomass, yield, and CO 2 and water flux measurements at the Auradé integrated carbon observation system (ICOS) instrumented site. A sensitivity analysis of the model to the soil parameters was then performed. Finally, SAFYE-CO 2 was validated via biomass and yield data collected from more than 80 fields during five years, covering a wide range of soil and climatic conditions. Maps of the components of the C budget over a cropping season are then presented. The study ends with a discussion on the benefits and limitations of our approach, raising issues concerning the assessment of accurate C and water budgets at the plot scale over large areas and discussing potential improvements for simulating those variables.

Materials
The proposed study used a large amount of data from several sources to calibrate, validate, and upscale our model. In addition to the meteorological inputs and soil property data needed for the model, high-resolution remote sensing data were required for calibration. The validation of the model is based on CO 2 and water flux measurements, biomass production, and yield. The last two originated either from destructive measurements carried out at the FR-Aur site (see Section 2.2) or Remote Sens. 2020, 12, 2967 4 of 25 from surveys carried out in 2013, 2014, and 2015 [35]. The following sections present these data as well as the study area.

Study Area
The study area is located in southwestern France and encompasses 4 administrative departments (Haute-Garonne, Gers, Aude, and Ariège; see Figure 1). It is part of the Regional Spatial Observatory-Southwest (OSR-SO, https://osr.cesbio.cnrs.fr/) located near Toulouse and includes the instrumented ICOS agricultural site of Auradé (FR-Aur). The region is characterized by a temperate and Mediterranean-like climate, with dry and warm summers (see Table 1). The Garonne River flows across the study area from south to north and has shaped the topography into terraces and hill landscapes that resulted from the erosion of old terraces. This has led to heterogeneous soils called "boulbènes" (silty soil on terraces) and "terrefort" (clayey-limestone soils on hills), characterized by contrasting textures (especially with respect to the clay content) and with variations in depth within each type of soil.
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 28 from surveys carried out in 2013, 2014, and 2015 [35]. The following sections present these data as well as the study area.

Study Area
The study area is located in southwestern France and encompasses 4 administrative departments (Haute-Garonne, Gers, Aude, and Ariège; see Figure 1). It is part of the Regional Spatial Observatory -Southwest (OSR-SO, https://osr.cesbio.cnrs.fr/) located near Toulouse and includes the instrumented ICOS agricultural site of Auradé (FR-Aur). The region is characterized by a temperate and Mediterranean-like climate, with dry and warm summers (see Table 1). The Garonne River flows across the study area from south to north and has shaped the topography into terraces and hill landscapes that resulted from the erosion of old terraces. This has led to heterogeneous soils called "boulbènes" (silty soil on terraces) and "terrefort" (clayey-limestone soils on hills), characterized by contrasting textures (especially with respect to the clay content) and with variations in depth within each type of soil.
According to the 2015 French Land Parcel Identification System (LPIS, or the Registre Parcellaire Graphique in French), agricultural activity represents 66% of the study area (focus in Figure 1), which includes forests, grasslands, orchards, etc. Sunflower crops constituted 19.1% of the agricultural area. In the region, sunflower is usually sown between April and May and is harvested between the end of August and September. The climatic conditions during the crop cycle of sunflower are characterized by high radiation and strong evaporative demand, while low and erratic rainfall occurs in the summer ( Table 1). The selected study area is very seldom irrigated, like elsewhere in France (<5% of the sunflower area; [36]). Owing to its deep root system, its thick and hairy epidermis, and its adaptive capability (through a reduction of the photosynthetic rates), sunflower is  According to the 2015 French Land Parcel Identification System (LPIS, or the Registre Parcellaire Graphique in French), agricultural activity represents 66% of the study area (focus in Figure 1), which includes forests, grasslands, orchards, etc. Sunflower crops constituted 19.1% of the agricultural area.
In the region, sunflower is usually sown between April and May and is harvested between the end of August and September. The climatic conditions during the crop cycle of sunflower are characterized by high radiation and strong evaporative demand, while low and erratic rainfall occurs in the summer ( Table 1). The selected study area is very seldom irrigated, like elsewhere in France (<5% of the sunflower area [36]). Owing to its deep root system, its thick and hairy epidermis, and its adaptive capability (through a reduction of the photosynthetic rates), sunflower is moderately sensitive to water deficit. However, hydric stress occurring before pre-flowering can lead to early senescence, ultimately causing losses in yield and oil content.

Flux Site and Experimental Instrumentation
Since 2005, the FR-Aur site ( Figure 1) has been highly instrumented for the continuous collection of micrometeorological, meteorological, soil, and vegetation data (see [37] for more details) in the context of the OSR-SO. FR-Aur is also part of the European Research Infrastructure Consortium ICOS (https://www.icos-cp.eu/), which is a European network documenting long-term greenhouse gas concentrations in the atmosphere and greenhouse gas exchange between the surface and the atmosphere in several ecosystems under different pedoclimatic conditions (crops, grasslands, forests, etc.). The plot area is 23.5 ha and is highly heterogeneous because of the topography (17.3 m height difference), which results in different soil depths (from 100 cm to more than 200 cm) and thus different crop yield potentials. The FR-Aur field belongs to a conventional grain farm where only mineral fertilization is applied. The plot is characterized by a rapeseed/winter wheat/sunflower/winter wheat rainfed rotation. Sunflower was grown during the 2006-2007 and 2015-2016 cropping seasons (see Table 1 for sowing and harvest dates). Hereafter, the crop years will be abbreviated by "CY" followed by the harvest year (e.g., CY-07 for the 2006-2007 crop season).

Meteorological Data
The SAFYE-CO 2 model requires meteorological data as the main input, while vegetation remote sensing was used to constrain the modeled dynamics. Specifically, air temperature (T AIR ) at 2 m, incoming global radiation (R G ), rain, and reference ETR (ET 0 ) were needed. These variables were provided either from in situ measurements acquired at the FR-Aur site for local validation or from SAFRAN (système d'analyse fournissant des renseignements adaptés à la nivologie) reanalysis data [38] for the spatial analysis.
At FR-Aur, meteorological data are recorded every 30 min and averaged on a daily basis (see [37] for more details). These data were used to calculate the ET 0 using the Penman-Monteith formula [33].
The SAFRAN meteorological data were used as inputs to simulate the sunflower experiments of 2013, 2014, and 2015. SAFRAN reanalysis data are provided over a grid of 8 × 8 km 2 over France. In this study, the climatic data were estimated for each field using a bilinear interpolation of the 4 closest grid points.
The climatic conditions of all simulated years are summarized in Table 1. Each variable was averaged from the 15th of April to the 15th of September, corresponding to the average sunflower growing season in the region. The crop cycles of CY-07 and CY-16 were quite similar, although 2016 was slightly hotter and sunnier than 2007 (Table 1), leading to slightly higher biomass production in 2016. In 2013 and 2014, the growing seasons were colder compared to the season during 2015. The 2013 growing season (CY-13) was the wettest, and the CY-14 growing season was the driest. Note for the sake of clarity, only sowing and harvest dates of CY-07 and CY-16 are indicated in Table 1. Indeed, more than one field was simulated for the others CY (respectively 9, 21, and 49 for CY-13, CY-14, and CY-15), thus the sowing and harvest dates vary by plot, which explains the empty cases of Table 1.

Biomass and Yield Data
Destructive measurements were made at FR-Aur to characterize the sunflower dry aboveground mass (DAM) dynamics during CY-07 and CY-16 (see Table 2). In CY-07, 30 plants were collected monthly inside the flux tower footprint area distributed on either side of the eddy covariance tower towards the prevailing winds. The DAM was obtained by drying samples in an oven at 65 • C for 48 h. The total DAM produced (before harvest) was measured by cutting plants from a 24-m 2 area and measuring them. The grain yield (YLD) was provided by the farmers for CY-07, and the grains were separated from the heads and weighed to obtain yield for CY-16. Extensive field campaigns were conducted in 2013, 2014, and 2015 to collect biomass and yield data across the study area ( Figure 1). In 2013, 9 elementary sampling units (ESUs, 20 × 20 m 2 ) corresponding to 8 different fields were monitored during the growing season. For each ESU and at 4 times during crop development (the last sampling was performed just before harvest), 5 samples of aboveground vegetation consisting of 10 adjacent plants were collected (except for the first date, when only 3 plants were collected), oven dried at 50 • C for 48 h, and then weighed. DAM was calculated by dividing the total biomass by the area sampled, calculated as the product of the interrow width and the length of the sampling area. On the last sampling date, the grains were separated from the heads and weighed to calculate the yield.
In 2014 and 2015, yield data were collected for more than 70 and 116 fields, respectively. They were provided by the farmers, and because some of the yield values were averaged over several fields, only yield data corresponding to one field were considered (see Table 2). Two or three visits were made to each field in the 2014 and 2015 campaigns to characterize the crop growth status (e.g., canopy heterogeneity, presence of weeds, pest and disease pressure). From those observations, we defined quality scores, and low-quality data were discarded. These data corresponded to fields where (i) vegetation-free areas larger than 2 m 2 were observed, and (ii) weeds covered more than 50% of the field. After discarding all these data (115), we kept data for 70 fields for the model evaluation ( Table 2). None of the fields monitored in 2013, 2014, and 2015 were irrigated.

Flux Data
Turbulent fluxes of CO 2 , water vapor (ETR and latent heat), sensible heat, and momentum were measured continuously using the eddy covariance method [39][40][41] by the combination of an open-path fast infrared gas analyzer (LI-7500, LI-COR) and a 3-D sonic anemometer (CSAT3, Campbell Scientific). The flux tower was installed near the center of the field (see Figure 1). EdiRe software (Clement, 1999) was used to calculate the ETR and net CO 2 fluxes (see [18] and [37] for details). Flux filtering, quality controls, and gap filling procedures were performed in accordance with the CarboEurope-IP recommendations (http://www.carboeurope.org/).
The net ecosystem exchange (NEE) of CO 2 was partitioned into gross primary productivity (GPP) and ecosystem respiration (R ECO ) according to the methods proposed by [42] and adapted for croplands Remote Sens. 2020, 12, 2967 7 of 25 by [37]. During the period when bare soil was exposed (deduced from field observations), the GPP was set to 0, while the R ECO was set equal to the NEE.

Soil Data
To run the SAFYE-CO 2 model, data on the soil moisture at field capacity (θ FC ) and the permanent wilting point (θ WP ) are needed for each considered field to estimate the daily soil available water content (AWC). This information can be derived either from in situ sampling and laboratory analysis or can be inferred from soil texture products by applying pedotransfer functions (PTFs). In this study, we used both methodologies: data from in situ soil sampling at FR-Aur and data from the global soil databases GlobalSoilMap (GSM [43]) and SoilGrids (SG [44]) at both FR-Aur and at the other 78 fields. These soil products were used to estimate the θ FC , θ WP , and soil depth for all the considered fields in this study, including those at FR-Aur. The global soil databases provide the fraction of sand, clay, and silt at a 90-(GSM) or 250-m (SG) resolution as well as the soil depth. SG is based on machine learning and remote sensing data, while GSM uses a data-mining technique and a bottom-up approach [45].
We used the PTFs proposed by [46] (Equations (9) and (10)) to calculate the volumetric soil moisture at field capacity (θ FC ) and at permanent wilting point (θ WP ). The estimations of the θ FC , θ WP , and soil depth for FR-Aur using in situ samples, SG, and GSM products are summarized in Table 3 for the 3 soil layers considered in the proposed modeling approach (see Section 3.1). The soil types of all the fields monitored in 2013, 2014, and 2015 were characterized based on the textures derived from the GSM and SG products (for the top 30 cm of the soil) and considering the United States Department of Agriculture (USDA) classification system (see Figure 2). Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 28

Soil Data
To run the SAFYE-CO2 model, data on the soil moisture at field capacity (θFC) and the permanent wilting point (θWP) are needed for each considered field to estimate the daily soil available water content (AWC). This information can be derived either from in situ sampling and laboratory analysis or can be inferred from soil texture products by applying pedotransfer functions (PTFs). In this study, we used both methodologies: data from in situ soil sampling at FR-Aur and data from the global soil databases GlobalSoilMap (GSM, [43]) and SoilGrids (SG, [44]) at both FR-Aur and at the other 78 fields. These soil products were used to estimate the θFC, θWP, and soil depth for all the considered fields in this study, including those at FR-Aur. The global soil databases provide the fraction of sand, clay, and silt at a 90-(GSM) or 250-m (SG) resolution as well as the soil depth. SG is based on machine learning and remote sensing data, while GSM uses a data-mining technique and a bottom-up approach [45].
We used the PTFs proposed by [46] (Eqs. (9-10)) to calculate the volumetric soil moisture at field capacity (θFC) and at permanent wilting point (θWP). The estimations of the θFC, θWP, and soil depth for FR-Aur using in situ samples, SG, and GSM products are summarized in Table 3 for the 3 soil layers considered in the proposed modeling approach (see Section 3.1). The soil types of all the fields monitored in 2013, 2014, and 2015 were characterized based on the textures derived from the GSM and SG products (for the top 30 cm of the soil) and considering the United States Department of Agriculture (USDA) classification system (see Figure 2). Considering all monitored fields, the GSM products suggest that 64% of the soils are clay loams, 23% are loams, 11% are silty clay loams, and the remaining soils vary between clay, silty clay, and silty loams. Compared with the GSM products, the SG products suggest less heterogeneity among the soil types and classify 87% of the soils as clay loams, 12% as loams, and less than 1% as silty clay loams. Considering all monitored fields, the GSM products suggest that 64% of the soils are clay loams, 23% are loams, 11% are silty clay loams, and the remaining soils vary between clay, silty clay, and silty Remote Sens. 2020, 12, 2967 8 of 25 loams. Compared with the GSM products, the SG products suggest less heterogeneity among the soil types and classify 87% of the soils as clay loams, 12% as loams, and less than 1% as silty clay loams.

Multisatellite Optical Images
The authors of [47,48] highlighted the usefulness of assimilating HSTR satellite products into crop models, as it increased the accuracy in reproducing crop dynamics and/or reduced the number of model parameters. Thus, the accuracy of our approach is partly determined by the availability of remote sensing images. Indeed, to successfully calibrate some of the model parameters (see Section 3.3), a certain number of images at key dates had to be available. An extensive set of HSTR products derived from several sensors was thus used in this study. The main characteristics in terms of the spatial resolution, temporal sampling, and spectral bandwidth of each sensor are detailed in Table 4, and the number of images (all in all, 147) from each sensor is shown in Figure 3. For the purpose of the study, a temporal series of remote sensing products, averaged at the plot scale, were needed. Products of different spatial (from 8 to 30 m) and temporal (from 1 to 16 days) scales were thus used in order to get the most complete time series ( Figure 3).

Multisatellite Optical Images
The authors of [47,48] highlighted the usefulness of assimilating HSTR satellite products into crop models, as it increased the accuracy in reproducing crop dynamics and/or reduced the number of model parameters. Thus, the accuracy of our approach is partly determined by the availability of remote sensing images. Indeed, to successfully calibrate some of the model parameters (see Section 3.3), a certain number of images at key dates had to be available. An extensive set of HSTR products derived from several sensors was thus used in this study. The main characteristics in terms of the spatial resolution, temporal sampling, and spectral bandwidth of each sensor are detailed in Table 4, and the number of images (all in all, 147) from each sensor is shown in Figure 3. For the purpose of the study, a temporal series of remote sensing products, averaged at the plot scale, were needed. Products of different spatial (from 8 to 30 m) and temporal (from 1 to 16 days) scales were thus used in order to get the most complete time series (Figure 3).
The authors of [23] showed that the advantages in terms of time sampling of using GAI estimates from multiple sensors outweigh the drawbacks in terms of sensor measurement mismatch. In their study, the authors compared the normalized difference vegetation index (NDVI) of different sensors and showed that, during the growing season (NDVI > 0.2) and over various crops (sunflower, maize, wheat, etc.), linear relationships (R values between 0.97 and 0.99 and relative root mean square error (rRMSE) values between 5.3% and 10.9%) exist between pairs of sensors (including DEIMOS-1, Formosat-2, Landsat-8, and SPOT4 (Take5) [49]). The KALIDEOS processing chain (http://kalideos.cnes.fr) for atmospheric, radiometric, and geometric corrections was used to derive surface reflectance values from the satellite data. The mean geometric correction accuracy was close to 0.2 pixels [50], which is satisfactory considering the area of the studied fields.

From Image Reflectance to GAI Estimates
The remotely sensed GAI estimates were derived from surface reflectance values using the Biophysical Variables Neural NETwork (BV-NNET) tool [51]. BV-NNET is a trained artificial neural network based on the inversion of the radiative transfer model PROSAIL [52]. The GAI estimates take into account the spectral and directional information of each sensor, which are of paramount importance when working with multiple satellites. The GAI value estimated by BV-NNET is an "effective GAI", which does not consider the clumping effect and could lead to underestimation compared to that of the "true GAI", mainly for high GAI values [53]. For each simulated field, the The authors of [23] showed that the advantages in terms of time sampling of using GAI estimates from multiple sensors outweigh the drawbacks in terms of sensor measurement mismatch. In their study, the authors compared the normalized difference vegetation index (NDVI) of different sensors and showed that, during the growing season (NDVI > 0.2) and over various crops (sunflower, maize, wheat, etc.), linear relationships (R values between 0.97 and 0.99 and relative root mean square error (rRMSE) values between 5.3% and 10.9%) exist between pairs of sensors (including DEIMOS-1, Formosat-2, Landsat-8, and SPOT4 (Take5) [49]). The KALIDEOS processing chain (http://kalideos.cnes.fr) for atmospheric, radiometric, and geometric corrections was used to derive surface reflectance values from the satellite data. The mean geometric correction accuracy was close to 0.2 pixels [50], which is satisfactory considering the area of the studied fields.

From Image Reflectance to GAI Estimates
The remotely sensed GAI estimates were derived from surface reflectance values using the Biophysical Variables Neural NETwork (BV-NNET) tool [51]. BV-NNET is a trained artificial neural network based on the inversion of the radiative transfer model PROSAIL [52]. The GAI estimates take into account the spectral and directional information of each sensor, which are of paramount importance when working with multiple satellites. The GAI value estimated by BV-NNET is an "effective GAI", which does not consider the clumping effect and could lead to underestimation compared to that of the "true GAI", mainly for high GAI values [53]. For each simulated field, the mean GAI was calculated taking into account all the pixels of the considered plot after the application of an offset of 10 m to avoid edge effects.

The SAFYE-CO 2 Model
The SAFYE-CO 2 model was constructed by coupling the SAFY-CO 2 model [18] with the water module of the SAFY-WB model [17], which was adapted from the FAO-56 method [33]. SAFY-CO 2 is a parsimonious agro-meteorological model running on a daily basis. It is driven by T AIR and R G as meteorological inputs. It simulates crop dynamics (GAI), crop production (DAM, YLD), CO 2 fluxes (GPP, R ECO , and NEE), and the annual net ecosystem C budget (NECB). Coupling with the water module, which can easily be added or removed, is needed to estimate crop TR and soil Ev and thus soil water content (SWC). From those data, soil water stress and plant water needs can be calculated. Adding this module requires two additional meteorological inputs, rain and ET 0 .
Since all equations involved in the SAFYE-CO 2 model have already been described and detailed in other studies (SAFY [16]; SAFY-WB [17]; SAFY-CO 2 [18]), we will summarize here the main methods and the interactions between the water module and the SAFY-CO 2 model. All related parameters discussed in this section are summarized in Table 5.
In the SAFYE-CO 2 model, GPP is first estimated as a function of the R G , a climatic efficiency (ε C ), the absorbed photosynthetically active radiation (APAR) calculated from the GAI, and the light-interception coefficient (K EXT ), a function of the light-use efficiency taking into account the fraction of the diffuse radiation (ELUE A and ELUE B ; see [18]), and a function accounting for the decrease in canopy photosynthesis capacity during senescence (sR 10 ). The daily estimates of GPP are modulated by (i) a thermal stress function defined by three cardinal temperatures limiting crop growth (T MIN , T MAX , T OPT ), and (ii) a water stress function (WtS V , see below). Net primary productivity (NPP) is estimated as the difference between the GPP and autotrophic respiration (R A ). R A is estimated by the methods proposed by [54], which separates R A into two components: maintenance respiration (R M ) and growth respiration (R GR ). R M is computed as a function of the NPP of the previous day, a "Q 10 -type" R M equation [55] (with two crop-specific parameters: Q 10 and R 10 ), and the sR 10 coefficient. R GR is estimated considering the daily amount of biomass produced and using a R GR coefficient parameter (Y G ) as proposed by [56] and later improved by [57]. Heterotrophic respiration (R H ) is estimated from a simple Q 10 first-order exponential equation (Q 10_H ) and a reference respiration parameter (Rh REF ). CO 2 fixed by the vegetation is converted into dry biomass via a plant C parameter (C VEG ) and partitioned between aboveground DAM and belowground parts according to the methods proposed by [58]. This process depends on three parameters (f 0 , f ∞ , and c) that were adapted for sunflower crops based on data from [59].
Like for SAFY, the daily increase in the GAI is estimated from the daily increase in DAM, the specific leaf area (SLA), and partition-to-leaf function (defined by parameters Pl A and Pl B ). The decrease in the GAI starting with senescence is estimated based on a threshold from the growing degree days (starting from crop emergence as detected by the remote sensing data), from the senescence parameter (SEN A ), and from the rate of senescence (SEN B ). The final yield is simply deduced from the maximum value of DAM multiplied by a harvest index (HI). The set of equations used to describe the previous methods are detailed in the study by [18].
In SAFY-CO 2 , the soil is a single layer whose characteristics (parameters) are used to calculate R H . In the water module of SAFYE-CO 2 , the soil is divided into 3 layers: A surface layer of constant depth (D EVAP ), subject to Ev and TR; a root layer, subject to TR only, that increases in depth during the growing season at a rate (V R ) that tracks crop growth and limited by a maximum root depth (D ROOTS ); and a constant depth (D) deep layer. The last one is used to initialize the water content of the root layer during root growth and to control for the effects of percolation and drainage. The soil module is detailed in the paper of [17]. The soil AWC is essentially updated daily, taking into account rain (inputs), Ev, and TR simulated based on the FAO-56 methods [33]. The AWC is defined by the water fluxes (input and output) and the total water capacity (TWC) (i.e., TWC = θ FC − θ WP ). TR depends on a maximal TR coefficient (Kcb MAX ), a reduction coefficient (K TRP ), a critical relative humidity parameter (Dfe), the GAI, and the maximum relative filling rate between the surface and root layer.
The water module also computes two water stress functions, one that drives Ev (WtS E , Equation (1), considering the AWC of the surface layer) and another that drives the growth of vegetation (WtS V , Equation (2), considering the surface and root layers) through GPP estimation. These functions represent the main link between the vegetation and the soil module and are defined as follows: where SM (m 3 ·m −3 ) is the SWC of the surface layer and where θ FC (m 3 ·m −3 ) and θ WP (m 3 ·m −3 ) are the water content at field capacity and at the permanent wilting point, respectively. Dfe is the critical relative filling rate for water stress, and RH evap and RH root are the relative humidity of the surface and root layers, respectively. In the present study, the calculation of Ev was modified from that of [17] according to the methods of [60] because differences were observed between simulated and observed ETR during bare soil periods and because of the dependence of Ev on ET 0 . The reason is that the FAO-56 method for estimating Ev was developed to reproduce homogeneous grass stands, whereas the soil is often completely bare during the fallow period. Additionally, [60] asserted that, in addition to being ET 0 dependent, Ev depends on the hydraulic state of the soil surface layer. When ET 0 is high, the first several centimeters of the soil rapidly become dry, slowing the capillary rise of the remaining water. A new equation for estimating Ev was thus adopted: where ET 0 (mm·d −1 ) is the reference evapotranspiration, K BS (−) is a correction coefficient for bare soil, Ev A (1/(mm·d −1 )) and Ev B (−) are the evapotranspiration coefficients, and F COVER is the vegetation cover fraction. F COVER is estimated from the maximal vegetation fraction cover (K COV ), an exponential coefficient (E COV ), and the GAI. F COVER can be derived from remote sensing and thus drives the model, but previous quantitative experiments did not show significant enhancement of the outputs; thus, we calculated it here from the GAI and two parameters (see Table 5).
The water module is impacted by the vegetation module through the calculation of F COVER (which considers the GAI), and in turn, the vegetation module is impacted by the water module through the water stress function (WtS V ).

Model Parameters
In addition to requiring meteorological data, the SAFYE-CO 2 model requires crop-specific parameters. These parameters came from several sources. Most of them could be obtained from the literature, some could be estimated via in situ measurements, and the rest were obtained from calibration (see Table 5). The model could run using literature and in situ parameters only, but when the last category of parameters is not constrained by remote sensing, the model cannot cope with changing conditions in the fields, which renders the outputs inaccurate. The calibration procedure based on the satellite-derived GAI allows optimization of the phenological parameters and the A parameter of the light-use efficiency function. Once these parameters are calibrated, the outputs are representative of the GAI extraction area (e.g., pixels, fields, farms). To calculate the plot-scale C budget, data supplied by farmers, concerning the amount of C imported into fields as organic amendments, are eventually needed.

Model Parametrization and Calibration
The calibration procedure incorporating remotely sensed GAI was detailed in the report by [18]. Six phenological parameters (Pl A , Pl B , SEN A , SEN B , SLA, D 0 ) as well as the A parameter of the light-use efficiency (see Table 5) were calibrated during this process for each field and each year independently so that the simulated GAI matched the remotely sensed GAI. This procedure allows the model to accurately reproduce crop growth. This inversion process is adapted from the Nelder-Mead simplex method [61], in which a priori values and minimum/maximum boundaries are set for each retrieved parameter. A quadratic difference cost function-root mean square error (RMSE)-is minimized using the adapted simplex between the remotely sensed GAI and the simulated GAI. For minimization, a global approach based on 30 different a priori samples is used to avoid local minima while preventing unnecessary runs. Once completed, the inversion process results in a set of parameters, allowing accurate reproduction of the remotely sensed GAI while avoiding biomass estimates as well as water and CO 2 flux estimates (see Section 3.1).

Model Validation and Sensitivity Analysis
The validation strategy adopted in this study is summarized in Figure 4. It relies on several types of datasets. First, we evaluated the ability of SAFYE-CO 2 to simulate crop growth and development (leaf area index (LAI) and DAM), YLD, and CO 2 (GPP, R ECO and NEE) and water (Ev, TR) fluxes. To accomplish this, the model was run for the two crop seasons of sunflower at the FR-Aur site (CY-07 and CY-16) using in situ (IS) meteorological data. Additionally, in situ measurements were used to estimate θ FC and θ WP and the soil depth. These results are presented in Section 4.1.
For upscaling of the model, the impact of the soil products (GSM and SG were used to estimate θ FC , θ WP , and soil depth) on the model's performance must be evaluated. For this purpose, the SAFYE-CO 2 model was run on the same two crop years at FR-Aur using the GSM or the SG products to estimate soil water retention parameters. The outputs were compared with those obtained with in situ measurements instead of soil products. Additionally, the results obtained with SAFYE-CO 2 were compared to those from the original SAFY-CO 2 version to assess whether the coupling of SAFY-CO 2 with the water module improves the model's outputs or not; Section 4.2.1 describes those results. Finally, the SAFY-CO 2 and SAFYE-CO 2 models were used to simulate DAM and YLD over a large area. In addition to the two crop years at FR-Aur, the fields from the 2013, 2014, and 2015 campaigns were simulated, and in situ measurements of DAM and YLD were used for validation. In situ meteorological variables were used to simulate the crop years at FR-Aur, while data from SAFRAN were used to simulate the other fields/crop years. The findings related to this exercise are described in Section 4.2.2. All simulations were performed from 15 November to 14 November of the following year, usually representing one cropping season in the area of study.

Local Assessment of the Model's Performances
The SAFYE-CO2 model was first tested against the in situ data of the FR-Aur site. Figure 5 presents the correlation between the simulated and observed GAI, DAM, and CO2 (GPP, RECO, and NEE) and water (ETR) fluxes for the two crop years of sunflower. Only data corresponding to the crop cycle (i.e., from emergence to harvest) were included to compute the GPP statistics, while the entire simulation period was considered to compute the model performances when estimating the other flux variables.
Since the model is calibrated by minimizing the quadratic difference between simulated and remotely sensed GAI, the estimation of GAI is used to verify that the inversion procedure is correctly performed rather than being used to validate the modeled GAI. Consequently, the statistical results related to the GAI estimation are very good (R 2 = 0.99, RMSE = 0.099 m 2 .m −2 , and rRMSE = 6.9 %), indicating that the sunflower's GAI dynamics are accurately reproduced by the model. The DAM is also robustly estimated in terms of the correlation (R 2 = 0.96) and error (RMSE = 66 g.m −2 and rRMSE = 16.2 %) data, and the yield (only 2 values at FR-Aur) is also estimated with a low error (RMSE = 53.8 g.m -2 ). Accurate estimation of biomass, and thus yield, is a prerequisite for assessing NECB with high precision. CO2 fluxes are correctly simulated, although compared with CY-07 CO2 fluxes, CY-16 CO2 fluxes are estimated with lower accuracy due to the lack of satellite images throughout the agricultural season. Indeed, for CY-16, only four images were available during sunflower development (it was the beginning of Sentinel's mission). In particular, no images were available from 30 July to 28 September (see Figure 3), leading to a senescence period that was poorly documented by the remotely sensed GAI and a model-simulated senescence stage that was too early. The poor representation of the senescence phase in 2016 led to an underestimation of GPP and RECO. This period is easily identifiable on the scatter plots ( Figure 5C-D) and corresponds to the green points underestimated by the model. Indeed, the mean bias of the simulated GPP and RECO is higher for CY-16 (respectively −2.21 and −0.69 gC.m −2 .d −1 ) than for CY-07 (respectively −0.11 and −0.16 gC.m −2 .d −1 ) as well as the slopes associated to these data (respectively 0.75 and 0.65 for CY-07 and 1.22 and 1.00 for CY-16). Nevertheless, when enough images are available to describe the crop

Local Assessment of the Model's Performances
The SAFYE-CO 2 model was first tested against the in situ data of the FR-Aur site. Figure 5 presents the correlation between the simulated and observed GAI, DAM, and CO 2 (GPP, R ECO , and NEE) and water (ETR) fluxes for the two crop years of sunflower. Only data corresponding to the crop cycle (i.e., from emergence to harvest) were included to compute the GPP statistics, while the entire simulation period was considered to compute the model performances when estimating the other flux variables.
Since the model is calibrated by minimizing the quadratic difference between simulated and remotely sensed GAI, the estimation of GAI is used to verify that the inversion procedure is correctly performed rather than being used to validate the modeled GAI. Consequently, the statistical results related to the GAI estimation are very good (R 2 = 0.99, RMSE = 0.099 m 2 ·m −2 , and rRMSE = 6.9%), indicating that the sunflower's GAI dynamics are accurately reproduced by the model. The DAM is also robustly estimated in terms of the correlation (R 2 = 0.96) and error (RMSE = 66 g·m −2 and rRMSE = 16.2%) data, and the yield (only 2 values at FR-Aur) is also estimated with a low error (RMSE = 53.8 g·m −2 ). Accurate estimation of biomass, and thus yield, is a prerequisite for assessing NECB with high precision. CO 2 fluxes are correctly simulated, although compared with CY-07 CO 2 fluxes, CY-16 CO 2 fluxes are estimated with lower accuracy due to the lack of satellite images throughout the agricultural season. Indeed, for CY-16, only four images were available during sunflower development (it was the beginning of Sentinel's mission). In particular, no images were available from 30 July to 28 September (see Figure 3), leading to a senescence period that was poorly documented by the remotely sensed GAI and a model-simulated senescence stage that was too early.
The poor representation of the senescence phase in 2016 led to an underestimation of GPP and R ECO . This period is easily identifiable on the scatter plots ( Figure 5C,D) and corresponds to the green points underestimated by the model. Indeed, the mean bias of the simulated GPP and R ECO is higher for CY-16 (respectively −2.21 and −0.69 gC·m −2 ·d −1 ) than for CY-07 (respectively −0.11 and −0.16 gC·m −2 ·d −1 ) as well as the slopes associated to these data (respectively 0.75 and 0.65 for CY-07 and 1.22 and 1.00 for CY-16). Nevertheless, when enough images are available to describe the crop dynamics (notably key dates, such as those corresponding to emergence, maximum vegetation, start of senescence, harvest), the crop production as well as the CO 2 and water fluxes are estimated with acceptable accuracy (see [18]).
Remote Sens. 2020, 12, x FOR PEER REVIEW 16 of 27 of senescence, harvest), the crop production as well as the CO2 and water fluxes are estimated with acceptable accuracy (see [18]). In the proposed approach, the errors associated with GPP and RECO compensate for each other when estimating NEE (which is calculated from the difference between the two variables). This resulted in an accurate estimation of the NEE (RMSE of 0.97 gC.m −2 .d −1 and R² of 0.83) despite the lack of images during the 2016 senescence phase.
The model is also able to reproduce the ETR (Ev+TR) dynamics of the two crop years with good accuracy (RMSE of 0.6 mm.d −1 , rRMSE = 40%, and R 2 of 0.79). For the CO2 fluxes, the ETR is underestimated by the model for CY-16 during the senescence period, which led to a decrease in TR before the end of the observed crop cycle ( Figure 5F).
In the proposed approach, RH depends only on the soil temperature, while it is known to be GPP and R ECO are estimated with relatively low errors (RMSE of 2.36 gC·m −2 ·d −1 and 0.96 gC·m −2 ·d −1 , respectively, and rRMSE of 39.5% and 40.9%, respectively) and good correlations (R 2 of 0.81 and 0.83, respectively).
In the proposed approach, the errors associated with GPP and R ECO compensate for each other when estimating NEE (which is calculated from the difference between the two variables). This resulted in an accurate estimation of the NEE (RMSE of 0.97 gC·m −2 ·d −1 and R 2 of 0.83) despite the lack of images during the 2016 senescence phase.
The model is also able to reproduce the ETR (Ev+TR) dynamics of the two crop years with good accuracy (RMSE of 0.6 mm·d −1 , rRMSE = 40%, and R 2 of 0.79). For the CO 2 fluxes, the ETR is underestimated by the model for CY-16 during the senescence period, which led to a decrease in TR before the end of the observed crop cycle ( Figure 5F).
In the proposed approach, R H depends only on the soil temperature, while it is known to be impacted by many other factors (e.g., AWC and soil organic C content). In their study, [18] discussed taking into account AWC in the estimation of R H . We thus estimated R H according to the [62] Equation III.5 that accounts for AWC. During the two simulated crop years at FR-Aur (data not shown), the temperature-dependent-only formulation performed better (RMSE = 0.5 gC·m −2 ·d −1 and R 2 = 0.36) than the combined temperature-and AWC-dependent equation (RMSE = 0.73 gC·m −2 ·d −1 and R 2 = 0.06), even when considering in situ AWC to estimate R H (RMSE = 0.61 gC·m −2 ·d −1 and R 2 = 0.15). We therefore kept the first-order exponential equation that depends only on soil temperature for the rest of the study.

Impact of Soil Data and of Coupling with the Water Module on Model Performance
The two sunflower crop years at FR-Aur were simulated with the SAFYE-CO 2 model using field capacity, the permanent wilting point, and soil depth estimated from (i) the in situ measurements, (ii) the GSM products, and (iii) the SG products. These two crop years were also simulated with (iv) the SAFY-CO 2 model to evaluate the benefits of coupling with the water module compared to the previous version. The different evaluated configurations of the model are referred to as the following: (i) SAFYE-CO 2 IS , (ii) SAFYE-CO 2 GSM , (iii) SAFYE-CO 2 SG , and (iv) SAFY-CO 2 . Figure 6 summarizes the statistics (RMSE and R 2 ) obtained when testing the different configurations of the model in simulating the target variables (GAI, DAM, GPP, R ECO , NEE, and ETR) for CY-07 and CY-16. There are no statistical data on ETR for SAFY-CO 2 , as this version does not simulate water flux. The differences observed between the different tested configurations are mainly due to water stress. In addition to simulating water fluxes (Ev, TR), the water module impacts vegetation photosynthesis and growth through a water stress function (WtS V ). Therefore, correctly estimating AWC is essential for accurate simulation of crop growth.
Regarding CY-07, the GAI and R ECO are estimated with nearly the same accuracy for each model configuration. Concerning the GAI, the good performance of each model configuration indicates that all the configurations were able to accurately fit the remotely sensed GAI regardless of the soil data used. The errors in R ECO are also similar between the different configurations for two reasons: (i) The heterotrophic component of R ECO is estimated in the same way for all versions; and (ii) the autotrophic component, depending on the GPP, differs between versions only at the end of the crop cycle when water stress occurs. However, greater differences occur for the DAM, GPP, NEE, and ETR estimates, where SAFYE-CO 2 GSM and SAFYE-CO 2 SG show low accuracy. This is due to the overestimation of the water stress at the end of the season, which in turn underestimates the water fluxes and GPP and thus the DAM. Indeed, TWC is underestimated for both SAFYE-CO 2 GSM and SAFYE-CO 2 SG during this period (see Section 2.2.4), leading to an underestimation of AWC, which induces stress at the end of the cropping season. For CY-16, the same observation can be made concerning the estimations of the GAI. Each version is able to accurately reproduce GAI dynamics. However, SAFYE-CO 2 IS largely outperformed SAFYE-CO 2 GSM and SAFYE-CO 2 SG for all the other variables. SAFYE-CO 2 SG and SAFYE-CO 2 GSM show high errors when estimating GPP and ETR because, like CY-07, those versions largely overestimated water stress, inducing early senescence and thus an underestimation of the CO 2 and water fluxes. Despite the observed water stress during both simulated years (at the end of the crop cycle), the best performances in estimating CO 2 fluxes were achieved by the SAFY-CO 2 model that did not simulate water stress, indicating that the water stress is already taken into account through the remotely sensed GAI. Additionally, the water module adds an additional level of complexity, while the performance of SAFY-CO 2 is already very high.
following: i) SAFYE-CO2 IS , ii) SAFYE-CO2 GSM , iii) SAFYE-CO2 SG , and iv) SAFY-CO2. Figure 6 summarizes the statistics (RMSE and R 2 ) obtained when testing the different configurations of the model in simulating the target variables (GAI, DAM, GPP, RECO, NEE, and ETR) for CY-07 and CY-16. There are no statistical data on ETR for SAFY-CO2, as this version does not simulate water flux. The differences observed between the different tested configurations are mainly due to water stress. In addition to simulating water fluxes (Ev, TR), the water module impacts vegetation photosynthesis and growth through a water stress function (WtSV). Therefore, correctly estimating AWC is essential for accurate simulation of crop growth.   Figure 7 shows the components of the NECB estimated with the different configurations of the model compared with observed values: The net annually cumulated CO 2 fluxes (net annual ecosystem productivity (NEP)), the C input as seeds (C INP ), and the C exported at harvest (C EXP ). Because of the poor estimates of TWC deduced from GSM and SG products, SAFYE-CO 2 GSM and SAFYE-CO 2 SG overestimate NEP and underestimate C EXP for both simulated years. Both models' configurations overestimate water stress, especially at the end of the season when AWC is low, which leads to premature senescence and thus an early harvest. In the absence of vegetation, the only CO 2 flux is R H (flux towards the atmosphere), so shortening of the crop cycle by the two model configurations leads to an overestimation of NEP. The same reason explains the underestimation of C EXP . The more stressed the vegetation is, the lower the total biomass production and the yield are. On the other end, both SAFYE-CO 2 IS and SAFY-CO 2 were able to accurately reproduce the NECB for CY-07 (−4.3 and −10.8% for SAFYE-CO 2 IS and SAFY-CO 2, respectively) and for CY-16 (+15 and −6.2% for SAFYE-CO 2 IS and SAFYE-CO 2 , respectively), as both configurations were able to accurately simulate the components of the C budget.
configurations leads to an overestimation of NEP. The same reason explains the underestimation of CEXP. The more stressed the vegetation is, the lower the total biomass production and the yield are. On the other end, both SAFYE-CO2 IS and SAFY-CO2 were able to accurately reproduce the NECB for CY-07 (−4.3 and −10.8% for SAFYE-CO2 IS and SAFY-CO2, respectively) and for CY-16 (+15 and −6.2% for SAFYE-CO2 IS and SAFYE-CO2, respectively), as both configurations were able to accurately simulate the components of the C budget.

Biomass and Yield Estimates
Apart from the two sunflower crop years available at FR-Aur, the model was evaluated for DAM and YLD over numerous fields in 2013, 2014, and 2015 (see Section 2.2.2) to assess its ability to reproduce those variables under a variety of climatic, soil, and management conditions. Since no soil in situ measurements were available at the plot scale in 2013, 2014, and 2015, the SAFYE-CO 2 IS configuration could not be tested. In the previous sections, TWC derived from GSM and SG were compared to the value derived from in situ measurements performed at FR-Aur (Table 3). The values associated with the global soil databases were lower than those observed in situ, and only a few variations were observed in terms of texture (particularly for SG, Figure 2). The TWC distributions derived from GSM and SG for all the studied fields are presented in Figure 9 and were compared to values derived from in situ measurements collected nearby (i.e., 45 fields located in the northwestern part of the study area-experiment described by [18]). The values of TWC derived from the global soil databases are low compared to those obtained from the previous studied fields, and very little spatial variation is present. The TWC derived from GSM ranges between 0.11 and 0.14 m 3 ·m −3 , while the TWC derived from SG is even less variable-between 0.12 and 0.13 m 3 ·m −3 . Considering the in situ measurements, soils in the study area present more variation, with the TWC between 0.10 and 0.16 m 3 ·m −3 . These low and quite spatially constant TWC values derived from the global soil databases undoubtedly lead to an overestimation of water stress and thus to an underestimation of the final DAM. Nevertheless, SAFY-CO 2 , which does not simulate water flux or water stress, better estimated DAM (R 2 of 0.91 and RMSE of 73 g·m −2 ) and especially better reproduced the final highest DAM values (red dots in Figure 8E), confirming that SAFYE-CO 2 GSM and SAFYE-CO 2 SG simulated too much water stress in 2013.
Regarding YLD estimates ( Figure 8B,D,F), all versions have difficulty reproducing the observed values accurately. This is possibly due to several factors. First, YLD is calculated from the maximum value of DAM using a constant HI. The HI of sunflower is known to vary with the length of the grain filling period and with temperature [63] but also with water stress [64]; moreover, there is some additional variations due to crop variety [65]. These factors are not yet accounted for in the proposed approach. Additionally, the observed yields were provided by the farmers, and may have high uncertainty. Indeed, compared to the "true" yield, those data are subject to significant grain losses at harvest.
reproduce those variables under a variety of climatic, soil, and management conditions. Since no soil in situ measurements were available at the plot scale in 2013, 2014, and 2015, the SAFYE-CO2 IS configuration could not be tested. Figure 8 presents DAM and YLD simulated by SAFYE-CO2 GSM (A,B), SAFYE-CO2 SG (C,D), and SAFY-CO2 (E,F). Although SAFYE-CO2 GSM and SAFYE-CO2 SG showed good overall performance at estimating DAM over the whole dataset (R² of 0.78 and 0.74, respectively; RMSE of 110 and 119 g.m −2, respectively), these two models' configurations had difficulty accurately estimating the final DAM for the 2013 field campaign. This could be explained by the water stress simulated at the end of the season, which slows down crop growth and thus underestimates final biomass. This inaccurate estimation of water stress was the result of poor estimation of the TWC. In the previous sections, TWC derived from GSM and SG were compared to the value derived from in situ measurements performed at FR-Aur (Table 3). The values associated with the global soil databases were lower than those observed in situ, and only a few variations were observed in terms of texture (particularly for SG, Figure 2). The TWC distributions derived from GSM and SG for all the studied fields are presented in Figure 9 and were compared to values derived from in situ measurements collected nearby (i.e., 45 fields located in the northwestern part of the study area-experiment described by [18]). The values of TWC derived from the global soil databases are low compared to those obtained from the previous studied fields, and very little spatial variation is Regarding YLD estimates ( Figure 8B,D,F), all versions have difficulty reproducing the observed values accurately. This is possibly due to several factors. First, YLD is calculated from the maximum value of DAM using a constant HI. The HI of sunflower is known to vary with the length of the grain filling period and with temperature [63] but also with water stress [64]; moreover, there is some additional variations due to crop variety [65]. These factors are not yet accounted for in the proposed approach. Additionally, the observed yields were provided by the farmers, and may have high uncertainty. Indeed, compared to the "true" yield, those data are subject to significant grain losses at harvest.

4.3.C Budget Assessment at the Landscape Scale
Since the SAFYE-CO2 model has difficulty reproducing sunflower growth if no accurate information on soil TWC is available (i.e., SAFYE-CO2 GSM and SAFYE-CO2 SG , which is the case at the landscape scale), the SAFY-CO2 model was used to simulate the components of the C budget of CY-15 at the landscape scale ( Figure 9) for more than 200 fields (black dashed square in Figure 1). Figure 10 presents the corresponding NEP (A), CEXP (B), and resulting NECB (C).
Considering all 200 fields, the simulated NEP is between −91 (sink) and 232 (source) gC.m −2 .yr −1, while the amount of C exported at harvest is between 68 and 155 gC.m −2 .yr −1 . This results in an NECB between 64 and 300 gC.m −2 .yr −1 , which means that each simulated field behaves as a C source whose flux is towards the atmosphere. The topography seems to slightly affect sunflower development. This tendency can be seen in Figure 10A; plots where the C assimilation is higher (lowest NEP) are located mainly in plains and valleys. The same observation can be made for the CEXP term, meaning that the highest yields are observed in plains or in the lowest parts of hillsides. However, high yields are also observed in hilly areas, and poor yields are also observed in plains. Concerning the NECB, Figure 9. TWC of the fields simulated in this study derived from GSM and SG and of 45 fields within the study area estimated with in situ (IS) measurements.

C Budget Assessment at the Landscape Scale
Since the SAFYE-CO 2 model has difficulty reproducing sunflower growth if no accurate information on soil TWC is available (i.e., SAFYE-CO 2 GSM and SAFYE-CO 2 SG , which is the case at the landscape scale), the SAFY-CO 2 model was used to simulate the components of the C budget of CY-15 at the landscape scale ( Figure 9) for more than 200 fields (black dashed square in Figure 1). Figure 10 presents the corresponding NEP (A), C EXP (B), and resulting NECB (C). Considering all 200 fields, the simulated NEP is between −91 (sink) and 232 (source) gC·m −2 ·yr −1, while the amount of C exported at harvest is between 68 and 155 gC·m −2 ·yr −1 . This results in an NECB between 64 and 300 gC·m −2 ·yr −1 , which means that each simulated field behaves as a C source whose flux is towards the atmosphere. The topography seems to slightly affect sunflower development. This tendency can be seen in Figure 10A; plots where the C assimilation is higher (lowest NEP) are located mainly in plains and valleys. The same observation can be made for the C EXP term, meaning that the highest yields are observed in plains or in the lowest parts of hillsides. However, high yields are also observed in hilly areas, and poor yields are also observed in plains. Concerning the NECB, the tendency is less clear as the C budget is conditioned by the NEP and the C EXP term, whose signs may be opposite, with each favorably affected by good soil conditions (not short of water). Additionally, NEP can be affected by spontaneous regrowth events or weeds, increasing the CO 2 net assimilation, albeit with no effect on yield. Differences in the availability of water resources for crop management can also explain the spatial variability in the NECB and its components. All plots simulated here are sources of C because the length of the sunflower cycle is short, which leaves a long fallow period that releases C from R H to the atmosphere. This observation is consistent with the results of a meta-analysis conducted for more than 15 flux crop sites in Europe by [66].

SAFYE-CO2 Overall Performances
The SAFYE-CO2 model for sunflower was validated in this study. Section 4.1 presents the results achieved with this approach using in situ soil measurements to calculate SWC parameters and soil depth. The performances for simulating the DAM and yield are within the range of those of other studies. As an example, [67] compared the performances of three crop models (AquaCrop, CropSyst, and WOFOST) for simulating the aboveground biomass and yield of sunflower. Those authors showed that under rainfed conditions (similar to the conditions in the present study), the accuracy of the models ranged from 63 (WOFOST) to 204 (AquaCrop) g.m −2 , while SAFYE-CO2 estimated DAM at FR-Aur with an error of 66 g.m −2 . Concerning yield, SAFYE-CO2 (using in situ measurements to estimate the TWC) provided estimates with an accuracy (error of 54 g.m −2 ) similar to that shown by [67], who reported errors ranging from 70 (AquaCrop) to 94 g.m −2 (CropSyst). However, the performances for estimating yield with SAFYE-CO2 should be taken with caution because of the small number of comparisons with field observations (only 2 corresponding to the Auradé site in 2007 and 2016), meaning that additional validation exercises are needed.
CO2 fluxes are also robustly estimated by the SAFYE-CO2 model and are within the range of other models. The authors of [68] evaluated the CERES-EGC model against CO2 fluxes over several European flux sites, including FR-Aur. The model reproduced the net CO2 fluxes over the growing period of CY-07 with an error of 30 gC.m −2 , while SAFYE-CO2 IS (using in situ measurements for soil description) estimated NEP (throughout the whole cropping year) with an error of −12.1 gC.m −2 . Additionally, in 2016, despite the lack of satellite images covering the period of senescence at FR-Aur, the model provided accurate estimates of NEE (mainly because the errors in GPP and RECO compensate for each other). This accuracy in estimating CO2 fluxes makes possible the realistic estimation of the NECB at the plot scale with the proposed approach. SAFYE-CO2 IS was able to reproduce NECB with very good accuracy (relative errors of −4.3% and +15% for CY-07 and CY-16, respectively). SAFYE-CO2 IS also performed well at estimating ETR. During the two simulated crop years at FR-Aur, the model reproduced ETR with a low error (RSME = 0.68 mm.d −1 ), meaning that our approach can also be used to calculate annual water budgets for sunflower plots.

SAFYE-CO 2 Overall Performances
The SAFYE-CO 2 model for sunflower was validated in this study. Section 4.1 presents the results achieved with this approach using in situ soil measurements to calculate SWC parameters and soil depth. The performances for simulating the DAM and yield are within the range of those of other studies. As an example, [67] compared the performances of three crop models (AquaCrop, CropSyst, and WOFOST) for simulating the aboveground biomass and yield of sunflower. Those authors showed that under rainfed conditions (similar to the conditions in the present study), the accuracy of the models ranged from 63 (WOFOST) to 204 (AquaCrop) g·m −2 , while SAFYE-CO 2 estimated DAM at FR-Aur with an error of 66 g·m −2 . Concerning yield, SAFYE-CO 2 (using in situ measurements to estimate the TWC) provided estimates with an accuracy (error of 54 g·m −2 ) similar to that shown by [67], who reported errors ranging from 70 (AquaCrop) to 94 g·m −2 (CropSyst). However, the performances for estimating yield with SAFYE-CO 2 should be taken with caution because of the small number of comparisons with field observations (only 2 corresponding to the Auradé site in 2007 and 2016), meaning that additional validation exercises are needed. CO 2 fluxes are also robustly estimated by the SAFYE-CO 2 model and are within the range of other models. The authors of [68]  also performed well at estimating ETR. During the two simulated crop years at FR-Aur, the model reproduced ETR with a low error (RSME = 0.68 mm·d −1 ), meaning that our approach can also be used to calculate annual water budgets for sunflower plots. At the landscape scale, coupling with a soil water module tended to lower biomass and yield estimates. The poor estimates of TWC derived from soil products (GSM and SG) led to an underestimation of the final biomass and, thus, yield. In contrast, the biomass was accurately estimated by the SAFY-CO 2 model (R 2 = 0.91 and RMSE = 73 g·m −2 ), and the estimations were within the range of those of models specific to sunflower: Using the OILCROP-SUN model, [21] estimated DAM with an error of 170 g·m −2 , while using the SUNFLO model, [22] simulated DAM with an error of 114 g·m −2 . In contrast, the yield was poorly estimated by SAFY-CO 2 in terms of its correlation (R 2 = 0.31), but the error was still within the range or better than those other studies. The error achieved by SAFY-CO 2 in estimating YLD was 51 g·m −2 , while errors with OILCROP-SUN and SUNFLO were 80 and 50 g·m −2 , respectively (in [21,22]). The poor results achieved for yield with SAFY-CO 2 can be explained in part by differences in the HI between sunflower plots. Indeed, sunflower HI is known to change with genotype, management, and environment [63,64] and depends on many factors that are not yet accounted for in the proposed approach. Accounting for varietal potentials [65] and for abiotic stress after flowering as in CropSyst [19] is an avenue for improvement with our approach.

Prerequisite to Compute Water Fluxes Over a Wide Area
Estimating water fluxes and thus water stress over croplands requires an accurate description of soil properties (θ FC , θ WP , and soil depth). This need has been highlighted in the present study. Soil parameters were derived from in situ measurements when available or from soil texture products for upscaling. The two existing soil products tested in this study (GSM and SG) did not meet the needs for the expected accuracy. Indeed, we demonstrated (see Section 4.2) that the use of soil products led to an underestimation of the soil TWC, which in turn led to an overestimation of the water stress and an underestimation of the biomass and yield compared to the in situ data. However, this conclusion concerns only our study area, which is not representative of the spatial scale of GSM and SG products. Indeed, those products provide, among other things, soil texture and depth at a fine resolution (SG: 250 m; GSM: 90 m) all across the globe. As such, it would be difficult to expect those products to accurately describe differences in soil properties between plots over a small area of study. Additionally, taking water fluxes into account in our approach does not improve the estimation of biomass or CO 2 fluxes, even under water-stressed conditions (see Section 4.2.2). Both sunflower crop years at FR-Aur experienced water stress (especially at the end of the season), and SAFY-CO 2 was still able to accurately reproduce the crop dynamics and production as well as the CO 2 flux, indicating that the plant stresses (i.e., water and nitrogen) were already correctly taken into account through the remotely sensed GAI. Vegetation stress slows down (or even stops) growth, which is observable through GAI. Therefore, we believe that coupling SAFY-CO 2 with a soil water module is useful for estimating ETR and water budgets or for assessing crop water-use efficiency [1] when safeguarding soil and water resources, but this coupling does not seem to improve the simulations of the annual C budget components. During the two sunflower crop years at FR-Aur, simulations were performed using SAFYE-CO 2 , with no feedback of the SWC on GPP computation (WtS V = 1, Equation (2)). The accuracy in estimating biomass and CO 2 flux was similar to that of SAFY-CO 2 ; thus, the accuracy was slightly better than that of simulations considering the water stress function. Concerning the estimation of water fluxes, the same accuracy was obtained with or without the consideration of effects of water stress on vegetation development. We thus argue that SAFYE-CO 2 should be used in a decoupled way between vegetation growth and SWC since information about water stress is already provided by the remotely sensed GAI. However, further analysis that includes additional water flux data and the simulation of other crop species should reveal whether the water module should be used systematically in a decoupled way, allowing us to estimate water fluxes in the absence of vegetation stress.

Limitations and Potential Improvements
In addition to simulating crop dynamics and production, our approach fills the gap in assessing the C and water budget simultaneously at the plot scale over large areas. Notably, to achieve this, information on both organic fertilization and eventually irrigation at the plot scale are needed. This information is not yet available at a landscape spatial scale, which can lead, for example, to great uncertainties in the C budget calculations in areas with livestock farming. However, this drawback could be compensated by the combination of our approach with the use of data in farm management information systems (FMISs). Additionally, the representation of R H is quite simplistic in our approach and would require improvement. As shown in the present study and as discussed by [18], this simple method allows for the accurate assessment of the annual C budget, but when considering relatively long time scales (e.g., crop rotation), the effects of other variables, such as soil organic matter content, or processes, such as priming effects, will have to be taken into account. For those reasons, the SAFYE-CO 2 model should be coupled with a soil organic matter model (e.g., AMG [66][67][68][69]; RothC [70]) in the future.
Another limitation of this approach concerns the availability of remote sensing images. To date, we used only optical data in this approach, which is thus limited in the case of cloudy sky conditions. Our method could even be inoperable if long periods of cloudy sky conditions occur during crop development. This constraint is partially mitigated because of the ongoing Sentinel missions that will provide high spatial-(10 m) and temporal-resolution (5 days) images all across the globe. Additionally, the assimilation of the C-band synthetic-aperture radar (SAR) from Sentinel-1 should be considered to lift this constraint, as shown in many studies [71][72][73][74]. Finally, thermal-infrared remote sensing, which allows the estimation of vegetation water stress [75], could be considered to better constrain the model.

Conclusions
The newly established SAFYE-CO 2 model was validated at local and landscape scales for sunflower crops against a wide range of in situ measurements. The model was able to accurately reproduce biomass, yield, CO 2 flux, and ETR at the FR-Aur flux site when using in situ measurements of texture to deduce soil parameters (field capacity and permanent wilting point) and soil depth. For upscaling the model to the landscape scale, the sensitivity of the model to those soil parameters estimated from existing state-of-the-art soil databases (GlobalSoilMap and SoilGrids) was tested. Biomass, over a population of fields, was correctly estimated by SAFY-CO 2 but underestimated by SAFYE-CO 2 . This is possibly due to the underestimation of the total soil water capacity derived from the soil products, which leads to overestimations of water stress. Moreover, the low heterogeneity and spatial variability of the derived soil parameters over the study area, compared to observed values, indicates that the current accuracy of soil products (GSM and SG) is not sufficient to provide an accurate description of TWC, especially for approaches developed at the field scale. Since SAFYE-CO 2 requires information about soil water capacity that is not yet available at the plot scale over wide areas with sufficient accuracy, in the case where only a C budget is needed, we suggest using SAFY-CO 2 for large-scale simulation, as the assimilation of frequent HSTR GAI data enables the indirect incorporation of the integrated impact of water stress. For the same reason, we argue that crop growth in SAFYE-CO 2 can be decoupled from soil water dynamics by omitting the water stress function to obtain water flux estimates as well as accurate CO 2 flux estimates. However, this approach needs further validation using other flux sites to ensure the robustness of the model for simulating various soil and climatic conditions and management practices.