Assessing and Modeling Ecosystem Carbon Exchange and Water Vapor Flux of a Pasture Ecosystem in the Temperate Climate-Transition Zone

The rising frequency of extreme weather events and global warming are greatly challenging pastoral ecosystem productivity, particularly in the temperate climate-transition regions. While this could cause greater gross primary production (GPP) mainly contributed by the warm-season vegetation, the consequences for the dynamics of net ecosystem exchange (NEE) and hydrological responses (e.g., evapotranspiration, ET) on an ecosystem level are poorly known. Here, we investigated the evolution of plant phenology, nutritive value, energy balance, and carbon/water budgets of a cool-season dominated pastoral ecosystem in the temperate zone; integrating both eddy covariance (EC) flux measurement and simulation modeling-based uncertainty analysis. Throughout the two-year duration (2017–2018) of this study, the entire pasture ecosystem remained a strong carbon sink (NEE = −1.23 and −1.95 kg C m−2, respectively) with 74% and 62% of available energy loss explained by EC fluxes, respectively. The cumulative ET was 735.8 and 796.8 mm, respectively; and the overall ecosystem water use efficiency (EWUE) were calculated as 6.5 g C kg−1 water across both growing seasons. The above-ground biomass yield agreed with the cumulative GPP and was inversely correlated with grass nutritive value. The uncertainty analysis indicated that accurate EC flux gap-filling models could be constructed using support vector machine trained time-series models (NEE, R2 = 0.77, RMSE = 11.8; ET, R2 = 0.90, RMSE = 73.8). The performance benchmarking tests indicated that REddyProc-based gap-filling performance was very limiting and highly variable (NEE, R2 = 0.21–0.64; ET, R2 = 0.79–0.87), particularly for estimating NEE. Overall, the warm-season vegetation encroachment greatly filled the production gap of cool-season grasses, leading to greater cumulative NEE and EWUE on a system level, compared with those from many other reported field-crop or grassland studies using EC approaches. The complex and dynamic nature of grassland ecosystems greatly challenged the conventional REddyProc-based EC flux gap-filling performance. However, accurate machine learning models could be constructed for error/uncertainty control purposes and, thus, should be encouraged in future studies.


Introduction
Pasture ecosystems serve important functions in providing high-quality feed to herbivores, producing biofuel feedstocks, and offering great ecological services such as carbon sequestration and biodiversity. However, the resilience and sustainability of these ecosystems have been challenged by climate change, weed encroachment, and anthropogenic intervention; leading to great uncertainties over the carbon/water budget and energy balance dynamics on a system level [1][2][3]. Thus, capturing and modeling the carbon/water vapor footprints of these pasture ecosystems using advanced instrumentation methods

Study Site Description
The study was conducted at Lascassas, TN, USA. The primary soil type was Hillwood gravelly silt loam (12% to 20% slopes, Alfisol, thermic Oxyaquic Fraglossudalf). The long-term (>30 years) average annual precipitation and mean annual temperature were 1397 mm and 15 • C, respectively. The entire pasture was measured as 12.8 ha with the geometric center located at the latitude of 35 • 53 19.66 N and the longitude of 86 • 16 23.77 W. The grassland pasture was originally established in 2008 and was dedicated for hay production or periodical cattle grazing. It was primarily managed as a cool-season grass dominated natural pastoral ecosystem without application of chemical fertilizers, pesti-Agronomy 2021, 11, 2071 3 of 18 cides, or herbicides. No grazing activities were allowed since the establishment of the EC towers in late 2016. Biomass harvest was performed when the majority of the cool-season grasses reach boot stage. In particular, hay was harvested, cured, and baled from the entire pasture twice in (17 July and 6 September) 2017 and (15 July and 17 September) 2018. Botanical composition was evaluated before EC data collection in late 2016, indicating 40% of tall fescue,~10% of kentucky bluegrass,~20% of orchardgrass, and~30% of other forbs and legumes. Warm-season grass encroachment (35-40% of botanical composition) was typically observed during the hot summer months to early fall (July to August) mainly consisting of purpletop (Tridens flavus (L.) Hitchc.), johnsongrass (Sorghum halepense (L.) Pers.), and dallisgrass (Paspalum dilatatum Poir.); and the warm-season grass population dwindled rapidly in late fall (September to October).
The entire EC-tower system was established in late August 2016 and was continuously recording data until late August of 2018. This time window included two complete growing cycles of typical cool-season grasses which starts from the beginning of the stockpiling season (or fall-planting season) in the early autumn to the end of summer-dormancy season of the following year. In Middle Tennessee, fall-planting (from late August to Mid-September) of cool-season grasses is generally preferred for controlling weed outbreak and Sclerotinia homoeocarpa F.T. Bennett infection [12] compared to spring planting. The detailed patterns of daily average ambient air temperature (Tair), soil volumetric water content (SVWC), and precipitation during each growing season are shown in Figure 1. 86°16′23.77″ W. The grassland pasture was originally established in 2008 and was dedicated for hay production or periodical cattle grazing. It was primarily managed as a cool-season grass dominated natural pastoral ecosystem without application of chemical fertilizers, pesticides, or herbicides. No grazing activities were allowed since the establishment of the EC towers in late 2016. Biomass harvest was performed when the majority of the cool-season grasses reach boot stage. In particular, hay was harvested, cured, and baled from the entire pasture twice in (17 July and 6 September) 2017 and (15 July and 17 September) 2018. Botanical composition was evaluated before EC data collection in late 2016, indicating ~40% of tall fescue, ~10% of kentucky bluegrass, ~20% of orchardgrass, and ~30% of other forbs and legumes. Warm-season grass encroachment (35-40% of botanical composition) was typically observed during the hot summer months to early fall (July to August) mainly consisting of purpletop (Tridens flavus (L.) Hitchc.), johnsongrass (Sorghum halepense (L.) Pers.), and dallisgrass (Paspalum dilatatum Poir.); and the warm-season grass population dwindled rapidly in late fall (September to October). The entire EC-tower system was established in late August 2016 and was continuously recording data until late August of 2018. This time window included two complete growing cycles of typical cool-season grasses which starts from the beginning of the stockpiling season (or fall-planting season) in the early autumn to the end of summer-dormancy season of the following year. In Middle Tennessee, fall-planting (from late August to Mid-September) of cool-season grasses is generally preferred for controlling weed outbreak and Sclerotinia homoeocarpa F.T. Bennett infection [12] compared to spring planting. The detailed patterns of daily average ambient air temperature (Tair), soil volumetric water content (SVWC), and precipitation during each growing season are shown in Figure 1.  In particular, the cumulative growing-season precipitation was 1212 and 1353 mm; and the average Tair was 15.8 and 14.3 • C for the first and the second growing seasons, respectively. The average SVWC was 26.6 and 31.4% with the values mostly between 8.6 to 36.2% and 18.4 to 39.7% during the two consecutive growing seasons, respectively.

Eddy Covariance Tower Establishment and Other Accessory Instruments
The instrument tower consisted of a complete EC system and other accessory meteorological sensors, including a Gill-WM WindMaster 3-D Sonic Anemometer (Gill Instruments Ltd., Lymington, Hampshire, UK, at 3 m height) and an LI-7500RS open-path infrared gas analyzer (IRGA, Li-Cor Inc., Lincoln, NE, USA at a 3 m height) integrated with the SMARTFlux System. The software estimates quality of each flux data point using a flux quality flag obtained from the integral results of the steady-state and the development turbulence tests. The flagging policy is in compliance with the documented standards specified by the ICOS (Integrated Carbon Observation System), AmeriFlux, and FLUXNET; which uses numerical values '0', '1', and '2' to indicate high-quality, suitable for budget analysis, and poor-quality data; respectively.
Supplementary instruments were included in the LI-7900-102 Biomet System Package (Li-Cor Inc., Lincoln, NE, USA

Flux Data Processing and Gap Filling
The daily summary logged files containing 30 min flux values were collected and stored as tab-delimited text files. The REddyProc Software Package from the Max Planck Institute for Biogeochemistry was used for flux data gap-filling and partitioning of net ecosystem exchange (NEE) into ecosystem respiration (ER) and gross primary production (GPP). In particular, the night-time based algorithms [13] were used for NEE partitioning; and the Ustar-based methods in conjunction with the Moving Point Test threshold estimation algorithms [14] were used for flux data filtering. The detailed information regarding the data filtering, gap-filling, and partitioning [13,15] could be found at the REddyProc website: https://www.bgc-jena.mpg.de/bgi/index.php/Services/REddyProcWebRPackage (accessed date: 24 January 2021). According to the sign convention for presenting NEE data, negative flux values indicate overall CO 2 sequestration (fluxes from the atmosphere to the ecosystem) and positive values indicate overall CO 2 release (fluxes from the ecosystem to the atmosphere). Throughout the entire duration of this study, 16.5% of NEE data and 15.9% of latent heat flux (LE) data were classified as "gap" either caused by missing values or identified by the Ustar-filtering algorithms. Evapotranspiration (ET) was calculated using LE based on equations described by Rajan et al. [16]. Obvious outliers (NEE beyond ±50 µmol m −2 s −1 ) [17,18] were removed manually from the dataset. All daily cumulative values, including NEE, ER, GPP, and ET; were calculated by integrating the value-by-seconds plotting using the "trapz" function (trapezoidal numerical integration) in MATLAB.
The energy balance and closure of the entire ecosystem were evaluated by regressing the 30 min values of total turbulent fluxes (H + LE) with the available energy accumulation [net radiation (Rn)-soil heat flux (G)] within each growing season. Specifically, non-gapfilled high-quality data, indicated by '0 quality flag values for corresponding LE and H, and those ranging from −200 to 500 and from −200 to 800 W m −2 of H and LE, respectively; were retained for this task [19,20]. After applying all filtering criteria mentioned above, 28% of data were removed from the energy balance/closure analysis. Finally, both coefficient of determination (R 2 ) and root mean square error (RMSE) were used for model fitting assessment and the results were presented by growing season.

Plant Biomass Sampling and Nutritive Value Estimation
Plant biomass samples were taken from three 0.3 m 2 quadrats at a 5 cm cutting height from randomly selected locations from April to August within each growing season. Sampling dates were 13 May, 26 June, 20 July, 27 August, for the first growing season; and 12 May, 19 June, 17 July, 25 August, for the second growing season. This sampling schedule was designed to avoid winter dormancy seasons and to effectively capture the end-of-spring growth of cool-season grasses, summer growth of the majority of warmseason grasses, and the biomass production of cool-season grasses during stockpiling seasons within each year. All biomass samples were immediately stored in paper bags and then shipped back to the laboratory and dried in a commercial oven at 60 • C to a constant weight. Cumulative biomass production was then correlated with GPP using simple linear regression. Following dry matter measurement, all samples were ground to fine powders that could pass 1 mm screen using a Wiley mill (Comeau Technique Ltd., Vandreuil-Dorion, Quebec, Canada). All powder samples were then scanned on a

Modeling-Based Uncertainty Analysis and Benchmarking
To assess the efficacy and accuracy of the REddyProc-based gap filling algorithms on NEE and LE, we implemented a series of statistical/machine learning algorithms and benchmarked our results against REddyProc-produced results. In particular, NEE and LE were of primary interests, because both variables could be directly measured by EC systems and later used for GPP/ER partitioning and ET calculation, respectively. To complete this task, we first extracted the high-quality flux data according to the u-star filtering algorithm specified by REddyProc [22] and signal intensity associated with each flux data point, which retained the 30-min flux data that have a signal intensity of at least 60% and a u-star quality flag value of "zero," indicating high-quality data point. Following that, the entire dataset was randomly and evenly partitioned into 10 subsets of similar size (10-fold cross validation, 10-fold CV). Each time, one subset of data were removed from the whole dataset to create artificial gaps and then gap-filled by REddyProc. This process was repeated 10 times until all ReddyProc predicted data were obtained and recombined to form a complete predicted dataset, which was then regressed with the original high-quality dataset for performance evaluation. In particular, we tested three different gap-filling conditions used by REddyProc [22]; including: only the target variables (NEE or LE) are missing but Rg, Tair, and VPD are available (Rg + Tair + VPD); both Tair and VPD are missing but Rg is available (Rg); and all three variables are missing (None) by artificially removing corresponding variable from the input dataset. Meanwhile, we also implemented our own least-square-based linear models (LSLM), artificial neural network models (ANN), and SVM models for gap-filling based on the same 10-fold CV regime. Additionally, 12 easy-to-measure EC-independent (without requiring IRGA or Sonic Anemometer) driver variables (features) were selected for this task, including hour readings, Rg, Tair, Tsoil, RH, VPD, Photosynthetic Photon Flux Density, Soil Heat Flux, Rn, wind speed, precipitation, and SVWC. The rationale for this was that we wanted to evaluate the possibility for accurately predicting high-resolution (30 min) flux data just based on easy-to-measure environmental variables without relying on high-cost IRGA or sonic anemometer. For the linear model, we used the least-square leaner, sparse-reconstruction-by-separable-approximation optimizer, and lasso-based regularization. For ANN, we constructed feedforward neural net with three hidden layers with ten, eight, and five neurons within each layer, respectively. For SVM, we used Gaussian kernel with grid-search optimizer identified hyper-parameters on standardized data. Additionally, instead of using the entire nine-subset training data, we also tested the performance of using just the 7-or 14-day flanking window (FW) around the artificial gap for model training processes, which closely mirror the gap-filling algorithms (dt ≤ 7 or 14 days) used by REddyProc [22]. Modeling performances were evaluated based on simple linear regression analysis between the predicted data and original data, and the final R 2 and RMSE were presented. The overall data analysis scheme is summarized in Figure 2.
flux data just based on easy-to-measure environmental variables without relying high-cost IRGA or sonic anemometer. For the linear model, we used the least-squ leaner, sparse-reconstruction-by-separable-approximation optimizer, and lasso-bas regularization. For ANN, we constructed feedforward neural net with three hidden l ers with ten, eight, and five neurons within each layer, respectively. For SVM, we us Gaussian kernel with grid-search optimizer identified hyper-parameters on standardiz data. Additionally, instead of using the entire nine-subset training data, we also test the performance of using just the 7-or 14-day flanking window (FW) around the artific gap for model training processes, which closely mirror the gap-filling algorithms (dt or 14 days) used by REddyProc [22]. Modeling performances were evaluated based simple linear regression analysis between the predicted data and original data, and final R 2 and RMSE were presented. The overall data analysis scheme is summarized Figure 2.

Energy Balance Analysis and Daily Cumulative Carbon Fluxes
The simple linear regression models used in the energy balance/closure analysis indicated great correlation between the available energy (Rn-G) and the sum of the turbulent fluxes (H + LE; Figure 3).
In particular, the first growing season data yielded an R 2 value of 0.91 and an RMSE value of 57 W m −2 , and the second growing season yielded an R 2 value of 0.83 with an RMSE value of 79 W m −2 . The slope results from both seasons indicated partial energy balance closure (<1), indicating that measured turbulence fluxes accounted for 74% and 62% of available energy loss during each growing season, respectively. The rest of the available energy was primarily transferred to the above-heat-flux-sensor soil layer, vegetation canopy (mainly used for photosynthesis), and the atmospheric air layer between the soil horizon and the IRGA. Foot-print analysis [23] indicated that at least 70-80% of flux data was from within the research field throughout the entire growing season. The overall patterns of daily cumulative NEE, GPP, and ER for the entire pasture ecosystem are depicted in Figure 4.

Energy Balance Analysis and Daily Cumulative Carbon Fluxes
The simple linear regression models used in the energy balance/closure analysis indicated great correlation between the available energy (Rn-G) and the sum of the turbulent fluxes (H + LE; Figure 3). In particular, the first growing season data yielded an R 2 value of 0.91 and an RMSE value of 57 W m −2 , and the second growing season yielded an R 2 value of 0.83 with an RMSE value of 79 W m −2 . The slope results from both seasons indicated partial energy balance closure (<1), indicating that measured turbulence fluxes accounted for 74% and 62% of available energy loss during each growing season, respectively. The rest of the available energy was primarily transferred to the above-heat-flux-sensor soil layer, vegetation canopy (mainly used for photosynthesis), and the atmospheric air layer between the soil horizon and the IRGA.
Foot-print analysis [23] indicated that at least 70-80% of flux data was from within the research field throughout the entire growing season. The overall patterns of daily cumulative NEE, GPP, and ER for the entire pasture ecosystem are depicted in Figure 4.   The daily maximum NEE reached up to 21.1 and 20.7 g C m −2 for the first (early fall 2016 to late summer 2017) and second growing seasons (early fall 2017 to late summer 2018), respectively; and the minimum NEE reached up to −33.5 and -43.6, g C m −2 , respectively. The seasonal average daily NEE was −5.2 and −7.8 g C m −2 . Large fluctuation was observed in daily cumulative GPP, with the average values of −21.7 and −23.9, maximum values of −0.02 and −0.37, and minimum values of −49.3 and −49.8 g C m −2 , for the first and the second growing seasons, respectively. Following similar trends, the average, maximum, and minimum ER values were 16.5, 41.3, and 0.07 g C m −2 , respectively, for the first growing season; and 11.8, 36.9, and 0.61 g C m −2 , respectively, for the second growing season. The seasonal cumulative NEE, GPP, and ER were −1.23, −3.46, and 2.23 kg C m −2 , respectively, for the first growing season; and −1.95, −2.98, and 1.03 kg C m −2 , respectively, for the second growing season. Again, the negative signs indicate carbon fluxes from the atmosphere into the ecosystem and vice versa. . The cumulative seasonal ET was 735.8 and 796.8 mm for the first and the second growing seasons, respectively. We regressed the monthly cumulative GPP with ET based on a simple linear regression, yielding a linear model (R 2 = 0.57, p-value < 0.01) with a slope of 6.48 (average EWUE = 6.48 g C kg −1 ) and an intercept of 139.8 g C m −2 .

Plant Biomass and Nutritive Value
The end-of-spring season biomass production were 5.7 and 6.4 Mg ha −1 in April and peaked at 8.2 and 9.8 Mg ha −1 in May in the first and second growing seasons; respectively ( Figure 6).
July sampling was conducted after hay cutting and, thus, resulted in very low biomass production (average: 2.2 Mg ha −1 ). Fall-season stockpile growth increased the biomass back to 7.8 and 5.6 Mg ha −1 on average for the first and second growing seasons, respectively. Grass nutritive value, indicated by RFV, was the greatest in April (average: 95.5), and started declining in the summer. The lowest values reached 66 in the first growing season. A dramatic RFV increase was detected in July immediately after each hay cutting and started declining again as biomass increased in the stockpiling season. The dynamics of grass CP level closely mirrored those of RFV. Particularly, the average late-spring and after-cutting CP level was around 15% in April and 16% in July. The low CP levels were found in June and August, averaging at 7.6 and 9.4%, respectively. Lignin concentration started low in the beginning of the growing season at 6.4% and peaked at 8.7% in June. Following hay cutting, lignin concentration dropped to 5.9 and 3.9%, in the first and second growing seasons, respectively. Overall, the cumulative above-ground biomass agreed with the cumulative GPP, yielding an R 2 value of 0.66 across both growing seasons (Figure 7).

Flux Data Modeling, Uncertainty Analysis, and Benchmarking
For NEE gap-filling, the overall performance of SVM-based models outcompeted all others ( Table 1).
The R 2 values of the regression between ANN-predicted NEE and the original NEE were satisfactory based on a 10-fold CV and a 7-day FW settings. However, a big reduction in performance was found when 14-day FW setting was used (R 2 = 0.02), which was unexpected. Meanwhile, the 7-day FW setting appeared to be a better setting compared with others. Likewise, SVM-based models yielded the optimal performance across all partition settings with an average R 2 value of 0.89 and an average RMSE value of 78.2. Both ANN-based models and the REddyProc software produced very similar results. The LSLM-based models generally provided the worst performance indices across all partitioning settings, and 10-fold CV generally provided the best performance for linear models. Interestingly, no performance plunge was detected under ANN-based LE models. Obviously, REddyProc gap-filling algorithms could provide reliable predictions for LE, but its performance on NEE prediction appeared to be limited. In general, the regressions between REddyProc-based predictions and original values explained more variation as more features were included in the models (from none to all three variables). Interestingly, Rg-only models appeared to be the best model (R 2 = 0.64) for NEE gap-filling. Finally, Figures 8 and 9 depict the performance of the best gap-filling machine learning models for NEE and LE, respectively. The best results generated by REddyProc-based models were also included in both figures for comparison purposes. The optimal results of NEE and LE gap-filling were both provided by SVM-based models using 7-day FW setting.

Plant Biomass and Nutritive Value
The end-of-spring season biomass production were 5.7 and 6.4 Mg ha −1 in April and peaked at 8.

Plant Biomass and Nutritive Value
The end-of-spring season biomass production were 5.7 and 6.4 Mg ha −1 in April and peaked at 8.2 and 9.8 Mg ha −1 in May in the first and second growing seasons; respectively ( Figure 6).
. Figure 6. Evolution of biomass, relative feed value (RFV), crude protein (CP) concentration, and lignin concentration from April to August within the 2017 and 2018 growing seasons of a grass-dominated grassland ecosystem located in Middle Tennessee. A one-time hay harvest was conducted around each mid-July. 95.5), and started declining in the summer. The lowest values reached 66 in th growing season. A dramatic RFV increase was detected in July immediately after hay cutting and started declining again as biomass increased in the stockpiling se The dynamics of grass CP level closely mirrored those of RFV. Particularly, the av late-spring and after-cutting CP level was around 15% in April and 16% in July. Th CP levels were found in June and August, averaging at 7.6 and 9.4%, respectively. L concentration started low in the beginning of the growing season at 6.4% and peak 8.7% in June. Following hay cutting, lignin concentration dropped to 5.9 and 3.9%, first and second growing seasons, respectively. Overall, the cumulative above-gr biomass agreed with the cumulative GPP, yielding an R 2 value of 0.66 across growing seasons (Figure 7).

Flux Data Modeling, Uncertainty Analysis, and Benchmarking
For NEE gap-filling, the overall performance of SVM-based models outcompet others (Table 1).  Table 1. Benchmarking performance of gap-filling artificially created missing net ecosystem exchange (NEE) or latent heat flux (LE) data points. Results were evaluated as R 2 and root mean square error (RMSE) values (in parenthesis) based on a simple linear regression between predicted and eddy covariance-measured values. Particularly, last-square linear models (LSLM), artificial neural network (ANN), and support vector machine (SVM) algorithms were used and evaluated based on different cross validation settings (10-fold CV, 7/14 day flaking windows). Three different variable settings used by REddyProc were compared, including: global radiation, ambient air temperature, and vapor pressure deficit are all available (Rg + Tair + VPD); both Tair and VPD are missing but Rg is available (Rg); and all three variables are missing (None). optimal results of NEE and LE gap-filling were both provided by SVM-based mod using 7-day FW setting.

Energy Dynamics and Carbon Flux
The energy closure values obtained were generally within the range (0.6-0.9 natural grassland ecosystems reported by other studies [24,25], which is typically lo in natural complex ecosystems such as pastures compared with row-cropping syste [26]. The remaining energy fraction can be retrieved from energy storage in the top layer above the heat flux plates, and from the vegetation canopy between the soil surf and the IRGA sensor head; which primarily consists converted energy by photosynthe The values found in this study (0.74 and 0.62 for the first and the second growing sons, respectively) were close to the lower bound within the normal range for C3 pla [27], indicating that the EC systems properly captured and recorded energy fluxes on ecosystem level. The slightly lower energy closure values observed in this study could attributed to the fact that global warming gradually pushes the climate transition zo towards the north, causing significant encroachment of summer annual/perennial spe (e.g., large crabgrass (Digitaria sanguinalis (L.) Scop.) purpletop, johnsongrass, dallisgr

Energy Dynamics and Carbon Flux
The energy closure values obtained were generally within the range (0.6-0.9) of natural grassland ecosystems reported by other studies [24,25], which is typically lower in natural complex ecosystems such as pastures compared with row-cropping systems [26]. The remaining energy fraction can be retrieved from energy storage in the top soil layer above the heat flux plates, and from the vegetation canopy between the soil surface and the IRGA sensor head; which primarily consists converted energy by photosynthesis. The values found in this study (0.74 and 0.62 for the first and the second growing seasons, respectively) were close to the lower bound within the normal range for C3 plants [27], indicating that the EC systems properly captured and recorded energy fluxes on an ecosystem level. The slightly lower energy closure values observed in this study could be attributed to the fact that global warming gradually pushes the climate transition zones towards the north, causing significant encroachment of summer annual/perennial species (e.g., large crabgrass (Digitaria sanguinalis (L.) Scop.) purpletop, johnsongrass, dallisgrass, etc.) during late summer/early fall growing seasons. This warm-season encroachment scenario complemented the depleted production of cool-season species during hot summer months, therefore, significantly increased overall photosynthetic activity and consequentially increased biomass production and eco-efficiency. Additionally, this could also be caused by the increased heat storage portion in open canopy dominated grassland sites as reported by Wilson et al. [28].
Unlike warm-season annual/perennial grasses, which produce the majority of biomass during the late summer-early fall of each year; the growth of cool-season grasses typically peaks at the spring and fall seasons of each year with summer (summer slump) and winter dormancy periods spliced in between [29]. This bimodal growth distributions of cool-season grasses were also observed in this study (indicated by the GPP pattern in Figure 4) and are generally well-known in grassland ecosystem community. However, what is less known is the large day-by-day variation of GPP and NEE caused by different environmental and managerial factors. Particularly, hay bailing events on 17 July and 6 September 2017, as well as on 15 July 2018 rapidly transitioned the ecosystem from a net carbon sink to carbon source for a short period of time as indicated by the shifting of NEE from negative to positive numbers. Similar findings were also reported by Zhou et al. [10] from a warm-season, old world bluestem dominated grassland in Oklahoma. It was also noteworthy that GPP responded more rapidly than NEE after bailing, turning the entire ecosystem into a carbon emission source for approximately 12 days after the September cutting in 2017 and about 5 days after the July cutting in both 2017 and 2018. Additionally, the July cutting removed a small portion of senesced warm-season grasses and very little cool-season grasses that were still in summer dormancy, thus, resulting in shorter duration of responses. Following a short duration of carbon emission period, both GPP and NEE increased rapidly triggered by vigorous regrowth of young plant tissues.
The large negative daily GPP values (strong carbon sink) observed in this study were primarily caused by the warm-season grass production in summer. Additionally, greater abundance of summer-season precipitation and SVWC in the first growing season had also resulted in more negative GPP in the first growing season (Figures 1 and 4). However, the first season's NEE indicated a less carbon sink than the second season, mainly because of the increased summer respiration intensity, caused by the higher night-time temperatures associated with more precipitation (Figures 1 and 4). Therefore, unlike the semiarid environment, precipitation might actually reduce carbon sequestration potential at an ecosystem level in the humid temperate regions. The seasonal cumulative GPP/NEE values obtained in this study were more negative, indicating strong biomass productivity and carbon sequestration capacity. Zhou et al. [10] also reported that GPP for a warmseason old world bluestems-based pasture only reached up to 21 g C m −2 day −1 , which was very similar to those values reported by Rajan et al.) [16] and Wagle et al. [20], but only about half of what we found in our grassland ecosystem. This magnitude differences were likely caused by the fact that the majority of the warm-season grass species found in the Southeastern U.S. feature great biomass production capacity (e.g., purpletop, johnsongrass, and dallisgrass), which are well-adapted to the humid southeastern weather and typically have much larger canopy architecture than those warm-season short-bunchy grasses found in the semiarid western U.S. regions. Additionally, in an ecosystem carbon exchange comparison study, Zeri et al. [30] pointed out that the extended growing seasons of perennial warm-season grasses offer excellent carbon sequestration capacity compared to many annual species. As expected, the seasonal cumulative GPP/NEE values were substantially greater than those reported values from other managed cool-season monoculture pastures or warm-season row-cropping systems. For example, Shurpali et al. [31] reported that the annual maximum cumulative GPP of a cool-season reed canary grass (Phalaris arundinaceae L.) pasture in eastern Finland only reached −689 g C m −2 . In another study, a kentucky bluegrass monoculture pasture only yielded an annual NEE value of −1.1 kg C m −2 [32]. Cotton et al. [33] found that a perennial warm-season hay pasture in Fort Valley of Georgia provided an annual NEE value of −1.62 kg C m −2 , which was similar to our results from the first growing season (−1.78 kg C m −2 ) but still substantially lower than the second growing season (−2.68 kg C m −2 ). In our study, we attributed the low seasonal GPP/NEE values (strong carbon sink) to the fact that the pasture was partially managed as a natural ecosystem with few biomass production gaps and almost whole-year-long growing season involving succession of dominance between cool and warm-season grass species.
Based on NEE data, the entire ecosystem remained a net carbon sink during the majority of the time within each growing season (Figure 4). This contradicts the findings reported by Skinner [9], indicating that a cool-season pasture ecosystem in Pennsylvania with very similar grass species composition could become a carbon source during the winter months. However, previous studies also indicated that transitioning from row-crop to perennial grass systems could greatly enhance GPP and NEE, but it might take more than one or two years before turning the entire ecosystem into a carbon sink [8]. The longer time period since the establishment of this pasture ecosystem might play a role in causing the differences between our study and the one reported by Skinner [16]. Meanwhile, Skinner [34] also found that the threshold temperature for C3-dominated pastures to remain actively sequestering carbon should be about −4 • C, below which is a temperature range not commonly seen in the middle Tennessee region (Figure 1). The climate transition zone of the U.S. typically has mild winter temperature and ample precipitation throughout the growing season. Therefore, the grass-dominated pasture ecosystems were rarely of a net carbon source even during the winter months.

Evapotranspiration and Ecosystem Water Use Efficiency
Information relating to quantification of ET of C3 species-dominated ecosystems is extremely limited. The EC-measured daily ET values (2.0 to 2.1 mm day −1 , Figure 5) obtained from this study were lower than that reported from a soybean monoculture study conducted in the Mississippi Delta region (4.4 mm day −1 ) [27]. This is mainly caused by the differences in solar radiation intensity and solar elevation angle affected by the latitude difference. As expected, the daily ET values were smaller than that from sorghum (Sorghum bicolor L.) and maize (Zea mays L.) fields found in semiarid environments [20]. On the ecosystem level across the two growing seasons, GPP was significantly (p < 0.05) correlated with ET but the model only explained 37% of variation in the data. This indicates that the impact of precipitation/ET on carbon uptake of perennial cool-season grass dominated systems in the climate transition zones of the U.S. is moderate, as this region typically receives ample precipitation year-round. In general, C3 species have lower WUE than C4. However, summer warm-season species encroachment could significantly increase the overall EWUE. Therefore, this grassland ecosystem provided greater average EWUE (6.5 g C kg −1 ) than that reported from warm-season perennial monoculture systems, such as old world bluestem (3.3 to 4.1 g C kg −1 water); 2.5 to 3.1 g C kg −1 water) [9,19] and switchgrass (3.1 g C kg −1 water) [16].

Biomass Productivity and Plant Nutritive Value
The decline in plant nutritive value with advancing maturity and accumulating biomass yield was observed in this research, as indicated in other previous studies [35,36]. Summer dormancy of cool-season grasses greatly decreased the overall biomass production but this effect was largely offset by warm-season encroachment, which maintained the biomass production at above the −4.5 Mg ha −1 level. Apparently, late-summer cutting significantly reduced standing biomass, and the new growth greatly increased plant nutritive value, reflected in RFV, CP, and lignin levels. The initial spring-season grass CP concentration (around 15%) largely represented the mixture of tall fescue, orchardgrass, and kentuky bluegrass; and we considered this number within the typical range of coolseason grasses without N fertilization [4]. As expected, CP concentration dropped to a very low level in summer as warm-season grasses became more abundant. Additionally, it seemed like CP was more responsive towards regrowth of juvenile tissues compared to the overall digestibility and voluntary intake responses as indicated by RFV values.
In particular, RFV is a quality index that considers both NDF and ADF values, as well as their derived voluntary intake and dry matter digestibility levels [37]. Overall, the pasture ecosystem maintained RFV values between fair to utility levels.

Model Performance
The REddyProc-based gap-filling and partitioning algorithms developed and maintained by the Max Planck Institute for Biogeochemistry are widely accepted as one of the standard methods in the EC flux measurement community [22,26]. The program determines flux gaps during the low turbulent mixing values (u-star filtering) based on low friction velocity, which are identified using moving point method, breakpoint detection, or bootstrapping uncertainty analysis. The purpose for this filtering process was to remove biased flux estimates caused by undetectable advection found in the stable nocturnal surface layer below the IRGA sensor head [38]. Meanwhile, it also uses a look-up table approach based on similarities of meteorological conditions found within a 7-or 14-day time frame for gap-filling. In the original paper, Wutzler et al. [22] benchmarked their results against the BGC online tool (BGC16, 2016 version MPI-BGC in Jena Sect. 3). However, comparisons were only made between the BGC16 and REddyProc predicted values, which indicated excellent agreement; but no comparisons against EC-measured real data were performed. Therefore, we think the benchmarking results obtained from this study could be very informative to the EC flux measurement community. Additionally, Acevedo et al. [39] and Hunt et al. [6] also pointed out that the standard deviation of the vertical wind speed could serve as a better sorting parameter, but the results obtained using both were similar. Therefore, we accepted the more popular u-star-based filtering results in this study as one of the main criteria for selecting high-quality data.
The results from this study indicated that non-parametric, non-linear, kernel methods, such as SVM generally provided superior gap-filling performance than REddyProc. The ANN-based models could be used for predicting both NEE and LE gaps, but the performance stability remained questionable, which could be caused by the uncertainties associated with the structure parameters (e.g., number of hidden layers and neurons in each layer). Overall, gap-filling LE data could be completed with much greater accuracies than NEE. This is anticipated, because the carbon dynamics on an ecosystem level are governed by much more complex plant-soil-microbes interplays than LE, thus, leading to greater challenges for the modeling processes. The 10-fold CV approach used much more training data than either the 7-or 14-day FW methods. However, only one model was constructed for each routine (training and testing), consisting of nine-fold training and one-fold testing. On the other hand, both the 7-and 14-day FW methods required training and testing on each individual gap data point, thus, resulted in a much greater number of model constructions, but with substantially less training data within each routine. Overall, the 10-fold CV approach was computationally cheaper and more time-efficient. The 7-or 14-day FW methods inherently considered the time-series correlations, as only those observed data temporally close to the artificial gap data points were used for model training. Therefore, the slightly improved performance observed from the FW methods indicated that there exists some time-series correlation in the dataset but not significantly impacting the overall gap-filling performance. Finally, except for the abnormality observed under the ANN-based 14-day FW NEE gap-filling models, little difference was found between the 7versus 14-day FW methods. This indicated that: although the threshold time length for accurate model training was unknown, a 7-day FW dataset (2 × 7 days × 48 half hours = 672 sample vectors for each training process) should contain enough information for constructing satisfactory prediction models.
To date, only a few studies have been conducted in designing and evaluating gapfilling methods for EC flux data. For example, approaches using the mean diurnal variation and the meteorological similarity-based look-up table approaches were compared by Falge et al. [40], but again, no real EC measured value-based comparisons were conducted. In another study, Hui et al. [41] constructed a multiple imputation algorithm based on ex-pectation and maximization algorithms but the predicted results lacked consistency across various forest ecosystems. There is another ANN-based gap-filling algorithm originally proposed by Hsu et al. [42], but was largely designed for hydrological modeling not dedicated for general EC flux data. In some most recent study, Kunwor et al. [43] designed three non-linear regression approaches for gap-filling NEE, but yielded variable prediction accuracy and limited reliability. Hayek et al. [44] indicated that a simple parametric model of missing storage incorporated with a PAR-extrapolated advective respiration loss approach could yield comparable, even slightly better results for gap-filling forest NEE flux data than REddyProc. However, its applicability and effectiveness in row-crop or pasture-based ecosystem studies remain unclear. Likewise, an earlier popular EC gap-filling model developed by Dunn et al. [45] also had a strong focus on forest ecosystems. Before the advent of REddyProc, ANN-based algorithms were generally regarded as one of the best gap-filling approaches [46]. However, as indicated in this study, the uncertainties associated with its structure parameter setting and lack of stability might hinder its popularity in the future. We considered SVM as the optimal machine learning algorithm for EC gap-filling. Overall, REddyProc-based algorithms provided limited performance while predicting NEE but was relatively reliable for LE.

Conclusions
Taken together, a grassland pasture exemplifies a complex and dynamic ecosystem. This case study used EC tower-based flux measurement approaches, integrated with field data measurement and simulation modeling to investigate the evolution of plant phenology, nutritive value, energy balance, and carbon/water fluxes of a cool-season grassland ecosystem in the temperate climate transition region. Climate change has caused significant summer warm-season encroachment, thus, resulted in high ecoefficiency (high GPP) and EWUE. The modeling-based uncertainty analysis indicated that REddyProc can effectively predict LE data, but NEE gap filling should proceed with caution. Finally, building novel EC gap-filling models using machine learning algorithms should be important future direction.