Remote Sensing-Based Estimation of Advanced Perennial Grass Biomass Yields for Bioenergy

A sustainable bioeconomy would require growing high-yielding bioenergy crops on marginal agricultural areas with minimal inputs. To determine the cost competitiveness and environmental sustainability of such production systems, reliably estimating biomass yield is critical. However, because marginal areas are often small and spread across the landscape, yield estimation using traditional approaches is costly and time-consuming. This paper demonstrates the (1) initial investigation of optical remote sensing for predicting perennial bioenergy grass yields at harvest using a linear regression model with the green normalized difference vegetation index (GNDVI) derived from Sentinel-2 imagery and (2) evaluation of the model’s performance using data from five U.S. Midwest field sites. The linear regression model using midsummer GNDVI predicted yields at harvest with R2 as high as 0.879 and a mean absolute error and root mean squared error as low as 0.539 Mg/ha and 0.616 Mg/ha, respectively, except for the establishment year. Perennial bioenergy grass yields may be predicted 152 days before the harvest date on average, except for the establishment year. The green spectral band showed a greater contribution for predicting yields than the red band, which is indicative of increased chlorophyll content during the early growing season. Although additional testing is warranted, this study showed a great promise for a remote sensing approach for forecasting perennial bioenergy grass yields to support critical economic and logistical decisions of bioeconomy stakeholders.


Introduction
Global efforts on the adoption of a biomass-based economy are increasing, particularly in the European Union and the United States (U.S.), to significantly displace our dependence on fossil-based energy and products in the coming decade [1]. For instance, in the U.S., the Biomass Research and Development (BRD) Act of 2000 [2] has paved the way for the Bioeconomy Initiative, a concerted, inter-federal agency effort that started in 2013, which is geared towards maximizing the utilization of the nation's abundant biomass resources for bioenergy and biobased products [3]. In 2014, the biobased products sector had contributed USD 393 billion and 4.2 million jobs to the U.S. economy [4]. As the market for bioproducts (renewable biochemicals and biopolymers) and bioenergy grows, grasses are grown for animal consumption (e.g., open grazing or for forage/harvested as hay) [31][32][33]. The differences between forage and bioenergy production systems result in different needs for biomass estimation from remote sensing models. For example, a forage production system may include a single or two-cut harvest regime, and harvest timing (early or midseason) will be based on optimizing nutrient and crude protein concentrations to improve feed quality [34][35][36]. Under a sustainable bioenergy production system, perennial grass biomass is typically harvested once after senescence or a killing frost at the end of each growing season to allow for nutrients to translocate to belowground tissues, thereby reducing the next season's nutrient requirements and improving stand longevity [7,[37][38][39][40][41]. The differences in harvest timing and harvest frequency for these two applications can impact the crop's biomass yield and quality. Management practices, such as fertilizer regimes, herbicide applications, and tillage practices, can also affect the optical properties of the crops captured in the imagery by influencing the chlorophyll and nitrogen content in leaf and stem tissues [42][43][44], the amount of canopy gaps or exposed ground [45], and ground cover composition (e.g., crop residues and weeds) [44,45]. Additionally, local weather and climate, soil properties, and genetic modifications in newly developed crops and cultivars can impact a plant's physiochemical properties (e.g., chemical and pigment constituents and abundance, moisture content, internal leaf structure, and canopy architecture) and therefore their optical properties. Despite the demonstrated success of remote sensing modeling for crop-yield estimation, changes in the crop's optical properties require the models to be evaluated and recalibrated for newly released perennial bioenergy crops for use in rapid and reliable forecasting of biomass yields.
The objectives of this study were to formulate a parsimonious optical remote sensing model using spectral vegetation indices and evaluate its predictive power for the dry biomass yields at-harvest of large-scale switchgrass (Panicum virgatum L.) production systems on marginal land using five study locations in the U.S. Midwest. The ultimate goal is to develop a methodology for rapid, reliable, and cost-effective forecasting of bioenergy crop yields that is spatially resolved at the subfield scale. In this study, biomass yield refers to the harvested aboveground switchgrass biomass (dry), cut between 10 and 15 cm above the soil surface.

Study Area
This study comprises five field sites in four U.S. Midwest states, including two sites in Illinois (Brighton (39 • (Figures 1 and 2). All sites selected for this study are considered marginal based on the landowner's observation of historical production data. In particular, the production costs of traditional row crops such as corn, soybean, and wheat have been generally higher than the income from yields generated, resulting in an economic loss for the producers. The landowners' observations also suggest that historical row crop yields for these sites have been 25-50% lower than the county-or state-reported averages. In most cases, the reduced yields are a result of poor soil quality due to soil erosion (i.e., loss of nutrient-rich topsoil) and soil moisture impediments (e.g., too dry, poorly drained, or high nutrient leaching).   Photos of the study sites. The Brighton, Illinois (IL), and Urbana, IL, sites show 'Independence' switchgrass (SW) from July 2020 and September 2020, respectively. The Madrid, Iowa (IA), site shows 'Shawnee' SW foreground, Independence SW in the middle, and 'Liberty' SW in the top right before the tree line (July 2020). The Ithaca, Nebraska (NE), site shows Independence SW in the foreground and a low diversity mixture in the background (June 2020). The South Shore, South Dakota (SD), site shows 'Sunburst' SW as a border, corn on the left, and 'Carthage' SW and a soybean on the right (July 2020).
Field site characteristics and crop management are included in Table 1. Precipitation data were provided by on-site or locally available weather stations. Treatments (three replicates each) and planting design varied across the site locations based on field (size and arrangement) and climate (temperature) conditions. The switchgrass cultivars and other warm-season perennial grasses evaluated are shown in Figure 1. The switchgrass cultivars included in the study consist of upland ('Carthage', 'Sunburst', 'Shawnee'), lowland ('Independence'), and mixed (hybrid: 'Liberty') ecotypes. Switchgrass was planted in the Photos of the study sites. The Brighton, Illinois (IL), and Urbana, IL, sites show 'Independence' switchgrass (SW) from July 2020 and September 2020, respectively. The Madrid, Iowa (IA), site shows 'Shawnee' SW foreground, Independence SW in the middle, and 'Liberty' SW in the top right before the tree line (July 2020). The Ithaca, Nebraska (NE), site shows Independence SW in the foreground and a low diversity mixture in the background (June 2020). The South Shore, South Dakota (SD), site shows 'Sunburst' SW as a border, corn on the left, and 'Carthage' SW and a soybean on the right (July 2020).
Field site characteristics and crop management are included in Table 1. Precipitation data were provided by on-site or locally available weather stations. Treatments (three replicates each) and planting design varied across the site locations based on field (size and arrangement) and climate (temperature) conditions. The switchgrass cultivars and other warm-season perennial grasses evaluated are shown in Figure 1. The switchgrass cultivars included in the study consist of upland ('Carthage', 'Sunburst', 'Shawnee'), lowland ('Independence'), and mixed (hybrid: 'Liberty') ecotypes. Switchgrass was planted in the spring of 2019 at three of the five sites (i.e., Brighton, IL; Iowa; and South Dakota). Lowland Land 2021, 10, 1221 6 of 22 ecotypes such as 'Independence' and 'W105L' were also planted at the South Dakota site; however, due to heavy winterkill in 2019, the plots were replanted with soybean and winter wheat for subsequent years and, therefore, are not included in Table 1. The majority of the Nebraska site was planted in 2012, including the Liberty switchgrass, big bluestem mixture ('Goldmine' and 'Bonanza'), and a low diversity mixture (side-oats grama, Indiangrass, big bluestem), and the exception was a single plot of Independence switchgrass, which was planted in the spring of 2019. The Urbana site was the most recently established site, with a spring planting in 2020 (Table 1). All sites were planted with a no-till drill with a targeted planting rate of 300 seedlings/m 2 and a 15-18 cm row spacing. Switchgrass stand establishment was evaluated in the fall of the planting year or the following spring using a frequency grid [46]. Plots with an average frequency of less than 25% were overseeded the following season. For the Iowa, South Dakota, and Brighton sites, no large-scale harvest occurred at the end of the establishment year. Instead, biomass was removed (via burning, mowing/raking) in early 2020 prior to the start of new growth. Additional management information and soil data for each site are included in Table S1. inter wheat for subsequent years and, therefore, are not included in Table 1. The majorty of the Nebraska site was planted in 2012, including the Liberty switchgrass, big luestem mixture ('Goldmine' and 'Bonanza'), and a low diversity mixture (side-oats rama, Indiangrass, big bluestem), and the exception was a single plot of Independence witchgrass, which was planted in the spring of 2019. The Urbana site was the most reently established site, with a spring planting in 2020 (Table 1). All sites were planted with no-till drill with a targeted planting rate of 300 seedlings/m 2 and a 15-18 cm row spacing. witchgrass stand establishment was evaluated in the fall of the planting year or the folowing spring using a frequency grid [46]. Plots with an average frequency of less than 5% were overseeded the following season. For the Iowa, South Dakota, and Brighton ites, no large-scale harvest occurred at the end of the establishment year. Instead, biomass as removed (via burning, mowing/raking) in early 2020 prior to the start of new growth. dditional management information and soil data for each site are included in Table S1.  [47]. £ Data provided by [48]. ꞎ Data provided by [49]. Ꞩ Data provided by .

Data
The data used for modeling were (1) switchgrass yields measured at harvest between ovember and December 2020 for the five study sites and (2) 10 m resolution Sentinel-2 atellite imagery from the 2020 April-October period. To monitor ground conditions, plot ictures and field observations including chlorophyll content of upper canopy leaves usng a SPAD-502 chlorophyll meter (Minolta, Konica Minolta Sensing Europe B.V.) were ollected during the growing season. Perennial grass dry yield was collected after a killing rost (air temperature below −2.2 °C) in the fall of 2020 from each study site. At the Iowa, ebraska, and Brighton sites, full plot biomass was measured by cutting the entire plot ith a mower at a cutting height of 10-15 cm. Subsamples of cut biomass were collected rom the windrows for measuring moisture and dry matter contents. The windrows were hen baled and weighed. In South Dakota, the plots were also mowed and baled, but only swath (5.5 m in width) up the center of each plot was cut and baled for biomass yield. ubsamples were also collected from the windrows, but the moisture content of the bales as directly measured using a moisture probe. spring of 2019 at three of the five sites (i.e., Brighton, IL; Iowa; and South Dakota). Lowland ecotypes such as 'Independence' and 'W105L' were also planted at the South Dakota site; however, due to heavy winterkill in 2019, the plots were replanted with soybean and winter wheat for subsequent years and, therefore, are not included in Table 1. The majority of the Nebraska site was planted in 2012, including the Liberty switchgrass, big bluestem mixture ('Goldmine' and 'Bonanza'), and a low diversity mixture (side-oats grama, Indiangrass, big bluestem), and the exception was a single plot of Independence switchgrass, which was planted in the spring of 2019. The Urbana site was the most recently established site, with a spring planting in 2020 (Table 1). All sites were planted with a no-till drill with a targeted planting rate of 300 seedlings/m 2 and a 15-18 cm row spacing. Switchgrass stand establishment was evaluated in the fall of the planting year or the following spring using a frequency grid [46]. Plots with an average frequency of less than 25% were overseeded the following season. For the Iowa, South Dakota, and Brighton sites, no large-scale harvest occurred at the end of the establishment year. Instead, biomass was removed (via burning, mowing/raking) in early 2020 prior to the start of new growth. Additional management information and soil data for each site are included in Table S1. verage of 2019 and 2020. € Data provided by [47]. £ Data provided by [48]. ꞎ Data provided by [49]. Ꞩ Data provided by ].

Data
The data used for modeling were (1) switchgrass yields measured at harvest between November and December 2020 for the five study sites and (2) 10 m resolution Sentinel-2 satellite imagery from the 2020 April-October period. To monitor ground conditions, plot pictures and field observations including chlorophyll content of upper canopy leaves using a SPAD-502 chlorophyll meter (Minolta, Konica Minolta Sensing Europe B.V.) were collected during the growing season. Perennial grass dry yield was collected after a killing frost (air temperature below −2.2 °C) in the fall of 2020 from each study site. At the Iowa, Nebraska, and Brighton sites, full plot biomass was measured by cutting the entire plot with a mower at a cutting height of 10-15 cm. Subsamples of cut biomass were collected from the windrows for measuring moisture and dry matter contents. The windrows were then baled and weighed. In South Dakota, the plots were also mowed and baled, but only a swath (5.5 m in width) up the center of each plot was cut and baled for biomass yield. Subsamples were also collected from the windrows, but the moisture content of the bales was directly measured using a moisture probe. Harvested area was measured for each plot across all field sites. A small-plot combine harvester (Wintersteiger Cibus plot har-  [47]. £ Data provided by [48]. spring of 2019 at three of the five sites (i.e., Brighton, IL; Iowa; and South Dakota). Lowland ecotypes such as 'Independence' and 'W105L' were also planted at the South Dakota site; however, due to heavy winterkill in 2019, the plots were replanted with soybean and winter wheat for subsequent years and, therefore, are not included in Table 1. The majority of the Nebraska site was planted in 2012, including the Liberty switchgrass, big bluestem mixture ('Goldmine' and 'Bonanza'), and a low diversity mixture (side-oats grama, Indiangrass, big bluestem), and the exception was a single plot of Independence switchgrass, which was planted in the spring of 2019. The Urbana site was the most recently established site, with a spring planting in 2020 (Table 1). All sites were planted with a no-till drill with a targeted planting rate of 300 seedlings/m 2 and a 15-18 cm row spacing. Switchgrass stand establishment was evaluated in the fall of the planting year or the following spring using a frequency grid [46]. Plots with an average frequency of less than 25% were overseeded the following season. For the Iowa, South Dakota, and Brighton sites, no large-scale harvest occurred at the end of the establishment year. Instead, biomass was removed (via burning, mowing/raking) in early 2020 prior to the start of new growth. Additional management information and soil data for each site are included in Table S1.  [47]. £ Data provided by [48]. ꞎ Data provided by [49]. Ꞩ Data provided by [50].

Data
The data used for modeling were (1) switchgrass yields measured at harvest between November and December 2020 for the five study sites and (2) 10 m resolution Sentinel-2 satellite imagery from the 2020 April-October period. To monitor ground conditions, plot pictures and field observations including chlorophyll content of upper canopy leaves using a SPAD-502 chlorophyll meter (Minolta, Konica Minolta Sensing Europe B.V.) were collected during the growing season. Perennial grass dry yield was collected after a killing frost (air temperature below −2.2 °C) in the fall of 2020 from each study site. At the Iowa, Nebraska, and Brighton sites, full plot biomass was measured by cutting the entire plot with a mower at a cutting height of 10-15 cm. Subsamples of cut biomass were collected from the windrows for measuring moisture and dry matter contents. The windrows were then baled and weighed. In South Dakota, the plots were also mowed and baled, but only a swath (5.5 m in width) up the center of each plot was cut and baled for biomass yield. Subsamples were also collected from the windrows, but the moisture content of the bales was directly measured using a moisture probe. Harvested area was measured for each plot across all field sites. A small-plot combine harvester (Wintersteiger Cibus plot har- spring of 2019 at three of the five sites (i.e., Brighton, IL; Iowa; and South Dakota). Lowland ecotypes such as 'Independence' and 'W105L' were also planted at the South Dakota site; however, due to heavy winterkill in 2019, the plots were replanted with soybean and winter wheat for subsequent years and, therefore, are not included in Table 1. The majority of the Nebraska site was planted in 2012, including the Liberty switchgrass, big bluestem mixture ('Goldmine' and 'Bonanza'), and a low diversity mixture (side-oats grama, Indiangrass, big bluestem), and the exception was a single plot of Independence switchgrass, which was planted in the spring of 2019. The Urbana site was the most recently established site, with a spring planting in 2020 (Table 1). All sites were planted with a no-till drill with a targeted planting rate of 300 seedlings/m 2 and a 15-18 cm row spacing. Switchgrass stand establishment was evaluated in the fall of the planting year or the following spring using a frequency grid [46]. Plots with an average frequency of less than 25% were overseeded the following season. For the Iowa, South Dakota, and Brighton sites, no large-scale harvest occurred at the end of the establishment year. Instead, biomass was removed (via burning, mowing/raking) in early 2020 prior to the start of new growth. Additional management information and soil data for each site are included in Table S1.  [47]. £ Data provided by [48]. ꞎ Data provided by [49]. Ꞩ Data provided by [50].

Data
The data used for modeling were (1) switchgrass yields measured at harvest between November and December 2020 for the five study sites and (2) 10 m resolution Sentinel-2 satellite imagery from the 2020 April-October period. To monitor ground conditions, plot pictures and field observations including chlorophyll content of upper canopy leaves using a SPAD-502 chlorophyll meter (Minolta, Konica Minolta Sensing Europe B.V.) were collected during the growing season. Perennial grass dry yield was collected after a killing frost (air temperature below −2.2 °C) in the fall of 2020 from each study site. At the Iowa, Nebraska, and Brighton sites, full plot biomass was measured by cutting the entire plot with a mower at a cutting height of 10-15 cm. Subsamples of cut biomass were collected from the windrows for measuring moisture and dry matter contents. The windrows were then baled and weighed. In South Dakota, the plots were also mowed and baled, but only a swath (5.5 m in width) up the center of each plot was cut and baled for biomass yield. Subsamples were also collected from the windrows, but the moisture content of the bales was directly measured using a moisture probe. Harvested area was measured for each plot across all field sites. A small-plot combine harvester (Wintersteiger Cibus plot har-Data provided by [50].

Data
The data used for modeling were (1) switchgrass yields measured at harvest between November and December 2020 for the five study sites and (2) 10 m resolution Sentinel-2 satellite imagery from the 2020 April-October period. To monitor ground conditions, plot pictures and field observations including chlorophyll content of upper canopy leaves using a SPAD-502 chlorophyll meter (Minolta, Konica Minolta Sensing Europe B.V.) were collected during the growing season. Perennial grass dry yield was collected after a killing frost (air temperature below −2.2 • C) in the fall of 2020 from each study site. At the Iowa, Nebraska, and Brighton sites, full plot biomass was measured by cutting the entire plot with a mower at a cutting height of 10-15 cm. Subsamples of cut biomass were collected from the windrows for measuring moisture and dry matter contents. The windrows were then baled and weighed. In South Dakota, the plots were also mowed and baled, but only a swath (5.5 m in width) up the center of each plot was cut and baled for biomass yield. Subsamples were also collected from the windrows, but the moisture content of the bales was directly measured using a moisture probe. Harvested area was measured for each plot across all field sites. A small-plot combine harvester (Wintersteiger Cibus plot harvester with 4 Kemper row independent cutter head, Service Point, IA) was used in Urbana to collect biomass yield from three areas in each plot (1.2 m × 9 m). We avoided whole-plot harvest to ensure the stand health during the establishment year. Biomass weight was directly measured by the harvester for each cut section, and subsamples of the cut biomass were collected from the combine for measuring moisture content.
The perennial grass yield dataset available for the model development is summarized in Table 2. The number of data points (or plots) ranged from 12 (Urbana and South Dakota)  Sentinel-2 satellite bottom-of-atmosphere reflectance images from April to October 2020 were obtained from the Copernicus Open Access Hub (https://scihub.coperni-cus.eu/ (accessed on 24 February 2020)). The time period was selected based on warm-season perennial grass phenology in the Midwest region with green-up period typically occurring in April or May and senescence occurring between September and October. Images with apparent cloud cover over the study sites based on visual interpretation were excluded from the download. Of the 37 images for each site (185 images total), a total of 61 Sentinel-2 multispectral images across the five study sites were obtained ( Table 3). The Sentinel-2 multispectral imagery of the five study sites is shown in Figure 3. The optical bands (i.e., blue, green, red, and near-infrared (NIR) bands) at 10 m resolution and the scene classification map at 20 m resolution were used for the model development. The spectral reflectance profiles of each of the five study sites are provided in Figure S1.

Image Processing and Model Development
The model development consists of image preparation, spectral index calculation, and linear regression model development. Prior to the index calculation, pixels affected by cloud and its shadow in each of the Sentinel-2 bottom-of-atmosphere reflectance images were excluded using the scene classification map layer on the same date, which is a part of Sentinel-2 products identifying several classes including clouds, cloud shadows, and saturated and defective pixels in the image (https://sentinels.copernicus.eu/web/sentinel/ technical-guides/sentinel-2-msi (accessed on 7 December 2020)). A total of 20 spectral indices that are known to correlate with vegetation properties were calculated using the images ( Table 4). The index values of pixels located only well within the boundaries of each of the plots were extracted to avoid possible spectral contamination from adjacent ground cover. The correlation between each index on each image date with yield was examined, and the index on a single date having the highest correlation with yields was identified.
A linear regression model for estimating yields was developed using the values from one of the index layers having the highest correlation with at-harvest yields. The model was developed using 75% of the available pixels from each plot and validated with the measured plot yields using the remaining 25% of the pixels in order to work with the limited data points. This process was iterated 31 times, and coefficient of determination (R 2 ), mean absolute error (MAE), and root mean squared error (RMSE) were calculated to evaluate the accuracy of yield estimates. For each site, each R 2 , MAE, and RMSE were averaged over the iterations, and those average values were used as the measure of predictive power of the model for the at-harvest yields for each site. A single linear regression model was developed for each site instead of multiple models for each cultivar due to the limited availability for yield estimates at harvest.

Analysis of Spectral Vegetation Indices in Relation to Switchgrass Yields
Of the 20 spectral vegetation indices calculated using the 10 m resolution Sentinel-2 multispectral imagery from April to October 2020, the seasonal trajectories of the green normalized difference vegetation index (GNDVI) [58,59] for the five study sites are shown in Figure 4 as examples since the trajectory of GNDVI seemed to represent the trajectories of indices that were highly correlated with harvestable biomass yields, which is described later in this section. The GNDVI values increased from early May and reached the peak greenness around mid-June at the Iowa, Brighton, and Nebraska sites. The South Dakota site, which is the furthest north in latitude, had a later start for greenness increase. The greenness at the Urbana site, the youngest of the five sites, did not start to increase until late June to beginning of July and exhibited a unique greenness trajectory.  Correlations between the spectral vegetation indices and dry biomass yield at harvest are summarized in Table 5. The strength of correlation between the index values and the dry biomass yields at harvest greatly varied across the sites. For the South Dakota site, 17 out of the 20 indices showed R 2 > 0.75, while index-yield correlations were moderate to high for the Brighton site (e.g., modified soil-adjusted vegetation index (MSAVI) [63], R 2 = 0.661). For the Iowa, Urbana, and Nebraska sites, a few indices showed strong correlations with yields, including the green chlorophyll index (GCI [56], R 2 = 0.828, 0.739, 0.884, and 0.878 for the Urbana, Iowa, Nebraska, and South Dakota sites, respectively) and GNDVI (R 2 = 0.842, 0.776, 0.873, and 0.857 for the Urbana, Iowa, Nebraska, and South Dakota sites, respectively). The indices based on the soil-adjusted vegetation index (SAVI [68]) were also correlated with the biomass yields for the Iowa site (R 2 = 0.737, 0.738, and In general, the trajectories of index values were very similar across the crops/cultivars within each site, with some exceptions. In particular, the Nebraska site showed very uniform trajectories across the crops (Figure 4: liberty switchgrass, big bluestem, and the low diversity mix). The major exception was the Independence switchgrass plot. The GNDVI trend was outside the 2 standard deviations from the mean. Although the stand of Independence switchgrass (height and thickness) at the Nebraska location looked very different from the other crop plots, due to the lack of plot replication, it is difficult to verify the cause of this variation. With the current dataset, the exclusion of the plot increased the R 2 between GNDVI and the harvested yield from 0.313 to 0.873. Therefore, the Independence plot was excluded from the model development for the Nebraska site.
The other observed visual crop difference at the Nebraska site occurred after day-of-year (DOY) 260. The GNDVI values for Liberty switchgrass declined more quickly than those for the low diversity mixture and big bluestem plots, resulting in a more defined split between the crop types by the end of the growing season.
In Brighton, a similar split in GNDVI trends between switchgrass cultivars was observed until DOY 219 (Figure 4). Shawnee switchgrass maintained higher GNDVI values than Independence and Liberty switchgrass by the end of the season. At the Iowa site, the majority of the Independence switchgrass plots had lower GNDVI values for the bulk of the growing season (DOY 157-247) compared to the other cultivars, resulting in a greater spread in GNDVI values by cultivar type for this time period. Plot 305, shown in gray in Figure 4, showed a unique trajectory in GNDVI values across the season. In fact, this plot was reseeded at the end of August 2020 due to low plant density (i.e., <25% threshold). In contrast to plot 305, the other two plots in Iowa that required reseeding (i.e., plots 307 (Liberty switchgrass) and 104 (Independence switchgrass)) showed comparable GNDVI trajectories across the season to other plots of the same cultivar type. Although no distinct pattern in GNDVI trajectory specific to any switchgrass cultivar type was observed, the South Dakota and Urbana sites showed variation in late June through the end of the season. Similar overall patterns in seasonal trajectory were observed for other spectral vegetation indices.
Correlations between the spectral vegetation indices and dry biomass yield at harvest are summarized in Table 5. The strength of correlation between the index values and the dry biomass yields at harvest greatly varied across the sites. For the South Dakota site, 17 out of the 20 indices showed R 2 > 0.75, while index-yield correlations were moderate to high for the Brighton site (e.g., modified soil-adjusted vegetation index (MSAVI) [63], R 2 = 0.661). For the Iowa, Urbana, and Nebraska sites, a few indices showed strong correlations with yields, including the green chlorophyll index (GCI [56], R 2 = 0.828, 0.739, 0.884, and 0.878 for the Urbana, Iowa, Nebraska, and South Dakota sites, respectively) and GNDVI (R 2 = 0.842, 0.776, 0.873, and 0.857 for the Urbana, Iowa, Nebraska, and South Dakota sites, respectively). The indices based on the soil-adjusted vegetation index (SAVI [68]) were also correlated with the biomass yields for the Iowa site (R 2 = 0.737, 0.738, and 0.737 for SAVI, MSAVI, and optimized SAVI (OSAVI) [66], respectively) and Brighton site (R 2 = 0.643 across SAVI, MSAVI, and OSAVI). The modified simple ratio index showed a good correlation for the Urbana and Nebraska sites as well (R 2 = 0.743 and 0.827, respectively) ( Table 5). The average switchgrass plant frequencies across cultivars in Iowa and Brighton (30% and 46%, respectively) in the spring/summer of 2020 were lower than those of the other field sites (Nebraska 85%, Urbana 50%, South Dakota 100%). These two sites also had the largest range in switchgrass plant frequency compared to the other sites.
The strongest correlations of the index values with yields also varied by timing ( Table 5). The indices in June (DOY 155-180) showed strong correlation with the dry biomass yields at harvest for the Brighton, Nebraska, and Iowa sites. For the Urbana site, the index-yield correlations were strong in the late growing season (i.e., early October). For the South Dakota site, the correlations varied across the indices.
Of the 20 indices examined, indices utilizing the green and NIR spectral bands such as GCI and GNDVI were consistently identified as those strongly correlated with the yields across the sites ( Table 5). The correlations of the indices using the green and NIR spectral bands were stronger than those of indices based on the red and NIR bands (e.g., NDVI) that are commonly used for vegetation studies. Based on the observation of a nearly consistent strong correlation of GNDVI with at-harvest biomass yields across the study sites, the subsequent modeling was carried out exclusively using GNDVI.
Scatter plots of the harvested dry biomass yields and the average GNDVI values corresponding to each of those plots on the date of highest correlation for each study site are presented in Figure 5. All sites except for the Brighton site showed strong correlations between GNDVI and yield (R 2 > 0.75). The GNDVI-yield correlation was moderate for the Brighton site, but index-yield correlations across the indices examined were lower than those of the other sites. Thus, modeling yield using GNDVI was reasonable for this study; GNDVI was calculated using Equation (1): where NIR and G are bottom-of-atmosphere reflectance values of the NIR and green spectral bands, which correspond to the spectral band 8 (B8) and band 3 (B3) of Sentinel-2 satellite imagery, respectively. Brighton site, but index-yield correlations across the indices examined were lower than those of the other sites. Thus, modeling yield using GNDVI was reasonable for this study; GNDVI was calculated using Equation (1): where NIR and G are bottom-of-atmosphere reflectance values of the NIR and green spectral bands, which correspond to the spectral band 8 (B8) and band 3 (B3) of Sentinel-2 satellite imagery, respectively.

Prediction of Switchgrass Dry Biomass Yields at Harvest
The overall performance of the linear regression models using GNDVI for predicting perennial grass dry biomass yields at harvest for the five sites is summarized in Table 6, along with the characteristics of dry biomass yield data used for the model development. Based on the 31 iterations, the yields predicted using the GNDVI linear regression model showed moderate to strong correlations with at-harvest yields ranging from R 2 = 0.592 (Brighton) to R 2 = 0.879 (South Dakota). The average of MAE ranged from 0.377 Mg/ha (Urbana) to 0.589 Mg/ha (Brighton), and the average of RMSE ranged from 0.399 Mg/ha (Urbana) to 0.685 Mg/ha (Brighton). The average MAE and RMSE indicated that the yields predicted by the GNDVI linear regression model had comparable errors across the five sites, except for the Urbana site, which is characterized by the lowest minimum and maximum measured yields and the smallest measured yield range (Tables 2 and 6). Table 6. Performance of linear regression models using the green normalized difference vegetation index 1 and characteristics of perennial grass dry biomass yield data used for model development.

Site
Image Date (DOY) R 2 MAE RMSE

Prediction of Switchgrass Dry Biomass Yields at Harvest
The overall performance of the linear regression models using GNDVI for predicting perennial grass dry biomass yields at harvest for the five sites is summarized in Table 6, along with the characteristics of dry biomass yield data used for the model development. Based on the 31 iterations, the yields predicted using the GNDVI linear regression model showed moderate to strong correlations with at-harvest yields ranging from R 2 = 0.592 (Brighton) to R 2 = 0.879 (South Dakota). The average of MAE ranged from 0.377 Mg/ha (Urbana) to 0.589 Mg/ha (Brighton), and the average of RMSE ranged from 0.399 Mg/ha (Urbana) to 0.685 Mg/ha (Brighton). The average MAE and RMSE indicated that the yields predicted by the GNDVI linear regression model had comparable errors across the five sites, except for the Urbana site, which is characterized by the lowest minimum and maximum measured yields and the smallest measured yield range (Tables 2 and 6). Table 6. Performance of linear regression models using the green normalized difference vegetation index 1 and characteristics of perennial grass dry biomass yield data used for model development.

Site
Image The yields predicted by the GNDVI linear regression model showed stronger correlations for the sites having higher minimum and maximum measured yields at harvest (i.e., Nebraska and South Dakota) than for those having lower minimum and maximum yields (i.e., Urbana and Brighton) (Tables 2 and 6). The predicted yields also correlated more strongly with the measured yields for the sites having the greater range of measured yields (i.e., Nebraska and South Dakota) than those with the limited yield range (i.e., Urbana and Brighton).
Examples of scatter plots of the measured switchgrass yields at harvest and predicted yields are presented in Figure 6. Each scatter plot was selected from the iteration resulting in the median R 2 for each site. Specific performance measures would vary across iterations, but the predicted yields showed a strong linear relationship with the measured yields across the five sites. The model showed the lowest median R 2 for the Brighton site with a greater difference between RMSE and MAE ( Figure 6), which is also suggested in Table 6. The model appears less robust for the Urbana site than other sites, as indicated by the slope and intercept deviating from 1 and 0, respectively, in the example (slope = 6.904, intercept = 9.134, Figure 6). Examples of scatter plots of the measured switchgrass yields at harvest and predicted yields are presented in Figure 6. Each scatter plot was selected from the iteration resulting in the median R 2 for each site. Specific performance measures would vary across iterations, but the predicted yields showed a strong linear relationship with the measured yields across the five sites. The model showed the lowest median R 2 for the Brighton site with a greater difference between RMSE and MAE ( Figure 6), which is also suggested in Table 6. The model appears less robust for the Urbana site than other sites, as indicated by the slope and intercept deviating from 1 and 0, respectively, in the example (slope = 6.904, intercept = 9.134, Figure 6). Examples of the switchgrass dry biomass yields for the five study sites that were predicted using the GNDVI linear regression model are shown in Figure 7. The maps were generated by applying the model developed using plot-level data to image pixels. Overall, the relative yield (i.e., high and low yields) and variation of predicted yields reflect the amount and range of the measured yields across the sites (Table 2). For example, the Urbana site, which has the lowest minimum and maximum measured yields (i.e., 1.19-4.35 Mg/ha) and the smallest yield range (i.e., 3.16 Mg/ha) of all sites, showed the lowest predicted yields and the most limited variation across the site. In contrast, the Nebraska site, which has the largest measured yield range (i.e., 6.96 Mg/ha) of all sites, exhibited the greatest variation in the predicted yields (Table 2 and Figure 7). Across crop types at the Nebraska site, the highest yields occurred in plots fertilized with 56 kg N ha −1 compared to 28 kg N ha −1 . Relationships between the predicted yields and fertilizer application were also observed for some plots in other sites (Figures 1 and 7). For the South Dakota site, the model predicted relatively homogeneous yields within each of the high-yielding plots but predicted noticeable internal variation for the plots having moderate to low yields. The difference between higher-and lower-yielding plots was not always associated with fertilizer rate (Figures 1 and 7). For the Brighton and Iowa sites, switchgrass cultivar type can be seen contributing to yield variation (Figures 1 and 7). Examples of the switchgrass dry biomass yields for the five study sites that were predicted using the GNDVI linear regression model are shown in Figure 7. The maps were generated by applying the model developed using plot-level data to image pixels. Overall, the relative yield (i.e., high and low yields) and variation of predicted yields reflect the amount and range of the measured yields across the sites ( Table 2). For example, the Urbana site, which has the lowest minimum and maximum measured yields (i.e., 1.19-4.35 Mg/ha) and the smallest yield range (i.e., 3.16 Mg/ha) of all sites, showed the lowest predicted yields and the most limited variation across the site. In contrast, the Nebraska site, which has the largest measured yield range (i.e., 6.96 Mg/ha) of all sites, exhibited the greatest variation in the predicted yields (Table 2 and Figure 7). Across crop types at the Nebraska site, the highest yields occurred in plots fertilized with 56 kg N ha −1 compared to 28 kg N ha −1 . Relationships between the predicted yields and fertilizer application were also observed for some plots in other sites (Figures 1 and 7). For the South Dakota site, the model predicted relatively homogeneous yields within each of the high-yielding plots Land 2021, 10, 1221 16 of 22 but predicted noticeable internal variation for the plots having moderate to low yields. The difference between higher-and lower-yielding plots was not always associated with fertilizer rate (Figures 1 and 7). For the Brighton and Iowa sites, switchgrass cultivar type can be seen contributing to yield variation (Figures 1 and 7).
Land 2021, 10, x FOR PEER REVIEW 17 of Figure 7. Examples of switchgrass dry biomass yield (Mg/ha) for the five study sites.

Discussion
Despite the limited data available, the simple linear regression model using GND showed promise for predicting harvestable switchgrass dry biomass yields at a plot sc (i.e., 0.2-0.4 ha). Except for the Urbana site, calculated index values in midsummer (i DOY 165 through 193) were often strongly correlated with the perennial grass yields harvest for all sites across the spectral vegetation indices examined (R 2 as high as 0.8 MAE and RMSE as low as 0.539 Mg/ha and 0.616 Mg/ha, respectively, except for the tablishment year). Switchgrass and other warm-season perennial grasses often acceler their growth in June followed by flowering, which was observed to start by the middle Figure 7. Examples of switchgrass dry biomass yield (Mg/ha) for the five study sites.

Discussion
Despite the limited data available, the simple linear regression model using GNDVI showed promise for predicting harvestable switchgrass dry biomass yields at a plot scale (i.e., 0.2-0.4 ha). Except for the Urbana site, calculated index values in midsum-mer (i.e., DOY 165 through 193) were often strongly correlated with the perennial grass yields at harvest for all sites across the spectral vegetation indices examined (R 2 as high as 0.879; MAE and RMSE as low as 0.539 Mg/ha and 0.616 Mg/ha, respectively, except for the establishment year). Switchgrass and other warm-season perennial grasses often accelerate their growth in June followed by flowering, which was observed to start by the middle to end of July for most cultivars. Because plants shift their energy allocation from vegetative growth to flowering at that time, the greatest correlation between vegetation index values is around the same time, which indicates the maximum green biomass of the season, and at-harvest biomass yield is understandable. This suggests that yields may be fairly reliably predicted a few months before harvest. In this study, the harvestable dry biomass yields were predicted 152 days prior to harvest on average. Factors that could influence the prediction accuracy when using midsummer spectral index values include the amount of precipitation, such as prolonged droughts that would influence soil moisture status. Sources of uncertainty in predicted yields could also include timing of harvest as observed in our previous study (not published) and cultivar variation in physiology and chemistry as they determine plant spectral responses [72]. Because fertilizer and herbicide applications and tillage practices influence the nitrogen status of crops [42][43][44], the amount of exposed ground within the stands [45], and the composition of exposed ground (e.g., crop residues and weeds) [44,45], these factors should also be carefully considered when modeling yields using spectral vegetation indices.
Another source of uncertainty in predicting switchgrass yields is associated with image availability. The availability of Sentinel-2 imagery for switchgrass yield prediction is constrained by periods of high cloud cover, which is a common limitation for satellite imagery. In this study, a total of 185 images were available for the study sites from April to October 2020. However, only 61 images were obtained and utilized in subsequent analyses across the sites (Table 3). For many of the sites, satellite imagery around the time of flowering (mid to end of July for most grasses and switchgrass cultivars, except for Independence switchgrass which flowers closer to the end of August) was not available due to cloud cover. This raised a question on the likelihood of satellite image availability that could provide sufficient predictive power for at-harvest biomass yields for each site. Even though the GNDVI seasonal trajectories show fairly comparable values around the selected image dates across the sites, the impact of image dates on yield prediction is uncertain. Furthermore, due to cloud cover, the availability of imagery during the summer months in subsequent years for multiyear studies is also not guaranteed. Although unmanned aerial vehicles and other technologies have been utilized to mitigate the issue of cloud cover, these technologies can be time-consuming and costly and could require rigorous data processing for proper analysis. Technologies that could increase satellite image availability or methodologies for simulating seasonal trajectory would greatly help in reducing such uncertainty.
The comparison of 20 spectral vegetation indices showed that the NIR spectral band is the key to predicting biomass yields, which is consistent with a number of existing vegetation studies using remote sensing [15,73]. The index comparison also indicated a greater information value of the green spectral band for predicting at-harvest yields of switchgrass and other warm-season perennial grasses than the red band. Gitelson et al. [58,59] found that spectral reflectance in the green spectral region is more sensitive to plant chlorophyll content than that in the red spectral region. As a result, GNDVI showed a stronger correlation with chlorophyll contents across the plants examined than NDVI, which utilizes the NIR and red bands [59]. Switchgrass aboveground biomass is accumulated through vegetative growth or elongation of stems from emergence in the spring through flowering, with the growth period (green-up to anthesis) dependent upon geographic location (temperature and daylength) and ecotype [74][75][76]. During the vegetative growth period, plant chlorophyll content increases [75]. This pattern was observed in the Iowa, Brighton, and Nebraska sites with green-up observed in imagery to occur in late April/early May and continue into June and July, which corresponded to the measured chlorophyll con-tent in the field during this time. In mid-to late summer, the GNDVI values plateaued, which appeared to coincide with flowering. Then, GNDVI values gradually decreased as switchgrass began senescing.
The pattern observed between the predictive power of the GNDVI linear regression model measured by R 2 and the characteristics of measured yields (i.e., minimum, maximum, and range) may be explained by the strength of spectral signal associated with switchgrass/perennial grass biomass in the field during the image collection. Dense switchgrass growth, which would indicate greater biomass volume than sparse growth, is likely represented by pixel values in the image. In contrast, spectral signals from sparse switchgrass growth likely contain noise from background such as soil or plant residue and obscure true signals of switchgrass in pixel values. Although it is out of scope for the present study, the influence of vertically projected cover and height of switchgrass and presence of other materials on remote sensing-based yield prediction could be examined with field observations of plant height, canopy photos, and weed cover during the peak growing season.
Phenological variations observed in the GNDVI seasonal trajectories can reflect the age of switchgrass or perennial grass stands and geographic location. The later start of the greening at the South Dakota site compared to the other sites can be explained by its higher latitude and colder temperatures later in the spring compared to the other sites. The unique greenness pattern of cultivars for the Urbana site can be explained by the stand age. Because the study year (2020) was the establishment year for this site, the timing of biomass accumulation was expected to be different from the other sites. The delayed increase in GNDVI can in part be due to the planting date (DOY 141) and the slow establishment, which is typical of perennial crops. Additionally, literature has shown that switchgrass flowering during the establishment year can occur later than what is observed in the following years after establishment [77].
At the Nebraska site, the difference in GNDVI seasonal trajectory between Independence switchgrass and the other perennial grasses may also be associated with differences in stand age. The Independence plot was established in 2019 while the other plots were established in 2012. The differences may also be associated with phenological characteristics such as canopy structure and leaf chlorophyll. The Independence stand was taller and thicker than the other perennial grasses and Liberty switchgrass cultivar in 2020. It also has a bluer leaf coloration with a waxy coating compared to the other switchgrass cultivars and perennial grasses used in this study. However, due to the lack of plot replication at the Nebraska site along with other phenological data, it is difficult to determine the causes of variation at this time. Additional years of data, collection of phenological data across cultivars, and observing trends between cultivar types across the other field locations may shed light on these seasonal trends. The GNDVI trajectory showed patterns specific to cultivar types at some sites, which suggests that cultivar-specific yields may be modeled using optical imagery with a sufficient number of replicate plots for each cultivar type. This could help the selection of high-yield cultivars.

Conclusions
This paper presented the initial investigation on the predictive power of spectral vegetation indices derived from optical satellite imagery, more specifically GNDVI, for at-harvest switchgrass and other warm-season perennial dry biomass yields using a linear regression model. The study utilized 10 m resolution Sentinel-2 multispectral imagery that was collected from April to October 2020 over the five study sites located in the U.S. Midwest states-Iowa, Illinois, Nebraska, and South Dakota. The key findings of the study were as follows:

1.
The linear regression model using midsummer GNDVI predicted at-harvest perennial grass yield with R 2 as high as 0.879 and MAE and RMSE as low as 0.539 Mg/ha and 0.616 Mg/ha, respectively, except for the establishment year. The selection of image date in this study was based on image availability; thus, there is a possibility that the GNDVI linear regression models has a greater predictive power for at-harvest yields than presented in this study.

2.
The GNDVI linear regression model predicted at-harvest switchgrass yields as early as 152 days before the date of harvest on average, except for the year of establishment. More frequent cloud-free imagery than used in the present study or simulated seasonal trajectories of spectral vegetation indices could allow us to answer more precisely how early and accurately at-harvest yields can be predicted using remote sensing.

3.
While the NIR spectral band was found to be one of the key spectral bands, the green band appeared to have a greater contribution for predicting at-harvest switchgrass and other perennial grass dry biomass yields than the red band. This is consistent with existing studies [58,59] demonstrating that the sensitivity of the green spectral reflectance to plant chlorophyll content, which is indicative of green biomass volume, is greater than that of the red spectral reflectance.
The model formulated in this study will be further tested using additional datasets from 2021. Downscaling of the model should also be explored to estimate switchgrass and other warm-season perennial grass yields at a 10 m scale, which would enable the integration of the data into a predictive model for marginal lands. Additional future studies will include determining (1) if a single-date spectral vegetation index from the optimal available image would provide satisfactory at-harvest switchgrass yield prediction, (2) if simulated seasonal trajectories of a spectral vegetation index could improve accuracy and robustness of switchgrass yield prediction, (3) how early switchgrass yields can be predicted using remote sensing, (4) if remote sensing models developed for a limited number of sites using one year could be applicable for predicting switchgrass yields of other years and other sites, and (5) how vertically projected cover and height of switchgrass and presence of other materials would influence remote sensing-based yield prediction. Additional areas of research include the uniqueness of establishment year for estimating harvestable switchgrass yields and potential impacts of herbicide and fertilizer treatments and tillage practices on biomass yield estimate using remote sensing.
In addition to biomass yields, an ability to predict biomass moisture and nutrient content at harvest in advance would provide actionable information to biorefineries in planning storage, transport, and adjustment to conversion formulas or preprocessing. Thus, predictive modeling for biomass moisture and nutrient content using remote sensing should also be explored.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/land10111221/s1, Figure S1: The spectral reflectance profile of the five study sites; Table S1: Additional field characteristics and management data, Table S2: Plant frequency across perennial grass crops/cultivars in the spring-summer of 2020. Funding: This research was funded by the U.S. Department of Energy, Energy Efficiency and Renewable Energy, Bioenergy Technologies Office, grant number DE-EE0008521. This research was supported by the USDA Agricultural Research Service (USDA-ARS). USDA is an equal opportunity provider and employer. This manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory ("Argonne"). Argonne, a U.S. Department of Energy (DOE) Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.