Multidimensional Aspects of Sustainable Biofuel Feedstock Production

: Bioenergy is becoming increasingly relevant as an alternative to fossil fuels. Various bioenergy feedstocks are suggested as environmentally friendly solutions due to their positive impact on stream health and ability to sequester carbon, but most evaluations for bioenergy feedstocks have not evaluated the implications of bioenergy crop production holistically to date. Through the application of multi-objective optimization on 10 bioenergy feedstock rotations in a Michigan watershed, a Pareto front is searched to identify optimal trade-off solutions for three objective functions representing stream health, environmental emissions/carbon footprint, and economic feasibility. Various multi-criteria decision-making techniques are then applied to the resulting Pareto front to select a set of most-preferred trade-off solutions, which are compared to optimal solutions from each individual objective function. The most-preferred trade-off solutions indicate that a diverse mix of rotations are necessary to optimize all three objectives, whereas the individually optimal solutions do not consider a diverse range of feedstocks, thereby making the proposed multi-objective treatment an important and pragmatic strategy.


Introduction
The U.S. Energy Information Administration (EIA) predicts that global energy consumption will increase 50% by 2050 [1]. In efforts to reach this demand in a sustainable way around the world, affordable clean energy was included in the Sustainable Development Goals adopted by all United Nations Member States in 2015 [2]. Innovation and production of renewable energy have been growing around the world, with 29% of the global electricity capacity [3] and 3% of transportation fuels [4] accounted for by renewables in 2018. Currently, hydropower, wind, and solar make up the majority of the renewable electricity sector, while biofuels account for all of the renewable transportation fuels [3].
In 2005, the U.S. Department of Energy set a series of goals, including the production of 36 billion gallons of renewable fuel by 2022 [5]. They projected that over 30% of the 2005 U.S. petroleum consumption could be displaced by one billion tons of dry biomass, sustainably produced on an annual basis in the U.S., without threatening the food and feed supply [6]. Biofuel production would be beneficial in the U.S. to decrease dependency on foreign oil, help to mitigate climate change, create jobs, and diversify domestic energy sources [7,8]. Meanwhile, negative environmental impacts from land use change, greenhouse gas (GHG) emissions, and decreased biodiversity from increased corn-based ethanol production have already been observed [9,10].
When developing a sustainable and cost-effective alternative to current agricultural practices, multiple aspects of large-scale bioenergy crop production must be considered [7,11]. These aspects include stream health, life cycle analysis (i.e., environmental emissions), and economic feasibility considerations. For instance, nutrient leaching, erosion, and decreased water holding capacity are all increasingly apparent problems faced by farmers due to deteriorating soil quality around the U.S., especially in heavily cultivated states in the Midwest [12]. It has been observed that soil organic matter, required for nutrient retention, erosion control, and water holding capacity, is heavily impacted by conventional agricultural practices [13]. In addition, eutrophication occurs as a result of excess nutrients in the water originated from agricultural runoff, being a potent ecological stressor, which has been shown to greatly impact fish fitness and populations, deteriorating stream health [14]. Incorporating rotations can decrease nutrient leaching through varied nutrient utilization [15]. Likewise, the incorporation of perennial and cover crops with living root systems in the fall season can decrease nutrient leaching by utilizing nutrients and water during the wet periods [16]. On the other hand, approximately 76% of all GHG emissions in the U.S. were attributed to burning fossil fuels in 2017 [17]. Even though the combustion of fuel will always lead to GHG emissions, bioenergy crop cultivation counteracts the emissions by taking in carbon dioxide through photosynthesis to produce biomass. Production and combustion of biofuels generates a closed-loop system between carbon sequestering feedstock production, conversion into fuel, and combustion emissions. For instance, Argonne National Laboratory estimates that biofuels can reduce atmospheric GHG produced by emissions up to 115% when compared to gasoline [18]. It is worth noting that the life cycle analysis (LCA) of bioenergy crops is largely influenced by the aboveto below-ground biomass ratio. Perennial grasses, such as switchgrass and miscanthus, have large quantities of below-ground biomass, allowing for greater carbon sequestration through both biomass and soil organic carbon production [19]. Regarding economic factors, there is already a strong market for bioenergy products in the U.S. Particularly, Michigan is in the top 20% of states for ethanol consumption and biomass feedstock electricity [20]. That being said, corn kernels are currently the primary biofuel feedstock [21]. Conversion of cellulosic feedstocks, such as corn stover, switchgrass, and miscanthus, requires additional pretreatment processes, which have not yet been widely implemented on a large scale [22]. Even though perennial bioenergy feedstocks tend to be less energy-intensive than conventional crops once established [23], their lower market value must be taken into account when optimizing for sustainable bioenergy feedstock production [24].
Stream health, life cycle analysis, and economic feasibility must be taken into account to provide policymakers and stakeholders with an environmentally feasible solution for agricultural land use. Therefore, in this study, agricultural land use for bioenergy feedstocks is optimized considering macroinvertebrate-and fish-based stream health indicators, environmental emissions (total nitrogen, total phosphorus, sediments, and sequestrated carbon), and net economic return. Studies have been conducted on the potential for bioenergy feedstock cultivation [25][26][27], but they do not consider all the aforementioned aspects simultaneously. This work aims to take all three key dimensions (i.e., stream health, life cycle analysis, and economic feasibility) into account to optimize for the most sustainable bioenergy cultivation solution in the Honeyoey Creek-Pine Creek watershed in Michigan, U.S. The objectives of this study are to design a novel framework which incorporates the linkages and feedbacks among land use change due to biofuels feedstock production, stream health, environmental benefits, economic return, and optimization techniques. Multi-objective optimization within the framework aims to maximize the benefits of biofuel feedstock production, while minimize potential negative environmental consequences to generate quantitative support for consensus decision-making models. The three objectives of this study were to (1) characterize the mechanism(s) by which atmospheric, terrestrial, and in-stream variables affect aquatic ecological integrities (fish and macroinvertebrate communities) and water quality; (2) establish a framework that quantifies the overall environmental impacts of land use change; and (3) develop a multi-objective optimization tool for the selection and placement of bioenergy crops.

Study Area
The Honeyoey Creek-Pine Creek watershed (Hydrologic Unit Code 0408020202) is located in the Saginaw River watershed in Michigan, U.S. (Figure 1). The Saginaw River watershed is identified as an Area of Concern by the U.S. Environmental Protection Agency (EPA) [28]. Nonpoint pollution control is one of the main priorities for action, due to the high level of contaminated sediments detected and degraded fisheries observed in the watershed. Currently, agriculture is the dominant land use within the study area, taking up approximately 52% of the 1100 km 2 land area followed by forests (~23%), wetlands (~17%), pasturelands (~5%), and urban (~3%). The domination of agricultural land within the region has a profound impact on stream health, economy, and the surrounding environment.

Modeling Process Overview
This study focuses on bioenergy feedstock production in Michigan, the third most agriculturally diverse state in the U.S., which also has high interest in bioenergy feedstocks [29]. An overview of the models and optimization process can be seen in Figure 2. Four models (i.e., hydrological model, stream health model, life cycle assessment, and economic model) were used to generate three scores (i.e., stream heath, environmental, and economic scores) for multi-objective optimization of bioenergy feedstock production in the study area. Figure 2 shows the inputs and outputs associated with each model. First, a hydrological model was calibrated and validated using climatological and physiographical data for the Honeyoey Creek-Pine Creek watershed. Ten bioenergy crop rotations were then simulated by the hydrological model. The output of the hydrological model includes daily values for streamflow and sediment and nutrient (total nitrogen and phosphorus) loads that resulted from bioenergy crop selection and placement within the study area in a 12-year period. Then, key ecologically-relevant hydrologic indicators were obtained from the simulated daily streamflow time-series in order to feed the stream health model, which represents the impact on freshwater ecological integrity in terms of macroinvertebrates and fish-based indicators. A life cycle assessment was also conducted on all ten bioenergy crop rotations and paired with the corresponding sediment, total nitrogen, and total phosphorus loads from the hydrological model to develop an environmental score. An economic model was used to calculate the average annual net return associated with each bioenergy crop over a 12-year period. The three resulting scores were then fed into a multi-objective optimization algorithm, which generated a set of Pareto-optimal solutions. The developed GIS-based multi-criteria decision system and optimization can be used by policymakers and stakeholders to guide bioenergy crop selection and placement to maximize economic return while minimizing potential negative stream health and environmental impacts.

Hydrological Model Description
The impact of bioenergy crops on water resources was analyzed using the Soil and Water Assessment Tool (SWAT, rev. 622), which is a hydrological model developed by Arnold et al. [30] for the United States Department of Agriculture, Agricultural Research Service (USDA, ARS). SWAT uses both physiographical and climatological data to simulate water movement and pollutant transport through watersheds. In the Honeyoey Creek-Pine Creek watershed, the most important observations are changes in streamflow and movement of nutrients, pesticides, and sediments [30,31], which all vary with land use [32]. The inputs of SWAT are used to estimate erosion and sediment transport via runoff through the Modified Universal Soil Loss Equation (MUSLE) [33]. Additionally, the nitrogen cycle and phosphorus cycle are integrated to estimate the movement of nutrients via both runoff and lateral subsurface flow [33].
In order for SWAT to produce a reliable watershed model, data on both the physiographical and climatological characteristics of the watershed were required. First, data as related to the physical characteristics of the watershed, such as topography, soil type, and land use were acquired from different resources. Concerning the topography, the U.S. Geological Survey (USGS) National Elevation Dataset (NED) at 30-m resolution provided topographical information for the watershed [34]. Soil data were obtained through the USDA Natural Resource Conservation Service (NRCS) SSURGO database, which varies in scale of accuracy between 1:12,000 and 1:63,360 [35]. Land use data were provided by the USDA Cropland Data Layer for the study period at 30-m resolution [36]. Climatological data from the period of the study, 2003-2014, were also collected for model development from the National Climate Data Center (NCDC). Daily precipitation and maximum and minimum temperatures were used from 16 weather stations within the watershed [37].

Hydrological Model Calibration and Validation
Both calibration and validation were performed on the SWAT model using observed flow to ensure model accuracy. Calibration is the act of adjusting model parameters to produce outputs that are comparable to observed data within the same time-frame, whereas validation evaluates how the calibrated model parameters fit observed data outside of the time-frame used for calibration [30]. Data from 2003-2008 was used for calibration, while data from 2009-2014 was used for validation. Accuracy was evaluated using eight different statistical measures: Nash-Sutcliffe efficiency (NSE), Kling-Gupta efficiency (KGE), modified KGE-statistic (KGE'), ratio of the root mean square error to the standard deviation of measured data (RSR), index of agreement (d), coefficient of determination (R 2 ), percent bias (PBIAS), and root mean square error (RMSE cms) [38][39][40]. Observed values used for daily streamflow calibration/validation were collected from the Pine River Near Midland USGS gauging station (ID 04155500). At the same time, the results of the model calibration/validation are presented in Table S1 in Supplementary Materials. Due to the lack of water quality stations within the area of study, water quality model parameters were taken from a previously calibrated SWAT model for the Saginaw River watershed [41].

Bioenergy Crops Implementation
Ten bioenergy crop rotations were considered for agricultural land in the Honeyoey Creek-Pine Creek watershed. Corn, corn-stover, corn-soy rotation, corn-stover-soy rotation, soybean, corn-soy-canola, sorghum, sorghum-soy, miscanthus, and switchgrass were all considered. The SWAT model divided the watershed into subbasins, which were further divided into Hydrologic Response Units (HRUs). HRUs are geographic units with distinctive land use, soil, and slope features [42]. In this study, it was assumed that there was one HRU per subbasin and the ten bioenergy crops were the only crops grown in the Honeyoey Creek-Pine Creek watershed. This simplification allowed for the complex multiobjective optimization even though the crop distribution is not commonly observed in Michigan. Corn, corn-stover, corn-soy rotation, corn-stover-soy rotation, soybean, corn-soycanola, sorghum, sorghum-soy, miscanthus, and switchgrass were all considered. For each rotation, tilling practices were assumed to be conservative with disk cultivators followed by soil finishers. Corn, soy, and canola were assumed to be glyphosate resistant. Fertilizer rates were based on recommendations from Michigan State University Extension field studies [29,[43][44][45] rather than current soil makeup for each agricultural field. Additionally, synthetic fertilizers were the only assumed nutrient input. All fields were assumed to consist of 1.75% soil organic matter, as suggested for central Michigan [46].
Both corn and corn-stover consisted of five years of corn and one-year soybean, with the corn leaving the stover on the field during the winter dormant period and corn-stover harvesting 38% of post-corn grain harvest biomass and requiring additional fertilizer [47,48]. Continuous soybean consisted of two years of soy, followed by one year of corn. The cornsoy and corn-stover-soy rotation were similar with corn in the first year followed by soy in the second. Again, 38% of stover was harvested after corn grain harvest in the cornstover-soy rotations, which was supplemented with additional fertilizers. Phosphorus was applied bi-annually during the soybean year of the rotation and was considered to remain on the field for the corn, reducing potential leaching into waterbodies [49]. During the corn-soy-canola rotation, the corn and soy were planted similarly to the corn-soy rotation, with the third year being a winter canola variety planted in the fall, after soybean harvest, and harvested in the late spring [50]. Sorghum had an operation schedule similar to corn with five years of sorghum followed by one year of soy. Fertilizer, herbicide, and seeding rate for this crop were all based on suggestions from the University of Wisconsin Extension office [51]. Sorghum was planted in the late spring when the ground was between 15.5-18.5 • C about three weeks after corn [51]. The sorghum-soy rotation alternated sorghum and soy on a yearly basis with similar additives to the sorghum rotation and soy rotation. Both miscanthus and switchgrass were scheduled based on a 12-year lifespan with the first three years dedicated to the establishment [29]. Both perennials were planted in the first year then had supplementary seeding in the second year before becoming fully robust in the third. Herbicides, fertilizer, and seeding rate were all based on suggestions from MSU Extension [29].

Stream Health Model
Fuzzy logic methods are robust for modeling stream health, due to their ability to model complex and non-linear systems [52][53][54]. The fuzzy logic methods use graphical membership functions (MFs) to describe the importance of a degree of membership from 0 (no membership) to 1 (full membership). An adaptive neuro-fuzzy inference system (ANFIS) is a fusion method, which uses artificial neural networks (ANN) to build optimal fuzzy logic (MFs) [55]. This method was used for this study due to being advantageous for evaluating ecological systems [56,57]. Previously developed ANFIS models produced by Woznicki et al., 2016 were adapted for the Honeyoey Creek-Pine Creek study area. Both linear and non-linear shapes were tested for best fit, including trapezoidal, triangular, Gaussian, Gaussian composite, and generalized bell. The most fitting shape was generated as a final set of linear and constant MFs, which were extracted as the stream health score for each scenario. Since flow is a "master variable" in ecohydrological studies as flow regime alteration can significantly impact stream health and aquatic life [58], ecologically-relevant hydrologic indicators were identified as key explanatory variables for predicting different macroinvertebrate-and fish-based indicators.
The macroinvertebrate indicators included the number of Ephemeroptera, Plecoptera, and Trichoptera (EPT) taxa, Family-Level Index of Biotic Integrity (FIBI), and the Hilsenhoff Biotic Index (HBI). For fish, the Index of Biotic Integrity (IBI) was used. The indicators were determined for the model through careful analyzation of 262 macroinvertebrate and 193 fish monitoring sites in the Saginaw River watershed [59,60]. Five MF shapes and two to four MFs were tested for each output (EPT, FIBI, HBI, and IBI). EPT taxa is a commonly used indicator for gauging the sensitivity of the taxa to varying pollutants [61], where lower values represent potential aquatic community degradation [54]. FIBI is similar by measuring pollution levels based on family-level taxa between the range of 0 to 45, again from poor to excellent, respectively [54]. HBI classifies stream health from excellent to very poor, 0 to 10, respectively, based on chemical properties [62]. Finally, IBI is a widely used indicator, which accounts for physical and chemical stream properties related to fish based on levels of human disturbance [63]. IBI is evaluated between the range of 0 to 100 from high to low levels of disturbance [54].

Life Cycle Analysis
A model developed by Thelen et al. [46] was used to calculate the carbon footprint of each bioenergy crop rotation. The LCA model uses values observed at the Great Lakes Bioenergy Research Center in addition to other well-documented values to determine the carbon footprint of synthetic fertilizer additions [64][65][66], herbicides and insecticides [67], tilling, machine usage, drying, and transporting [68]. Carbon footprint of conversion within the biofuel refinery and consumer combustion were not taken into account for the feedstocks due to a lack of established values. Parameters from the SWAT bioenergy feedstock operation schedule were used to calculate the carbon footprint of each bioenergy feedstock. Soil organic matter was calculated as suggested by model developers, and labor and fuel demands were estimated based on averages from the Michigan Agricultural Statistics 2013-2014 and MSU Extension 2018 Custom Machine and Work Rates [69,70]. An example of the Soybean LCA can be seen in Table S2 in Supplementary Materials.
Each crop was modeled for one year and one hectare, except for miscanthus and switchgrass, which were modeled for three separate years. For field crops, each year was assumed to be approximately the same and was multiplied based on the aforementioned rotation schedules. The twelve-year average GHG footprint was calculated in kilograms of sequestrated carbon per hectare (kg/ha) for each rotation. The first two years of switchgrass and miscanthus were different due to added inputs for the establishment period; then, the third year was assumed to represent fully established crops for the following ten years. According to the USDOE, GHG emissions associated with land use change must be considered when evaluating the carbon footprint of bioenergy crops, which was accounted for in the model based on changes in soil organic matter from tilling practices, biomass changes, and perennial versus annual crops [71].
Yields for each crop were estimated based on observed values in Michigan or similar climate. The corn and soybean estimated yields were based on the average yield of Michigan farms from the Michigan Agricultural Statistics 2013-2014 [66]. Corn-stover yield was calculated using 38% of the estimated biomass production by Zea mays L., which is assumed not to change with latitude [46]. Further, 38% removal is suggested to be a sustainable rate of removal, especially when paired with reduced till or soybeans within the rotation [43]. Canola yield was based on MSU Extension values [50]. Sorghum yield was estimated based on the average MSU Extension guideline [72]. Both switchgrass and miscanthus were assumed to have steady yields after three years of establishment with 90% harvesting efficiency and the first three years yielding 0%, 50%, and 100%, respectively [29].

Economic Analysis
In efforts to be consistent with the stream health index and carbon footprint data, each rotation was evaluated economically for 12 years (2003-2014). For each rotation, the annual net return was calculated based on USDA Economic Research Service (ERS) national data for commodity crops [73], or budgets produced by MSU Extension [74]. None of the net returns included government program benefits and the bioenergy feedstock value was assumed to be comparable to national commodity crop value for the given years. An example economic analysis of soybeans can be seen in Table S3 of the Supplementary Materials. USDA ERS data contain average farm cost and revenue information on a per hectare basis for commodity crops in the Northern Crescent region, which includes Michigan [75]. Annual net returns for corn, corn-stover, soybeans, and sorghum were extracted from the period between 2003 and 2014. Factors including the gross value of production, operating costs, and allocated overhead were included for each crop. Budgets produced by MSU Extension for switchgrass and miscanthus were modified to include interest on operating capital, land cost, labor cost, insurance, and marketing. Net return for canola was calculated using seed and gross value of production data from the USDA ERS [73]. Canola yield was based on annual ton/ha observed in Ontario for respective years [76], and the rest was assumed linear based off the example budget produced by the University of Tennessee for canola cultivation with conventional tillage [77].
Each net return was calculated on an annual, per hectare basis. The average value was taken over the 12 years and extracted for application on the agricultural land within the Honeyoey Creek-Pine Creek watershed. The full economic analysis can be seen in Table S4 in the Supplementary Materials. Final net return associated with each crop rotation was taken into consideration by the multi-objective optimization tool for each subbasin.

Multi-Objective Optimization
A multi-objective optimization problem was formulated to identify the optimal spatial distribution of bioenergy feedstock rotations and placements within the area of study, while simultaneously considering stream health, environment, and economic aspects of production. For this problem, there were 111 decision variables (one for each agricultural subbasin in the study area). Each decision variable could take a discrete value between 0 and 9, where each value represented a particular crop rotation, thereby making 10 different options for crop rotation in each subbasin. Therefore, 10,111 spatial combinations of crop rotations comprised the entire search space, and an optimization algorithm, rather than a manual analysis, was necessary to identify the optimal or, near-optimal solutions.
Three objective functions, one for each criterion, were formulated based on (1) simulations of stream health indicators, (2) water quality loads (i.e., total nitrogen, total phosphorus, and sediments) along with estimates for carbon footprint, and (3) net return. Each objective was obtained based on spatially aggregated values for the entire area of study. Thus, weighted-average values for EPT taxa, HBI, FIBI, IBI, sediments, total nitrogen, total phosphorus, carbon footprints, and net returns were obtained using the subbasins' surface areas as weights.
The stream health objective function was determined by aggregating the individual stream health indices into an overall score. For this purpose, each stream health index was normalized to having values between 0 and 1, with 0 representing excellent stream health conditions and 1 representing poor conditions. The overall stream health score was determined as the average of the normalized EPT taxa, HBI, FIBI, and IBI indices. Likewise, the environmental objective function resulted from an aggregation of normalized water quality and carbon footprint estimates. For this case, all water quality values (i.e., total sediment, total nitrogen, and total phosphorus) were given a weight of 1/6 each, whereas carbon footprints were given a weight of 1/2. Similar to the stream health score, values close to 0 represented excellent environmental conditions, whereas values close to 1 represented poor conditions. Finally, the third objective function was derived directly from estimated net return. This objective function was also normalized to have values between 0 and 1, with 0 being very profitable and 1 the opposite. Extreme values for normalizing the individual stream health indices are well defined and were obtained from the literature [78]. The maximum value for EPT normalization was taken as 15 based on observed values in the Saginaw River watershed. Meanwhile, the extreme values required for normalizing water quality loads, carbon footprint, and cost values were determined after running the models and comparing the results when implementing a particular crop rotation everywhere. As a result, the original multi-objective optimization problem of maximizing profit and stream health while minimizing the negative environmental impacts was transformed into a minimization problem.
The Unified Non-Dominated Sorting Genetic Algorithm III (U-NSGA-III) [79] was implemented to address the multi-objective minimization problem stated above. U-NSGA-III is a recently developed algorithm capable of solving single-, multi-, and many-objective problems. This algorithm is an extension of the NSGA-III algorithm [80], which is a population-based method based on reference directions, non-domination sorting, and evolutionary operators (i.e., recombination and mutation) for identifying Pareto-optimal solutions. It is worth mentioning that these algorithms have been successfully applied in water resources and crop modeling problems [81][82][83]. The U-NSGA-III algorithm was implemented in python 3.7 using the pymoo library [84]. The algorithm parameters chosen here are standard and recommended, and are given as follows: (i) the number of reference directions (assigned here equal to 100), (ii) the stopping criteria (set as a maximum number of 680 generations), (iii) recombination and mutation probabilities (assigned as 0.9 and the reciprocal of the number of decision variables, respectively), and (iv) distribution indices for each genetic operations (i.e., Simulated Binary Crossover-SBX, and polynomial mutation, with values of 10 and 20, respectively). It is worth mentioning that the initial population was comprised of 90 randomly generated rotation combinations plus 10 predefined combinations where a particular crop rotation was implemented everywhere within the study area. A python interface to SWAT and stream health ANFIS models in MATLAB was also developed for executing the multi-objective optimization framework. Well-spaced reference points for generating the reference direction of step (i) were obtained using the recently developed Riesz s-Energy method [84], which is available in the pymoo library [84].
The convergence to a near-optimal Pareto front was analyzed using the Hypervolume indicator and the number of Pareto points at the end of each generation (see Figure 3). Hypervolume measures the collective volume of dominated region by the obtained solutions in the objective space. A set of Pareto solutions is considered better, if the Hypervolume is higher. The Hypervolume indicator was used to analyze how well the method performed at finding Pareto-optimal solutions at the end of each generation [85]. Eventual convergence of the Hypervolume indicator and number of Pareto points with generations indicate that the Pareto front has stabilized well at the end of the optimization run. The goal for the optimization was to attain convergence in the Hypervolume indicator while generating the maximum number of Pareto points possible from the given parameters. In order to obtain both the Hypervolume and the number of Pareto points for each generation, the following procedure was performed: (i) at the end of each generation, a Pareto front was obtained from the parent population of the current generation; (ii) a new set of solutions was created containing both the Pareto front from the previous step and the Pareto front from the previous generation; (iii) a new Pareto front was determined from the set of solutions of the previous step; (iv) from the new Pareto front, the closest solutions to 500 reference directions were selected; (v) the hypervolume indicator and the number of Pareto points was computed using the filtered set of solutions from the previous step. The filtered Pareto front for the final generation was employed for the subsequent decision-making analyses.

Most-Preferred Trade-Off Solutions
Finding a set of Pareto solutions by a multi-objective optimization method is a great feat, but it must also be followed by a multi-criterion decision-making (MCDM) task to select one or a few preferred solutions. This task usually involves stakeholders and decision-makers. To identify the most-preferred trade-off solutions for the biofuel feedstock production problem, three approaches were implemented: (i) compromise programming using the 2 metric [86], (ii) pseudo-weight method [87], and (iii) knee points identification [88].
In the compromise programming approach, the p metric is computed for each solution, representing the distance between that solution and a reference point. This metric is computed as follows: where M is the number of objectives, f m is the value for the m objective function for the solution of interest, z m is the value for the m objective function for the reference point, and p is a user-defined parameter (>0). When p is 2 in Equation (1), 2 is simply the Euclidian distance. In general, the reference point is chosen as the ideal point, which in this problem is the origin. The best trade-off solution is the one that is closest to the ideal point (i.e., solution with the lowest 2 metric). It is worth noting that before applying Equation (1), the objective functions are standardized to values between 0 and 1.
In the pseudo-weight method, a weight vector is obtained for each Pareto-optimal solution. The i-th element in the vector represents the weight (or importance) of the i-th objective function. The sum of all elements in the vector is forced to one, so that the pseudo-weight vector represents the relative importance of each objective function for a particular solution. Equation (2) is used to compute the pseudo-weight w i for the i-th objective function: where f min i and f max i are the minimum and maximum values of i-th objective function, respectively. The weight is in proportion to the normalized difference of i-th objective value from its maximum value. This means that solutions that are closer to the minimum objective value has a higher weight value for that objective, loosely allocating a higher preference of the objective to the solution. In this study, the solution with the most balanced pseudo-weight solution was chosen as the solution having the closest target pseudo-weight vector of (1/3, 1/3, 1/3). Any other target vector, preferring a particular objective over others, may also be used. In this study, three additional preferred solutions were obtained by using corresponding target vectors where one objective had twice the importance of any other objective (i.e., Knee points refer to Pareto solutions that show high trade-offs, well above the average trade-off of the entire Pareto front. In other words, they are points where a small gain in one objective function in moving to a neighboring solution comes from a high loss in at least one other objective function. This does not support the move, hence a natural preference to select a knee point for a pragmatic reason. For this approach, the high tradeoff calculation procedure, proposed by Rachmawati and Srinivasan [88] and included in the pymoo library, was implemented. In this procedure, high trade-off points were declared as the ones presenting trade-offs with neighboring solutions above an upper threshold. This threshold was defined as the average trade-off plus twice the standard deviation using the entire Pareto-optimal set of solutions. After a number of knee points (high trade-off points) are identified, the one having the most balanced pseudo-weight vector among them was selected.

Economic, Environmental, and Stream Health Impacts of the Best-Case Scenarios
The initial population comprised of 90 randomly generated combinations and 10 predefined combinations for crop rotations evolved into a well-spread and evenly distributed convex Pareto front (see Figure S1 in Supplementary Materials and Figure 4). Figure 3 indicates that U-NSGA-III progressed towards a near-optimal Pareto front given the stable behavior of both the Hypervolume indicator and number of Pareto-optimal points at the end of the optimization run. The final Pareto front (Figure 4) is comprised of 233 optimal solutions, and was obtained after 680 generations (i.e., 68,000 function evaluations). An almost uniform distribution of objective vectors in the front is evident from the figure.
The Pareto front was further divided into three clusters showing the closeness of the Pareto-optimal solutions to each individual objective optimum (i.e., economic, environmental, and stream health). Clusters of solutions for the three objectives were generated using the k-means clustering approach, which gathers points around the mean of all points [89]. Figure 4 also shows the most-preferred solutions obtained using the compromise programming approach with equally balanced pseudo-weights, and the most balanced knee point. All these compromise solutions fall within the stream health cluster. As can be seen in Figure 4, the clusters are well distributed almost equally, with the environmental cluster consisting of 93 solutions, the stream health cluster consisting of 78 solutions, and the economic cluster consisting of 62 solutions. The solutions obtained for each cluster are further analyzed to capture their central tendency and dispersion ( Figure 5). In general, the economic solutions were observed to have the highest variability, whereas the stream health was observed to have the lowest. The stream health indicators were calculated based on hydrologic indicators (i.e., water quantity), and the median of all three clusters was observed to be within 1 point of one another for all four stream health indicators (Figure 5a-d). Therefore, the small variations in stream health indicators were expected as the change in agricultural land use from food production to bioenergy feedstock production does not significantly alter the flow regimes of the nearby rivers and streams. Concerning the best solutions for the stream health (SH) cluster, the best median values for the EPT, HBI, and IBI indicators were observed under this condition (SH) compared to the environmental (EN) and economic (EC) cluster solutions (Figure 5a-c). The only indicator that ranked second-best was the FIBI indicator, as seen in Figure 5d. The high scores from the EPT, HBI, and IBI balance out the slightly lower FIBI score given that all four values held equal weight within the optimization.
Both the median and interquartile range of the SH cluster for the four indicators were very similar to the mean and interquartile range of the EN cluster. These similarities can also be observed in the higher density of solutions in the EN and SH clusters within Figure 4 and may be explained by the lower sediment and nutrient loads to which the taxa of macroinvertebrates are highly sensitive (see Figure 5a,b) [53,90]. The EN cluster held the optimum solution for each variable besides total nitrogen, where it had the highest median load and EC cluster had the lowest (Figure 5f). This is most likely a result of the higher biomass producing crops, such as miscanthus and corn, requiring higher amounts of nitrogen inputs than legumes, which fix their own nitrogen through a symbiotic relationship with nitrogen-fixing bacteria [48]. Finally, soybeans, a legume, were observed to be the most profitable during the time period used for this study (Table S4 in Supplementary Materials), but their low biomass could explain the poor EPT, HBI, IBI, and FBI scores and higher sediment loss and nutrient loads of the EC cluster (Figure 5a-h). Crop rotations observed within the EN cluster solutions had lower sediment erosion and higher carbon sequestration than SH or EC clusters, likely due to the increased quantity of high biomass producing crops, which was also observed by Lemus and Lal [91].

Distribution of Bioenergy Feedstocks within the Watershed for Most-Preferred Trade-Off Solutions
Three most-preferred trade-off solutions from the Pareto front were determined through different methods. First, the Euclidean norm approach, was applied to identify the most-preferred trade-off solution based on the distance of the points to the origin. The pseudo-weight approach was also applied, which determined the best solution that was closest to a uniform pseudo-weight value of 1/3 for each objective [92]. Finally, the most balanced high trade-off solution was determined by finding the knee point with the most balanced pseudo-weight vector ( Figure 4). Figure 6 compares the rotation distribution of the calculated most-preferred tradeoff solutions (Figure 6g-i), the preferred trade-off solutions when one of the pseudoweights has twice the importance of any of the other two (Figure 6d-f), and the individually optimum solution for stream health (Figure 6a), environment (Figure 6b), and economy ( Figure 6c). It can be observed that miscanthus was beneficial for the environment (Figure 6b), accounting for all of the agricultural land. Love and Nejadhashemi [8] also observed that significant shifts towards miscanthus production resulted in the lowest levels of total sediment and phosphorus loads. Additionally, Agostini et al. [93] observed the highest level of carbon sequestration with miscanthus among studied bioenergy crops. On the other hand, soybeans were the most profitable as represented in the optimal economic rotation distribution (Figure 6c), with the entire optimal economic watershed observed to be covered by soybeans. The stream health and all six preferred trade-off solutions represented diverse rotation distributions over the entire watershed. Even though they included varying distributions of different crops, miscanthus was commonly distributed over all scenarios, besides the optimal economic, on the south edge of the watershed. It can be observed in Figure 6b,c,e,f that the solutions across the Pareto front varied in a distinctive pattern along the left boundary between the best environment and best economic solutions. Even though the environmentally and economically optimal solutions consisted of monocrops, the preferential environment solution was observed to have an almost equal split sorghum-soybean to miscanthus and the preference for profit was balanced with soybean, sorghum-soybean, corn-soybean, and miscanthus (Figure 6e,f). The points along the front's boundary were observed to be interspersed with miscanthus within the environmental cluster and completed with sorghum-soybean within the economic cluster until a small void was reached ( Figure 4). On one side of the void were solutions with primarily sorghum-soybean, while on the other side was primarily soybean. The sorghum-soybean rotation was therefore rated relatively highly for environmental benefits whilst also profitable in this study, which follows the observations of Sperow [94] who also observed higher soybean and sorghum carbon and economic benefit.
Similar rotation distribution patterns can be observed on the right side of the front with Figure 6a,d,g,i having corn-stover similarly placed towards the east side of the map in each of the solutions. Aside from corn-stover and miscanthus in the middle of the watershed, the optimal solutions close to the best stream health solution were observed to be quite diverse in comparison to the solutions from the left side of the front ( Figure 6). Stream health scores were calculated primarily based on flow regimes as a result of previously performed sensitivity analyses [95]. Flow regimes have been observed to change the most significantly with increased urbanization or agriculture, neither of which were simulated within this study. This led to a very small margin of variation between the stream health scores and low impact of specific crop rotations on the individual stream health scores.
The solutions represented in Figure 6 shows the benefit of using multi-objective optimization in bioenergy feedstock rotation planning. The best solution for each objective (Figure 6a-c) were observed to be different than the best trade-off solutions (Figure 6d-i) with the single objective optimal solutions presenting less biodiversity than the best tradeoff solutions. Our results were supported by findings showing that biological diversity is important for both ecosystem [96] and risk aversion associated with monocultures [97]. The best trade-off approach from multi-objective optimization provided solutions that were sustainable for the region through decreased negative impacts to stream health and the environment and increased economic impact, overall increasing the watershed's sustainability and resiliency.

Comparison of Evaluation Metrics between Optimal Solutions and Current Land Use Scenario
The goal of this section of the study is to compare the differences between the individual evaluation metrics of the complete current agricultural land use to the optimized individual and most-preferred trade-off solutions. Table 1 shows the percent difference for each evaluation metric between the current agricultural practices in the watershed, the most-preferred trade-off solutions and three extreme solutions.
The current watershed agricultural makeup is 0.001% field pea, 0.004% sugar beet, 5% alfalfa, 6% winter wheat, 17% soybean, and 72% corn. Currently, only 5% of the Honeyoey Creek-Pine Creek watershed is covered by perennial crops, which can be observed by the 81% decrease in sediment load and 601% increase in carbon sequestration from the best environment solution (Table 1). Interestingly, the total nitrogen load was higher in all of the calculated optimal solutions than the reference scenario (Table 1). This could be attributed to the additional food crops grown in the watershed not considered within this study, such as alfalfa, which is a legume and maintains year-round groundcover [98]. Compared to the three best trade-off solutions, slight water quality change was observed with improvement from the EPT, HBI, and IBI indices and degradation of the FIBI score, which follows the minimum changes in stream health score also observed by Herman et al. [78]. Table 1 also shows the importance of taking multiple objectives into account when optimizing bioenergy feedstock landscape within a watershed. Both the net sequestrated carbon and net return column showed the greatest magnitude of percent change between the current and optimal solutions. The greatest net sequestrated carbon change was observed to coincide with the greatest degradation of profit of the best environment solution, while the least amount of change for both metrics coincided with the best economic solution. All of the Pareto-optimal solutions, best stream health, and best environmental solution included perennial bioenergy feedstocks (Figure 6), which currently hold a much lower market value than the other crops (Table S4 in Supplementary Materials). Glithero et al. [99] showed that in order to have the greatest carbon sequestration, high biomass perennial bioenergy feedstocks would have to significantly increase in value, which was matched with our observations within Table 1. Trade-off solutions are therefore very valuable for decision-makers when evaluating agricultural land use within a watershed in the current climate and economy.

Conclusions
Multi-objective optimization has been shown to be useful for applications where a decision is influenced by multiple factors. In this paper, the multi-objective optimization approach was shown to be effective in generating a set of individual optimal solutions for bioenergy feedstock production within a watershed-based on stream health, environmental, and economic criteria. Up to this point, bioenergy feedstock application has been based on satisfying only one or two objectives. Optimizing for multiple objectives ensures a solution is chosen that is sustainable for the earth, decision-makers, and economy.
To show the potential of this approach, nine trade-off solutions were determined. The first three solutions put 100% weight on each of the three objectives separately, whereas the remaining six put equal weight on all three objectives. After analyzing the rotation distribution for each of the trade-off solutions, it was noted that miscanthus had the biggest environmental impact, whereas soybean had the largest net return. In comparison, the stream health and most-preferred trade off solutions were observed to be diverse with at least nine out of ten rotations considered for application over the entire watershed.
The Pareto front was found to have discontinuities, particularly near small economic score solutions. The study revealed that contrary to general practices of optimizing a linearly weighted sum of objectives, the use of a global evolutionary multi-objective optimization has a better scope of addressing such irregularities. The best economic solutions usually exhibit non-linear properties and therefore finding a representative Pareto front by an efficient nonlinear optimization algorithm provides decision-makers a comprehensive and flexible tool for choosing a single most-preferred solution. Patterns in the front were also identified with common rotations found within each cluster. These patterns will allow decision-makers to understand the trend of rotations for specific goals for their bioenergy feedstock rotation problem.
The Pareto front contains an unbiased array of optimal solutions. In this study, we showed the benefit of using the Pareto front as a decision-making tool through selecting the most-preferred trade-off solutions based on decision-maker's preferences. Importantly, we have also shown that due to the availability of multiple Pareto solutions, they can be analyzed to extract key insights about the common properties of them, which can stay as domain-knowledge for future applications. In addition, we demonstrated the use of other decision-making approaches to select a preferred solution from the Pareto front. One popular idea is for the decision-maker to provide an aspiration point containing three desired objective values and then identify the Pareto solution which is closest to the supplied aspiration point. This paper showed the potential for multi-objective optimization on determining bioenergy feedstock rotations on agricultural land within a watershed. In this approach, each subbasin was assumed to apply only one bioenergy feedstock rotation over the entire area for the 12-year period. Additionally, it was assumed that all agricultural land within the watershed would be used for bioenergy feedstock production. Finally, the LCA was performed within the scope of seed to harvest rather than well-to-wheel. Further research is necessary to make this optimization more applicable to agricultural areas within the US and quantify emission savings by incorporating other factors affecting bioenergy crop expansions, such as food security and societal and political perceptions.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.