Establishment of Regional Phytoremediation Buffer Systems for Ecological Restoration in the Great Lakes Basin, USA. I. Genotype × Environment Interactions

Poplar remediation systems are ideal for reducing runoff, cleaning groundwater, and delivering ecosystem services to the North American Great Lakes and globally. We used phyto-recurrent selection (PRS) to establish sixteen phytoremediation buffer systems (phyto buffers) (buffer groups: 2017 × 6; 2018 × 5; 2019 × 5) throughout the Lake Superior and Lake Michigan watersheds comprised of twelve PRS-selected clones each year. We tested for differences in genotypes, environments, and their interactions for health, height, diameter, and volume from ages one to four years. All trees had optimal health. Mean first-, second-, and third-year volume ranged from 71 ± 26 to 132 ± 39 cm3; 1440 ± 575 to 5765 ± 1132 cm3; and 8826 ± 2646 to 10,530 ± 2110 cm3, respectively. Fourth-year mean annual increment of 2017 buffer group trees ranged from 1.1 ± 0.7 to 7.8 ± 0.5 Mg ha−1 yr−1. We identified generalist varieties with superior establishment across a broad range of buffers (‘DM114’, ‘NC14106’, ‘99038022’, ‘99059016’) and specialist clones uniquely adapted to local soil and climate conditions (‘7300502’, ‘DN5’, ‘DN34’, ‘DN177’, ‘NM2’, ‘NM5’, ‘NM6’). Using generalists and specialists enhances the potential for phytoremediation best management practices that are geographically robust, being regionally designed yet globally relevant.


Introduction
Ninety-five percent of the United States' surface freshwater and 20% of the world's freshwater reserve are contained within the Great Lakes Basin [1,2]. The Basin provides ecosystem services (including clean drinking water) to over 34 million people in the United States and Canada [3][4][5], and contributes substantially to the United States' economy, with a gross regional product (GRP) estimated at nearly 4.1 trillion USD [6]. Anthropogenic dictability in performance that can complicate selection of proper genotypes for commercial use. Therefore, in tree breeding, multi-environmental trials (MET) are implemented to evaluate the degree and pattern of G × E interactions, as well as to test the robustness of genotypes in different environments [45,46].
However, sometimes phenotypic plasticity can hinder expression of G × E interactions. For example, testing of hybrid poplar clones for growth performance and robustness at different sites in the Midwestern USA showed low G × E interaction and high plasticity of tested genotypes, indicating their suitability for growing on a wider spectrum of habitats [46,47]. In poplar research, G × E interactions and definitions of generalists vs specialists have been studied for various traits ranging from biomass production [48][49][50] and rooting ability [51], to physiological traits related to productivity [36] and drought tolerance [52], to wood properties [53,54]. Often, G × E interactions are also determined by the genomic groups of the hybrids, where components of different Populus species (e.g., P. deltoides Bartr. ex. Marsh, P. maximowiczii A. Henry, P. nigra L., P. trichocarpa Torr. et Gray) significantly contribute to the expression of specific traits by a given genotype, and are reflected in rooting ability and adaptability to certain climate, latitude or soil-water conditions [48,49,55,56]. Therefore, clonal testing to maximize understanding of G × E interactions is paramount in establishing successful poplar plantings for a broad range of end uses and ecosystem services, including environmental objectives such as phytoremediation. For example, optimizing G × E interactions may be most appropriate for small-scale applications with site-specific needs, while minimum G × E interactions are needed to deploy superior and robust clones at a justifiable cost for large-scale commercial systems [46].
The current study is a component of a regional phytoremediation testing network originally funded by the Great Lakes Restoration Initiative (GLRI) [2] to select and test the ability of new poplar genotypes to reduce surface runoff and prevent groundwater contamination at landfills and similar sites in the Great Lakes Basin. We implemented phyto-recurrent selection, a stepwise greenhouse-and field-based approach that combines crop and tree improvement strategies to evaluate, identify and select superior-performing clones based on multiple testing cycles [32,57]. Results of the initial screening of candidate clones in greenhouse experiments, previously published by Rogers et al. [58], were used to develop the MET testing network consisting of sixteen phytoremediation buffer systems (i.e., phyto buffers) at ten sites located in the Lake Superior (i.e., Michigan's Upper Peninsula) and Lake Michigan (i.e., eastern Wisconsin) watersheds. Our objective was to test for differences in genotypes, environments, and their interactions for health, height, diameter, and volume during early field establishment (i.e., from one to four years after planting). Our results identify poplar clones with maximum phytoremediation potential that can be established across a wide range of environmental conditions. These data are useful for clonal selection to maximize phytoremediation potential in future phytotechnologies, regardless of specific site conditions or genotypes deployed.

Site Description
A regional phytotechnologies network consisting of sixteen phytoremediation buffer systems (i.e., phyto buffers) was established in 2017 (×6 phyto buffers), 2018 (×5), and 2019 (×5) in the Lake Superior watershed of the Upper Peninsula of Michigan, USA and the Lake Michigan watershed of eastern Wisconsin, USA. Multiple phyto buffers were installed at some locations, resulting in ten field testing sites throughout the network (Figure 1). The sites ranged in latitude from 46.7840 to 42.8382 • N and in longitude from −89.1291 to −86.5976 • W, which is consistent with poplar productivity supplysheds in the region [47,[59][60][61]. Site-related climate properties are shown in Table 1. Twentyyear historical monthly averages for precipitation, temperature, and drought were determined across each growing season (April to October) for the time period 2000 to 2020 and summed (precipitation) and averaged (temperature, drought) to obtain final annual values. Based on the nearest weather station to each site, precipitation (P, mm) along with maximum (T max , • C) and minimum (T min , • C) air temperatures were obtained from the National Oceanic and Atmospheric Administration (NOAA) National Climate Data Center (https://www.ncdc.noaa.gov/cdo-web/, accessed on 20 January 2021). Average temperature (i.e., T avg = ((T max + T min )/2), • C) and the difference between maximum and minimum temperatures (i.e., T diff = T max − T min , • C) were calculated for each site. Daily growing degree days (GDD day ) were calculated as the average temperature minus a base temperature of 10 • C (i.e., GDD day = T avg − 10 • C) and summed across each growing season. Average annual growing degree days for each site (GDD annual ) were estimated by averaging the seasonal GDD values from 2000 to 2020. The United States Drought Monitor (https://droughtmonitor.unl.edu/, accessed on 20 January 2021) was accessed to obtain drought index scores according to percent area within each county belonging to four drought index categories: D0 (abnormally dry), D1 (moderate drought), D2 (severe drought), D3 (extreme drought), and D4 (exceptional drought). Categories D3 and D4 were negligible and, therefore, not reported in Table 1. Buffer-specific soil properties were acquired from the USDA Natural Resources Conservation Service (NRCS) Web Soil Survey (https://websoilsurvey.sc.egov.usda.gov/, accessed on 20 January 2021) and are provided in Table 2. region [47,[59][60][61]. Site-related climate properties are shown in Table 1. Twenty-year historical monthly averages for precipitation, temperature, and drought were determined across each growing season (April to October) for the time period 2000 to 2020 and summed (precipitation) and averaged (temperature, drought) to obtain final annual values. Based on the nearest weather station to each site, precipitation (P, mm) along with maximum (Tmax, °C) and minimum (Tmin, °C) air temperatures were obtained from the National Oceanic and Atmospheric Administration (NOAA) National Climate Data Center (https://www.ncdc.noaa.gov/cdo-web/ accessed on 20 January 2021). Average temperature (i.e., Tavg = ((Tmax + Tmin)/2), °C) and the difference between maximum and minimum temperatures (i.e., Tdiff = Tmax−Tmin, °C) were calculated for each site. Daily growing degree days (GDDday) were calculated as the average temperature minus a base temperature of 10 °C (i.e., GDDday = Tavg−10 °C) and summed across each growing season. Average annual growing degree days for each site (GDDannual) were estimated by averaging the seasonal GDD values from 2000 to 2020. The United States Drought Monitor (https://droughtmonitor.unl.edu/ accessed on 20 January 2021) was accessed to obtain drought index scores according to percent area within each county belonging to four drought index categories: D0 (abnormally dry), D1 (moderate drought), D2 (severe drought), D3 (extreme drought), and D4 (exceptional drought). Categories D3 and D4 were negligible and, therefore, not reported in Table 1. Buffer-specific soil properties were acquired from the USDA Natural Resources Conservation Service (NRCS) Web Soil Survey (https://websoilsurvey.sc.egov.usda.gov/ accessed on 20 January 2021) and are provided in Table 2.

Clone Selection
Phyto-recurrent selection was conducted through a polycyclic evaluation process to choose superior genotypes for out planting [32,58,62]. For each phyto buffer, soils were collected in the field, brought to the USDA Forest Service, Institute for Applied Ecosystem Studies in Rhinelander, Wisconsin, USA (Figure 1), and processed (i.e., dried and sifted to remove large debris) for use as soil treatments during three greenhouse selection cycles [58]. The number of clones tested in each cycle decreased from 140 (cycle 1) to 60 (cycle 2) to 15 (cycle 3) based on multiplicative rank summation indices incorporating tree survival, growth, physiology, and health [63]. Twelve clones were selected for cycle 4 field testing, resulting in three buffer groups related to phyto buffer establishment in 2017, 2018, and 2019. Phyto buffers within buffer group years had the same twelve clones, but the twelve clones differed among the three buffer groups based on cycle 3 phyto-recurrent selection results. In the current study, separate analyses were conducted for each buffer group. Each set of buffer group clones consisted of genotypes that have: (1) been commonly used for commercial and/or research purposes in the region (i.e., Common clones), (2) a rich history of testing but are still at the experimental stage (i.e., Experimental), and (3) been bred, tested, and selected at the University of Minnesota Duluth, Natural Resources Research Institute (NRRI) and show promise for broad-ranging applications (i.e., NRRI) [46,56]. Specific clones and their genomic groups are listed in Table 3. Table 2. Soil properties of sixteen phytoremediation buffer systems (i.e., phyto buffers) comprising a regional phytotechnologies network established from 2017 to 2019 in the Lake Superior watershed of the Upper Peninsula of Michigan, USA and the Lake Michigan watershed of eastern Wisconsin, USA.    (Table S1). Prior to planting, rocks and other large obstructions were removed and the soil was tilled to a depth of 30 cm. Thus, at planting, there was no competition from weeds and/or grasses. To reduce potential competition effects over time, subsequent site maintenance throughout the study period included: (1) tilling to a depth of 30 cm, (2) hand weeding to a minimum diameter of 0.61 m around each individual tree, and (3) removing rocks and other obstructions. For each phyto buffer, a minimum of one maintenance cycle per month throughout each growing season was conducted.
At each of the six (2017) or five (2018, 2019) phyto buffers, trees were planted in a randomized complete block design (RCBD) with eight blocks and twelve clones at a spacing of 2.44 m × 2.44 m (i.e., 1680 trees ha −1 ). Due to field space constraints, four blocks were planted at Slinger, Wisconsin. Clones were arranged in randomized complete blocks to minimize effects of potential environmental gradients, especially those related to runoff and soil physico-chemical properties. Two border rows were established on the perimeter of each phyto buffer to reduce potential border effects [64,65]. All phyto buffers were fenced using 2.3 m tall Trident extra strength deer fencing (Trident Enterprises, Waynesboro, PA, USA) to eliminate potential impacts from white-tailed deer (Odocoileus virginianus Zimmerman) browse. At the end of the first growing season, tree survival across all phyto buffers and clones was 97.1 %, with 95.5%, 96.2%, and 99.6 % survival for the 2017, 2018, and 2019 buffer groups, respectively. All trees that died were replanted with the same genotype to ensure full stocking of 1680 trees ha −1 . However, the replanted trees were not included in the analyses below.

Field Measurements
At the end of each growing season, tree height to the nearest 0.1 m was measured from the ground to the apical bud. Diameter was measured at 10 cm above the soil surface for one-and two-year-old trees and at breast height (i.e., DBH at 1.37 m) for three-year-old trees. All diameter estimates were determined to the nearest 0.1 cm. Tree volume (V) was calculated from height (H) and diameter (D; including one-and two-year diameter and DBH) using the model: V = D 2 × H [66]. Starting in 2020, trees of the 2017 buffer group were too tall to continue height measurements at the requisite precision noted above. As a result, 2020 DBH measurements collected from 2017 buffer group trees were used to estimate mean annual increment (MAI; Mg ha −1 yr −1 ). In this particular case, individualtree biomass was estimated using the general model: Biomass = 10 a0 × DBH a1 while applying genomic-group specific coefficients from Headlee and Zalesny [67] (P. deltoides 'D': a 0 = -0.65, a 1 = 2.01; P. deltoides × P. nigra 'DN': a 0 = −1.02, a 1 = 2.36; P. deltoides × P. maximowiczii 'DM': a 0 = −1.03, a 1 = 2.33; P. nigra × P. maximowiczii 'NM': a 0 = −0.50, a 1 = 1.94). Individual-tree biomass estimates were then scaled to MAI using standard metric conversion factors and stocking of 1680 trees ha −1 at four years after planting.

Health Assessments
Based on a modified methodology of Rogers et al. [58], individual tree health parameters were scored using a five-category qualitative scale ranging from 1 to 5, where 1 = optimal health, 2 = good health, 3 = moderate health, 4 = poor health, and 5 = dead (i.e., health score was inversely related to health). Two researchers scored each parameter to promote consistency in ratings. There were six parameters: (1) vigor, (2) defoliation, (3) leaf discoloration, (4) chlorosis, (5) leaf scorch, and (6) leaf spots. A multiplicative weighted summation index was used to calculate final health index values, with vigor receiving a coefficient of 0.25 and all other parameters having a health coefficient equal to 0.15. Health assessments were not conducted in 2020.

Data Analysis
Health (of all buffer groups) and MAI (of the 2017 buffer group) data were subjected to analyses of variance ( The ages (representing tree growth after each growing season) were analyzed as the repeated measure. To account for pseudo-replication over time, six different covariance structures (i.e., vc, cs, ar (1), toep, ante (1), un) were tested in PROC MIXED to determine which one provided the best model fit based on the lowest BIC scores. Using these covariance structures, ANOVA were conducted in PROC MIXED for all traits, and multiple comparisons analyses were conducted to identify significant differences among least-squares means for main effects and interactions as noted above.

Biomass and Growth
Differences for buffer and clone main effects were significant for fourth-year mean annual increment of the 2017 buffer group trees (MAI2017(2020)) (P < 0.0001), yet the buffer × clone interaction governed MAI2017(2020) (P < 0.0001) (Table S2). MAI2017(2020) ranged from 1.10 ± 0.73 ('7300502' at Whitelaw) to 7.67 ± 0.5 Mg ha −1 yr −1 ('NM5' at Menomonee Falls (East)), with an overall mean of 3.20 ± 0.51 Mg ha −1 yr −1 ( Figure 5). The largest trees were grown at Menomonee Falls (West), which had 55.4% greater MAI2017(2020) than at Whitelaw, the buffer with the smallest trees. The range in MAI2017(2020) was broader for clones, with 'NM5' exhibiting 69.8% more biomass than 'NC14106' that had the smallest trees of any genotype. While MAI2017(2020) varied across genotypes, trends across clone groups were nonexistent, with Common, Experimental, and NRRI clones performing similarly across buffer × clone combinations. Many buffer × clone interactions resulted in MAI2017(2020) values that were significantly greater than the overall mean, with the most notable being

Biomass and Growth
Differences for buffer and clone main effects were significant for fourth-year mean annual increment of the 2017 buffer group trees (MAI 2017(2020) ) (P < 0.0001), yet the buffer × clone interaction governed MAI 2017(2020) (P < 0.0001) (Table S2). MAI 2017(2020) ranged from 1.10 ± 0.73 ('7300502' at Whitelaw) to 7.67 ± 0.5 Mg ha −1 yr −1 ('NM5' at Menomonee Falls (East)), with an overall mean of 3.20 ± 0.51 Mg ha −1 yr −1 ( Figure 5). The largest trees were grown at Menomonee Falls (West), which had 55.4% greater MAI 2017(2020) than at Whitelaw, the buffer with the smallest trees. The range in MAI 2017(2020) was broader for clones, with 'NM5' exhibiting 69.8% more biomass than 'NC14106' that had the smallest trees of any genotype. While MAI 2017(2020) varied across genotypes, trends across clone groups were non-existent, with Common, Experimental, and NRRI clones performing similarly across buffer × clone combinations. Many buffer × clone interactions resulted in MAI 2017(2020) values that were significantly greater than the overall mean, with the most notable being 'NM5' outperforming the mean at four of the six buffers: Caledonia (East), Menomonee Falls (East), Menomonee Falls (West), and Slinger. In contrast, 'NC14106' had significantly less MAI 2017(2020) than the overall mean at three buffers: Bellevue (West), Menomonee Falls (East), and Whitelaw. In addition to these changes in magnitude, there were distinct changes in MAI2017(2020) ranks that defined the genotypes as generalists or specialists. In particular, clones exhibited generalist MAI2017(2020), with four exceptions (Table 4). First, '99059016' had stable performance across five of the six buffers, with Whitelaw (i.e., the buffer with its lowest rank of eleventh) having 60.9% less MAI2017(2020) than Bellevue (West), where '99059016' ranked fifth. Second, '7300502' had broad variation in MAI2017(2020) across all six buffers, with a 75.3% reduction in MAI2017(2020) at Whitelaw (rank = 12) versus Bellevue (West) (rank = 3). Third, 'DN177' was a specialist because of its high MAI2017(2020) at Slinger (where it ranked fourth), which was 36.7% greater than Menomonee Falls (East) (rank = 9). Fourth, despite higher ranking (rank = 7) at Whitelaw relative to other buffers for 'DN34', MAI2017(2020) at Whitelaw was 60.6% less than that of Caledonia (East) (rank = 11), resulting in its specialist response (Table 4). However, this classification for 'DN34' should be interpreted with caution, especially given that 'DN34' ranked tenth at four buffers and eleventh at one buffer,   In addition to these changes in magnitude, there were distinct changes in MAI 2017(2020) ranks that defined the genotypes as generalists or specialists. In particular, clones exhibited generalist MAI 2017(2020) , with four exceptions (Table 4). First, '99059016' had stable performance across five of the six buffers, with Whitelaw (i.e., the buffer with its lowest rank of eleventh) having 60.9% less MAI 2017(2020) than Bellevue (West), where '99059016' ranked fifth. Second, '7300502' had broad variation in MAI 2017(2020) across all six buffers, with a 75.3% reduction in MAI 2017(2020) at Whitelaw (rank = 12) versus Bellevue (West) (rank = 3). Third, 'DN177' was a specialist because of its high MAI 2017(2020) at Slinger (where it ranked fourth), which was 36.7% greater than Menomonee Falls (East) (rank = 9). Fourth, despite higher ranking (rank = 7) at Whitelaw relative to other buffers for 'DN34', MAI 2017(2020) at Whitelaw was 60.6% less than that of Caledonia (East) (rank = 11), resulting in its specialist response (Table 4). However, this classification for 'DN34' should be interpreted with caution, especially given that 'DN34' ranked tenth at four buffers and eleventh at one buffer, indicating stable performance across buffers. Table 4. Clonal rank for mean annual increment (MAI2017(2020)) measured after the fourth growing season of twelve poplar clones tested in six phytoremediation buffer systems (i.e., phyto buffers) established in 2017 (i.e., the 2017 buffer group) in the Lake Michigan watershed of eastern Wisconsin, USA. The buffer × clone × year interaction was significant for height (P = 0.0483), diameter (P = 0.0018), and volume (P = 0.0001) of the 2017 buffer group trees (Table S3). VOLUME 2017 of trees measured in 2017 (VOLUME 2017(2017) ) ranged from 3.1 ± 54.4 ('7300502' at Whitelaw) to 459.6 ± 22.6 cm 3 ('7300502' at Slinger), with an overall mean of 132.0 ± 38.7 cm 3 , while VOLUME 2017(2018) (i.e., 2017 buffer group trees measured in 2018) ranged from 503.5 ± 1619.6 ('7300502' at Whitelaw) to 13,027.0 ± 1402.6 cm 3 ('NM2' at Slinger), with an overall mean of 5765.2 ± 1132.3 cm 3 (Table 5). VOLUME 2017(2019) ranged from 1668.7 ± 3018.1 ('7300502' at Whitelaw) to 26,652.0 ± 1848.0 cm 3 ('NM5' at Menomonee Falls (East)), with an overall mean of 10,530.6 ± 2109.5 cm 3 ( Table 5). Across all buffer × clone × year combinations, volume increased 43.7-fold from the first year to the second year after planting, and then 1.8-fold from the second year to the third year. After the first and second growing seasons, the largest trees were grown at Slinger, which had 1301 and 268% greater volume than the buffer with the smallest trees (Whitelaw), respectively. For the third growing season, trees at Menomonee Falls (East) were largest, with VOLUME 2017(2019) being 338% greater than Whitelaw, which had the smallest trees.

Phyto Buffer
The range in volume was narrower for clones, with 'DN5' having 204% greater volume than the least productive clone ('99059016') for VOLUME 2017(2017) , 'NM5' being 86% greater than '7300502' for VOLUME 2017(2018) , and 'NM5' being 114% greater than 'NC14106' for VOLUME 2017(2019) . With the exception of the first growing season where NRRI clones exhibited the least overall volume followed by Experimental and Common (most volume) genotypes, Common clones had the largest and Experimental clones the smallest trees, with NRRI genotypes being intermediate for VOLUME 2017(2018) and VOLUME 2017(2019) . Trends in height and diameter of the 2017 buffer group trees were similar to volume (Tables S4  and S5). Furthermore, in addition to the broad variability in the magnitude of differences among buffer × clone × year combinations, the frequency and magnitude of changes in rank within and across years defined genotypes as generalists (i.e., minimal rank changes) or specialists (i.e., broad variation resulting in ≥5 rank changes for at least one buffer × clone × year pair) (Table S6). In particular, the NRRI clones ('99038022', '99059016', '9732-36') were high-level generalists characterized by nearly universal stability in ranks across buffers within measurement years, less than three substantial (i.e., >5 ranks) rank changes across all three-way combinations, and moderate to high rank stability over time. Clones 'DM114' and 'NC14106' were also generalists, exhibiting moderate rank stability within years and consistent ranks over time. The remaining genotypes were specialists. Clone '7300502' had the most variability in early ranks of all clones, followed by high rank stability in later years that were not consistent over time. For example, '7300502' ranked first at Slinger in 2017 but then twelfth at this buffer in 2018 and 2019. Clones 'NM2', 'NM5', and 'NM6' were high-level specialists characterized by broad variability in ranks across buffers within measurement years, frequent substantial rank changes across all three-way combinations, and moderate stability over time. Similarly, 'DN5', 'DN34', and 'DN177' were specialists, albeit with more moderate rank variation than the 'NMx' genotypes (Table S6).  The buffer × clone × year interaction was significant for height, diameter, and volume (P < 0.0001) of the 2018 buffer group trees (Table S3). VOLUME 2018(2018) ranged from 12.2 ± 38.4 ('DN5' at Marquette) to 185.7 ± 25.3 cm 3 ('NM2' at Manitowoc), with an overall mean of 71.2 ± 26.3 cm 3 , while VOLUME 2018(2019) ranged from 287.8 ± 1518.7 ('DN5' at Marquette) to 11,085.0 ± 930.0 cm 3 ('NM2' at Manitowoc), with an overall mean of 3418.4 ± 1035.2 cm 3 ( Table 6). VOLUME 2018(2020) ranged from 261.0 ± 3882.2 ('DN5' at Marquette) to 27,220.0 ± 2377.4 cm 3 ('NM5' at Manitowoc), with an overall mean of 8826.0 ± 2646.2 cm 3 ( Table 6). Across all buffer × clone × year combinations, volume increased 48-fold from the first year to the second year after planting, and then 2.6-fold from the second year to the third year. After the first growing season, the largest trees were grown at Caledonia (West), which had 529% greater volume than Marquette, the buffer with the smallest trees. During the second and third growing seasons, the largest trees were from Manitowoc, which had 1079 and 1744% greater volume than the buffer with the smallest trees (Marquette), respectively. As with volume for the 2017 buffer group trees, there was less variability among clones than buffers, with '9732-31' having 156% greater volume than the least productive clone ('7300502') for VOLUME 2018(2018) , 'NM5' being 163% greater than '7300502' for VOLUME 2018(2018) , and 'NM5' being 143% greater than 'DM114' for VOLUME 2018(2018) . During the first growing season, NRRI clones had the greatest overall volume, while Common clones exhibited the largest trees at two and three years after planting. For all three years, Experimental clones had the least volume. Trends in height and diameter of the 2018 buffer group trees were similar to volume (Tables S7 and S8). Moreover, as with 2017 buffer group trees, changes in magnitude and rank of 2018 buffer group clones across buffer × year combinations defined them as generalists or specialists (defined as for Table S6 above) (Table S9). Similar to 2017, the NRRI clones ('9732-11', '9732-24', '9732-31', '9732-36') were high-level generalists with universal stability in ranks, few rank changes greater than five ranks, and high rank stability over time. Clone 'DM114' was also a generalist, having only two substantial rank changes and nearly identical ranks from 2017 to 2019. All other genotypes were specialists. As in 2017, clone '7300502' had a high level of early rank variability and low to moderate rank consistency over time; clones 'NM2', 'NM5', and 'NM6' were high-level specialists with broad rank variability, numerous substantial rank changes, and moderate stability as trees aged; 'DN2', 'DN5', and 'DN34' were consistent specialists with moderate levels of rank changes and age-dependent stability (Table S9).
The buffer × clone × year interaction was significant for height (P = 0.0079), diameter (P < 0.0001), and volume (P < 0.0001) of the 2019 buffer group trees (Table S3). VOLUME 2019(2019) ranged from 8.6 ± 26.9 ('DN177' at Ontonagon (North)) to 396. ± 26.7 cm 3 ('99038022' at Escanaba (West)), with an overall mean of 88.7 ± 26.7 cm 3 , while VOLUME 2019(2020) ranged from 92.6 ± 612.7 ('NM5' at Ontonagon (North)) to 8909.6 ± 573.1 cm 3 ('NM2' at Escanaba (West)), with an overall mean of 1440.4 ± 575.1 cm 3 ( Table 7). Across all buffer × clone × year combinations, volume increased 16.2-fold from the first year to the second year after planting. After the first and second growing seasons, the largest trees were grown at Escanaba (West), which had 1008 and 1066% greater volume than the buffer with the smallest trees (Ontonagon (North)), respectively. For VOLUME 2019(2019) , '99038022' had 101% greater volume than the least productive clone ('DN177'), while 'NM2' was 164% greater than '9732-11' for VOLUME 2019(2020) . NRRI clones exhibited the greatest first-year volume, followed by Common and Experimental genotypes. For VOLUME 2019(2020) , ranks of NRRI, Experimental, and Common clones changed, with Common genotypes having the most VOLUME 2019(2020) followed by Experimental and NRRI (least volume) clones. Trends in height and diameter of the 2019 buffer group trees were similar to volume (Tables S10 and S11). Furthermore, 2019 clones were classified as generalists or specialists (defined as for Table S6 above) (Table S12). Table 6. Volume (cm 3 ) (±one standard error) of twelve poplar clones tested in five phytoremediation buffer systems (i.e., phyto buffers) established in 2018 (i.e., the 2018 buffer group) in the Lake Superior watershed of the Upper Peninsula of Michigan, USA and the Lake Michigan watershed of eastern Wisconsin, USA. Trees were measured following the 2018, 2019, and 2020 growing seasons. Volume values with different letters within a clone column across measurement years are different at P < 0.05.

Genotype × Environment Interactions
Understanding genotype by environment (G × E) interactions is a necessary step for identifying and selecting poplar clones used for phytoremediation and associated phytotechnologies [68]. Poplar phenotypes are a function of their genotype, environment, and genotypic response to specific site conditions [49]. Phyto-recurrent selection has been used to choose superior poplar genotypes in the Midwestern United States [32,68]. Using both generalist and specialist genotypes enhances ecosystem services provided by phytoremediation applications. Deploying generalists with low G × E interactions and robust productivity across the region may be beneficial for cost and operational efficiencies [47,56], while specialists with high G × E interactions may maximize productivity, phytoremediation potential, and overall benefits of ecosystem services [26,28,49]. In the current study, sixteen phytoremediation buffer systems (i.e., multi-environmental trials (MET)) were established to evaluate trends in G × E interactions and identify generalist and specialist poplars in order to reduce runoff and clean groundwater (Figure 1).
In this study there were significant main (buffer, clone, and year) and interaction effects on tree health and growth parameters. In particular, interactions involving the buffer main effect were major factors governing clonal productivity. Buffer effects reflect tree responses to combined edaphic and local climatic conditions, and influence clonal performance traits such as: stem biomass production [69,70]; foliar fungal microbiomes [71]; leaf characteristics [72]; and diameter, height, and wood volume production [49,73]. In the current study, the broad spectrum of MET buffers with varying soil and climate conditions led to a wide range in clonal performance related to changes in both genotypic magnitude and ranks over time. More favorable climatic conditions (i.e., warmer, more precipitation; Table 1) and adequate soils for poplar cultivation (e.g., suitable texture, water-air properties, pH; Table 2) likely led to greater volume of most clones at Menomonee Falls, Slinger, Manitowoc and Caledonia. On the other hand, lower performance of clones at Marquette and Ontonagon can be attributed to less favorable climatic and soil conditions such as lower precipitation, temperature, and soil pH (i.e., more acidity). Similar results were obtained by Hansen et al. [74] who showed that soil water availability played a key role in the productivity of woody biomass plantations. In the current study, sites with irrigation or shallow water tables exhibited the greatest wood volume, which was further corroborated through poplar biomass productivity modeling in the Midwestern United States [59,60]. Overall, G × E interactions resulted in mean annual increment (MAI) of four-year-old trees from the current 2017 buffer group ranging from 1.1 to 7.8 Mg ha −1 yr −1 (mean = 3.2 Mg ha −1 yr −1 ), which agreed with results for similarly-aged (i.e., 3 to 5 years) poplars in the region, whose MAI ranged from 0.6 Mg ha −1 yr −1 to 7.1 Mg ha −1 yr −1 [49,75].
Multi-environmental trials are key tools for defining gains achieved through identifying genotype characteristics, their stability, and relevance of their interaction with varying environmental conditions (i.e., G × E interactions) [76]. Although G × E interactions can have a significant impact on the precision of breeding value estimates, often resulting in decreased genetic gain [53], matching superior species and clones to particular site and growing conditions has been critical in maximizing the productivity of SRWC plantations [70]. Tree age is an important factor shown to govern G × E interactions for poplars. Although Riemenschneider et al. [48] found significant G × E interactions first occurring at three years after planting, Semerci et al. [52] recorded significant G × E interactions for growth and phenology traits in one-year-old poplar clones grown on sites with different water availability in Turkey. Similarly, in the present study, all tested traits exhibited G × E interactions after the first year in all three buffer groups (e.g., for one-to four-year-old trees). In contrast to the results presented here, greenhouse phyto-recurrent selection experiments with soils from the six phyto buffers of the 2017 buffer group [Bellevue (West), Caledonia (East), Menomonee Falls (East), Menomonee Falls (West), Slinger, and Whitelaw] showed a lack of G × E interactions for root-shoot ratio and growth performance index of many poplar clones tested at the current MET. Nevertheless, there were significant G × E interactions for tree health [58]. Such differences may be attributed to variability in environmental conditions between the greenhouse and field buffers and/or the length of the experiment (i.e., months versus years).
Regardless, such results have indicated that G × E interactions in poplar clones vary during the life cycle of the trees. Zalesny and Headlee [25] found significant G × E interactions for biomass and carbon production in both 10-and 20-year-old poplar plantations, despite negligible genotypic effects on both traits for 20-year-old trees. The presence/absence of G × E interactions within clones during the production cycle also can be expressed by variability in growth patterns across clones. Netzer et al. [77] recorded that some clones had greater biomass productivity in the second half of the stand rotation, while Ghezehei et al. [78] recorded both lack and presence of significant differences in clonal productivity of four-and eight-year-old poplars.
Changes in both magnitude and ranks across buffer × clone × year combinations defined the G × E interactions of the current study [43,44]. Different classifications were found for volume versus mean annual increment (MAI), which supports the need for longterm monitoring throughout plantation development [68]. One explanation for differences between these traits may be related to the age at which the trees were measured. As noted previously, in the Southeastern United States, clonal rankings in poplar wood volume production changed with increasing stand age [70,79]. In this study, all volume estimates were from one-to three-year-old trees, while MAI was determined for trees after their fourth growing season, the start of the mid-rotation growth stage for poplars used for phytoremediation [80]. Similar changes were also apparent when evaluating measurement years within individual buffer groups. That is, oftentimes clonal rankings dramatically changed as trees aged (e.g., '7300502' had the greatest volume at Slinger during the establishment year only to have the least volume at this buffer after two and three growing seasons). A second explanation for differences in classifications between volume and MAI may be related to individual clones expressing higher levels of genetic variation and phenotypic plasticity as they responded to highly variable and changing soil conditions both within and across growing seasons at the phyto buffers. Guet et al. [72] reported that P. deltoides and P. nigra (which were the most common species used as parents in the current study) exhibited high levels of such genetic variation and plasticity, allowing them to better adapt to site-level spatial and temporal heterogeneity. These responses could lead to a greater propensity for specialist growth performance, and may explain why some NRRI clones ('9732-11', '9732-24', '9732-31', '9732-36') shifted from generalists to specialists in the current study (Table S13). In contrast, Nelson et al. [46] identified most of these clones as being geographically robust across both latitudinal and longitudinal gradients in North America. One of these clones, '9732-36', had very consistent volume production in our 2017 and 2018 buffer groups (as well as MAI in our 2017 buffer group), yet trended towards specialist responses at phyto buffers established in 2019. This may have been due to a negligible relationship between phenotypic plasticity and G × E interactions. As in our interpretation (and its definition), Des Marais et al. [81] linked G × E interactions to changes in clonal ranking and growth performance (i.e., variance-changing interaction) [56].
This concept of variance-changing interaction also supports differences in classifying generalist and specialist clones of the current study within individual measurement years associated with buffer × clone interactions for specific buffer groups. Specifically, individual response group designations for buffer × clone × year combinations (from Tables S6, S9 and S12) may differ somewhat from final classifications listed in Table S13, given the need to assess stability and magnitude of ranks within years and over time (i.e., classifying the clones holistically). With the exception of the NRRI clones, most other genotypes from the P. deltoides × P. nigra 'DN' genomic group (with 'DN2' being the only exception), as well as the P. deltoides 'D' clone '7300502' and all clones of the P. deltoides × P. maximowiczii 'DM' and P. nigra × P. maximowiczii 'NM" genomic groups exhibited consistent classifications across buffer groups. Nevertheless, of particular interest was that individual clones within genomic groups (or breeding groups, for the 'DN' hybrids) performed similarly, indicating that selection of genomic groups may be effective for early phyto-recurrent selection cycles (i.e., when choosing base populations for testing). Such genomic group trends have been reported for the same or related genotypes used in other phytoremediation applications [35,82]. Overall, the preponderance of specialist clones in the current study supports the need for phyto-recurrent selection in order to match genotypes to sites for small-scale applications with location-specific requirements (i.e., see variation in ranks for 'NM5' from Tables S6, S9 and S12), as well as the parallel need for continued testing of new genetic material, such as the NRRI clones, to select robust genotypes with minimal G × E interactions that can be used for large-scale, commercial applications at a justifiable cost while providing a multitude of ecosystem services across the rural to urban continuum [46].

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/f12040430/s1; Table S1: Dates of planting for each phytoremediation buffer system (i.e., phyto buffer); Table S2: Probability values from analyses of variance for health and mean annual increment (MAI); Table S3: Probability values from analyses of variance for height, diameter, and volume; Table S4: Height for the buffer × clone × year interaction (2017 buffer group); Table S5: Diameter