Application and Analysis of a Composite Sampling Strategy to Cost-E ﬀ ectively Compare Nutritive Characteristics of Perennial Ryegrass Cultivars in Field Trials

: Pasture nutritive value is economically important in south-eastern Australian dairy production systems, yet measurement of nutritive characteristics in pasture cultivar evaluation trials is not routinely undertaken, primarily due to cost. An approach aiming to reduce the total laboratory analysis costs in multi-harvest ﬁeld trials by testing some entries as composite samples is provided. A ﬁeld trial evaluating 31 trial entries sown in 4 replicates was used. On nine harvest occasions, samples were collected from each plot, dried, ground and analysed using near infrared spectroscopy for key nutritive characteristics (metabolisable energy (ME), crude protein (CP) and neutral detergent ﬁbre (NDF)). Additionally, composite samples of 17 of the 31 entries from each harvest were created by combining sub-samples of material from each of four replicate plots into a single sample that was also analysed. A linear mixed model (LMM) analysis accounting for spatial and temporal variation as well as spatial and temporal correlations was conducted, comparing the full data model where all plots at all harvests were tested individually to a data model where some entries were evaluated as individual plots and others as composites. The precision and accuracy of the estimates for the two models were similar and best linear unbiased prediction (BLUP) means of the composite sampling strategy model were comparable to the full data model. It was concluded that if composite sampling is used in conjunction with testing samples from individual plots on a selection of cultivars, statistically valid inferences are possible and the total cost of determining the nutritive characteristics of perennial ryegrass cultivars in ﬁeld trials can be reduced.


Introduction
Perennial ryegrass (Lolium perenne L.) underpins the "home grown" forage supply on dairy farms in south-eastern Australia, with farmers able to choose from more than 60 commercially available cultivars [1]. The relative merit of these cultivars is routinely assessed in small plot trials managed by pasture seed companies and industry groups. These trials measure forage mass (kg DM/ha) at a sequence of harvests throughout the year; however, they do not routinely measure the nutritive characteristics of the cultivars, despite them being economically important in pasture-based dairy production systems. Metabolisable energy (ME) intake is the primary determinant of milk production in dairy cows [2], and therefore determining the ME concentration of pastures is important. Crude protein (CP) and fibre, routinely measured as neutral detergent fibre (NDF), are other important nutritive characteristics.
Analysis of nutritive characteristics can be laborious and costly when there are many samples to test over multiple harvests/grazings and years, such as the case in perennial ryegrass cultivar evaluation trials. These trials commonly test around 30 cultivars of perennial ryegrass in a replicated (row-column) design, with current industry protocols specifying the use of four replicates for the determination of forage mass [3]. However, these protocols do not address testing cultivars for nutritive characteristics. Anecdotally, if nutritive characteristics are measured, to save costs, trial operators may elect to test a single replicate plot of each cultivar or create a composite sample by combining material from each plot of a cultivar and testing that sample. This approach results in the loss of field replication and does not allow for sound statistical analysis.
An approach has been proposed in wheat variety trials whereby a combination of individual replicate samples and composite samples are used in a way that can reduce the total number of samples requiring laboratory analysis and therefore cost, yet retain an ability to account for some spatial variation and replication in field trials [4]. If resources are unlimited, sampling and testing every individual plot in a fully replicated trial design is recommended, as this method provides the greatest precision and accuracy. However, as budget constraints are common, the approach tested on wheat variety trials [4] offers a compromise that is statistically superior to testing all cultivars as composite samples and losing all field replication. Our study aimed to test the applicability of this concept to perennial ryegrass field trials using an approach that is practical to implement and enables statistically valid inferences to be made. The emphasis of this research was to validate a methodological approach that reduces the costs of analysis, thereby facilitating the collection of data across more sites and seasons.
A more cost-effective method of testing the nutritive characteristics in pasture cultivar evaluation is likely to increase the availability of such information for dairy farmers and enable them to make more informed decisions on which cultivars to sow when this information becomes available through tools such as the forage value index (FVI) [5][6][7]. The recently developed FVI for south-eastern Australia [5,6] provides dairy farmers with an economic basis for perennial ryegrass cultivar selection. The index is based on trial data and estimated regionally based seasonal economic values of extra yield (kg DM/ha) [5]. However, the tables currently available to farmers in Australia [8] only rank perennial ryegrass cultivars on seasonal forage mass. To date, the FVI has not included any nutritive characteristics of perennial ryegrass as only limited information on individual cultivars is available in Australia [9]. Information on estimated ME has recently been added to the New Zealand FVI for perennial ryegrass [7]. However, this is currently limited to reporting mean values for functional groups (mid-heading diploids, late heading diploids and tetraploids) rather than individual cultivars due to insufficient cultivar-specific data [10]. In Ireland, 'grass quality,' presented as dry matter digestibility (DMD), is one of the sub-indices of the pasture profit index (PPI) [11]. The 'grass quality' data in the PPI are drawn from the published recommended list system managed by the Department of Agriculture, Food and the Marine in the Republic of Ireland. In these lists, the DMD values are from plot samples analysed throughout the growing season at a single site [12] The primary reason for limited data is the cost of obtaining such data for many cultivars across environments. If a more cost-effective method of sampling and analysing field data can be implemented, it follows that more data on the nutritive characteristics of cultivars will become available. The trial evaluated a total of 31 pre-commercial lines and commercially available cultivars of perennial ryegrass (herein referred to as trial entries) in a row-column design (4 rows and 32 columns) with four replicates. There were 22 diploid and 9 tetraploid lines evaluated. Early, mid and late flowering entries were included in the trial. Each trial entry was sown in a single plot per replicate measuring 1 m × 5 m. 'Victorian' perennial ryegrass containing standard endophyte (SE), which is often used as a reference entry in Australia, was sown in 2 plots per replicate (row), creating a field layout with 32 plots per replicate (row) (Figure 1).

Trial Design
Di-ammonium phosphate (DAP) containing 18% nitrogen (N) and 20% phosphorus (P) was applied at a rate of 100 kg/ha at sowing. Additional N was applied as urea throughout the growing season, with the total N applied as urea between sowing and the conclusion of the study equivalent to 400 kg N/ha. A fertilizer blend containing 7% P, 10% potassium (K) and 8.8% sulphur (S) was applied in July 2016.
On 9 harvest occasions between October 2015 and December 2016, samples from each of the 128 plots were collected using hand shears to cut the pasture to a simulated grazing height of 5 cm from at least 5 different points randomly selected in the plot, with the remainder of the plot mown to a 5 cm residual also. On all occasions, samples were collected in the morning between 1000 and 1200 h. These sub-samples were combined to form a single sample per plot. Sampling times occurred when pasture was at a growth stage consistent with a time where farmers would choose to graze. All sampling times therefore occurred where the average pasture mass of plots was between 2000 and 3500 kg DM/ha.
Following collection, samples were stored on ice prior to oven drying at 60 • C for at least 48 h. The dried samples were ground through a 1 mm screen using a Cyclotec TM 1093 sample mill (Foss, Hilleroed, Denmark). Composite samples were generated for 17 of the trial entries ( Figure 1) by combining ground subsamples of equal weight from each of the 4 replicate plots corresponding to the trial entry at each sampling time to create a single composite sample for that trial entry. This sample was mixed thoroughly prior to laboratory analysis.
Agronomy 2020, 10, x; doi: FOR PEER REVIEW www.mdpi.com/journal/agronomy Figure 1. Plot layout. Trial entries in unshaded plots had all replicates tested for nutritive characteristics, whereas shaded plots only had 1 composite sample tested. Figure 1. Plot layout. Trial entries in unshaded plots had all replicates tested for nutritive characteristics, whereas shaded plots only had 1 composite sample tested.

Laboratory Analysis
Individual and composite samples were analysed for in vitro dry matter digestibility (IVDMD), CP and NDF at the Department of Jobs, Precincts and Regions research laboratory in Horsham, Victoria, Australia using near-infrared spectroscopy (NIRS). The NIRS spectra were collected on all samples using a Rapid Content Analyzer (XDS, Foss Analytical AB, Höganäs, Sweden) in conjunction with WinISI II v.1.04 software (Infrasoft International, LLC, PA, USA). NIRS calibrations for IVDMD, CP and NDF had previously been derived on large sample populations collected over multiple seasons and using a range of forages from multiple locations using published procedures [13]. This database comprised close to 800 samples, including perennial ryegrass, other annual and perennial forages, concentrates and grains. Standard errors of prediction for IVDMD, CP and NDF were 2.0, 1.0 and 2.5% DM, respectively. A comparison between NIRS and wet chemistry was undertaken for 10% of the randomly selected samples, with resultant correlations (R 2 ) of 0.95, 0.99 and 0.97, respectively [14]. Any spectral outliers from the calibrations had analysis by NIRS repeated and were further analysed using wet chemistry techniques. Reference methods used for NIRS calibrations were as follows: IVDMD using a pepsin-cellulase technique [15] with analytical values adjusted using a linear regression based on similar samples of known IVDMD, CP using the Kjeldahl method and NDF [16] including amylase and sodium sulphite on a DM basis. Metabolisable energy was estimated [17] as: (1)

Composite Sampling Strategy
Creating composite samples for 17 of the 31 trial entries being tested reduced the number of samples analysed for each harvest from 128 to 77.
The mixture of individual plot and composited data structure analysed using a linear mixed model (LMM), as described in Section 2.5, retained the experimental layout information (row and column) on all 128 plots. Each composited trial entry measurement was randomly assigned to one of the four plots ( Figure 1) it came from and the remaining three plots were recorded as missing values. The analyses were then performed, accounting for temporal and spatial correlation with heterogeneity in residual variance and covariance. The composite data structure was simulated and analysed one thousand times and the average results of these simulations were then compared with the results from the full data model (with all 128 plots).

Using a Single Replicate of 17 Composited Cultivars
The mixture of individual plot samples for 14 cultivars and a randomly selected plot sample (instead of a composite sample of four plots) for 17 cultivars was also analysed using an LMM, hereby referred to as the single data model (SDM). Temporal and spatial correlation with heterogeneity in residual variance and covariance was accounted for as in Section 2.3. A single plot sample out of the four plot samples was randomly selected for each of the 17 trial entries and the remaining three plots were recorded as missing values. This data structure was simulated and analysed one thousand times and the average results of these simulations were then compared with the results from the full data model (FDM) and composite data model (CDM).

Statistical Analysis
The data on each nutritive characteristic for the FDM, CDM and SDM were analysed using the LMM methodology implemented using restricted maximum likelihood (REML) [18] in ASReml-R (VSN International, Hempstead, UK) [19]. The fixed effects included the main effects of harvests and the linear effects of rows and columns to account for non-stationary global variation across the field within a harvest. The random effect included a two-way separable model structure, that considered cultivars within harvests as the treatment structure. This enabled best linear unbiased prediction (BLUP) of nutritive characteristics. The temporal genotypic correlation of observations on the same plot from consecutive harvests was modelled by a first-order auto-regressive process. With multi-harvest data, the repeated measure structure of the data was accounted for using the full variance-covariance structure of residuals from different harvests, which allowed for both the heterogeneity of residual variances at different harvests and heterogeneity in covariance (correlation). Furthermore, we accounted for spatial correlations between observations by including autocorrelation of order one in both the row and column direction. The CDM and SDM, which had same number of plot sample values, were compared using the Akaike information criterion (AIC) along with residual variance.
For both the full data model and the composite data model, harvest and seasonal BLUP means for trial entries were generated. A year was split into five seasons on a calendar month basis (Table 1) consistent with the dry matter yield values in the FVI [5]. Seasonal BLUP means were computed first by generating a two-way table of predicted means of 'Cultivar' by 'Harvest', then averages were taken to 'collapse' the multiple harvest means into seasonal means (for example, all the means from harvest dates in 'early spring' were averaged to give the 'early spring' predicted mean, for that trial entry). The correlation between the seasonal BLUP means from the composite model and the full model were calculated for ME, CP and NDF. We present three types of correlation coefficients; namely, Spearman's rank correlation coefficient (SRCC), Pearson's correlation coefficient (PCC) and Lin's concordance correlation coefficient (LCCC). The SRCC measures the correlation between ranks in two variables, while PCC measures the linear relationship between two variables. The LCCC measures the agreement between two methods of analyses as the product of a coefficient of accuracy and the Pearson correlation coefficient (coefficient of precision). The coefficient of accuracy in LCCC reflects the conformity of the mean linear relationship between the composite and full data models, to the 45 line of agreement through the origin, while the Pearson correlation coefficient falls from unity with increasing random error (decreasing precision in the relationship).

Results
The overall mean values for estimated ME, CP and NDF were similar between the three models ( Table 2). Average estimated ME in the full data model was 0.18 and 0.08 MJ ME/kg DM higher than the CDM and SDM, respectively. The values for CP and NDF only differed by 0.07 and 0.5 percentage units, respectively, between the FDM and CDM and by 0.3 and 0.7 percentage units, respectively, between the FDM and SDM. The linear row and column effects were also similar between the three models for the three nutritive characteristics. Residual variance was the greatest in the SDM, followed by the CDM, and was lowest in the FDM for all traits. Trial entry variance was smaller in the SDM and CDM compared to the FDM for each nutritive characteristic ( Table 2). The residual variance for NDF was about 26% higher in the SDM compared to the full data model and it was the biggest change in residual variances for these three characteristics. The trial entry variance for estimated ME in the CDM and SDM was about half of the full data model, and this was the biggest change in trial entry variance for these three traits. Except for the column auto correlation for estimated ME and CP, slightly higher values in the CDM and SDM were obtained for row and column auto correlation.
The precision at which these parameters were estimated was generally slightly higher for the FDM than the CDM and SDM. The standard error estimates of the parameters from the CDM were lower than the estimates from SDM. For all three traits, a lower AIC was obtained for the CDM compared to the SDM. The residual variance was always lower for the CDM than the SDM. The AIC, along with residual variance, indicated the CDM was superior to the single data model, hence we compared the BLUPs derived from the CDM to BLUPs derived from the FDM.
The BLUP means derived from the CDM were very highly correlated with the FDM for all nutritive characteristics and seasons (Figure 2). The SRCC was at least 0.87 or higher (Table 3) for estimated ME in each season. The PCC was also 0.87 or higher (Table 3) for the same characteristics within each season. Autumn, early spring, late spring and summer all had LCCC value of 0.73 or higher, whereas the winter LCCC was 0.60 for ME. The SRCC, PCC and LCCC were at least 0.74 or higher for CP in each season. The SRCC for CP was highest (0.94) in summer and lowest (0.78) in autumn. In most cases, all three correlation coefficients were slightly lower for NDF compared to the other nutritive characteristics examined. As with CP, NDF had the highest SRCC (0.80) in summer and the lowest SRCC (0.63) in autumn. The PCC for NDF was 0.75 or higher in each season. The LCCC was the lowest (0.53) in late spring and highest (0.76) in early spring for NDF.

Discussion
The results of this study suggest that a more cost-effective way of determining the nutritive characteristics of pasture cultivars in small plot evaluation trials is possible. In our study, the use of a combination of composite and individual plot sampling led to about a 40% reduction in sample testing cost whilst retaining some replication and an ability to account for spatial variation. The analyses of composite data (77 samples on each sampling occasion (60 individual plot samples plus

Discussion
The results of this study suggest that a more cost-effective way of determining the nutritive characteristics of pasture cultivars in small plot evaluation trials is possible. In our study, the use of a combination of composite and individual plot sampling led to about a 40% reduction in sample testing cost whilst retaining some replication and an ability to account for spatial variation. The analyses of composite data (77 samples on each sampling occasion (60 individual plot samples plus 17 composite samples)) produced similar results to those when all plots (128 plots/sampling occasion) were used in the analyses, albeit with less precision. One could see the possibility of using a single plot replicate value rather than the composited sample value of four plots as an option, but our simulation results clearly show the composited data model as superior. Our belief was that roughly half the individual plot samples along with randomly assigned composite data samples would be enough to estimate the variation along the row and column directions across the field within a harvest. As expected, average residual variance was always higher in the composite data model compared to the full data model, and therefore the standard error of the overall mean and other parameters was also slightly greater. In practice, these differences are not large and did not affect the statistical significance of any parameters in the two model types.
As seen in Figure 2, there were some differences in BLUP means between the full data model and the composite data model, both in terms of accuracy and precision, but the relatively high SRCC between trial entry ranking in the full data model and the composite data model suggested that the results are very similar when ranking the trial entries across all seasons, particularly for estimated ME and CP. This indicates the method's suitability for application in plant breeding programs selecting for forage nutritive characteristic traits. However, if enough resources are available, we recommend using results from fully replicated data which provide the greatest precision and accuracy. The results of this study also demonstrate the potential for a composite data modelling approach to be applied in cultivar evaluation trials, such as those that underpin the FVI [5,6], as this system is based on the ranking of cultivars. In the case of perennial species, such as perennial ryegrass, this method could be applied across years as well as in multiple harvests. Where multiple trial locations are used, the model statement could be expanded to account for genotype × environment interactions, as in the multi-environment analysis methodology suggested in a study on perennial crops [20]. A database of field trials underpins the FVI, with the confidence in cultivar rankings improving as more trials are added to the dataset. In the case of nutritive characteristics, implementation of a composite sample strategy would allow data on nutritive characteristics to be available to farmers sooner and more sites to be sampled within a fixed budget. Based on our experience, we suggest the number of trial entries that are composited should be around 50% or less.
The statistical analysis used here adopted the concept tested on wheat variety trials [4], but the statistical method was simplified, as the wheat variety trial work involved some non-standard design matrices which are not built into ASReml-R [19]. ASReml-R is a statistical software commonly used to analyse variety trials and the intent of our study was to develop a method for pasture trials that could be readily adopted by others. The approach used to analyse the mixture of composite and individual replicate samples is simple and easy to implement on similar field trials.
We acknowledge that these results are based on multi-harvest data from a single year of one trial. The growing seasons in which the trial was harvested were typical of this environment, with an average of 17 t/ha of DM grown during the measurement period. This supports the conclusion that the relationships between the full and composite datasets were not unduly influenced by environmental effects such as drought and waterlogging. This paper is a demonstration of a method to show that it is feasible to conduct a relatively simple statistical analysis that enables the use of a mixture of composite and individual plot samples from a cultivar evaluation trial. The method is also applicable where data from multiple sites are available. In this instance, the model statement can be adjusted to suit the analysis required.
Aerial and in-field methods to measure the nutritive characteristics of pastures are in development [21,22]. These technologies have the potential to make the in situ measurement of nutritive characteristics in pasture trials possible. However, considerable work is yet to be done on the calibration and validation of these technologies. In the interim, applying a sampling and analysis strategy as described here is a viable option where resources are limited.

Conclusions
Our approach of using LMM with random allocation of composite sample to plots enabled the analyses of a mixture of individual and composite samples in a simple and efficient manner, allowing a significant reduction in the cost of testing samples at each harvest in this study. This is particularly important for trials running over multiple years, as is the case with perennial pasture species. The reduction in cost makes the analysis of nutritive characteristics more feasible for pasture seed companies and other industry groups evaluating cultivars in field trials. However, wherever possible, all individual plot samples should be tested and, where it is not feasible to test all plot samples, then a mixture of composite and individual plot samples can be employed. The methods described in this paper would allow more extensive measurement of nutritive characteristics across sites and seasons than is currently routine. The lack of information is one of the key limitations of the widescale interpretation of cultivar and environmental effects on the nutritive value of pastures. If more information on the nutritive characteristics of cultivars becomes available, this could lead to a greater understanding of the seasonal nutritive value of perennial ryegrass in pasture-based dairy production systems.