Spectral Reflectance Modeling by Wavelength Selection : Studying the Scope for Blueberry Physiological Breeding under Contrasting Water Supply and Heat Conditions

To overcome the environmental changes occurring now and predicted for the future, it is essential that fruit breeders develop cultivars with better physiological performance. During the last few decades, high-throughput plant phenotyping and phenomics have been developed primarily in cereal breeding programs. In this study, plant reflectance, at the level of the leaf, was used to assess several physiological traits in five Vaccinium spp. cultivars growing under four controlled conditions (no-stress, water deficit, heat stress, and combined stress). Two modeling methodologies [Multiple Linear Regression (MLR) and Partial Least Squares (PLS)] with or without (W/O) prior wavelength selection (multicollinearity, genetic algorithms, or in combination) were considered. PLS generated better estimates than MLR, although prior wavelength selection improved MLR predictions. When data from the environments were combined, PLS W/O gave the best assessment for most of the traits, while in individual environments, the results varied according to the trait and methodology considered. The highest validation predictions were obtained for chlorophyll a/b (RVal ≤ 0.87), maximum electron transport rate (RVal ≤ 0.60), and the irradiance at which the electron transport rate is saturated (RVal ≤ 0.59). The results of this study, the first to model modulated chlorophyll fluorescence by reflectance, confirming the potential for implementing this tool in blueberry breeding programs, at least for the estimation of a number of important physiological traits. Additionally, the differential effects of the environment on the spectral signature of each cultivar shows this tool could be directly used to assess their tolerance to specific environments.


Introduction
Agriculture is essential to human survival, but it is being critically affected by global warming, compromising the food security of the population that is predicted to continue to increase over the coming decades [1].Elevated temperatures, changes in precipitation patterns and more frequent extreme weather events have given rise to greater intra-and inter-annual variability in the yield and quality of crops [2].These new climatic scenarios make it necessary to further our understanding of the principal projected stresses (drought and heat stress) impacting plant performance, and to develop relevant information to generate cultivars capable of overcoming more complex environmental scenarios.
To the extent that breeders can consider useful additional information (e.g., plant physiology) to plan each season's crosses and perform early selection of material suitable for each breeding program's aims, it should be possible to improve both efficiency (cost and time) and productivity (higher proportion of new cultivars carrying advantageous characteristics) [3,4].
Unfortunately, the practical and economic limitations of performing complex evaluations (e.g., plant water status, photosystem status, leaf gas exchange, photosynthetic pigment contents) of a large number of genotypes has been one of the main constraints to performing in-depth phenotypic characterization, especially under water deficit or heat stress conditions [3,5,6].This highlights the urgent need to develop faster and cheaper methodologies to estimate key physiological traits, which have proven to be relevant in the multi-dimensional characterization of the phenotype (phenomics) [7], especially in fruit breeding programs [3].Nevertheless, it is important to highlight that these new technologies need to be calibrated and validated; therefore, they must first be developed in conjunction with reference traditional measures to establish correlations and the degree of realistic predictability.
Among the remote sensing technologies with potential for high-throughput plant phenotyping and phenomics, reflectance spectroscopy is probably the most promising.When the leaf receives radiation of the sun, part is transmitted to lower leaf layers, part is absorbed by the chlorophylls and other pigments, and the rest is reflected; the reflectance is therefore the ratio of the reflected and incident radiation.The spectral signature (graphical representation of the reflectance for each wavelength) is closely associated with the absorption of certain wavelengths linked to specific characteristics or plant conditions [8][9][10][11]; plant spectral reflectance can thus be used for characterizing the effects of abiotic and biotic stresses [3,9,12,13].For example, in a healthy leaf, chlorophyll pigments absorb in the blue (400-500 nm) and red (600-700 nm) range, generating a higher reflection in the green wavelengths (500-600 nm); carotene has also a strong absorbance in the blue range.Most of the past research has concentrated on the measurement in the visible (VIS; ~400-770 nm) and near infrared (NIR; ~770-1300 nm) spectrum, although new studies are also covering the UV (~300-400 nm) and the short wavelength infrared 1 (SWIR1; ~1300-1900 nm) and 2 (SWIR2; ~1900-2500 nm) [14,15].
Reflectance spectroscopy has gained importance and is being widely used in eco-physiological studies for predicting traits of interest [10,16].However, most of the studies have focused on the use of Spectral Reflectance Indices (SRIs) (denoting relationships between specific wavelengths or spectrum bands), paying less attention to the use of a wider proportion of wavelengths or the full spectrum [11,[17][18][19][20].
Because of the large volume of data generated by spectral measurements, it is common to use multivariate techniques for trait modeling [21].The simplest method is Multiple Linear Regression (MLR), which is a well-known statistical procedure based on ordinary least squares regression [22,23].In principle, MLR can be used with many predictors; however, if the number of these becomes too large, it is likely to generate a model that fits the sampled data but has a probability of failing to predict new one, a phenomenon called over-fitting [24,25].Furthermore, an important assumption for the MLR method is that independent variables are linearly independent, which is not always true for spectral data due to multicollinearity (MC).One alternative to MLR is the use of Partial Least Squares (PLS), which is a bilinear modeling method where information in the original x variables is projected onto a small number of latent variables called PLS components; characteristics of principal component analysis and multiple linear regression are used to reduce the complex variability to some smaller number of relevant factors [26][27][28].PLS is thus a method for constructing predictive models when there are many factors and high multicollinearity.
To avoid problems associated with MLR models in spectral studies, prior to the modeling it is suggested that a selection of predictors is used to reduce their number, leaving only those that contribute most to the model.Due to spectral data characteristics, in which one or several sets of predictors provide the same information, the use of an MC analysis contributes towards eliminating those predictors with redundant contributions to the model [29,30].Another alternative for the selection of independent variables is Genetic Algorithm (GA) analysis, widely used in chemical spectrometry [21,[31][32][33][34][35][36][37].GAs are adaptive procedures for finding solutions, or numerical optimizations, and are inspired by the theory of evolution: in a living environment, the "best" individuals have a greater chance of surviving and a higher probability of spreading their genes by reproduction [38].In particular, the best chromosomes (with higher fitness) are allowed to survive, mutate and recombine to generate offspring, and after a number of generations have elapsed, only those selected chromosomes (equivalent to sets of wavelengths) are preserved [39].
Despite numerous studies relating the spectral signature (total or partial) to different traits, further investigation is required.A significant number of articles cling to the use of PLS as the principal multivariate technique; nevertheless, comparisons with other methodologies have not always been considered.The literature suggests that modeling with MLR, after selecting wavelengths, would generate models with similar predictive ability to PLS [30,38,40,41].For example, [42] indicated that the selection of wavelengths with lower MC would improve the performance of MLR models for spectral data with a large number of samples.Questions regarding the best modeling methodology for physiological trait prediction (e.g., PLS vs. MLR), and whether the model should consider the whole spectrum or the convenience of a wavelength selection method still remain unclear.
We propose that it is possible to estimate complex physiological traits using models based on the spectral signature, providing fruit breeders with additional relevant information for plant selection (parents to be crossed or advanced lines) under adverse abiotic conditions (water deficit or heat stress, and their combination).Thus, through the comparison of the coefficients of determination of validation, the aim of this study was to evaluate the potential use of spectral reflectance, at the leaf level, to estimate physiological traits (plant water status, chlorophyll content, leaf gas exchange, and the working state of the photosynthetic apparatus) by determining the performance of PLS and MLR in relation to a previous wavelength selection procedure (GA, MC or in combination).Additionally, a secondary goal was placed on the recognition of the effects of environmental constraints on the leaf spectral signature, considering the differences between and within cultivars.

Experimental Trial and Plant Material
The experimental trial was conducted at the Universidad de Talca, Campus Lircay (35 • 24 20"S, 71 • 38 5"W), Talca, Chile, in the course of the 2013/14 growing season.During dormancy in the previous year (June 2012), 2-to 3-year-old plants of northern highbush blueberry (Vaccinium corymbosum L.; 'Bluegold' and 'Liberty'), southern highbush blueberry (V.corymbosum L.; 'Bluecrisp' and 'Star') and rabbiteye (V.ashei Reade; 'Bonita') were established in 20 L pots, containing a substrate mixture of sand, peat moss, and sawdust (1:3:1); according to [43], a mature blueberry is shallow-rooting with most roots only exploring up to 0.4 m deep, indicating that the cylinder pot (305 × 383 mm) should provide enough space for a young root system to grow without restrictions.Plants were maintained outdoors until early August 2013 (mid-winter in Chile) when they were moved into two greenhouses (12 × 9 m) covered with alveolar polycarbonate sheeting (6 mm and 86% of solar transmission); one of them (only roof covered) was considered to reproduce the conditions of ambient temperature (at), while the second one, completely enclosed with temperature control set to keep approximately 5 • C above at, simulated an overheated environment (at +5 ) (temperature records in Figure S1).Both greenhouses had two water regimes, full irrigation (FI) and water deficit (WD) with approximately one-third of the water supplied under FI.Therefore, four environmental conditions were generated: (i) at-FI; (ii) at-WD; (iii) at +5 -FI; and (iv) at +5 -WD.
There were three benches (blocks) in each greenhouse, and both water regimes were applied to each one.Each individual bench included all cultivars and two replications of each water regime; thus, six plants represented each environmental condition.

Measurements
Assessments started when 20-30% of the fruit in each cultivar reached maturity (berry skin 100% blue coverage) on each bush.Measurements were performed on fully illuminated tissue (leaves or shoots) from the edge of the middle-third of the canopy, at solar zenith (±2 h) during completely sunny days.
Evaluations of chlorophyll (Chl) content, leaf gas exchange, modulated chlorophyll fluorescence, and spectral reflectance were conducted on the same leaves.Stem water potential (SWP) was measured in shoots of neighboring buds.Descriptions of the measurements are given below: 2.2.1.SWP Plant water status was assessed on one shoot (15 cm) per plant using a pressure chamber (PMS600, PMS Instruments Company, Corvallis, OR, USA).Prior to the measurement, shoots were covered with aluminum foil and plastic Ziploc TM bags for two hours [44].

Chl Content
From one leaf per plant, one disc of 0.32 cm 2 was extracted from each side of the central vein in the middle of the leaf [44].Both discs were placed in 1.5 ml of the organic solvent N,N-dimethylformamide for 48 h in the dark at 4 • C [45].The concentrations of Chl a, b and total were determined using a UV/VIS spectrophotometer (T80 +, PG Instruments Ltd., England) at 664.5 and 647 nm, according to the equation proposed by [46].Additionally, the ratio between Chl a and b (Chl a/b) was calculated.

Leaf Gas Exchange
On one leaf per plant, CO 2 assimilation rate (A), stomatal conductance (gs), transpiration rate (E), and internal CO 2 concentration (Ci) were recorded using an infrared gas analyzer (Ciras 2, PP Systems, Amesbury, MA, USA) with a leaf cuvette of 2.5 cm 2 .Measurements were performed using a light source (CRS131, PP Systems, Amesbury, MA, USA) at saturation conditions (1500 µmol m −2 s −1 ), a flow rate of 0.15 L min −1 and 390 ppm of CO 2 .

Modulated Chlorophyll Fluorescence:
On one leaf per plant, the working state of the photosynthetic apparatus was evaluated by fast light curves using a portable fluorometer (PAM 2500, Walz, Effeltrich, Germany).The equipment was configured to deliver 20 pulses of actinic light at different levels of photosynthetically active radiation (PAR), between 0 and 2700 µmol m −2 s −1 .The parameters evaluated were those reported by [47], which were useful to discriminate between environments with water deficit and heat stress: effective photochemical quantum yield of photosystem II [Y(II)], coefficient of non-photochemical fluorescence quenching (qN), coefficient of photochemical fluorescence quenching (qP), initial slope of the fast light curve (Alpha), maximum electron transport rate (ETR max ), and the irradiance at which the electron transport rate is saturated (IK), or in other words, the PAR value at the point of intersection of the horizontal line between ETR max and Alpha.

Spectral reflectance
This was measured using a portable spectrometer (FieldSpec 3 Jr., Analytical Spectral Devices ASD Inc., Boulder, CO, USA) with a spectral range of 350-2500 nm.The spectrometer fiber (25 • ) was inserted (32 • ) into a contact probe device (ASD Inc., Boulder, CO, USA) with a halogen light source (5 W to prevent the blade from burning), keeping a constant distance from the leaf (10 mm), and generating a measuring spot of ~10 mm diameter.The contact probe was calibrated every 15 minutes using a white reference tile (Spectralon, ASD Inc., Boulder, CO, USA) for scatter correction.Software RS3 (ASD Inc., Boulder, CO, USA) was used to calibrate, control the spectrometer and to acquire the spectral signatures.The equipment was configured to integrate three samples (350-2500 nm) per scan, and each leaf was scanned ten times.Reflectance data were extracted using View Spec Pro 2008 software (ASD Inc., Boulder, CO, USA).For leaf scan averaging, exploratory analysis of high-resolution spectral reflectance was performed using Spectral Knowledge software (SK-UTALCA) [19].

Modeling Analysis
Multivariate regression analysis was performed for each of the four individual environments (at-FI, at-WD, at +5 -FI, and at +5 -WD), but also considering the four conditions taken together (All).The regression models were developed using MLR (MATLAB, version 7.8.0R2009a, MathWorks, Inc., Natick, MA, USA) and PLS (The Unscrambler X, version 10.4,Computer Aided Modelling Camo, Trondheim, Norway).PLS was implemented by considering the number of components that maximized the calibration of the models in each analysis.A mean-centering procedure was carried out on all datasets prior to MLR and PLS to remove the offset effect.The validation of each model was evaluated using the methodology of leave-one-out cross-validation (LOO).
Each modeling methodology (PLS or MRL) considered the complete spectral signature (PLS W/O or MLR W/O), which was contrasted with that in which a selection of wavelengths was previously performed by: (i) multicollinearity (MC PLS or MC MLR); (ii) genetic algorithms (GA PLS or GA MLR); (iii) MC and then GA (MC+GA PLS or MC+GA MLR); and (iv) GA and then MC (GA+MC PLS or GA+MC MLR).The final number of wavelengths after each of the selection procedures is given in Table S1.
As a way to simplify the analysis and discussion of the multiple combinations generated for the study (15 traits, five environmental conditions, two multivariate analyzes, and five wavelength selection methods), only the values of the coefficient of determination for the validation procedure (R 2 Val ) were considered for the comparisons.Detailed information on the calibration and validation coefficients of determination, and root mean square error (RMSE) of the calibration and validation are given in Table S2.

Determining the Environmental Effects in the Leaf Spectral Signature
As a way to easily characterize the environmental effects on the spectral signature, magnifying the differences between the compared reflections, a practical methodology is proposed.For each cultivar, the ten scans per leaf were averaged to generate the spectral signature of the replica.Then, the six replicates were also averaged, constituting the spectral signature of the evaluated environment.For each cultivar and wavelength, analyses of variance (ANOVAs) were performed to verify whether the reflectance measurements were statistically different (Table S3).
When statistical differences were found between environments at each wavelength, the average reflectance values of the control treatment and the different adverse conditions (at-WD, at +5 -FI and at +5 -WD) were subtracted (at-FI vs. at-WD, at +5 -FI or at +5 -WD).Finally, at each wavelength assessed, the reflectance differences between environments were plotted per cultivar.

Results
In general, PLS was a better approximation than MLR and this was evident throughout all the analyses.Regardless of the methodology of wavelength selection, the validation models indicated an R 2  Val ≥ 0.45, at least in some environments, for SWP, Chl (a, b, total, and a/b), ETR max and IK (Figures 1-4, respectively).When all environmental data were combined (All), in general, the wavelength selection (simple or double) did not improve the estimation by PLS, but it was enhanced under each particular condition (at-FI, at-WD, at +5 -FI, or at +5 -WD).At the same time, MLR coefficients of determination of validation increased, particularly when a double selection was considered.In contrast, independent of the modeling methodology and the wavelength selection applied, the worst results obtained were for the estimation of Alpha, qP, qN and Y(II).

Stem Water Potential (SWP)
PLS performed better than MLR in the prediction of SWP (Figure 1).PLS W/O (full spectrum) showed the best estimation when all environments were considered together (All; R 2 Val = 0.29) and under water deficit conditions (at-WD; R 2 Val = 0.48).Wavelength selection proved to be an adequate procedure for modeling SWP by PLS for plants growing under fully irrigated conditions with (at +5 -FI; R 2 Val = 0.38 considering GA) or without heat stress (at-FI; R 2 Val = 0.31 considering GA+MC).When both of the adverse conditions were present, the prediction level was low (R 2 Val < 0.06) and the best approaches were found when a prior wavelength selection was performed.
MLR allowed estimation of SWP under at +5 -FI conditions with an R 2 Val of 0.36, when a double wavelength selection (MC+GA) was considered.Under the same conditions with the full spectrum (W/O), MLR performed poorly.

Results
In general, PLS was a better approximation than MLR and this was evident throughout all the analyses.Regardless of the methodology of wavelength selection, the validation models indicated an R 2 Val ≥ 0.45, at least in some environments, for SWP, Chl (a, b, total, and a/b), ETRmax and IK (Figures 1-4, respectively).When all environmental data were combined (All), in general, the wavelength selection (simple or double) did not improve the estimation by PLS, but it was enhanced under each particular condition (at-FI, at-WD, at+5-FI, or at+5-WD).At the same time, MLR coefficients of determination of validation increased, particularly when a double selection was considered.In contrast, independent of the modeling methodology and the wavelength selection applied, the worst results obtained were for the estimation of Alpha, qP, qN and Y(II).

Stem Water Potential (SWP)
PLS performed better than MLR in the prediction of SWP (Figure 1).PLS W/O (full spectrum) showed the best estimation when all environments were considered together (All; R 2 Val = 0.29) and under water deficit conditions (at-WD; R 2 Val = 0.48).Wavelength selection proved to be an adequate procedure for modeling SWP by PLS for plants growing under fully irrigated conditions with (at+5-FI; R 2 Val = 0.38 considering GA) or without heat stress (at-FI; R 2 Val = 0.31 considering GA+MC).When both of the adverse conditions were present, the prediction level was low (R 2 Val < 0.06) and the best approaches were found when a prior wavelength selection was performed.
MLR allowed estimation of SWP under at+5 -FI conditions with an R 2 Val of 0.36, when a double wavelength selection (MC+GA) was considered.Under the same conditions with the full spectrum (W/O), MLR performed poorly.

Chlorophyll Content (Chl a, Chl b, Chl total, and Chl a/b)
For the environments individually or in combination, the estimations of Chl a, b, total and a/b (Figure 2a-d, respectively) were higher under PLS.PLS W/O was the best approximation (R 2 Val = 0.36, 0.69, 0.61, and 0.87 for Chl a, b, total and a/b, respectively) when all environments were processed as one.Either with or without wavelength selection, there were no large differences in the estimation of Chl a under at-FI (0.18 < R 2 Val < 0.23) or at+5-WD (0.25 < R 2 Val < 0.27).Nevertheless, under at-WD and at+5-FI the coefficients of determination were improved when part of the spectral signature was  For the environments individually or in combination, the estimations of Chl a, b, total and a/b (Figure 2a-d, respectively) were higher under PLS.PLS W/O was the best approximation (R 2 Val = 0.36, 0.69, 0.61, and 0.87 for Chl a, b, total and a/b, respectively) when all environments were processed as one.Either with or without wavelength selection, there were no large differences in the estimation of Chl a under at-FI (0.18 < R 2 Val < 0.23) or at +5 -WD (0.25 < R 2 Val < 0.27).Nevertheless, under at-WD and at +5 -FI the coefficients of determination were improved when part of the spectral signature was selected to build the PLS models (0.26 < R 2 Val < 0.58 and 0.18 < R 2 Val < 0.26, respectively) (Figure 2a).Considering some magnitude differences associated with the environment and the prediction methodology, Chl total followed similar trends to Chl a (Figure 2c).With some exceptions, the estimation of Chl b did not improve when predictors were selected prior to modeling with PLS (Figure 2b); the R 2  Val was higher than Chl a (All: 0.69; at-FI: 0.43; at-WD: 0.62; at +5 -FI: 0.71; at +5 -WD: 0.47).Higher R 2  Val was found in Chl a/b with PLS (Figure 2d); All: 0.87 (W/O), at-FI: 0.85 (W/O), at-WD: 0.80 (GA), at +5 -FI: 0.84 (MC), and at +5 -WD: 0.72 (GA).
one.Either with or without wavelength selection, there were no large differences in the estimation of Chl a under at-FI (0.18 < R 2 Val < 0.23) or at+5-WD (0.25 < R 2 Val < 0.27).Nevertheless, under at-WD and at+5-FI the coefficients of determination were improved when part of the spectral signature was selected to build the PLS models (0.26 < R 2 Val < 0.58 and 0.18 < R 2 Val < 0.26, respectively) (Figure 2a).Considering some magnitude differences associated with the environment and the prediction methodology, Chl total followed similar trends to Chl a (Figure 2c).With some exceptions, the estimation of Chl b did not improve when predictors were selected prior to modeling with PLS (Figure 2b); the R 2   In the case of MLR, and irrespective of the variable and the environmental conditions, the estimations were always higher when a wavelength selection process was considered.

Leaf gas Exchange (A, gs, E, and Ci)
Although PLS generally proved to be a better approach to the estimations of leaf gas exchange, none of the modeling analyses were able to reach coefficients of determination higher than 0.44.
Among the four variables studied, the estimation of the CO2 assimilation rate (A) had the highest coefficients of determination under each condition (Figure 3a).Using PLS, when all environments were considered as one (All), there were no substantial differences between the wavelength selection methodologies (0.34 < R 2 Val < 0.38), while the at-FI, at+5-FI, and at+5-WD models were improved by In the case of MLR, and irrespective of the variable and the environmental conditions, the estimations were always higher when a wavelength selection process was considered.

Leaf gas Exchange (A, gs, E, and Ci)
Although PLS generally proved to be a better approach to the estimations of leaf gas exchange, none of the modeling analyses were able to reach coefficients of determination higher than 0.44.
Among the four variables studied, the estimation of the CO 2 assimilation rate (A) had the highest coefficients of determination under each condition (Figure 3a).Using PLS, when all environments were considered as one (All), there were no substantial differences between the wavelength selection methodologies (0.34 < R 2  Val < 0.38), while the at-FI, at +5 -FI, and at +5 -WD models were improved by GA (0.44, 0.43, and 0.43, respectively).For the estimation of A under at-WD, the highest R 2 Val GA (0.44, 0.43, and 0.43, respectively).For the estimation of A under at-WD, the highest R 2 Val (0.22) was obtained by performing a double selection (GA+MC PLS).In the case of MLR, wavelength selection was always better than the whole spectrum (MLR W/O).Similar to A, when all environmental conditions were combined, the estimation of gs did not vary among the PLS modeling approaches (0.24 < R 2 Val < 0.29) (Figure 3b).Wavelength selection by GA was the best methodology for stomatal conductance under at-FI and at-WD (0.33 and 0.35, respectively), and double selection (MC+GA PLS) was best for at+5-FI and at+5-WD (0.31 and 0.06, correspondingly).Again, estimates by MLR were always higher when a wavelength selection procedure was considered, although it resulted in lower coefficients of determination than PLS.
Regarding E, wavelength selection prior to PLS improved the estimates (Figure 3c).A double selection by MC+GA increased R 2 Val when all environments were combined (All) and also under at-FI and at+5-FI (0.23, 0.43, and 0.43, respectively), while GA+MC improved the estimation under at-WD (0.27).When the double stress (at+5-WD) was present, the R 2 Val was very low, regardless of either the PLS or MLR methodology.
When the internal CO2 concentration (Ci) was estimated for All, at-FI, and at-WD, the best approaches were from modeling with PLS and considering a previous selection by GA+MC (R 2 Val = 0.21, 0.23, and 0.35, respectively) (Figure 3d).In the case of plants subjected to heat stress alone (at+5 -FI), selection by MC and modeling with PLS gave the highest coefficients of determination (0.41).The selection by GA+MC and subsequent modeling with MRL was the best methodology to estimate the Ci in blueberry plants growing under the double stress (water deficit and heat stress) (R 2 Val = 0.34).Similar to A, when all environmental conditions were combined, the estimation of gs did not vary among the PLS modeling approaches (0.24 < R 2 Val < 0.29) (Figure 3b).Wavelength selection by GA was the best methodology for stomatal conductance under at-FI and at-WD (0.33 and 0.35, respectively), and double selection (MC+GA PLS) was best for at +5 -FI and at +5 -WD (0.31 and 0.06, correspondingly).Again, estimates by MLR were always higher when a wavelength selection procedure was considered, although it resulted in lower coefficients of determination than PLS.
Regarding E, wavelength selection prior to PLS improved the estimates (Figure 3c).A double selection by MC+GA increased R 2  Val when all environments were combined (All) and also under at-FI and at +5 -FI (0.23, 0.43, and 0.43, respectively), while GA+MC improved the estimation under at-WD (0.27).When the double stress (at +5 -WD) was present, the R 2  Val was very low, regardless of either the PLS or MLR methodology.
When the internal CO 2 concentration (Ci) was estimated for All, at-FI, and at-WD, the best approaches were from modeling with PLS and considering a previous selection by GA+MC (R 2  Val = 0.21, 0.23, and 0.35, respectively) (Figure 3d).In the case of plants subjected to heat stress alone (at +5 -FI), selection by MC and modeling with PLS gave the highest coefficients of determination (0.41).The selection by GA+MC and subsequent modeling with MRL was the best methodology to estimate the Ci in blueberry plants growing under the double stress (water deficit and heat stress) (R 2 Val = 0.34).

Modulated Chlorophyll a Fluorescence [Y(II), qN, qP, ETRmax, IK, and Alpha]
PLS models showed higher R 2 Val than MLR.With the exception of ETR and IK, the estimates of these variables had a R 2  Val lower than 0.26 (Figure 4).

Modulated Chlorophyll a Fluorescence [Y(II), qN, qP, ETRmax, IK, and Alpha]
PLS models showed higher R 2 Val than MLR.With the exception of ETR and IK, the estimates of these variables had a R 2 Val lower than 0.26 (Figure 4).When data of the effective photochemical quantum yield of photosystem II of all environments were combined (All), there were no major disparities in the estimates of [Y(II)] by GA PLS or MC+GA PLS (R 2 Val = 0.11 and 0.10, respectively) (Figure 4a).Double selection by MC+GA also improved the When data of the effective photochemical quantum yield of photosystem II of all environments were combined (All), there were no major disparities in the estimates of [Y(II)] by GA PLS or MC+GA PLS (R 2 Val = 0.11 and 0.10, respectively) (Figure 4a).Double selection by MC+GA also improved the estimation under at-FI, at-WD, and at +5 -FI (0.17, 0.26, and 0.17, respectively).In the case of at +5 -WD, W/O and MC+GA did not show large differences (0.07 and 0.06, respectively).Regarding qN, MC+GA PLS were the best estimation methodologies under AlI, at-FI, and at +5 -FI (R 2 Val = 0.18, 0.22, and 0.22, respectively) (Figure 4b).In the case of at-WD, PLS W/O generated coefficients of determination of 0.14, whereas at +5 -WD was lower than 0.1 (MLR W/O).For qP, R 2  Val values higher than 0.1 were obtained under at-FI (GA+MC PLS: 0.15), at-WD (GA PLS: 0.17), and at +5 -FI (MC+GA PLS: 0.11) (Figure 4c).
The best approach when modeling ETR max (Figure 4d) in All and at-WD was PLS W/O (R 2 Val = 0.54 and 0.40, correspondingly).For at-FI and at +5 -FI, there were no major differences between PLS W/O and MC+GA PLS (~0.60, and 0.28, respectively).In regard to at +5 -WD there was no variation between PLS W/O and MC PLS (R 2 Val = 0.53).With respect to IK (Figure 4e) under All and at-FI, modeling with PLS W/O resulted in the best estimates (R 2 Val = 0.35 and 0.59, respectively).GA PLS was the best wavelength selection method for at-WD (R 2 Val = 0.43), whereas MC+GA PLS was best for at +5 -FI and at +5 -WD (R 2 Val = 0.56 and 0.35, correspondingly).
Finally, Alpha was the trait with the lowest estimates, independent of the wavelength selection procedure and the modeling methodology considered (R 2 Val < 0.1).

Determining the Environmental Effects in the Leaf Spectral Signature
When the spectral signatures of each genotype were compared (Figure 5a,c,e,g,i; Table S3), differences between environments were clearer in those blueberries growing with and without heat stress (red and blue lines, respectively); dissimilarities were more consistent from ~740 nm onwards.Within each heat condition, differences between environments were difficult to examine.On the other hand, the proposed methodology to contrast the control spectral signature (at-FI) with the characteristic reflectance under each adverse condition (at-WD, at +5 -FI and at +5 -WD) allowed the magnification of the differences between the compared reflections (Figure 5b,d,f,h,j).In terms of magnitude, a greater number of differences were observed with at +5 -FI (black lines) and at +5 -WD (red lines) than in an environment without heat stress (at-WD; blue lines) (Figure 5b,d,f,h,j).
When the averaged reflectances under the adverse conditions were compared with that of the control (Figure 5b,d,f,h,j), a larger range of differences were observed, recognizing four comparable stretches: i) ~350-710 nm: V. ashei 'Bonita' (Figure 5j) behaved differently to V. corymbosum cultivars (Figure 5b,d,f,h); V. corymbosum was more reflective than the control, whereas V. ashei showed no differences between environments; ii) ~710-1450 nm: before the peak at 1450 nm, the responses varied according to the treatment and cultivar; this was the only section where all the cultivars proved to respond differently among the conditions studied.In general, the differences between at +5 -FI and at +5 -WD were minor in the southern V. corymbosum 'Bluecrisp' and 'Star' (Figure 5f,h) and V. ashei 'Bonita' (Figure 5j), but larger in the northern V. corymbosum 'Bluegold' and 'Liberty' (Figure 5b,d).In relation to at-WD, due to it lesser distance from the at-FI, the northern V. corymbosum 'Bluegold' (Figure 5b) and V. ashei 'Bonita' (Figure 5j; blue line) appeared to be less influenced by water deficit than the rest of the cultivars; iii) ~1450-1880 nm: the spectral signatures differed in a small region common to all cultivars, plateau between both extremes (~1570-1700 nm), in which a greater number of different patterns between environments was observed.The southern V. corymbosum 'Star' (Figure 5h) was notable because it showed the largest differences between at-FI and at-WD, and when heat stress was present its spectral signatures were not influenced by the water content in the pots (at +5 -FI and at +5 -WD; black and red lines); and (iv) ~1880-2500 nm: in the significant common section (2100-2250 nm), the 'Star' cultivar (Figure 5h) under at-WD differed the most from the control, and when heat stress was present, the spectral signatures of at +5 -FI and at +5 -WD were relatively similar.Additionally, V. corymbosum 'Bluegold' (Figure 5b) and V. ashei 'Bonita' (Figure 5j) seemed to be less influenced than the rest of the cultivars when they were grown under at-WD.a, c, e, g, and i) of V. corymbosum ('Bluegold', 'Liberty', 'Bluecrisp', and 'Star') and V. ashei ('Bonita') growing under four environmental conditions (at-FI: without water stress or heat stress; at-WD: only water stress; at+5-FI: only heat stress; and at+5-WD: with water stress and heat stress).Comparisons between control (at-FI) and each environmental condition are represented by the subtraction of reflectance at each wavelength (b, d, f, h, and j; at each wavelength, interruptions in the lines indicate sections without statistical differences, p < 0.05, between the environments).

Discussion
The incorporation of physiological assessments in plant breeding programs will become increasingly important, not only for the estimation of specific traits, but also for the simultaneous integration of a number of different characters, improving the chances of identifying cultivars welladapted to the more challenging environments predicted for the future [3,4].
Studies in which physiological traits have been estimated by spectral reflectance are varied in terms of equipment used (e.g., manufacturer, resolution and spectral sample, reproducibility and accuracy), methodology of reflectance measurements (e.g., calibration procedure, distance to the

Discussion
The incorporation of physiological assessments in plant breeding programs will become increasingly important, not only for the estimation of specific traits, but also for the simultaneous integration of a number of different characters, improving the chances of identifying cultivars well-adapted to the more challenging environments predicted for the future [3,4].
Studies in which physiological traits have been estimated by spectral reflectance are varied in terms of equipment used (e.g., manufacturer, resolution and spectral sample, reproducibility and accuracy), methodology of reflectance measurements (e.g., calibration procedure, distance to the measured object, use and intensity of light source, angle of measurement, screening of a single spot or moving the fiber across the plot, number of scans per plot, and integration time), and data analysis (e.g., the criterion of elimination of spectral noise, with or without prior wavelength selection, and linear and nonlinear approximations).Therefore, not only it is difficult to compare results between studies but, due to the same reasons, it is also inadequate on many occasions.
Methodologically, the first point of interest is the modeling criterion when the environmental conditions were evaluated together (All); usually performed better when higher coefficients of determination are looked for by the increment of the trait-range [20].Interestingly, the results of the present study suggest that blueberry models, in general, had a better performance when the environments were considered separately, at least when a proximal approach (leaf clip with light source) is used.When [17] and [20] estimated several physiological and productive traits by non-proximal spectral reflectance (80 cm above the canopy); it was concluded that the estimations were always improved when data from the contrasting water supply conditions (fully irrigated, mild and severe water deficit) were combined.
In most instances (i.e., variables, environments, wavelength selection criterion, and modeling methodology), PLS was superior to MLR.In this regard, the PLS method transforms the space of spectral properties so that the resulting factors represent the maximum variation in the covariance of the variable being evaluated.This produces efficient data compression and therefore a better calibration model compared to other linear statistical methods [48][49][50].Furthermore, the PLS method is based on a projection of the predictors (x) and response (y) variables into a set of latent variables (or PLS factors) and corresponding scores, minimizing the dimensionality of the data while maximizing the covariance between x and y variables [27].
On the other hand, as expected, MLR W/O was not a suitable methodology for modeling spectral data.It was common that a prior wavelength selection, especially MC, did improve the estimation (R 2  Val values) of the different variables analyzed (Figures 1-4, and Table S2).Although the MLR R 2 Val values were lower than the PLS, there are studies where a previous selection of wavelengths improved the predictive ability of MLR over PLS, or at least both methodologies were of equal strength [30,40,41].
The variables analyzed in this study are considered key for physiological breeding approaches [51], many of which have not previously been modeled by the use of spectral reflectance, especially in fruit breeding programs [3,4].Among the characters delivering valuable information about plant physiological status, photosynthetic pigments are probably the most extensively studied, especially the chlorophylls, where a strong association with the nitrogen content has been described [10,52,53].Independent of the plant species studied or the modeling methodology utilized for predicting chlorophyll concentration (Chl a, Chl b or Chl total), the coefficients of determination reported in the literature are usually high (R 2 Val > 0.65) [10,41,48,54,55], which is coincident with the results presented in this work.Interestingly, the higher R 2  Val values that were obtained for the estimation of Chl a/b values are also superior to those documented by other authors, which have been generated either via predictive models or through the study of SRIs [56].The chlorophyll a and b ratio (Chl a/b) is a good indicator of stress in higher plants [57,58].Stress conditions generate oxidative damage, causing a decrease in the chlorophyll content due to degradation of chlorophylls, deficiency in chlorophyll synthesis, and also because of changes in the thylakoid membrane structure [59].Low values of Chl a/b in leaves under drought conditions are caused by higher Chl a degradation rates compared to Chl b due to the conversion of Chl a to Chl b by the oxidation of the methyl group on ring II to an aldehyde group [60,61].In this sense, using the same individuals assessed in the current study, [47] proved that Chl a/b was relevant in identifying blueberry genotypes subjected to heat stress.
Due to the time involved in screening plant water status via pressure chambers, there has been significant effort invested in estimating SWP by remote sensing technologies [19].In this study, the R 2 Val values of the SWP under individual environmental conditions were, in general, lower than those found in other species such as olives (R 2  Val ~0.7) [19,62] or grapevines (0.71 < R 2 Val < 0.84) [63][64][65].
the control treatment, probably also associated with the drought tolerance characteristics of this species; if this is true, it would mean that, in the case of 'Bluegold', the similarity of the spectral signatures of at-FI and at-WD (blue line in Figure 5b) could be due to a greater capacity to tolerate a lack of water.These preliminary results would indicate that the presented methodology (changes relative to the control), could be considered for the identification of tolerant/susceptible genotypes for a specific environmental condition.

Conclusions and Future Perspectives
In general, PLS showed better results than MLR, regardless of the methodology used in the selection of the wavelengths.Considering the low predictive values of MLR, with few exceptions, there was an increase in trait estimation (R 2 Val ) using MC over GA for predictor selection.Chl a/b, ETR max and IK were among the variables with the highest coefficients of determination of validation, which contrasted with the poor estimations reached for Alpha, qP, qN and Y(II).The results of this study reaffirm that, by the modeling of the spectral reflectance, it is possible to assess some key traits (e.g., Chl a/b, ETR max and IK) that could begin to be used in blueberry breeding programs oriented to adaptation to the new challenging environmental scenarios.
Even if it could be argued that it is disadvantageous to work with potted plants under controlled environments and reflectance assessed by a proximal approach, as performed by other authors in maize and wheat (Yendrek et al., 2017;Silva-Perez et al., 2018) [15,86], there are still some important advantages to highlight: (i) the minimum spectral noise compared to a non-proximal measurement helps to identify the potential of this technology in fruit breeding programs; (ii) under field conditions, it is difficult to study spectral signature changes of isolated and combined environmental conditions (drought and heat); (iii) in the case commercial blueberry production, an important part of the southern highbush cultivars are grown in pots with peat moss and in greenhouses [87,88], primarily subjected to high temperature conditions, so breeders are also programming crosses for these type of environments; and (iv) assessments are considerably faster than the classic eco-physiological measurements, allowing the screening of a much higher number of genotypes in a short period of time [15,86].The selected genotypes can then be tested in more sophisticated field trials as part of the advanced variety selection [4,89].
For increasing measurement speed and to allow integration of a larger proportion of the canopy, future efforts should consider a non-proximal approach.To the extent that the use of phenomics is of major relevance and a greater number of seasons are available for modeling, the implementation of a completely independent validation of calibration models should be undertaken.As proposed by [20], a multivariate classification modeling approximation (e.g., PLS-DA) should also improve the predictive capacity, at least for the group of elite genotypes.
The preliminary results from the study of the spectral signature with respect to the control treatment could represent an easier and more direct way to evaluate the tolerance or susceptibility to a specific environmental condition.In this sense, continuous measurements of the spectral signature through periods of adverse conditions could help us gain better identification and hence understanding of the wavelengths involved and the magnitudes associated with the particular condition.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2072-4292/11/3/329/s1,Table S1: Number of predictors after a wavelength selection process (W/O: no-wavelength selection; MC: multicollinearity; GA: genetic algorithms; and their combination: MC+GA or GA+MC) for each trait and environmental condition (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; at +5 -WD: with water stress and heat stress; and All: all environments combined), Table S2: Coefficients of determination (R 2 ) and root mean square error (RMSE) of calibration (Cal) and validation (Val) for each trait in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; at +5 -WD: with water stress and heat stress; and All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC, Table S3: P-values of the analyses of variance (ANOVAs) for each cultivar and wavelength (nm), for the averaged spectral signature under four contrasting environments: without water stress or heat stress, only water stress, only heat stress, and water and heat stress.For each cultivar, the ten scans per leaf were averaged to generate the spectral signature of the replica.Then, the six replicates were again averaged, constituting the spectral signature of the evaluated environment.P-values below 0.05 are in bold font.Figure S1: Temperature records (11:00 and 17:00 h) in the ambient temperature (at) and elevated temperature (at +5 ) greenhouses during the days of measurement.The solid line represents the temperature in at, while the line with crosses represents the temperatures in at +5 .The line with circles represents the differences between the temperatures of the two greenhouses (at-at +5 ).

Figure 1 .
Figure 1.Coefficients of determination of the validation (R 2 Val) process of the estimation of stem water potential in different environments (at-FI: without water stress or heat stress; at -WD: only water stress; at+5-FI: only heat stress; at+5-WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 1 .
Figure 1.Coefficients of determination of the validation (R 2 Val ) process of the estimation of stem water potential in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; at +5 -WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 2 .
Figure 2. Coefficients of determination of the validation (R 2 Val) process of the estimation of chlorophyll (Chl) a (a), Chl b (b), total Chl (c) and Chl a/b (d), in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at+5-FI: only heat stress; at+5-WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 2 .
Figure 2. Coefficients of determination of the validation (R 2 Val ) process of the estimation of chlorophyll (Chl) a (a), Chl b (b), total Chl (c) and Chl a/b (d), in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; at +5 -WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 3 .
Figure 3. Coefficients of determination of the validation (R 2 Val) process of estimation of CO2 assimilation rate (a), stomatal conductance (b), transpiration rate (c) and internal CO2 concentration (d), in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at+5-FI: only heat stress; at+5-WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 3 .
Figure 3. Coefficients of determination of the validation (R 2 Val ) process of estimation of CO 2 assimilation rate (a), stomatal conductance (b), transpiration rate (c) and internal CO 2 concentration (d), in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; at +5 -WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 4 .
Figure 4. Coefficients of determination of the validation (R 2 Val) process of estimation of the photochemical quantum yield of photosystem II (a), the coefficient of non-photochemical fluorescence quenching (b), the coefficient of photochemical fluorescence quenching (c), the maximum electron transport rate (d) the irradiance at which the electron transport rate is saturated (e) and the initial slope of the fast light curve (f), in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at+5-FI: only heat stress; at+5-WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 4 .
Figure 4. Coefficients of determination of the validation (R 2 Val ) process of estimation of the photochemical quantum yield of photosystem II (a), the coefficient of non-photochemical fluorescence quenching (b), the coefficient of photochemical fluorescence quenching (c), the maximum electron transport rate (d) the irradiance at which the electron transport rate is saturated (e) and the initial slope of the fast light curve (f), in different environments (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; at +5 -WD: with water stress and heat stress; All: all environments combined) and modeled by partial least squares (PLS) and multiple linear regression (MLR), considering five wavelength selection methods: without selection (W/O) or full spectrum, multicollinearity (MC), genetic algorithms (GA), and the combinations MC+GA or GA+MC.

Figure 5 .
Figure 5. Spectral signatures(a, c, e, g, and i) of V. corymbosum ('Bluegold', 'Liberty', 'Bluecrisp', and 'Star') and V. ashei ('Bonita') growing under four environmental conditions (at-FI: without water stress or heat stress; at-WD: only water stress; at+5-FI: only heat stress; and at+5-WD: with water stress and heat stress).Comparisons between control (at-FI) and each environmental condition are represented by the subtraction of reflectance at each wavelength (b, d, f, h, and j; at each wavelength, interruptions in the lines indicate sections without statistical differences, p < 0.05, between the environments).

Figure 5 .
Figure5.Spectral signatures (a,c,e,g,i) of V. corymbosum ('Bluegold', 'Liberty', 'Bluecrisp', and 'Star') and V. ashei ('Bonita') growing under four environmental conditions (at-FI: without water stress or heat stress; at-WD: only water stress; at +5 -FI: only heat stress; and at +5 -WD: with water stress and heat stress).Comparisons between control (at-FI) and each environmental condition are represented by the subtraction of reflectance at each wavelength (b,d,f,h,j; at each wavelength, interruptions in the lines indicate sections without statistical differences, p < 0.05, between the environments).