#### 3.2. Analysis for Sensitive Wavebands and Optimal Vegetation Indices for Breeding Line Yield-Prediction

For identifying the hyperspectral reflectance wavebands sensitive to yield, the yield of 2ndYYT 2015 (

Figure 3A), 1stYYT 2015 (

Figure 3B), NJRIKY test 2015 (

Figure 3C) and their corresponding average hyperspectral data at R2, R4, R5 and R6 were analyzed, The wavelengths with maximum and minimum correlation coefficients between spectral reflectance and seed yield were 750~950 nm and 454~710 nm, respectively, for the tests (

Figure 3).

Based on the 2ndYYT 2015 (

Figure 4A) and 1stYYT 2015 (

Figure 4B) data, the contour maps of determination coefficients of linear regression between the two-band NDVI, RVI at R5 stage and yield were established using the “plsregress” function in MATLAB procedure. The dark red area presented the highest correlation zone, and the best sensitive bands for yield-prediction concentrated in the range of 550 nm to 750 nm.

The results of the relationship between vegetation indices and yield at different single growth stages analyzed using MATLAB procedure are listed in

Table 3; the sensitive bands of the 1stYYT 2015 at R2, R4, R5 and R6 growth stages were 750 nm and 770 nm, 750 nm and 770 nm, 634 nm and 674 nm and 550 nm and 710 nm, respectively, while those of the 2ndYYT 2015 were 482 nm and 590 nm, 514 nm and 606 nm, 514 nm and 606 nm and 550 nm and 710 nm, respectively. This indicated that the sensitive bands varied greatly between the two breeding line tests for the same growth stage, while the sensitive bands also varied at the different growth stages even for a same yield-test.

Table 3 also shows that the yields of the two tests were all highly correlated with canopy reflectance at R5 stage, with the maximum

R^{2} up to 0.68 and 0.50 respectively, and therefore, the best growth stage to collect the UAV hyperspectral reflectance data for yield-prediction using vegetation indices was at R5, while the spectral sensitive bands for soybean yield-prediction were in 454~850 nm. The other growth stages, R2, R6 and R4, were in turn not as good as R5. The 10 vegetation indices were ranked for each of the growth stages in the two yield-tests according to their determination coefficients, NDVI and RVI were all ranked the top two (

Table 3). Since NDVI and RVI based on filtered optimized bands are the two most sensitive indices, they were selected for the establishment of yield-prediction models in this research.

#### 3.3. Optimized Reflectance-Sampling Unit-Size for Organizing the UAV Hyperspectral Reflectance Data

From the UAV reflectance data set of the breeding lines, the hyperspectral data of each plot were obtained using the vector image georeferenced with the hyperspectral image. Twenty-one reflectance-sampling unit-sizes were designed using ENVI procedure combined with IDL language (

Table S3), each plot image and vector map at each spatial scale were read, and the 21 datasets of the average spectral reflectance in each plot were extracted (

Figure S2). It could be seen that the spectral reflectance of the canopy corresponding to different spatial sampling unit areas was of no significant difference in 550~750 nm of the visible light bands, but the difference was significant in the 750~850 nm near-infrared region. To select the best sampling unit-size of hyperspectral reflectance and eliminate plot marginal effects, the hyperspectral reflectance plot data of 2ndYY T2015, 1stYYT 2015 and NJRIKY at R5 growth stage were used. The CVs of red and near-infrared band, NDVI, RVI and VOG1 of the three tests were also calculated from the spectral information extracted from the 21 different reflectance-sampling unit-sizes. The smaller the value of the coefficient of variation, the better the reflectance-sampling unit-size.

Figure 5 showed that the CV of red-band, near-infrared, NDVI, RVI and VOG1 distributed between 0.15~0.18, 0.16~0.18, 0.13~0.14, 0.01~0.02, 0.05~0.06 for 2ndYYT 2015, and 0.12~0.15, 0.11~0.15, 0.15~0.20, 0.03~0.04, 0.04~0.05 for 1stYYT 2015, and 0.83~0.98, 1.10~1.19, 0.37~0.48, 0.05~0.07, 0.05~0.05 for NJRIKY. The CV was larger when the sampling unit area was at the small or large side that was probably because fluctuations caused by too small unit while marginal effect of the sampling area included in a too large unit. However, all the results showed only slight differences of CV among band values and vegetation indices under the 21 different sampling areas. The reflectance-sampling unit-areas with stable CVs were approximately between 2.1~8.1 m

^{2}, 1.2~5.2 m

^{2} and 1.0~2.7 m

^{2} for 2ndYYT 2015, 1stYYT 2015 and NJRIKY, respectively (

Figure 5). Thus, when the proportion of the sampling unit-size in that of the total plot was between about 20% to 80%, the canopy reflectance data obtained could be used for plot-yield prediction. In the establishment of prediction model below, the upper-side of the optimal sampling unit-area was preferred since all the hyperspectral data can be obtained from one flight and no additional expense was needed.

#### 3.4. Identification of Major Factors for the Establishment of Plot-Yield Prediction Models

In the establishment of plot-yield prediction models, all the material sets were separated into two subsets for mutual checks which was done automatically by the MATLAB software. The materials, in a total of 1,103 lines, were organized and coded as 1stYYT 2015 (A1 + B1), 2ndYYT 2015 (A2 + B2), and 2ndYYT 2016 (A3 + B3) (

Table S1), while the total of the three sets of materials was coded as A4 + B4 (= A1 + B1 + A2 + B2 + A3 + B3), A4 (= A1 + A2 + A3) including 551 lines, B4 (= B1 + B2 + B3) including 552 lines. The 165 lines of 1stYYT 2015 were promoted to the second-year yield-test in 2016, which was designated A5, while the 48 lines of the second-year yield-test in 2015 were retained in the second-year yield-test in 2016, which was designated B5. The 165 + 48 = 213 lines in 2015 was designated A6, while the 213 lines in 2016 was designated B6, therefore, A5 + B5 = A6 + B6 = 426 lines. The prediction models were constructed based on A1 + B1 A1, B1, A2 + B2, A2, B2, A3 + B3, A3, B3, A4 + B4, A4, B4, A5, B5, A6 + B6, A6 and B6 in a total of 17 material groups (

Tables S1, S4 and S5).

The 17 material sets were used to screen for major factors to be included in yield-prediction models. The exponential, linear and logarithmic regressions with one vegetation index (RVI or NDVI) at R5 were established using Excel 2007 procedure (

Tables S4 and S5). The results showed that the difference of

R^{2} between the RVI and NDVI were not significant and the

R^{2} of linear function of all material sets were somewhat larger and more stable. Among the models in

Table S4, the linear regression y = 3E-05x + 0.6526 (x = RVI (618, 674)) with

R^{2} of 0.61 and y = −2E-05x + 0.2055 (x = NDVI (618, 674)) with

R^{2} of 0.61 both for A1 + B1 (1stYYT 2015); the two linear regressions composed of NDVI or RVI both with

R^{2} of 0.49 for A2 + B2 (2ndYYT 2015). The similar situation was observed for other material groups, such as A1, B1, A2, B2, etc., which indicates both NDVI and RVI were relevant in the construction of plot-yield prediction models. Based on the aforementioned, a linear function with two vegetation indices (namely NDVI and RVI) at R5 stage was established for the second round of the yield-prediction models assessment (

Table S6).

#### 3.5. Establishment and Evaluation of Yield-Prediction Models Using Normalized Difference Vegetation Index (NDVI) and Ration Vegetation Index (RVI) at R5

The second round yield-prediction models were established from the 17 material groups and listed in

Table 4 (the model equations listed in

Table S7). As indicated above, the program took a random half of the lines for establishing yield-prediction model and the other random half for validation of the established model. Linear models composed of NDVI and RVI at R5 were established for each of the 17 material groups. In

Table 4, the established models were evaluated based on their modelling precision, including the modelling determination coefficient

R_{M}^{2} and the modelling root mean square error (

RMSE_{M}) and their verification precision, including the verification determination coefficient

R_{V}^{2} and the verification root mean square error (

RMSE_{V}). For a comprehensive evaluation to balance the modelling and verification, these two parts were summed up as

R_{S}^{2} and

RMSE_{S}, respectively. In

Table 4, the model M

_{A1} presented the largest

R_{S}^{2} = 1.30, in turn followed by M

_{A1+B1}, M

_{B1}, M

_{A5}, M

_{A2}, M

_{A6} and M

_{B5} with

R_{S}^{2} 1.21, 1.19, 1.19, 1.13, 1.12 and 1.06. Their corresponding

RMSE_{S} were 0.541, 0.651, 0.740, 0.580, 0.503, 0.519 and 0.674, respectively. These models were established from modelling sample size from 48 to 266 lines from a single yield-test. As for the models M

_{A4}, M

_{B4}, and M

_{A4+B4} based on modelling a sample size of 275~551 lines composed from three sets of yield-tests, their

R_{S}^{2} were all 0.91 and

RMSE_{S} were 0.724, 0.802 and 0.819, respectively. The other models were inferior to the above ones with respect to their precision.

#### 3.6. Establishment and Evaluation of Yield-Prediction Models Using NDVI and RVI at Multiple Stages

The 17 material sets and yield-prediction models in

Table 4 involved only two vegetation indices at a single growth stage R5, utilization of more vegetation indices at multiple growth stages might improve the model precision, which was conducted using the MATLAB procedure. From the 1stYYT 2015 and 2ndYYT 2015 data, all the 10 vegetation indices and growth stages were screened for best plot-yield prediction-models, the maximum coefficient of determination for models with the 10 vegetation indices reached 0.69 and 0.59. Since 1stYYT 2015 (A1 and B1) in

Table 4 was the material set from which the best model came, its major results are introduced here. The yield-prediction models based on combinations of two growth stages and three growth stages of vegetation index when 9 VIs involved, the maximum of the model

R^{2} was 0.73. The best combination of the three growth stages were R2, R5, and R6; when 10 vegetation index variables were introduced, the maximum model

R^{2} was 0.74. As the number of growth stages and vegetation indices increased, the model

R_{M}^{2} increased but not significantly,

Table S6 shows that two growth stages models are better than single-stage models, the combinations of R2 and R5, then R6 and R5, R4 and R5 are in turn better than the others among the two-stage models. However, not very large difference was among the vegetation index numbers involved, so less number (2 vegetation indices) was preferred for simplicity of the models.

Based on the above results, the third round yield-prediction models for the 17 material sets with two growth stages (R5 + R4 for each material set and R5 + R2, R5 + R6 and R5 + R4 for A1 and A6 material sets) and two vegetation indices (NDVI and RVI), in a total of 21 yield-prediction models were established using the MATLAB procedure and then evaluated further. As indicated before, half a set of breeding lines was used for modelling and half set for validation. The results were summarized in

Table 5 (the model equations listed in

Table S8).

Based on the results that the precision of the yield-prediction models composed of two vegetation indices at two growth stages were better than those composed of two vegetation indices at R5 single growth stage in term of determination coefficient (R_{M}^{2}, R_{V}^{2} and R_{S}^{2}) and root mean squares error (RMSE_{M}, RMSE_{V} and RMSE_{S}) for all the 17 material sets. Among the different material sets, the best set of models were those obtained from 1stYYT 2015, i.e., models of M_{A1+B1-2}, M_{A1-1}, M_{A1-2}, M_{A1-3} and M_{B1-2}; the second were those from 2ndYYT 2015, i.e., models of M_{A2+B2-2}, M_{A2-2}, M_{B2-2}; the third were those of M_{A6+B6-2}, M_{A6-1}, M_{A6-2} and M_{A6-3}, but not M_{B6-2}, and M_{A5-2} and M_{B5-2}; the fourth were those from the total of the three sets of breeding lines, i.e., models of M_{A4+B4-2}, M_{A4-2}, M_{B4-2}. This situation coincides roughly with the situation of the R5 single growth-stage models that the model precision depends on their source materials. Those from a same test were usually better than those from different tests even the sample size (number of total lines) increased, such as M_{A1+B1-2} and M_{A2+B2-2} but not M_{A3+B3-2} are better than M_{A4+B4-2}.

Among the 1stYYT2015 models, the R_{S}^{2} of M_{A1-1}, M_{A1-2}, M_{A1-3}, M_{B1-2} and M_{A1+B1-2} models (-1 means R5 and R2, -2 means R5 and R6, -3 means R5 and R4) were 1.41, 1.34, 1.32, 1.24 and 1.22 with the RMSE_{S} 0.457, 0.540, 0.541, 0.640 and 0.631, respectively. Among the 2ndYYT 2015 models, the R_{S}^{2} of M_{A2+B2-2}, M_{A2-2} and M_{B2-2} models were 1.28, 1.17 and 1.00 with the RMSE_{S} 0.703, 0.606 and 0.603, respectively. Among the material sets with two years data, the R_{S}^{2} of M_{A6+B6-2}, M_{A6-1}, M_{A6-2}, M_{A6-3} and M_{B6-2} were 1.03, 1.17, 1.15, 1.17, and 0.44 with the RMSE_{S} 0.680, 0.517, 0.550, 0.550 and 0.709, respectively. The R_{S}^{2} of M_{A5-2} and M_{B5-2} were 1.26 and 1.09 with their RMSE_{S} 0.622 and 0.615, respectively. Among the combined material sets, the R_{S}^{2} of M_{A4+B4-2}, M_{A4-2} and M_{B4-2} models were 0.94, 0.94 and 0.93 with their RMSE_{S} 0.761, 0.653 and 0.814. From the above, the superior models were constructed from A1, A1+B1, B1, A2+B2, A5, A6 material sets, the superior growth stage combination was R5+R4, provided the best vegetation index combination was NDVI and RVI. All the models were potential for breeding line yield selection except those of M_{A3+B3-2}, M_{A3-2}, M_{B3-2} and M_{B6-2}, while M_{A4+B4-2}, M_{A4-2} and M_{B4-2} were for further checking.

#### 3.7. Further Comparison and Selection of Best-Fitted Plot-Yield Prediction Models for Yield Breeding Programs

The verification of the models in

Table 5 was limited in using the other half of breeding lines in the same material set, while the recognized yield-prediction model was to be used for a broad range of breeding materials, so these models should be further validated with more breeding materials. Our method was twofold: one was to evaluate the verification root-mean-square-errors (

RMSE_{V}) for all the breeding line sets tested (in a total of 1103 lines), the other was to evaluate the coincidence between the model-predicted and breeders’ actual yield selection results.

Table 6 shows the results from the evaluation of verification root-mean-square-errors (

RMSE_{V}). All the models were evaluated with the three sets of yield-tested breeding lines 1stYYT 2015 (A1 + B1), 2ndYYT 2015 (A2 + B2), 2ndYYT 2016 (A3 + B3) and their total set (A4 + B4). The models M

_{A1-2}, M

_{A2+B2-2}, M

_{A2-2} and M

_{A6-2} are models with less

RMSE_{V} for all the four breeding line sets in addition to higher determination coefficient in

Table 5, while the models M

_{A4+B4-2}, M

_{A4-2} and M

_{B4-2} were of small

RMSE_{V} for all the four material sets but with medium size of determination coefficient in

Table 5.

The results of evaluation of the coincidence between the model-predicted and breeders’ actual yield selection are shown in

Table 7. The coincidence was good in the four material sets for the above 4 models (M

_{A1-2}, M

_{A2+B2-2}, M

_{A2-2} and M

_{A6-2}) selected from

Table 6. After a further comparison comprehensively, the models of M

_{A1-2}, M

_{A6-2} and M

_{A4-2} were good in coincidence rates for all the selection categories (eliminated, reserved and promoted) in all the populations and were chosen for utilization in plot-yield prediction in yield breeding programs (see

Table 7 and its notes for details).

M

_{A1-2} is a linear model derived from the material set which is a first part with 133 breeding lines of the 1stYYT 2015, with its yield ranging between 1.836 and 4.680 t ha

^{−1}, growth period ranging between 99 d and 112 d. M

_{A4-2} is also a linear model derived from the material set which is a first part with 275 breeding lines of the three sets of tests, with its yield ranging between 1.656 and 4.757 t ha

^{−1}, growth period ranging between 96 d and 116 d. M

_{A6-2} is also a linear model derived from the material set which is a group of the selected and retained breeding lines from 1stYYT 2015 and 2ndYYT 2015 with two years’ data of 106 breeding lines, with its yield ranging between 2.380 and 4.925 t ha

^{−1}, growth period ranging between 101 d and 116 d. The formulae of the three recommended and other prediction models are listed in

Table S8 with their corresponding hyperspectral reflectance bands. The three plot-yield prediction models can be used for breeding lines in yield-test nurseries within the corresponding yield and growth period range, single model or all the three models can be used simultaneously in a same yield-test nursery.

In addition, the 21 models in

Table 5 were also validated with the NJRIKY (A + B) population to imitate the plant-to-line selection precision.

Tables S9 and S10 showed that the above models of M

_{A1-2} and M

_{A4-2} (but not M

_{A6-2}) were also suitable for yield-prediction of the plant-derived-line selection.