Phenotypic Variation and Relationships between Grain Yield, Protein Content and Unmanned Aerial Vehicle-Derived Normalized Difference Vegetation Index in Spring Wheat in Nordic–Baltic Environments

.


Introduction
Common wheat (Triticum aestivum L.) is a valuable and extensively cultivated cereal worldwide.Global wheat production has increased, rising from 715 million tons in 2015 to 780 million tons in 2021 [1].The significance of wheat extends to the Baltic countries (Latvia, Estonia, Lithuania) and the Nordic nations (Denmark, Norway, Sweden, Finland), where wheat production collectively accounts for approximately 7% and 5% of the total European wheat production, respectively [1,2].
Agronomy 2024, 14, 51 2 of 20 The climatic conditions in the Nordic-Baltic region facilitate the cultivation of both types of wheat-spring and winter wheat.Notably, the majority of the harvested area in the Baltic countries is dedicated to winter wheat, with only about 20% of the area allocated to spring wheat cultivation [3].Given that a significant portion of spring wheat production in the Baltic countries and Norway is earmarked for breadmaking, breeding programs in this region simultaneously prioritize traits related to productivity and baking quality, including grain yield (GY) and grain protein content (GPC) [4][5][6][7].Wheat with a moderate-to-high GPC (exceeding 12%) is considered essential for bread making, while wheat with lower GPC levels is typically used for cookies, noodles, or animal feed [8].
The GY and GPC are complex quantitative characteristics controlled by multiple genes, influenced by genotype, environment, and genotype-environment interactions.Breeders often rely on grain yield and quality, along with their contributed performance and phenotypic expression, when selecting crop cultivars under mega-environment tests [9].Abiotic stresses, such as water stress, heat, and nutritional deficiency, can negatively impact crop yield and quality [10,11].Therefore, each genotype should undergo evaluation for several years in different locations to identify high-performing and stable genotypes, recommending them for cultivation in target areas.
Wheat, as a cereal crop, has a high demand for nitrogen (N), crucial for plant growth, development, GY formation, and GPC accumulation and essential for baking and processing quality.Wheat growth has an absolute requirement for N, and crop yield and quality heavily depend on substantial N inputs [12].
Crop breeding needs to be intensified and made more efficient, utilizing new phenomics methods to predict the GY and GPC, both of which are interrelated parameters [13].Consequently, remote sensing is gradually gaining recognition as an alternative to traditional, destructive field sampling methods and laboratory testing.It serves as a tool for finding the nitrogen (N) status in wheat [14].Unmanned aerial vehicle (UAV) platforms equipped with multiple sensors, capable of swiftly scanning entire fields, have proven to be useful tools for gathering non-destructive data on the physiological state of plant canopies [15,16].Multispectral data have been analysed for UAV-based estimation of wheat grain yield under different water and nitrogen conditions and at various wheat growth stages [17,18].
The normalized difference vegetation index (NDVI), calculated from multispectral data, stands out as the most popular index for vegetation assessment [19].This index finds utility in various fields, including agriculture [20].The NDVI is valuable in estimating biomass [21], indicating plant stress [22], determining chlorophyll concentration [23], and assessing other characteristics.Numerous studies on wheat have demonstrated the potential use of NDVI readings as a tool to differentiate and identify high-yielding wheat genotypes, even in different nitrogen environments [24][25][26].
There is a lack of published results on the agronomic and UAV remote sensing phenotyping of spring wheat genotypes in multi-environment experiments, particularly within the unique conditions of the Baltic region, which would offer valuable insights for breeding programs.The aim of this study was to identify the best performing and stable spring wheat genotypes for grain yield and grain protein content in different Nordic-Baltic environments.
The objectives of this study were to: • Assess the variation in the GY, GPC, and UAV-derived NDVI across different years and nitrogen fertilization rates in the Nordic-Baltic region; • Evaluate the performance and stability of spring wheat genotypes concerning the GY and GPC under varying N fertilization rates and diverse agrometeorological conditions;

•
Explore the quantitative relationships between GY, GPC, and UAV-derived NDVI variables across multi-environmental trials during different plant growth stages.
Using UAV phenotyping technologies combined with field evaluations, we intended to determine whether superior genotypes can be accurately identified with this methodology, and at which stages of plant development the assessment should be performed.

Plant Materials
In this experiment, sixteen spring wheat genotypes were used, which include cultivars and improved breeding lines mostly developed in the last decade in the spring wheat breeding programs of the Baltic States and Norway.These programs include the Agricultural Resources and Economics Institute (Latvia, LV), the Lithuanian Agricultural and Forestry Research Centre (Lithuania, LT), and the Centre of Estonian Rural Research and Knowledge (Estonia, EE), as well as the Norwegian University of Life Sciences (Norway, NO) and Graminor (Norway).Detailed information on the selected spring wheat genotypes and their origin and descriptions are given in Table S1.S2.

Field Trials
In each location, spring wheat genotypes were sown in field trials set up with two contrasting nitrogen (N) fertilization rates applied before sowing, i.e., 75 and 150 kg ha −1 N of compound NPK fertilizer, referred to as low N (N75) and high N (N150) rates, respectively.The high N treatment represents the typical fertilization rate for spring wheat in Norway and the Baltic states, whereas the low N treatment reflects less intensive management, allowing for the evaluation of the performance of genotypes at a reduced N supply.The experiment was designed in split plots, with the main plot being the N rates (N75, N150) and the sub-plot being the wheat genotypes.
The harvested plot size for each genotype across different locations was 9-15 m 2 , arranged in two randomized replicates.The seeding rate was 500 kernels m −2 in LV and LT, and 600 kernels m −2 in EE and NO.Trials were conducted at the proper time, from 21 April to 2 May, depending on the local meteorological conditions at each location.Treatment of trials with pesticides included the application of herbicides and fungicides in all locations based on standard procedures.Foliar fungicide treatments (1-2 applications) were applied at each location, but insecticides were used when the control thresholds of insects was exceeded.Following the ripening stage, the trial material was harvested using a combine harvester.
Daily weather data (precipitation amount, mm; • C; average day temperature, • C) were obtained from services provided by research institutions.The climate normal, or long-term average (LTA, 1991-2020), was used to compare two-year data at each location.

Phenotyping of Field Traits
Days to maturity (DM, days) were calculated from visually recorded maturity dates when 50% of the plants were in the respective developmental stage.Plant height (PH, cm) was manually measured at the ripening stage in three separate places within the plot and calculated as the average height of a sample of a fertile stem from the soil surface to the top of the spike.Grain yield (GY, g m −2 ) was recorded for each experimental plot and adjusted to a moisture content of 14%.Grain protein content (GPC, %) was measured using a near-infrared reflectance spectrometer (Perten Inframatic 9200/Perten Instrument AB) in Norway (NO), and near-infrared transmittance whole grain analysers were used in Estonia (EE) (InfratecTM/Foss), Latvia (LV) (InfratecTM NOVA/Foss), and Lithuania (LT) (Infratec 1241/Foss).

UAV Data Acquisition and Processing
A DJI Phantom 4 Multispectral quadcopter (DJI, Inc., Shenzhen, Guangdong, China) was used to acquire image data in LV, LT, and NO.This commercial UAV model is commonly employed in agronomy research and monitoring [27].A DJI Phantom 4 RTK (Real Time Kinematics) quadcopter (DJI, Inc., Shenzhen, China) equipped with a Micasense RedEdge MX multispectral camera (AgEagle Aerial Systems, Seattle, WA, USA) was used to acquire image data in EE.
Multispectral UAV images were captured at each location from an altitude of 20 m, resulting in a ground sample distance (GSD) of 1 cm.The images were taken with a minimum of 75% overlap in the front and side directions, using a 90 degree (NADIR) camera position.Each flight mission generated approximately 5000 TIFF images, with five images acquired for each shot.
Four UAV flight missions were planned in each of eight field trials, corresponding to a certain time of plant vegetative and generative development based on the Zadoks growth scale [28]: the middle of vegetative wheat plant development (tillering, growth stage 25/GS25), spike emergence or heading (GS55), the middle of grain filling (milk development stage, GS75), and late grain filling (dough development stage, GS85) (Table 1).However, the developmental stages differed slightly between genotypes due to differences in phenological development.The result was a series of average reflectance values of the field trial plots in each of the five spectral bands (red, green, blue, red edge, near-infrared).Aerial images were processed in PIX4Dmapper by Pix4D (Lausanne, Switzerland), photogrammetry software for drone mapping.Processed orthophotos were georeferenced with an average root mean square error (RMSE) of 3-5 cm.To maintain this precision and compatibility between data series taken at different time periods, ground control points (GCPs) were set up and measured and maintained throughout the entire vegetation season.Also, for each flight mission, a multispectral photo of the MAPIR calibration panel was taken, which was used to calibrate the results so that the results could be compared between locations and missions under different lighting conditions.Spectral reflectance values were extracted from final orthophoto mosaics and cropped to field boundaries.The QGIS zonal statistics tool was used to extract mean reflectance values from the orthophoto images, which served as the input for the calculation of NDVI values.The vectorized plot layer served as an AOI (area of interest), containing rectangles for each of the trial plots.A vector layer was created using a digitization technique using a georeferenced orthophoto image from the field where plots were clearly visible after germination.
The NDVI was calculated in QGIS based on a combination of visible, red, and NIR wavelengths with the following Equation (1): where R NIR is the reflectance in the near-infrared band and R red is the reflectance in the red band.

Data Statistical Analysis
The data for each of the 16 genotypes across 16 environments (4 locations, 2 years, 2 N rates) were initially studied using a descriptive analysis, including maximum and minimum values, variance, standard deviation, and mean.All derived measurements were calculated from the individual plot values.
For the split-plot trial design, a linear mixed-effects model ANOVA was applied.In the first model, the two main factors, genotype and N rate, and their interaction were fixed, while the field block was considered random.Calculations were carried out separately for the data from the 8 environments (combination of location and year).The partitioning of the percentage of factors was computed from the total sum of the square (TSS, %).In the second model, three main factors, genotype, year, and N rate, and their interactions were fixed, and the field block was random; calculations were carried out separately for the data from each location.Mean comparisons between genotypes, trial years, and N rates were performed using Fisher's least significant difference (LSD) (p < 0.05).
A multivariate stability analysis was performed using the additive main effects and multiplicative interaction model (AMMI), which combines an ANOVA for the genotype and environment main effects with a principal component analysis of genotype-environment interactions [29].Each year-location combination was considered as an environment; therefore, eight environments were accounted for in the stability analysis of the 16 genotypes under high N and low N rates.
A stability analysis was carried out using the WAASB index [30].The index was calculated as follows (2): where WAASB i is the weighted average of absolute scores of the i-th genotype; IPCA ik is the score of the i-th genotype in the k-th IPCA (interaction principal component axis), and EP k is the amount of the variance explained by k-th IPCA.The genotype with the lowest WAASB score deviates least from the mean performance across environments and is therefore considered as the most stable.Pearson correlation analyses were conducted between GY, GPC, and NDVI values at four plant growth stages in each of the 8 environments, separately for high and low N rates.Linear regression models were employed to examine the relationships between NDVI values at each plant growth stage (as independent variables) and GY and GPC (as dependent variables).Adjusted determination coefficients (adj.R 2 ), root mean squared errors (RMSEs), and p-values were reported to characterize regression models.
Data results were visualized with MS Excel extension Real Statistics and R 4.2.3 statistical program [31] packages: lme4, datasets, agricolaeimetan.

Variation in Meteorological Data
Overall meteorological conditions were markedly different between the trial years of 2021 and 2022.However, in comparison to the 30-year long-term average (LTA) data (1991-2020), there are both similar trends and differences between environments (Figure 1).In 2021, April, May, and August experienced lower average daily air temperatures than the LTA in all trial locations, but in June and July, temperatures exceeded the average annual observations by 1.7 to 4.0 • C and 1.5 to 4.2 • C, respectively (Figure 1a).In April, May, and June in all test locations, and in July in EE and LT, the amount of precipitation was significantly reduced (Figure 1b).In the 2022 season, April and May were generally relatively cool, but in June, the average daily air temperature in all trial locations exceeded the LTA (Figure 1a).In June, the monthly rainfall was reduced in the trials in NO, EE, and LV.Heavy rains occurred in LT in June, when 256% more precipitation compared to the LTA was noted (Figure 1b).
relatively cool, but in June, the average daily air temperature in all trial locations exceeded the LTA (Figure 1a).In June, the monthly rainfall was reduced in the trials in NO, EE, and LV.Heavy rains occurred in LT in June, when 256% more precipitation compared to the LTA was noted (Figure 1b).Overall, the growing season of 2022 was considered as a more favourable season for wheat growth and productivity than the earlier one due to the lower average temperatures and higher rainfall.

Variation in Traits by Location and Year
The results of an analysis of variance performed for GY and GPC data showed that in all locations, the effect (sum of squares) of different years was highly significant (p < 0.001) with the highest contribution to both traits' variability (Table S2).Mean GY values varied significantly by year in each location, where the lowest yielding trial location/year was EE in 2021 (466.8 g m −2 ), and the highest yielding location/year was LV in 2022 (622.2 g m −2 ) (Table 2).When comparing the two trial years, the average grain yield was significantly higher in 2022 than in 2021 in three locations (NO, LV, LT), while in LT, it was the opposite.Overall, the growing season of 2022 was considered as a more favourable season for wheat growth and productivity than the earlier one due to the lower average temperatures and higher rainfall.

Variation in Traits by Location and Year
The results of an analysis of variance performed for GY and GPC data showed that in all locations, the effect (sum of squares) of different years was highly significant (p < 0.001) with the highest contribution to both traits' variability (Table S2).Mean GY values varied significantly by year in each location, where the lowest yielding trial location/year was EE in 2021 (466.8 g m −2 ), and the highest yielding location/year was LV in 2022 (622.2 g m −2 ) (Table 2).When comparing the two trial years, the average grain yield was significantly higher in 2022 than in 2021 in three locations (NO, LV, LT), while in LT, it was the opposite.Mean GPC values also varied substantially from one location and year to another.The mean GPC values among four locations ranged from 11.7 to 15.6% in 2021 and from 10.9 to 14.5% in 2022, with the lowest value in LT in 2022 (10.9%), and the highest value in LV in Agronomy 2024, 14, 51 7 of 20 2021 (15.6%).Differences were observed between locations in the GPC depending on the trial year.In LV, EE, and LT, a significantly higher GPC was obtained in 2021, but not in NO, where it was, on the contrary, 2.1% higher in 2022.
The UAV-derived NDVI obtained at four growth stages showed also significant variation among trial years (Table 3).Overall, in all four growth stages, mean value differences varied significantly and did not show consistent similar variations compared to data obtained between years.For instance, in GS55, where comparatively the highest mean NDVI values were detected in LV and LT, this trait was significantly higher in 2022, while in EE it was higher in 2021; however, in NO, in both trial years, we obtained similar NDVI values.

The Impact of N Rate on Phenotypic Variation in Traits
As shown in Figure 2, the impact of a high N rate was pronounced for the GY at all trial locations; significantly higher GYs in such conditions were obtained in EE, LV, and NO (Figure 2a; Table S4).Opposite results were detected in LT, where in 2021, significantly (p < 0.001) higher average GY values were obtained under low N conditions, but in 2022, N treatment did not have a significant effect.In LV and NO, a higher N regime response to GY variation was detected in 2021 compared to 2022.
A high N fertilization rate promoted the formation of significantly (p < 0.001) higher GPCs at all locations (Figure 2b; Table S4).
The N rate showed a significant effect on the variation in NDVI values measured at the tillering stage (GS25) only in three environments (EE_2021; NO_2021, LT_2022) (Table 4).In all locations, the NDVI was evaluated during heading (GS55) and milk (GS75) growth stages, and in most cases, significantly higher NDVI values were obtained under a high N supply compared to those obtained in low N conditions.At all locations, the NDVI variation was also significantly affected by the N fertilization regime in the late grain filling stage (GS85), with significantly (except in EE_2022) higher values under high N application.An analysis of variance confirmed that in all location-year environments, the genotype-by-nitrogen (G × N) interaction was not significant for any of the investigated traits (Table S4).An analysis of variance confirmed that in all location-year environments, the genotypeby-nitrogen (G × N) interaction was not significant for any of the investigated traits (Table S4).

Mean Performance and Stability of Wheat Genotypes under Contrasting Environments
The mean performance of GY and GPC and their corresponding WAASB stability indices and trait ranking values under high and low N rates for spring wheat genotypes are presented in Table 5. Genotype effects were significant for investigated traits.Three spring wheat genotypes were detected as high yielding under both low N (GY 547.4-564.1 g m −2 ) and high N (GY 592.5-609.5)rates, confirming the lack of a significant G × N interaction effect.Overall, for the GY, the ranking order of WAASB values for genotypes differed between the N rates.Genotypes DS-655-7-DH and DS-720-3-DH had a below average GY stability ranking via WAASB over all investigated environments.In turn, the top-yielding DS-17-16-DH exhibited the best stability only under a low N rate.Genotype 990-2 on the contrary ranked first via WAASB under N150, but only ranked thirteenth under N75.The variety Betong showed comparatively good GY results and stability under both high and low N rates.The other genotypes, characterized by a high stability ranking via WAASB in both N rates, showed below-average yield levels.The genotype Zombi had the same stability rank across both N rates.
Table 5. Grain yield (GY), grain protein content (GPC), and corresponding weighted average of absolute scores (WAASB) stability index estimates and their ranking for genotypes in the trial averaged over eight environments (combination of location and year) under low N and high N rates.GPCs among different genotypes varied from 11.7 to 14.0% under low N, and from 13.2 to 16.0% under high N.The results showed that genotypes 876 and Runar, with the lowest yield in the investigated environments, exhibited the highest (first and second) GPCs in both N rates.However, in both N rates, these genotypes exhibited some of the lowest stability rankings via WAASB among the 16 tested genotypes.The top-yielding genotype DS-17-16-DH exhibited one of the lowest GPCs under both N rates, where the ranking of WAASB stability values between both N rates distinctly differed.Voore simultaneously combined above average GPCs (13.1 and 14.7% under N75 and N150, respectively) and showed a good stability ranking (fourth) under high N. Overall, the top ranking WAASB values for the GPC were detected for genotypes 013-074, 990-2, and Hiie (all grown under high N) with the GPC approaching the trait average values.
Genotypes DS-17-16-DH, DS-655-7-DH, DS-720-3-DH, 013-01, and DS-638-5-DH were high yielding over all investigated environments and were characterized comparatively by the longest average growing period (DM 106.9-108.6 days) among the 16 tested genotypes (Table 6).All high-yielding genotypes also obtained the highest NDVI values in GS75 and GS85.High-yielding genotypes DS-17-16-DH and DS-638-5-DH were characterized with comparatively short PHs, but for DS-655-7-DH and DS-720-3-DH, this trait was above average.Wheat genotypes 876 and Runar were at the top of GPC rankings, and were characterized by the comparatively tallest plants (fourth and first ranked, respectively), fewer DM (102.8 and 103.1 days, respectively), and the lowest NDVI values in all growth stages.Different patterns in terms of NDVI profiles between genotypes were observed.For instance, for DS-655-7-DH, comparatively high values of the NDVI were detected in all investigated growth stages, but for high-protein cultivar Runnar, the highest NDVI values were only observed in GS25 (Table 6).* Genotypes are ranked based on GYs over 16 environments (combination of year × location × N rates); PH-plant height; DM-days to maturity; R-ranking order; N75-75 kg N ha −1 ; N150-150 kg N ha −1 ; GS25-tillering growth stage; GS55-heading growth stage; GS75-milk growth stage; GS85-dough growth stages; LSD-least significant difference (α = 0.05).

Correlations between NDVI and GY, GPC
As the above-described results obtained in the trial locations demonstrated the significant effect of year and N rate on the variation in the GY and GPC, relationships with the NDVI were therefore analysed separately in each of the 16 environments.Correlation analyses generally indicated a strong positive relationship between the NDVI and GY at the N75 rate, which differed between environments and plant development stages (Figure 3a).The strongest and most consistent correlations were found in LT and NO in 2022, which were statistically significant at all developmental stages (p < 0.05), except for LT at GS85.In almost all environments, the strongest correlations were found at GS75; all of them, apart from EE in 2022, were significant (p < 0.01).Correlations from 2021 were more similar between environments compared to 2022.

GS85.
In almost all environments, the strongest correlations were found at GS75; all of them, apart from EE in 2022, were significant (p < 0.01).Correlations from 2021 were more similar between environments compared to 2022.At the N150 fertilization rate, the overall strength of correlations between the NDVI and grain yield (GY) was similar to that at N75. Coefficients were the highest (r > 0.75) for NO in 2022 and LT in 2021 (GS55).For most environments and growth stages, the correlation between the NDVI and grain yield was positive, except for LV (GS25), where there was a significant (p < 0.05) negative correlation (Figure 3b).Overall, at GS25, correlative relationships between the GY and NDVI remained lower compared to the respective results of later growth stages, with the notable exception of EE in 2021, where GS25 had the strongest significant correlation (p < 0.01).
For the GPC, correlations with NDVI were notably weaker and mostly negative compared to the GY, or remained non-significant (Figure 4a).At N75, strong negative correlations (r > −0.5) were found for GS25 (in NO in 2022, EE in 2021, and LV 2021).All the significant correlations at N75 between the GPC and NDVI remained negative.GS75 was also the stage with the strongest overall correlations compared to other development stages.However, in most environments, at GS25 and GS55, correlations between NDVI and GPC were non-significant.
At N150, the correlation coefficients between GPC and NDVI were mostly negative or non-significant (Figure 4b).A notable exception to this was in NO (in 2021 and 2022), where significant negative correlations (r < −0.25) were found at GS55, GS75, and GS85.Similarly, in EE in 2021 and 2022, the negative correlation was rather strong (r < −0.5) for GS55 and for GS75 and GS85.In LT in 2022, the positive correlation was significant (p > 0.05) at GS25-the only significant one for the GPC and NDVI at N150.At the N150 fertilization rate, the overall strength of correlations between the NDVI and grain yield (GY) was similar to that at N75. Coefficients were the highest (r > 0.75) for NO in 2022 and LT in 2021 (GS55).For most environments and growth stages, the correlation between the NDVI and grain yield was positive, except for LV (GS25), where there was a significant (p < 0.05) negative correlation (Figure 3b).Overall, at GS25, correlative relationships between the GY and NDVI remained lower compared to the respective results of later growth stages, with the notable exception of EE in 2021, where GS25 had the strongest significant correlation (p < 0.01).
For the GPC, correlations with NDVI were notably weaker and mostly negative compared to the GY, or remained non-significant (Figure 4a).At N75, strong negative correlations (r > −0.5) were found for GS25 (in NO in 2022, EE in 2021, and LV 2021).All the significant correlations at N75 between the GPC and NDVI remained negative.GS75 was also the stage with the strongest overall correlations compared to other development stages.However, in most environments, at GS25 and GS55, correlations between NDVI and GPC were non-significant.
At N150, the correlation coefficients between GPC and NDVI were mostly negative or non-significant (Figure 4b).A notable exception to this was in NO (in 2021 and 2022), where significant negative correlations (r < −0.25) were found at GS55, GS75, and GS85.Similarly, in EE in 2021 and 2022, the negative correlation was rather strong (r < −0.5) for GS55 and for GS75 and GS85.In LT in 2022, the positive correlation was significant (p > 0.05) at GS25-the only significant one for the GPC and NDVI at N150.

Linear Relationships between NDVI and GY, GPC
A linear regression analysis showed that the most significant relationships between GY and NDVI values were found at the milk growth stage (GS75) and heading growth stage (GS55) (Table 7).Conversely, the dough growth stage (GS85) only had two significant relationships.Specific environments stood out; for instance, NO trials at GS85 were highly significant (p < 0.001) for both years and at both N rates.Lithuania for 2022 showed the highest R 2 at GS75 and at N75 (0.82, p < 0.001) and an exceptionally low RMSE value (29.96).Overall, NO trials in 2022 showed the most consistent and significant relationships with the GY for both years and N rates.
Differences in significance between the two N rates were also present; for the GY, trials with N150 showed more significant relationships compared to N75 (23 and 17, respectively, from a total of 32 measured relationships in Table 7).This was especially clear at GS25 and GS85.RMSE values for N75 were on average half of those for N150.Additionally, in EE trials, the GY at GS75 was non-significant at N75 but highly significant at N150 for both years (adj.R 2 = 0.40 and 0.42).In LT trials, the GY was highly significant for both GS55 and GS75 in both years.
The GPC showed fewer significant relationships with NDVI values compared to GY.Similarly, as for the GY, most environments showed significant relationships at GS75 and GS85, with N150 having a slightly higher number of these.In contrast to GY, differences in RMSE values for the GPC were on average similar between N75 and N150.For all tested environments, the NO trial in 2022 showed the most consistent and significant relationships with the GPC at GS75 and GS85 (adj.R 2 = 0.23-0.55).

Linear Relationships between NDVI and GY, GPC
A linear regression analysis showed that the most significant relationships between GY and NDVI values were found at the milk growth stage (GS75) and heading growth stage (GS55) (Table 7).Conversely, the dough growth stage (GS85) only had two significant relationships.Specific environments stood out; for instance, NO trials at GS85 were highly significant (p < 0.001) for both years and at both N rates.Lithuania for 2022 showed the highest R 2 at GS75 and at N75 (0.82, p < 0.001) and an exceptionally low RMSE value (29.96).Overall, NO trials in 2022 showed the most consistent and significant relationships with the GY for both years and N rates.
Differences in significance between the two N rates were also present; for the GY, trials with N150 showed more significant relationships compared to N75 (23 and 17, respectively, from a total of 32 measured relationships in Table 7).This was especially clear at GS25 and GS85.RMSE values for N75 were on average half of those for N150.Additionally, in EE trials, the GY at GS75 was non-significant at N75 but highly significant at N150 for both years (adj.R 2 = 0.40 and 0.42).In LT trials, the GY was highly significant for both GS55 and GS75 in both years.
The GPC showed fewer significant relationships with NDVI values compared to GY.Similarly, as for the GY, most environments showed significant relationships at GS75 and GS85, with N150 having a slightly higher number of these.In contrast to GY, differences in RMSE values for the GPC were on average similar between N75 and N150.For all tested environments, the NO trial in 2022 showed the most consistent and significant relationships with the GPC at GS75 and GS85 (adj.R 2 = 0.23-0.55).

Discussion
Given that the price of common wheat is determined by the GPC in most wheat markets, including the Nordic-Baltic region, this study focused on the GY and GPC as the most critical economic traits.This comprehensive multifactorial study of spring wheat genotypes, incorporating remote phenotyping data, marks a pioneering effort under the conditions of the Nordic-Baltic region, supplying practical knowledge for breeders.

Variations in Meteorological Conditions across Trial Years and Locations Resulted in Differences in the GY and GPC
The impacts of climatic variability have been observed in the last few decades, where drought stress is a major factor that affects the wheat GY and GPC, increasing the variability in these traits under Baltic region conditions [7].A meta-analysis based on wheat data showed that drought significantly decreased GYs and increased GPCs [32].In our study, drought conditions were recorded in 2021, but to a different degree, in different growing periods, and for varying durations across the four locations, resulting in substantial GY and GPC variation.In 2021, when higher average daily temperatures and lower precipitation were seen, a significantly lower GY was obtained in three out of the four trial locations (EE, LV, and NO).Considering that the complete amount of N fertilizer was applied before sowing, the heavy rainfall experienced in LT in June of 2022 caused leaching of N from the upper soil layer, which is likely the main reason for both the lower GY and GPC in this location.Elevated temperatures during grain filling can result in high protein percentages because these conditions tend to reduce starch synthesis more than protein synthesis [33].In 2021, temperature and moisture conditions were more favourable for the formation of a higher mean GPC compared to 2022 in EE, LV, and LT, but in NO, the mean value of the GPC did not exceed 12% because of comparatively lower mean daily temperatures in June and July in this location.
In our study, the estimated field traits were significantly affected by N fertilization rates in almost all locations, with some exceptions, which can also be explained by differences in meteorological conditions between locations.It is generally acknowledged that an appropriate increase in nitrogen supply improves plants' yields, grain quality, and resistance to abiotic stresses [34].Recently, several attempts have been made to compare plants' yield penalties due to drought and heat stresses under low and high nitrogen supplies in the field conditions for wheat.Giménez et al. (2021) [35] proved that an increased temperature overnight resulted in an increased tiller mortality only under a high nitrogen supply.A recent study showed that a high nitrogen supply mitigated the adverse effects of drought and heat stresses when they were applied separately.However, when the two stresses were combined, wheat growth and grain yield were negatively affected by a high N supply [36].In 2021, comparatively worse drought conditions were present in EE and LT; however, the timing of water shortages was different in these locations.In EE, plants experienced drought in June, while the water deficit was mostly in July in LT.In EE, the average yield was significantly higher under a high nitrogen supply, while it was significantly higher under a low nitrogen supply in LT in 2021.The obtained results of yield reductions under a high N rate imply that a high nitrogen supply under a water deficit together with unusually high temperatures at anthesis and during grain filling might cause higher plant sensitivity to drought stress.Considering that drought and high temperatures are becoming more frequent due to global warming, the rates of N fertilization might need to be revised to avoid not only pollution of the environment, but also yield penalties.
In this study, the UAV-derived NDVI obtained at four growth stages also showed significant variation between trial years, although there was no marked consistency between locations due to the aforementioned differences in meteorological variables.The NDVI, as a basic measure of plant vitality and vegetation density, can be influenced by plant growth stages, vegetation cover percentages, plant health and stress, weather conditions, and crop management [37].In a study by Thapa et al. (2019) [38] in winter wheat, it was detected that as the water stress increased, NDVI values across seasons decreased.Our results showed that the NDVI exhibited a significant difference between N treatments consistently in all locations during GS55, GS75, and GS85, with higher values found under high N compared with low N conditions, although in the study of Yousfi et al. (2019) [39], the NDVI did not show significant differences under contrasting N conditions (N0 and N160).

Spring Wheat Genotypes Grown under Contrasting N Rates Differed in Stability and Exhibited Different Patterns in NDVI Profiles during Seasons
Genotypes and their genetic characteristics are also key factors that affect the variation in traits in spring wheat.Our results showed that genotype effects were highly significant (p < 0.001) for all traits, showing the presence of wide genetic variation among the genotypes.This is explainable, as the set of 16 genotypes included in this study consisted of both older and newly registered spring wheat genotypes (Table S1), which also differed significantly in terms of PH and the length of the DM (Table 6).Higher input prices and environmental concerns are now also focused on common wheat breeding programs towards low-input agricultural practices, and, more particularly, low N input management [40].In our research, we were also interested in determining the significant genotype-by-nitrogen (G × N) interaction variability for the studied traits, which would indicate wheat tolerance to N constraints.An analysis of variance showed that in our study, the G × N interaction was not significant for any of the investigated traits (Table S4).Other studies have found both significant [25] and non-significant [41] effects of this interaction factor on GY variability.Hitz et al. (2017) [25] indicated that to select for high yield potential (or NUE), breeding material must be evaluated under both low and high N conditions.A high yield potential only under low N conditions will not be enough to ensure success, since the ability to respond to high rates of N is also important to achieve wide cultivation of a genotype.The three top-yielding spring wheat genotypes DS-655-7-DH, DS-17-16-DH, and DS-720-3-DH (all originated from LT) have been characterized by breeders as high-yielding breeding lines, which was also confirmed in our study, as these genotypes showed the best results under both low and high N rates.The results also showed that the grain yield is strongly associated with the DM; a longer vegetation period comes with a higher GY.In turn, early-maturing genotypes Voore, 876 (both from EE) and Runar (NO) provided the lowest yield, but at the same time exhibited a higher GPC under both N rates.The focus on more yield-stable genotypes will prevent N losses in unfavourable years [42].We found signs that a genotype that is stable with respect to variation in one influencing factor will not always be relatively stable in response to variation in a different influencing factor, as also shown by Weiner at al. (2021) [43].We determined genotype performance and stability under different N rates in eight environments by using the WAASB stability index, which is based on the AMMI model [30].Only a limited amount of published research was found on yield stability investigations under different N rates for wheat [44].We found that genotypes had different yield stabilities depending on the N rate.DS-17-16-DH was the most stable for GY only under low N, but genotype 990-2 in contrast has the best WAASB stability results under high N.The genotypes Zombi, with a below average GY, and 013-074, with a below average GPC, has a high WAASB stability ranking across both rates of N.
An analysis of variance showed that a comparatively high magnitude of the genotype as a factor effect for the NDVI was seen in the milk (GS75) and dough (GS85) development stages, except in EE in 2022 (Table S4).We recognized that all identified high-yielding genotypes were characterized by the highest NDVI values in GS75 and GS85 (Table 6).During earlier stages of plant development (GS25; GS55), the characteristics of those genotypes showed different patterns in terms of NDVI profiles.For instance, for top-yielding DS-655-7-DH, comparatively high values of the NDVI were detected in all investigated growth stages, while the old, early-maturing, high-protein cultivar Runar showed one of the highest NDVI values only in GS25.

Correlations and Linear Relationships between the GY, GPC, and UAV-Derived NDVI Highlight Key Crop Growth Stages
We used the UAV-derived NDVI as the main vegetation index, since it has been widely used in vegetation assessments in general and wheat phenotyping in particular [19].Our preliminary checks showed that the use of red edge indices (in this case, NDRE) strongly correlates with the NDVI.The correlation between NDRE and yield is weak, not significantly better.A study of rapid wheat phenotyping using the NDVI in China showed that it had more consistent correlations with grain yield compared to other indices like NDRE and NGRDI [45].Wheat trials in Australia found that UAV-derived NDVI values were strongly correlated with the hand-held GreenSeeker sensor (R 2 = 0.85) [46], which can be considered as a baseline for such measurements.
Analysing the relationships between NDVI and grain yield (GY) data in each of the 16 environments, we found the strongest correlations between the NDVI and GY at GS75 (r = 0.74, p < 0.01), which shows that this period can be used for the exploration of the predictive power of the NDVI by linear models.By using a split-plot design, Sultana et al. (2014) [47] found that the NDVI had the strongest correlations (R 2 > 0.9) with GY in booting, grain filling, and maturity stages.
UAV data collected at the heading stage were superior for predicting GPCs, while the GY was more accurately predicted during the grain filling stage [48].A study in Idaho, USA [49], reported similar findings, with strong linear relationships between GY and NDVI values in the middle phase of plant development (R 2 = 0.78).Yousfi et al. (2016) [50] showed correlation coefficients between GY and NDVI ranging from 0.36 to 0.74.
Another main response variable-the grain protein content (GPC)-expressed a negative relationship with the NDVI, which is typical and has been reported in several studies [51,52].GPC prediction is less straightforward compared to GY prediction, often requiring machine learning methods and ensembles of predictors [53,54].Li-Hong et al. (2007) [55] concluded that using a single measurement of spectral reflectance was not enough to predict GPCs reliably in wheat.A study in China [54] found that in GPC prediction from UAV-derived NDVIs using machine learning methods, the RMSE did not exceed 0.74.Our study also found a weaker relationship between the NDVI and GPC compared to GY. Research has found that the weak relationship between the NDVI and GPC makes it a suboptimal index for protein content prediction in wheat [56].
A study using correlation analyses for wheat trials with high-resolution NDVI data for yield found average to strong correlations (r = 0.60-0.81)and was able to distinguish between various rates of fertilization [27].Here, we showed that N fertilization rates did not significantly affect the strength of correlations between the NDVI and GPC, since the highest r values were found at the same locations and growth stages for both N75 and N150.
Differences in correlations between environments were quite pronounced for all yield parameters.Two locations, NO and EE, stood out, especially for significant correlations between the NDVI and GY in GS25 and GS85, absent in other environments.Such differences in early and late stages of crop development could be explained by variations in biophysical parameters of the environments in these trials.Sánchez et al. (2023) [57] mention that NDVI values are influenced by soil type and moisture as well as local meteorological conditions, which are impossible to control in field trials.
Our study proved that UAV-derived NDVI data, combined with field-collected agronomic data, may be successfully employed for multi-environmental trials of wheat phenotyping.One of our main findings is that NDVI values at GS55 and GS75 were the most significant linear predictors for spring wheat GYs.Other studies [58,59] reported the highest R 2 values for linear relationships between the wheat NDVI and GY at these stages.However, it is important to note that uncertainties arise from differences in growing conditions during crop development stages.In our study, GY showed more significant relationships compared to GPC.This is also visible in many studies which use GY as the main response variable, see [60].
In our trials, a higher N application rate (N150) exhibited slightly more significant linear relationships compared to N75, although this was not clear from a correlation analysis.This effect was expected since the N fertilization rate is known to influence NDVI values [24,61]; however, the influence on R 2 values between environments was quite modest.
A single growing season with a major meteorological influence, for example, significant drought stress, can alter expected NDVI values.Thapa et al. (2019) [38] report

Figure 1 .
Figure 1.Meteorological parameters at the field trial locations (NO-Norway, EE-Estonia, LV-Latvia, LT-Lithuania) and seasons (2021, 2022) compared to 30-year long-term average (LTA) data (1991-2020): (a) deviation from the mean daily temperature from LTA; zero values on the y-axis represent the LTA; (b) monthly rainfall as a percentage of the LTA; 100% on the y-axis represents the LTA.
NO-Norway, EE-Estonia, LV-Latvia, LT-Lithuania; the values with different superscript letters in a row are significantly different between years (p < 0.05).

Figure 1 .
Figure 1.Meteorological parameters at the field trial locations (NO-Norway, EE-Estonia, LV-Latvia, LT-Lithuania) and seasons (2021, 2022) compared to 30-year long-term average (LTA) data (1991-2020): (a) deviation from the mean daily temperature from LTA; zero values on the y-axis represent the LTA; (b) monthly rainfall as a percentage of the LTA; 100% on the y-axis represents the LTA.

Figure 2 .
Figure 2. Boxplot showing the phenotypic distribution of (a) grain yield (GY) and (b) grain protein content (GPC) for 16 spring wheat genotypes grown in 16 environments.The black horizontal line in each boxplot is the median, the lower and upper box edges are the first and third quartiles, respectively, and the whiskers are the data minimum and maximum.The black circle in each plot is the mean for that class.Outliers are shown as open circles; EE-Estonia, LV-Latvia, LT-Lithuania, NO-Norway; 2021 and 2022-year of trials; N75-N rate with 75 kg N ha −1 ; N150-N rate with 150 kg N ha −1 ; a, b-significant differences (p < 0.05) between the mean values of two N rates within each location and year are shown by different superscript letters.

Figure 2 .
Figure 2. Boxplot showing the phenotypic distribution of (a) grain yield (GY) and (b) grain protein content (GPC) for 16 spring wheat genotypes grown in 16 environments.The black horizontal line in each boxplot is the median, the lower and upper box edges are the first and third quartiles, respectively, and the whiskers are the data minimum and maximum.The black circle in each plot is the mean for that class.Outliers are shown as open circles; EE-Estonia, LV-Latvia, LT-Lithuania, NO-Norway; 2021 and 2022-year of trials; N75-N rate with 75 kg N ha −1 ; N150-N rate with 150 kg N ha −1 ; a, b-significant differences (p < 0.05) between the mean values of two N rates within each location and year are shown by different superscript letters.
Field experiments were carried out over two consecutive years (2021 and 2022) at four geographically distant locations representing different agro-ecological conditions and soil types in the Nordic and Baltic region: the Centre of Estonian Rural Research and Knowledge (Jogeva, Eastern part of Estonia, 58 • 76 N, 26 • 24 E), Stende Research Centre (Dižstende, north-west Latvia, 57 • 18 N, 22 • 56 E), the Institute of Agriculture (Dotnuva, in central Lithuania, 55 • 39 N, 26 • 24 E), and the Vollebekk Research Station (Ås, south-eastern Norway, 59 • 39 N, 10 • 45 E).The physical and chemical properties of the experimental soils prior to sowing are detailed in Table

Table 1 .
UAV flight mission dates for multi-location trials.

Table 2 .
Mean values by years in the trial locations averaged over N fertilization rates for grain yield (GY) and grain protein content (GPC).

Table 2 .
Mean values by years in the trial locations averaged over N fertilization rates for grain yield (GY) and grain protein content (GPC).
NO-Norway, EE-Estonia, LV-Latvia, LT-Lithuania; the values with different superscript letters in a row are significantly different between years (p < 0.05).

Table 3 .
Mean values by years in the trial locations averaged over N fertilization rates for the UAV-derived NDVI at four wheat plant growth stages.

Table 4 .
Mean values of UAV-derived NDVIs obtained at four growth stages from eight environments (combination of locations and years) under low and high N application (N75, N150).
GS85-dough growth stage.NDVI mean values followed by different letters within columns of each pair of two N rates (N75-75 kg N ha −1 , N150-150 kg N ha −1 ) are significantly different (p < 0.05).

Table 4 .
Mean values of UAV-derived NDVIs obtained at four growth stages from eight environments (combination of locations and years) under low and high N application (N75, N150).

Table 6 .
Mean values of plant height (PH), days to maturity (DM), and NDVIs at four growth stages and their ranking for genotypes averaged over 16 environments (combination of locations × years × N rates).

Table 7 .
Values of RMSE, adjusted R 2 , and corresponding significance of linear regression analyses in multi-environment trials for the relationships between grain yield (GY) and grain protein content (GPC) as dependent variables and the NDVI as the independent variable.