Association Mapping of Physiological and Morphological Traits Related to Crop Development under Contrasting Nitrogen Inputs in a Diverse Set of Potato Cultivars

Ample nitrogen (N) is required for potato production, but its use efficiency is low. N supply strongly interacts with maturity type of the cultivar grown. We assessed whether variation among 189 cultivars grown with 75 or 185 kg available N/ha in 2 years would allow detecting quantitative trait loci (QTLs) for relevant traits. Using phenotypic data, we estimated various traits and carried out a genome-wide association study (GWAS) with kinship correction. Twenty-four traits and 10,747 markers based on single-nucleotide polymorphisms from a 20K Infinium array for 169 cultivars were combined in the analysis. N level affected most traits and their interrelations and influenced the detection of marker–trait associations; some were N-dependent, others were detected at both N levels. Ninety percent of the latter accumulated on a hotspot on Chromosome 5. Chromosomes 2 and 4 also contained regions with multiple associations. After correcting for maturity, the number of QTLs detected was much lower, especially of those common to both N levels; however, interestingly, the region on Chromosome 2 accumulated several QTLs. There is scope for marker-assisted selection for maturity, with the main purpose of improving characteristics within a narrow range of maturity types, in order to break the strong links between maturity type and traits like N use efficiency.


Introduction
In potato cropping, farmers often abundantly apply nitrogen (N) fertilizer to ensure profits because potato plants are highly responsive to extra N [1]. This practice reduces the nitrogen use efficiency (NUE) of the crop [2,3], which is already rather low because of a shallow root system and the common cultivation in sandy soils. Leaching of excess N causes eutrophication of ground and surface water and is, therefore, a serious threat to the environment in potato production areas. Governmental regulations limiting the N supply have been installed, and these make it necessary to improve NUE at lower levels of input. Moreover, the N fertilizer regulations are specified for maturity types, at least in the Netherlands, emphasizing the need to incorporate the effects of maturity type in NUE studies.
Effects of nitrogen availability on above-ground and below-ground crop development have been widely studied. High N input increases individual leaf size and leaf longevity [4,5] and promotes branching [6], thus supporting a sustained leaf production, which enlarges the period of full soil cover (SC) [7,8]. Therefore, the crop intercepts more solar radiation and accumulates more dry matter with more N [9], all resulting in higher yield, but lower NUE [2,3].
An increase in nitrate available for the plant was reported to lead not only to a reduction in the proportion of dry matter allocated to roots but also to an increase in the total root surface and root length [10]. Differences in N uptake efficiency of two cultivars were attributed to the general differences in root morphology and to a particular N response of the cultivars [10]. Moreover, high N availability tends to suppress or delay tuber bulking and affect dry matter partitioning between haulm and tubers [5]. Additionally, N input also affects tuber size and quality parameters, including tuber dry matter content, tuber starch content, tuber protein content, tuber nitrate content, and processing quality [4,11]. With more N, the proportion of large tubers was shown to increase and the fry color was shown to become darker, while the effect on tuber dry matter content was ambiguous [4,12,13].
Studies on canopy cover have shown a high correlation among the ability of genotypes to intercept photosynthetically active radiation, to change resource use efficiency, and to create tuber yield [9,14,15]. Khan [16] and Khan et al. [17,18] studied the canopy development (CDv) of potato using an eco-physiological model in which canopy growth is a function of thermal time, following the beta function as described by Yin et al. [19]. This methodology allows the dissection of the complex trait of canopy growth into model parameters with biological meaning [20][21][22]. The analysis of the curve parameters as new traits allowed capturing differences in N response among cultivars and maturity types, as well as among cultivars within the same maturity class, facilitating the understanding of the N effects on different stages of CDv [3,[16][17][18]23]. Furthermore, those canopy cover traits had high heritabilities [16], and some of them showed high correlations with yield, maturity, and N content, allowing an interpretation of how the NUE of potato is affected and showing potential as selection criteria for NUE [3,11,16]. In addition, these parameters were found to be related to genetic factors (quantitative trait loci; QTLs) that act during development of the canopy cover and are probably involved in the underlying physiological processes [24]. The combination of this eco-physiological growth model and QTL analysis is a two-step approach, where the first step is to model the complex trait identifying biologically relevant parameters demonstrating genetic variation, and the second step is to use these parameters as new traits to find QTLs [25,26]. In potato, the two-step procedure was used to study the dynamics of senescence and the adaptation in potato under different day lengths [27], and to identify QTLs related to canopy cover parameters [16], as well as QTLs related to the N effects on the canopy cover parameters in a diploid mapping population [24].
In recent years, association mapping approaches have become increasingly popular for genetic studies, offering a series of advantages that include higher mapping resolution and results that are applicable to a wider genetic background [28]. Association mapping (AM) identifies QTLs by examining marker-trait associations resulting from linkage disequilibrium (LD) between markers and trait functional polymorphisms across a set of diverse germplasms [28]. AM copes better with tetraploid, noninbred crops, such as potato [29], than linkage analysis using segregating biparental tetraploid populations for which tetrasomic inheritance is complicated [30]. AM can detect QTLs at the tetraploid level within a genetic background that is more representative of the breeding germplasm of the crop [31]. Moreover, AM procedures can effectively compare a greater portion of the variation within a species while the traditional linkage analysis is limited to the variation in the two parents of the segregating population [32]. However, in AM, it is important to consider the effect of population structure and/or kinship because any association may partially be caused by population admixture, leading to plausible but false marker-trait associations [28,32,33]. The success of association mapping efforts depends on the possibilities of separating LD due to genetic linkage from LD resulting from other causes [31].
Several papers reported on association mapping studies in tetraploid potato. Gebhardt et al. [34] and Simko [35] reported markers associated with resistance to diseases using a form of t-test. Malosetti et al. [31] proposed an AM approach based on mixed models with attention for the incorporation of the relationships between genotypes, whether induced by pedigree, population substructure, or otherwise. D'hoop et al. [36] applied a simple regression-based AM approach for quality traits in potato with promising results for these traits in a large set of tetraploid cultivars. In this paper, we combine the model for canopy development and association analysis to study the genetic basis of developmental physiological and agronomic traits in relation to contrasting N levels. We performed genome-wide AM for canopy development parameters and agronomic traits in a set of 169 tetraploid potato cultivars. Our cultivar set was phenotyped and studied for canopy development under contrasting N levels; in addition to effects of environmental factors, we observed genetic variation in the model traits and in agronomic traits [3], as required for a genome-wide association analysis. Moreover, we analyzed N-dependence of the detected QTLs to show the genetic response to such an important factor and to show the usefulness of the canopy development analysis in combination with genetics studies.
In summary, our objectives were to carry out a genetic analysis on model-based phenotypic variables associated with above-ground and below-ground development. Since these variables are known to be sensitive to nitrogen supply, the analysis was carried out using data of phenotyping experiments with two contrasting nitrogen input levels. Similarly, these variables vary among maturity types and show strong interactions between maturity type and nitrogen supply; we were, therefore, interested in the genetic background of the cultivar × nitrogen interaction and whether marker-trait associations were consistent across maturity types and nitrogen supplies.

Phenotypic Data
The phenotypic dataset used for the association analysis was collected and analyzed as described by Ospina et al. [3]. Here, we investigated the genotypic correlations among all traits at high N, low N, and across N input levels ( Figure 1). As the correlations were calculated on the basis of estimated genotypic main effect values, these are effectively genetic correlations (i.e., after excluding the effects due to other terms in the model; see Section 5.7). A summary of data for all traits per N level is included in Appendix A. For an explanation of the acronyms of the traits, see Sections 5.5 and 5.6. Looking at one trait at a time, the highest correlation at each N level was between the same traits. Exceptions were tm1, t2, te, TbwMx, TbnA, and mt_as. For each of these traits, the highest correlation (to whichever other trait) was different for low N and high N. In general, with the diagonal in the correlation matrices excluded, there were 169 out of 552  Genetic correlation matrices for both high and low N conditions are shown in Figure 1 (corresponding to the right upper and left lower triangles, respectively). A Mantel test to compare the two genetic correlation matrices showed a high and positive association between the Pearson correlations under high and low N (Mantel test r = 0.9384, significance = 0.001), which was also reflected in similar grouping of the traits in a hierarchical clustering using Pearson correlation as a similarity measure (see dendrograms in Figure 2). However, there were slight differences between the clustering results at each N level. Hierarchical cluster analysis of the traits at both N levels (HN and LN, high and low N, respectively); 1 minus the absolute correlation between each pair of traits was considered as a measure of dissimilarity. For an explanation of the acronyms of the traits, see Sections 5.5 and 5.6. The blue line is an arbitrary threshold at which clusters of traits resulting from the two dendrograms were compared.
The box plots ( Figure 3) illustrate the variation among maturity groups at both N levels for selected traits. The differences among maturity groups (Mt) were not significant for the traits tm1, AP3, SCYi, and TbwA (Appendix B; AP3 is shown as an example in Figure 3). [N] is an example of a trait for which the differences between Mt were significant, supported by a positive and high correlation with the maturity assessment (mt_as). AUC and yield were also significantly affected by Mt, showing a negative correlation with mt_as.
The effect of N level was significant for most traits except for AP1, te-t2, DM%, SCYi, and TbnMx (see also Appendix B). In Figure 3, DM% is shown as an example of a trait that was unaffected by N levels.
[N], AUC, and Y_DM were strongly influenced by N input. Moreover, AUC, which is a parameter accumulating temporal and spatial progression of canopy development (which is directly linked to the amount of intercepted light and, therefore, photosynthetic potential over the whole growth cycle) was highly positively correlated with yield at both N levels. More N input promoted vegetative growth and prolonged the growing period, and both effects may support higher yields, provided the growing season is long enough for the potato tuber yield to benefit from the prolonged canopy development. Lastly, the tuber size with maximum weight (TbwB) was significantly affected by N but not by Mt, and this trait is a representative of a group of traits that behaved consistently different than other traits included in the analysis at both N levels ( Figure 2  Looking at one trait at a time, the highest correlation at each N level was between the same traits. Exceptions were tm1, t2, te, TbwMx, TbnA, and mt_as. For each of these traits, the highest correlation (to whichever other trait) was different for low N and high N. In general, with the diagonal in the correlation matrices excluded, there were 169 out of 552 combinations with absolute correlation coefficients lower than 0.4 at both N levels (using 0.4 as threshold, i.e., equivalent to a significance level smaller than 3.5 × 10 −10 to prove that the correlation was different from 0, to describe the matrix; Figure 1). Additionally, there were more pairwise correlations with absolute values higher than 0.4 at low N than at high N, showing the overall effect of N on trait relationships.
The diagonal of the matrix ( Figure 1) shows the correlation coefficients between N levels for each trait. AP3 and te-t2 were the least consistent traits across N levels with very low values (0.18 and 0.20, respectively). The traits showing the highest positive correlations between N levels were DM%, Y_DM, mt_as, AUC, and TbnMx. Thus, the expected interaction of these traits with N level across the cultivar set was lowest.
Looking at hierarchical clustering of traits ( Figure 2), yield (Y_DM) was grouped closer (more similar) to the CDv parameters AUC, t2, and te, as well as to maturity mt_as, [N], and DM%, at low N than at high N. At high N, NUpt, Vx, and TbwMx were closer to yield. Furthermore, there were five traits clustering together under both N conditions. This group included all traits from tuber size-weight and size-number distribution but not TbwMx (left-hand side of both HN and LN hierarchical trees in Figure 2), all being highly correlated to each other, as they describe the same phenomenon: tuber size distribution.
The box plots ( Figure 3) illustrate the variation among maturity groups at both N levels for selected traits. The differences among maturity groups (Mt) were not significant for the traits tm1, AP3, SCYi, and TbwA (Appendix B; AP3 is shown as an example in Figure 3). [N] is an example of a trait for which the differences between Mt were significant, supported by a positive and high correlation with the maturity assessment (mt_as). AUC and yield were also significantly affected by Mt, showing a negative correlation with mt_as. Green "×" symbols followed by number codes represent outliers.

Association Mapping
The association mapping was performed with kinship correction to minimize false positive associations. The marker-trait associations reported here had −log10(p) values higher than 4 and explained at least 10% of the variance. The results of the association mapping are presented as marker-trait associations to generally describe the output of the analysis, to have an overall impression of the N level effect on the detection of associations in our dataset and to assess the colocalization of association with markers related to maturity. QTLs were defined using a linkage disequilibrium window of 8 Mb as mentioned in Section 5.
An overall summary is shown in Table 1A. The majority of the marker-trait associations were year-dependent, with only 166 associations (out of 950) detected in both years, reflecting a strong influence of environmental conditions. We focused on marker-trait associations detected in both years to compare the results for high and low N levels. In general, more marker-trait associations were detected under high N input than under low N.
Overall, 20 traits showed associations that were present in both years (irrespective of the N levels). A QTL for maturity assessment (mt_as) in our experiment was detected in the region on Chromosome 5, reported as maturity-related in the literature [37][38][39][40][41]. This region was an association hotspot with 11 traits (AP1, AP2, AUC, mt_as, [N], t1, t2, t2-t1, The effect of N level was significant for most traits except for AP1, te-t2, DM%, SCYi, and TbnMx (see also Appendix B). In Figure 3, DM% is shown as an example of a trait that was unaffected by N levels.
[N], AUC, and Y_DM were strongly influenced by N input. Moreover, AUC, which is a parameter accumulating temporal and spatial progression of canopy development (which is directly linked to the amount of intercepted light and, therefore, photosynthetic potential over the whole growth cycle) was highly positively correlated with yield at both N levels. More N input promoted vegetative growth and prolonged the growing period, and both effects may support higher yields, provided the growing season is long enough for the potato tuber yield to benefit from the prolonged canopy development. Lastly, the tuber size with maximum weight (TbwB) was significantly affected by N but not by Mt, and this trait is a representative of a group of traits that behaved consistently different than other traits included in the analysis at both N levels ( Figure 2).

Association Mapping
The association mapping was performed with kinship correction to minimize false positive associations. The marker-trait associations reported here had −log 10 (p) values higher than 4 and explained at least 10% of the variance. The results of the association mapping are presented as marker-trait associations to generally describe the output of the analysis, to have an overall impression of the N level effect on the detection of associations in our dataset and to assess the colocalization of association with markers related to maturity. QTLs were defined using a linkage disequilibrium window of 8 Mb as mentioned in Section 5.
An overall summary is shown in Table 1A. The majority of the marker-trait associations were year-dependent, with only 166 associations (out of 950) detected in both years, reflecting a strong influence of environmental conditions. We focused on marker-trait associations detected in both years to compare the results for high and low N levels. In general, more marker-trait associations were detected under high N input than under low N. Table 1. Number of marker-trait associations detected by genome-wide association analysis using data not corrected (A) and corrected (B) for maturity. The correction was done as explained in Section 5. The associations included fulfilled the criteria of having a −log 10 (p) value > 4 and explained variance >10%.

Figure 4.
Visualization of genome-wide association analysis (data not corrected for maturity); cm = common associations between N levels (green +), HN = associations at high nitrogen (red circle), and LN = associations at low nitrogen (blue circle). All traits are included, and only associations detected in both years with a −log10(p) value > 4 are shown. A higher intensity of the color corresponds to a higher number of marker-trait associations located in close proximity of each other.
At low N, 27.7% of the N-dependent marker-trait associations colocalized with mt_as (Table 1A). Seven traits (DM%, t2-t1, TbnA, TbnB, TbnMx, TbwB, and TbwMx) showed a total of 11 QTLs with markers not colocalizing with mt_as (Table 2). Moreover there was an N-dependent QTL for mt_as detected at low N on Chromosome 3 and a QTL for te-t2 detected at low N but colocalizing with mt_as on Chromosome 5. More QTLs were detected at high N level than at low N level, and even fewer QTLs were detected at both N levels ( Table 2).  Figure 4. Visualization of genome-wide association analysis (data not corrected for maturity); cm = common associations between N levels (green +), HN = associations at high nitrogen (red circle), and LN = associations at low nitrogen (blue circle). All traits are included, and only associations detected in both years with a −log 10 (p) value > 4 are shown. A higher intensity of the color corresponds to a higher number of marker-trait associations located in close proximity of each other.
Trait associations detected at both N levels (Table 1A, common to high N and low N) were considered N-independent associations. Eighty-eight percent of these associations were with markers also associated with mt_as within a window of 8 Mb (see Section 5) on Chromosomes 5 or 3. Eleven QTLs for eight traits were N-independent ( Table 2). Six of these QTLs (for AP2, AUC, [N], t2, te, and te-t2) were located on Chromosome 5. The other two traits were SCYi (with QTLs on Chromosomes 1, 2, and 11) and TbnB (with a QTL on Chromosome 3).
Marker-trait associations detected at only one N level were considered N-dependent associations. At high N, 45% of these associations involved mt_as-associated markers (on Chromosomes 5 and 3), all within the LD window of 8 Mb. Eleven traits with 24 QTLs were exclusively detected at high N (Table 2), and nine of these traits (AP1, SCYi, t1, t2-t1 TbnMx, TbwA, TbwMx, tm1, and Y_DM) did not have QTLs colocalizing with maturity assessment (Tables 1A and 2).
At low N, 27.7% of the N-dependent marker-trait associations colocalized with mt_as (Table 1A). Seven traits (DM%, t2-t1, TbnA, TbnB, TbnMx, TbwB, and TbwMx) showed a total of 11 QTLs with markers not colocalizing with mt_as (Table 2). Moreover there was an N-dependent QTL for mt_as detected at low N on Chromosome 3 and a QTL for te-t2 detected at low N but colocalizing with mt_as on Chromosome 5. More QTLs were detected at high N level than at low N level, and even fewer QTLs were detected at both N levels ( Table 2). "cm" represents N-independent QTLs, i.e., QTLs detected for both N levels. "HN" and "LN" represent N-dependent QTLs, i.e., QTLs exclusively detected at high N level or QTLs exclusively detected at low N level, respectively. For the trait acronyms, see Sections 5.5 and 5.6. Only QTLs detected in both years are included.

Association after Maturity Correction
Maturity had a strong effect on the traits measured at both high and low N, and the genomic region associated with maturity type was related to most of the traits measured. To gain more insight into the effect of maturity and to allow detection of maturity-independent QTLs, we corrected for the effect of the maturity classes (i.e., the differences in the means of the trait values between the maturity classes), effectively equalizing the trait means per maturity group. The relative differences between genotypes were maintained within a maturity group but corrected when comparing genotypes across maturity groups.
The total number of associations detected with the phenotypic data corrected for the maturity main effects (CD) was 348, with 181 markers (Table 1B), much lower than with the noncorrected data (NCD) ( Table 1A). The number of associations consistently found in both years at either N level was almost half compared with the NCD, but the number of markers involved was very similar (74 compared with 66 for the NCD and CD, respectively). This is because most of the trait associations colocalizing with the maturity assessment in the NCD disappeared after correction, as expected. The number of marker-trait associations common to both N levels after the correction was only eight, involving three traits (SCYi, TbnB, and DM%).
There were 24 QTLs for 11 traits commonly detected with both datasets (CD and NCD) (Table 3), with more QTLs detected at high N than at low N (13 and six, respectively), while five QTLs were common to both N levels. A QTL for SCYi on Chromosome 5 (no QTL detected for mt_as at this position) was consistently detected at both N levels in both analyses (with CD and NCD). Furthermore, there were 17 QTLs for 11 traits detected only after the maturity correction. Three QTLs were N-independent (for DM%, te-t2, and [N]), while 14 QTLs were N-dependent with seven QTLs at high N (for TbnA, SCYi, DM%, TbwMX, SCYi, and mt_as) and seven QTLs at low N (for tm1, t1, Vx, AP2, AP3, NUpt, and TbnA). There were regions accumulating QTLs for three or more traits on Chromosomes 3, 4, and 12. Table 3. Peak markers of QTLs consistently detected in both years (corrected and not corrected for maturity). For the trait acronyms, see Sections 5.5 and 5.6. N level: Whether the associations were detected at high nitrogen (HN), detected at low nitrogen (LN), or common to both nitrogen levels (cm).

Discussion
In this study, we combined canopy development modeling with an association mapping analysis to reveal the genetic basis of developmental, physiological, and agronomic traits with varying N availability. We applied established methodologies as used by D'hoop [36,42]. The association analysis was done after correction for relatedness, which is the accepted standard because it decreases the probability of false positives [31,43]. In potato, D'hoop et al. [42] showed an increased level of LD within specific cultivar groups demonstrating the importance of correcting for relatedness.
Canopy cover of potato shows a very large genetic variation [3,17,18,[44][45][46] and a large genotype × environment interaction [47][48][49]. It is controlled by multiple interacting genes, each having only a relatively small effect [27], like in many other crops [50][51][52][53]. Moreover, the genotype × environment interaction is strongly influenced by the maturity type of the cultivar [17,18]. However, the impact of the genotype × environment interaction on the various components of canopy cover over time is variable [3,17]. These effects come about through the impact of environmental factors on stem development (stem number, stem branching, and sympodial growth), leaf appearance, leaf expansion, and leaf senescence [54]. In the specific case of nitrogen supply, the interaction between the cultivar and nitrogen supply affects the number of branches (either the lower lateral branches or the sympodial branches at the top of the stem), the number of leaves on the main stem and on the different types of branches, the rate and duration of leaf expansion (resulting in the final size of the individual leaves), the duration of the life span of the leaf, and the rate of leaf senescence [55]. In close interaction with the maturity type [3], nitrogen supply supports a rapid development of the canopy early in the growing season and a long duration of canopy cover throughout the remainder of the season, thus enabling an advanced, enhanced, and prolonged light interception, allowing high tuber yields [3,17,56].
Our results showed effects of N levels on the relationship between traits based on the genetic correlation (Figure 2), similar to the results based on phenotypic correlation for both N levels reported by Ospina et al. [3]. We demonstrated the effect of N input on canopy development and yield traits (Figure 3), as well as the strong contribution of maturity type, which is the major factor determining development, to the genetic variation. The genetic variation resulted in QTLs consistently detected in both years at both N levels for 20 of the 24 traits included in this study. Many of these QTLs accumulated in a single region on Chromosome 5 that is known to be linked to maturity type as shown by Kloosterman et al. [40], who identified an allelic variation of the CDF1 (cycling DOF factor) gene at this locus which strongly influences phenology, plant maturity, and onset of tuberization, reflecting the importance of this region for quantitative developmental traits.
Effects of nitrogen availability on potato development were reported by many authors [2][3][4][5][6]9,[11][12][13][14]16]. As stated above, in general, more available nitrogen advances, enhances, and prolongs canopy cover as a result of improved haulm growth [57], as well as the initiation of more leaves with a longer life span [4]. Our canopy development model describes this elegantly and in a quantitative way with biologically meaningful parameters, thus illustrating the genotype × environment interaction in an analytical way. The three phases of canopy development (see Section 5.6) responded to N with cultivars having a faster buildup phase of the canopy, resulting in a shorter time to reach maximum coverage (t1) and a higher maximum cover (Vx) for a longer period (t2-t1) at high N input, all resulting in higher photosynthetic potential [3,14].
Most traits included in this study had relatively high genetic correlations between high and low nitrogen conditions, except for AP3, te-t2 (Phase 3 of CDv), and t1. These high correlations reflect the consistency of the genotypic behavior under varying N availability, at least for canopy development parameters associated with the period of maximum canopy cover. Phase 3 of CDv was difficult to phenotype precisely due to the senescence process itself, which starts with yellow leaves until an uncertain point when the canopy collapses. The yellowness could start early if conditions are not favorable but the crop continues to take up nitrogen. On the other hand, wind and rain can accelerate the collapse and those factors are difficult to predict. Therefore, Phase 3 parameters showed the largest random error, explaining the low heritabilities of AP3 and te-t2 [3]. The relationships between some of the traits based on their genetic correlation coefficients were slightly different between N input levels. For instance, yield has an absolute correlation with AUC higher at low N than at high N, as a result of changes in the relationship with other traits. Under high N input, there are no nutrient constraints for canopy development leading to an expansion of the duration of the potato growth phases [3,14]. However, it is known that, with high N input, important traits determining yield such as leaf area index (LAI) and radiation use efficiency (RUE) are positively affected [24]. LAI continues to increase even when the soil coverage is 100% [9,58]; the maximum coverage is also reached faster and sustained longer at high nitrogen input [3]. Therefore, although yield and AUC are highly correlated under both N conditions, the contribution of LAI and RUE under high N may not be fully captured by the AUC. This could also be reflected in the QTLs detected; QTLs common to both N levels for yield and AUC were found on Chromosome 5 ( Table 2) and colocalized with QTLs for maturity (mt_as), while QTLs exclusively detected at high N for yield were located on Chromosomes 9 and 12. A possible explanation is that the latter two might be associated with the contribution of RUE and/or LAI to yield.
Genomic regions with possible pleiotropic effects were detected on Chromosomes 2, 5, and 6 ( Figure 4). The QTL hotspot on Chromosome 5 was the most noticeable, accumulating QTLs for 50% of the traits on this maturity-related region, as similarly shown by Hurtado-Lopez et al. [39] with developmental traits related to senescence and flowering and with plant height. Most of the traits with QTLs in this region were highly correlated with maturity assessed in our trials (mt_as), emphasizing the importance of maturity and the genomic region on Chromosome 5 for crop development. Moreover, as a general remark, the colocalization of QTLs was mostly determined by the correlation between traits. Furthermore, there was an N dependency of some QTLs for several traits. The region on Chromosome 2 accumulated QTLs for six traits (AP1, t1, TbnA, TbwA, TbwMX, and SCYi) at high N input, while the region on Chromosome 6 was related to four traits (with QTLs for TbnB, TbnA, TbwB, and tm1) at low N input. This shows the strong effect of available N on the genetic response, as well as its complexity.
Regarding the N-dependent QTLs, at high N input, more QTLs involving more traits were detected than at low N input, along with a higher percentage of marker-trait associations on Chromosome 5. Gallais and Hirel [59] found in maize more QTLs for some traits at high N input than at low N input (vegetative development, N uptake, and yield components), while, for other traits, it was the opposite (N utilization efficiency and protein content). This is a reflection of the difference in the expression of the genetic variability between high and low N input that may be trait-dependent. The trait × nitrogen interaction was translated into a QTL × nitrogen interaction in those studies, as well as in our study.
N-independent associations were mostly located at the maturity locus on Chromosome 5 (data uncorrected for maturity). Khan [16] used a similar phenotyping approach to study potato canopy development and reported a major QTL hotspot on Chromosome 5 in a diploid biparental population (SH × RH) affecting all parameters of the canopy cover curve in several environments. Ospina [24], using the same diploid biparental mapping population, reported the same QTL region on Chromosome 5 at both high and low nitrogen levels. In addition, QTLs for growth and yield traits in this region were found in drought tolerance QTL mapping in the greenhouse of the C × E diploid mapping population [60] and in multiple environments for the same population [27,39]. Therefore, the overall and predominant effect of maturity on canopy development and on yield appears to be stable across different environments, nitrogen conditions, and populations.
N-independent QTLs different from those of the maturity locus on Chromosome 5 were found only for SCYi and TbnB (Figure 4). These two traits were not correlated with the maturity assessment (mt_as) (Figure 1). For SCYi, there were N-independent QTLs on Chromosomes 1, 2, 5, and 11, while, in the diploid mapping population, SH × RH [24], Nindependent QTLs were found on Chromosomes 5 and 10 (referred to as linkage groups V and X in the multitrait QTL analysis [24]). This might suggest that the genetic background, as well as the population type, influences the genomic regions related to a trait. For TbnB, we found N-independent QTLs, as well as a QTL at low N level, on Chromosome 3. Schönhals [61] also found associations for tuber number on this chromosome (as well as on Chromosomes 1, 5, and 6), using markers for candidate genes that were functionally related to tuber yield and starch. A comparison of the results with previous reports is difficult because different markers were used in different populations. For the markers used in the detection of QTLs with the SH × RH diploid biparental population by Ospina [24], there were no physical positions available (these makers were not used in this association analysis), while, for the SNPs used here in the association mapping, there are no genetic positions known on the SH × RH genetic map.
After the maturity correction, the number of N-independent marker-trait associations was drastically reduced. Since most of the traits were maturity-related, the maturity correction was expected to have a strong impact on the detection of QTLs. Only the Nindependent QTLs for SCYi and TbnB remained after the correction (these were not linked to maturity, and the traits did not correlate with maturity). Similarly, D'hoop et al. [62] showed the impact of maturity. In their phenotypic analysis, the presence or absence of maturity as a term in the model influenced the genotypic effects for two traits studied, underwater weight and maturity trait (both traits are physiologically correlated), but not for the majority of quality traits, which were not correlated with maturity. Their association analysis using maturity-corrected values in a model with a correction for relatedness showed a reduction in the number of marker-trait associations detected for these two traits (underwater weight and maturity), while, for other quality traits, there was no clear trend [63].
The maturity correction of our data using predefined information of the cultivars (the maturity grouping factor) was effective in removing the maturity effect, although it might have affected the detection of association, most probably reducing the power since the overall variation was reduced. However, it allowed detection of new QTLs ( Table 3) that did not colocalize in the main region related to maturity on Chromosome 5. For instance, a new N-independent QTL for DM% on Chromosome 3 was detected. This QTL was expected to be N-independent since the DM% trait did not have a significant nitrogen effect (Appendix B). The results with the diploid population SH × RH [24] also showed a QTL for DM% on Chromosome 3. In general, the maturity dependency of some QTLs resulted from the physiological relationship of canopy development traits and agronomic traits with maturity (these relationships were discussed by [3,24]). Kloosterman et al. [40] identified the causal gene within the Chromosome 5 maturity locus. Allelic variation of the CDF1 (cycling DOF factor) gene at this locus strongly influences phenology, plant maturity, and onset of tuberization. The CDF1 gene has a great effect on the plant life-cycle length by acting as a mediator between photoperiod and the tuberization signal. This major effect acts on several processes of the plant, resulting in a strong linkage between maturity and traits related to CDv and yield. Khan [16] also mentioned the dependency of tuber yield on its components (especially tuber bulking parameters and CDv traits), as these are physiologically and surely genetically related; genotypes with higher tuber bulking rates show limited haulm growth and canopy duration, leading to an early maturity type [16].
We showed how genetic factors determining canopy development and yield traits in potato cultivars interact with N levels. The different QTL regions detected for a trait under contrasting N conditions may imply that the phenotypes are the result of a tradeoff between these QTL regions. The detection of N-dependent QTLs emphasizes the importance of direct selection under limiting N conditions only if the QTLs contribute to the traits of interest. The contribution of genetic factors to growth and yield is affected by N input, with different interactions between the traits under low N than under high N and, therefore, different contributions of the traits to the observed phenotype. Ospina et al. [3] mentioned that, to breed for NUE under low input, the strategy should be to select for high yield under low N and combine this with a high responsiveness to more N input. This allows for selection of better adapted genotypes in N-limiting conditions. In addition, to bypass the strong linkage with maturity that is observed for developmental traits, NUE, and yield, the selections should be done within each maturity group. Thus, the phenotyping should be made more discriminative to exploit the variation in a narrow maturity category. Additionally, the strong correlation of most of the traits mentioned with maturity can mask useful genetic variation for these traits, as exemplified in this paper. An early selection is required to increase the number of individuals in the target maturity class. This might be achieved by developing marker selection for maturity. Small differences in the existing trait will then be detectable, and selectors and breeders may be able to identify new traits and be more discriminant in their assessments for these other traits.
In general, the reports in other crops on QTL detection under contrasting N conditions have shown the great influence of environmental conditions. For example, in barley, the detection of QTLs was reported to show an extensive G × E/QTL × E interaction, with QTLs changing between years irrespective of N levels [64]. In maize, contrasting results have been reported when looking for QTLs under high and low N input. QTLs for grain composition and NUE-related traits detected at low N corresponded to QTLs detected at high N input [65], while other results showed that QTLs for NUE were only detected at low N input [66]. Hirel et al. [67] suggested that, depending on the recombinant inbred line population, the response of yield to various levels of N fertilization could be different and, thus, controlled by a different set of genes. We only reported QTLs consistently detected in both years for each N level. Therefore, our research does not directly address the G × E interaction, for which more experiments would be needed.
However, our research also demonstrated extensive G × E/QTL × E interactions, as illustrated in Table 1; the number of marker-trait associations was significantly higher across all datasets (950 marker-trait associations detected irrespective of the year) than the number of associations detected in both years (only 166 marker-trait associations detected in both years). In the context of our experimental setup, nitrogen level was the major control factor driving the differences within this very constant physical environment. The total number of marker-trait associations (n = 166) detected in both years could be broken down in groups that were detected at both N levels (n = 50), only at high N (n = 69), or only at low N (n = 47) (Table 1). Moreover, there was a strong effect of maturity type on the detection of these marker-trait associations. When data were corrected for maturity type, only 86 marker-trait associations were found, which could be broken down into three groups; eight were detected at both N levels, 48 were only detected at high N, and 30 were only detected at low N (Table 1). Therefore, the nitrogen dependency of some QTLs could be interpreted as a QTL × N interaction.
Our approach focused on contrasting N input levels using a single N application, and this is a first step to understanding the genetic factors involved in the response of potato to N. It is important to mention that fertilization practices such as split application might have an additional effect on the plant response to nitrogen, especially in relation to the different maturity types. Additionally, soil mineral N supply during the growing season is difficult to control and understand [7] since it is a dynamic factor. Goffart et al. [68] mentioned that soil mineral N supply is influenced by several predictable and unpredictable factors, such as weather conditions, chemical and physical soil properties, type and evolution of organic matter previously incorporated in the soil, cultural practices, maturity type of the cultivar, and crop duration. This N dynamic in the soil could result in different levels of available N. This difference will affect the crop development response of the cultivar and, thus, the variation of the traits, thereby affecting the consistency in the detection of QTLs or marker-trait associations.
Lastly, the understanding of the influence of an intrinsic major genotypic factor such as maturity type is valuable to refine breeding strategies, as well as to develop cultivars suitable to low N input or otherwise limiting conditions. Furthermore, the results presented here suggest that breeding schemes should be done within maturity groups, with the main idea of improving characteristics that are highly influenced by maturity, such as DM%, N content, and NUE, within a maturity group.

Conclusions
We used a detailed phenotyping dataset based on a 2-year experiment with 169 cultivars or other genetic material, grown at two levels of nitrogen in an otherwise very uniform physical environment, and we analyzed that dataset using a dynamic canopy cover model with biologically meaningful parameters. Cultivar-specific parameter values of that model and the tuber growth characteristics model were associated with markers from a 20K Infinium array. Twenty-four traits and 10,747 SNP markers were combined in a GWAS with kinship correction. Nitrogen had a strong effect on most traits and on the correlations between these traits. Nitrogen also strongly affected the detection of marker-trait associations. We could show that some associations were detected at both nitrogen levels, whereas others were only found at a low or high nitrogen level. However, correction for maturity type proved to be essential for the interpretation of the data. After correcting for maturity, the number of QTLs detected became much lower, especially for those that were common to both N levels; however, interestingly, a region on Chromosome 2 accumulated several QTLs. Apparently, there are strong links between maturity type and traits associated with nitrogen husbandry of the potato crop.

Materials and Methods
The experimental design, data collection, and processing to generate the phenotypic information used in this paper were described in detail by Ospina et al. [3]. Therefore, a brief description suffices here.

Location and Planting Material
Experiments were carried out at the Agrico research and breeding station (Bant, Flevoland, The Netherlands), in 2009 and 2010. We used a set of 189 cultivars representing a wide diversity of potato cultivars commonly grown in Europe, ancestors, and progenitor clones (Appendix C). The set has been extensively used for association studies of quality traits, as described by D'hoop et al. [36,42].

Experimental Design and Treatments
In both experiments, two N levels were implemented: (i) high N, with 180 kg available N/ha (soil N and fertilizer N combined) as a standard conventional N input level, and (ii) low N, with 75 kg available N/ha as the low input variant. The amount of fertilizer required was calculated on the basis of a soil analysis done at the beginning of the growing season. Fertilizer application was split into two. A basic fertilizer treatment was applied just after planting (NPK) on the whole experimental field to reach the amount for low N. A second amount was applied to the high-N plots only, before the final ridging, using dolomite ammonium nitrate (DAN, 27-0-0). P and K were abundantly available for potato crop growth in both N treatments.
The experimental design was an unbalanced split-plot design, with 16 plants per genotype per field plot, with treatments (N) as whole plots (with no replicates), maturity groups as subplots randomized within whole plots, and cultivars nested and randomized within maturity subplots. An additional 16 (2009) or 20 (2010) field plots with a reference cultivar were planted at random across the field to estimate the plot-to-plot environmental variation without confounding cultivar variation.

Data Collection
Emergence date was estimated per plot as the first date when more than 50% of the plants in the plot had emerged (i.e., first leaf visible). The percentage of soil cover (SC) was assessed weekly over three plants per plot (same three plants) all through the growing season from emergence until harvest. Maturity was scored using a scale to assess the progress of senescence (modified from [37]) in which 1 = green canopy with the first flower buds, 2 = green haulm with abundant flowers, 3 = first signs of yellowness in the upper leaves, 4 = up to 25% of the plant with yellow leaves, 5 = up to 50% of the plant with yellow leaves or lost leaves, 6 = up to 75% as in 5, 7 = up to 90% of the plant yellowed or without leaves, and 8 = entire haulm brown or dead. This assessment is referred to as maturity assessment (mt_as) to avoid confusion with the maturity index used to form maturity groups as blocking factor (Mt).

Final Harvest
The final harvest took place as late as possible to allow late cultivars to complete their cycle. The whole experiment was harvested at once. Sixteen plants were harvested per plot, and the following tuber traits were assessed: (A) total tuber fresh weight; (B) tuber size and weight distribution, for which six size classes were included (0-30 mm, 30-40 mm, 40-50 mm, 50-60 mm, 60-70 mm, and >70 mm), recording the tuber number and tuber weight for each class; (C) tuber number per meter (obtained for the class 50-60 mm); (D) dry matter percentage (DM%), as the dry weight of a sample divided by its fresh weight, expressed as a percentage, where tubers from all size classes were cut using a French fries cutting machine before drying at 70 • C for 48 h; (E) N content ([N]) in the tubers, assessed using the Kjeldahl protocol.

Data Processing
A canopy development model was fitted using the NOLIN procedure of SAS/STAT ® , with percentage soil cover as the dependent variable of Beta thermal time counted from emergence day until each assessment date. Five parameters were estimated for each individual plot [16][17][18]. Four t-parameters were expressed in thermal days (td): tm1 (inflection point in the growing phase of the curve), t1 (when SC stabilized), t2 (start of senescence), and te (when canopy had completely senesced). The fifth parameter, Vx, was the maximum SC reached with percentage soil coverage (%SC) as unit.
A bell-shaped curve was fitted per plot for tuber weight and tuber number datasets separately (Tbw and Tbn respectively) to describe their distribution. Three parameters were estimated for each dataset following Equation (1) ( Figure 5).  (1)). Parameter names are explained in the text (see Section 5.5). Tb is either Tbw or Tbn, "A" is a dispersion parameter expressing how the weights/numbers were distributed across tuber size classes, "mcl" is the average size of each tuber size class, and "B" is the average size at which the maximum ("MX") weight/number occurs. The curve-fit parameters were named for each variable as follows: for Tbw data, TbwA, TbwB, and TbwMx; for Tbn data, TbnA, TbnB, and TbnMx.  (1)). Parameter names are explained in the text (see Section 5.5). Tb is either Tbw or Tbn, "A" is a dispersion parameter expressing how the weights/numbers were distributed across tuber size classes, "mcl" is the average size of each tuber size class, and "B" is the average size at which the maximum ("MX") weight/number occurs. The curve-fit parameters were named for each variable as follows: For Tbw data, TbwA, TbwB, and TbwMx; for Tbn data, TbnA, TbnB, and TbnMx.

Calculated Variables
On the basis of the parameters estimated with the CDv model, the following variables were calculated [16][17][18]: t2-t1 (duration of maximum SC in td), te-t2 (duration of senescence in td), Cm (maximum progression rate of %SC in %/td), AP1 (area under the curve for canopy buildup phase in %.td), AP2 (area under the curve for phase of maximum SC in %.td), AP3 (area under the curve for senescence phase in %.td), and AUC (area under the curve for the entire crop cycle in %.td). In order to express the agronomic variables in a standard way, subsequent calculations and conversions were done as follows: N content ([N]) in g/kg (determined only in tubers); DM% in percentage; dry matter yield (Y_DM) in kg/m 2 , i.e., Y × DM%/100; N uptake in tuber (NUpt) in g/m 2 , i.e., Y_DM × [N]; N use efficiency (NUE) as Y_DM/(N input) in kg/g; N utilization efficiency (NUtE), i.e., Y_DM/NUpt, in kg/g; N uptake efficiency (NUptE; NUpt/N input in g/g); and soil coverage yield index (SCYi = AUC/Y_DM in %.td/(kg/m 2 )). The variables were analyzed without transformation since there were no severe violations to the assumptions required for mixed model analysis. All trait acronyms are summarized in Appendix D.

Statistical Analysis
Data were analyzed with the Genstat package (16th edition). The model in Equation (2) combining information of both years was used for each N level.
where terms joined by " * " represent individual effects plus the interactions (yr * Mt = yr + Mt + yr.Mt), whereas terms joined by "." represent interaction only. The term yr represents year, clarifying that year effects include variation due to the experimental field. The term Mt is the maturity group excluding control plot information. Corrections for rows and columns are the random terms (yr.row and yr.col). The term Mt.G represents the cultivars nested within maturity groups, since maturity is an intrinsic characteristic of each cultivar.
Lastly, E represents the error. All random terms are underlined. The genetic correlations between traits were estimated as the Pearson correlations based on the estimated genotypic means, BLUEs, i.e., best linear unbiased estimates (excluding all other terms in Equation (2)). In addition, in order to understand relationships between traits and to define groups of traits, a divisive hierarchical cluster analysis was carried out, i.e., the analysis was top-down, performing recursive splits when going down the hierarchy. For this hierarchical cluster analysis, we used the absolute genetic correlations between traits as a similarity measure and applied the Ward minimum-variance method. The hierarchical cluster analysis was carried out separately for the two nitrogen levels. We excluded traits for which calculation included N input level, i.e., NUE, NutE, and NUptE. Additionally, biplots were generated to visualize relationships between traits per N level, included in Appendix E.

Association Mapping
The analysis included the 169 potato cultivars out of the total set of 189 cultivars for which genotypic data were available. SNP data were generated using a 20K Infinium SNP array [69]. A total of 14,587 markers were successfully scored in (a maximum of) five dosage classes per SNP using fitTetra [70]. The dosage classes were nulliplex, simplex, duplex, triplex, and tetraplex depending on the number of copies of the allele being quantified (0 to 4). Only SNPs having allele frequencies greater than 5% in at least two of the dosage classes were considered. Thus, frequencies of minor alleles were ignored, and a total of 10,747 SNPs were used to perform the GWAS.
The GWAS was performed using a mixed model including a kinship matrix to account for population structure. The kinship matrix was estimated using 764 SNP markers randomly distributed over the genome, expressed as (with random terms being underlined): Trait (y) = Marker (m) + genotype + residual where var(genotype) = Kσ g 2 and K = Kinship matrix.
Linkage disequilibrium between markers was extensively studied by Vos et al. [69] and D'hoop et al. [42] using the same cultivar set. From those studies, the linkage disequilibrium decay was estimated to be between 2 and 4 Mb. We considered LD as 4 Mb and a window of 8 Mb, i.e., the apposition of a marker ±4 Mb). For a full, detailed description of the 20K SNP array, see https://edepot.wur.nl/392278 (accessed on 20 July 2021).
The association analysis was done using fitted values for the observations, (BLUPs, i.e., best linear unbiased predictions) using the model of Equation (2). Four phenotypic datasets corresponding to combinations of both years and N levels were the input in this analysis. In Section 2, we only considered associations with a −log 10 (p) > 4. Then, the focus was to find marker-trait associations consistent across both years, for each N level. Next, we compared results from the two N levels defining associations detected at both N levels as common (cm), as well as those exclusive to either high N (HN) or low N (LN). The last two categories were considered N-dependent marker-trait associations.
Since maturity is known to have a strong effect on the traits considered in this study [3,[16][17][18], BLUEs excluding the main effect of maturity group were calculated. From each estimated value, the effect of the maturity class was subtracted, and these maturity type-corrected BLUEs were used as input in the association analysis with population structure correction.