Nucleotide Diversity and Association Analysis of ZmMADS60 with Root Length in the Maize Seedling Stage

: Root length is a determining factor of the root system architecture, which is essential for the uptake of water, nutrients and plant anchorage. In this study, ZmMADS60 was resequenced in 285 inbred lines, 68 landraces and 32 teosintes to detect the nucleotide diversity and natural variations associated with root length. Nucleotide diversity and neutral tests revealed that ZmMADS60 might be selected in domestication and improvement processes. ZmMADS60 in maize retained only 40.1% and 66.9% of the nucleotide diversity found in teosinte and landrace, respectively. Gene-based association analysis of inbred lines identiﬁed nine variants that were signiﬁcantly associated with primary root length (PRL), lateral root length (LRL), root length between 0 mm and 0.5 mm in diameter (RL005) and total root length (TRL). One single-nucleotide polymorphism SNP1357 with pleiotropic e ﬀ ects was signiﬁcantly associated with LRL, RL005 and TRL. The frequency of the increased allele T decreased from 68.8% in teosintes to 52.9% and 38.9% in the landrace and inbred lines, respectively. The frequency of the increased allele of another signiﬁcant SNP723 associated with PRL also decreased during the maize domestication and improvement processes. The results of this study reveal that ZmMADS60 may be involved in the elongation of primary and lateral roots in the seedling stage and that signiﬁcant variants can be used to develop functional markers to improve root length in maize.


Introduction
Root systems are crucial for plant survival; they not only provide anchorage for the plant and acquisition of essential mineral nutrients and water from soil, but they also contribute to monitoring the changing environmental conditions [1]. One remarkable feature of immobile plants is that root systems are highly plastic to various environmental cues, such as the water content and nutrient levels in soil [2]. The root system architecture (RSA) can be modulated in several ways, such as primary root elongation, diameter, growth direction and adventitious and lateral root branching. These traits reshape the root system architecture, allowing plants to efficiently absorb water and nutrients for plant growth. Understanding genetic components and exploring the natural variation of root growth are key to breeding and engineering better root system architectures to enhancing plant productivity. The maize root system comprises structurally and functionally different root types, including primary (PR), seminal (SR), crown (CR), brace (BR), and lateral roots (LR) [3]. A primary root emerges from a seed only two or three days after germination, followed about a week later by the formation of a variable number of seminal roots, which are embryonic [4]. All root types form postembryonic lateral roots, which are major determinants of the plant root system architecture [2]. These roots increase the total root length, biomass and surface area, potentially dramatically enhancing the contact area between the roots and soil for exploration of the soil environment for water and nutrients [5]. They are important for seedling vigor during plant early development. Some genes for these traits have been cloned in mutant analyses. The rootless concerning crown and seminal roots (RTCS) gene encodes a 25.5-kDa lateral organ boundaries (LOB) domain protein that is a central regulator of auxin signaling, and the maize mutant RTCS is defective in the initiation of seminal roots and the shoot-borne root system [6]. The rum1 mutant is deficient in the initiation of embryonic seminal roots and post-embryonic lateral roots at the primary root. Map-based cloning demonstrated that rum1 encodes an Aux/IAA protein [7], and a transcriptional activator lateral root primordia 1 (LRP1) was further identified, the expression of which is repressed by the binding of rootless with undetectable meristem 1 (RUM1) to its promoter [8]. Association analyses in 74 maize inbred lines revealed several polymorphisms in rum1 associated with seedling root traits [9,10]. Numerous quantitative trait loci (QTL) associated with root traits have been identified under different environmental conditions in maize [11,12]. Several significant marker-trait associations with root traits have also been detected by genome-wide association analysis in diverse maize inbred lines [13]. However, most of these loci have not been cloned.
Transcription factors (TF) play an important role in plant development [14]. A total of 3308 TFs belonging to 56 families have been identified in maize (MaizeSequence AGPv3.31) [15]. The MADS-box family comprises large genes with important functions in various aspects of flower development, flowering time control, inflorescence architecture, pollen development, seed/fruit development, and root development [16]. For example, FLOWERING LOCUS C (FLC) is a MADS-box gene, and FLC and its orthologous genes act as major regulators of flowering time in Arabidopsis, broccoli, and oilseed rape [16,17]. ZmMADS1, ZmMADS69 and ZAGL1 also function as flowering activators in maize [18][19][20]. OsMADS1 and OsMADS57 can control plant height and tillering in rice [16]. Members of the MADS-box gene family have been prominently studied during flower and plant development; however, the role of these types of TFs in root development has received relatively less attention. It has been reported that more than half of MADS-box genes are expressed in Arabidopsis roots [21]. XAL1/AGL12 and XAL2/AGL14 are involved in the regulation of PR elongation, and AGAMOUS-Like42 (AGL42), AGL16, AGL17, AGL18, and AGL21 are preferentially expressed in quiescent center (QC) cells. ANR1 plays a role in lateral root development in the presence of NO 3 − [22]. In rice, OsMADS25 can regulate the primary root length and lateral root (LR) density via auxin signaling [23], and the OsmiR156-OsSPL3 module directly activates OsMADS50 in the node to regulate crown root development in rice [24]. These studies highlight the important role of MADS-box genes in root development. To systematically study natural variations of MADS-box genes in maize, single-nucleotide polymorphism (SNP) in fifty MADS-box genes (Table S1) were filtered from genotyping-by-sequencing (GBS) dataset [25] in 285 inbred lines. A gene-based association analysis was conducted, and eight genes (Table S2) were significantly associated with seedling root traits. ZmMADS60 was most significant gene that associated with four root length traits (Table S2). In the present study, we further re-sequenced ZmMADS60 in 285 inbred lines, 68 landraces and 32 wild relatives, and a gene-based association analysis was conducted examining root traits in inbred lines. The objectives of this study were (1) to identify natural variations in ZmMADS60 associated with root length traits, (2) to detect favorable alleles and haplotypes within ZmMADS60 for root length, and (3) to examine the ZmMADS60 for the involvement in maize domestication and improvement.

Plant Materials and Phenotypic Evaluation
In total, 285 inbred lines, 68 landraces and 32 teosintes were used in this study (Table S3). The root traits of inbred lines at the seedling stage were determined in a hydroponic system [25]. Seeds were sterilized in 10% H 2 O 2 solution for 20 min, soaked in saturated CaSO 4 for 6 h, and then germinated on moist filter paper at 28 • C and 80% relative humidity in the dark for 2 days. Eight uniformly germinated seeds were selected and vertically rolled in germination roll paper (Anchor Paper Company, St Paul, MN, USA). The paper rolls were placed in black incubators containing 7.5 L nutrient solution. A completely randomized design with two replicates was used. Plants were harvested 14 days after germination, and the root number and primary and seminal root length were measured. The roots were then scanned and analyzed using WinRHIZO software (Pro 2004b, Canada). A total of 7 root traits were measured: primary root length (PRL), total root length (TRL), lateral root length (LRL), root length between 0 mm and 0.5 mm in diameter (RL005), root length between 0.5 mm and 1.0 mm in diameter (RL0510), root length between 1.0 mm and 1.5 mm in diameter (RL1015), and average root diameter (ARD).

ZmMADS60 Resequencing
Genomic DNA of inbred lines, landraces and teosintes was extracted from young leaves using the cetyltrimethylammonium bromide (CTAB) method. The ZmMADS60 gene was sequenced using targeted sequence capture technology on the NimbleGen platform by BGI Life Tech Co. [26]. The genomic sequence of ZmMADS60 (GRMZM2G152415) from the B73 inbred line (AGPv3.31) was used as a reference for target sequence capture following the manufacturer's protocols (Roche/NimbleGen) with modifications at the W. M. Keck Facility at Yale University [26]. DNA was sheared by sonication and adaptors were ligated to the resulting fragments. Extracted DNA with desired size was amplified by polymerase chain reaction (PCR), purified, and hybridized to the capture array at 42.0 • C using the manufacturer's buffer. The array was washed twice at 47.5 • C and three more times at room temperature. The resulting fragments were purified and subjected to DNA sequencing on the Illumina platform. The clean reads were mapped to the B73 reference genome sequence (AGPv3.31) by BWA with the settings 'mem -t 4 -k 32 -M' [27]; variant calling and gene sequences converting were performed for all samples using the GATK 4.0 [28].

Sequence Analysis, Genetic Diversity Analysis and Neutral Evolution Test
The ZmMADS60 gene sequences from all measured lines were aligned using MAFFT software [29] and manually improved using BioEdit [30]. The gene features [5 untranslated region (UTR), 3 UTR, introns and exons] were defined by gene annotation from MaizeSequence (B73, AGPv3.31). The sequence polymorphisms, genetic diversity analysis and neutral evolution test were conducted using DNASP5.0 software [31]. π and θ were used to estimate the degree of genetic diversity within the tested population. The neutral evolution test was investigated using Tajima's D [32], Fu and Li's test [33].

Natural Variation of the ZmMADS60 Gene Associated with Root Traits in Inbred Lines
The association between variants of the ZmMADS60 gene and the eight root traits was conducted using TASSEL v5.0 with mixed linear models (MLM). The top five principal components (PCs) and kinship (K) were used to control the population structure and kinship to reduce the false positive error. A total of 678 ZmMADS60-based markers with a minor allele frequency (MAF) ≥0.05 were selected for association analysis in 285 inbred lines. Using a Bonferroni correction based on 678 markers, the p-value thresholds were 0.0015 (1/678, −log 10 (p) >2.83). A significant P-value threshold (0.001, −log 10 (p) >3.0) was applied to identify significant variants with root traits.

Nucleotide Diversity and Selection of ZmMADS60 in Inbred Lines, Landrace and Teosinte
ZmMADS60 was resequenced in 285 inbred, 68 landrace and 32 teosinte accessions. After multiple sequence alignment, a total of 7429 bp of genomic region was sequenced covering 1543 bp of the upstream region, 1019 bp of the 5 UTR region, 4755 bp of the coding region containing 11 exons and 11 introns, and a 112 bp of the 3 UTR region (Table 1). Among these regions, 1199 variations were identified in all tested lines, including 1018 SNPs and 181 InDels. On average, SNPs and InDels were found every 7.30 bp and 41.04 bp, respectively. The highest frequencies of SNPs and InDels were found in the 5 UTR (6.21 bp and 29.97 bp, respectively). Sliding-window analysis showed that the overall nucleotide diversity (π × 1000) of the ZmMADS60 locus was 17.57. Among four regions of ZmMADS60, the 3 UTR genomic regions were less diverse than the other regions (0.34), and the upstream region showed a high nucleotide diversity (37.59, Table 1). To investigate the selection mechanism of ZmMADS60 during maize domestication and improvement, the sequence conservation (C) and nucleotide diversity (π) were compared in the inbred lines, landrace and teosinte. For all test lines, the values of C and π × 1000 were 0.667 and 17.57, respectively ( Figure 1). Compared with teosinte, the landrace and inbred lines showed higher conservation (C T = 0.728, C L = 0.826 and C I = 0.855) and lower diversity (π × 1000 T = 43.28, π × 1000 L = 25.96 and π × 1000 I = 17.36). The nucleotide diversity ratio was 40.1% and 66.9% for maize/teosinte and maize/landrace, respectively. The highest divergence between the inbred lines and teosintes was observed in the upstream regions and the seventh intron (Figure 1b)  Denotes number of single nucleotide polymorphisms (SNP) per 1000 bp, C represents sequence conservation, and D and F represent Fu and Li's D *and F *. * indicates a statistical significance at p < 0.05 level, ** indicates a statistical significance at p < 0.01 level (b) Nucleotide diversity (π) of inbred lines, landraces, and teosinte. π was calculated using the sliding windows method with a window size of 100 bp and a step length of 25 bp. A schematic diagram of the genomic region of ZmMADS60, including upstream sequence and introns (light gray), the coding region (black), and 5'UTR and 3'UTR (blue) is presented.

Association Analysis of Phenotypic Traits with ZmMADS60
Trait-marker association analysis was conducted to identify the association of root traits with nucleotide polymorphism of ZmMADS60 in 285 inbred lines. After quality control (minor allele frequency ≥ 0.05), 678 variants, including 425 SNPs and 253 InDels, were included in the association analysis. The unified mixed model with controls for both PCA and relative kinship (MLM+PCA+K) was employed to perform the marker-trait association analysis. A total of nine significant markertrait associations involving four SNPs (SNP723, SNP726, SNP1319 and SNP1357) and two InDels (InDel-1576 and InDel727) were associated with primary root length (PRL), lateral root length (LRL), root length between 0 mm and 0.5 mm in diameter (RL005) and total root length (TRL) (Table 2, Figure 2). A total of 1, 2 and 3 variants were distributed upstream and in introns 4 and 7, respectively. The significant variants could explain 3.9-5.1% of the phenotypic variation.

Association Analysis of Phenotypic Traits with ZmMADS60
Trait-marker association analysis was conducted to identify the association of root traits with nucleotide polymorphism of ZmMADS60 in 285 inbred lines. After quality control (minor allele frequency ≥ 0.05), 678 variants, including 425 SNPs and 253 InDels, were included in the association analysis. The unified mixed model with controls for both PCA and relative kinship (MLM+PCA+K) was employed to perform the marker-trait association analysis. A total of nine significant marker-trait associations involving four SNPs (SNP723, SNP726, SNP1319 and SNP1357) and two InDels (InDel-1576 and InDel727) were associated with primary root length (PRL), lateral root length (LRL), root length between 0 mm and 0.5 mm in diameter (RL005) and total root length (TRL) (Table 2, Figure 2). A total of 1, 2 and 3 variants were distributed upstream and in introns 4 and 7, respectively. The significant variants could explain 3.9-5.1% of the phenotypic variation.  A total of four significant variants were associated with primary root length (PRL) (Figure 3a), including two Indels (InDel-1576 and InDel727) and two SNPs (SNP723 and SNP726). Site −1576 consisted of a 1-bp insertion/deletion (InDel) polymorphism located in the upstream region. SNP723, SNP726 and InDel727 located in the fourth intron showed strong LD (r 2 = 1) with each other in the inbred lines (Figure 3b). Among these significant sites, three major haplotypes, which contained more than 20 lines, were detected across 285 inbred lines (Figure 3c). The primary root length was compared between these haplotypes, and a significant difference was detected between haplotypes by ANOVA (p = 7.2 × 10 −4 ). Hap1, carrying all increased alleles, had the longest primary root length, followed by Hap2, but no significant phenotypic difference was detected between these two haplotypes. Hap3, carrying all the decreased alleles, had the shortest root length. SNP723 was selected as the leading SNP, and the lines carrying the G allele had a significantly longer PRL than the group carrying the A allele (p = 7.9 × 10 −4 , Figure 3d). The allele frequencies among inbred lines, landrace and teosintes were analyzed. In teosintes, all the lines carried the increasing allele, but in A total of four significant variants were associated with primary root length (PRL) (Figure 3a), including two Indels (InDel-1576 and InDel727) and two SNPs (SNP723 and SNP726). Site −1576 consisted of a 1-bp insertion/deletion (InDel) polymorphism located in the upstream region. SNP723, SNP726 and InDel727 located in the fourth intron showed strong LD (r 2 = 1) with each other in the inbred lines (Figure 3b). Among these significant sites, three major haplotypes, which contained more than 20 lines, were detected across 285 inbred lines (Figure 3c). The primary root length was compared between these haplotypes, and a significant difference was detected between haplotypes by ANOVA (p = 7.2 × 10 −4 ). Hap1, carrying all increased alleles, had the longest primary root length, followed by Hap2, but no significant phenotypic difference was detected between these two haplotypes. Hap3, carrying all the decreased alleles, had the shortest root length. SNP723 was selected as the leading SNP, and the lines carrying the G allele had a significantly longer PRL than the group carrying the A allele (p = 7.9 × 10 −4 , Figure 3d). The allele frequencies among inbred lines, landrace and teosintes were analyzed. In teosintes, all the lines carried the increasing allele, but in landraces and inbred lines, the frequency decreased to 89.7% and 63.9%, respectively (Figure 3e). These results suggested that the variants might have been selected during the domestication and improvement of maize. Two significant SNPs, SNP1319 and SNP1357, were significantly associated with total root length (TRL) and lateral root length (LRL) (Figure 4a). Three major haplotypes emerged from the two significant sites across inbred lines (Figure 4b). Both TRL and LRL showed significant difference between haplotypes (P TRL = 5.2 × 10 −5 and P LRL = 9.5 × 10 −6 ; Figure 4c). The haplotype carrying all increased alleles (Hap1) had the longest root length compared with that carrying all the decreased alleles (Hap3). The most significant site was SNP1357, which was also significantly associated with root length between 0 mm and 0.5 mm in diameter (RL005) (Figure 5), and the frequency of the increased allele, SNP1357T, decreased from 68.8% in teosintes to 52.9% and 38.9% in the landrace and inbred lines, respectively (Figure 4e). significant difference between haplotypes (PTRL = 5.2 × 10 −5 and PLRL = 9.5 × 10 −6 ; Figure 4c). The haplotype carrying all increased alleles (Hap1) had the longest root length compared with that carrying all the decreased alleles (Hap3). The most significant site was SNP1357, which was also significantly associated with root length between 0 mm and 0.5 mm in diameter (RL005) (Figure 5), and the frequency of the increased allele, SNP1357T, decreased from 68.8% in teosintes to 52.9% and 38.9% in the landrace and inbred lines, respectively (Figure 4e).

Discussion
RSA describes the spatial arrangement of root tissue within the soil [34] and the development of crops with ideal root systems that can capture more water and nutrients [35]. Many different factors are involved in shaping RSA, including the number, length, diameter, growth angle, elongation rate, and branching of lateral roots [34]. The manipulation of root traits could deliver increases in resource uptake; however, roots have received less attention compared with aboveground traits in maize breeding programs because they are hidden belowground and heavily influenced by the complex soil environment [36]. Direct selection for optimal RSA in the field is difficult to achieve, but manipulating genes or QTLs that influence RSA can deliver gains in resource use efficiency and yield: for example, introgression of natural variation in the DRO1 gene into rice lines to generate deeper roots could increase yield under dry conditions [37], and introgressing certain chromosomal fragments into target maize genotypes to increase root size contributed directly to efficient N-uptake and higher yield [38]. Identifying genes or natural variations associated with desired root traits is crucial for root improvement. In this study, gene-based association analysis revealed that natural variations in the ZmMADS60 gene were significantly associated with seedling root traits in 285 inbred lines. Previous studies implicated the MADS-box genes AGL12 and AGL14 in primary root elongation, while AGL21 was involved in lateral root development in Arabidopsis [21,39]. OsMADS25 regulated primary root length and lateral root density, and OsMADS50 regulated crown root development in rice [23,24]. Here, ZmMADS60 was associated with primary root length (PRL), lateral root length (LRL), root length between 0 and 0.5 mm in diameter (RL005) and total root length (TRL), which indicated that this gene might be involved in elongation of the primary and lateral root in the seedling stage. An early robust root system helps plants capture more water and nutrients to enhance seedling growth in specific environmental settings [40,41]. For example, PSTOL1 can enhance early root growth in rice, thereby enabling plants to acquire more phosphorus and other nutrients, and introgression of this locus into locally adapted rice varieties is expected to considerably enhance productivity under low phosphorus conditions [40]. Lateral roots are the major determinant of total root length and are instrumental for water uptake in maize [3,4]. Our results indicated that SNP1319 and SNP1357 were significantly associated with LRL and TRL, and Hap1 was the favorable haplotype for improving LRL and TRL. These favorable gene alleles can be integrated into breeding programs to improve early root development in maize. However, there are two limitations to this study. (1) Using B73 as the reference, some big InDels or structural variations

Discussion
RSA describes the spatial arrangement of root tissue within the soil [34] and the development of crops with ideal root systems that can capture more water and nutrients [35]. Many different factors are involved in shaping RSA, including the number, length, diameter, growth angle, elongation rate, and branching of lateral roots [34]. The manipulation of root traits could deliver increases in resource uptake; however, roots have received less attention compared with aboveground traits in maize breeding programs because they are hidden belowground and heavily influenced by the complex soil environment [36]. Direct selection for optimal RSA in the field is difficult to achieve, but manipulating genes or QTLs that influence RSA can deliver gains in resource use efficiency and yield: for example, introgression of natural variation in the DRO1 gene into rice lines to generate deeper roots could increase yield under dry conditions [37], and introgressing certain chromosomal fragments into target maize genotypes to increase root size contributed directly to efficient N-uptake and higher yield [38]. Identifying genes or natural variations associated with desired root traits is crucial for root improvement. In this study, gene-based association analysis revealed that natural variations in the ZmMADS60 gene were significantly associated with seedling root traits in 285 inbred lines. Previous studies implicated the MADS-box genes AGL12 and AGL14 in primary root elongation, while AGL21 was involved in lateral root development in Arabidopsis [21,39]. OsMADS25 regulated primary root length and lateral root density, and OsMADS50 regulated crown root development in rice [23,24]. Here, ZmMADS60 was associated with primary root length (PRL), lateral root length (LRL), root length between 0 and 0.5 mm in diameter (RL005) and total root length (TRL), which indicated that this gene might be involved in elongation of the primary and lateral root in the seedling stage. An early robust root system helps plants capture more water and nutrients to enhance seedling growth in specific environmental settings [40,41]. For example, PSTOL1 can enhance early root growth in rice, thereby enabling plants to acquire more phosphorus and other nutrients, and introgression of this locus into locally adapted rice varieties is expected to considerably enhance productivity under low phosphorus conditions [40]. Lateral roots are the major determinant of total root length and are instrumental for water uptake in maize [3,4]. Our results indicated that SNP1319 and SNP1357 were significantly associated with LRL and TRL, and Hap1 was the favorable haplotype for improving LRL and TRL. These favorable gene alleles can be integrated into breeding programs to improve early root development in maize. However, there are two limitations to this study. (1) Using B73 as the reference, some big InDels or structural variations may be missing, because the maize genome exhibits high levels of genetic diversity among different inbred lines [42]; (2) Roots grown under controlled conditions, such as paper roll and hydroponic system, might not match those grown under field conditions [43]. Recently, improvements in phenotyping under field conditions will aid in the validation of the effect of significant variations in useful agronomic traits [44].
Maize was domesticated in southwestern Mexico~9000 years ago from its wild ancestor, teosinte [45], and the domestication underwent two stages: domestication selection and subsequent genetic improvement (postdomestication selection) [46]. As many traits related to plant development and morphology were the target of selection during domestication and improvement processes, transcription factors that usually orchestrate the activity of other genes are among the primary targets of selection [47]. Different overviews provided in previous studies have indicated that 43-81% of domestication genes encode transcriptional regulators, and MADS-box genes function as master regulators of plant development that have been important targets of the artificial selection associated with domestication [47]. A study examined variation in 32 maize MADS-box genes and 32 randomly chosen maize loci, and eight MADS-box genes were selected during the domestication process, indicating that MADS-box genes were more frequent targets of selection during domestication than genes chosen at random [48]. Another MADS-box transcription factor, ZmMADS69, was found to be an activator of maize flowering, and the promoter region of ZmMADS69 has been a target of selection in the adaptation of maize to temperate regions [20]. In this study, ZmMADS60 in maize retained only 40.1% and 66.9% of the nucleotide diversity of that in teosinte and landrace, respectively. Neutral tests also revealed that ZmMADS60 might be selected during domestication and improvement processes. We found that the frequency of the increased allele of ZmMADS60 for root length decreased from teosinte to inbred lines; however, root system has been omitted during domestication and improvement processes owing to its subterranean nature [49]. Ancient humans indirectly selected genetic alleles for root system of maize by breeding for aboveground traits [50]. The root domestication syndrome in the common bean was associated with genes that were directly selected to increase seed weight, but had a significant effect on early root growth through a developmental pleiotropic effect [49]. Zhang et al. (2015) [51] also revealed that the underground nodal root number was generally regulated by the aboveground trait of flowering time via indirect selection during maize domestication. Indirect selection has retained large phenotypic variations in teosinte and landrace. Currently, two genes for root architecture QTLs (DRO1 and PSTOL1) were identified in landrace germplasm rather than elite breeding lines [37,40], highlighting teosinte and landrace as valuable resources for identifying elite natural variations to improve root traits in maize.