Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize

Wang, Beibei; Wu, Penghao; Wu, Ruotong; Xie, Xinru; Ren, Zilong; Wang, Kaixiang; Ren, Jiaojiao

doi:10.3390/agronomy16070761

Open AccessArticle

Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize

by

Beibei Wang

,

Penghao Wu

,

Ruotong Wu

,

Xinru Xie

,

Zilong Ren

,

Kaixiang Wang

and

Jiaojiao Ren

^*

College of Agronomy, Xinjiang Agricultural University, Urumqi 830052, China

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(7), 761; https://doi.org/10.3390/agronomy16070761

Submission received: 9 March 2026 / Revised: 31 March 2026 / Accepted: 3 April 2026 / Published: 5 April 2026

(This article belongs to the Section Crop Breeding and Genetics)

Download

Browse Figures

Versions Notes

Abstract

Plant architecture-related traits are key agronomic traits affecting crop growth and yield. To unravel the genetic architecture of plant height (PH), ear height (EH), tassel length (TL), and tassel primary branch number (TPBN), 379 DH lines derived from 21 maize hybrids were used for genome-wide association study (GWAS) and genomic selection (GS) analyses. Although plant architecture-related traits were significantly influenced by genotype and genotype-by-environment interactions, moderate to high broad-sense heritability was observed for PH (81.3%), EH (79.6%), TL (86.4%), and TPBN (82.5%). Using six different models for GWAS, seven unique SNPs on chromosomes 1, 2, and 3 were identified for PH, 92 unique SNPs located on chromosomes 1 to 9 were identified for EH, three unique SNPs on chromosome 6 were detected for TL, and 18 unique SNPs located on chromosomes 1, 4, 5, 8, and 10 were identified for TPBN at the p-value threshold of 7.42 × 10⁻⁶. A few hotspot genomic regions conferring plant architecture-related traits were identified, located in bins 2.07, 4.07, 8.03, 6.01, and 10.00. A total of 144 putative candidate genes were identified, which were enriched in endocytosis and lipid biosynthetic process, electron carrier activity, chloroplast stroma, and plastid stroma. The prediction accuracy evaluated through 5-fold cross-validation was 0.44 for PH, 0.43 for EH, 0.31 for TL, and 0.30 for TPBN. When the training population size (TPS) reached 60–70% or marker density (MD) reached 3000, the prediction accuracy tends to stabilize, indicating that the optimum size of TPS and MD were 60–70% and 3000 for GS, respectively. The highest prediction accuracy evaluated by using 30–5000 significant SNPs corresponding to the lowest p-value was 0.70 for PH, 0.85 for EH, 0.58 for TL, and 0.75 for TPBN, with an increase in accuracy of 59.1% to 150.0%. These results demonstrate that integrating GS with a subset of highly significant SNPs can substantially enhance prediction efficiency, thereby facilitating the selection of superior genotypes and accelerating the breeding of maize varieties with optimized plant architecture. This study has further elucidated the genetic basis of maize architecture-related traits and provided valuable information on how to implement GS to breed novel maize varieties with optimized plant types.

Keywords:

maize; plant architecture-related traits; genome-wide association study (GWAS); genomic selection (GS)

1. Introduction

Ideal plant architecture plays a pivotal role in determining maize grain yield. The fundamental elements of an ideal maize plant architecture include plant height (PH), ear height (EH), ear architecture, tassel architecture, leaf angle, and root architecture. As the male inflorescence, the tassel affects pollination efficiency and resource allocation between male and female reproductive structures, with excessive tassel size potentially reducing yield potential [1]. Tassel morphology affects canopy light interception and photosynthetic efficiency. A proper tassel morphological structure is crucial to ensure coordinated development between ears and tassels and to increase grain yield. It has been reported that when maize plants were emasculated or had fewer tassel branches, the light transmittance in the lower leaves increased and the nutrient consumption of tassels decreased, resulting in a significant increase in yield [2]. Plant height directly affects yield, the distribution of canopy leaves, and resistance to lodging. Ear height can influence the lodging resistance of maize roots and stems, as well as the transport of photosynthetic products from leaves to the developing ears. A proper decrease in ear and plant height has been shown to increase the harvest index and economic yield of maize, enhance the utilization efficiency of light energy and nitrogen fertilizer, and bolster resistance to stress and lodging [3,4]. Therefore, in-depth analysis of the genetic mechanism of maize plant-related traits is of great significance in guiding maize breeding and promoting maize production.

In recent years, numerous scholars have employed the GWAS approach to evaluate the loci responsible for diverse traits, including PH, EH, TPBN, TL, disease resistance [5], and grain dehydration [6] in maize. Multi-environment GWAS for PH and EH was performed using 203 maize inbred lines genotyped via the GBS sequencing platform [7]. A total of 28 and 25 Corn-Belt-specific QTNs (quantitative trait nucleotide) for PH and EH, respectively, were identified using a multi-environment GWAS software (R software version 4.4.1) package called MRMLM (multi-locus random-SNP-effect mixed linear model), which is a powerful tool for discovering QTNs with significant QTN-by-environment interactions. Key plant architecture traits, including PH, EH, and TPBN, were identified using genome-wide association analysis and genomic prediction analysis in a fresh edible maize population consisting of 190 sweet corn inbred lines and 287 waxy corn inbred lines [8]. A total of 278,592 SNPs were identified by mixed linear model (MLM), including 184 for PH, 45 for EH, and 68 for TPBN. Both genome-wide association analysis and linkage analysis were employed to pinpoint the genetic architecture of inflorescence size, and 63 quantitative trait loci (QTL) for TPBN and 62 QTL for TL were identified [9].

The GWAS analysis models can be categorized into single-locus and multi-locus models. The single-locus GWAS method scans only one SNP locus at a time, which is inconsistent with the true genetic model of complex traits [10]. Such methods typically assume independent locus effects and rely on stringent multiple-testing corrections, which can increase false negatives and overlook polygenic background influences [11]. Therefore, there are certain limitations to using only the single-locus GWAS analysis method FarmCPU to identify significantly associated loci. To tackle the issue of false negatives resulting from Bonferroni correction in single-marker scanning, researchers have proposed methods to enhance detection power. These methods include BLINK (Bayesian-information and linkage-disequilibrium iteratively nested keyway) [12], FarmCPU (Fixed and Random Model Circulating Probability Unification) [13], and MLMM (multi-locus mixed model) [14]. Multi-locus models address these limitations by controlling polygenic background and reducing both false positives and false negatives [15].

To detect the genetic basis of ear-related traits, five models were employed: GLM (generalized linear model), MLM, CMLM (compressed mixed linear model), BLINK, and FarmCPU [16]. A total of 104 significant SNPs and 10 co-localized SNPs were identified. A total of six, 22, 14, and two genes were identified as significantly associated with TPBN, TL, EH, and PH, respectively, using seven models: GLM, MLM, MLMM, CMLM, ECMLM, SUPER, and BLINK [17]. Detecting the genetic basis of plant architecture traits in maize using different GWAS models can compensate for the limitations of a single model, improve statistical efficiency and accuracy, and thereby reduce false positives and false negatives.

Genomic prediction is a tool that employs markers to forecast the genetic merit of complex traits in offspring, facilitating selection and breeding processes [18]. The application of genomic prediction to selection is defined as genomic selection (GS) [19]. The prediction accuracy is assessed by calculating the correlation between the genomic estimated breeding values (GEBVs) and the true breeding values. With the development of high-throughput sequencing technologies and the reduction in sequencing costs, GS has also seen significant development in crop breeding, such as in maize [20,21,22], wheat [23,24], and rice [25,26]. Previous studies have shown that the predictive capability of GS depends on the accuracy of predictions, which can be affected by various factors such as trait heritability, prediction models, environmental and seasonal variations, the TPS, the genetic similarity between training and prediction sets, MD, and marker quality [27,28,29,30]. Using five inbred lines from four heterotic groups, a connected segregating population was developed, consisting of five subpopulations with 535 doubled haploid (DH) lines and 15 related testcross populations encompassing 1568 hybrids [31]. Although the prediction accuracy varied across populations and traits, the prediction accuracy of PH and EH exceeded 0.5. A genomic prediction analysis was conducted on plant architecture-related traits in sweet corn and waxy corn [8]. The prediction accuracy of PH, EH, and TPBN increased markedly as the TPS increased from 10% to 30% and reached a plateau when the TPS reached 80%. The prediction accuracy also sharply improved as the MD increased from 0 to 500 and reached a plateau when the MD reached 3000. Studying the effects of MD, TPS, and significant SNPs on prediction accuracy is of great significance for improving prediction accuracy and promoting the application of GS in maize breeding programs.

PH, EH, TL, and TPBN are important components of plant architecture. Most previous studies have relied on a single GWAS model and have not systematically integrated multi-model GWAS results with GS optimization strategies in multi-parent DH populations. In this study, 379 multi-parent doubled haploid (DH) lines were phenotyped for PH, EH, TL, and TPBN across different environments and genotyped using the 48 K liquid-phase hybridization probe capture technique for GWAS and GS. The main objectives of the present study were (1) to identify SNPs and putative candidate genes conferring plant architecture-related traits using four single-locus GWAS models (GLM, MLM, CMLM, and SUPER) and two multi-locus GWAS models (BLINK and FarmCPU); (2) to explore the potential of GS for plant architecture-related traits; and (3) to assess the impact of TPS, MD, and significant SNPs on the accuracy of GS.

2. Materials and Methods

2.1. Plant Materials and Trial Locations

In this study, 379 DH lines derived from 21 maize hybrids [32] were used for GWAS and GS. DH lines were maintained through strict self-pollination to ensure genetic purity. The multi-parent DH population was grown at Sangong Town experimental station, Changji City, Xinjiang, China (SG, 87°12′57″ E, 43°56′54″ N) and Dafeng Town experimental station, Hutubi County, Xinjiang, China (DF, 86°34′49″ E, 44°10′47′′ N) during the summer of 2022; Ledong experimental station, Hainan Province, China (LD, 108°57′14″ E, 18°27′14″ N) during the winter of 2022; and Qitai County experimental station, Xinjiang, China (QT1 and QT2, 89°44′19″ E, 44°05′17″ N) during the summer of 2023 and 2024. The experiment was set as a completely randomized block design with single-row plots and two replicates in each environment. Each row was 2.5 m long, with a row spacing of 0.6 m and a plant spacing of 0.25 m. All field management practices, including fertilization, irrigation, pest and disease control, and weed management, followed local standardized procedures.

2.2. Phenotype Data Analysis

At the maturity stage, five plants from each row were randomly selected and measured for PH, EH, TPBN, and TL. PH was measured as the distance from the ground surface to the top of the plant’s tassel using a ruler (cm). EH was measured as the distance from the ground to the uppermost ear-bearing node. The TL was measured as the length from the lowest branch of the tassel’s main axis to its top. The TPBN was determined as the count of first-order lateral branches present during the peak period of pollen shedding in the maize plant.

R software version 4.4.1 [33] was used to perform descriptive statistical analysis, variance analysis, and correlation analysis. Using the “lmer” function in the lme4 package of R, variance components, best linear unbiased prediction (BLUP) values, and broad-sense heritability were calculated. BLUP values across all five environments were utilized for GWAS and GS. The broad-sense heritability (H²) on an entry-mean basis was evaluated as follows:

H^{2} = σ_{g}^{2} / (σ_{g}^{2} + σ_{g e}^{2} / r + σ_{e}^{2} / n r)

(1)

where

σ_{g}^{2}

is the genetic variance,

σ_{g e}^{2}

is the genotype-environment interaction variance,

σ_{e}^{2}

is the residual variance, n is the number of environments, and r is the number of replications [34].

2.3. Genotyping

The genomic DNA extraction and genotyping of the DH population were carried out by China Golden Marker Biotech Co., Ltd. (Beijing, China). Genotyping was performed using the 48 K liquid-phase hybridization probe capture technique, including library preparation and sequencing. Raw reads were filtered by Trimmomatic-0.36 [35]. BWA 0.7.17 software [36] was used for SNP calling to align reads to Maize B73_RefGen_v4. A total of 1,583,425 SNPs were generated. SNPs with a missing rate (MR) more than 20% and a minor allele frequency (MAF) of less than 0.05 were filtered using vcftools [37]. Finally, 134,785 high-quality SNPs were retained.

GWAS of the four traits was performed using six different models, MLM, CMLM, GLM, FarmCPU, BLINK, and SUPER, all implemented in the Genome Association and Prediction Integrated Tool (GAPIT) 3.0 [38]. False-positive control was achieved by evaluating the first three principal components and the kinship matrix. The significance threshold for p-values was set at 7.42 × 10⁻⁶, calculated by dividing 1 by n, where n is 134,785. Q-Q and Manhattan plots were generated utilizing the “qqman” package in R [39]. The phenotypic variation explained (PVE) by each SNP was calculated by Rsquare of model with SNP-Rsquare of model without SNP for MLM, CMLM, and GLM models.

2.4. Enrichment and Pathway Analysis of Putative Candidate Genes

As linkage disequilibrium (LD) decayed at approximately 16.18 kb [40], candidate genes located within a 16.18 kb region upstream and downstream of significant SNPs were identified. These genes were annotated using both Maize GDB (http://www.maizegdb.org/ accessed on 1 September 2025) and NCBI (https://www.ncbi.nlm.nih.gov/ accessed on 22 December 2025). GO (Gene Ontology) enrichment analysis was conducted through AgriGO v2.0 (http://systemsbiology.cau.edu.cn/agriGOv2/ accessed on 17 January 2026), while KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis was performed using clusterProfiler in R [41].

2.5. Genomic Prediction Analysis

The genomic prediction analysis of PH, EH, TL, and TPBN was performed with the ridge regression best linear unbiased prediction (RRBLUP) model using the rrBLUP package [42] in R. The BLUP values across environments for each DH were used as the phenotype data for GS analysis. A 5-fold cross-validation approach was employed, in which 80% of the randomly selected population was used as the training population and the remaining 20% as the prediction population. This GS process was repeated 100 times, and the average prediction accuracy was computed.

An assessment was conducted to evaluate how MD, TPS, and significant markers (SM) influence prediction accuracy (rMG). To assess the impact of TPS on rMG, GS was implemented using all markers, with TPS varying from 10% to 90% of the total population size in increments of 10%. To evaluate the effect of MD on rMG, GS was conducted using a 5-fold cross-validation approach, with marker density varying from 10 to 10,000 (i.e., 10, 30, 50, 100, 300, 500, 1000, 3000, 5000, and 10,000). To assess the effect of SM on rMG, 30, 50, 100, 300, 500, 1000, 3000, 4000, and 5000 significant SNPs with the lowest p-values obtained from GWAS were selected for GS. Prediction accuracies were determined through 100 repetitions of 5-fold cross-validation.

3. Results and Analysis

3.1. Genetic Diversity and Heritability of Plant Architecture-Related Traits

The descriptive statistical results of plant architecture-related traits in five environments are shown in Table 1. There were abundant phenotypic variations for all four traits in each environment. Among the five environments, the greatest differentiation of PH was observed in DF, where PH ranged from 158.80 cm to 306.40 cm, and the average PH was 254.01 cm. The greatest differentiation of EH was observed in QT2, where EH ranged from 37.83 cm to 155.90 cm, and the average EH was 101.39 cm. The greatest differentiation of TL was observed in QT1, where TL ranged from 19.38 cm to 49.66 cm, and the average TL was 32.59 cm. The greatest differentiation of TPBN was observed in LD, where TPBN ranged from 1 to 16, and the average TPBN was 6.04.

Analysis of variance (ANOVA) revealed highly significant differences among genotypes and genotype × environment (G × E) interactions (p < 0.01). This indicated that PH, EH, TL, and TPBN were not solely determined by genetic factors but were also significantly influenced by G × E interactions. The broad-sense heritabilities (H²) of PH, EH, TL, and TPBN were 81.3%, 79.6%, 86.4%, and 82.5%, respectively, suggesting that these traits were primarily influenced by genetic factors and are reliable for GWAS and GS (Table 1).

3.2. Correlation Analysis of Plant Architecture-Related Traits

Correlation analysis was conducted based on the combined phenotypic data from five environments. The PH was strongly positively correlated with EH (p < 0.01), indicating that the growth and development of PH and EH were coordinated with each other. The TPBN was significantly correlated with PH (p < 0.01) and EH (p < 0.05). However, TL showed no significant correlation with the other three traits (Figure 1).

3.3. Genome-Wide Association Analysis of Plant Architecture-Related Traits

GWAS was conducted based on the BLUP values across all five environments using GLM, MLM, CMLM, FarmCPU, BLINK, and SUPER models (Figure 2). Based on the defined significance threshold, a total of 186 SNPs were detected (Table S1).

For PH, 12 SNPs were identified on chromosomes 1, 2, and 3, including three identified by GLM, one identified by MLM, one identified by CMLM, one identified by FarmCPU, one identified by BLINK, and five identified by BLINK. For GLM, the most significant SNP 2_200079765 located at bin 2.07 showed the lowest p-value of 1.63 × 10⁻⁶. The PVE of this SNP was 6.31%. For SUPER, the most significant SNP 2_200079765 with the lowest p-value of 2.03 × 10⁻⁷ explained 2.50% of the phenotypic variation. The SNP 1_48336894 located at bin 1.03 was identified by both MLM and CMLM, with the PVE ranging from 5.60% to 5.74%. The SNP 2_200079765 located at bin 2.07 was identified by both GLM, FarmCPU, BLINK, and SUPER. The PVE estimated by BLINK was 9.99%. The SNP 2_200079718 located at bin 2.07 was identified by both GLM and SUPER. A hotspot region containing eight SNPs was identified at bin 2.07, of which two were identified by GLM, one by FarmCPU, one by BLINK, and four by SUPER.

For EH, 124 SNPs were identified along all chromosomes except chromosome 10. A total of 40 SNPs distributed on chromosomes 2, 3, 4, 5, 6, 8, and 9 were identified by GLM. The most significant SNP 4_175914150 with the lowest p-value of 6.68 × 10⁻⁹ was located at bin 4.07. It had a PVE of 8.11% and a SNP effect of −3.39. Three SNPs located at bin 4.07 were identified by MLM. The most significant SNP 4_175914150 with the lowest p-value of 1.36 × 10⁻⁶ explained 5.29% of the phenotypic variation. The same results were observed by CMLM. Seven SNPs distributed on chromosomes 2, 4, 6, 7, and 9 were identified by FarmCPU. The most significant SNP 4_175914150 showed the lowest p-value of 1.88 × 10⁻¹¹. Four SNPs on chromosomes 2, 4, 6, and 8 were identified by BLINK. The most significant SNP 4_175914150 with the lowest p-value of 2.24 × 10⁻¹³ had a PVE of 4.70%. A total of 67 SNPs on chromosomes 1, 2, 4, 5, 8, and 9 were identified by SUPER. The most significant SNP 4_175913984 located at bin 4.07 exhibited the lowest p-value of 5.41 × 10⁻¹². It had a MAF of 0.27 and a PVE of 0.07%. The SNP 4_175914150 was identified by all six models. Two SNPs, 4_175913984 and 4_175914258, were identified by four models. Two SNPs, 8_109465858 and 9_10871373, were identified by three models. In total, 17 SNPs were identified by two models. A hotspot region containing 51 SNPs was identified at bin 4.07, of which 15 were identified by GLM, three by MLM, three by CMLM, one by FarmCPU, one by BLINK, and 28 by SUPER. Another hotspot region containing 25 SNPs was identified at bin 8.03, of which four were identified by GLM, one by BLINK, and 20 by SUPER.

For TL, 12 SNPs were identified at bin 6.01, including three identified by GLM, two identified by MLM, three identified by CMLM, two identified by FarmCPU, two identified by BLINK, and three identified by BLINK. Two SNPs, 6_74547563 and 6_74547652, were identified by five models. SNP 6_76700707 was identified by GLM and CMLM.

For TPBN, 38 SNPs distributed on chromosomes 1, 3, 4, 5, 8, and 10 were identified. A total of 14 SNPs on chromosomes 4, 5, 8, and 10 were identified by GLM. The most significant SNP 10_2209895 with the lowest p-value of 3.32 × 10⁻⁷ was located at bin 10.00 and explained 6.43% of the phenotypic variance. Six SNPs on chromosomes 1, 4, and 5 were identified by MLM. The most significant SNP 5_116545594 with the lowest p-value of 2.30 × 10⁻⁶ was located at bin 5.04 and explained 5.53% of the phenotypic variance. Five SNPs on chromosomes 1, 4, and 5 were identified by CMLM. The most significant SNP 5_116545594 with the lowest p-value of 3.66 × 10⁻⁶ explained 5.25% of the phenotypic variance. Five SNPs on chromosomes 5 and 10 were identified by FarmCPU. The most significant SNP 10_2209895 had the lowest p-value of 1.06 × 10⁻⁶. Five SNPs on chromosomes 1, 5, and 10 were identified by BLINK. The most significant SNP 1_192761375 with the lowest p-value of 1.07 × 10⁻⁹ was located at bin 1.06 and had a PVE of 6.15%. Three SNPs on chromosomes 3, 5, and 10 were identified by SUPER. The most significant SNP 5_35068669 with the lowest p-value of 1.18 × 10⁻⁶ was located at bin 5.03 and had a PVE of 4.03%. Two SNPs, 10_2209895 and 5_115751803, were identified by four models. Four SNPs, 1_299003064, 4_237221777, 4_237221779, and 5_116545594, were identified by three models. Five SNPs were identified by two models. A hotspot region containing 14 SNPs was identified at bin 5.04, of which seven were identified by GLM, three by MLM, two by CMLM, one by FarmCPU, and one by BLINK. Another hotspot region containing 10 SNPs was identified at bin 10.00, of which four were identified by GLM, four by FarmCPU, one by BLINK, and one by SUPER.

For each trait, one SNP identified by the most models was randomly selected for SNP effect analysis (Figure 3). For SNP 2_200079765 identified by four models, the PH of inbreds carrying the TT allele was significantly higher than that of inbreds carrying the CC allele. For SNP 4_175914150 identified by six models, the EH of inbreds carrying the CC allele was significantly higher than that of inbreds carrying the TT allele. For SNP 6_74547563 identified by five models, the TL of inbreds carrying the AA allele was significantly longer than that of inbreds carrying the CC allele. For SNP 5_115751803 identified by four models, the TPBN of inbreds carrying the CC allele was significantly higher than that of inbreds carrying the AA allele.

3.4. Putative Candidate Genes Associated with Significant SNPs

Based on the genome and annotation information of B73_RefGen_v4, 144 putative candidate genes were detected within 16.18 kb upstream and downstream of the significant SNPs associated with the four plant architecture-related traits, of which 109 genes have been annotated with known functions (Table S2). For PH, 12 putative candidate genes were detected, all of which have been annotated with known functions. For EH, 108 putative candidate genes were detected, of which 78 genes were functionally annotated. For TL, three putative candidate genes with known functions and one putative candidate gene with unknown function were identified. For TPBN, 20 putative candidate genes were detected, of which 16 were functionally annotated.

GO term analysis and KEGG pathway analysis were performed on the 144 putative candidate genes. A total of 131 genes were annotated with 437 GO terms (Table S3, Figure 4a), including 328 GO terms under biological process (BP), 46 under molecular function (MF), and 64 under cellular component (CC) categories. 49 GO terms were enriched (p < 0.05), including 33 GO terms under BP, one under MF, and 15 under CC categories. The most enriched GO terms involved were endocytosis and lipid biosynthetic process under BP, electron carrier activity under MF, chloroplast stroma, and plastid stroma under CC. For KEGG pathway analysis, 19 putative candidate genes were assigned to 27 pathways. No significant enrichment was observed in these pathways (Table S4, Figure 4b).

Protein–protein interaction (PPI) analysis revealed seven distinct protein interaction groups (Table S5, Figure 5). Group 1 was composed of 11 proteins, with the hub protein being A0A1D6QBW9 (Zm00001d052003, 40S ribosomal protein S30). 40S ribosomal protein S30 is a component of the 40S subunit of the ribosome, involved in ribosome assembly and protein synthesis, and plays a pivotal role in various biochemical and developmental processes [43]. In Arabidopsis, overexpression of the S30 ribosomal subunit leads to transcriptional and metabolic changes, thereby affecting plant development and stress responses [44]. Group 2 was composed of four proteins, with the hub protein being A0A1D6GMS0 (Zm00001d013631, TATA-box-binding protein 2). TATA-box-binding protein plays a crucial role in the initiation of transcription by interacting with DNA-binding multiprotein factor TFIID.

3.5. Genomic Prediction Accuracy for Plant Architecture-Related Traits

The rMG evaluated from 5-fold cross-validation was 0.43, 0.44, 0.31, and 0.30 for EH, PH, TL, and TPBN, respectively. The rMG evaluated using all the SNPs with different training population sizes (TPS) is shown in Figure 6a. As TPS increased from 10 to 90%, the rMG was 0.35, 0.38, 0.41, 0.42, 0.43, 0.45, 0.47, 0.47, and 0.48 for EH; 0.39, 0.42, 0.44, 0.44, 0.46, 0.46, 0.47, 0.46, and 0.47 for PH; 0.27, 0.30, 0.31, 0.32, 0.32, 0.33, 0.33, 0.33, and 0.31 for TL; and 0.26, 0.29, 0.29, 0.31, 0.31, 0.31, 0.32, 0.33, and 0.33 for TPBN, respectively. The results indicated that the rMG increased as TPS increased. When the TPS reached 60–70%, the prediction accuracy reached a plateau.

The rMG evaluated through 5-fold cross-validation with different marker densities (MD) is shown in Figure 6b. As the MD increased from 10 to 10,000 (specifically, 10, 30, 50, 100, 300, 500, 1000, 3000, 5000, and 10,000), the rMG was 0.25, 0.32, 0.34, 0.37, 0.41, 0.42, 0.47, 0.45, 0.45, and 0.45 for EH; 0.30, 0.34, 0.39, 0.42, 0.44, 0.45, 0.46, 0.47, 0.47, and 0.47 for PH; 0.23, 0.27, 0.28, 0.30, 0.31, 0.32, 0.33, 0.34, 0.33, and 0.34 for TL; and 0.21, 0.16, 0.27, 0.29, 0.30, 0.31, 0.31, 0.32, 0.32, and 0.32 for TPBN, respectively. The results indicated that the rMG increased as MD increased. When the MD reached 3000, the prediction accuracy reached a plateau.

The highest rMG of PH was 0.74 ± 0.1 by using 3000 significant SNPs detected by the MLM model, 0.70 ± 0.1 by using 1000 significant SNPs detected by the CMLM model, 0.63 ± 0.2 by using 300 significant SNPs detected by the SUPER model, 0.60 ± 0.2 by using 1000 significant SNPs detected by the GLM model, 0.59 ± 0.2 by using 1000 significant SNPs detected by the FarmCPU model, and 0.59 ± 0.2 by using 3000 significant SNPs detected by the BLINK model.

The impact of significant SNPs detected by different GWAS models is shown in Figure 7. For each model, the highest prediction accuracy for EH was 0.85 ± 0.10 when using 3000 significant SNPs detected by the CMLM model, 0.85 ± 0.08 by using 4000 significant SNPs detected by the MLM model, 0.77 ± 0.10 by using 4000 significant SNPs detected by the FarmCPU model, 0.70 ± 0.10 by using 100 significant SNPs detected by the BLINK model, 0.67 ± 0.2 by using 3000 significant SNPs detected by the GLM model, and 0.67 ± 0.2 by using 5000 significant SNPs detected by the SUPER model.

The highest prediction accuracy of TL was 0.58 ± 0.1 by using 1000 significant SNPs detected by the CMLM model, 0.57 ± 0.2 by using 4000 significant SNPs detected by the MLM model, 0.53 ± 0.2 by using 3000 significant SNPs detected by the SUPER model, 0.51 ± 0.2 by using 3000 significant SNPs detected by the GLM model, 0.50 ± 0.2 by using 4000 significant SNPs detected by the FarmCPU model, and 0.50 ± 0.2 by using 3000 significant SNPs detected by the BLINK model.

The highest prediction accuracy of TPBN was 0.75 ± 0.1 by using 1000 significant SNPs detected by the MLM model, 0.71 ± 0.2 by using 4000 significant SNPs detected by the CMLM model, 0.67 ± 0.2 by using 500 significant SNPs detected by the GLM model, 0.65 ± 0.2 by using 100 significant SNPs detected by the BLINK model, 0.65 ± 0.2 by using 500 significant SNPs detected by the SUPER model, and 0.65 ± 0.2 by using 3000 significant SNPs detected by the FarmCPU model.

4. Discussion

4.1. Genetic Architecture of Plant Architecture-Related Traits

Plant architecture-related traits are key agronomic traits affecting crop growth [45], population photosynthetic efficiency [46], lodging resistance [47], nutrient utilization rates, and overall yield [48]). In this study, a multi-parent DH population was employed to assess the genetic basis of plant-related traits across multiple environmental trials. ANOVA revealed significant differences in genetic variance and genotype × environment variance for all four traits, indicating that the BLUP values across environments were more reliable for genetic analysis. The broad-sense heritability of plant architecture-related traits was moderate to high, with values ranging from 79.6% for EH to 86.4% for TL. These results were consistent with previous studies [5,8,49]. The broad-sense heritability for PH and EH estimated in 212 inbred lines from southeast China across four environments was 0.81 and 0.81 [49]. The broad-sense heritability for TL evaluated in 182 maize inbred lines across three environments was 89% [50]. The broad-sense heritability for TBN estimated in 190 sweet corn inbred lines and 287 waxy corn inbred lines ranged from 72% to 95% [8]. Moderate to high heritability indicates that the plant architecture-related traits are relatively stable genetic traits, predominantly influenced by genetic factors.

GWAS, which relies on LD, has been widely used to identify significant associated loci and candidate genes related to target traits, especially complex quantitative traits [51]. It is an effective method for detecting multiple allelic variations and helping to uncover favorable alleles for target traits [52]. GWAS takes advantage of historical recombination in populations with broad genetic diversity, resulting in high mapping resolution [53]. However, it exhibits a high rate of false positive associations due to population structure. The statistical models of GWAS have undergone continuous refinement, evolving from the simplest analysis of variance (ANOVA) to the GLM incorporating fixed-effect covariates, and further to the MLM that includes random effects [54]. Since quantitative traits are typically influenced by multiple factors, the MLM model is now widely used for single-marker scanning with population structure and polygenic background control. However, the computational time required for scanning tens of thousands of genetic markers remains extensive. Additionally, due to the stringent screening criteria imposed by Bonferroni correction, many important minor-effect loci often go undetected [55]. To achieve higher-resolution association mapping, subsequent mixed models have been continuously optimized and developed, leading to the creation of various software packages with improved analytical accuracy and enhanced detection power.

4.2. Multivariate Genome-Wide Association Study Models for Plant Architecture-Related Traits

In this study, three single-locus GWAS models (GLM, MLM, and SUPER) and three multi-locus GWAS models (BLINK, FarmCPU, and CMLM) were used to detect SNP loci associated with plant architecture-related traits. A total of 186 significant SNPs were identified on chromosomes 1 through 10, including 60 identified by GLM, 12 identified by MLM, 12 identified by CMLM, 15 identified by FarmCPU, 12 identified by BLINK, and 75 identified by SUPER. The number of significant loci detected varied across different models, highlighting the importance of model selection in GWAS. Seven unique SNPs at bins 1.03, 2.07, and 3.09 were identified for PH. In total, 92 unique SNPs located on chromosomes 1 to 9 were identified for EH. Three unique SNPs at bin 6.01 were detected for TL. In total, 18 unique SNPs located on chromosomes 1, 4, 5, 8, and 10 were identified for TPBN.

The MLM model, which incorporates population structure (Q matrix) as a fixed effect and kinship (K matrix) as a random effect to control for false positives, is widely used for its robustness but may be more conservative, leading to fewer significant associations [54]. On the other hand, GLM tests each marker independently through a simple linear model, without considering kinship through random effect, which may lead to false positive associations.

4.3. Putative Candidate Genes for Plant Architecture-Related Traits

Seven unique SNPs at bins 1.03, 2.07, and 3.09 were identified for PH. A hotspot region controlling PH was identified at bin 2.07, which falls within the QTL interval (reference genome AGPv: 2194078723-194232549) for PH detected through joint population linkage analysis [55]. The SNP 2_200079765 was 529.84 kb away from the SNP chr2.S_193626872 identified by GWAS in a ROAM population, consisting of 10 RIL populations [55]. In total, 11 putative candidate genes were identified in the hotspot region. The putative candidate gene Zm00001d006160 encodes DEAD-box ATP-dependent RNA helicase 7, which belongs to the RNA helicase family. The RNA helicase family is involved in various RNA metabolism processes in plants [56] and plays critical roles in vegetative and reproductive growth of plants [57]. The SNP 3_229008966 was 8.16 kb away from the SNP 3_220846626_identified for PH in sweet corn and waxy corn [8]. It is located in LD with Zm00001d044469, encoding the heterogeneous nuclear ribonucleoprotein A3-like protein 2, which may play an important role in plant growth and development by participating in RNA metabolic regulation. The SNP 1_48336894 identified by both MLM and CMLM was located within the gene model of Zm00001d028840, which encodes a DUF4378 domain protein that plays a crucial role in regulating plant growth and development. In Arabidopsis and rapeseed, overexpression of the AtDUF4 gene increased the size of plant organs, suggesting that the AtDUF4 gene might regulate cell wall formation and the expression of auxin transporter genes [58].

A total of 92 unique SNPs located on chromosomes 1 to 9 were identified for EH. A hotspot region controlling EH was detected at bin 4.07, which included the SNP chr4.S_179801568 associated with EH [55]. The SNP 2_4813105 located at 2.02 was 3.14 Mb away from the SNP S2_2 identified in 203 maize inbred lines [7]. The SNP 4_235421563 was 5.13 Mb away from the SNP 4_240547324 identified for EH [8]. The SNP 6_167238598 was 69.21 kb away from the SNP 6_167169385 detected for EH [8]. The SNP 1_13186808 was 808.11 kb away from SNP Affx-291404634 identified under well-watered conditions [59]. The SNP 4_184809560 was 191.66 kb away from SNP SYN21465 identified in 226 inbred lines [60]. The SNP 3_19067558 was 187.75 kb away from the SNP PZE-103026453 related to EH [60]. The SNP 5_15858401 was 1.86 Mb away from the SNP SYN27136 associated with EH [60]. It is located in LD with Zm00001d013631, encoding the TATA-box-binding protein 2. The SNP identified by five models was located in LD with Zm00001d052003, encoding 40S ribosomal protein S30. The putative candidate gene Zm00001d051856 was located at bin 4.06 within 172.4 mb and near the qEP1-4 (159 mb), as previously reported [61]. This gene encodes an auxin-independent growth promoter protein that regulates EH through non-canonical hormonal pathways. This protein primarily influences cell elongation, internode development, and meristem activity to modulate plant architecture and yield potential.

Three unique SNPs at bin 6.01 were detected for TL. The putative candidate SNPs 6_74547563 and 6_74547652 were Zm00001d036148, encoding a Whirly transcription factor 1 (WHY1). Whirly proteins are plant-specific transcription factors that regulate plant growth, development, and abiotic stress responses by binding to single-stranded DNA [62]. It has been shown that Prunus avium WHY1 displays different expression patterns during flower bud differentiation, fruit development, and cold acclimation processes [63].

A total of 18 unique SNPs located on chromosomes 1, 4, 5, 8, and 10 were identified for TPBN. The SNP 8_174181776 was located in the QTL interval (reference genome AGPv: 168835715-168984157) related to TPBN [55]. Seven SNPs, 5_115751803, 5_116545032, 5_116545543, 5_116545594, 5_116594336, 5_116594388, and 5_116627988, were located at bin 5.04. The SNP 5_116594336 was located in the gene model of Zm00001d015754, encoding nuclear pore complex protein NUP205. The nuclear pore complex proteins are the main channel controlling nucleocytoplasmic communication. It plays a crucial role in various biological processes, such as plant growth and development, hormone signaling, and response to biotic and abiotic stress [64,65]. Zm00001d014178 was located at bin 5.03 within 35.06–35.07 mb and near the previously reported qTBN-5 (10.3–11.1 mb) [66,67].

4.4. Genomic Selection Strategies for Plant Architecture-Related Traits

Over the past decade, genomic selection (GS), a promising breeding method aimed at accelerating the speed and efficiency of the breeding process, has been widely used in plant and animal molecular breeding. MD, TPS, and significant markers were the key factors affecting prediction accuracy, and their impact on prediction accuracy has been evaluated in this study. With the increase of MD and TPS, the prediction accuracy first increased and then stabilized. When TPS reached 60–70% or MD reached 3000, the prediction accuracy reached a plateau, which is consistent with previous studies. The prediction accuracy increases with the increase of TPS and MD [68]. When 50–70% of the total genotypes were set as the training population and 3000–5000 markers were used for GS in a GWAS panel, a relatively high prediction accuracy with smaller standard deviation (SD) was observed for tar spot complex resistance [69], common rust resistance [70], kernel zinc concentration in maize [71], and leaf-related traits [72]. This is consistent with the findings of our study. The prediction accuracy of PH, EH, TL, and TPBN was greatly improved by using 100–3000 significant SNPs. The optimal number of significant SNPs varies depending on the genetic characteristics of the target trait and the GWAS model.

One of the main factors limiting the application of GS in complex traits is the relatively low prediction accuracy. Multiple studies [73,74,75] have shown that integrating significant SNPs into the GS model can enhance prediction accuracy. Compared with the prediction accuracy estimated by 5-fold cross-validation with all SNPs, the prediction accuracy increased from 0.51 to 0.75 for brace root penetrometer resistance, 0.39 to 0.75 for root number, and 0.36 to 0.76 for tier number when using 100, 100, and 300 significant SNPs with the lowest p-values, respectively [76]. In a previous study, the prediction accuracy of ear shank length and ear shank node number was significantly improved by using 500–1000 significant SNPs [77].

Genomic prediction using significant SNPs detected by the MLM and CMLM models showed higher prediction accuracy compared to that using FarmCPU, SUPER, CMLM, and BLINK models for genomic prediction. This can be attributed to the inherent characteristics of these models. MLM and CMLM, as single-locus models, employ a stringent Bonferroni correction that tends to identify SNPs with relatively larger effect sizes, which are often more robust and transferable across environments and populations. In contrast, multi-locus models such as FarmCPU and BLINK are designed to detect a broader spectrum of associated loci, including those with smaller effects, but may introduce model-specific biases that affect the stability of selected SNPs for genomic prediction. Therefore, the observed differences in prediction accuracy reflect not only the statistical power of each GWAS model but also the compatibility between SNP selection strategies and the predictive model used. The choice of GWAS model and its alignment with the genetic architecture of the trait play a critical role in determining the effectiveness of significant SNP-based genomic prediction. When constructing SNP panels for genomic prediction, it is essential to comprehensively consider the genetic architecture of the target trait, the stability of LD between QTLs and markers, and the breeding objectives, rather than relying on a single indicator [78].

In addition to MD, TPS, and significant markers, the prediction accuracy is also influenced by heritability and genetic architecture of the target traits. In this study, EH and PH showed higher prediction accuracy than TL and TPBN, which was inconsistent with the results of heritability. This may be due to the different genetic structures of each trait. In summary, our study provides valuable information on how to implement GS for plant architecture-related traits in maize.

5. Conclusions

This study systematically investigates the genetic basis and prediction strategies for plant architecture-related traits in maize through GWAS and GS. The results showed that PH, EH, TL, and TPBN exhibited moderate to high broad-sense heritability. Using six different GWAS models, 120 significant SNPs were identified across multiple chromosomes, along with five hotspot genomic regions and 144 putative candidate genes enriched in endocytosis, lipid biosynthesis, and electron carrier activity. When the training population size reached 60–70% or marker density reached 3000, GS prediction accuracy stabilized. Notably, using 30–5000 significant SNPs with the lowest p-values improved prediction accuracy by 59.1% to 150.0%. This study elucidates the multi-locus genetic architecture of maize plant architecture traits, proposes an optimized genomic selection strategy integrating highly significant SNP subsets that substantially improve prediction efficiency while reducing genotyping costs, and provides valuable molecular resources, including identified SNPs, candidate genes, and five hotspot genomic regions for marker-assisted selection, fine mapping, and functional marker development in breeding ideal plant architecture.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16070761/s1, Table S1: Significantly associated SNPs for plant type-related traits revealed by the GWAS using different models; Table S2: The annotation of putative candidate genes related to plant type-related traits; Table S3: Annotated 437 GO terms for 131 putative candidate genes; Table S4: The 27 KEGG pathways for 19 putative candidate genes; Table S5: Seven distinct protein interactions.

Author Contributions

Writing—original draft preparation, B.W. and R.W.; writing—review and editing, B.W. and P.W.; visualization, Z.R., K.W., and X.X.; supervision, J.R. All authors have read and agreed to the published version of the manuscript.

Funding

The authors declare that financial support was received for the research and/or publication of this article. This research was funded by Xinjiang Uygur Autonomous Region Major Science and Technology Special Projects (grant number: 2022A02003-1); key research and development projects in Xinjiang Uygur Autonomous Region (grant number: 2024B02008-1); Tianshan Yingcai (grant number: 2022TSYCJU0003); National Natural Foundation of China (grant numbers: 32360491); Xinjiang Uygur Autonomous Region Natural Science Foundation key project (grant number: 2022D01D34); and Xinjiang Agriculture Research System (grant number: XJARS-02).

Data Availability Statement

The original contributions presented in this study are included in the article; further inquiries can be directed to the corresponding authors. Raw code is available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chen, Z.; Wang, B.; Dong, X.; Liu, H.; Ren, L.; Chen, J.; Hauck, A.; Song, W.; Lai, J. An ultra-high density bin-map for rapid QTL mapping for tassel and ear architecture in a large F2 maize population. BMC Genom. 2014, 15, 433. [Google Scholar] [CrossRef]
Lambert, R.J.; Johnson, R.R. Leaf angle, tassel morphology, and the performance of maize hybrids. Crop Sci. 1978, 18, 499–502. [Google Scholar] [CrossRef]
Khush, G.S. Green revolution: The way forward. Nat. Rev. Genet. 2001, 2, 815–822. [Google Scholar] [CrossRef]
Weng, J.; Xie, C.; Hao, Z.; Wang, J.; Liu, C.; Li, M.; Zhang, D.; Bai, L.; Zhang, S.; Li, X. Genome-wide association study identifies candidate genes that affect plant height in Chinese elite maize (Zea mays L.) inbred lines. PLoS ONE 2011, 6, e29229. [Google Scholar] [CrossRef] [PubMed]
Li, H.; Yang, Q.; Fan, N.; Zhang, M.; Zhai, H.; Ni, Z.; Zhang, Y. Quantitative trait locus analysis of heterosis for plant height and ear height in an elite maize hybrid zhengdan 958 by design III. BMC Genet. 2017, 18, 36. [Google Scholar] [CrossRef] [PubMed]
Li, N.; Lin, B.; Wang, H.; Li, X.; Chu, Z. Natural variation in zmfbl41 confers banded leaf and sheath blight resistance in maize. Nat. Genet. 2019, 51, 1540–1548. [Google Scholar] [CrossRef] [PubMed]
Shu, G.; Wang, A.; Wang, X.; Chen, R.; Gao, F.; Wang, A.; Li, T.; Wang, Y. Identification of QTNs, QTN-by-environment interactions for plant height and ear height in maize multi-environment GWAS. Front. Plant Sci. 2023, 14, 1284403. [Google Scholar] [CrossRef]
Dang, D.; Guan, Y.; Zheng, H.; Zhang, X.; Zhang, A.; Wang, H.; Ruan, Y.; Qin, L. Genome-Wide Association Study and Genomic Prediction on Plant Architecture Traits in Sweet Corn and Waxy Corn. Plants 2023, 12, 303. [Google Scholar] [CrossRef]
Wu, X.; Li, Y.; Shi, Y.; Song, Y.; Zhang, D.; Li, C.; Buckler, E.S.; Li, Y.; Zhang, Z.; Wang, T. Joint-linkage mapping and GWAS reveal extensive genetic loci that regulate male inflorescence size in maize. Plant Biotechnol. J. 2016, 14, 1551–1562. [Google Scholar] [CrossRef]
Xu, Y.; Yang, T.; Zhou, Y.; Yin, S.; Li, P.; Liu, J.; Xu, S.; Yang, Z.; Xu, C. Genome-Wide Association Mapping of Starch Pasting Properties in Maize Using Single-Locus and Multi-Locus Models. Front. Plant Sci. 2018, 9, 1311. [Google Scholar] [CrossRef]
Merrick, L.F.; Burke, A.B.; Zhang, Z.; Carter, A.H. Comparison of Single-Trait and Multi-Trait Genome-Wide Association Models and Inclusion of Correlated Traits in the Dissection of the Genetic Architecture of a Complex Trait in a Breeding Program. Front. Plant Sci. 2022, 12, 772907. [Google Scholar] [CrossRef] [PubMed]
Huang, M.; Liu, X.; Zhou, Y.; Summers, R.M.; Zhang, Z. Blink: A package for the next level of genome-wide association studies with both individuals and markers in the millions. GigaScience 2019, 8, giy154. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Huang, M.; Fan, B.; Buckler, E.; Zhang, Z. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 2016, 12, e1005767. [Google Scholar] [CrossRef]
Segura, V.; Vilhjálmsson, B.; Platt, A.; Korte, A.; Seren, Ü.; Long, Q.; Nordborg, M. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 2013, 44, 825–830. [Google Scholar] [CrossRef]
Zhang, J.; Wu, Z.; Cai, M.; Liu, K.; Han, X.; Liu, C.; Han, G.; Wen, Y. Integrated single marker scanning and sparse Bayesian learning improves performance of detection for GWAS. Plant Methods 2026. [Google Scholar] [CrossRef] [PubMed]
Qian, F.; Jing, J.; Zhang, Z.; Che, S.; Sang, Z.; Li, W. GWAS and meta-QTL analysis of yield-related ear traits in maize. Plants 2023, 12, 3806. [Google Scholar] [CrossRef]
Zhao, X.; Wang, C.; Liu, J.; Han, B.; Huang, J. Molecular markers and molecular basis of plant type related traits in maize. Front. Genet. 2024, 15, 1487700. [Google Scholar] [CrossRef]
Spindel, J.; Begum, H.; Akdemir, D.; Virk, P.; Collard, B.; Redoña, E.; Atlin, G.; Jannink, J.L.; McCouch, S.R. Genomic selection and association mapping in rice (Oryza sativa): Effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS Genet. 2015, 11, e1004982. [Google Scholar]
Meuwissen, T.; Hayes, B.; Goddard, M. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef]
Zhang, X.; Pérez-Rodríguez, P.; Semagn, K.; Beyene, Y.; Babu, R.; López-Cruz, M.A.; San Vicente, F.; Olsen, M.; Buckler, E.; Jannink, J.L.; et al. Genomic prediction in biparental tropical maize populations in water-stressed and well-watered environments using low-density and GBS SNPs. Heredity 2015, 114, 291–299. [Google Scholar] [CrossRef]
Beyene, Y.; Semagn, K.; Mugo, S.; Tarekegne, A.; Babu, R.; Meisel, B.; Sehabiague, P.; Makumbi, D.; Magorokosho, C.; Oikeh, S.; et al. Genetic gains in grain yield through genomic selection in eight bi-parental maize populations under drought stress. Crop Sci. 2015, 55, 154. [Google Scholar] [CrossRef]
Crossa, J.; Perez, P.; Hickey, J.; Burgueno, J.; Ornella, L.; Ceron-Rojas, J.; Zhang, X.; Dreisigacker, S.; Babu, R.; Li, Y.; et al. Genomic prediction in CIMMYT maize and wheat breeding programs. Heredity 2014, 112, 48–60. [Google Scholar] [CrossRef] [PubMed]
Charmet, G.; Storlie, E.; Oury, F.; Laurent, V.; Robert, O. Genome-wide prediction of three important traits in bread wheat. Mol. Breed. 2014, 34, 1843–1852. [Google Scholar] [CrossRef]
Bassi, F.M.; Bentley, A.R.; Charmet, G.; Ortiz, R.; Crossa, J. Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Sci. 2016, 242, 23–36. [Google Scholar] [CrossRef]
Xu, S.; Zhu, D.; Zhang, Q. Predicting hybrid performance in rice using genomic best linear unbiased prediction. Proc. Natl. Acad. Sci. USA 2014, 111, 12456–12461. [Google Scholar] [CrossRef]
Wang, X.; Li, L.; Yang, Z.; Zheng, X.; Yu, S.; Xu, C.; Hu, Z. Predicting rice hybrid performance using univariate and multivariate gblup models based on north carolina mating design ii. Heredity 2017, 118, 302–310. [Google Scholar] [CrossRef]
Sharma, S.; Pinson, S.R.M.; Gealy, D.R.; Edwards, J.R. Genomic prediction and qtl mapping of root system architecture and above-ground agronomic traits in rice (Oryza sativa L.) with a multitrait index and bayesian networks. G3 Genes Genomes Genet. 2021, 11, jkab178. [Google Scholar] [CrossRef]
Bertolini, E.; Manjunath, M.; Ge, W.; Murphy, M.D.; Inaoka, M.; Fliege, C.; Eveland, A.L.; Lipka, A.E. Genomic prediction of cereal crop architectural traits using models informed by gene regulatory circuitries from maize. Genetics 2024, 228, iyae162. [Google Scholar] [CrossRef] [PubMed]
Heslot, N.; Yang, H.; Sorrells, M.; Jannink, J. Genomic selection in plant breeding: A comparison of models. Crop Sci. 2012, 52, 146–160. [Google Scholar] [CrossRef]
Millet, E.; Kruijer, W.; Coupel-Ledru, A.; Prado, S.; Tardieu, F. Genomic prediction of maize yield across European environmental conditions. Nat. Genet. 2019, 51, 952–956. [Google Scholar] [CrossRef]
Cheng, D.; Li, J.; Guo, S.; Wang, Y.; Xu, S.; Chen, S.; Liu, W. Genomic prediction for germplasm improvement through inter-heterotic-group line crossing in maize. Int. J. Mol. Sci. 2025, 26, 2662. [Google Scholar] [CrossRef]
Fan, Z.; Lin, S.; Jiang, J.; Zeng, Y.; Meng, Y.; Ren, J.; Wu, P. Dual-Model GWAS Analysis and Genomic Selection of Maize Flowering Time-Related Traits. Genes 2024, 15, 740. [Google Scholar] [CrossRef] [PubMed]
Gentleman, R.; Ihaka, R. R: A language and environment for statistical computing. Computing 2011, 1, 12–21. [Google Scholar]
Hallauer, A.; Carena, M.; Miranda Filho, J. Quantitative Genetics in Maize Breeding; Springer: New York, NY, USA, 2010. [Google Scholar]
Bolger, A.M.; Marc, L.; Bjoern, U. Trimmomatic: A flexible trimmer for illumina sequence data. Bioinformatics 2014, 15, 2114–2120. [Google Scholar] [CrossRef]
Li, H.; Durbin, R. Fast and accurate long-read alignment with burrows–wheeler transform. Bioinformatics 2010, 26, 589–595. [Google Scholar] [CrossRef]
Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and vcftools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Zhang, Z. Gapit version 3: Boosting power and accuracy for genomic association and prediction. Genom. Proteom. Bioinform. 2021, 19, 629–640. [Google Scholar] [CrossRef] [PubMed]
Turner, S. qqman: An R package for visualizing GWAS results using Q-Q and manhattan plots. Biorxiv 2018, 3, 731. [Google Scholar]
Zeng, Y.; Xu, X.; Jiang, J.; Lin, S.; Fan, Z.; Meng, Y.; Maimaiti, A.; Wu, P.; Ren, J. Genome-wide association analysis and genomic selection for leaf-related traits of maize. PLoS ONE 2025, 20, e0323140. [Google Scholar] [CrossRef]
Yu, G.; Wang, L.; Han, Y.; He, Q. Clusterprofiler: An r package for comparing biological themes among gene clusters. Omics A J. Integr. Biol. 2012, 16, 284–287. [Google Scholar] [CrossRef]
Campos, G.; Hickey, J.; Ricardo, P.; Hans, D.; Calus, M. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 2013, 193, 327–345. [Google Scholar] [CrossRef]
Finkelshtein, A.; Khamesa, H.; Tuan, L.; Rabanim, M.; Chamovitz, D. Overexpression of the ribosomal s30 subunit leads to indole–carbinol tolerance in Arabidopsis thaliana. Plant J. 2020, 105, 668–677. [Google Scholar] [CrossRef]
Finkelshtein, A.; Khamesa, H.; Chamovitz, D. Overexpression of s30 ribosomal protein leads to transcriptional and metabolic changes that affect plant development and responses to stress. Biomolecules 2024, 14, 319. [Google Scholar] [CrossRef]
Wang, W.; Zhang, W.; Jamil, M.; Tu, J.; Huang, L. Editorial: Molecular and genetic mechanisms of plant architecture regulation. Front. Plant Sci. 2024, 15, 142119. [Google Scholar] [CrossRef]
Li, C.; Li, Y.; Song, G.; Yang, D.; Xia, Z.; Sun, C.; Zhao, Y.; Hou, M.; Zhang, M.; Qi, Z.; et al. Gene expression and expression quantitative trait locianalyses uncover natural variations underlying the improvement of important agronomic traits during modern maize breeding. Plant J. 2023, 115, 772–787. [Google Scholar] [CrossRef]
Ren, Z.; Wang, X.; Tao, Q.; Guo, Q.; Duan, L. Transcriptome dynamic landscape underlying the improvement of maize lodging resistance under coronatine treatment. BMC Plant Biol. 2021, 21, 202. [Google Scholar] [CrossRef] [PubMed]
Yin, X.; Bi, Y.; Jiang, F.; Guo, R.; Zhang, Y.; Fan, J.; Kang, M.S.; Fan, X. Fine mapping of candidate quantitative trait loci for plant and ear height in a maize nested-association mapping population. Front. Plant Sci. 2022, 13, 963985. [Google Scholar] [CrossRef]
Wang, C.; He, W.; Li, K.; Yu, Y.; Zhang, X.; Yang, S.; Wang, Y.; Yu, L.; Huang, W.; Yu, H.; et al. Genetic Diversity Analysis and GWAS of Plant Height and Ear Height in Maize Inbred Lines from South-East China. Plants 2025, 14, 481. [Google Scholar] [CrossRef]
Cao, X.; Lu, H.; Zhao, Z.; Lian, Y.; Chen, H.; Yu, M.; Wang, F.; Sun, H.; Ding, D.; Zhang, X.; et al. Mining Candidate Genes for Maize Tassel Spindle Length Based on a Genome-Wide Association Analysis. Genes 2024, 15, 1413. [Google Scholar] [CrossRef] [PubMed]
Zhu, C.; Gore, M.; Buckler, E.; Yu, J. Status and prospects of association mapping in plants. Plant Genome 2008, 1, 5–20. [Google Scholar] [CrossRef]
Shi, W.; Hao, C.; Zhang, Y.; Cheng, J.; Zhang, Z. A Combined Association Mapping and Linkage Analysis of Kernel Number Per Spike in Common Wheat (Triticum aestivum L.). Front. Plant Sci. 2017, 8, 1412. [Google Scholar] [CrossRef]
Zuo, Z.; Li, M.; Liu, D.; Li, Q.; Huang, B.; Ye, G.; Wang, J.; Tang, Y.; Zhang, Z. GWAS Procedures for Gene Mapping in Diverse Populations with Complex Structures. Bio-Protocol 2025, 15, e5284. [Google Scholar] [CrossRef]
Yu, J.; Pressoir, G.; Briggs, W.H.; Vroh Bi, I.; Yamasaki, M.; Doebley, J.F.; McMullen, M.D.; Gaut, B.S.; Nielsen, D.M.; Holland, J.B.; et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 2006, 38, 203–208. [Google Scholar] [CrossRef] [PubMed]
Pan, Q.; Xu, Y.; Li, K.; Peng, Y.; Zhan, W.; Li, W.; Li, L.; Yan, J. The genetic basis of plant architecture in 10 maize recombinant inbred line populations. Plant Physiol. 2017, 175, 858–873. [Google Scholar] [CrossRef] [PubMed]
Huang, C.K.; Shen, Y.L.; Huang, L.F.; Wu, S.J.; Yeh, C.H.; Lu, C.A. The DEAD-Box RNA Helicase AtRH7/PRH75 Participates in Pre-rRNA Processing, Plant Development and Cold Tolerance in Arabidopsis. Plant Cell Physiol. 2016, 57, 174–191. [Google Scholar] [CrossRef]
Ohtani, M.; Demura, T.; Sugiyama, M. Arabidopsis root initiation defective1, a DEAH-box RNA helicase involved in pre-mRNA splicing, is essential for plant development. Plant Cell 2013, 25, 2056–2069. [Google Scholar] [CrossRef]
Chen, G. Overexpression of the nuclear protein gene AtDUF4 increases organ size in Arabidopsis thaliana and Brassica napus. J. Genet. Genom. 2018, 45, 459–462. [Google Scholar] [CrossRef] [PubMed]
Wen, X.; Li, H.Y.; Song, Y.L.; Zhang, P.Y.; Zhang, Z.; Bu, H.H.; Dong, C.L.; Ren, Z.Q.; Chang, J.Z. Genome-wide association study for plant height and ear height in maize under well-watered and water-stressed conditions. BMC Genom. 2025, 26, 745. [Google Scholar] [CrossRef]
Lu, X.; Liu, P.; Tu, L.; Guo, X.; Wang, A.; Zhu, Y.; Jiang, Y.; Zhang, C.; Xu, Y.; Chen, Z.; et al. Joint-gwas, linkage mapping, and transcriptome analysis to reveal the genetic basis of plant architecture-related traits in maize. Int. J. Mol. Sci. 2024, 25, 2694. [Google Scholar] [CrossRef]
Xi, X.; Lu, X.; Xue, C.; Li, J.; Pi, N.; Zhang, K.; Lu, Y. Qtl mapping of seven agronomic traits in maize based on the introgression lines. Chin. Sci. Bull. 2018, 63, 3103–3113. [Google Scholar] [CrossRef][Green Version]
Liu, H.; Wang, X.; Yang, W.; Liu, W.; Wang, Y.; Wang, Q.; Zhao, Y. Identification of whirly transcription factors in triticeae species and functional analysis of tawhy1-7d in response to osmotic stress. Front. Plant Sci. 2023, 14, 1297228. [Google Scholar] [CrossRef]
Wang, L.; Hou, Q.; Qiao, G. Genome-Wide Identification and Expression Analysis of the Sweet Cherry Whirly Gene Family. Curr. Issues Mol. Biol. 2024, 46, 8015–8030. [Google Scholar] [CrossRef] [PubMed]
Gu, Y. The nuclear pore complex: A strategic platform for regulating cell signaling. New Phytol. 2018, 219, 25–30. [Google Scholar] [CrossRef] [PubMed]
Parry, G. Assessing the function of the plant nuclear pore complex and the search for specificity. J. Exp. Bot. 2013, 64, 833–845. [Google Scholar] [CrossRef]
Xie, Y.; Wang, X.; Ren, X. A snp-based high-density genetic map reveals reproducible qtls for tassel-related traits in maize (Zea mays L.). Trop. Plant Biol. 2019, 12, 244–254. [Google Scholar] [CrossRef]
Brewbaker, J. Diversity and genetics of tassel branch numbers in maize. Crop Sci. 2015, 55, 65–78. [Google Scholar] [CrossRef]
Zhang, A.; Wang, H.; Beyene, Y.; Semagn, K.; Liu, Y.; Cao, S.; Cui, Z.; Ruan, Y.; Burgueno, J.; San Vicente, F. Effect of trait heritability, training population size and marker density on genomic prediction accuracy estimation in 22 bi-parental tropical maize populations. Front. Plant Sci. 2017, 8, 1916. [Google Scholar] [CrossRef]
Cao, S.; Loladze, A.; Yuan, Y.; Wu, Y.; Zhang, A.; Chen, J.; Huestis, G.; Cao, J.; Chaikam, V.; Olsen, M.; et al. Genome-wide analysis of tar spot complex resistance in maize using genotyping-by-sequencing snps and whole-genome prediction. Plant Genome 2017, 10, plantgenome2016-10. [Google Scholar] [CrossRef]
Ren, J.; Li, Z.; Wu, P.; Zhang, A.; Liu, Y.; Hu, G.; Cao, S.; Qu, J.; Dhliwayo, T.; Zheng, H. Genetic dissection of quantitative resistance to common rust (Puccinia sorghi) in tropical maize (Zea mays L.) by combined genome. Front. Plant Sci. 2021, 12, 12692205. [Google Scholar] [CrossRef]
Guo, R.; Dhliwayo, T.; Mageto, E.; Palacios-Rojas, N.; Lee, M.; Yu, D.; Ruan, Y.; Zhang, A.; San Vicente, F.; Olsen, M. Genomic Prediction of Kernel Zinc Concentration in Multiple Maize Populations Using Genotyping-by-Sequencing and Repeat Amplification Sequencing Markers. Front. Plant Sci. 2020, 11, 534. [Google Scholar] [CrossRef]
Yang, X.; Wu, P.; Cui, W.; Alimu, D.; Wang, K.; Ren, J. Genome-wide association studies and genomic selection for leaf-related traits in maize. Front. Plant Sci. 2025, 16, 1669346. [Google Scholar] [CrossRef] [PubMed]
Herter, C.; Ebmeyer Ehum, T.; Miedaner, T. Accuracy of within-and among-family genomic prediction for Fusarium head blight and Septoria tritici blotch in winter wheat. Theor. Appl. Genet. 2019, 132, 1121–1135. [Google Scholar] [CrossRef]
Rice, B.; Lipka, A. Evaluation of RR-BLUP genomic selection models that incorporate peak genome-wide association study signals in maize and sorghum. Plant Genome 2019, 12, 180052. [Google Scholar] [CrossRef]
Cao, S.; Song, J.; Yuan, Y.; Zhang, A.; Ren, J.; Liu, Y.; Qu, J.; Hu, G.; Zhang, J.; Wang, C.; et al. Genomic prediction of resistance to tar spot complex of maize in multiple populations using genotyping-by-sequencing snps. Front. Plant Sci. 2021, 12, 672525. [Google Scholar] [CrossRef]
Lin, S.; Xu, X.; Fan, Z.; Jiang, J.; Zeng, Y.; Meng, Y.; Ren, J.; Wu, P. Genome-wide association study and genomic selection of brace root traits related to lodging resistance in maize. Sci. Rep. 2024, 14, 31898. [Google Scholar] [CrossRef]
Jiang, J.; Ren, J.; Zeng, Y.; Xu, X.; Lin, S.; Fan, Z.; Meng, Y.; Ma, Y.; Li, X.; Wu, P. Integration of gwas models and gs reveals the genetic architecture of ear shank in maize. Gene 2024, 938, 149140. [Google Scholar] [CrossRef] [PubMed]
Wang, W.; Guo, W.; Le, L.; Yu, J.; Wu, Y.; Li, D.; Wang, Y.; Wang, H.; Lu, X.; Qiao, H.; et al. Integration of high-throughput phenotyping, GWAS, and predictive models reveals the genetic architecture of plant height in maize. Mol. Plant 2023, 16, 354–373. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Correlation analysis for plant architecture-related traits of maize. PH: plant height; EH: ear height; TL: tassel length; TPBN: tassel primary branch number. *** represents significant at p < 0.001.

Figure 2. QQ and Manhattan plots of maize plant height, ear height, tassel primary branch number, and tassel length by six model. (a) PH: plant height; (b) EH: ear height; (c) TL: tassel length; (d) TPBN: tassel primary branch number.

Figure 3. Allele effects of the overlapping SNPs identified by different models. (a) Allele effect of SNP 2_200079765 for PH. (b) Allele effect of SNP 4_175914150 for EH. (c) Allele effect of SNP 6_74547563 for TL. (d) Allele effect of SNP 5_115751803 for TPBN. ** represents significant at p < 0.01, and **** represents significant at p < 0.0001.

Figure 4. Enrichment (a) and pathway (b) analysis of putative candidate genes for plant architecture-related traits.

Figure 5. Protein–protein interaction analysis.

Figure 6. Genomic selection prediction accuracy of plant architecture-related traits. (a) Genomic prediction accuracy for plant architecture-related traits with the training population size (TPS) ranged from 10 to 90% of the total population size. (b) Genomic prediction accuracy for plant architecture-related traits when the number of SNPs ranged from 10 to 10,000.

Figure 7. Genomic selection prediction accuracy estimated using 30–5000 significant SNPs detected by different GWAS models. (a) PH: plant height; (b) EH: ear height; (c) TL: tassel length; (d) TPBN: tassel primary branch number.

Table 1. Descriptive statistics, variance components, and broad-sense heritability (H²) response to plant architecture-related traits in the multi-parent DH population.

Trait ^a	Environment	Min	Max	Mean	SD	CV	Variance Components ^b			H² ^c
Trait ^a	Environment	Min	Max	Mean	SD	CV	${σ^{2}}_{g}$	${σ^{2}}_{g e}$	${σ^{2}}_{e}$	H² ^c
PH	SG	171.33	294.60	236.59	23.79	0.10	260.149 ***	183.396 ***	231.056 ***	81.3%
	DF	158.80	306.40	254.01	25.51	0.10
	QT1	177.00	302.60	235.50	20.96	0.09
	LD	147.58	289.54	206.83	38.50	0.19
	QT2	176.00	314.73	244.09	29.72	0.12
	Combined	147.58	314.73	241.16	27.79	0.12
EH	SG	50.78	141.76	97.30	16.17	0.17	103.568 ***	61.078 ***	142.914 ***	79.6%
	DF	44.30	149.40	107.53	16.77	0.16
	QT1	44.30	149.60	104.21	17.20	0.17
	LD	31.63	74.70	48.49	9.28	0.19
	QT2	37.83	155.90	101.39	21.51	0.21
	Combined	31.63	155.90	97.67	23.70	0.24
TL	SG	24.00	45.68	33.60	3.98	0.12	11.42 ***	6.766 ***	0.508 ***	86.4%
	DF	19.30	50.50	32.73	5.44	0.17
	QT1	19.38	49.66	32.59	4.71	0.14
	LD	22.47	49.68	34.48	4.70	0.14
	QT2	20.93	50.97	35.13	4.82	0.14
	Combined	19.30	50.97	33.84	4.86	0.14
TPBN	SG	2.00	16.40	7.54	2.36	0.31	2.177 ***	0.634 ***	3.3431 ***	82.5%
	DF	2.00	15.00	7.30	2.58	0.35
	QT1	2.20	15.40	7.22	2.16	0.30
	LD	1.00	16.00	6.04	2.25	0.37
	QT2	1.00	15.00	6.33	2.88	0.45
	Combined	1.00	16.40	6.87	2.53	0.37

^a PH: plant height; EH: ear height; TL: tassel length; TPBN: tassel primary branch number. ^b represents genotypic variance, genotype × environment interaction variance, and error variance, respectively. *** p < 0.001. ^c H², broad-sense heritability.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, B.; Wu, P.; Wu, R.; Xie, X.; Ren, Z.; Wang, K.; Ren, J. Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize. Agronomy 2026, 16, 761. https://doi.org/10.3390/agronomy16070761

AMA Style

Wang B, Wu P, Wu R, Xie X, Ren Z, Wang K, Ren J. Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize. Agronomy. 2026; 16(7):761. https://doi.org/10.3390/agronomy16070761

Chicago/Turabian Style

Wang, Beibei, Penghao Wu, Ruotong Wu, Xinru Xie, Zilong Ren, Kaixiang Wang, and Jiaojiao Ren. 2026. "Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize" Agronomy 16, no. 7: 761. https://doi.org/10.3390/agronomy16070761

APA Style

Wang, B., Wu, P., Wu, R., Xie, X., Ren, Z., Wang, K., & Ren, J. (2026). Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize. Agronomy, 16(7), 761. https://doi.org/10.3390/agronomy16070761

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Genome-Wide Association Studies Using Multiple Models Reveal the Genetic Basis of Plant Architecture-Related Traits in Maize

Abstract

1. Introduction

2. Materials and Methods

2.1. Plant Materials and Trial Locations

2.2. Phenotype Data Analysis

2.3. Genotyping

2.4. Enrichment and Pathway Analysis of Putative Candidate Genes

2.5. Genomic Prediction Analysis

3. Results and Analysis

3.1. Genetic Diversity and Heritability of Plant Architecture-Related Traits

3.2. Correlation Analysis of Plant Architecture-Related Traits

3.3. Genome-Wide Association Analysis of Plant Architecture-Related Traits

3.4. Putative Candidate Genes Associated with Significant SNPs

3.5. Genomic Prediction Accuracy for Plant Architecture-Related Traits

4. Discussion

4.1. Genetic Architecture of Plant Architecture-Related Traits

4.2. Multivariate Genome-Wide Association Study Models for Plant Architecture-Related Traits

4.3. Putative Candidate Genes for Plant Architecture-Related Traits

4.4. Genomic Selection Strategies for Plant Architecture-Related Traits

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI