Next Article in Journal
Hydroponic Screening and Comprehensive Evaluation System for Salt Tolerance in Wheat Under Full-Fertility-Cycle Salt Stress Conditions
Previous Article in Journal
Effects of Additives on the Fermentation Quality and Bacterial Community of Silage Prepared from Giant Juncao Grass Grown in Saline–Alkali Soil
Previous Article in Special Issue
Identification of Wild Segments Related to High Seed Protein Content Under Multiple Environments and Analysis of Its Candidate Genes in Soybean
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The PIN-LIKES Auxin Transport Genes Involved in Regulating Yield in Soybean

College of Agriculture, Northeast Agricultural University, Harbin 150030, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Agronomy 2026, 16(2), 226; https://doi.org/10.3390/agronomy16020226 (registering DOI)
Submission received: 18 November 2025 / Revised: 8 January 2026 / Accepted: 14 January 2026 / Published: 17 January 2026
(This article belongs to the Special Issue Functional Genomics and Molecular Breeding of Soybeans—2nd Edition)

Abstract

PIN-LIKES (PILS) auxin transport genes play key roles in plant development, but their functions and molecular mechanism in soybean yield remain unclear. Here, we characterized the 44-member soybean GmPILS genes via comprehensive analyses. Phylogenetic analysis classified GmPILS into three subfamilies, with most proteins being hydrophobic, stable, and membrane-localized. Chromosomal distribution showed random scattering across 17 chromosomes, with gene duplication driving family expansion. Expression profiling identified GmPILS36 and GmPILS40 as seed-specific and differentially expressed between cultivated Suinong14 (SN14) and wild ZYD00006 (ZYD06) soybeans. Population genetic analyses revealed GmPILS40 experienced a domestication bottleneck without yield-related superior haplotypes, while GmPILS36 underwent selection during landrace-to-improved variety domestication. A coding region CC/TT natural variation in GmPILS36 (S/A substitution) was significantly associated with seed weight per plant and 100-seed weight, with the TT genotype conferring superior traits. This study provides insights into GmPILS genes’ evolution and identifies GmPILS36 as an important candidate gene for further functional study and investigation of the molecular mechanisms regulating soybean yield.

1. Introduction

Auxins are a class of small molecules synthesized autonomously within plants and play a crucial role in regulating plant growth and development in conjunction with other plant hormones [1,2,3]. In the process of studying the mechanisms of auxin function, the focus of current research includes auxin biosynthesis, polar transport of auxins, auxin signaling pathways, and the interactions between auxins and other plant hormones. Among these, auxins significantly influence seed development and crop yield [4]. The transmembrane transport of auxins is determined by proteins such as the PIN family and the structurally similar PIN-LIKES (PILS) gene family [5,6]. Although PIN proteins exhibit 10–18% similarity in structure to PILS proteins, they belong to their respective protein families, with PILS proteins being more evolutionarily conserved than PIN proteins. Moreover, PIN and PILS proteins, which are specific to eukaryotes, belong to a large protein superfamily with ancestral origins in bacteria and form two distinct subfamilies within this superfamily [5]. Typically, phosphorylation sites located at the C-terminus of PIN proteins can be phosphorylated by PINOID kinase, regulating their polar localization and the transport of auxins between and within cells [7]. In contrast, PILS proteins lack prominent phosphorylation sites and primarily influence auxin transport by affecting the flux from the endoplasmic reticulum to the nucleus, thereby regulating plant growth and development [8,9,10]. Currently, a total of eight PIN proteins and seven PILS proteins have been identified in Arabidopsis thaliana.
PIN and PILS proteins are involved in plant growth and development, with the PIN1 gene being the first gene cloned from the PIN family in Arabidopsis and currently the most extensively studied gene within this family [11]. Research indicates that auxins play a role in regulating grain development, ultimately determining grain size. Overexpression of GmGA3ox1 in Arabidopsis can significantly increase 1000-grain weight and single-plant yield [12]. In wheat, allelic variations of Pinb-2v3 can enhance 1000-grain weight by 8–10% [13]. In rice, the OsPIL15 knockout line shows a significant increase in grain weight compared to wild-type lines. Field yield measurements indicate that the OsPIL15 knockout line has increased yield, while the OsPIL15 overexpression line exhibits significantly reduced yield [14]. Transcriptome sequencing of various pineapple tissues, including roots, leaves, flowers, and fruits, has revealed that AcPILS2, AcPILS6b, and AcPILS6c are highly expressed in fruits, suggesting their potential involvement in regulating pineapple fruit development [15]. GmJAZ3 participates in the jasmonic acid signaling pathway and the cytokinin pathway regulating grain weight, and it plays a critical role in the transport of cytokinins through a regulatory network involving GmJAZ3-GmRR18a-GmMYC2a and GmCKXs, ultimately affecting seed weight [16]. In soybean, GmPIN1 is involved in the asymmetric distribution of auxins, which influences the architecture of soybean plants and the angle of petioles [17]. However, there are currently no reports on the role of PIN and PILS in regulating soybean seed development or seed weight. Therefore, exploring the GmPIN and GmPILS genes involved in soybean seed development could provide original genetic resources for soybean breeding efforts.
Soybean is an important source of protein and a major oilseed crop globally [18]. As one of the largest consumers of seeds, China’s demand for seed crops has increased in parallel with the rising living standards of its citizens. The seed issue can be addressed through domestic cultivation and importation, and recent international soybean prices continue to rise slowly [19]. Increasing soybean yield is currently a key focus in the development of soybean breeding in China. Soybean yield is influenced by factors such as 100-seed weight, seed weight per plant, seed length, and seed width [20]. Among these, yield is a complex quantitative trait that is controlled by multiple genes and is one of the traits that directly affects crop yield. Given the significant role of PILS proteins in the growth and development of other plants, particularly in seed development, research on PILS proteins in soybeans is relatively limited, making it an area of considerable research significance.
In this study, by integrating bioinformatics, molecular biology, and population genetics approaches, we identified 44 GmPILS genes in soybean and screened for an important gene closely associated with soybean yield, GmPILS36, at multiple levels including the genome, transcriptome, and proteome. At the transcriptional and protein levels, GmPILS36 is relatively highly expressed in seeds, with increased expression during the later stages of seed development, specifically corresponding to the cell expansion stage, and it is localized to the cell membrane. Genomically, GmPILS36 has been under selective pressure during the domestication of soybean. Furthermore, GmPILS36 has two major haplotypes, with one key variant distinguishing these haplotypes being the CC/TT natural variation in its coding region, which leads to an S/A amino acid substitution. The TT genotype is associated with significantly higher seed weight per plant and 100-seed weight compared to the CC genotype. Thus, GmPILS36 is an important gene closely associated with soybean yield and 100-seed weight, with the natural variation CC/TT in its coding region serving as a key SNP responsible for changes in soybean yield. This study not only provides new insights into the genetic regulatory mechanisms of soybean yield traits but also offers valuable candidate gene resources for functional gene cloning in soybean.

2. Materials and Methods

2.1. Phylogenetic Analysis

The genomic sequence files, protein sequence files, and annotation files for the cultivated soybean variety ‘Williams W82’ were downloaded from the Ensembl Plants database, https://plants.ensembl.org/index.html (accessed on 14 September 2025). The HMMER model for the PIN-Like domain (PF03547) was obtained from the InterPro website, http://www.ebi.ac.uk/interpro/ (accessed on 18 September 2025) [21]. Using HMMER 3.0 software, we conducted a search to predict protein families based on the input protein sequences, filtering the soybean PILS gene family at a threshold of 0.001. The Tbtools v1.098 software [22] was utilized to extract the protein sequences of the identified PILS gene family, construct a phylogenetic tree, and visualize the phylogenetic tree along with the predicted motifs. Chromosomal distribution analysis and synteny analysis of the soybean PILS gene family were performed using the genomic and annotation files.

2.2. Analysis and Visualization of Conserved Motifs and Phylogenetic Relationships in the Soybean PILS Gene Family

The protein sequences of the soybean GmPILS gene family were input into the MEME Suite to predict motifs within the protein sequences, https://meme-suite.org/meme/tools/meme (accessed on 28 September 2025). The results were downloaded in XML format and visualized using Tbtools v1.098 software [22]. Using MEGA 12 software [23], the PILS family protein sequences were aligned quickly via either “Align by ClustalW” or “Align by MUSCLE.” A phylogenetic tree for the PILS family was constructed using the Neighbor-Joining method, with 1000 bootstrap repetitions to estimate the reliability of the tree. The Gene Structure View (Advanced) feature in Tbtools v1.098 [22] was utilized to input the NWK file from MEGA software and the XML file downloaded from the MEME website for a joint analysis, enabling the co-visualization of the phylogenetic tree and motifs for the soybean PILS genes.

2.3. Analysis of Physicochemical Properties and Prediction of Subcellular Localization

The theoretical isoelectric points, instability indices, and hydrophilicity/hydrophobicity of the soybean GmPILS proteins were predicted using ExPASy ProtParam, https://web.expasy.org/protparam/ (accessed on 1 October 2025). Subcellular localization of the soybean PILS gene family was predicted using the online tool Cell-PLoc 2.0, http://www.csbio.sjtu.edu.cn/bioinf/Cell-PLoc-2/ (accessed on 5 October 2025).

2.4. Synteny Analysis of the GmPILS Genes

The genomic sequence and annotation files downloaded from the Ensembl Plants database were imported into Tbtools v1.098 software [22]. The process included extracting the chromosome backbone, preparing the gene density file, and generating an initial framework for the synteny file. The synteny file was then simplified and extracted, followed by the identification of target genes. Finally, the syntenic regions of the target genes were highlighted. The synteny mapping and graphical enhancement of the soybean PILS gene family were completed within Tbtools v1.098 [22].

2.5. Analysis of Tissue Expression Patterns

Transcriptional data for various soybean tissues (lateral roots, roots, root tips, stems, stem tips, leaves, and seeds) were downloaded from the Soybase database, https://www.soybase.org/ (accessed on 14 October 2025). This data was used to analyze the tissue expression patterns of the soybean PILS gene family, with a preliminary selection of genes that are specifically expressed in soybean seeds. Using the Heatmap function in Tbtools v1.098 software [22], the expression patterns of the soybean PILS gene family were analyzed based on transcriptomic data processed on a log scale for the cultivated soybean variety ‘Suinong 14’ and the wild soybean variety ‘ZYD00006’ during the early stages of seed development [23].

2.6. qRT-PCR Analysis

qRT-PCR was performed using seeds from the ‘Sui Nong 14’ and ‘ZYD00006’ varieties at the Cot and EM1 seed developmental stages. The definition of developmental stages in soybean is based on the Soybase database, https://www.soybase.org/ (accessed on 2 October 2025). qPCR primers were designed using the NCBI website, https://www.ncbi.nlm.nih.gov (accessed on 13 October 2025), with GmActin11 (Glyma.18G290800) serving as the internal reference gene [24]. The primers were synthesized by Beijing Ruibo Xingke Biotechnology Co., Ltd. (Beijing, China). Fresh seeds (100 mg) from the soybean varieties ‘Sui Nong 14’ and ‘ZYD00006’ at the EM1 stage were placed in a 1.5 mL centrifuge tube, treated at 180 °C for 3 h, and then cooled with liquid nitrogen before being ground in a mortar. Total RNA was extracted from the soybean seeds using the FastPure Universal Plant Total RNA Isolation Kit. The HiScript III RT SuperMix for qPCR (+gDNA wiper) kit from Nanjing Novogene Biotechnology Co., Ltd (Nanjing, China). was used to reverse transcribe the RNA into cDNA. A total of three technical replicates were designed, and the relative expression levels of the candidate genes were calculated based on the 2−ΔΔCt method [25]. Each sample was subjected to three biological replicates.

2.7. Plant Material and Field Management

To explore the genetic divergence between wild and cultivated soybeans, two accessions with distinct phenotypic characteristics were selected: ZYD00006 (ZYD06), an annual wild soybean with a small 100-seed weight of 3.1 g, and Suinong14 (SN14), a commercially released cultivar serving as the recurrent parent with a large 100-seed weight of 18.7 g. A total of 547 soybean accessions were planted at Xiangyang Farm of Northeast Agricultural University in Harbin, China, during the 2022 and 2023 growing seasons under consistent field conditions. Each accession was arranged in a single 5 m row with a ridge spacing of 65 cm and a plant spacing of 6 cm, resulting in 80 plants per row. Sowing was performed through artificial precision dibbling to ensure strict control of inter-plant distances, and field management was carried out in accordance with conventional agricultural practices.

2.8. Population Genetic Analysis

The 547 soybean accessions were used for population genetic analysis [26]. Population genetic parameters were estimated using well-established bioinformatics tools. Nucleotide diversity (π) and fixation index (FST) were calculated with DnaSP 5.10 [27] and VCFtools [28] across 500-kb sliding windows with a step size of 20 kb. Neutrality was assessed by calculating Tajima’s D within 500-kb genomic windows. Patterns of linkage disequilibrium were analyzed using PLINK [29], while haplotype networks were constructed in PopART, https://doi.org/10.5061/dryad.4n4j1 (accessed on 21 October 2025) employing the TCS algorithm [30].

2.9. Haplotype Analysis

Haplotype analysis of the candidate genes was conducted using an existing database of 547 soybean accessions. Haplotype frequencies that exceeded 5.0% in the population were considered superior haplotypes. The significance of the differences between the superior haplotypes and their 100-seed weight was analyzed using one-way ANOVA in SPSS 25.0 software (IBM, Chicago, IL, USA), followed by post hoc multiple comparisons using the LSD, Tukey’s HSD, and Duncan’s multiple range tests. Linkage disequilibrium (LD) analysis was performed using PLINK software [29] to calculate pairwise r2 values between SNPs within the target genomic region. SNPs with a minor allele frequency (MAF) < 0.05 were filtered out prior to analysis to ensure reliable LD estimates. The selection criterion for the predominant haplotypes of GmPILS genes was set at a frequency greater than 0.05 in the population.

2.10. Statistical Analysis

A two-tailed Student’s t-test was performed using SPSS 25.0 software (IBM, Chicago, IL, USA) to analyze significance of the expression for GmPILS genes in SN14 and ZYD06 seeds, and of yield differences among the predominant haplotypes of GmPILSs in 547 soybean germplasms, the threshold was set at p < 0.05. The normal distribution analysis and histogram visualization of the frequency distribution of seed weight per plant, 100-seed weight, seed length, and seed width for the 547 soybean accessions were also conducted using SPSS 25.0 software (IBM, Chicago, IL, USA).

3. Results

3.1. Superior Seed Submergence Tolerance in Cultivated Variety SN14 Compared to Wild Variety ZYD00006

To investigate the evolutionary relationships among soybean, Arabidopsis, and wild soybean, 44 PILS family genes from soybean, 7 PILS family genes from Arabidopsis, and 14 PILS family genes from wild soybean were identified using the HMMER program for phylogenetic analysis (Figure 1A). The results indicated that the PILS proteins can be divided into three distinct subfamilies. Among these, GmPILS4, GmPILS5, GmPILS7, GmPILS9GmPILS13, GmPILS15, GmPILS16, GmPILS19, GmPILS20, GmPILS26GmPILS31, GmPILS35, GmPILS36, GmPILS38, GmPILS39, GmPILS41, and GmPILS42 formed a single evolutionary branch, suggesting a closer evolutionary relationship among these 24 genes. Another set of 31 genes, including GmPILS1GmPILS3, GmPILS6, GmPILS8, GmPILS17, GmPILS18, GmPILS22GmPILS25, GmPILS32GmPILS34, GmPILS43, GmPILS44, AtPILS1, AtPILS4, AtPILS5, AtPILS7, GsPILS1GsPILS6, and GsPILS8GsPILS11, along with GmPILS14, formed another evolutionary branch. Additionally, GmPILS14, GmPILS21, GmPILS37, GmPILS40, AtPILS2, AtPILS3, AtPILS6, AtPILS7, GsPILS12, and GsPILS13 were grouped together and showed a closer evolutionary relationship. The results from the MEME online tool predicted that the soybean PILS gene family possesses a total of 10 conserved motifs (Motif1–Motif10) (Figure 1B). Specifically, 22 genes contained Motif1, 22 contained Motif2, 23 contained Motif3, 21 contained Motif4, 21 contained Motif5, 17 contained Motif6, 22 contained Motif7, 17 contained Motif8, 15 contained Motif9, with 9 genes containing Motif9, and 16 contained Motif10. These findings indicate that the soybean PILS gene family can be divided into three subfamilies (Group 1, Group 2, and Group 3) based on gene structure, which displayed significant differentiation among the three subfamilies.

3.2. The Predictions of Physicochemical Properties and Subcellular Localization of Soybean GmPILSs

Based on the chromosomal locations of the genes, the 44 members of the soybean PILS gene family were designated as GmPILS1–GmPILS44. The physicochemical properties of the soybean PILS gene family proteins were analyzed using the Expasy website, https://web.expasy.org/protparam/ (accessed on 1 October 2025). The results indicated that the amino acid lengths of the soybean PILS proteins range from 115 amino acids (GmPILS33) to 666 amino acids (GmPILS42), with isoelectric points ranging from 4.68 (GmPILS6) to 9.72 (GmPILS39). Among these, 34 PILS proteins are classified as stable proteins, while 10 PILS proteins are classified as unstable (GmPILS3, GmPILS6, GmPILS12, GmPILS13, GmPILS22, GmPILS24, GmPILS26, GmPILS28, GmPILS35, and GmPILS36). There is 1 hydrophilic protein (GmPILS16) and 43 hydrophobic proteins (GmPILS1–GmPILS15, GmPILS17–GmPILS44) (Table 1).
Subcellular localization predictions for the soybean PILS gene family were conducted using Cell-PLoc 2.0. The results revealed that 31 soybean PILS proteins are localized to the cell membrane (GmPILS1–GmPILS3, GmPILS5, GmPILS6, GmPILS8, GmPILS9, GmPILS14, GmPILS17–GmPILS25, GmPILS28–GmPILS30, GmPILS32–GmPILS40, GmPILS43, and GmPILS44), 9 are localized in the cytoplasm (GmPILS4, GmPILS7, GmPILS11, GmPILS12, GmPILS13, GmPILS15, GmPILS16, GmPILS31, and GmPILS41), 2 are localized in the chloroplasts (GmPILS26 and GmPILS27), GmPILS42 is localized in the nucleus, and GmPILS10 is localized in both the cell membrane and the nucleus (Table 1).

3.3. Chromosomal Distribution and Synteny Analysis of GmPILSs

The soybean PILS gene family is randomly distributed across 17 chromosomes of soybean (Figure 2A). GmPILS1 and GmPILS2 are located on chromosome 1 (Chr1), GmPILS3 and GmPILS4 are found on chromosome 3 (Chr3), GmPILS5 is located on chromosome 5 (Chr5), and GmPILS6 is on chromosome 6 (Chr6). GmPILS7–GmPILS10 are all distributed on chromosome 7 (Chr7), while GmPILS11 is located on chromosome 8 (Chr8). GmPILS12–GmPILS21 are found on chromosome 9 (Chr9), and GmPILS22 and GmPILS23 are both located on chromosome 10 (Chr10). GmPILS24 and GmPILS25 are distributed on chromosome 11 (Chr11), and GmPILS26–GmPILS28 are found on chromosome 13 (Chr13). GmPILS29 is located on chromosome 14 (Chr14), GmPILS30 and GmPILS31 are both on chromosome 15 (Chr15), and GmPILS32–GmPILS34 are located on chromosome 16 (Chr16). GmPILS35 and GmPILS36 are found on chromosome 17 (Chr17), while GmPILS37–GmPILS39 are on chromosome 18 (Chr18). GmPILS40 and GmPILS41 are located on chromosome 19 (Chr19), and GmPILS42–GmPILS44 are found on chromosome 20 (Chr20). There are no GmPILS genes present on chromosome 2 (Chr2), chromosome 4 (Chr4), or chromosome 12 (Chr12) (Figure 2A).
The results of the synteny analysis within soybean species indicate that 27 members of the soybean PILS gene family have undergone 48 duplication events. Among these, GmPILS1, GmPILS2, GmPILS12, GmPILS17, GmPILS24, GmPILS28, GmPILS30, GmPILS32, and GmPILS35 have experienced three duplication events, while GmPILS7, GmPILS11, and GmPILS16 have undergone two duplication events. The remaining genes have experienced one duplication event each (Figure 2B).

3.4. The Expression Pattern Analysis of GmPILS Genes in Soybean

Based on published transcriptome data of soybean, the expression patterns of GmPILS genes were analyzed in seven soybean tissues: roots, root tips, lateral roots, stem tips, stems, leaves, and seeds, https://legacy.soybase.org/ (accessed on 26 October 2025). The results showed that GmPILS genes exhibited tissue-specific expression in soybean, with GmPILS18, GmPILS36, and GmPILS40 being highly expressed specifically in soybean seeds (Figure 3A). Additionally, a comparative analysis of the differential expression patterns of GmPILS genes during seed development in cultivated soybean SN14 and wild soybean ZYD06 revealed that GmPILS35, GmPILS40, GmPILS24, GmPILS7, and GmPILS17 displayed differential expression during the Seed 1 (S1) to Seed 15 (S15) stages, corresponding to the cell division and cell expansion phases of seed development (Figure 3B). GmPILS42 showed differential expression only between S1 and S5, corresponding to the cell division stage, while 23 genes exhibited differential expression during the S10 and S15 stages of seed development, corresponding to the cell expansion stage (Figure 3B). Furthermore, GmPILS6, GmPILS8, GmPILS13, GmPILS20, GmPILS30, GmPILS38, and GmPILS44 were found to be almost undetectable in soybean seeds (Figure 3B). These results indicate that most GmPILS genes are expressed in soybean seeds, and the majority of GmPILS genes are expressed during the cell division phase of seed development, which may suggest that most GmPILS genes could be involved in regulating seed development during the cell elongation phase, ultimately affecting seed weight.
We further analyzed the expression patterns of GmPILS genes at corresponding seed development stages using qRT-PCR. At the Cot stage, there was no significant difference in the expression of GmPILS42 between SN14 and ZYD06 (Figure 3D). At the EM1 stage, seven genes, including GmPILS40, GmPILS24, GmPILS7, GmPILS23, GmPILS11, GmPILS35, and GmPILS14 were expressed at higher levels in ZYD06 compared to SN14, whereas GmPILS4, GmPILS17, GmPILS21, GmPILS28, GmPILS36, GmPILS37, and GmPILS41 showed higher expression in SN14 than in ZYD06 (Figure 3E–R). Among these genes, only GmPILS40 and GmPILS36 exhibited significant differences in expression during the EM1 stage between SN14 and ZYD06 (Figure 3E–R). These expression pattern analyses suggest that GmPILS40 and GmPILS36 are important candidate genes for regulating seed weight and yield.

3.5. Population Genetic Analysis and Haplotype Analysis of Yield Candidate Gene GmPILS40

To further investigate the selection and domestication patterns of the yield candidate genes GmPILS40 and GmPILS36 during soybean domestication, we performed the population genetic analysis of GmPILS40 (Glyma.19G072900, or SoyZH13_19G064200) and GmPILS36 (Glyma.17G157300, or SoyZH13_19G064200) using 2898 soybean accessions from China, https://ngdc.cncb.ac.cn/soyomics/index (accessed on 28 September 2025). Nucleotide polymorphism analysis revealed that the values of θπ for GmPILS40 at the genomic location were 1.14 for Improved vs. Landrace, and 0.37–0.39 for Landrace vs. Wild, and 0.43–0.48 for Improved vs. Wild, suggesting that GmPILS40 experienced a bottleneck effect during the transition from wild to landrace and improved soybeans (Figure 4A). The results of the population genetic differentiation coefficient indicated that FST values for Improved_Landrace, Landrace_Wild, and Improved_Wild were 0.0035, 0.1346–0.1355, and 0.1521–0.1525, respectively, demonstrating a moderate level of genetic differentiation, with the differentiation between improved varieties and wild soybeans being slightly higher than that between landrace and wild soybeans (Figure 4B). The analysis of Tajima’s D values for the GmPILS40 showed that the improved population ranged from –1.19 to –0.95, that for the landrace population ranged from –1.52 to –1.48, and that for the wild population ranged from –1.19 to 1.17. All three populations exhibited negative Tajima’s D values, with the Landrace population showing the largest absolute value, indicating a higher proportion of low-frequency alleles in all three groups, particularly pronounced in the landrace population.
Moreover, the haplotype analysis was conducted on 547 soybean accessions for individual yield, 100-seed weight, seed length, and seed width [26] (Figure S1). The results of haplotype analysis revealed three main haplotypes for GmPILS40: Hap1, Hap2, and Hap3 (Figure 4D,E). However, there were no significant differences in seed weight and seed size among the haplotypes Hap1, Hap2, and Hap3 (Figure 4F–I). These results suggest that GmPILS40 may be influenced by domestication selection, but not under strong selection pressure, and that this gene does not possess any superior haplotypes associated with seed weight.

3.6. Population Genetic and Haplotype Analyses of Yield Candidate Gene GmPILS36

The population genetic analysis results for the yield candidate gene GmPILS36 indicate that the nucleotide polymorphism analysis at the genomic location of GmPILS36 yielded θπ values of approximately 0.92–1.00 for Improved vs. Landrace, 0.33–0.47 for Landrace vs. Wild, and 0.35–0.44 for Improved vs. Wild (Figure 5A). These results suggest a significant reduction in nucleotide polymorphism in the improved varieties, while the polymorphism levels in landrace and wild accessions are similar, indicating that GmPILS36 may have experienced strong selective pressure during the domestication process from local varieties to improved varieties. The population genetic differentiation coefficients showed FST values of 0.11 for Improved_Landrace, 0.19–0.20 for Landrace_Wild, and 0.23–0.24 for Improved_Wild, demonstrating a moderate level of genetic differentiation, with the differentiation between improved varieties and wild soybeans being slightly higher than that between landrace and wild soybeans (Figure 5B). The analysis of Tajima’s D values for GmPILS36 revealed that the Tajima’s D value for the improved population ranged from –0.83 to –0.73, for landrace it ranged from 0.15 to 0.16, and for wild it ranged from –0.35 to 0.19. These results indicate a pattern of low-frequency allele enrichment in the improved population, near neutrality in the landrace population, and an evolutionary pattern in the wild population that falls between the two, with the improved varieties showing the most significant deviation from neutrality (Figure 5C). On the one hand, the negative Tajima’s D in the improved population could be attributed to soybean domestication bottlenecks, post-domestication population expansion, or breeding-mediated genetic drift, rather than selection alone. On the other hand, GmPILS36 exhibits a notable enrichment of low-frequency alleles in the improved varieties, is close to neutral evolution in the Landrace, and has an evolutionary pattern in the wild that is intermediate between the two, with the highest degree of deviation from neutrality in the Improved varieties.
Based on haplotype analysis using 547 soybean accessions, two main haplotypes, Hap1 and Hap2, were identified for GmPILS36 (Figure 5D,E). Among them, a natural variation site (C/T) in the coding region of GmPILS36 results in an amino acid change from S to A (Figure 5D). Genotyping at this site revealed that the TT genotype had significantly higher seed weight per plant and 100-seed weight compared to the CC genotype (Figure 5F,G), while there were no significant differences in seed length and width (Figure 5H,I). These results suggest that GmPILS36 experienced strong selective pressure during the domestication from Landrace to Improved varieties, and the C/T variation in its coding region, leading to the S/A amino acid difference, is significantly associated with seed weight per plant and 100-seed weight. The TT genotype is identified as a superior genetic variant for soybean yield in both years. These results indicated that GmPILS36 is involved in soybean yield domestication, and the CC/TT is a crucial natural variation related to soybean yield.

4. Discussion

4.1. The PILS Genes Reveal Significant Functional Differentiation Among Multiple Species

PILS proteins, which are auxin transport carriers, are widely present in plants and play a crucial role in plant growth and development. Currently, PILS proteins have been reported in various plants, including Arabidopsis [31], rice [32,33], pineapple [15], chrysanthemum [34], and sugarcane [35]. Seven PILS family genes have been identified in Arabidopsis, four in sugarcane, and eight in areca palm. Previous studies utilized the gene sequences from Arabidopsis to search the soybean genome database, ultimately identifying 19 soybean PILS genes [36]. In this study, based on the same database, we used HMMER 3.0 software to screen for PILS-like domains and identified 44 soybean GmPILS genes. This may be closely related to multiple rounds of whole-genome duplication events experienced by the soybean genome, such as ancient polyploidization, which provides a genetic basis for functional differentiation through gene duplication. From the perspective of sequence conservation, soybean and Arabidopsis PILS genes share a core PILS domain, including transmembrane transport-related motifs, indicating a conserved basic function in auxin transport. However, the subcellular localization of soybean GmPILS is more diverse, predominantly at the plasma membrane, with some distribution in the cytoplasm and chloroplasts, while Arabidopsis PILS is primarily localized to the endoplasmic reticulum. This difference may reflect adaptive evolutionary changes in soybean in response to complex environments and organ development, such as seed enlargement, during long-term domestication. For example, GmPILS36, localized at the plasma membrane, may respond more directly to dynamic changes in auxin during seed development, whereas the endoplasmic reticulum-localized PILS in Arabidopsis is more focused on regulating intracellular auxin storage and release. Functionally, Arabidopsis PILS genes are mainly involved in fundamental processes such as seedling morphogenesis and root apex development, while soybean GmPILS genes exhibit stronger tissue specificity, including GmPILS36 and GmPILS40, which are highly expressed in seeds, suggesting that they have evolved new functions in the regulation of crop-specific agronomic traits, such as seed yield.

4.2. The GmPILS Genes Demonstrate Significant Functional Diversification in Soybean Evolution

Soybean has experienced two extra whole-genome duplication (WGD) events: one occurring around 59 million years ago (MYA), and the other taking place between 5 and 13 MYA [37]. Roughly 75% of soybean genes exist in multiple copies within the genome [37]. Due to functional redundancy, one gene in a duplicated pair often becomes non-functional or undergoes functional divergence following a WGD event. Duplicated genes can follow at least four evolutionary paths: becoming pseudogenes, neofunctionalization, subfunctionalization, or retaining and sharing the original function between both copies [38,39,40]. Moreover, after undergoing WGDs and small-scale duplications, the balance of gene dosage can exert varying impacts on the evolutionary dynamics of subfunctionalization and non-functionalization [41]. In our study, from the perspective of gene structure, the differences in conserved motifs among the three subfamilies, Group 1 is enriched in Motifs 1/2/3, while Group 3 contains the unique Motif 9, which may correspond to functional specialization. For instance, members containing Motif 7, which is presumed to be associated with auxin binding, including GmPILS36, are more likely to be directly involved in auxin transmembrane transport, whereas members containing Motif 10 (of unknown function), including GmPILS42, may be involved in auxiliary signaling functions. Subcellular localization further supports this hypothesis, 28 GmPILS proteins localized at the plasma membrane, including GmPILS36 and GmPILS40, are likely to dominate the directional transport of auxin between cells, while members localized in the cytoplasm or chloroplasts, including GmPILS11 and GmPILS26, may participate in the intracellular distribution of auxin or in regulating processes coupled with photosynthetic metabolism.
Tissue expression pattern analyses indicate that GmPILS36, GmPILS40, and GmPILS18 are highly expressed specifically in seeds, and their dynamic expression during the seed development stages S1 to S15, especially during cell division and expansion phases, suggests that they may influence cell proliferation and volume expansion by regulating auxin accumulation in the endosperm or cotyledons, ultimately affecting seed weight and yield. Members that are highly expressed in root and stem tissues, including GmPILS1 and GmPILS2, may be involved in root architecture formation or stem elongation, which relates to soybean’s resistance to lodging and nutrient uptake. Additionally, the expression of chloroplast-localized GmPILS26/27 in leaves suggests that they may influence photosynthetic efficiency by regulating auxin levels within chloroplasts. Furthermore, the divergence in expression patterns of the paralogous genes resulting from gene duplication events, including GmPILS7, which is highly expressed in roots, and GmPILS8, which is lowly expressed in seeds, illustrates “subfunctionalization” evolution, thereby avoiding functional redundancy.

4.3. The Natural Variation CC/TT of GmPILS36 May Be Involved in the Domestication of Soybean Single-Plant Yield

Cultivated soybean Glycine max was domesticated from its wild relative Glycine soja, which exhibits higher genetic diversity. The domestication process of cultivated soybean has led to a significant reduction in genetic diversity due to a strong selection for traits beneficial to human needs, such as yield-related traits, flowering time, and pod shattering. This selection has resulted in the loss of some advantageous traits and the genes regulating them. Although the domestication process has greatly improved traits like yield and oil content in cultivated soybean, it has also resulted in the loss of some superior characteristics; for example, wild soybean has higher protein content, greater stress resistance, and more pods per plant compared to cultivated soybean [42]. Therefore, fully utilizing wild soybean can broaden the genetic base of soybean and provide excellent materials for positional research and even breeding, making it a focal point in soybean genetic improvement research [43]. Here, the CC/TT variation (S/A amino acid substitution) in the coding region of GmPILS36 exhibits significant directional selection during soybean domestication. Population genetic analysis reveals that the CC genotype predominates in wild soybeans, with low-yielding phenotypes, while the accumulation of the TT genotype begins in local varieties and significantly increases in improved varieties, showing a strong correlation with seed weight per plant and 100-seed weight. This pattern aligns with the reduced polymorphism indicated by the θπ value and the significant differentiation between improved varieties and wild soybeans reflected by the FST value, suggesting that artificial selection has exerted a directional influence at this locus, favoring the retention of the TT genotype that enhances yield.
Furthermore, the TT genotype of this CC/TT variation can serve as an efficient molecular marker for rapidly screening high-generation breeding materials through marker-assisted selection (MAS), thereby shortening the breeding cycle. Additionally, the TT genotype can be combined with superior alleles of other yield-related genes to potentially achieve cumulative enhancements in yield traits. Furthermore, it is necessary to determine whether the S/A substitution is located within the auxin-binding domain or transmembrane region of the PILS protein through protein structural modeling and functional validation, which will clarify its molecular mechanism and provide a theoretical basis for precise editing of this locus. In the future, new allele variations at this locus can be created using gene editing technologies to expand the diversity of breeding resources.

4.4. The Functions and Molecular Mechanisms of GmPILS Genes in Soybean Seed Development Warrant Further Investigation

This study conducted a systematic bioinformatics analysis of the soybean GmPILS gene family, revealing the diversity of its members in terms of gene structure, evolutionary history, and stress responses, thereby providing an important theoretical basis for understanding the functions of GmPILS. However, there are still some limitations in this research. First, the functional predictions are primarily based on sequence analysis and publicly available expression data, lacking direct experimental validation, such as clarifying the specific biological functions of each GmPILS gene in potassium ion homeostasis, osmoregulation, and stress tolerance through gene editing techniques. Second, the tissue expression analysis was limited to RNA-seq analysis results provided by databases, and further validation through qRT-PCR experiments was not performed. The collinearity analysis mainly focused on the intraspecific analysis within soybean; future studies could incorporate more closely related species to construct a more refined evolutionary map.
Additionally, while this study identified the interesting phenomenon that “GmPILS36 is involved in the domestication process of soybean yield and 100-seed weight traits,” this conclusion needs to be further validated in a larger population and over multiple years of phenotypic data. The phylogenetic and molecular characterization framework established in this study lays a solid foundation for further exploration of the functions of GmPILS genes. Based on this, future research could follow these pathways for further deepening. First, creating GmPILS36 soybean mutants using CRISPR/Cas9 and other technologies is the most effective way to directly validate its physiological functions. Second, comprehensive expression profile analyses across multiple tissues and time points should be conducted, combined with subcellular localization studies, to accurately depict the spatiotemporal expression patterns and action sites of each gene. For GmPILS36, an important candidate gene for soybean yield, it is essential to explore differences in its promoter activity and protein interaction networks to reveal the molecular mechanisms regulating soybean yield and 100-seed weight. Finally, extending the findings of this study in soybean to comparative analyses in other important crops will help uncover core genes with breeding application value, providing new genetic resources for enhancing crop yield through genetic engineering approaches.

5. Conclusions

In this study, we systematically characterized the 44-member soybean PILS gene family via phylogeny, physicochemical properties, subcellular localization, chromosomal distribution, synteny, and expression pattern analyses and explored the domestication patterns and functional variations of yield candidates GmPILS40 and GmPILS36. Phylogenetic analysis divided GmPILS genes into three subfamilies with structurally differentiated conserved motifs. GmPILS genes are randomly distributed across 17 chromosomes, with 27 genes undergoing 48 duplication events, suggesting that gene duplication drives family expansion. Expression analyses showed tissue-specific expression, with GmPILS36 and GmPILS40 exhibiting seed-specific high expression and significant differential expression between cultivated soybean SN14 and wild soybean ZYD06 at the EM1 stage. Population genetic analyses revealed distinct domestication patterns: GmPILS40 experienced a bottleneck effect from wild to cultivated soybeans with moderate differentiation and low-frequency allele enrichment but no yield-associated superior haplotypes. In contrast, GmPILS36 underwent selection during landrace-to-improved variety domestication, showing reduced polymorphism in improved varieties, moderate differentiation, which was the highest in improved and wild soybeans, and a coding region CC/TT variation, S/A substitution, significantly associated with yield and 100-seed weight. These findings advance understanding of the GmPILS gene family, and GmPILS36 is an important candidate gene that warrants further functional validation and investigation of the molecular mechanisms regulating yield.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16020226/s1, Figure S1. Distribution of seed-related phenotypes for 547 soybean accessions.

Author Contributions

Conceptualization: Z.Q. and Q.C.; resources: M.Y., Z.Q. and Q.C.; methodology: S.W., J.H. and C.T.; project administration: Z.Q.; supervision: Z.Q. and Q.C.; writing, review and editing: S.W., J.H., C.T. and Z.Q.; funding acquisition: Z.Q. and Q.C.; writing, original draft: S.W., J.H. and C.T.; data curation: J.H., S.W., L.Z., M.Y. and F.C.; formal analysis: Y.Z. and X.L.; investigation: Y.Z., X.L. and H.X.; visualization: S.W., J.H. and C.T.; software: S.W. and L.Z.; validation: S.W., C.T. and J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Natural Science Foundation of Heilongjiang Province of China (grant number: ZL2024C007), and the APC was funded by Zhaoming Qi.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, R.; Estelle, M. Diversity and specificity: Auxin perception and signaling through the TIR1/AFB pathway. Curr. Opin. Plant Biol. 2014, 21, 51–58. [Google Scholar] [CrossRef]
  2. Strader, L.C.; Zhao, Y. Auxin perception and downstream events. Curr. Opin. Plant Biol. 2016, 33, 8–14. [Google Scholar] [CrossRef]
  3. Guo, F.; Han, N.; Xie, Y.; Fang, K.; Yang, Y.; Zhu, M.; Wang, J.; Bian, H. The miR393a/target module regulates seed germination and seedling establishment under submergence in rice (Oryza sativa L.). Plant Cell Environ. 2016, 39, 2288–2302. [Google Scholar] [CrossRef] [PubMed]
  4. Cao, J.; Li, G.; Qu, D.; Li, X.; Wang, Y. Into the seed: Auxin controls seed development and grain yield. Int. J. Mol. Sci. 2020, 21, 1662. [Google Scholar] [CrossRef]
  5. Feraru, E.; Vosolsobě, S.; Feraru, M.I.; Petrášek, J.; Kleine-Vehn, J. Evolution and structural diversification of PILS putative auxin carriers in plants. Front. Plant Sci. 2012, 3, 227. [Google Scholar] [CrossRef] [PubMed]
  6. Bogaert, K.A.; Blomme, J.; Beeckman, T.; De Clerck, O. Auxin’s origin: Do PILS hold the key? Trends Plant Sci. 2022, 27, 227–236. [Google Scholar] [CrossRef]
  7. Sauer, M.; Kleine-Vehn, J. PIN-FORMED and PIN-LIKES auxin transport facilitators. Development 2019, 146, dev168088. [Google Scholar] [CrossRef]
  8. Zheng, Q.; Meng, X.; Fan, X.; Chen, S.; Sang, K.; Yu, J.; Zhou, Y.; Xia, X. Regulation of PILS genes by bZIP transcription factor TGA7 in tomato plant growth. Plant Sci. 2025, 352, 112359. [Google Scholar] [CrossRef]
  9. Béziat, C.; Barbez, E.; Feraru, M.I.; Lucyshyn, D.; Kleine-Vehn, J. Light triggers PILS-dependent reduction in nuclear auxin signalling for growth transition. Nat. Plants 2017, 3, 17105. [Google Scholar] [CrossRef] [PubMed]
  10. Feraru, E.; Feraru, M.I.; Barbez, E.; Waidmann, S.; Sun, L.; Gaidora, A.; Kleine-Vehn, J. PILS6 is a temperature-sensitive regulator of nuclear auxin input and organ growth in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 2019, 116, 3893–3898. [Google Scholar] [CrossRef]
  11. Wang, Y.; Chai, C.; Valliyodan, B.; Maupin, C.; Annen, B.; Nguyen, H.T. Genome-wide analysis and expression profiling of the PIN auxin transporter gene family in soybean (Glycine max). BMC Genom. 2015, 16, 951. [Google Scholar] [CrossRef]
  12. Hu, Y.; Liu, Y.; Lu, L.; Tao, J.J.; Cheng, T.; Jin, M.; Wang, Z.Y.; Wei, J.J.; Jiang, Z.H.; Sun, W.C.; et al. Global analysis of seed transcriptomes reveals a novel PLATZ regulator for seed size and weight control in soybean. New Phytol. 2023, 240, 2436–2454. [Google Scholar] [CrossRef]
  13. Chen, F.; Beecher, B.S.; Morris, C.F. Physical mapping and a new variant of Puroindoline b-2 genes in wheat. Theor. Appl. Genet. 2010, 120, 745–751. [Google Scholar] [CrossRef]
  14. Ji, X.; Du, Y.; Li, F.; Sun, H.; Zhang, J.; Li, J.; Peng, T.; Xin, Z.; Zhao, Q. The basic helix-loop-helix transcription factor, OsPIL15, regulates grain size via directly targeting a purine permease gene OsPUP7 in rice. Plant Biotechnol. J. 2019, 17, 1527–1537. [Google Scholar] [CrossRef]
  15. Zhao, H.; Maokai, Y.; Cheng, H.; Guo, M.; Liu, Y.; Wang, L.; Chao, S.; Zhang, M.; Lai, L.; Qin, Y. Characterization of auxin transporter AUX, PIN and PILS gene families in pineapple and evaluation of expression profiles during reproductive development and under abiotic stresses. PeerJ 2021, 9, e11410. [Google Scholar] [CrossRef]
  16. Hu, Y.; Liu, Y.; Tao, J.J.; Lu, L.; Jiang, Z.H.; Wei, J.J.; Wu, C.M.; Yin, C.C.; Li, W.; Bi, Y.D.; et al. GmJAZ3 interacts with GmRR18a and GmMYC2a to regulate seed traits in soybean. J. Integr. Plant Biol. 2023, 65, 1983–2000. [Google Scholar] [CrossRef] [PubMed]
  17. Zhang, Z.; Gao, L.; Ke, M.; Gao, Z.; Tu, T.; Huang, L.; Chen, J.; Guan, Y.; Huang, X.; Chen, X. GmPIN1-mediated auxin asymmetry regulates leaf petiole angle and plant architecture in soybean. J. Integr. Plant Biol. 2022, 64, 1325–1338. [Google Scholar] [CrossRef] [PubMed]
  18. Kumar, S.; Liu, Z.B.; Sanyour-Doyel, N.; Lenderts, B.; Worden, A.; Anand, A.; Cho, H.J.; Bolar, J.; Harris, C.; Huang, L.; et al. Efficient gene targeting in soybean using Ochrobactrum haywardense-mediated delivery of a marker-free donor template. Plant Physiol. 2022, 189, 585–594. [Google Scholar] [CrossRef] [PubMed]
  19. Majidian, P.; Ghorbani, H.R.; Farajpour, M. Achieving agricultural sustainability through soybean production in Iran: Potential and challenges. Heliyon 2024, 10, e26389. [Google Scholar] [CrossRef]
  20. Wei, S.; Yu, Z.; Du, F.; Cao, F.; Yang, M.; Liu, C.; Qi, Z.; Chen, Q.; Zou, J.; Wang, J. Integrated transcriptomic and proteomic characterization of a chromosome segment substitution line reveals the regulatory mechanism controlling the seed weight in soybean. Plants 2024, 13, 908. [Google Scholar] [CrossRef]
  21. Song, S.; Wang, Z.; Ren, Y.; Sun, H. Full-length transcriptome analysis of the ABCB, PIN/PIN-LIKES, and AUX/LAX families involved in somatic embryogenesis of Lilium pumilum DC. Fisch. Int. J. Mol. Sci. 2020, 21, 453. [Google Scholar] [CrossRef]
  22. Chen, C.; Chen, H.; Zhang, Y.; Thomas, H.R.; Frank, M.H.; He, Y.; Xia, R. TBtools: An integrative toolkit developed for interactive analyses of big biological data. Mol. Plant 2020, 13, 1194–1202. [Google Scholar] [CrossRef]
  23. Wei, S.; Yong, B.; Jiang, H.; An, Z.; Wang, Y.; Li, B.; Yang, C.; Zhu, W.; Chen, Q.; He, C. A loss-of-function mutant allele of a glycosyl hydrolase gene has been co-opted for seed weight control during soybean domestication. J. Integr. Plant Biol. 2023, 65, 2469–2489. [Google Scholar] [CrossRef]
  24. Hu, R.; Fan, C.; Li, H.; Zhang, Q.; Fu, Y.F. Evaluation of putative reference genes for gene expression normalization in soybean by quantitative real-time RT-PCR. BMC Mol. Biol. 2009, 10, 93. [Google Scholar] [CrossRef]
  25. Livak, K.J.; Schmittgen, T.D. Analysis of relative gene expression data using real-time quantitative PCR and the 2−ΔΔCT method. Methods 2001, 25, 402–408. [Google Scholar] [CrossRef]
  26. Qi, Z.; Guo, C.; Li, H.; Qiu, H.; Li, H.; Jong, C.; Yu, G.; Zhang, Y.; Hu, L.; Wu, X.; et al. Natural variation in Fatty Acid 9 is a determinant of fatty acid and protein content. Plant Biotechnol. J. 2024, 22, 759–773. [Google Scholar] [CrossRef]
  27. Librado, P.; Rozas, J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 2009, 25, 1451–1452. [Google Scholar] [CrossRef]
  28. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef] [PubMed]
  29. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [PubMed]
  30. Clement, M.; Posada, D.; Crandall, K.A. TCS: A computer program to estimate gene genealogies. Mol. Ecol. 2000, 9, 1657–1659. [Google Scholar] [CrossRef] [PubMed]
  31. Barbez, E.; Kubeš, M.; Rolčík, J.; Béziat, C.; Pěnčík, A.; Wang, B.; Rosquete, M.R.; Zhu, J.; Dobrev, P.I.; Lee, Y.; et al. A novel putative auxin carrier family regulates intracellular auxin homeostasis in plants. Nature 2012, 485, 119–122. [Google Scholar] [CrossRef]
  32. Liu, S.; Zheng, Y.; Zhao, L.; Gulam, M.; Ullah, A.; Xie, G. CALMODULIN-LIKE16 and PIN-LIKES7a cooperatively regulate rice seedling primary root elongation under chilling. Plant Physiol. 2024, 195, 1660–1680. [Google Scholar] [CrossRef]
  33. Li, Y.; He, Y.; Liu, Z.; Qin, T.; Wang, L.; Chen, Z.; Zhang, B.; Zhang, H.; Li, H.; Liu, L.; et al. OsSPL14 acts upstream of OsPIN1b and PILS6b to modulate axillary bud outgrowth by fine-tuning auxin transport in rice. Plant J. 2022, 111, 1167–1182. [Google Scholar] [CrossRef] [PubMed]
  34. Zhai, L.; Yang, L.; Xiao, X.; Jiang, J.; Guan, Z.; Fang, W.; Chen, F.; Chen, S. PIN and PILS family genes analyses in Chrysanthemum seticuspe reveal their potential functions in flower bud development and drought stress. Int. J. Biol. Macromol. 2022, 220, 67–78. [Google Scholar] [CrossRef] [PubMed]
  35. Pan, J.M.; Tian, S.R.; Liang, Y.L.; Zhu, Y.L.; Zhou, D.G.; Que, Y.X.; Ling, H.; Huang, N. Identification and expression analysis of PIN-LIKES gene family in sugarcane. Acta Agron. Sin. 2023, 49, 414–425. [Google Scholar]
  36. Dong, Y.K.; Huang, D.Q.; Gao, Z.; Chen, X. Identification, expression profile of soybean PIN-Like (PILS) gene family and its function in symbiotic nitrogen fixation in root nodules. Acta Agron. Sin. 2022, 48, 353–366. [Google Scholar] [CrossRef]
  37. Schmutz, J.; Cannon, S.B.; Schlueter, J.; Ma, J.; Mitros, T.; Nelson, W.; Hyten, D.L.; Song, Q.; Thelen, J.J.; Cheng, J.; et al. Genome sequence of the palaeopolyploid soybean. Nature 2010, 463, 178–183. [Google Scholar] [CrossRef]
  38. Force, A.; Lynch, M.; Pickett, F.B.; Amores, A.; Yan, Y.L.; Postlethwait, J. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 1999, 151, 1531–1545. [Google Scholar] [CrossRef]
  39. Panchy, N.; Lehti-Shiu, M.; Shiu, S.H. Evolution of gene duplication in plants. Plant Physiol. 2016, 171, 2294–2316. [Google Scholar] [CrossRef]
  40. Fang, C.; Yang, M.; Tang, Y.; Zhang, L.; Zhao, H.; Ni, H.; Chen, Q.; Meng, F.; Jiang, J. Dynamics of cis-regulatory sequences and transcriptional divergence of duplicated genes in soybean. Proc. Natl. Acad. Sci. USA 2023, 120, e2303836120. [Google Scholar] [CrossRef]
  41. Wilson, A.E.; Liberles, D.A. Dosage balance acts as a time-dependent selective barrier to subfunctionalization. BMC Ecol. Evol. 2023, 23, 14. [Google Scholar] [CrossRef] [PubMed]
  42. Goettel, W.; Zhang, H.; Li, Y.; Qiao, Z.; Jiang, H.; Hou, D.; Song, Q.; Pantalone, V.R.; Song, B.H.; Yu, D.; et al. POWR1 is a domestication gene pleiotropically regulating seed quality and yield in soybean. Nat. Commun. 2022, 13, 3051. [Google Scholar] [CrossRef] [PubMed]
  43. Zheng, H.; Hou, L.; Xie, J.; Cao, F.; Wei, R.; Yang, M.; Qi, Z.; Zhu, R.; Zhang, Z.; Xin, D.; et al. Construction of chromosome segment substitution lines and inheritance of seed-pod characteristics in wild soybean. Front. Plant Sci. 2022, 13, 869455. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Phylogenetic and conservative motif analyses of PILS genes. (A) Phylogenetic analyses of PILS genes in Glycine max, Glycine soja and Arabidopsis thaliana. The sequence data of PILS genes in Glycine max was used as the reference genome of ‘Williams 82’ (Wm82). The sequence data of PILS genes in Glycine soja was used as the reference genome of ‘PI 483463’. The sequence data of PILS genes in the Glycine max and Glycine soja were downloaded from the Phytozome database, https://phytozome-next.jgi.doe.gov/ (accessed on 21 October 2025). And the PILS genes in Arabidopsis thaliana was download from the TAIR database. https://www.arabidopsis.org/ (accessed on 23 October 2025). (B) Conservative motif analysis of soybean GmPILS genes. The sequence data were downloaded from Phytozome database.
Figure 1. Phylogenetic and conservative motif analyses of PILS genes. (A) Phylogenetic analyses of PILS genes in Glycine max, Glycine soja and Arabidopsis thaliana. The sequence data of PILS genes in Glycine max was used as the reference genome of ‘Williams 82’ (Wm82). The sequence data of PILS genes in Glycine soja was used as the reference genome of ‘PI 483463’. The sequence data of PILS genes in the Glycine max and Glycine soja were downloaded from the Phytozome database, https://phytozome-next.jgi.doe.gov/ (accessed on 21 October 2025). And the PILS genes in Arabidopsis thaliana was download from the TAIR database. https://www.arabidopsis.org/ (accessed on 23 October 2025). (B) Conservative motif analysis of soybean GmPILS genes. The sequence data were downloaded from Phytozome database.
Agronomy 16 00226 g001
Figure 2. Chromosomal distribution synteny analysis of GmPILSs. (A) Chromosomal distribution of GmPILSs on the soybean genome. The blue horizontal line represents the location of the genes on the chromosome. (B) Synteny analysis of GmPILSs. The numbers in the boxes represent the numbering of soybean chromosomes. The red connecting lines indicate that the two genes have high similarity in sequence, structure, and function, likely derived from a common ancestor gene duplication.
Figure 2. Chromosomal distribution synteny analysis of GmPILSs. (A) Chromosomal distribution of GmPILSs on the soybean genome. The blue horizontal line represents the location of the genes on the chromosome. (B) Synteny analysis of GmPILSs. The numbers in the boxes represent the numbering of soybean chromosomes. The red connecting lines indicate that the two genes have high similarity in sequence, structure, and function, likely derived from a common ancestor gene duplication.
Agronomy 16 00226 g002
Figure 3. The expression pattern of GmPILS genes. (A) The different tissue expression patterns of GmPILS genes. (B) Analysis of expression patterns of soybean GmPILS family genes at different stages of seed development in SN14 and ZYD00006 (ZYD06). Flower buds, flower buds that have not yet unfolded. Seed 1, the first day after fertilization; species 5, the 5th day after fertilization. Seed 10, the 10th day after fertilization. Seed 15, the 15th day after fertilization. (C) Seeds in SN14, R158, and ZYD06 during the seed development process (Cot, EM1, and EM2). Bar = 2 mm. (DR) The relative expression of GmPILS candidate genes in SN14 and ZYD06 during seed development. The expression data of GmPILS genes were performed by qRT-PCR. Each sample was subjected to three biological replicates. Cot_SN14 or Cot_ZYD06, the seeds of SN14 or ZYD06 at the cotyledon stage. EM1_SN14, the seeds of SN14 at early mature stage 1. EM1_ZYD06, the seeds of ZYD06 at early mature stage 1. The genes that are expressed with significant differences in the seeds of SN14 and ZYD06 are represented in red font. The significance of the expression levels of candidate genes between different samples was calculated using a two-tailed Student’s t-test. *, significance at p < 0.05. ns, indicates no significance.
Figure 3. The expression pattern of GmPILS genes. (A) The different tissue expression patterns of GmPILS genes. (B) Analysis of expression patterns of soybean GmPILS family genes at different stages of seed development in SN14 and ZYD00006 (ZYD06). Flower buds, flower buds that have not yet unfolded. Seed 1, the first day after fertilization; species 5, the 5th day after fertilization. Seed 10, the 10th day after fertilization. Seed 15, the 15th day after fertilization. (C) Seeds in SN14, R158, and ZYD06 during the seed development process (Cot, EM1, and EM2). Bar = 2 mm. (DR) The relative expression of GmPILS candidate genes in SN14 and ZYD06 during seed development. The expression data of GmPILS genes were performed by qRT-PCR. Each sample was subjected to three biological replicates. Cot_SN14 or Cot_ZYD06, the seeds of SN14 or ZYD06 at the cotyledon stage. EM1_SN14, the seeds of SN14 at early mature stage 1. EM1_ZYD06, the seeds of ZYD06 at early mature stage 1. The genes that are expressed with significant differences in the seeds of SN14 and ZYD06 are represented in red font. The significance of the expression levels of candidate genes between different samples was calculated using a two-tailed Student’s t-test. *, significance at p < 0.05. ns, indicates no significance.
Agronomy 16 00226 g003
Figure 4. Population genetic analysis and haplotype analysis of GmPILS40. (A) Nucleotide diversity and selection analyses on chromosome 19. The red dashed line represents the threshold of nucleotide polymorphism θπ, where θπ = 0.5. (B) The population fixation statistics (FST) analysis on chromosome 19. (C) The Tajima’s D analysis on chromosome19. The orange vertical line indicates the location of GmPILS40 in the genome. The nucleotide diversity (π) and FST value were calculated with 500 kb windows with a 20 kb shift. Tajima’s D value was determined along sliding windows of 500 kb. The orange boxes represent the location of GmPILS40 around 500 kb. The SNP data of the GmPILS40 genome in 2898 soybean accessions was from the SoyOmic, https://ngdc.cncb.ac.cn/soyomics/index (accessed on 20 October 2025). (D) Haplotype analysis of GmPILS40 in 547 soybean accessions. (E) The linkage disequilibrium analysis of GmPILS40. *, the location of the corresponding natural variant SNP. (FI) Comparative analysis of yield-related phenotypes associated with three major haplotypes of GmPILS40. (F) The seed weight per plant. (G) The 100-seed weight. (H) The seed length. (I) The seed width. Different colors represent different haplotypes of GmPILS40. The SNP data of the GmPILS40 genome in 547 soybean accessions was from the SoyCOAD [26]. SoyZH13_19G064200 (Glyma.19G072900) was the gene ID of GmPILS40 on the reference genomes ‘Zhong Huang 13’ and ‘Williams 82’, https://www.soybase.org/ (accessed on 21 October 2025). The significance p values for the phenotypes corresponding to the haplotypes were calculated using a two-tailed Student’s t-test. ns, indicates no significance.
Figure 4. Population genetic analysis and haplotype analysis of GmPILS40. (A) Nucleotide diversity and selection analyses on chromosome 19. The red dashed line represents the threshold of nucleotide polymorphism θπ, where θπ = 0.5. (B) The population fixation statistics (FST) analysis on chromosome 19. (C) The Tajima’s D analysis on chromosome19. The orange vertical line indicates the location of GmPILS40 in the genome. The nucleotide diversity (π) and FST value were calculated with 500 kb windows with a 20 kb shift. Tajima’s D value was determined along sliding windows of 500 kb. The orange boxes represent the location of GmPILS40 around 500 kb. The SNP data of the GmPILS40 genome in 2898 soybean accessions was from the SoyOmic, https://ngdc.cncb.ac.cn/soyomics/index (accessed on 20 October 2025). (D) Haplotype analysis of GmPILS40 in 547 soybean accessions. (E) The linkage disequilibrium analysis of GmPILS40. *, the location of the corresponding natural variant SNP. (FI) Comparative analysis of yield-related phenotypes associated with three major haplotypes of GmPILS40. (F) The seed weight per plant. (G) The 100-seed weight. (H) The seed length. (I) The seed width. Different colors represent different haplotypes of GmPILS40. The SNP data of the GmPILS40 genome in 547 soybean accessions was from the SoyCOAD [26]. SoyZH13_19G064200 (Glyma.19G072900) was the gene ID of GmPILS40 on the reference genomes ‘Zhong Huang 13’ and ‘Williams 82’, https://www.soybase.org/ (accessed on 21 October 2025). The significance p values for the phenotypes corresponding to the haplotypes were calculated using a two-tailed Student’s t-test. ns, indicates no significance.
Agronomy 16 00226 g004
Figure 5. Population genetic analysis and haplotype analysis of GmPILS36 on chromosome 17 in 2898 and 547 soybean accessions. (A) Nucleotide diversity and selection analyses of GmPILS36. The red dashed line represents the threshold of nucleotide polymorphism θπ, where θπ = 0.5. (B) The population fixation statistics (FST) analysis of GmPILS36. (C) The Tajima’s D analysis of GmPILS36. The orange vertical line indicates the location of GmPILS36 in the genome. The nucleotide diversity (π) and FST value were calculated with 500 kb windows with a 20 kb shift. Tajima’s D value was determined along sliding windows of 500 kb. The orange boxes represent the location of GmPILS36 around 500 kb. (D) Haplotype analysis of GmPILS36 in 547 soybean germplasm accessions. (E) The linkage disequilibrium analysis of GmPILS36. *, the location of the corresponding natural variant SNP. (FI) Comparative analysis of yield-related phenotypes associated with two genotypes of CC/TT on GmPILS36 genome. (F) The seed weight per plant. (G) The 100-seed weight. (H) The seed length. (I) The seed width. Different colors represent different haplotypes of GmPILS36. SoyZH13_17G151000 (Glyma.17G157300) was the gene ID of GmPILS36 on the reference genomes ‘Zhong Huang 13’ and ‘Williams 82’, https://www.soybase.org/ (accessed on 25 October 2025). *, and ***, significance at p < 0.05, and p < 0.001 in two-tailed Student’s t-test. ns, indicates no significance. The significance p values for the phenotypes corresponding to the haplotypes were calculated using a two-tailed Student’s t-test.
Figure 5. Population genetic analysis and haplotype analysis of GmPILS36 on chromosome 17 in 2898 and 547 soybean accessions. (A) Nucleotide diversity and selection analyses of GmPILS36. The red dashed line represents the threshold of nucleotide polymorphism θπ, where θπ = 0.5. (B) The population fixation statistics (FST) analysis of GmPILS36. (C) The Tajima’s D analysis of GmPILS36. The orange vertical line indicates the location of GmPILS36 in the genome. The nucleotide diversity (π) and FST value were calculated with 500 kb windows with a 20 kb shift. Tajima’s D value was determined along sliding windows of 500 kb. The orange boxes represent the location of GmPILS36 around 500 kb. (D) Haplotype analysis of GmPILS36 in 547 soybean germplasm accessions. (E) The linkage disequilibrium analysis of GmPILS36. *, the location of the corresponding natural variant SNP. (FI) Comparative analysis of yield-related phenotypes associated with two genotypes of CC/TT on GmPILS36 genome. (F) The seed weight per plant. (G) The 100-seed weight. (H) The seed length. (I) The seed width. Different colors represent different haplotypes of GmPILS36. SoyZH13_17G151000 (Glyma.17G157300) was the gene ID of GmPILS36 on the reference genomes ‘Zhong Huang 13’ and ‘Williams 82’, https://www.soybase.org/ (accessed on 25 October 2025). *, and ***, significance at p < 0.05, and p < 0.001 in two-tailed Student’s t-test. ns, indicates no significance. The significance p values for the phenotypes corresponding to the haplotypes were calculated using a two-tailed Student’s t-test.
Agronomy 16 00226 g005
Table 1. Prediction of physicochemical properties and subcellular localization of GmPILSs.
Table 1. Prediction of physicochemical properties and subcellular localization of GmPILSs.
Gene_
Name
Gene_
ID
No. Amino AcidsTheoretical_
pI
Instability_
Index
Grand Average of HydropathicitySubcellular Localization
GmPILS1Glyma.01G1562004155.5932.130.707Cell membrane
GmPILS2Glyma.01G1577004195.137.960.593Cell membrane
GmPILS3Glyma.03G1136004247.5440.390.725Cell membrane
GmPILS4Glyma.03G1260005979.0834.770.243Cell membrane
GmPILS5Glyma.05G1098003628.9239.90.642Cell membrane
GmPILS6Glyma.06G2977001224.6840.870.698Cell membrane
GmPILS7Glyma.07G1025006058.6433.980.097Cell membrane
GmPILS8Glyma.07G1131004188.7133.920.61Cell membrane
GmPILS9Glyma.07G1646005566.5738.090.185Cell membrane
GmPILS10Glyma.07G2179006657.7237.660.159Cell membrane and nucleus
GmPILS11Glyma.08G0547006038.6433.150.054Cytoplasm
GmPILS12Glyma.09G0618004439.1242.480.53Cytoplasm
GmPILS13Glyma.09G0973004879.0346.980.431Cytoplasm
GmPILS14Glyma.09G1161004405.336.60.644Cell membrane
GmPILS15Glyma.09G1179006347.2737.090.179Cell membrane
GmPILS16Glyma.09G1763004209.1938.28−0.057Cell membrane
GmPILS17Glyma.09G1956004148.9736.890.594Cell membrane
GmPILS18Glyma.09G1969004095.536.650.631Cell membrane
GmPILS19Glyma.09G2405003589.737.820.726Cell membrane
GmPILS20Glyma.09G2516003777.5330.080.683Cell membrane
GmPILS21Glyma.09G2711004148.0936.570.732Cell membrane
GmPILS22Glyma.10G1890004008.8240.820.716Cell membrane
GmPILS23Glyma.10G1891003138.4637.580.527Cell membrane
GmPILS24Glyma.11G0873004195.141.40.594Cell membrane
GmPILS25Glyma.11G0886004155.1831.160.709Cell membrane
GmPILS26Glyma.13G0383003505.9641.10.247Chloroplast
GmPILS27Glyma.13G0384001269.5137.921.009Chloroplast
GmPILS28Glyma.13G1019006429.142.090.12Cell membrane
GmPILS29Glyma.14G1209005318.8837.090.366Cell membrane
GmPILS30Glyma.15G1681004699.2136.980.186Cell membrane
GmPILS31Glyma.15G2086004928.2737.630.344Cytoplasm
GmPILS32Glyma.16G1149005945.4337.590.553Cell membrane
GmPILS33Glyma.16G1150001157.7428.031.219Cell membrane
GmPILS34Glyma.16G1155004147.0335.780.674Cell membrane
GmPILS35Glyma.17G0573006379.0543.640.141Cell membrane
GmPILS36Glyma.17G1573003638.9239.640.63Cell membrane
GmPILS37Glyma.18G2183004147.5136.920.727Cell membrane
GmPILS38Glyma.18G2410003698.335.20.707Cell membrane
GmPILS39Glyma.18G2558003599.7240.290.699Cell membrane
GmPILS40Glyma.19G0729004455.3834.630.636Cell membrane
GmPILS41Glyma.19G1288004817.6939.960.14Cytoplasm
GmPILS42Glyma.20G0143006667.2138.620.147Nucleus
GmPILS43Glyma.20G2016004099.1437.180.746Cell membrane
GmPILS44Glyma.20G2017002597.7131.930.634Cell membrane
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wei, S.; Han, J.; Tang, C.; Zhang, L.; Yang, M.; Cao, F.; Zhao, Y.; Li, X.; Xu, H.; Qi, Z.; et al. The PIN-LIKES Auxin Transport Genes Involved in Regulating Yield in Soybean. Agronomy 2026, 16, 226. https://doi.org/10.3390/agronomy16020226

AMA Style

Wei S, Han J, Tang C, Zhang L, Yang M, Cao F, Zhao Y, Li X, Xu H, Qi Z, et al. The PIN-LIKES Auxin Transport Genes Involved in Regulating Yield in Soybean. Agronomy. 2026; 16(2):226. https://doi.org/10.3390/agronomy16020226

Chicago/Turabian Style

Wei, Siming, Jiayin Han, Chun Tang, Lei Zhang, Mingliang Yang, Fubin Cao, Yuyao Zhao, Xinghua Li, Hao Xu, Zhaoming Qi, and et al. 2026. "The PIN-LIKES Auxin Transport Genes Involved in Regulating Yield in Soybean" Agronomy 16, no. 2: 226. https://doi.org/10.3390/agronomy16020226

APA Style

Wei, S., Han, J., Tang, C., Zhang, L., Yang, M., Cao, F., Zhao, Y., Li, X., Xu, H., Qi, Z., & Chen, Q. (2026). The PIN-LIKES Auxin Transport Genes Involved in Regulating Yield in Soybean. Agronomy, 16(2), 226. https://doi.org/10.3390/agronomy16020226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop