Simple Summary
Reproductive performance is a key factor influencing sheep production efficiency. The Dorper × Hu hybrid sheep (DHS) combines the rapid growth of Dorper sheep with the high fecundity of Hu sheep and is widely used for lamb production in China. In this study, whole-genome selective sweep analysis and genome-wide association study (GWAS) were used to explore the genetic basis of high fecundity in DHS. The results revealed that DHS possesses a distinct genetic structure and high genetic diversity derived from its parental breeds. Several candidate genes related to reproduction were identified. Moreover, two SNPs—g.88680390 C>A (SLC24A2/MLLT3) and g.18197516 T>C (ABCA1)—were significantly associated with litter size. These findings provide valuable molecular markers for improving reproductive efficiency and advancing genomic breeding in meat sheep.
Abstract
Sheep are an economically important livestock species, and reproductive performance is a key trait affecting productivity. The Dorper × Hu hybrid sheep (DHS), widely bred in China, provides a valuable model for studying the genetic basis of prolificacy. This study aimed to investigate the genomic architecture and identify candidate genes associated with high litter size in DHS using whole-genome selective sweep analysis and genome-wide association study (GWAS). A total of 31 DHS individuals with complete reproductive records were sequenced and compared with publicly available genomic data from 20 Hu sheep (HUS) and 10 Dorper sheep (DPS). Population genetic structure and diversity were assessed using phylogenetic trees, principal component analysis (PCA), and ADMIXTURE analysis. To identify key genomic regions associated with litter size, we performed selective sweep analysis between the polytocous and monotocous subpopulations of DHS using multiple methods within a 50 kb sliding window framework, including , ratio, XP-CLR, and XP-EHH; we also conducted GWAS. DHS exhibited a distinct genetic structure with admixed ancestry and elevated genetic diversity. Genetic diversity analysis showed that DHS retained moderate levels of heterozygosity and polymorphism, comparable to or exceeding those of its parental breeds. Comparative analysis between polytocous and monotocous DHS identified reproduction-associated genes, including MUC1, PLCB4, SIN3A, and ELAVL2, enriched in pathways such as ovarian steroidogenesis, insulin secretion, and circadian entrainment. Furthermore, genome-wide association study (GWAS) identified 140 significant loci (p < 10−5) associated with reproductive traits. From these, 10 candidate SNPs were selected for validation through single-marker association analysis in 200 DHS individuals, among which two loci—g.88680390 C>A (SLC24A2/MLLT3) and g.18197516 T>C (ABCA1)—showed significant correlations with litter size. These findings enhance our understanding of the genetic basis of prolificacy in DHS and provide valuable molecular markers for genomic selection in sheep-breeding programs.
1. Introduction
Sheep (Ovis aries) have served an essential source of meat, wool, and milk for thousands of years. China has a long history of sheep domestication and boasts abundant resources of indigenous breeds, which are generally categorized into three main groups: Mongolian, Kazakh, and Tibetan sheep [1]. Through long-term domestication and selective breeding, these breeds have adapted to diverse ecological regions and evolved distinct desirable traits [2], such as high reproductive performance in Small-Tailed Han and Hu sheep, superior meat quality in Tan sheep, and strong adaptability to harsh environments in Tibetan sheep. To overcome the limitations of indigenous breeds and meet the growing demand for high-quality lamb, crossbreeding local breeds with highly productive exotic breeds followed by long-term selection has become a primary genetic strategy for improving sheep production traits [3]. In China, this approach has yielded several successful composite breeds, including Luxi Black Head sheep [4] and Bamei mutton sheep [5], both of which exhibit enhanced production efficiency. Among them, the Dorper × Hu hybrid sheep (DHS), derived from crossing Dorper sheep (DPS) with Hu sheep (HUS), has emerged as one of the dominant breeds for lamb meat production in China. DHS combines the rapid growth and excellent carcass traits of Dorper sheep with the high prolificacy and non-seasonal estrus of Hu sheep [6], making it a critical genetic resource for meat sheep improvement.
In recent years, the rapid advancement of whole-genome resequencing technology, coupled with the completion of the sheep reference genome [7], has significantly accelerated research into the genetic mechanisms underlying morphological and economic traits in sheep. Whole-genome resequencing has been widely utilized to investigate sheep domestication and evolution [8], as well as to unravel the genetic basis of complex traits—such as reproductive capacity, coat color, horn type, tail type, and body size [9]. The identification of genetic variants associated with these traits deepens our understanding of sheep biology and provides a valuable foundation for genomic selection in breeding programs [10].
Among various production traits, reproductive efficiency plays a pivotal role in determining productivity and economic return in sheep production systems. It is regulated by genetic, hormone, environmental, and management factors. However, litter size is a complex trait with low to moderate heritability [11], controlled by both major genes and polygenes and influenced by environmental factors. Several major genes, such as BMPR1B [12], BMP15, and GDF9 [13]—have been shown to significantly affect this trait. In particular, the well-characterized FecB mutation (BMPR1B: A746G/p.Q249R) markedly increases litter size, and this mutation is nearly fixed in Hu sheep [1]. The BMP15 gene plays a vital role in follicular development by promoting granulosa cell proliferation and inhibiting follicle-stimulating hormone expression. GDF9 is also essential for folliculogenesis, with several heterozygous mutations associated with increased litter size. Notably, candidate genes linked to prolificacy vary across breeds, reflecting the polygenic and breed-specific nature of litter size. Therefore, identifying fertility-related genes is critical for elucidating the genetic mechanisms underlying reproductive performance and laying a solid foundation for marker-assisted selection in sheep breeding.
Given that the molecular mechanisms and candidate genes regulating reproductive traits in DHS remain incompletely understood, this study integrates whole-genome resequencing data from 31 DHS individuals with publicly available genomic data from 20 HUS and 10 DPS. Through genetic diversity analysis and genetic structure analysis, we systematically clarified the phylogenetic status of DHS. Meanwhile, we compare selection signals between the polytocous subgroup (those with three consecutive multiple-lamb litters) and monotocous subgroup (those with three consecutive single-lamb litters) within the DHS population to identify candidate genes associated with high prolificacy, ultimately clarifying the key genetic variants contributing to differences in reproductive performance.
2. Materials and Methods
2.1. DNA Sample and Sequence
A total of 31 Dorper × Hu hybrid sheep (DHS), all three years of age and possessing records of three consecutive lambings, were selected from Genyuan Animal Husbandry Farm in Shanxi Province, China. This included 15 individuals that consistently delivered multiple lambs (prolific/polytocous) and 16 individuals that consistently delivered single lambs (non-prolific/monotocous) across all three parities. In Supplementary Table S1, individuals denoted with a prefix of “M” correspond to monotocous sheep (non-prolific), while those starting with “S” refer to polytocous sheep (prolific). Approximately 5 mL of jugular blood was collected from each individual into EDTA-treated tubes and stored at −20 °C until DNA extraction. To assess genetic background and admixture patterns, whole-genome resequencing data from 20 HUS and 10 DPS were obtained from the Sequence Read Archive (BioProject: PRJNA681929) for comparative analysis.
2.2. Whole-Genome Data Processing
Raw genome sequence data were quality filtered using fastp (v0.23.4) [14] with parameters “−n 1 −q 5 −u 50 −l 100” to remove reads containing an excessive number of ambiguous bases, low-quality reads, or adapter contamination. Clean reads were mapped and aligned to the Ovis aries reference genome (ARS-UI_Ramb_v2.0, https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_016772045.1/, accessed on 1 May 2025) using BWA (v0.7.17) [15], and sequencing depth for each sample was calculated using SAMtools (v1.20) [16]. Variant calling was performed across all samples using the HaplotypeCaller module in GATK (v4.18) [17]. To reduce false positives, SNPs were filtered using the following thresholds: QD < 2.0, FS > 60.0, MQ < 40.0, SOR > 3.0, MQRankSum < −12.5, and ReadPosRankSum < −8.0. Additional stringent filters were applied to remove: (i) SNP clusters (more than one SNP within 5 bp), (ii) SNPs within 5 bp of indels, and (iii) genotypes with low quality (GQ < 20), which were labeled as “lowGQ”.
2.3. Population Structure Analysis
Population genetic structure was assessed using three complementary approaches: Neighbor-Joining (NJ) tree construction, Principal Component Analysis (PCA), and ancestry inference. NJ trees were constructed in MEGA (v11.0.13) [18] using pairwise genetic distance matrices calculated by PLINK. PCA was performed using GCTA (v1.94.1) [19] based on the genomic relationship matrix calculated using the VanRaden method. ADMIXTURE (v1.30) [20] was employed to estimate individual ancestry components, with the number of ancestral populations determined to be K = 2–3.
2.4. Genetic Diversity Analysis
Genetic diversity indices including observed heterozygosity (), expected heterozygosity (), polymorphic marker ratio (), minor allele frequency (MAF), and polymorphism information content (PIC) were calculated using PLINK [21]. Runs of homozygosity (ROH) were identified using the following criteria: a minimum length of 300 kb, at least 50 SNPs per ROH, and a SNP density of one per 50 kb. Up to 5 missing calls and 3 heterozygous calls were allowed per window, with a maximum gap of 300 kb between adjacent SNPs. Linkage disequilibrium (LD) decay was analyzed using PopLDdecay (v3.42) [22]. The parameters used were: MaxDist 1,000,000, MAF 0.05, Het 0.9, and Miss 0.1.
2.5. Selective Sweep Identification
Genomic selection signals were identified using VCFtools (v0.1.16) [23] for fixation index () and nucleotide diversity () ratios, XP-CLR (v1.1.2) [24] for cross-population composite likelihood ratio (XP-CLR), and selscan (v1.2.0) [25] for XP-EHH. All analyses were conducted using a 50 kb sliding window with a 10 kb step. Candidate regions were defined as the top 1% windows ranked by scores from each selection method. Genes located within these candidate regions were annotated and analyzed for functional enrichment using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways via DAVID [26].
2.6. Genome-Wide Association Study (GWAS)
A GWAS was conducted using WGS data from 31 DHS ewes (15 polytocous and 16 monotocous), all aged three years with records from three consecutive lambings. The Generalized Linear Model (GLM) implemented in GEMMA [27] was utilized to identify SNPs associated with prolificacy traits. Phenotypes were treated as a binary variable: ewes with three consecutive multiple-lamb litters were classified as polytocous (coded as 1), while those with three consecutive single-lamb litters were classified as monotocous (coded as 0). To correct for population stratification, the top ten principal components (PCs) derived from PCA were incorporated into the model.
2.7. Targeted Genotyping and Association Analysis
Based on production records, 200 DHS with three consecutive lambing events were selected for targeted SNP genotyping. Jugular blood samples (5 mL) were collected into EDTA-treated tubes for DNA extraction. Candidate nonsynonymous SNPs were identified. Genotyping was performed using the MassARRAY® platform (Agena Bioscience, San Diego, CA, USA). DNA samples were quality-controlled prior to PCR amplification of target fragments. Single-base extension reactions were carried out using specific extension primers. The resulting products were purified, spotted onto a SpectroCHIP array, and analyzed by mass spectrometry. Genotyping was achieved by distinguishing products based on molecular weight differences. All MassARRAY genotyping assays were conducted by Beijing Compass Biotechnology Co., Ltd. Detailed primer information is provided in Table S8.
For association analysis, consistent with the GLM framework used in the GWAS (Section 2.6), SNP-trait associations were also evaluated using a GLM, but adapted for the analysis of quantitative traits (litter size) with repeated measurements. The model included parity as a fixed effect to account for its contribution to litter size variation:
where is the litter size, is the population mean, is the genotype vector for SNP , is the genetic effect of SNP, is the parity effect, is the design matrix for , and is the residual error.
3. Results
3.1. Genome Resequencing and Identification of Single Nucleotide Polymorphisms
Whole-genome sequencing data were generated for 31 DHS, and publicly available genomic data from 30 sheep (10 DPS and 20 HUS; PRJNA681929 [28]) were also included in the analysis. In total, the raw sequencing data amounted to 2067.57 G, of which 2023.03 G remained after quality filtering. The average Q30 was 94.26%, and the GC content was 44.82% (Supplementary Table S1). All 61 samples were aligned to the sheep reference genome, achieving an average alignment rate of 99.86% (Supplementary Table S2) and an average sequencing depth of 10.83× (Supplementary Table S3). After variant calling and filtering, an average of 12.52 million SNPs were identified per DHS individual, 12.06 million per DPS individual, and 12.39 million per HUS individual (Supplementary Table S4). Most SNPs were located in intergenic and intronic regions, while a smaller proportion were found in exonic regions, untranslated regions (UTRs), and splicing sites. The genomic distribution density of SNPs is shown in Supplementary Figure S1.
3.2. Population Genetic Structure
Based on genome data, we analyzed the genetic relationships among the three sheep populations. The phylogenetic tree, constructed using individual SNP variants, revealed that DHS, DPS, and HUS were clearly grouped into three distinct genetic clusters (Figure 1A). Furthermore, PCA further supported the genetic differentiation among the three populations. PC1 and PC2 explaining 4.55% and 3.61% of the total genetic variation, respectively, and clearly separated the individuals into three clusters corresponding to DHS, DPS, and HUS (Figure 1B). Ancestry component analysis revealed that when K = 2, DPS and HUS each formed distinct ancestral groups, while DHS exhibited a higher proportion of DPS ancestry (Figure 1C). When K = 3, each population exhibited a unique genetic signature, allowing clearer differentiation among the three groups.
Figure 1.
Population Genetics Analyses of Samples. (A) Phylogenetic trees of three sheep populations constructed based on the neighbor-joining method; (B) PCA results for three sheep populations; (C) Analysis of ancestral composition of three sheep populations.
3.3. Population Genetic Diversity
We systematically assessed genetic diversity, polymorphism, and genome-wide patterns among the three sheep populations to investigate how hybridization has shaped the genetic background of DHS. Nucleotide diversity (π) analysis revealed that DHS exhibited diversity levels comparable to HUS and greater than DPS (Figure 2A). ROH analysis indicated that most ROHs were concentrated in the 0–1 Mb range across all populations, but DPS had significantly more ROHs in the 1–2 Mb, 2–4 Mb, and 4–8 Mb intervals (Figure 2B). The Ho, He, and MAF of DHS were 0.250, 0.249, and 0.177, respectively—values slightly higher than those observed in DPS and HUS. The PN in DHS reached 0.940, significantly higher than in DPS (0.668) and HUS (0.846), indicating that more polymorphic functional loci were retained due to its hybrid background (Figure 2C). The MAF distribution revealed that DHS had the lowest proportion of SNPs in the 0.4–0.5 interval (11.19%), a trend similar to those observed in DPS and HUS, indicating broadly comparable allele frequency patterns across the three populations (Figure 2D). LD analysis revealed that DHS exhibited lower LD levels across all genomic distances, particularly over short distances, with DPS showing the highest LD and HUS an intermediate degree (Figure 2E). These findings, consistent with the ROH patterns, suggest that hybridization in DHS increased genetic diversity, reduced linkage disequilibrium, and contributed to a broader genomic foundation conducive to future breeding improvement.
Figure 2.
Genetic Diversity Statistics of Three Sheep Breeds. (A) Whole-genome distribution of π in three populations; (B) Distribution of total ROH lengths by length category for the three populations; (C) Statistics of genetic diversity parameters of the three populations, including Ho, He, PN, and MAF; (D) Distribution of MAF in three populations; (E) Genome-wide average LD decay for the three breeds.
3.4. Selection Signatures Between the Polytocous and Monotocous Sheep
Prior to analyzing novel selection signatures for prolificacy, we first confirmed the status of the major prolificacy gene, FecB (BMPR1B c.746G>A). The prolific B allele was fixed across all 31 DHS individuals, resulting in minimal polymorphism at this locus. Specifically, the 15 polytocous individuals comprised 14 B+ and 1 BB genotypes, while the 16 monotocous individuals consisted of 13 B+ and 3 BB genotypes. Due to this high genetic fixation and insufficient polymorphism across the two groups, the FecB was unable to register as a significant selective sweep region in this intraspecies comparison. This necessitates the search for other novel or quantitative loci contributing to litter size variation.
To uncover the genetic basis of prolificacy, we conducted selective sweep analysis between polytocous and monotocous DHS using four methods. Distinct selection signals were identified on multiple chromosomes, with , ratio, XP-CLR, and XP-EHH (Figure 3A) collectively highlighting key loci potentially associated with reproductive adaptation. In the top 1% of selected genomic regions, , ratio, XP-CLR, and XP-EHH, respectively, identified 470, 707, 1323, and 532 candidate genes (Figure 3B; Supplementary Table S5). LOC101109111 and LOC101109377 were consistently identified by all four methods, indicating their potential key roles in regulating litter size. Additional genes, such as MUC1, MSH3, FNBP1L, TRIM46, CADM2, and CEP128, were identified by at least three methods, suggesting they may be under multiple selection pressures and contribute to the genetic regulation of prolificacy. GO and KEGG analyses (Figure 3C; Supplementary Table S6) of candidate genes from selective sweep regions revealed enrichment in reproductive-related biological processes and pathways, including follicle development, hormonal regulation, ovarian steroidogenesis, insulin secretion, and circadian rhythm, suggesting their roles in regulating prolificacy and reproductive cycles.
Figure 3.
Selection signal analysis and functional enrichment of genes with litter size in DHS. (A) Manhattan plots of genome-wide selection signatures between polytocous and monotocous sheep, based on ,, XP-CLR, XP-EHH. Different colors indicate different chromosomes. (B) Venn diagram of candidate genes identified by different selection methods. (C) GO and KEGG enrichment analysis of genes identified by at least two methods.
3.5. Genome-Wide Association Analysis of Litter Size in DHS
A GWAS was performed using the GLM to identify genetic variants associated with litter size in DHS. The results are presented in Figure 4 (Manhattan and Q–Q plots). The Q–Q plot showed that all quantiles were distributed along the expected diagonal line (slope = 1), indicating that the population structure and genetic background were well controlled and the association signals were reliable. Using a threshold p < 10−5, the analysis yielded 140 significant SNPs, which annotated 83 candidate genes highly associated with litter size (Supplementary Tables S7 and S8). Among these, 12 SNPs met a more stringent significance level of p < 10−6. These 12 highly significant SNPs were mainly clustered on Chromosome 2 (2 loci) and Chromosome 25 (6 loci).
Figure 4.
Manhattan Plot and QQ Plot of Genome-Wide Association Analysis for the prolificacy trait (polytocous vs. monotocous) of Dorper × Hu Hybrid Sheep. The phenotype represents the overall prolificacy status derived from the ewes’ reproductive records across three consecutive parities. The blue shadow in QQ Plot indicates the confidence interval.
To validate the GWAS findings and identify potential functional mutations, targeted genotyping was performed on 200 DHS exhibiting three consecutive lambing records. The 10 candidate SNPs selected for MassARRAY® genotyping were chosen based on two primary criteria: (1) their highly significant GWAS p-values, which ranged from 8.77 × 10−9 to 2.94 × 10−6, and (2) the known biological functions of their corresponding genes in reproduction and development. Specifically, genes harboring these selected SNPs included: SLC24A2 and MLLT3 (involved in cell signaling and gene transcription regulation); PAX3 and SGPP2 (related to embryonic development and lipid metabolism); ABCA1 (affects steroid hormone biosynthesis and ovarian function); and NRG3 and KCNU1 (linked to neuroendocrine and sperm maturation processes). These 10 SNPs, located on Chromosomes 2, 3, 14, 23, 25, and 26, were then genotyped using the MassARRAY® platform, which showed high detection rates ranging from 81.5% to 100% (Table 1).
Table 1.
MassARRAY® SNP Genotyping Site Information.
Subsequently, using the full dataset of 200 DHS ewes, the GLM was used to analyze the association between each SNP genotype and the number of offspring (Table 2). The results revealed that two sites, g.88680390 C>A and g.18197516 T>C, showed significant associations with litter size (p < 0.05). Specifically, ewes carrying the AA genotype at g.88680390 exhibited consistently higher average litter sizes across the first to third parities compared with other genotypes. Similarly, the TT genotype at g.18197516 was associated with larger litter sizes than the heterozygous or homozygous mutant types. These results suggest that the identified loci may have favorable allelic effects on reproductive performance in DHS and could serve as molecular markers for the selection of high-prolificacy individuals.
Table 2.
Association Analysis Between SNP Mutation Sites and Litter Size in 200 DHS using Three Consecutive Lambing Records.
4. Discussion
This study attempted to identify the genetic characteristics of DHS using whole-genome data, focusing on population genetic structure and genetic diversity. The results highlight the potential and value of DHS as a composite breed with distinct genetic features for future breeding and improvement programs.
Population genetic structure analyses supported the classification of DHS as a distinct genetic cluster. These findings indicate that DHS has developed a stable genetic foundation through hybridization and has diverged from its parental populations, rather than remaining an intermediate type. These findings align with previous studies on the Purunã composite cattle breed [29] and prolific Suffolk sheep [30], further demonstrating that systematic hybrid breeding can generate novel breeds with independent genetic structures through population recombination. This provides valuable insights for meat sheep-breeding strategies.
DHS displayed greater genetic polymorphism than both HUS and DPS. Genetic diversity indicators, including nucleotide diversity (π), Ho, He, PN, and MAF were all elevated in DHS, suggesting a greater accumulation of genetic variation during the hybridization process [31]. These findings imply that DHS harbors enhanced genetic potential for environmental adaptation and selective breeding. These observations are consistent with reports on high genetic diversity and low inbreeding in Luxi black-headed sheep [4], Bamei mutton sheep [5], and Creole goats [32]. This underscores the importance of hybrid breeding strategies in preserving genetic variability. ROH analysis showed that DHS predominantly contained short ROH segments, with relatively few long fragments. Combined with faster LD decay, this indicates a lower risk of inbreeding and a more relaxed population structure.
We identified overlapping candidate genes and enriched pathways in polytocous populations. LOC101109111 and LOC101109377 were consistently detected across all four selection methods. Although not extensively characterized previously, these genes may represent novel regulators of litter size. Several previously reported genes were also identified by three or more methods and appear to be under cumulative selection pressure. These include MUC1, which affects the uterine environment and embryo implantation and significantly influences litter size in pigs [33]; PLCB4, identified as a hub gene for litter size traits in sheep through meta-GWAS analysis [34]; SIN3A, which regulates chromatin remodeling during follicular development [35]; and ELAVL2, which is involved in oocyte maturation [36]. These genes are likely to play important roles in ovarian function and embryonic development. Moreover, the enrichment of pathways such as ovarian steroidogenesis [37], insulin secretion [38], Apelin signaling [39], and circadian entrainment [30] underscores the importance of coordinated endocrinological and developmental processes in prolificacy. These pathways have also been implicated in reproductive success across mammals, reinforcing the relevance of our findings.
In this study, we identified 140 significantly associated SNPs () via GWAS and selected 10 highly significant candidate loci for validation. Candidate genes in the regions harboring these SNPs and their biological functions provide critical insights into the molecular regulatory mechanisms underlying the high fecundity of DHS. Among these, the most significant association signal () maps to the g.88680390 C>A locus on chromosome 2, which is located in the intergenic region between the SLC24A2 and MLLT3 genes. Previous studies have shown that SLC24A2 plays an important role in the reproduction of Cameroon’s native goats [40], suggesting a conserved function in sheep reproductive regulation. As a transcriptional activator, MLLT3 can extend the transcription cycle of target genes and remodel chromatin during embryogenesis [41], and has been demonstrated to be involved in key processes of embryonic development [42]. Furthermore, significant differences in the expression levels of MLLT3 were identified between the ovarian tissues of high- and low-fecundity Chongming white goats [43], further supporting its role in regulating reproductive traits. Another key candidate locus is g.18197516 T>C (), located within the ABCA1 gene region. ABCA1 plays a central role in transmembrane cholesterol transport, a precursor for steroid hormone synthesis. Studies have shown that ABCA1-deficient mice exhibit significantly reduced cholesterol ester levels in adrenal and ovarian tissues, leading to impaired reproductive capacity in homozygous female mice, characterized by fewer pregnancies and lower litter sizes [44]. Additionally, the g.36609350 G>C locus () on chromosome 25 is associated with the NRG3 gene, which has been identified as a candidate gene for litter size in sheep in multiple studies [44,45]. Although selective sweep analyses (e.g., , XP-CLR) and GWAS can both identify genomic regions associated with litter size variation, they capture distinct aspects of the underlying genetic architecture, resulting in incomplete overlap of detected signals. In this study, some loci (e.g., g.88680390 C>A) were identified by both methods—showing significant associations in GWAS while residing in regions with strong and XP-CLR signatures—whereas some GWAS-significant SNPs were outside major selective sweep regions. This discrepancy stems from the fact that selective sweep analyses focus on identifying genomic regions shaped by historical or recent directional selection within populations [46], while GWAS are more sensitive to SNPs with measurable additive genetic effects on traits [47].
To validate the GWAS findings, we performed targeted genotyping on a larger population. We confirmed that two key SNPs—g.88680390 C>A (located in SLC24A2/MLLT3) and g.18197516 T>C (located in ABCA1)—are significantly associated with increased litter size. Specifically, ewes carrying the AA genotype at g.88680390 and the TT genotype at g.18197516 exhibited higher litter sizes. Consequently, we recommend prioritizing these SNPs to enhance reproductive rates. While this study provides valuable genomic insights, it has several limitations. Although the validation utilized 200 DHS individuals, this sample size remains limited for litter size. To improve confidence in these SNPs and facilitate reliable marker-assisted selection, future work should involve genotyping in larger DHS populations and integrating functional analyses (e.g., expression Quantitative Trait Loci or eQTL analysis) to elucidate the precise causal mechanisms of these markers.
5. Conclusions
In summary, this study systematically characterized the genetic structure and genetic diversity of DHS, providing comprehensive insights for its genetic breeding research. The results confirm that DHS has formed an independent and stable genetic background through systematic hybridization, with a distinct population structure and higher genetic diversity. Additionally, this study investigated selection signatures associated with litter size in DHS, identifying multiple candidate genes and pathways that may be involved in regulating reproductive performance. Furthermore, two SNP loci—g.88680390 C>A (located in the genomic region of SLC24A2/MLLT3) and g.18197516 T>C (located in the genomic region of ABCA1)—showed a significant association with litter size, providing potential molecular markers for the improvement of sheep fecundity. These results offer important references for the development of genomic selection and breeding strategies related to sheep reproductive performance.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ani15233505/s1, Table S1: Summary of Whole-Genome Resequencing Data Quality Before and After Filtering; Table S2: Summary of Whole-Genome Read Alignment Statistics to the Reference Genome; Table S3: Statistics of sample average sequencing depth and coverage ratio; Table S4: SNP location or type statistics; Table S5: Candidate genes associated with litter size in Dorper × Hu hybrid sheep identified by multiple selection signature methods and their overlaps; Table S6: GO and KEGG Enrichment Analysis of Candidate Genes Under Selection with litter size; Table S7: Results of Candidate SNP Loci Screening in Genome-Wide Association Analysis of Dorper × Hu Hybrid Sheep; Table S8: Site Primer Information Table. Figure S1: Heatmap of SNPs Density Distribution on Chromosomes.
Author Contributions
Conceptualization, L.Q. and W.L.; methodology, Z.P.; software, K.M. and K.C.; validation, L.Q., Z.P. and Q.Y.; formal analysis, Z.P.; investigation, W.W. and K.C.; resources, W.L.; data curation, S.Z.; writing—original draft preparation, L.Q. and Z.P.; writing—review and editing, K.M. and W.W.; visualization, Z.P. and S.Z.; supervision, Q.Y.; project administration, W.L.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the earmarked fund for Modern Agro-industry Technology Research System (2024CYJSTX14-06) and the Shanxi Province “Yan Yun Aries” Breeding Joint Research Project (NYGG23).
Institutional Review Board Statement
This study has been reviewed and approved by the Institutional Animal Care and Use Committee of Shanxi Agricultural University (Approval number: SXAU-EAW-2023S.NR.012023361).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The SNP variants generated in this study through variant calling using GATK are available via NutCloud at the following link: https://www.jianguoyun.com/p/DbjQdZgQ_tC_DRj9pIYGIAA (accessed on 1 July 2025).
Acknowledgments
All authors thank the Department of Genetics and Breeding, College of Animal Science, Shanxi Agricultural University for providing the necessary equipment for this study.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Chen, K.; Zhang, Y.; Pan, Y.; Xiang, X.; Peng, C.; He, J.; Huang, G.; Wang, Z.; Zhao, P. Genomic insights into demographic history, structural variation landscape, and complex traits from 514 Hu sheep genomes. J. Genet. Genom. 2025, 52, 245–257. [Google Scholar] [CrossRef]
- Wanjala, G.; Kusuma Astuti, P.; Bagi, Z.; Kichamu, N.; Strausz, P.; Kusza, S. A review on the potential effects of environmental and economic factors on sheep genetic diversity: Consequences of climate change. Saudi J. Biol. Sci. 2023, 30, 103505. [Google Scholar] [CrossRef]
- Oyieng, E.; Ojango, J.M.K.; Gauly, M.; Mrode, R.; Dooso, R.; Okeyo, A.M.; Kalinda, C.; König, S. Evaluating reproduction traits in a crossbreeding program between indigenous and exotic sheep in semi-arid lands. Animal 2025, 19, 101391. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Tan, X.; Wang, J.; Jin, Q.; Meng, X.; Cai, Z.; Cui, X.; Wang, K. Whole genome sequencing of Luxi Black Head sheep for screening selection signatures associated with important traits. Anim. Biosci. 2022, 35, 1340–1350. [Google Scholar] [CrossRef] [PubMed]
- Yao, Y.; Pan, Z.; Di, R.; Liu, Q.; Hu, W.; Guo, X.; He, X.; Gan, S.; Wang, X.; Chu, M. Whole Genome Sequencing Reveals the Effects of Recent Artificial Selection on Litter Size of Bamei Mutton Sheep. Animals 2021, 11, 157. [Google Scholar] [CrossRef] [PubMed]
- Nie, H.T.; Zhang, H.; You, J.H.; Wang, F. Determination of energy and protein requirement for maintenance and growth and evaluation for the effects of gender upon nutrient requirement in Dorper × Hu Crossbred Lambs. Trop. Anim. Health Prod. 2015, 47, 841–853. [Google Scholar] [CrossRef]
- Cockett, N.E. The sheep genome. Genome Dyn. 2006, 2, 79–85. [Google Scholar]
- Li, X.; Yang, J.; Shen, M.; Xie, X.-L.; Liu, G.-J.; Xu, Y.-X.; Lv, F.-H.; Yang, H.; Yang, Y.-L.; Liu, C.-B.; et al. Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits. Nat. Commun. 2020, 11, 2815. [Google Scholar] [CrossRef]
- Lagler, D.K.; Hannemann, E.; Eck, K.; Klawatsch, J.; Seichter, D.; Russ, I.; Mendel, C.; Lühken, G.; Krebs, S.; Blum, H.; et al. Fine-mapping and identification of candidate causal genes for tail length in the Merinolandschaf breed. Commun. Biol. 2022, 5, 918. [Google Scholar] [CrossRef]
- Johnsson, M. Genomics in animal breeding from the perspectives of matrices and molecules. Hereditas 2023, 160, 20. [Google Scholar] [CrossRef]
- Hulsman Hanna, L.L.; Taylor, J.B.; Holland, P.W.; Vonnahme, K.A.; Reynolds, L.P.; Riley, D.G. Effect of ewe birth litter size and estimation of genetic parameters on ewe reproductive life traits. Animal 2023, 17, 100900. [Google Scholar] [CrossRef]
- Wen, Y.L.; Guo, X.F.; Ma, L.; Zhang, X.S.; Zhang, J.L.; Zhao, S.G.; Chu, M.X. The expression and mutation of BMPR1B and its association with litter size in small-tail Han sheep (Ovis aries). Arch. Anim. Breed. 2021, 64, 211–221. [Google Scholar] [CrossRef]
- Zhang, Y.; Wang, H.; Li, T.; Zhang, N.; Chen, J.; Yang, H.; Peng, S.; Ma, R.; Wang, D.; Liu, Q.; et al. Association of BMP15 and GDF9 Gene Polymorphisms with Litter Size in Hu Sheep. Genes 2025, 16, 168. [Google Scholar] [CrossRef] [PubMed]
- Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef]
- McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef]
- Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
- Yang, J.; Lee, S.H.; Goddard, M.E.; Visscher, P.M. GCTA: A Tool for Genome-wide Complex Trait Analysis. Am. J. Hum. Genet. 2011, 88, 76–82. [Google Scholar] [CrossRef]
- Alexander, D.H.; Lange, K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinform. 2011, 12, 246. [Google Scholar] [CrossRef]
- Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef]
- Zhang, C.; Dong, S.-S.; Xu, J.-Y.; He, W.-M.; Yang, T.-L. PopLDdecay: A fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics 2019, 35, 1786–1788. [Google Scholar] [CrossRef]
- Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
- Chen, H.; Patterson, N.; Reich, D. Population differentiation as a test for selective sweeps. Genome Res. 2010, 20, 393–402. [Google Scholar] [CrossRef]
- Szpiech, Z.A. selscan 2.0: Scanning for sweeps in unphased data. Bioinformatics 2024, 40, btae006. [Google Scholar] [CrossRef] [PubMed]
- Sherman, B.T.; Hao, M.; Qiu, J.; Jiao, X.; Baseler, M.W.; Lane, H.C.; Imamichi, T.; Chang, W. DAVID: A web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 2022, 50, W216–W221. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 2012, 44, 821–824. [Google Scholar] [CrossRef] [PubMed]
- Lv, X.; Weihao, C.; Shanhe, W.; Xiukai, C.; Zehu, Y.; Tesfaye, G.; Mwacharo, J.M.; Aynalem, H.; Sun, W. Whole-genome resequencing of Dorper and Hu sheep to reveal selection signatures associated with important traits. Anim. Biotechnol. 2023, 34, 3016–3026. [Google Scholar] [CrossRef]
- Mulim, H.A.; Brito, L.F.; Batista Pinto, L.F.; Moletta, J.L.; Da Silva, L.R.; Pedrosa, V.B. Genetic and Genomic Characterization of a New Beef Cattle Composite Breed (Purunã) Developed for Production in Pasture-Based Systems. Front. Genet. 2022, 13, 858970. [Google Scholar] [CrossRef]
- Yang, H.; Zhu, M.; Wang, M.; Zhou, H.; Zheng, J.; Qiu, L.; Fan, W.; Yang, J.; Yu, Q.; Yang, Y.; et al. Genome-wide comparative analysis reveals selection signatures for reproduction traits in prolific Suffolk sheep. Front. Genet. 2024, 15, 1404031. [Google Scholar] [CrossRef]
- Hedrick, P.W. Adaptive introgression in animals: Examples and comparison to new mutation and standing variation as sources of adaptive variation. Mol. Ecol. 2013, 22, 4606–4618. [Google Scholar] [CrossRef]
- Corredor, F.-A.; Figueroa, D.; Estrada, R.; Burgos-Paz, W.; Salazar, W.; Cruz, W.; Lobato, R.; Injante, P.; Godoy, D.; Barrantes, C.; et al. Genome-wide single nucleotide polymorphisms reveal the genetic diversity and population structure of Creole goats from northern Peru. Livest. Sci. 2024, 283, 105473. [Google Scholar] [CrossRef]
- Xiao, C.; Jinluan, F.; Aiguo, W. Effect of VNTR polymorphism of the Muc1 gene on litter size of pigs. Mol. Biol. Rep. 2012, 39, 6251–6258. [Google Scholar] [CrossRef]
- Gholizadeh, M.; Esmaeili-Fard, S.M. Meta-analysis of genome-wide association studies for litter size in sheep. Theriogenology 2022, 180, 103–112. [Google Scholar] [CrossRef] [PubMed]
- Luo, L.; Dang, Y.; Shi, Y.; Zhao, P.; Zhang, Y.; Zhang, K. SIN3A Regulates Porcine Early Embryonic Development by Modulating CCNB1 Expression. Front. Cell Dev. Biol. 2021, 9, 604232. [Google Scholar] [CrossRef] [PubMed]
- Chalupnikova, K.; Petr, S.; Vadym, S.; Radislav, S.; and Svoboda, P. An oocyte-specific ELAVL2 isoform is a translational repressor ablated from meiotically competent antral oocytes. Cell Cycle 2014, 13, 1187–1200. [Google Scholar] [CrossRef] [PubMed]
- Hadjadj, I.; Zuzana, F.; Luz, G.M.d.l.; Barbora, L.; Martin, M.; Peter, M.; Ivan, A.; Sirotkin, A.V.; Argente, M.-J. Effects of selection for litter size variability on ovarian folliculogenesis, ovarian cell proliferation, apoptosis, and production of regulatory peptides in rabbits. Ital. J. Anim. Sci. 2024, 23, 1290–1304. [Google Scholar] [CrossRef]
- Père, M.-C.; Etienne, M. Influence of litter size on insulin sensitivity in multiparous sows1. J. Anim. Sci. 2019, 97, 874–884. [Google Scholar] [CrossRef]
- Dobrzyn, K.; Kiezun, M.; Kopij, G.; Zarzecka, B.; Gudelska, M.; Kisielewska, K.; Zaobidna, E.; Makowczenko, K.G.; Dall’Aglio, C.; Kamiński, T.; et al. Apelin-13 modulates the endometrial transcriptome of the domestic pig during implantation. BMC Genom. 2024, 25, 501. [Google Scholar]
- Kouam Simo, J.; Meutchieye, F.; Wouobeng, P.; Tarekegn, G.M.; Mutai, C.; Nandolo, W.; Pelle, R.; Djikeng, A.; Manjeli, Y. Genome-wide association study for the level of prolificacy in Cameroon’s native goat. J. Appl. Anim. Res. 2024, 52, 2291472. [Google Scholar] [CrossRef]
- Germano, G.; Porazzi, P.; Felix, C.A. Leukemia-associated transcription factor mllt3 is important for primitive erythroid development in zebrafish embryogenesis. Dev. Dyn. Off. Publ. Am. Assoc. Anat. 2022, 251, 1728–1740. [Google Scholar] [CrossRef] [PubMed]
- Büttner, N.; Johnsen, S.A.; Kügler, S.; Vogel, T. Af9/Mllt3 interferes with Tbr1 expression through epigenetic modification of histone H3K79 during development of the cerebral cortex. Proc. Natl. Acad. Sci. USA 2010, 107, 7042–7047. [Google Scholar] [CrossRef]
- Lin, Y.; Sun, L.; Dai, J.; Lv, Y.; Liao, R.; Shen, X.; Gao, J. Characterization and Comparative Analysis of Whole-Transcriptome Sequencing in High- and Low-Fecundity Chongming White Goat Ovaries during the Estrus Phase. Animals 2024, 14, 988. [Google Scholar] [CrossRef]
- Tao, L.; He, X.; Wang, X.; Di, R.; Chu, M. Litter size of sheep (Ovis aries): Inbreeding depression and homozygous regions. Genes 2021, 12, 109. [Google Scholar] [CrossRef]
- Liu, W.; Ma, S.; Lu, Q.; Tang, S.; Mamat, N.; Wang, Y.; Hong, W.; Hu, X.; Wu, C.; Fu, X. Genome-Wide Association Study Revealed Candidate Genes Associated with Litter Size, Weight, and Body Size Traits in Tianmu Polytocous Sheep (Ovis aries). Biology 2025, 14, 1446. [Google Scholar] [CrossRef] [PubMed]
- Hoban, S.; Kelley, J.L.; Lotterhos, K.E.; Antolin, M.F.; Bradburd, G.; Lowry, D.B.; Poss, M.L.; Reed, L.K.; Storfer, A.; Whitlock, M.C. Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions. Am. Nat. 2016, 188, 379–397. [Google Scholar] [CrossRef]
- Tam, V.; Patel, N.; Turcotte, M.; Bossé, Y.; Paré, G.; Meyre, D. Benefits and limitations of genome-wide association studies. Nat. Rev. Genet. 2019, 20, 467–484. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).