Next Article in Journal
Economic Perspectives on Farm Biosecurity: Stakeholder Challenges and Livestock Species Considerations
Next Article in Special Issue
Longevity and Culling Dynamics of Holstein–Friesian Cows in Hungary
Previous Article in Journal
Urea Supplementation Increases Crude Protein and Alters pH but Does Not Affect Ruminal Degradability of Opuntia Silages
Previous Article in Special Issue
Assessment of Heterozygosity in European Local and Cosmopolitan Pig Populations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Two-Stage Machine Learning-Based GWAS for Wool Traits in Central Anatolian Merino Sheep

1
Department of Animal Science, Faculty of Veterinary Medicine, Aksaray University, Aksaray 68000, Türkiye
2
Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
3
Department of Animal Science, Faculty of Agriculture, Ankara University, Ankara 06560, Türkiye
4
Department of Animal Science, Faculty of Agriculture, Erciyes University, Kayseri 38039, Türkiye
5
Department of Veterinary Microbiology & Pathology, College of Veterinary Medicine, Washington State University, Pullman, WA 99164, USA
6
Betül Ziya Eren Genome and Stem Cell Center, Erciyes University, Kayseri 38280, Türkiye
*
Author to whom correspondence should be addressed.
Agriculture 2025, 15(21), 2287; https://doi.org/10.3390/agriculture15212287
Submission received: 11 September 2025 / Revised: 16 October 2025 / Accepted: 22 October 2025 / Published: 3 November 2025
(This article belongs to the Special Issue Genetic Diversity, Adaptation and Evolution of Livestock)

Abstract

Wool traits such as fiber diameter, fiber length, and greasy fleece yield are economically significant characteristics in sheep breeding programs. Traditional genome-wide association studies (GWAS) have identified relevant genomic regions but often fail to capture the non-linear and polygenic architecture underlying these traits. In this study, we implemented a two-stage machine learning (ML)-based GWAS framework to dissect the genetic basis of wool traits in Central Anatolian Merino sheep. Phenotypic records were collected from 228 animals, genotyped with the Illumina OvineSNP50 BeadChip. In the first stage, feature selection was conducted using LASSO, Ridge Regression, and Elastic Net, generating a consensus SNP panel per trait. In the second stage, association modeling with Random Forest and Support Vector Regression (SVR) identified the most predictive models (R2 up to 0.86). Candidate gene annotation highlighted biologically relevant loci: MTHFD2L and EPGN (folate metabolism and keratinocyte proliferation) for fiber diameter; COL5A2, COL3A1, ITFG1, and ELMO1 (extracellular matrix integrity and actin remodeling) for staple length; and FAP, DPP4, PLCH1, and NPTX1 (extracellular matrix remodeling, proteolysis, and sebaceous gland function) for greasy fleece yield. These findings demonstrate the utility of ML-enhanced GWAS pipelines in identifying biologically meaningful markers and propose novel targets for genomic selection strategies to improve wool quality and yield in indigenous sheep populations.

1. Introduction

Sheep have played a fundamental role in human societies for millennia, valued not only for meat and milk but also for wool, a renewable and versatile raw material. Wool’s unique physical and chemical properties—elasticity, crimp, fiber diameter, and fiber length—have secured its position in the textile industry from ancient times to the present day [1,2,3,4]. Today, wool remains a globally traded commodity, contributing substantially to rural economies and sustainable fashion markets. Recent statistics indicate that annual global clean wool production ranges between 1.03 and 1.16 million tonnes [5], with the largest volumes originating from countries such as Australia, China, New Zealand, and Türkiye.
Türkiye, in particular, possesses a rich genetic reservoir of indigenous sheep breeds, shaped over centuries by diverse climatic zones and management systems. The Central Anatolian Merino, which was derived from the Akkaraman breed, stands out for their adaptability, resilience, and multipurpose productivity [6,7,8,9]. While traditionally selected for hardiness and meat yield, recent breeding programs have increasingly targeted wool quality parameters, recognizing their direct influence on market value and textile performance. These parameters—especially fiber diameter, staple length, and greasy fleece yield—are complex traits influenced by both genetic and environmental factors, including nutrition, climate, and shearing management [1,3].
Genetic improvement in wool traits has been facilitated by advances in genomic technologies, particularly genome-wide association studies (GWAS). Conventional GWAS approaches, typically employing single-locus mixed linear models, have identified numerous loci associated with wool quality traits, including genes related to keratin and keratin-associated proteins (KAPs), follicular development, and pigment pathways [1,3,10]. For example, variants within TRIM2, TLR2, and SLC45A1 have been implicated in fiber structure and immune-related follicular responses [1,10,11]. Despite these insights, classical GWAS often face limitations in capturing the full genetic architecture of complex traits, particularly when interactions, non-linear effects, and polygenic backgrounds play significant roles [12].
To overcome these constraints, machine learning (ML) methods have emerged as powerful alternatives. Techniques such as least absolute shrinkage and selection operator (LASSO), Ridge Regression, and Elastic Net provide embedded feature selection and robust handling of multicollinearity [13,14,15], while ensemble approaches like Random Forest (RF) [16] and Extreme Gradient Boosting (XGBoost) can detect intricate non-linear relationships and gene–gene interactions [17]. Support Vector Regression (SVR) further enhances prediction in high-dimensional datasets by maximizing generalization capacity. Comparative studies have demonstrated that ML-based GWAS pipelines can improve predictive accuracy, stability, and the biological relevance of selected markers over traditional models [18,19].
Building on this progress, the present study implements a two-stage, machine learning-enhanced GWAS framework to uncover the genetic architecture of fiber diameter, staple length, and greasy fleece yield in Central Anatolian Merino sheep. In the first stage, regularized regression methods (LASSO, Ridge, Elastic Net) are used for SNP pre-selection, resulting in a consensus panel of informative markers. In the second stage, predictive modeling with Random Forest (RF), XGBoost, and Support Vector Regression (SVR) identifies the most accurate model, from which top-ranked SNPs are selected for annotation and functional interpretation. By combining high-density genotyping with advanced computational tools, this approach not only enhances detection power but also yields actionable genomic markers that can inform selection strategies aimed at improving wool quality in Central Anatolian Merino populations.

2. Materials and Methods

2.1. Animals and Phenotypic Data

The study was conducted on 228 Central Anatolian Merino (CAM) sheep reared under extensive production systems in Türkiye. All animals were ewes (female sheep) of uniform age and maintained under similar management and nutritional conditions. Wool traits, including fiber diameter (FD, in µm), fiber length (FL, in mm), and greasy fleece yield (GFY, in kg) were measured when animals were approximately 18 months of age. Fiber diameter and fiber length were recorded for all 228 animals, whereas greasy fleece yield measurements were obtained for a subset of 62 animals due to logistical limitations in fleece sampling and weighing. FD and FL measurements were performed using the OFDA2000 instrument (BSC Electronics, Perth, Australia) according to industry standards. Wool samples were collected from the upper rear scapular region of each animal to ensure consistency across measurements. Before analysis, all phenotypic records were pre-adjusted for fixed effects such as age and management group, ensuring that corrected phenotypic values were used in the GWAS models.

2.2. Genotyping and Quality Control

Genomic DNA was extracted from whole blood samples using Qiacube HT automated device using a commercial kit (Qiagen Blood kit, Hilden, Germany). All animals were genotyped using the Illumina OvineSNP50 BeadChip (Illumina Inc., San Diego, CA, USA), which includes approximately 50,000 single-nucleotide polymorphisms (SNPs). Quality control was performed by excluding SNPs with a call rate below 90% and minor allele frequency (MAF) below 0.05. Animals with >10% missing genotypes were also removed. Missing genotypes were imputed using the K-nearest neighbor (KNN) method implemented in the impute package in R (version 4.5.1) [20]. Genotypes were encoded as 0 (“AA”), 1 (“AB”), and 2 (“BB”) for subsequent modeling.

2.3. Two-Stage Machine Learning-Based GWAS Pipeline

A two-stage genome-wide association framework was employed, integrating regularized regression and non-linear machine learning algorithms to improve feature selection and trait prediction.

2.3.1. Stage 1: Feature Selection via Regularized Regression

To reduce dimensionality and select informative SNPs, three regularization-based regression methods were applied:
  • Least Absolute Shrinkage and Selection Operator (LASSO; α = 1) [13];
  • Ridge Regression (α = 0) [14];
  • Elastic Net (α = 0.5) [15].
These models were implemented using the glmnet R package [21]. The optimal regularization parameter (λ_min) was determined via 12-fold cross-validation. For each model, the top 100 SNPs with the highest absolute regression coefficients were retained. The overlap between SNPs selected by the three models was visualized using Venn diagrams. A consensus SNP panel was created by including all SNPs shared by at least two models. Remaining SNPs were selected from the best-performing model based on R2 values to reach a final panel of 100 markers per trait.

2.3.2. Stage 2: Association Modeling with Tree-Based and Kernel Methods

The consensus SNP panel was used to train three supervised learning models:
  • Random Forest (RF) [16];
  • Extreme Gradient Boosting (XGBoost) [17];
  • Support Vector Regression (SVR) with radial basis function kernel [22].
Model performance was evaluated using a stratified 10-fold cross-validation approach with an 80/20 train–test split. Performance metrics included the coefficient of determination (R2) and root mean square error (RMSE). The model with the highest R2 value for each trait was selected for downstream SNP importance scoring and candidate gene analysis.
Variable importance was derived from the best-performing model, and the top 10 SNPs for each trait were prioritized for genomic annotation.

2.4. Candidate Gene Annotation and Functional Enrichment

Genomic positions of the top-ranking SNPs were matched to the Illumina SNP Map, and a 200 kb window (±100 kb) was defined around each locus. Candidate gene identification was conducted using the biomaRt R package [23] referencing the Ensembl Ovis aries genome (Oar_v4.0). Genes overlapping or flanking the selected SNPs were extracted, and Ensembl Gene IDs were subjected to functional enrichment analysis using g:Profiler [24]. Enrichment was assessed across Gene Ontology Biological Process, Molecular Function, Cellular Component, KEGG, and Reactome pathways. Statistical significance was determined using Benjamini–Hochberg FDR correction (q < 0.05), and top terms were visualized using dot plots.

2.5. Software and Computational Resources

All analyses were conducted in R version 4.5.1 [25], using the following packages: glmnet, randomForest, xgboost, e1071, caret, biomaRt, gprofiler2, VennDiagram, ggplot2, and openxlsx.

3. Results

3.1. Descriptive Statistics of Wool Traits

Descriptive statistics for the three evaluated wool traits—fiber diameter, fiber length, and greasy wool yield—are presented in Table 1. Fiber diameter exhibited a mean of 21.74 µm, ranging from 15.16 to 34.52 µm, with a standard deviation (SD) of 2.39 and a coefficient of variation (CV) of 11.01%, indicating moderate phenotypic variability. Fiber length demonstrated a wider range of variation, with a mean of 33.92 mm, SD of 9.97, and CV of 29.40%. Greasy wool yield, assessed in a subset of 62 animals, displayed an average of 4.04 kg (range: 2.24–6.42 kg), with a SD of 0.82 and a CV of 20.48%. These results indicate substantial variability, particularly in fiber length and greasy wool yield, which may be exploitable in breeding programs.

3.2. Feature Selection Model Performance

The performance of three regularized regression models—LASSO, Ridge, and ElasticNet—in feature selection is summarized in Table 2. For fiber diameter, both LASSO and ElasticNet yielded a high coefficient of determination (R2 = 0.86), whereas Ridge showed slightly lower predictive performance (R2 = 0.72). Fiber length prediction was less accurate across models, with R2 values ranging from 0.17 (Ridge) to 0.29 (LASSO). In greasy wool yield, both LASSO and ElasticNet achieved moderate predictive capacity (R2 = 0.32), while Ridge remained comparatively lower (R2 = 0.17).
Figure 1 presents an UpSet plot illustrating the intersections among the top 100 SNPs selected by LASSO, Ridge, and ElasticNet models for each wool trait. This visualization reveals that 71 SNPs were shared across all three models for fiber diameter, whereas only 11 SNPs overlapped for greasy fleece yield. The use of trait-specific color coding enhances interpretability across model-trait combinations.
In the feature selection stage, the SNPs with the highest importance values identified by the best-performing models for each wool trait are presented in Figure 2. These bar plots illustrate the relative contribution of the top-ranked SNPs, highlighting those with the strongest predictive influence within each model–trait combination. This visualization provides a clear overview of the most informative markers that were retained for subsequent association analyses.

3.3. Association Modeling and Predictive Accuracy

In the second modeling phase, three machine learning algorithms—Random Forest, XGBoost, and Support Vector Regression (SVR)—were applied to the SNP panels derived from feature selection. Their comparative performance is detailed in Table 3. For fiber diameter, Random Forest and SVR both achieved the highest R2 value (0.86), while XGBoost showed lower accuracy (R2 = 0.72). SVR yielded the best predictive ability for fiber length (R2 = 0.38), and Random Forest performed best for greasy wool yield (R2 = 0.40), with a lower root mean square error (RMSE = 0.47) compared to other models.
The top 20 most important SNPs for each trait, as ranked by SVR or Random Forest, are presented in Figure 3. These SNPs were prioritized for genomic annotation.

3.4. Candidate Gene Annotation and Functional Insights

The functional annotation of significant SNPs associated with fiber diameter (FD), fiber length (FL), and greasy wool yield (GWY) (Table 4) revealed several biologically relevant genes implicated in structural, metabolic, and signaling pathways within the hair follicle microenvironment.
For fiber diameter, notable genes included Methenyltetrahydrofolate Dehydrogenase (NADP+ Dependent) 2 Like (MTHFD2L), involved in folate metabolism and cellular proliferation, and Epigen (EPGN), an epidermal growth factor family member that promotes keratinocyte proliferation and follicle development.
For fiber length, key genes included Integrin Alpha FG–GAP Repeat Containing 1 (ITFG1), facilitating integrin-mediated adhesion between basal keratinocytes and the extracellular matrix, thereby directing axial fiber growth, and Engulfment And Cell Motility 1 (ELMO1), a regulator of Rac1-dependent actin cytoskeleton remodeling crucial for follicle elongation. Collagen-related genes Collagen Type V Alpha 2 Chain (COL5A2) and Collagen Type III Alpha 1 Chain (COL3A1) are directly involved in extracellular matrix organization, influencing tensile strength and follicular support. The Sphingolipid Delta(4)-Desaturase 1 (SPTLC1) gene highlights lipid metabolism’s role in follicular function, while Neurotrophic Receptor Tyrosine Kinase 2 (NTRK2) suggests involvement in neuronal regulation of follicle growth.
For greasy wool yield, candidate genes spanned metabolic, structural, and regulatory functions. Phospholipase C Eta 1 (PLCH1) participates in intracellular signaling cascades, potentially influencing sebaceous gland activity. ADP Ribosylation Factor Guanine Nucleotide Exchange Factor 4 (ARHGEF4) and Pleckstrin Homology Domain Containing B2 (PLEKHB2) relate to cytoskeletal and membrane dynamics. Fibroblast Activation Protein Alpha (FAP) and Dipeptidyl Peptidase 4 (DPP4) indicate extracellular matrix remodeling and proteolysis pathways, while Myc Associated Factor X (MXD1) and ETS Variant Transcription Factor 1 (ETV1) reflect transcriptional regulation influences on follicular metabolism. Additional genes such as Neuronal Pentraxin 1 (NPTX1) and Phospholipase D Family Member 5 (PLD5) may contribute to synaptic-like signaling within follicle-associated nerves.

4. Discussion

4.1. Fiber Diameter

Wool fiber diameter is a multifactorial trait shaped by extracellular matrix (ECM) remodeling, growth factor signaling, and neuro-immune crosstalk within the follicular microenvironment. The candidate genes identified in this study highlight several molecular pathways that plausibly modulate fiber thickness in sheep.
EGFR axis regulates follicular keratinocyte proliferation (GO:0040007–growth) and differentiation (GO:0030216–keratinocyte differentiation). In our study, candidates such as AREG and EPGN, members of the EGF ligand family, are key players in maintaining the anagen phase (GO:0042633–hair cycle, anagen phase) and supporting matrix–cell cycling. In Merino sheep, EGF administration has been shown to suppress wool growth and induce follicle regression [26]; murine and human data also confirm that EGFR signaling is critical for follicle development/cycling (GO:0071333–cellular response to epidermal growth factor stimulus) [27,28]. Furthermore, the transcriptomic atlas of embryonic follicular progenitors supports the regulation of EGF family signaling in follicular lineages [29]. Collectively, this evidence suggests that the AREG/EPGN–EGFR axis represents the most potent biological pathway indirectly determining fiber thickness.
Candidate genes in our dataset include ZDHHC21 (DHHC21), an S-acyltransferase that mediates the palmitoylation of membrane proteins (GO:0018345–protein palmitoylation) in skin/follicle cells. Mutations or deletions in DHHC21 have been linked to alterations in hair follicle biology (GO:0001942–hair follicle development) and epidermal homeostasis in mammals [30]. This highlights the potential impact of palmitoylation-dependent receptor/signaling regulation on wool phenotypes.
The NR3C1 gene encodes the glucocorticoid receptor, a central mediator of the response to chronic stress/elevated cortisol (GO:0051384–response to glucocorticoid). In livestock, elevated glucocorticoid levels have been associated with follicle regression (GO:0042634–regulation of hair cycle) and finer fibers; these findings are consistent with reviews and experimental studies summarizing glucocorticoid-mediated regulation of skin/appendage biology [31,32]. Thus, NR3C1 may represent a molecular switch for stress-induced changes in fiber diameter.
MTHFD2L (methylenetetrahydrofolate dehydrogenase (NADP+-dependent) 2-like) is a mitochondrial enzyme that catalyzes reversible reactions within the folate-mediated one-carbon metabolic pathway, contributing to the generation of formate and one-carbon units (GO:0006730–one-carbon metabolic process) [33]. While direct evidence in sheep is lacking, substantial mammalian literature demonstrates that one-carbon metabolism underpins nucleotide synthesis, epigenetic methylation, redox regulation, and amino acid homeostasis—all vital for cellular proliferation and matrix gene regulation [34]. Importantly, methyl donor availability from one-carbon pathways influences methylation of histones and DNA, including promoters of extracellular matrix (ECM)-related genes. For example, global methylation changes have been shown to regulate collagen gene expression during tissue remodeling [35].
The presence of Nuclear Cell Adhesion Molecule 2 (NCAM2), associated with cell–cell adhesion, suggests a role in maintaining follicular structural integrity. Additionally, Nuclear Receptor Subfamily 3 Group C Member 1 (NR3C1, glucocorticoid receptor) indicates a potential link between hormonal regulation and fiber traits. Genes such as DEFB119–DEFB124 (beta-defensin family) may reflect immune modulation within follicles, whereas Synuclein Alpha Interacting Protein (SNCAIP) and Armadillo Repeat Containing Protein 1 (ARMH1) point toward roles in intracellular trafficking and signaling.
Collectively, these annotations emphasize that wool trait variability is driven by an interplay of extracellular matrix composition, cytoskeletal dynamics, metabolic regulation, and neuro-hormonal signaling. Many of the identified genes have direct or indirect roles in follicle morphogenesis, elongation, and keratinocyte proliferation, supporting their candidacy for future functional validation and potential use in genomic selection strategies. Among these, MTHFD2L appears particularly relevant, as its metabolic activity may influence wool fiber diameter through modulation of extracellular matrix gene expression and collagen biosynthesis (GO:0030198–extracellular matrix organization; GO:0032964–collagen biosynthetic process). This underscores a potential mechanistic link between cellular methyl metabolism and the structural determinants of fiber morphology.

4.2. Fiber Length

In our study, the genes associated with wool fiber length point to a diverse set of molecular mechanisms underlying hair follicle biology. These functional categories include cytoskeletal organization, cell adhesion, extracellular matrix (ECM) integrity, neural niche regulation, cell cycle control, lipid metabolism, and protein trafficking.
The repeated appearance of COL5A2 (collagen type V alpha 2 chain) and COL3A1 (collagen type III alpha 1 chain) points to extracellular-matrix (ECM) control of follicle biomechanics as a primary axis for fiber elongation. Type V collagen regulates fibril nucleation and spacing, while type III collagen contributes tensile strength and dermal architecture—together shaping the mechanical milieu that supports axial outgrowth of the hair shaft (GO:0030198 extracellular matrix organization; GO:0030199 collagen fibril organization; GO:0043588 skin development). Reviews and genetic studies consistently place COL5A2/COL3A1 at the core of collagen fibrillogenesis and skin structure [36,37]. In sheep, collagens also emerge in selection-signature scans (e.g., adaptation across altitudinal ecotypes in dairy breeds), underscoring pleiotropic roles of collagen genes in skin/follicle biology [38]. In that study, the mountain ecotypes showed enrichment for pleiotropic genes involved in structural, metabolic, and neuroendocrine pathways that could indirectly affect wool growth and morphology. Another independent genomic selection studies has also highlighted the evolutionary relevance of these loci. COL5A2 and COL3A1 were identified within strong selective sweep regions when comparing Iranian indigenous breeds (Baluchi, Lori-Bakhtiari, Zandi) and the Greek Chios sheep, primarily in association with prolificacy traits [39]. Given the known differences in wool staple length between these breeds, such selection signals may reflect pleiotropic effects influencing both reproductive capacity and fiber traits. The recurrence of COL5A2 and COL3A1 in these independent datasets underscores their importance as conserved selection targets with potential dual effects on adaptability and fiber structure.
NTRK2 (neurotrophic receptor tyrosine kinase 2, TrkB) implicates neurotrophin TRK receptor signaling (GO:0048011) in hair growth control. Classic skin/hair work shows BDNF/TrkB components are expressed in follicles and modulate the anagen program and perifollicular innervation state, suggesting a route by which neural cues tune matrix proliferation and hence fiber length [40,41]. BRINP1 (BMP/retinoic acid inducible neural-specific protein 1)—best known for promoting cell-cycle arrest and differentiation in neurons—may mark a broader neuro-epithelial regulatory layer near the follicle, though hair-specific evidence remains indirect [42].
ELMO1 (engulfment and cell motility protein 1) promotes Rho/Rac-dependent migration and actin cytoskeleton organization (GO:0030334; GO:0030036), processes that govern matrix–cell flux and follicle tip advancement. ELMO–DOCK complexes activate Rac1 to drive lamellipodia and collective movement—mechanistics that fit the elongation phenotype [43,44]. ITFG1 (integrin alpha FG-GAP repeat containing 1) maps to the integrin/ECM adhesion axis (GO:0007160 cell–matrix adhesion; GO:0033627 cell adhesion mediated by integrin). While hair-specific data for ITFG1 are limited, integrin-dependent adhesion is well established as essential for keratinocyte survival and stratified epithelium dynamics [45], making ITFG1 a cautious but biologically coherent candidate.
ZC3HC1 (zinc finger C3HC-type containing 1; NIPA) connects to G2/M control (GO:0000086) and ubiquitin-mediated proteolysis (GO:0006511), in part via Cyclin B1 regulation. Perturbations that delay Cyclin B1 turnover disrupt mitotic progression, plausibly throttling matrix proliferation and elongation [46,47]. AIFM2 (apoptosis-inducing factor mitochondria-associated 2; FSP1) is a key negative regulator of ferroptosis (GO:0110076), maintaining membrane antioxidant capacity via the FSP1–CoQ10 axis; protecting matrix keratinocytes from lipid peroxidation would favor prolonged anagen and sustained fiber growth [48].
Across independent lines of evidence, ECM/collagen assembly, neurotrophin signaling, actin-driven migration, and matrix–cell-cycle/ferroptosis control handling emerge as convergent, biologically credible pathways for fiber elongation in sheep. These axes offer testable hypotheses—for example, correlating COL5A2/COL3A1 expression with dermal stiffness, tracking TrkB-responsive transcription during anagen, or perturbing SPTLC1/ELMO1 in ovine follicle organoids to quantify effects on shaft outgrowth.

4.3. Greasy Wool Yield

The biological regulation of greasy wool yield involves multiple processes, including follicular lipid secretion, glandular activity, extracellular matrix remodeling, and metabolic control of sebaceous glands. Among the genes identified in this study, FAP (Fibroblast Activation Protein Alpha) is notable for its role in extracellular matrix degradation and tissue remodeling, functions that are relevant to the maintenance of follicular architecture and sebaceous gland positioning [49]. Changes in the ECM can influence lipid secretion dynamics, indirectly affecting wool grease deposition.
DPP4 (Dipeptidyl Peptidase 4) is a multifunctional enzyme involved in proteolytic processing and regulation of inflammatory signaling [50]. In the skin, DPP4 expression is linked to keratinocyte proliferation and wound healing responses, suggesting a potential role in follicular homeostasis and lipid metabolism. Altered DPP4 activity could influence sebaceous gland secretion profiles, thereby affecting greasy wool production.
Another gene of interest, PLCH1 (phospholipase C eta 1), participates in phosphatidylinositol signaling and lipid hydrolysis pathways, which are central to sebaceous gland secretion and follicular lipid metabolism. While our dataset focused on greasy wool yield, the molecular function of PLCH1 suggests it could also influence clean wool weight by modulating lipid release and follicular surface composition. Similar lipid metabolism-related genes have been reported to affect both greasy and clean fleece components, supporting a broader role of PLCH1 in fleece characteristics.
Taken together, these genes represent functional nodes in pathways involving extracellular matrix organization and proteolytic processing. The biological plausibility of their involvement is supported by their known expression in skin or glandular tissues and by established roles in related pathways. This suggests that variation in these loci may contribute to phenotypic differences in greasy wool yield among individuals.
However, it is important to note that greasy fleece yield was measured in only 62 animals in the present dataset, which limits the statistical power for this trait. This sample-size constraint may reduce the robustness of model predictions compared to other wool traits. Future studies incorporating larger or other independent validation populations—such as other Merino lines or native Turkish breeds—would strengthen the generalizability and confirm the reproducibility of the findings.
Moreover, while the identified genes (e.g., MTHFD2L, EPGN, COL5A2, FAP) are biologically plausible, functional validation was beyond the scope of the present work. Future studies applying RNA-Seq or qPCR expression profiling in wool follicles of animals with contrasting phenotypes will be essential to confirm the transcriptional activity and biological relevance of these candidate genes.

5. Conclusions

This study highlights the polygenic and multifactorial architecture underlying key wool traits in sheep. Fiber diameter is largely influenced by extracellular matrix composition, stress response mechanisms, and neuro-immune interactions, reflecting the structural and physiological complexity of follicular development. Fiber length is governed by cytoskeletal organization, cell cycle control, and efficient protein synthesis, emphasizing the role of proliferative and biosynthetic pathways in follicle elongation. In contrast, greasy wool yield is shaped by metabolic activity, sebaceous gland function, and hormonal regulation, pointing to the importance of lipid biosynthesis and energy homeostasis. The identified candidate genes not only align with known biological processes but also propose novel targets for functional validation. Nevertheless, the predictive models in this study were developed and tested within a single population, and future research should aim to validate these results in independent or larger datasets (e.g., different Merino lines or native Turkish sheep breeds). Such cross-population validation would provide stronger evidence of predictive robustness and enhance the applicability of the two-stage ML–GWAS framework across diverse genetic backgrounds. These insights provide a strong basis for developing genomic selection strategies that aim to improve both wool quality and quantity. Future studies integrating gene expression profiling, metabolic assays, and gene editing approaches will be essential to confirm the functional roles of these loci and translate them into effective breeding programs.

Author Contributions

Y.A. contributed to the conceptualization, methodology, funding acquisition, project administration, formal analysis, investigation, visualization, writing—original draft and review and editing. M.K. contributed to conceptualization, methodology, investigation, validation, review and editing. S.B. contributed to the conceptualization and implemented review and editing. S.T. contributed to the methodology for phenotyping. M.U.Ç. provided supervision, conceptualization, funding acquisition, resources, project administration, methodology, validation, review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

The study was funded within the Research Universities Support Program scope, conducted by Turkish Council of Higher Education Presidency, grant number FBAÜ-2023-12523.

Institutional Review Board Statement

The animal study protocol was approved by the Local Ethics Committee of Erciyes University (protocol code 191 and November/2022).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed at the corresponding authors.

Acknowledgments

The animals were under the National Community-based Small Ruminant Breeding Program. Therefore, the authors kindly acknowledge the contribution of the General Directorate of Agricultural Research and Policies (Ministry of Agriculture and Forestry) of the Republic of Türkiye, who fund and run the National Community-based Small Ruminant Breeding Program.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Arzik, Y.; Kizilaslan, M.; Behrem, S.; White, S.N.; Piel, L.M.W.; Cinar, M.U. Genome-Wide Scan of Wool Production Traits in Akkaraman Sheep. Genes 2023, 14, 713. [Google Scholar] [CrossRef]
  2. Anaya, G.; Laseca, N.; Granero, A.; Ziadi, C.; Arrebola, F.; Domingo, A.; Molina, A. Genomic Characterization of Quality Wool Traits in Spanish Merino Sheep. Genes 2024, 15, 795. [Google Scholar] [CrossRef]
  3. Zhou, H.; Bai, L.; Li, S.; Li, W.; Wang, J.; Tao, J.; Hickford, J.G.H. Genetics of Wool and Cashmere Fibre: Progress, Challenges, and Future Research. Animals 2024, 14, 3228. [Google Scholar] [CrossRef]
  4. Zhang, W.; Jin, M.; Li, T.; Lu, Z.; Wang, H.; Yuan, Z.; Wei, C. Whole-Genome Resequencing Reveals Selection Signal Related to Sheep Wool Fineness. Animals 2023, 13, 2944. [Google Scholar] [CrossRef]
  5. IWTO International Wool Textile Organisation. Available online: https://iwto.org/ (accessed on 14 August 2025).
  6. Arzik, Y.; Kizilaslan, M.; Behrem, S.; Piel, L.M.W.; White, S.N.; Çınar, M.U. Exploring Genetic Factors Associated with Moniezia Spp. Tapeworm Resistance in Central Anatolian Merino Sheep via GWAS Approach. Animals 2025, 15, 812. [Google Scholar] [CrossRef]
  7. Kizilaslan, M.; Arzik, Y.; Behrem, S. Genetic Parameters for Ewe Lifetime Productivity Traits in Central Anatolian Merino Sheep. Small Rumin. Res. 2024, 233, 107235. [Google Scholar] [CrossRef]
  8. Behrem, S. Estimation of Genetic Parameters for Pre-Weaning Growth Traits in Central Anatolian Merino Sheep. Small Rumin. Res. 2021, 197, 106319. [Google Scholar] [CrossRef]
  9. Arzık, Y.; Kızılaslan, M.; Behrem, S.; Çınar, M.U. Identification of Candidate Genes Associated with Eimeria Spp. Oocyst Load in Central Anatolian Merino Sheep. Livest. Stud. 2025, 65, 33–39. [Google Scholar] [CrossRef]
  10. Yaman, Y.; Önaldi, A.T.; Doğan, Ş.; Kirbaş, M.; Behrem, S.; Kal, Y. Exploring the Polygenic Landscape of Wool Traits in Turkish Merinos through Multi-Locus GWAS Approaches: Middle Anatolian Merino. Sci. Rep. 2025, 15, 10611. [Google Scholar] [CrossRef] [PubMed]
  11. Zhao, H.; Guo, T.; Lu, Z.; Liu, J.; Zhu, S.; Qiao, G.; Han, M.; Yuan, C.; Wang, T.; Li, F.; et al. Genome-Wide Association Studies Detects Candidate Genes for Wool Traits by Re-Sequencing in Chinese Fine-Wool Sheep. BMC Genom. 2021, 22, 127. [Google Scholar] [CrossRef] [PubMed]
  12. Elgart, M.; Lyons, G.; Romero-Brufau, S.; Kurniansyah, N.; Brody, J.A.; Guo, X.; Lin, H.J.; Raffield, L.; Gao, Y.; Chen, H.; et al. Non-Linear Machine Learning Models Incorporating SNPs and PRS Improve Polygenic Prediction in Diverse Human Populations. Commun. Biol. 2022, 5, 856. [Google Scholar] [CrossRef] [PubMed]
  13. Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
  14. Hoerl, A.E.; Kennard, R.W. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  15. Zou, H.; Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
  16. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  17. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
  18. Smola, A.J.; Schölkopf, B. A Tutorial on Support Vector Regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
  19. Ghavi Hossein-Zadeh, N. An Overview of Recent Technological Developments in Bovine Genomics. Vet. Anim. Sci. 2024, 25, 100382. [Google Scholar] [CrossRef]
  20. Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R.B. Missing Value Estimation Methods for DNA Microarrays. Bioinformatics 2001, 17, 520–525. [Google Scholar] [CrossRef]
  21. Tay, J.K.; Narasimhan, B.; Hastie, T. Elastic Net Regularization Paths for All Generalized Linear Models. J. Stat. Softw. 2023, 106, 1–31. [Google Scholar] [CrossRef]
  22. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  23. Durinck, S.; Spellman, P.T.; Birney, E.; Huber, W. Mapping Identifiers for the Integration of Genomic Datasets with the R/Bioconductor Package BiomaRt. Nat. Protoc. 2009, 4, 1184–1191. [Google Scholar] [CrossRef] [PubMed]
  24. Raudvere, U.; Kolberg, L.; Kuzmin, I.; Arak, T.; Adler, P.; Peterson, H.; Vilo, J. G:Profiler: A Web Server for Functional Enrichment Analysis and Conversions of Gene Lists (2019 Update). Nucleic Acids Res. 2019, 47, W191–W198. [Google Scholar] [CrossRef] [PubMed]
  25. Team, R.C. R: A Language and Environment for Statistical Computing, version 4.5.1; R Foundation for Statistical Computing: Vienna, Austria, 2020.
  26. Moore, G.P.; Panaretto, B.A.; Robertson, D. Inhibition of Wool Growth in Merino Sheep Following Administration of Mouse Epidermal Growth Factor and a Derivative. Aust. J. Biol. Sci. 1982, 35, 163–172. [Google Scholar] [CrossRef] [PubMed]
  27. Schneider, M.R.; Schmidt-Ullrich, R.; Paus, R. The Hair Follicle as a Dynamic Miniorgan. Curr. Biol. 2009, 19, R132–R142. [Google Scholar] [CrossRef]
  28. McElwee, K.; Hoffmann, R. Growth Factors in Early Hair Follicle Morphogenesis. Eur. J. Dermatol. 2000, 10, 341–350. [Google Scholar]
  29. Sennett, R.; Wang, Z.; Rezza, A.; Grisanti, L.; Roitershtein, N.; Sicchio, C.; Mok, K.W.; Heitman, N.J.; Clavel, C.; Ma’ayan, A.; et al. An Integrated Transcriptome Atlas of Embryonic Hair Follicle Progenitors, Their Niche, and the Developing Skin. Dev. Cell 2015, 34, 577–591. [Google Scholar] [CrossRef]
  30. Mill, P.; Lee, A.W.S.; Fukata, Y.; Tsutsumi, R.; Fukata, M.; Keighren, M.; Porter, R.M.; McKie, L.; Smyth, I.; Jackson, I.J. Palmitoylation Regulates Epidermal Homeostasis and Hair Follicle Differentiation. PLoS Genet. 2009, 5, e1000748. [Google Scholar] [CrossRef]
  31. Ansari-Renani, H.R.; Hynd, P.I. Cortisol-Induced Follicle Shutdown Is Related to Staple Strength in Merino Sheep. Livest. Prod. Sci. 2001, 69, 279–289. [Google Scholar] [CrossRef]
  32. Narayan, E.; Sawyer, G.; Fox, D.; Smith, R.; Tilbrook, A. Interplay Between Stress and Reproduction: Novel Epigenetic Markers in Response to Shearing Patterns in Australian Merino Sheep (Ovis Aries). Front. Vet. Sci. 2022, 9, 830450. [Google Scholar] [CrossRef]
  33. Shin, M.; Bryant, J.D.; Momb, J.; Appling, D.R. Mitochondrial MTHFD2L Is a Dual Redox Cofactor-Specific Methylenetetrahydrofolate Dehydrogenase/Methenyltetrahydrofolate Cyclohydrolase Expressed in Both Adult and Embryonic Tissues. J. Biol. Chem. 2014, 289, 15507–15517. [Google Scholar] [CrossRef]
  34. Ducker, G.S.; Rabinowitz, J.D. One-Carbon Metabolism in Health and Disease. Cell Metab. 2017, 25, 27–42. [Google Scholar] [CrossRef]
  35. Ghadieh, H.E.; Kanaan, A.; Harb, F. Exploring the Complexities of 1C Metabolism: Implications in Aging and Neurodegenerative Diseases. Front. Aging Neurosci. 2024, 15, 1322419. [Google Scholar] [CrossRef]
  36. Chakravarti, S.; Enzo, E.; Rocha Monteiro de Barros, M.; Maffezzoni, M.B.R.; Pellegrini, G. Genetic Disorders of the Extracellular Matrix: From Cell and Gene Therapy to Future Applications in Regenerative Medicine. Annu. Rev. Genom. Hum. Genet. 2022, 23, 193–222. [Google Scholar] [CrossRef]
  37. Mak, K.M.; Png, C.Y.M.; Lee, D.J. Type V Collagen in Health, Disease, and Fibrosis. Anat. Rec. 2016, 299, 613–629. [Google Scholar] [CrossRef] [PubMed]
  38. Ben Jemaa, S.; Mastrangelo, S.; Carta, F.; Riggio, S.; Dimauro, C.; Persichilli, C.; Portolano, B.; Senczuk, G.; Cesarani, A. Genome-Wide Identification of Selection Signatures across Altitudinal Gradients in Dairy Sheep Breeds. Sci. Rep. 2025, 15, 29117. [Google Scholar] [CrossRef]
  39. Moradi, M.H.; Nejati-Javaremi, A.; Moradi-Shahrbabak, M.; Dodds, K.G.; McEwan, J.C. Genomic Scan of Selective Sweeps in Thin and Fat Tail Sheep Breeds for Identifying of Candidate Regions Associated with Fat Deposition. BMC Genet. 2012, 13, 10. [Google Scholar] [CrossRef]
  40. Peters, E.M.J.; Hansen, M.G.; Overall, R.W.; Nakamura, M.; Pertile, P.; Klapp, B.F.; Arck, P.C.; Paus, R. Control of Human Hair Growth by Neurotrophins: Brain-Derived Neurotrophic Factor Inhibits Hair Shaft Elongation, Induces Catagen, and Stimulates Follicular Transforming Growth Factor β2 Expression. J. Investig. Dermatol. 2005, 124, 675–685. [Google Scholar] [CrossRef] [PubMed]
  41. Botchkarev, V.A.; Botchkareva, N.V.; Welker, P.; Metz, M.; Lewin, G.R.; Subramaniam, A.; Bulfone-Paus, S.; Hagen, E.; Braun, A.; Lommatzsch, M.; et al. A New Role for Neurotrophins: Involvement of Brain-Derived Neurotrophic Factor and Neurotrophin-4 in Hair Cycle Control. FASEB J. Off. Publ. Fed. Am. Soc. Exp. Biol. 1999, 13, 395–410. [Google Scholar] [CrossRef] [PubMed]
  42. Terashima, M.; Kobayashi, M.; Motomiya, M.; Inoue, N.; Yoshida, T.; Okano, H.; Iwasaki, N.; Minami, A.; Matsuoka, I. Analysis of the Expression and Function of BRINP Family Genes during Neuronal Differentiation in Mouse Embryonic Stem Cell-Derived Neural Stem Cells. J. Neurosci. Res. 2010, 88, 1387–1393. [Google Scholar] [CrossRef]
  43. Grimsley, C.M.; Kinchen, J.M.; Tosello-Trampont, A.-C.; Brugnera, E.; Haney, L.B.; Lu, M.; Chen, Q.; Klingele, D.; Hengartner, M.O.; Ravichandran, K.S. Dock180 and ELMO1 Proteins Cooperate to Promote Evolutionarily Conserved Rac-Dependent Cell Migration. J. Biol. Chem. 2004, 279, 6087–6097. [Google Scholar] [CrossRef]
  44. Epting, D.; Slanchev, K.; Boehlke, C.; Hoff, S.; Loges, N.T.; Yasunaga, T.; Indorf, L.; Nestel, S.; Lienkamp, S.S.; Omran, H.; et al. The Rac1 Regulator ELMO Controls Basal Body Migration and Docking in Multiciliated Cells through Interaction with Ezrin. Development 2015, 142, 174–184. [Google Scholar] [CrossRef]
  45. Manohar, A.; Shome, S.G.; Lamar, J.; Stirling, L.; Iyer, V.; Pumiglia, K.; DiPersio, C.M. α3β1 Integrin Promotes Keratinocyte Cell Survival through Activation of a MEK/ERK Signaling Pathway. J. Cell Sci. 2004, 117, 4043–4054. [Google Scholar] [CrossRef] [PubMed]
  46. Chang, D.C.; Xu, N.; Luo, K.Q. Degradation of Cyclin B Is Required for the Onset of Anaphase in Mammalian Cells. J. Biol. Chem. 2003, 278, 37865–37873. [Google Scholar] [CrossRef]
  47. Gunkel, P.; Iino, H.; Krull, S.; Cordes, V.C. ZC3HC1 Is a Novel Inherent Component of the Nuclear Basket, Resident in a State of Reciprocal Dependence with TPR. Cells 2021, 10, 1937. [Google Scholar] [CrossRef] [PubMed]
  48. Doll, S.; Freitas, F.P.; Shah, R.; Aldrovandi, M.; da Silva, M.C.; Ingold, I.; Goya Grocin, A.; Xavier da Silva, T.N.; Panzilius, E.; Scheel, C.H.; et al. FSP1 Is a Glutathione-Independent Ferroptosis Suppressor. Nature 2019, 575, 693–698. [Google Scholar] [CrossRef] [PubMed]
  49. Piñeiro-Sánchez, M.L.; Goldstein, L.A.; Dodt, J.; Howard, L.; Yeh, Y.; Chen, W.-T. Identification of the 170-KDa Melanoma Membrane-Bound Gelatinase (Seprase) as a Serine Integral Membrane Protease. J. Biol. Chem. 1997, 272, 7595–7601. [Google Scholar] [CrossRef]
  50. Mentlein, R. Dipeptidyl-Peptidase IV (CD26)-Role in the Inactivation of Regulatory Peptides. Regul. Pept. 1999, 85, 9–24. [Google Scholar] [CrossRef]
Figure 1. Visualization of overlapping and unique SNPs identified by LASSO, Ridge, and ElasticNet models for each wool trait using an UpSet plot. The plot shows the number of SNPs commonly selected across models for fiber diameter (green), fiber length (purple), and greasy fleece yield (red). Vertical bars indicate the size of each intersection, and colored horizontal bars represent the total number of SNPs per model-trait combination.
Figure 1. Visualization of overlapping and unique SNPs identified by LASSO, Ridge, and ElasticNet models for each wool trait using an UpSet plot. The plot shows the number of SNPs commonly selected across models for fiber diameter (green), fiber length (purple), and greasy fleece yield (red). Vertical bars indicate the size of each intersection, and colored horizontal bars represent the total number of SNPs per model-trait combination.
Agriculture 15 02287 g001
Figure 2. Bar plots of the top-ranked SNPs based on importance scores from LASSO, Ridge, and Elastic Net models. The color intensity of the bars represents the absolute coefficient value, with darker shades indicating higher coefficient magnitudes.
Figure 2. Bar plots of the top-ranked SNPs based on importance scores from LASSO, Ridge, and Elastic Net models. The color intensity of the bars represents the absolute coefficient value, with darker shades indicating higher coefficient magnitudes.
Agriculture 15 02287 g002
Figure 3. Bar plots showing the top 20 SNPs with the highest importance scores for each wool trait. (a) Fiber diameter and (b) staple length were analyzed using the Support Vector Regression (SVR) model, while (c) greasy fleece yield was analyzed using the Random Forest (RF) model. Within each trait, SNPs are ranked by their importance values, and bar colors represent a gradient corresponding to their relative contribution. The use of separate scales allows visualization of trait-specific variation in SNP effects, while the consistent layout facilitates comparison across traits.
Figure 3. Bar plots showing the top 20 SNPs with the highest importance scores for each wool trait. (a) Fiber diameter and (b) staple length were analyzed using the Support Vector Regression (SVR) model, while (c) greasy fleece yield was analyzed using the Random Forest (RF) model. Within each trait, SNPs are ranked by their importance values, and bar colors represent a gradient corresponding to their relative contribution. The use of separate scales allows visualization of trait-specific variation in SNP effects, while the consistent layout facilitates comparison across traits.
Agriculture 15 02287 g003
Table 1. Descriptive statistics for wool traits.
Table 1. Descriptive statistics for wool traits.
TraitNMeanMinMaxSDCV
Fiber Diameter22821.7415.1634.522.390.11
Fiber Length22833.9215.065.009.970.29
Greasy Wool Yield624.042.246.420.820.20
Min = Minimum; Max = Maximum; SD = Standard deviation; CV = Coefficient of variation.
Table 2. Predictive Performance (RMSE and R2) of LASSO, Ridge, and ElasticNet Models for Wool Trait Feature Selection.
Table 2. Predictive Performance (RMSE and R2) of LASSO, Ridge, and ElasticNet Models for Wool Trait Feature Selection.
Trait/ModelLASSORidgeElasticNet
RMSER2RMSER2RMSER2
Fiber diameter0.860.861.240.720.890.86
Fiber length8.050.298.760.178.160.27
Greasy wool yield0.670.320.740.170.670.32
Note: Values in bold indicate the best model performance (highest R2 and lowest RMSE) for each trait.
Table 3. Predictive Performance (RMSE and R2) of Random Forest, XGBoost, and SVR Models for Wool Trait Association analysis.
Table 3. Predictive Performance (RMSE and R2) of Random Forest, XGBoost, and SVR Models for Wool Trait Association analysis.
Trait/ModelRandom ForestXGBoostSVR
RMSER2RMSER2RMSER2
Fiber diameter0.860.861.240.720.890.86
Fiber length9.280.219.530.178.210.38
Greasy wool yield0.470.400.670.200.500.31
Note: Values in bold indicate the best model performance (highest R2 and lowest RMSE) for each trait.
Table 4. Annotated SNPs Associated with Wool Traits, with Importance Scores and Candidate Genes.
Table 4. Annotated SNPs Associated with Wool Traits, with Importance Scores and Candidate Genes.
TraitSNP NameChr.Position (bp)Importance aAssociated GenesDistance (bp)
FDs33129.1988,451,4030.086982427ENSOARG00000011475~52 Kbp
FDOAR2_87651563.1282,493,2390.084604564--
FDOAR2_85886786.1280,773,1040.080407321ENSOARG00000013809
ENSOARG00000013822
~36 Kbp,
~73 Kbp
FDs41235.1688,867,1350.076554375MTHFD2L,
EPGN
Within,
~45 Kbp
FDOAR1_145170621.11133,964,3630.074748211NCAM2~78 Kbp
FDs05062.11071,358,2150.074079122ENSOARG00000001282,
ENSOARG00000001372
~64 Kbp,
~85 Kbp
FDOAR16_20816595.11618,929,8090.073739797ENSOARG00000007198Within
FDOAR4_21300620.1420,430,2330.069173084TMEM106B,
ENSOARG00000025221,
ENSOARG00000007333,
ENSOARG00000007341,
rpl23a
~62 Kbp,
~10 Kbp,
~36 Kbp,
~56 Kbp,
~77 Kbp
FDs56953.11360,278,5890.068444192DEFB119,
DEFB123,
DEFB124,
REM1
~39 Kbp,
~58 Kbp,
~92 Kbp,
~99 Kbp
FDs50275.1529,091,7860.064427904SNCAIP,
ENSOARG00000000312
~77 Kbp,
~26 Kbp
FDOAR5_56291790.1551,806,0230.06440768NR3C1Within
FDOAR1_19174177.1119,035,5680.064146978RNF220,
TMEM53,
ARMH1
~22 Kbp,
~62 Kbp,
~95 Kbp
FDs56018.1318,491,6890.063794119ASAP2Within
FDOAR9_5634331.195,696,2180.06351038ADGRB3~487 Kbp
FDs40686.16114,036,9980.063092884SH3TC1,
ENSOARG00000012784,
ENSOARG00000012922
~79 Kbp,
~18 Kbp,
~18 Kbp
FLOAR14_15485140.11415,189,2880.04036412ITFG1Within
FLOAR18_9017434.1189,202,8600.04578853ENSOARG00000022262,
ENSOARG00000026425
~85 Kbp,
~64 Kbp
FLOAR2_127374669.12119,169,9830.037659889COL5A2,
COL3A1
~60 Kbp,
~59 Kbp
FLOAR2_2498940.124,281,8520.037285693BRINP1,
ENSOARG00000025736
~19 Kbp,
~39 Kbp
FLOAR2_27783649.1226,911,3900.070589085SPTLC1~20 Kbp
FLOAR2_36125921.1234,836,1480.041386766NTRK2,
ENSOARG00000022677
Within,
~58 Kbp
FLOAR26_34945353.12630,611,7400.045379879ENSOARG00000026797,
ENSOARG00000026798
Within,
~93 Kbp
FLOAR4_63737955.1460,265,7820.046813613ELMO1Within
FLOAR4_99615533.1493,955,9470.039335397ZC3HC1,
ENSOARG00000004950,
U6,
TMEM209,
SSMEM1
~80 Kbp,
Within,
~16 Kbp,
~33 Kbp,
~61 Kbp
FLs09809.12526,191,4200.071425692AIFM2,
TYSND1,
SAR1A
~38 Kbp,
~58 Kbp,
Within
FLs67646.1398,031,7920.047237622ENSOARG00000026004~74 Kbp
GWYOAR1_149428483.11138,107,4151.949006396--
GWYOAR1_248210458.11230,110,9541.557074499PLCH1~57 Kbp
GWYOAR2_100315266.1293,253,1072.15690926ENSOARG00000008914~36 Kbp
GWYOAR2_121679731.12113,779,0612.449371931ARHGEF4,
FAM168B,
PLEKHB2
~48 Kbp,
Within,
~44 Kbp
GWYOAR2_129873469.12121,449,6550.826171159ZSWIM2,
FAM171B
~21 Kbp,
~33 Kbp
GWYOAR2_133464609.12125,280,4772.508589292--
GWYOAR2_155562989.12146,559,8671.660832568FAP,
GCG,
DPP4
~118 Kbp,
~16 Kbp,
~73 Kbp
GWYOAR2_179267347.12169,227,1382.399497451--
GWYOAR3_41046441.1338,239,5751.426609273PCBP1,
ASPRV1,
MXD1
~60 Kbp,
~47 Kbp,
~70 Kbp
GWYOAR4_23258944.1422,111,7871.86407227ETV1~31 Kbp
GWYOAR8_94906557.1888,032,4241.03960191MPC1,
RPS6KA2
~37 Kbp,
~18 Kbp
GWYs18611.1299,959,1661.721905126--
GWYs42578.11121,550,1461.816074052MIS18A, RPTOR~70 Kbp, ~306 Kbp
GWYs58015.11151,177,4312.787717766NPTX1,
ENDOV,
RNF213
~23 Kbp,
~66 Kbp,
~87 Kbp
GWYs64605.11244,002,9721.525014597U2SURPWithin
GWYs71567.11232,583,2362.284989619PLD5~10 Kbp
a Importance scores represent weight-based estimates from SVR models. For Random Forest, feature relevance was assessed using Mean Decrease in Impurity.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Arzık, Y.; Kizilaslan, M.; Behrem, S.; Tütenk, S.; Çınar, M.U. Two-Stage Machine Learning-Based GWAS for Wool Traits in Central Anatolian Merino Sheep. Agriculture 2025, 15, 2287. https://doi.org/10.3390/agriculture15212287

AMA Style

Arzık Y, Kizilaslan M, Behrem S, Tütenk S, Çınar MU. Two-Stage Machine Learning-Based GWAS for Wool Traits in Central Anatolian Merino Sheep. Agriculture. 2025; 15(21):2287. https://doi.org/10.3390/agriculture15212287

Chicago/Turabian Style

Arzık, Yunus, Mehmet Kizilaslan, Sedat Behrem, Simge Tütenk, and Mehmet Ulaş Çınar. 2025. "Two-Stage Machine Learning-Based GWAS for Wool Traits in Central Anatolian Merino Sheep" Agriculture 15, no. 21: 2287. https://doi.org/10.3390/agriculture15212287

APA Style

Arzık, Y., Kizilaslan, M., Behrem, S., Tütenk, S., & Çınar, M. U. (2025). Two-Stage Machine Learning-Based GWAS for Wool Traits in Central Anatolian Merino Sheep. Agriculture, 15(21), 2287. https://doi.org/10.3390/agriculture15212287

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop