Next Article in Journal
Dynamic Profiling of Fruit Quality Attributes During Development Reveals the Early-Ripening Advantage in ‘Longhuihong’ Navel Orange, a Bud Mutant of ‘Newhall’
Previous Article in Journal
Transcriptome and Weighted Gene Co-Expression Network Analysis Reveals Key Genes and Pathways in the Response of Litchi Embryogenic Callus to 2,4-Dichlorophenoxyacetic Acid Regulation
Previous Article in Special Issue
A Single-Nucleotide Mutation in the α-Tubulin Gene Underlies Dwarfism in Watermelon (Citrullus lanatus)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genetic Architecture of Fruit Color and Morphology Revealed by Image-Based Phenotyping and Genome-Wide Association Analysis in Octoploid Strawberry

1
Vegetable Research Division, National Institute of Horticultural and Herbal Science, Rural Development Administration, Jeonju 55365, Republic of Korea
2
Department of Horticulture, Chungbuk National University, Cheongju 28644, Republic of Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Horticulturae 2026, 12(5), 547; https://doi.org/10.3390/horticulturae12050547
Submission received: 6 April 2026 / Revised: 26 April 2026 / Accepted: 27 April 2026 / Published: 29 April 2026

Abstract

Cultivated strawberry (Fragaria × ananassa) is an allo-octoploid for which the genetic basis of fruit appearance traits has not been comprehensively elucidated. This study investigated the genetic architecture of fruit color and morphological traits using integrated digital phenotyping and genome-wide association analysis of a core collection of diverse strawberry germplasm maintained for Korean breeding programs. A 108-accession core collection was assembled, genotyped, and phenotyped for 12 fruit quality traits. Population structure analysis identified K = 10 genetic clusters, and a Mantel test confirmed significant genotype–phenotype correspondence (r = 0.38, p < 0.001). Genome-wide association studies (GWAS) using BLINK and MLMM identified 15 significant marker–trait associations across six traits. Pleiotropic loci on chromosomes 15 (4C) and 22 (6B) were consistently associated with fruit lightness (L*) and red channel intensity (R) in both models, and the 6B locus explained approximately 18% of the phenotypic variance for each trait. Gene Ontology enrichment implicated transcriptional regulation, SUMOylation, and plastid-to-chromoplast transition, suggesting that the identified loci influenced fruit coloration through cellular regulatory mechanisms rather than direct pigment biosynthesis. These findings provide a genomic foundation for dual-trait marker-assisted selection targeting light and vividly red fruits for strawberry breeding.

1. Introduction

Cultivated strawberry (Fragaria × ananassa) is one of the most economically important fruits worldwide, with global production reaching approximately 10.7 million tons from 436,067 hectares in 2024 (FAOSTAT, 2024). China is by far the largest producer (4.1 million tons, approximately 38% of global output), followed by the United States (1.5 million tons), Egypt (881,000 tons), Mexico (696,000 tons), and Türkiye (606,000 tons) (FAOSTAT, 2024). The Republic of Korea ranks among the top ten strawberry-producing countries with approximately 169,000 tons in 2024, and strawberry represents the largest vegetable crop by production value in the country, with domestic cultivars such as ‘Seolhyang’, ‘Maehyang’, and ‘Kuemsil’ now covering more than 96% of the national market [1]. Consumer preference studies across major markets consistently identify a common core of quality traits that drive purchasing decisions, including sweetness, flavor, firmness, and fruit appearance, with bright red color, uniform size, and glossy surface being particularly important for fresh-market sales [2,3,4]. Regional and end-use differences also exist, with fresh-market and export channels placing stronger emphasis on firmness and shelf life to withstand long-distance shipping, whereas processing markets prioritize internal color stability and sugar content [4,5]. These varied market demands highlight the importance of breeding strategies that can simultaneously improve multiple fruit quality traits.
Cultivated strawberry is an allo-octoploid species (2n = 8x = 56) that originated from a complex and stepwise history of interspecific hybridization and polyploidization involving at least four diploid progenitor species. Phylogenomic evidence indicates that these ancestral genomes, derived from geographically and temporally distinct Fragaria lineages, were assembled through successive hybridization events followed by chromosome doubling, ultimately giving rise to the modern cultivated strawberry genome [6,7,8]. This genomic complexity, with an allopolyploid origin involving four diploid progenitor species [8], results in high heterozygosity and extensive sub-genomic interactions that often obscure the genetic basis of key horticultural traits such as fruit color, shape, and overall quality [6,9].
Recent advances in genomic resources have significantly improved our ability to study the genetic architecture of complex strawberry traits. High-quality chromosome-scale genome assemblies and large-scale genotyping platforms have provided valuable frameworks for exploring genetic variation and trait-associated loci [8,9,10,11]. Despite these advances, the translation of genomic information into practical breeding applications remains challenging. In particular, identifying specific genetic variants responsible for complex phenotypes across diverse germplasm collections remains a major bottleneck in strawberry genomics and breeding [10,12].
A key prerequisite for effective genomic analyses is the establishment of structured germplasm panels that adequately represent the standing genetic variation within breeding populations. Core collections are widely used to achieve this objective by capturing the maximum genetic diversity with a minimal number of accessions, thereby improving the efficiency and statistical power of downstream analyses such as genome-wide association studies (GWAS) [13,14]. Although national germplasm repositories maintain extensive genetic diversity, breeding germplasm needs to be systematically characterized to ensure that genomic analyses reflect the population structure relevant to regional breeding programs. Evaluating locally adapted germplasm resources allows breeders to account for population stratification and develop genomic tools that are optimized for region-specific breeding targets [10,14].
Many breeding-relevant traits in cultivated strawberries, including yield, disease resistance, and fruit quality, exhibit continuous phenotypic variations that are shaped by both genetic and environmental factors. These traits are typically governed by numerous loci with small to moderate effects, and the genetic architecture is complicated by the allo-octoploid genome organization of strawberries [6,8]. Therefore, GWAS are widely used for dissecting the genetic architecture of complex traits in strawberries. Previous studies using diverse germplasm panels have successfully identified quantitative trait loci (QTLs) associated with key horticultural traits, including fruit firmness (FaPG1), pigmentation and anthocyanin regulation (FaMYB10), as well as major fruit quality components such as sugars and organic acids [10,15,16]. Collectively, these studies demonstrate the potential of GWAS to link natural genetic variation with agronomically important phenotypes.
However, the effectiveness of GWAS is strongly influenced by the quality and resolution of the phenotypic data. Conventional phenotyping approaches for fruit quality traits often rely on manual measurements or subjective visual scoring, which are labor-intensive and difficult to scale across large populations. This limitation, commonly referred to as the “phenotyping bottleneck,” has emerged as a major constraint in crop genetics and breeding [12,17,18]. This challenge is particularly pronounced in polyploid crops, such as strawberries, where homoeologous gene interactions and gene dosage effects complicate trait expression [7,19]. Therefore, insufficient phenotypic resolution can obscure the underlying genetic signals and reduce the statistical power of association analyses. To address these limitations, high-throughput image-based phenotyping approaches have recently garnered attention as powerful tools in crop research. For strawberries, digital image analysis enables the rapid and objective quantification of fruit morphological traits, including fruit length, width, area, and circularity. In addition, color descriptors derived from the RGB and Commission Internationale de l’Éclairage L*a*b* color spaces provide nondestructive proxies for fruit pigmentation and color intensity, which are closely related to anthocyanin accumulation [19,20]. Compared with conventional measurements, image-derived phenotypes offer increased precision, improved reproducibility, and greater sensitivity to continuous variations in fruit appearance. When combined with high-density SNP genotyping platforms, such phenotypic datasets substantially enhance the ability of GWAS to detect genetic effects that may otherwise be masked by phenotypic noise [9,10].
Building on these advances, the present study aimed to dissect the genetic architecture of fruit appearance traits in cultivated strawberries using an integrated genomics and digital phenotyping framework. We constructed a core collection of 108 accessions selected from the strawberry germplasm repository at the National Institute of Horticultural and Herbal Science (NIHHS), representing accessions of diverse geographic origins. An automated RGB image-based phenotyping pipeline was implemented to generate high-resolution measurements of fruit morphology and color across the core collection. These phenotypic datasets were integrated with genome-wide SNP markers to perform GWAS and identify the major loci and candidate pleiotropic genomic regions associated with fruit quality traits. By combining population-level genomics with objective digital phenotyping, this study provides new insights into the genetic control of fruit appearance, and contributes to the development of more precise and predictive breeding strategies for strawberry improvement.

2. Materials and Methods

2.1. Plant Materials and Fruit Sample Preparation

A core collection of 108 strawberry accessions was established for this study. Starting from a broader germplasm panel of 253 accessions maintained at NIHHS (Wanju-gun, Korea; 35°50′02.0″ N, 127°01′60.0″ E), elite cultivars and key parental lines relevant to Korean strawberry breeding were prioritized, followed by additional accessions selected using the ShinyCore program with a target allelic diversity coverage of 90–95%, yielding the final panel of 108 accessions used in this study. A total of 108 strawberry accessions were cultivated in a high-bed system within an environmentally controlled greenhouse at the NIHHS. For each accession, 14–16 plants were cultivated under uniform greenhouse conditions. Plants were grown on a commercial substrate (Hanareum A; Shinsung Mineral Co., Ltd., Seongnam-si, Republic of Korea) and supplied with a standard strawberry nutrient solution via automated fertigation. Greenhouse temperature was maintained between 7 °C (night) and 25 °C (day). Fruits were harvested during the peak production period (January–February, 2025) between 08:00 and 10:00 to minimize diurnal physiological variations. For phenotypic evaluation, 5–7 marketable fruits per accession at approximately 80% surface coloration (commercial harvest stage) were sampled from the first inflorescence. Only undamaged fruits were selected, and samples were immediately transferred to the laboratory for image acquisition and physical quality assessment. Also, Individual fruits were measured separately, and the resulting values were averaged to obtain a single mean value per accession for each trait. These accession-level mean values were used for the genome-wide association analysis.

2.2. DNA Extraction and SNP Genotyping

Genomic DNA was extracted from young leaf tissues using the cetyltrimethylammonium bromide (CTAB) method [21] with minor modifications. DNA concentration and purity were assessed by spectrophotometry (NanoDrop, Thermo Fisher Scientific, Waltham, MA, USA). Genome-wide SNP genotyping was performed using the Axiom® 50 K FanaSNP array (Thermo Fisher Scientific). Genotype calling was conducted with Axiom Analysis Suite software v5.3.0.45 (Thermo Fisher Scientific, Waltham, MA, USA )using the BRLMM-P algorithm. Samples with dish quality control (DQC) values ≥ 0.82 and call rates ≥ 97% were retained. SNP markers were filtered for minor allele frequency (MAF > 0.05), call rate (>95%), and Hardy–Weinberg equilibrium. Missing genotypes were imputed using the major allele approach in PLINK v1.9 [22]. After quality control, 29,016 high-quality SNPs were retained for genome-wide association analyses.

2.3. Population Structure and Genetic Diversity Analysis

For population structure and genetic diversity analyses, a refined subset of 13,025 SNPs was selected from the 29,016 quality-filtered markers (Section 2.2) by retaining subgenome-specific markers and applying a stricter Hardy–Weinberg equilibrium threshold (p ≥ 0.0001) to minimize confounding from cross-subgenome hybridization. Population structure was inferred using STRUCTURE v2.3.4 under an admixture model with correlated allele frequencies, with three independent runs for each K (1–12; burn-in 5000; 50,000 MCMC iterations). The optimal K was determined by the Evanno ΔK method, and replicate runs at the optimal K were merged using CLUMPP. Accessions with Q < 0.60 for all clusters were classified as admixed. Shannon–Wiener diversity (H) and Nei’s unbiased gene diversity (Hexp) were calculated using the R package poppr. An unrooted weighted neighbor-joining tree was constructed in DARwin v6 based on simple Euclidean dissimilarity.

2.4. Digital Image Acquisition and Phenotyping Platform

A standardized imaging system was established using a mirrorless camera (Alpha 6400; Sony, Tokyo, Japan) mounted in a fixed top-down position inside a commercial photobox (DH-01; Daehan Mall Co., Ltd., Suwon, Republic of Korea) with integrated LED lighting. Each fruit was longitudinally bisected, with the external surface (exocarp) positioned in the upper section and the internal flesh (mesocarp) in the lower section of the imaging frame, enabling the simultaneous quantification of surface and internal quality traits. All images were acquired against a uniform matte background using ColorChecker Classic Mini (Calibrite, Grand Rapids, MI, USA) for post-acquisition color calibration.
Images were processed using a custom Python (v3.13) pipeline. A color calibration matrix was derived via linear regression using a reference color checker. Background removal was performed using blue-channel thresholding, followed by calyx suppression through green-mask expansion and L* space filtering (L* ≥ 120). Pixel dimensions were converted to metric units using a pre-calibrated 5 cm spatial reference. Twelve fruit quality traits were quantified: fruit weight (FrW, g), fruit firmness (FF, N), fruit length (FL, mm), fruit width (FW, mm), fruit area (FA, mm2), fruit circularity (FC), and six colorimetric parameters in both the RGB (R, G, B) and CIELAB (L*, a*, b*) color spaces. Fruit circularity (FC) was calculated from the segmented fruit contour as 4π × area/pe-rimeter2, where values approach 1 for a perfect circle. Fruit length (FL) and width (FW) were estimated as the major and minor axes, respectively, of the minimum-area rotated bounding rectangle fitted to each segmented fruit region. Fruit area (FA) was defined as the projected two-dimensional area derived from image segmentation. Fruit firmness was measured using a texture analyzer (TA1; Lloyd Instruments, Fareham, UK) with a 5 mm cylindrical probe at 2.5 mm/s, and fruit weight was recorded using a digital scale. The overall pipeline workflow and representative processing outputs are shown in Figure S1.

2.5. Genome-Wide Association Study

Genome-wide association analysis was performed for all 12 traits using two complementary multilocus models, BLINK [23] and MLMM [24], as implemented in GAPIT3 [25]. Population structure was accounted for using principal components, and a genomic relationship matrix was included as a random effect to control for familial relatedness. Significant marker–trait associations were declared at a Bonferroni-corrected threshold (α = 0.05). For each significant SNP, the phenotypic variance explained (PVE) was calculated as PVE (%) = R2 × 100, where R2 is the coefficient of determination obtained by regressing the trait value on the SNP allele dosage (coded as 0, 1, or 2 for the alternative allele).

2.6. Software and Data Availability

Images were processed using custom Python scripts (v3.13) with NumPy, SciPy, and Matplotlib. GWAS was performed using GAPIT3 in R (v4.3). Genotype quality control and imputation were conducted in PLINK v1.9 [22].

2.7. Functional Annotation of Candidate Genes

Candidate genes were identified based on linkage disequilibrium (LD) structure de-rived from this study population. LD blocks surrounding each significant SNP (−log10(P) > 6) were delineated, and all genes located within these LD-defined regions were selected for downstream analyses. This approach was adopted instead of applying a fixed physical window to better reflect the local genomic architecture. Gene sequences were mapped to Arabidopsis thaliana orthologs to facilitate functional interpretation. Gene Ontology (GO) enrichment analysis was performed using DAVID v2024 [26], and enrichment results were visualized using bubble plots.

3. Results

3.1. Genetic Diversity and Population Structure

The core collection of 108 Fragaria × ananassa accessions assembled from 253 strawberry germplasm lines comprised accessions from Korea (n = 48, 44.4%), Japan (n = 25, 23.1%), USA (n = 25, 23.1%), and Europe (n = 10, 9.3%; Table S1). Population structure analysis using STRUCTURE revealed that the optimal number of genetic clusters was K = 10, based on the highest ΔK value (ΔK = 2165.61; Figure 1a). A secondary peak was detected at K = 2 (ΔK = 815.45; Table S3). At K = 10, 108 accessions were assigned to ten subpopulations (SP1–SP10), with 27 accessions (25.0%) classified as admixed (Q < 0.60; Figure 1b; Table S1). SP5 was the largest subpopulation (n = 32) and consisted predominantly of Korean accessions (87.5%), including ‘Seolhyang’ and ‘Maehyang’. SP1 (n = 15) comprising accessions from the USA (46.7%), Japan (26.7%), Denmark (6.7%), the Netherlands (6.7%), and Korea (6.7%). SP10 (n = 13) mainly consisted of accessions from the USA (61.5%), Israel (23.1%), and UK (15.4%). The remaining subpopulations (SP2–SP4 and SP6–SP9) contained two to five accessions (Table S1). Genotyping using the Axiom FanaSNP 50 K array yielded 29,016 high-quality SNPs after stringent filtering. The Shannon–Wiener diversity index of the core collection was 4.69, and Nei’s unbiased gene diversity was 0.376, compared with 5.53 and 0.374, respectively, for the total germplasm (Table S2). The unrooted neighbor-joining (NJ) tree was largely consistent with the population structure inferred using STRUCTURE (Figure 1c). The Korean accessions formed a distinct clade, whereas the Japanese accessions were distributed across multiple lineages. Accessions classified as admixed by STRUCTURE were positioned at intermediate nodes in the tree.

3.2. Phenotypic Analysis

The 108 accessions displayed wide phenotypic variations across all 12 fruit quality traits (Table 1, Figure S2). Fruit weight ranged from 3.28 to 50.72 g (mean = 26.1 g, coefficient of variation (CV) = 35.37%), and firmness from 1.60 to 12.45 N (CV = 41.55%), representing the two most variable traits. Morphological traits, such as fruit area (CV = 22.38%) and width (CV = 14.41%), were comparatively stable, whereas the green and blue RGB channels exhibited higher variability (CV > 32%). The frequency distributions for most traits approximated normality (Figure 2a,b). The histogram representations were refined to reduce visual overlap and improve clarity.
Pearson’s correlation analysis revealed several significant inter-trait relationships (Figure 2c). The fruit weight and image-derived fruit area were strongly correlated (r = 0.94, p < 0.001), confirming the reliability of the image-based phenotyping pipeline. Fruit length and width were positively associated with fruit weight, whereas firmness showed a moderately positive correlation with weight (r = 0.36, p < 0.01). Size-related traits were negatively correlated with the CIELAB color values (L*, a*, and b*), indicating that larger fruits tend to exhibit darker and more saturated surface coloration. Principal component analysis showed that PC1 and PC2 accounted for 18.4% and 11.8% of the total phenotypic variance, respectively (Figure 2d). PC1 was primarily driven by morphological and size-related traits (FrW, FA, FL, and FW), whereas colorimetric parameters contributed more to PC2, suggesting that fruit size and color variation were captured by largely independent axes of phenotypic diversity.
A one-way ANOVA across the ten genetic clusters revealed significant phenotypic differentiation for most traits (p < 0.001), with the exception of fruit circularity and a* and b* color values (p > 0.05) (Table S4). Cluster 8 exhibited the highest mean values for fruit weight (46.63 g), area (1919.86 mm2), and width (41.39 mm), while Clusters 3 and 8 showed significantly greater firmness (6.47 and 6.92 N, respectively). A Mantel test between genetic distance and phenotypic Euclidean distance yielded a significant positive correlation (r = 0.38, p < 0.001; Figure S3), with pairwise phenotypic distances ranging from 0.71 to 12.18 (mean = 4.56; Table S5), indicating that the genetic stratification of the core collection corresponded to structured phenotypic differentiation across the evaluated fruit quality traits.

3.3. Genome Wide Association Analysis

GWAS was conducted using the BLINK and MLMM models, with 29,016 high-quality SNPs across 108 strawberry accessions. In total, 15 significant markers were identified at a threshold of −log10(P) ≥ 6.0 (p < 1 × 10−6), spanning six traits: fruit firmness (FF), fruit width, fruit length, CIE lightness (L*), red channel value (R), and blue channel value (B) (Table 2; Figure 3 and Figures S4–S7). Q-Q plots confirmed adequate control of population stratification, with p-value distributions closely following the null expectation across all 12 traits in both models, and deviation from the diagonal confined to a small number of markers at the upper tail (Figure 3 and Figure S7). These associations were distributed across 11 chromosomes with multiple loci on chromosomes 13 (4A), 15 (4C), 22 (6 B), and 8 (2D). For fruit firmness, a single significant SNP, AX-184398615, was detected on chromosome 21 (6A) by BLINK (−log10(P) = 10.68; phenotypic variance explained (PVE) = 21.7%) (Figure S4). For fruit width, three SNPs on chromosomes 3 (1C), 5 (2A), and 22 (6 B) were identified by BLINK, with PVE values of 19.4%, 9.0%, and 0.1%, respectively, whereas an association on chromosome 22 (6B, AX-184276260) was detected by BLINK only and exhibited a very low PVE; therefore, it should be interpreted with caution. For fruit length, AX-184240674 on chromosome 19 (5D) was significant in both BLINK (−log10(P) = 8.40) and MLMM (−log10(P) = 7.81), explaining 12.0% of the phenotypic variance (Figure S5).
For image-based color traits, 11 significant markers were identified across L*, R, and B (Figures S6 and S7). Three SNPs were consistently associated with L*: AX-184439390 on chromosome 13 (4A) was detected by MLMM only (−log10(P) = 7.01; PVE = 0.1%), while AX-184710565 on chromosome 15 (4C) (−log10(P) = 11.12–11.86; PVE = 5.9%) and AX-184621491 on chromosome 22 (6B) (−log10(P) = 6.42–8.19; PVE = 18.4%) were significant in both models (Table 2, Figure 3). For R, four SNPs were identified on chromosomes 8 (2D), 13 (4A), 15 (4C), and 22 (6 B). Among these, AX-184710565 on chromosome 15 (4C) exhibited the strongest association (−log10(P) = 15.39 by BLINK; PVE = 5.1%), and AX-184621491 on chromosome 22 (6B) explained the greatest proportion of phenotypic variance (PVE = 18.0%) (Table 2, Figure 3 and Figure S7). For B, three SNPs were identified on chromosomes 8 (2D), 12 (3D), and 22 (6 B) using BLINK, with PVE values of 20.0%, 8.9%, and 14.6%, respectively (Figure S7).
Notably, AX-184710565 on chromosome 15 (4C) and AX-184621491 on chromosome 22 (6B) were significantly associated with both L* and R in both the BLINK and MLMM models (Table 2; Figure 3), indicating the presence of pleiotropic loci that simultaneously influence multiple dimensions of fruit color. Additionally, AX-184439390 on chromosome 13 (4A) was associated with both L* (MLMM only) and R (both models), further supporting a shared genetic basis for fruit lightness and redness at this locus. The concordance between BLINK and MLMM for these shared loci fortifies their biological significance.

3.4. Functional Enrichment Analysis of Candidate Genes at Pleiotropic Loci

To characterize the functional properties of candidate genes at the pleiotropic loci, GO enrichment analysis was re-performed using genes located within LD-defined regions surrounding the lead SNPs on chromosomes 13 (4A), 15 (4C), and 22 (6B). The enriched terms were classified into Biological Process (BP), Cellular Component (CC), and Molecular Function (MF) categories and are presented in Figure 4. The 6B locus formed a well-defined LD block containing a concentrated set of candidate genes, whereas the 4A and 4C loci exhibited weaker LD structure, with SNPs not confined within a single LD block. As a result, candidate gene regions for 4A and 4C were more dispersed compared to 6B. Despite these differences, distinct functional signatures were observed across loci. The 4A locus showed enrichment patterns associated with transcriptional regulation, although statistical support remained limited. The 4C locus exhibited enrichment related to post-translational modification processes, including deSUMOylation. In contrast, the 6B locus displayed strong enrichment for plastid- and chloroplast-related components, consistent with a role in organelle-mediated regulation of fruit coloration. These results indicate that LD-informed candidate gene selection provides a refined interpretation of locus-specific functional mechanisms.
At the 4A locus (Chr. 13, AX-184439390), the enriched GO terms were primarily associated with transcriptional regulation and cellular developmental processes. BP enrichment included cell differentiation and cellular developmental processes, whereas MF terms were dominated by DNA-binding and transcription regulatory activities, indicating that candidate genes in this region may function as upstream regulators of developmental and metabolic pathways (Figure 4). The 4C locus (Chr. 15, AX-184710565) showed strong enrichment for protein modifications and post-translational regulatory processes, particularly protein deSUMOylation and ubiquitin-like protein peptidase activity. Additional enrichment in kinase-related and catalytic activities suggests that this region may regulate cellular signaling and protein turnover mechanisms (Figure 4). In contrast, the 6B locus (Chr. 22, AX-184621491/AX-184026606) contained the largest number of candidate genes and exhibited enrichment that was predominantly associated with plastid and chloroplast components, along with nucleotide-binding and metabolic processes. The enrichment of plastid-related terms suggests that genes within this region may influence plastid development and metabolic activities related to pigment accumulation and fruit coloration (Figure 4). Overall, the three loci displayed distinct functional signatures: 4A locus was associated with transcriptional regulation, 4C locus with post-translational protein modification, and the 6B locus with plastid-associated cellular functions. The complete GO enrichment results are provided in Supplementary Table S6.
Bubble plots display enriched GO terms for candidate genes located within 200 kb windows flanking the lead SNPs at the (a) 4A locus (Chr. 13, AX-184439390), (b) 4C locus (Chr. 15, AX-184710565), and (c) 6B locus (Chr. 22, AX-184621491/AX-184026606). The enriched terms were classified into three categories: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). The x-axis represents the −log10(p-value) of enrichment, and bubble size indicates the number of candidate genes annotated to each term. Bubble color reflects the degree of significance, with darker red indicating higher −log10(P) values. The vertical dashed red line denotes the significance threshold (−log10(P) = 1.3, corresponding to p = 0.05). GO enrichment analysis was performed using DAVID v2024 with Arabidopsis thaliana orthologs as the background.

4. Discussion

Dissecting the genetic architecture of complex fruit-quality traits in cultivated strawberries requires both representative germplasm and high-resolution phenotyping. Here, we demonstrate that a strategically constructed core collection of 108 accessions, assembled from internationally diverse germplasm held at NIHHS, captures sufficient genetic diversity to identify reproducible marker–trait associations for six fruit quality traits. High-density SNP genotyping (Axiom FanaSNP 50K) was integrated with an automated RGB image–based phenotyping pipeline to generate multidimensional measurements of fruit color and morphology. Genome-wide association analyses identified significant loci for six fruit quality traits, including pleiotropic loci affecting fruit lightness and redness, with candidate gene enrichment, suggesting roles in transcriptional regulation, SUMOylation, and plastid development.

4.1. Genetic Representativeness and Phenotypic Diversity of the Core Collection

The 108-accession strawberry core collection maintained a Shannon–Wiener diversity index of 4.69 and Nei’s unbiased gene diversity of 0.376, effectively retaining the genetic diversity of the original 253 accession germplasm (5.53 and 0.374, respectively, Table S2). A ShinyCore-based selection strategy was employed to optimize core subset composition. This approach, conceptually similar to that described by Koorevaar et al. (2023) for a European strawberry core collection, effectively maximizes allelic representation while minimizing the number of retained accessions [14]. The strategic prioritization of domestic breeding lines (44.4%) further ensured alignment with region-specific breeding objectives. The optimal cluster number of K = 10 from the STRUCTURE analysis reflects the high heterozygosity and complex subgenomic architecture of the octoploid strawberry. The Mantel test revealed a significant positive correlation between genotypic and phenotypic Euclidean distances (r = 0.38, p < 0.001), indicating that genetic stratification within the core collection was associated with structured phenotypic differentiation. However, the moderate correlation coefficient suggests that genetic distance alone does not fully account for observed phenotypic variation. This remaining variance likely reflects the combined influence of environmental effects and subgenome-specific gene expression imbalance, underscoring the complexity of genotype–phenotype relationships in allo-octoploid strawberries, as reported by Edger et al. (2019) [8].

4.2. Reliability of RGB Image-Based Phenotyping

The strong correlation between fruit weight and image-derived fruit area (r = 0.94) indicated that the automated image analysis pipeline provided a measurement accuracy comparable to that of conventional manual phenotyping [12]. The combined use of CIELAB and RGB color spaces enabled the multidimensional quantification of fruit color, and the high coefficients of variation observed for the green and blue channels (CV > 32%) suggested sensitivity to subtle differences in fruit maturity and anthocyanin accumulation [20]. Principal component analysis further revealed that PC1 was primarily associated with morphological traits, whereas PC2 was dominated by colorimetric parameters, indicating a largely independent variation between fruit size and color. This pattern was consistent with the GWAS results, where significant loci for size- and color-related traits were distributed across distinct chromosomal regions.

4.3. Multilocus Architecture and Pleiotropy of Color Traits in Octoploid Strawberry

Two loci on chromosomes 15 (4C) and 22 (6B) were significantly associated with both L* and R in the BLINK and MLMM models, providing evidence of pleiotropic genetic effects rather than coincidental colocalization. The locus on chromosome 22 (6B) alone explained 18.4% and 18.0% of the phenotypic variance of L* and R, respectively, representing a relatively notable effect size within this population for colorimetric traits in an allo-octoploid genome. Such effect sizes contrast with the dispersed, small-effect QTL architecture typically reported for outbred polyploid germplasms. For example, Muñoz et al. (2024) analyzed 124 accessions across 26 traits and detected 121 marker–trait associations distributed across 95 QTLs, with most loci contributing only minor phenotypic variance [10]. Similarly, using a MAGIC population, Wada et al. (2020) reported no dominant single-locus effect on colorimetric parameters [16]. The comparatively large PVE observed here likely reflects the structured composition of the core collection, which concentrates the Korean breeding germplasm spanning a defined range of fruit color phenotypes, thereby enhancing the allelic contrast necessary to resolve major effect associations. Importantly, the colocalization of L* and R signals at the same loci suggests a shared genetic basis for fruit lightness and redness, raising the possibility that loci on chromosomes 15 (4C) and 22 (6B) could serve as candidates for future marker development targeting both color dimensions simultaneously. A third locus on chromosome 13 (4A) was also associated with both traits, but explained only 0.1–0.3% of the variance. This pattern is consistent with a mixed genetic architecture, where a few loci with relatively large effects contribute alongside numerous minor loci acting through epistatic interactions or homoeologous gene redundancy in the polyploid genome [8,9].
The octoploid genome of cultivated strawberry, composed of four subgenomes (A, B, C, and D) derived from distinct diploid progenitors [8,9], provides a unique context for interpreting the distribution of significant marker–trait associations. For color traits, significant SNPs were detected across all four subgenomes for redness (R) and across three subgenomes for lightness (L*), indicating that color variation in the core collection is jointly shaped by homoeologous loci of genomically distinct origins. A similar multi-subgenome pattern was observed for fruit size, whereas the firmness locus on chromosome 21 (6A) was confined to the A subgenome, consistent with the known dominance of the F. vesca derived A subgenome in cultivated strawberry [8]. This contrast suggests that fruit firmness may be governed primarily by a dominant-subgenome locus, whereas color and size traits reflect combined contributions from homoeologous copies across subgenomes, underscoring how the polyploid genome architecture shapes the genetic basis of fruit quality traits. From a breeding perspective, these results suggest that the major loci on 4C and 6 B should be prioritized in future fine-mapping and functional validation efforts, whereas allelic diversity at the minor 4A locus may contribute to the long-term selection response.
For morphological traits, the firmness locus on chromosome 21 (6A; PVE = 21.7%) co-localized with the major QTL independently reported by Muñoz et al. [5], who proposed the polygalacturonase gene FaPG1 as the candidate causal gene, and by Wada et al. [11] in a distinct MAGIC population. The fruit size loci identified on chromosomes 3 (1C), 5 (2A), and 19 (5D) were located in subgenomes distinct from those harboring color-associated loci. This pattern mirrors the largely non-overlapping genomic distribution of morphological and colorimetric QTLs reported by Wada et al. [11], and is consistent with the principal component analysis of the present dataset, in which PC1 captured morphological variation and PC2 represented color variation along orthogonal axes.
Interestingly, the two major pleiotropic loci on chromosomes 15 (4C) and 22 (6B), as well as the minor locus on chromosome 13 (4A), were distant from the canonical anthocyanin biosynthesis genes in strawberries. FaMYB10, the master regulator of fruit pigmentation, is located on chromosome 1, whereas the structural genes ANS, DFR, and CHS reside on chromosomes 5, 2, and 7, respectively, with predominant expression in the F. vesca-derived subgenome [27,28]. This genomic separation suggests that the detected loci influence fruit coloration through regulatory or cellular mechanisms, rather than through the direct control of pigment biosynthesis. Supporting this interpretation, the present dataset revealed a positive correlation between L* and R across core collection accessions. This pattern contrasts with the trajectory observed during strawberry ripening, where increasing anthocyanin accumulation progressively reduces surface lightness. Parra-Palma et al. (2020) demonstrated that L* values decreased consistently as a* values increased during ripening in four cultivars, reflecting the light-absorbing properties of the accumulated pigments [29]. Therefore, the positive L*–R relationship observed here suggests that genetic variation in the core collection does not simply reflect differences in pigment quantity, but rather differences in the intracellular optical environment that allow high redness to be achieved without the expected reduction in lightness. The observed positive covariation may indicate a potentially favorable color appearance characterized by increased redness without substantial loss of lightness.

4.4. Functional Implications of Candidate Genes at Pleiotropic Loci

GO enrichment analysis revealed distinct functional signatures at the three plei-otropic loci, collectively suggesting that these genomic regions influence the cellular framework supporting pigment expression rather than the core biosynthetic pathway. The incorporation of LD-based candidate region definition further revealed notable differences in genetic architecture among loci. The strong LD block observed at the 6B locus supports the presence of a coherent genomic region underlying fruit color variation, whereas the dispersed LD patterns at the 4A and 4C loci suggest more complex or fragmented genetic control. This contrast highlights the importance of considering LD structure in polyploid species, where recombination patterns and subgenome interactions can vary substantially across genomic regions. At the 4C locus, deSUMOylase activity (FE = 123.61, p = 0.016) and protein deSUMOylation (FE = 81.65, p = 0.024) were the most strongly enriched terms, suggesting that reversible post-translational protein modification may be potentially associated with regulatory processes at this locus. Given that SUMO-mediated modification is known to regulate transcription factor stability and activity in plant pigmentation pathways, SUMO pathway components in this region represent promising candidates for functional validation, particularly in light of the strongest association signals observed for both L* and R (−log10(P) = 11.12–15.39; PVE = 5.1–5.9%). The 6B locus showed strong enrichment for plastid- and chloroplast-related cellular components (p < 10−5), implicating genes involved in plastid remodeling during fruit ripening. The transition from chloroplasts to chromoplasts plays a central role in fruit pigmentation by coordinating pigment accumulation with structural changes in the plastids. The relatively high phenotypic variance explained for the blue channel (14.6%) at this locus is consistent with the genetic variation affecting chlorophyll degradation and chromoplast differentiation, which influence blue light reflectance.
Collectively, these enrichment patterns indicate that fruit coloration in strawberries is shaped not only by pigment biosynthesis but also by the coordinated regulation of transcriptional control, post-translational modification, and plastid structural dynamics [30,31]. Similar conclusions have been drawn from large-scale GWAS in tomatoes, where fruit color variation is associated with genes involved in plastid remodeling and light signaling, in addition to biosynthetic enzymes [32]. More broadly, studies across fruit crops have demonstrated that pigment visibility depends on processes such as vacuolar sequestration, organelle differentiation, and metabolic sink formation [33,34].
From a practical standpoint, the major-effect loci identified in this study provide priority targets for marker-assisted selection in strawberry breeding. The firmness locus on chromosome 21 (6A; PVE = 21.7%) offers a reliable marker for improving post-harvest storability, one of the most important traits in fresh-market strawberry. The pleiotropic loci on chromosomes 15 (4C) and 22 (6B), which were associated with both lightness (L*) and redness (R), allow the two color traits to be selected together using a single marker, and the 6B locus alone explained about 18% of the variance for both traits. In addition, the candidate genes at these loci appear to act through transcriptional, post-translational, and plastid-related regulatory pathways rather than the anthocyanin biosynthetic pathway itself, which widens the range of possible breeding targets and supports the use of allele mining and genomic selection in the NIHHS strawberry breeding program.

5. Conclusions

This study established a genetically representative 108-accession strawberry core collection and integrated automated RGB image-based phenotyping with genome-wide association analysis to dissect the genetic architecture of fruit appearance traits. The core collection effectively captured the allelic diversity of the original 253-accession germplasm and exhibited structured phenotypic differentiation across genetic clusters, as confirmed using the Mantel test. The GWAS identified 15 significant marker–trait associations across six fruit quality traits, with pleiotropic loci on chromosomes 15 (4C) and 22 (6B) consistently associated with fruit lightness (L*) and redness (R) in both the BLINK and MLMM models. The 6 B locus alone explained approximately 18% of the phenotypic variance for each color trait, representing a relatively large single-locus effect for colorimetric traits in an allo-octoploid genome. GO enrichment analysis implicated transcriptional regulation (4A), deSUMOylation-mediated post-translational modification (4C), and chloroplast-to-chromoplast transition (6B) at the pleiotropic loci, suggesting that fruit colorimetric traits are governed by regulatory mechanisms beyond the core pigment biosynthesis pathways. Collectively, these findings demonstrate that the combination of strategic core collection design, high-resolution image-based phenotyping, and multi-model GWAS provides a robust framework for the genetic dissection of complex fruit quality traits in octoploid strawberries and offers a practical foundation for advancing precision breeding of major strawberry cultivars in Korea.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/horticulturae12050547/s1, Table S1. Population structure relationship probabilities for 108 strawberry accessions based on STRUCTURE analysis (K = 10). Table S2. Genetic diversity indices of the total germplasm (n = 253) and core collection (n = 108). Table S3. Mean and standard deviations of the logarithm of the probability of data for different numbers of subpopulations (K) tested in strawberry. Table S4. Mean phenotypic values across ten genetic clusters with one-way ANOVA results. Different lowercase letters within rows indicate significant differences at p < 0.05 (Tukey’s HSD test). Table S5. Summary of phenotypic Euclidean distance parameters based on Z-score standardized values of 12 traits. Table S6. Gene Ontology (GO) enrichment results for candidate genes associated with significant GWAS loci on subgenomes 4A, 4C, and 6B. Figure S1. Automated digital phenotyping pipeline for strawberry fruit trait extraction. (a) Workflow comprising four stages: color calibration, image compression and ROI selection, automated segmentation with calyx removal, and morphological/colorimetric trait extraction. (b) Representative images showing raw input (left) and color-calibrated output (right) for external surfaces (top) and segmented results for internal surfaces (bottom). Figure S2. Representative fruit images illustrating the phenotypic range observed in the 108-accession strawberry core collection. Each panel shows external (top row) and longitudinal section (bottom row) views of fruits from accessions exhibiting the extreme low and high values for each trait category. (a) Fruit size (weight, length, area): ‘Lassen’ (CC-059; FrW = 1.94 g, FL = 20.1 mm, FA = 210.5; left) and ‘Kingsberry’ (CC-023; FrW = 50.72 g, FL = 58.2 mm, FA = 1.734; right) representing the smallest and largest fruits, respectively. (b) Fruit firmness: ‘Aromas’ (CC-056; FF = 1.60 N; left) and ‘Selva’ (CC-060; FF = 12.45 N; right) showing the softest and firmest fruits, with longitudinal sections illustrating differences in flesh density and core structure. (c) Fruit shape based on circularity (FC): ‘Jukhyang’ (CC-097; FC = 0.62; left, elongated) and ‘Jumbo Pure Berry’ (CC-054; FC = 0.82; right, near-round). (d) Fruit color: ‘Selva’ (CC-060; L* = 65.00, R = 112.53; left, dark-pigmented) and ‘NS170309’ (CC-106; L* = 110.83, R = 173.30; right, light-pigmented), illustrating the range of both surface and internal pigmentation across the core collection. The same accession (‘Selva’) is shown in panels (b,d), reflecting its dual phenotype of high firmness and dark coloration. ColorChecker reference and ruler shown in each panel were used for color and size calibration. Figure S3. Mantel test between genetic distance and phenotypic Euclidean distance across 108 strawberry accessions. Pairwise comparisons are categorized by cluster membership: same cluster (0.0) and different cluster (1.0) (r = 0.38, p < 0.001). Figure S4. Circular Manhattan plots showing genome-wide association results for fruit weight and fruit firmness using MLMM and BLINK models. Each concentric track represents a trait, and significant SNPs (−log10P ≥ 6) are highlighted. Chromosomes are arranged according to subgenomes (A–D). Figure S5. Circular Manhattan plots of GWAS results for fruit morphological traits, including fruit circularity, area, length, and width. Results from MLMM and BLINK models are combined, with significant loci indicated by red markers. Figure S6. Circular Manhattan plots showing GWAS results for fruit color traits in the CIELAB color space (L*, a*, b*). Significant SNPs (−log10P ≥ 6) identified by MLMM and BLINK are highlighted across subgenomes. Figure S7. Circular Manhattan plots presenting GWAS results for RGB-based color traits (R, G, B channels). Significant associations detected by MLMM and BLINK models are indicated, with chromosome positions shown by subgenome. Figure S8. Global Q–Q plot comparing observed versus expected −log10(P) values across all traits and models (MLMM and BLINK). The deviation from the null expectation indicates the presence of true genetic associations with minimal inflation.

Author Contributions

Conceptualization, Methodology and Writing—Original Draft Preparation, S.K. and Y.J.J.; Visualization and Formal Analysis and Data Curation, K.H., S.K.,Y.J.J., E.S.L., H.-I.A. and Y.O.; Resources, S.K. and K.H.; Supervision, D.-S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by grants from the National Institute of Horticultural and Herbal Science (Project No. PJ016681 02) Rural Development Administration, Republic of Korea. This study was supported by the 2026 RDA Fellowship Program of the National Institute of Horticultural and Herbal Science, Rural Development Administration, Republic of Korea.

Data Availability Statement

Genotype data have been deposited in the National Agricultural Biotechnology Information Center (https://nabic.rda.go.kr, accession number DL016481 accessed on 17 March 2026).

Acknowledgments

The authors thank Jeong-Il Yoo, Hyun-Mi Yoo, and Yong-Bin Kim for their dedicated support in strawberry cultivation at the National Institute of Horticultural and Herbal Science (NIHHS), and Kyung-A Lim and Jong-Won Park for their assistance with phenotypic evaluation.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

GWASGenome-wide association study
SNPSingle nucleotide polymorphism
RGBRed, green, blue
CIELABCommission Internationale de l’Éclairage L*a*b* color space
BLINKBayesian-information and Linkage-disequilibrium Iteratively Nested Keyway
MLMMMultilocus mixed model
PVEPhenotypic variance explained
MAFMinor allele frequency
PCAPrincipal component analysis
NJNeighbor-joining
GOGene Ontology
DAVIDDatabase for Annotation, Visualization and Integrated Discovery
CTABCetyltrimethylammonium bromide
DQCDish quality control
FFFruit firmness
FrWFruit weight
FAFruit area
FLFruit length
FWFruit width
FCFruit circularity
LAB CIELAB color space
NIHHSNational Institute of Horticultural and Herbal Science
RDARural Development Administration
QTLQuantitative trait locus
LEDLight-emitting diode
CVCoefficient of variation
SPSubpopulation
BPBiological process
CCCellular component
MFMolecular function

References

  1. Yoon, H.S.; Jin, H.J.; Oh, J.Y. ‘Kuemsil’, a strawberry variety suitable for forcing culture. Korean Soc. Breed. Sci. 2020, 52, 184–189. [Google Scholar] [CrossRef]
  2. Yue, C.; Gallardo, R.K.; Luby, J.; Rihn, A.; McFerson, J.R.; McCracken, V.; Whitaker, V.M.; Finn, C.E.; Hancock, J.F.; Weebadde, C. An evaluation of US strawberry producers trait prioritization: Evidence from audience surveys. HortScience 2014, 49, 188–193. [Google Scholar] [CrossRef]
  3. Colquhoun, T.A.; Levin, L.A.; Moskowitz, H.R.; Whitaker, V.M.; Clark, D.G.; Folta, K.M. Framing the perfect strawberry: An exercise in consumer-assisted selection of fruit crops. J. Berry Res. 2012, 2, 45–61. [Google Scholar] [CrossRef]
  4. Bhat, R.; Geppert, J.; Funken, E.; Stamminger, R. Consumers perceptions and preference for strawberries—A case study from Germany. Int. J. Fruit. Sci. 2015, 15, 405–424. [Google Scholar] [CrossRef]
  5. Porter, M.; Fan, Z.; Lee, S.; Whitaker, V.M. Strawberry breeding for improved flavor. Crop Sci. 2023, 63, 1949–1963. [Google Scholar] [CrossRef]
  6. Tennessen, J.A.; Govindarajulu, R.; Ashman, T.-L.; Liston, A. Evolutionary origins and dynamics of octoploid strawberry subgenomes revealed by dense targeted capture linkage maps. Genome Biol. Evol. 2014, 6, 3295–3313. [Google Scholar] [CrossRef]
  7. Hardigan, M.A.; Feldmann, M.J.; Lorant, A.; Bird, K.A.; Famula, R.; Acharya, C.; Cole, G.; Edger, P.P.; Knapp, S.J. Genome synteny has been conserved among the octoploid progenitors of cultivated strawberry over millions of years of evolution. Front. Plant Sci. 2020, 10, 1789. [Google Scholar] [CrossRef]
  8. Edger, P.P.; Poorten, T.J.; VanBuren, R.; Hardigan, M.A.; Colle, M.; McKain, M.R.; Smith, R.D.; Teresi, S.J.; Nelson, A.D.; Wai, C.M. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 2019, 51, 541–547. [Google Scholar] [CrossRef]
  9. Hardigan, M.A.; Lorant, A.; Pincot, D.D.A.; Feldmann, M.J.; Famula, R.A.; Acharya, C.B.; Lee, S.; Verma, S.; Whitaker, V.M.; Bassil, N.; et al. Unraveling the Complex Hybrid Ancestry and Domestication History of Cultivated Strawberry. Mol. Biol. Evol. 2021, 38, 2285–2305. [Google Scholar] [CrossRef] [PubMed]
  10. Muñoz, P.; Roldán-Guerra, F.J.; Verma, S.; Ruiz-Velázquez, M.; Torreblanca, R.; Oiza, N.; Castillejo, C.; Sánchez-Sevilla, J.F.; Amaya, I. Genome-wide association studies in a diverse strawberry collection unveil loci controlling agronomic and fruit quality traits. Plant Genome 2024, 17, e20509. [Google Scholar] [CrossRef] [PubMed]
  11. Whitaker, V.M.; Knapp, S.J.; Hardigan, M.A.; Edger, P.P.; Slovin, J.P.; Bassil, N.V.; Hytönen, T.; Mackenzie, K.K.; Lee, S.; Jung, S.; et al. A roadmap for research in octoploid strawberry. Hortic. Res. 2020, 7, 33. [Google Scholar] [CrossRef] [PubMed]
  12. James, K.M.F.; Sargent, D.J.; Whitehouse, A.; Cielniak, G. High-throughput phenotyping for breeding targets—Current status and future directions of strawberry trait automation. Plants People Planet 2022, 4, 432–443. [Google Scholar] [CrossRef]
  13. Zurn, J.D.; Hummer, K.E.; Bassil, N.V. Exploring the diversity and genetic structure of the U.S. National Cultivated Strawberry Collection. Hortic. Res. 2022, 9, uhac125. [Google Scholar] [CrossRef]
  14. Koorevaar, T.; Willemsen, J.H.; Visser, R.G.; Arens, P.; Maliepaard, C. Construction of a strawberry breeding core collection to capture and exploit genetic variation. BMC Genom. 2023, 24, 740. [Google Scholar] [CrossRef]
  15. Fan, Z.; Tieman, D.M.; Knapp, S.J.; Zerbe, P.; Famula, R.; Barbey, C.R.; Folta, K.M.; Amadeu, R.R.; Lee, M.; Oh, Y. A multi-omics framework reveals strawberry flavor genes and their regulatory elements. New Phytol. 2022, 236, 1089–1107. [Google Scholar] [CrossRef]
  16. Wada, T.; Tsubone, M.; Mori, M.; Hirata, C.; Nagamatsu, S.; Oku, K.; Nagano, S.; Isobe, S.; Suzuki, H.; Shimomura, K. Genome-wide association study of Strawberry fruit quality-related traits using a MAGIC population derived from crosses involving six strawberry cultivars. Hortic. J. 2020, 89, 553–566. [Google Scholar] [CrossRef]
  17. Furbank, R.T.; Tester, M. Phenomics–technologies to relieve the phenotyping bottleneck. Trends Plant Sci. 2011, 16, 635–644. [Google Scholar] [CrossRef]
  18. Abebe, A.M.; Kim, Y.; Kim, J.; Kim, S.L.; Baek, J. Image-based high-throughput phenotyping in horticultural crops. Plants 2023, 12, 2061. [Google Scholar] [CrossRef]
  19. Zingaretti, L.M.; Monfort, A.; Pérez-Enciso, M. Automatic fruit morphology phenome and genetic analysis: An application in the octoploid strawberry. Plant Phenomics 2021, 2021, 9812910. [Google Scholar] [CrossRef]
  20. Yoshioka, Y.; Nakayama, M.; Noguchi, Y.; Horie, H. Use of image analysis to estimate anthocyanin and UV-excited fluorescent phenolic compound levels in strawberry fruit. Breed. Sci. 2013, 63, 211–217. [Google Scholar] [CrossRef] [PubMed]
  21. Doyle, J. DNA protocols for plants. In Molecular Techniques in Taxonomy; Springer: Berlin/Heidelberg, Germany, 1991; pp. 283–293. [Google Scholar]
  22. Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 2015, 25, 7. [Google Scholar] [CrossRef]
  23. Huang, M.; Liu, X.; Zhou, Y.; Summers, R.M.; Zhang, Z. BLINK: A package for the next level of genome-wide association studies with both individuals and markers in the millions. Gigascience 2019, 8, giy154. [Google Scholar] [CrossRef] [PubMed]
  24. Segura, V.; Vilhjálmsson, B.J.; Platt, A.; Korte, A.; Seren, Ü.; Long, Q.; Nordborg, M. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 2012, 44, 825–830. [Google Scholar] [CrossRef]
  25. Wang, J.; Zhang, Z. GAPIT version 3: Boosting power and accuracy for genomic association and prediction. Genom. Proteom. Bioinform. 2021, 19, 629–640. [Google Scholar] [CrossRef]
  26. Huang, D.W.; Sherman, B.T.; Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009, 4, 44–57. [Google Scholar] [CrossRef]
  27. Castillejo, C.; Waurich, V.; Wagner, H.; Ramos, R.; Oiza, N.; Muñoz, P.; Triviño, J.C.; Caruana, J.; Liu, Z.; Cobo, N. Allelic variation of MYB10 is the major force controlling natural variation in skin and flesh color in strawberry (Fragaria spp.) fruit. Plant Cell 2020, 32, 3723–3749. [Google Scholar] [CrossRef] [PubMed]
  28. Denoyes, B.; Prohaska, A.; Petit, J.; Rothan, C. Deciphering the genetic architecture of fruit color in strawberry. J. Exp. Bot. 2023, 74, 6306–6320. [Google Scholar] [CrossRef]
  29. Parra-Palma, C.; Morales-Quintana, L.; Ramos, P. Phenolic content, color development, and pigment− related gene expression: A comparative analysis in different cultivars of strawberry during the ripening process. Agronomy 2020, 10, 588. [Google Scholar] [CrossRef]
  30. Yang, J.; Chen, Y.; Xiao, Z.; Shen, H.; Li, Y.; Wang, Y. Multilevel regulation of anthocyanin-promoting R2R3-MYB transcription factors in plants. Front. Plant Sci. 2022, 13, 1008829. [Google Scholar] [CrossRef] [PubMed]
  31. Wang, W.; Wang, Y.; Chen, T.; Qin, G.; Tian, S. Current insights into posttranscriptional regulation of fleshy fruit ripening. Plant Physiol. 2023, 192, 1785–1798. [Google Scholar] [CrossRef]
  32. Kayikci, H.C.; Aydin, S.; Adak, A.; Dogan, A.; Sapkota, M.; Feng, Q.; Topcu, Y. Association mapping of tomato fruit quality for weight, firmness, brix, and color using GWAS. BMC Plant Biol. 2025, 26, 41. [Google Scholar] [CrossRef]
  33. Kapoor, L.; Simkin, A.J.; George Priya Doss, C.; Siva, R. Fruit ripening: Dynamics and integrated analysis of carotenoids and anthocyanins. BMC Plant Biol. 2022, 22, 27. [Google Scholar] [CrossRef]
  34. Albert, N.W.; Iorizzo, M.; Mengist, M.F.; Montanari, S.; Zalapa, J.; Maule, A.; Edger, P.P.; Yocca, A.E.; Platts, A.E.; Pucker, B. Vaccinium as a comparative system for understanding of complex flavonoid accumulation profiles and regulation in fruit. Plant Physiol. 2023, 192, 1696–1710. [Google Scholar] [CrossRef]
Figure 1. Population structure and phylogenetic relationships among 108 strawberry accessions. (a) Delta K plot indicating the optimal number of clusters (K = 10). (b) STRUCTURE bar plot showing membership probabilities for each accession across ten clusters. (c) Unrooted neighbor-joining tree with branches color-coded according to STRUCTURE cluster assignment.
Figure 1. Population structure and phylogenetic relationships among 108 strawberry accessions. (a) Delta K plot indicating the optimal number of clusters (K = 10). (b) STRUCTURE bar plot showing membership probabilities for each accession across ten clusters. (c) Unrooted neighbor-joining tree with branches color-coded according to STRUCTURE cluster assignment.
Horticulturae 12 00547 g001
Figure 2. Phenotypic characterization of 12 fruit quality traits across 108 strawberry accessions. (a) Frequency distributions of physical and morphological traits and (b) colorimetric traits in CIELAB and RGB color spaces. Dashed lines indicate population means. (c) Pearson correlation matrix among the 12 traits. Color scale ranges from blue (negative) to red (positive); * p < 0.05, ** p < 0.01, *** p < 0.001. (d) PCA biplot showing the distribution of accessions along PC1 (18.4%) and PC2 (11.8%). Red arrows represent trait loading vectors. FrW, fruit weight; FF, firmness; FC, circularity; FA, fruit area; FL, fruit length; FW, fruit width.
Figure 2. Phenotypic characterization of 12 fruit quality traits across 108 strawberry accessions. (a) Frequency distributions of physical and morphological traits and (b) colorimetric traits in CIELAB and RGB color spaces. Dashed lines indicate population means. (c) Pearson correlation matrix among the 12 traits. Color scale ranges from blue (negative) to red (positive); * p < 0.05, ** p < 0.01, *** p < 0.001. (d) PCA biplot showing the distribution of accessions along PC1 (18.4%) and PC2 (11.8%). Red arrows represent trait loading vectors. FrW, fruit weight; FF, firmness; FC, circularity; FA, fruit area; FL, fruit length; FW, fruit width.
Horticulturae 12 00547 g002
Figure 3. Identification of pleiotropic loci for strawberry fruit color traits via comparative GWAS. Manhattan and Q-Q plots displaying genetic associations for (a,b) fruit lightness (L*) and (c,d) redness (R). The results were compared across two multilocus models: MLMM (a,c) and BLINK (b,d). The horizontal lines in the Manhattan plots denote the significance threshold (−log10 P ≥ 6.0). Consistent significant peaks were identified on chromosomes 13 (4A), 15 (4C), and 22 (6B) (highlighted by red vertical dashed lines) for both traits. The high concordance between the MLMM and BLINK algorithms, coupled with the overlapping signals for L* and R, strongly indicated a shared genetic basis for fruit pigmentation at these loci.
Figure 3. Identification of pleiotropic loci for strawberry fruit color traits via comparative GWAS. Manhattan and Q-Q plots displaying genetic associations for (a,b) fruit lightness (L*) and (c,d) redness (R). The results were compared across two multilocus models: MLMM (a,c) and BLINK (b,d). The horizontal lines in the Manhattan plots denote the significance threshold (−log10 P ≥ 6.0). Consistent significant peaks were identified on chromosomes 13 (4A), 15 (4C), and 22 (6B) (highlighted by red vertical dashed lines) for both traits. The high concordance between the MLMM and BLINK algorithms, coupled with the overlapping signals for L* and R, strongly indicated a shared genetic basis for fruit pigmentation at these loci.
Horticulturae 12 00547 g003
Figure 4. Gene Ontology (GO) enrichment analysis of candidate genes at three pleiotropic loci associated with fruit color traits.
Figure 4. Gene Ontology (GO) enrichment analysis of candidate genes at three pleiotropic loci associated with fruit color traits.
Horticulturae 12 00547 g004
Table 1. Descriptive statistics of 12 fruit quality, morphological, and colorimetric traits in 108 strawberry accessions.
Table 1. Descriptive statistics of 12 fruit quality, morphological, and colorimetric traits in 108 strawberry accessions.
CategoryTrait (Unit)MeanStdMin–MaxCV 1 (%)
Fruit QualityFruit Weight (g)26.19.233.28–50.7235.37
Firmness (N)4.281.781.60–12.4541.55
MorphologyFruit-circularity (ratio)0.740.030.62–0.824.5
Fruit-area(mm2)1452.45325.18450.12–2150.4522.38
Fruit-length (mm)48.528.4222.15–65.3417.35
Fruit-width (mm)36.145.2118.42–48.1514.41
ColorL*_LAB (Lightness)78.6711.7854.25–110.8314.97
a*_LAB (Redness)42.158.4222.31–65.4819.98
b*_LAB (Yellowness)35.826.1418.52–52.1417.14
R_RGB (Red)185.3424.15110.25–245.8213.03
G_RGB (Green)48.2115.6215.34–95.4232.40
B_RGB (Blue)38.4512.3812.15–78.6232.20
1 CV1 (%):coefficient of variation.
Table 2. Significant marker identified by GWAS for fruit quality, morphology, and color traits in the strawberry core collection.
Table 2. Significant marker identified by GWAS for fruit quality, morphology, and color traits in the strawberry core collection.
CategoryTraitSNP IDChr. 1Position (bp)MAFBLINKMLMM
p Value−log10(P)PVE (%) 2p Value−log10(P)PVE (%) 2
Fruit QualityFruit FirmnessAX-18439861521 (6A)23,741,1130.152.09 × 10−1110.6821.70---
MorphologyFruit WidthAX-1841252705 (2A)325,9380.461.74 × 10−87.769.00---
AX-1842678903 (1C)406,1430.422.09 × 10−76.6819.40---
AX-18427626022 (6B)14,245,6740.415.65 × 10−76.250.10---
Fruit LengthAX-18424067419 (5D)2,737,3270.323.97 × 10−98.4012.01.55 × 10−87.8112.0
ColorL* (Lightness)AX-18443939013 (4A)6,882,9270.36---9.73 × 10−87.010.10
AX-18471056515 (4C)25,194,8530.387.63 × 10−1211.125.901.38 × 10−1211.865.90
AX-18462149122 (6B)31,037,5340.316.51 × 10−98.1918.43.81 × 10−76.4218.4
R (Redness)AX-18443939013 (4A)6,882,9270.361.65 × 10−98.780.303.04 × 10−87.520.30
AX-1844467398 (2D)7,143,9120.491.61 × 10−98.791.70---
AX-18471056515 (4C)25,194,8530.384.05 × 10−1615.395.103.97 × 10−1413.405.10
AX-18462149122 (6B)31,037,5340.312.12 × 10−98.6718.08.19 × 10−87.0918.0
B (Blueness)AX-1233574458 (2D)5,910,9750.251.79 × 10−98.7520.0---
AX-18489953612 (3D)22,467,6660.151.61 × 10−1110.798.90---
AX-18402660622 (6B)31,864,1270.234.58 × 10−1615.3414.60---
1 Chromosome numbers are based on the Fragaria × ananassa cv. ‘Royal Royce’ reference genome v1.0 (FaRR1 [4]). 2 PVE (%): phenotypic variance explained (%), estimated by linear regression R2 of phenotype on SNP dosage (0/1/2). Significance threshold: −log10(P) ≥ 6.0 (p < 1 × 10−6).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, S.; Jang, Y.J.; Han, K.; Lee, E.S.; Ahn, H.-I.; Oh, Y.; Kim, D.-S. Genetic Architecture of Fruit Color and Morphology Revealed by Image-Based Phenotyping and Genome-Wide Association Analysis in Octoploid Strawberry. Horticulturae 2026, 12, 547. https://doi.org/10.3390/horticulturae12050547

AMA Style

Kim S, Jang YJ, Han K, Lee ES, Ahn H-I, Oh Y, Kim D-S. Genetic Architecture of Fruit Color and Morphology Revealed by Image-Based Phenotyping and Genome-Wide Association Analysis in Octoploid Strawberry. Horticulturae. 2026; 12(5):547. https://doi.org/10.3390/horticulturae12050547

Chicago/Turabian Style

Kim, Seolah, Yoon Jeong Jang, Koeun Han, Eun Su Lee, Hong-Il Ahn, Youngjae Oh, and Do-Sun Kim. 2026. "Genetic Architecture of Fruit Color and Morphology Revealed by Image-Based Phenotyping and Genome-Wide Association Analysis in Octoploid Strawberry" Horticulturae 12, no. 5: 547. https://doi.org/10.3390/horticulturae12050547

APA Style

Kim, S., Jang, Y. J., Han, K., Lee, E. S., Ahn, H.-I., Oh, Y., & Kim, D.-S. (2026). Genetic Architecture of Fruit Color and Morphology Revealed by Image-Based Phenotyping and Genome-Wide Association Analysis in Octoploid Strawberry. Horticulturae, 12(5), 547. https://doi.org/10.3390/horticulturae12050547

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop