Genome-Wide Association Study Detects Loci Involved in Scab Susceptibility in Japanese Apricot

: Japanese apricot ( Prunus mume ) is an important fruit tree in East Asia. ‘Nanko’, the primary cultivar of Japanese apricots, usually suffers from scab, a disease caused by Venturia carpophila . However, there have been few reports on the phenotypic variation in scab resistance/susceptibility and the underlying genetic factors. In this study, we investigated the severity of naturally occurring scabs based on fruit lesions in 108 Japanese apricot accessions over four consecutive years. In each year, both resistant and susceptible accessions were observed, and signiﬁcant annual correlations were detected among the ratios of diseased fruits (Rt; 0.52–0.76) and among the disease severity indices (Sv; 0.55–0.79). We also conducted a genome-wide association study (GWAS) based on exon-targeted resequencing, and signiﬁcant peaks were detected in the data from 2017 and 2018. The candidate genes involved in disease resistance are located near nine single-nucleotide polymorphisms. These genes may be associated with the susceptibility of ‘Nanko’ lineages to scab. These ﬁndings shed light on the phenotypic and genetic proﬁles of scab resistance in P. mume and will assist future breeding programs with improving scab resistance.

The 'Nanko' cultivar has long faced a significant problem with a fungal disease referred to as scab, caused by Venturia carpophila (synonym: Cladosporium carpophilum), which leads to the development of black spots on the fruit surface ( Figure 1) and reduces its commercial value [10]. In addition, this fungus infects and causes economic damage to other stone fruits, including peaches (Prunus persica), almonds (Prunus dulsis), and apricots (Prunus armeniaca). Therefore, chemical controls are required [11]. Notably, scab disease has not been reported in the Japanese plum (Prunus salicina) (https://www.naro. go.jp/project/results/laboratory/fruit/2002/fruit02-26.html) (accessed on 1 May 2023). However, the emergence of fungicide-resistant strains of V. carpophila has been reported in the Wakayama Prefecture [12]. Similarly, in apple trees (Malus domestica), which belong to the family Rosaceae, the occurrence of fungicide-resistant Venturia inaequalis, which causes Despite being an economic concern for Japanese apricots and other stone fruit tree crops, there have been limited reports on the phenotypic variation in scab resistance/susceptibility, underlying genetic factors, and the development of DNA markers. Previously, some PCR-based molecular markers, such as random amplified polymorphic DNA [14], simple sequence repeats [15,16], and chloroplast markers [17], have been used for cultivar identification. Recently, genome-wide single-nucleotide polymorphisms (SNPs) have become available with the development of next-generation sequencing technology [18,19]. However, the loci controlling agriculturally useful traits in Japanese apricots are largely unknown, and the currently available DNA markers are limited to the selection of selfcompatibility traits, targeting the S-RNase gene [20]. The principal challenges in genetic analysis and fruit tree breeding are related to the large individual sizes and long juvenile periods of trees, which make it difficult to segregate populations and conduct continuous trait evaluations [21].
The main objective of this study was to investigate the genetic factors that control scab resistance and susceptibility. In order to achieve this goal, phenotypic observations of scab resistance and susceptibility in Japanese apricots were performed using 108 accessions maintained in the experimental orchard of the Japanese Apricot Laboratory, Wakayama Fruit Tree Experiment Station. Fruits from a single tree per accession were employed to investigate naturally occurring scabs in the field. Considering the annual variations, the observations were repeated over 4 years to evaluate the symptoms. A genome-wide association study (GWAS) was conducted, and candidate genomic regions for scab resistance were estimated. Here, we report nine SNPs possibly involved in severe scab susceptibility in 'Nanko' linages. These SNPs could be utilized as selection markers, particularly in cross-breeding populations using 'Nanko' as a parent.

Phenotypic Variation of Scab Resistance
A total of 108 Japanese apricot accessions were used to evaluate scab resistance and susceptibility. These specimens consisted of 45 fruit (F), 11 small-fruit (FS), 37 ornamental (O), and 5 Taiwanese (T) accessions, as well as eight hybrids between P. mume and P. armeniaca (AM) and two hybrids between P. mume and P. salicina (SM). Accessions that Despite being an economic concern for Japanese apricots and other stone fruit tree crops, there have been limited reports on the phenotypic variation in scab resistance/susceptibility, underlying genetic factors, and the development of DNA markers. Previously, some PCR-based molecular markers, such as random amplified polymorphic DNA [14], simple sequence repeats [15,16], and chloroplast markers [17], have been used for cultivar identification. Recently, genome-wide single-nucleotide polymorphisms (SNPs) have become available with the development of next-generation sequencing technology [18,19]. However, the loci controlling agriculturally useful traits in Japanese apricots are largely unknown, and the currently available DNA markers are limited to the selection of selfcompatibility traits, targeting the S-RNase gene [20]. The principal challenges in genetic analysis and fruit tree breeding are related to the large individual sizes and long juvenile periods of trees, which make it difficult to segregate populations and conduct continuous trait evaluations [21].
The main objective of this study was to investigate the genetic factors that control scab resistance and susceptibility. In order to achieve this goal, phenotypic observations of scab resistance and susceptibility in Japanese apricots were performed using 108 accessions maintained in the experimental orchard of the Japanese Apricot Laboratory, Wakayama Fruit Tree Experiment Station. Fruits from a single tree per accession were employed to investigate naturally occurring scabs in the field. Considering the annual variations, the observations were repeated over 4 years to evaluate the symptoms. A genome-wide association study (GWAS) was conducted, and candidate genomic regions for scab resistance were estimated. Here, we report nine SNPs possibly involved in severe scab susceptibility in 'Nanko' linages. These SNPs could be utilized as selection markers, particularly in cross-breeding populations using 'Nanko' as a parent.

Phenotypic Variation of Scab Resistance
A total of 108 Japanese apricot accessions were used to evaluate scab resistance and susceptibility. These specimens consisted of 45 fruit (F), 11 small-fruit (FS), 37 ornamental (O), and 5 Taiwanese (T) accessions, as well as eight hybrids between P. mume and P. armeniaca (AM) and two hybrids between P. mume and P. salicina in 100 fruits (or all fruits if the number of fruits was <100) of each accession and were categorized by the number of lesions per fruit as follows: 0, no symptoms; 1, 1-3 lesions; 2, 4-8 lesions; 4, 9-20 lesions; and 6, >20 lesions. The ratio of diseased fruits (Rt; %) and the disease severity index (Sv; considers the number of lesions per fruit) [22,23] were calculated (Table S1).
The results showed substantial variation in both the Rt and Sv among the accessions. 100 fruits (or all fruits if the number of fruits was ˂100) of each accession and were categorized by the number of lesions per fruit as follows: 0, no symptoms; 1, 1-3 lesions; 2, 4-8 lesions; 4, 9-20 lesions; and 6, >20 lesions. The ratio of diseased fruits (Rt; %) and the disease severity index (Sv; considers the number of lesions per fruit) [22,23] were calculated (Table S1).
The results showed substantial variation in both the Rt and Sv among the accessions.   Spearman's rank correlation coefficients were calculated for all possible combinations to assess the annual variation in scab resistance/susceptibility traits (Tables 1 and 2

Population Structure of Japanese Apricot Accessions
Raw sequence reads of 108 accessions were quality-trimmed and then mapped onto LG1-8 of Peach v2.0 genome [24], with repeated sequences masked. Duplicate reads were removed and SNP calling was performed. Quality filtering and removal of positions with a missing rate > 0.2 (loci with DP < 8 were considered as missing) were conducted, followed by the imputation of missing data. Subsequently, SNPs with a minor allele frequency < 0.03 were removed, and SNPs in strong linkage disequilibrium (LD) (r 2 > 0.5) were pruned. The resulting dataset was subjected to principal component analysis (PCA), phylogenetic tree construction, and Bayesian clustering analysis using ADMIXTURE [25].
PCA, which shrinks the dimensions of large genotype data, revealed that the F, FS, O, AM, and SM populations were clearly separated in PC1 (Figure 3a). In PC2, the Taiwanese population (T population) was distinctly separated from the others. However, in PC3 and PC4, although some ornamental accessions (such as 'Yaetoji', 'Chinamume', and 'Jakobai') were plotted separately, the other populations showed no well-defined clustering. In the phylogenetic tree, the AM and SM populations formed a distinct cluster, whereas the other populations showed ambiguous clustering according to the usage type (F, FS, and O) ( Figure 3b). Furthermore, ADMIXTURE analysis revealed that the cross-validation error was minimized at K = 4, indicating the optimal number of clusters. In the bar plot for K = 4, the AM and SM populations (red bars), as well as some ornamental accessions (e.g., 'Yaetoji', 'Chinamume', and 'Jakobai'; green bars), formed separate clusters, whereas the other populations were grouped into two major clusters (blue and orange bars; Figure 3c).
These results confirm that the AM, SM, and T populations have distinct structures compared with the other populations. Therefore, these populations were excluded from further analysis, and the remaining populations were subjected to the GWAS.

Multi-Year GWAS Identifies SNPs Associated with Disease Severity
A GWAS was conducted on 93 Japanese apricot accessions, excluding the AM, SM, and T populations, based on the results of the population structure analysis. Genotype data were extracted from the SNP set, excluding positions with missing rates > 0.2 (minDP < 8), and imputed using Beagle [26]. Only the SNPs with minor allele frequencies > 0.03 were selected. The GWAS was performed using the mixed linear model method in TASSEL [27]. The ADMIXTURE results (K = 4) were inputted as the population structure, and the kinship matrix obtained as the output by TASSEL was inputted as the pedigree structure.

Multi-Year GWAS Identifies SNPs Associated with Disease Severity
A GWAS was conducted on 93 Japanese apricot accessions, excluding the AM, SM, and T populations, based on the results of the population structure analysis. Genotype data were extracted from the SNP set, excluding positions with missing rates > 0.2 (minDP ˂ 8), and imputed using Beagle [26]. Only the SNPs with minor allele frequencies > 0.03 were selected. The GWAS was performed using the mixed linear model method in TAS-SEL [27]. The ADMIXTURE results (K = 4) were inputted as the population structure, and the kinship matrix obtained as the output by TASSEL was inputted as the pedigree structure.
Overall, the results of the GWAS using the Rt data were more conservative than those using the Sv data ( Figure 4). Inflation of peak detection was not observed based on the quantile-quantile (Q-Q) plots for each year, in which most of the observed p-values followed a uniform distribution ( Figure 5). In 2019, the observed p-values were lower than Overall, the results of the GWAS using the Rt data were more conservative than those using the Sv data ( Figure 4). Inflation of peak detection was not observed based on the quantile-quantile (Q-Q) plots for each year, in which most of the observed p-values followed a uniform distribution ( Figure 5). In 2019, the observed p-values were lower than expected, indicating lower power of the GWAS, possibly due to the insufficient sample size ( Figure 5 Figure 6 and Figures S1-S3 , Table S5).
from 2016 and 2019. However, the Manhattan plots generally showed a shape similar to those from 2017 and 2018 (Figure 4), supporting the results obtained in 2017 and 2018. Among the significant peaks, nine SNPs were consistently detected in both 2017 and 2018, with one SNP located on chromosome 2 and eight located on chromosome 8 (Figure 4b, Tables S3 and S4). These SNPs were also detected in Rt in 2017 (Figure 4a, Table S2). Among these SNPs, Rt and Sv tended to be higher in accessions with a homozygous T genotype at SNP 23,000,599 on chromosome 2; accessions with heterozygous genotypes at SNPs 19,895,234,19,956,845,20,009,405, and 20,095,763 on chromosome 8; accessions with a homozygous C genotype at SNP 20,311,602; and accessions with homozygous T, G, and C genotypes at SNPs 20,392,183,20,396,172,and 20,396,173 (Figures 6 and S1-S3,  Table S5).

Candidate Genes for Scab Severity
To identify candidate genes near the nine significant GWAS peaks detected over multiple years, we first defined LD blocks around these peaks using HaploView [28]. We regarded a gene harboring a significant SNP as a candidate gene for SNPs where LD blocks could not be defined because the genotyping method used in this study targeted the genic region [18]. According to the genomic regions defined above, we obtained information for candidate genes (e.g., functional annotations for expressed protein) from the Phytozome

Candidate Genes for Scab Severity
To identify candidate genes near the nine significant GWAS peaks detected over multiple years, we first defined LD blocks around these peaks using HaploView [28]. We regarded a gene harboring a significant SNP as a candidate gene for SNPs where LD blocks could not be defined because the genotyping method used in this study targeted the genic region [18]. According to the genomic regions defined above, we obtained information for candidate genes (e.g., functional annotations for expressed protein) from the Phytozome 13 database (https://phytozome-next.jgi.doe.gov/) (accessed on 1 May 2023) based on the annotation of P. persica [24].

Discussion
In this study, we conducted multiyear trait evaluations using 108 accessions of Japanese apricots to obtain data on their resistance and susceptibility to scab disease (Table S1). The annual correlations for Rt and Sv were not very high, indicating that the scab severity of fruits may be influenced by climatic conditions, such as temperature, humidity, and precipitation. Therefore, we conducted an annual analysis of the Rt and Sv data in the GWAS. A small number of accessions exhibited strong susceptibility to scabs, whereas most accessions showed resistance ( Figure 2). Furthermore, accessions with high Rt and Sv values (for example, 'W2', 'Tenjin', and 'Kotsubunanko') tended to be genetically related to 'Nanko' (Tables S1 and S5, Figure 3). In these accessions, the genotypic patterns of the GWAS peaks consistently detected over multiple years were highly similar (Table S5). 'Nanko' has been a major cultivar in Japan for over 50 years since its registration in 1966 and is extensively cultivated in the Wakayama Prefecture [1,5]. In modern agricultural environments where a single variety is cultivated on a large scale for an extended period, the evolution of pathogenic races specific to the variety can be enhanced [29]. It is possible that 'Nanko' (or its ancestors), perhaps originally not highly susceptible to scab, experienced the co-evolution of the pathogen V. carpophila because of its large-scale, long-term (>50 years) cultivation, possibly involving the breakdown of the resistance genes. The complete genome of V. carpophila has already been published; therefore, future research focusing on the genetic diversity of this pathogen may facilitate elucidation of the mechanisms of resistance/susceptibility to scabs in Japanese apricots [11].
To the best of our knowledge, no studies have investigated the large-scale variation in scab resistance among other Prunus fruit tree species. However, in apples, the resistance/susceptibility to scabs (caused by V. inaequalis) has been evaluated in 177 accessions [30]. Among the 61 cultivated apple (M. domestica) varieties, only 8 showed resistance to scab, indicating a breakdown of resistance. Notably, these eight resistant varieties were developed by introducing the resistance gene Rvi6 derived from Malus floribunda 821 through breeding. It is essential to promote resistance breeding in Japanese apricots to diversify variety groups and establish an agricultural system that is not overly dependent on 'Nanko'.
The GWAS results showed reproducibility of significant associations on chromosomes 2 and 8 (Figure 4). The SNPs consistently detected for Rt in 2017 and for Sv in 2017 and 2018 could partly explain the strong susceptibility to scab in the 'Nanko' family (Table S5). First, in 'Nanko', except for the SNP 23,000,599 on chromosome 2, all other SNPs were associated with the susceptible genotype. Other accessions with the same genotype as 'Nanko' included 'Tanfun' and 'Sadayuume. ' Furthermore, in 'W2', which tended to be more susceptible than 'Nanko', the susceptible genotype was observed in all nine SNPs (Table S5). Accessions having the same genotype as 'W2' included 'Tenjin' and 'Kotsubunanko'. All the accessions were in close proximity in the phylogenetic tree ( Figure 3). Therefore, the significant associations identified in this study potentially harbor genomic regions that contribute to the high susceptibility to scab in the 'Nanko' family. An example is the difference in scab susceptibility among 'W2', 'Seiko', and 'Seishu', all offsprings of 'Nanko'. 'Seiko', having moderate resistance to scab, was developed by crossing 'Nanko' with 'Jizoume', while 'Seishu', having strong resistance, was developed by crossing 'Nanko' with 'Kensaki' [31,32]. There was a significant variation in the resistance level, with 'Seiko' having a marginally lower disease severity than 'Nanko', while 'Seishu' showed lower disease severity than that of 'Seiko' (Table S5). This difference can be partially explained by the nine SNPs identified in this study. In 'Seiko', four SNPs on chromosome 8 (SNP 19,895,234-20,095,763) exhibited resistant genotypes, while SNP 20,311,602-20,396,173 remained the susceptible type, and 'Seishu' showed the resistance genotype in all nine SNPs (Table S5). Note that 'W2' is a full sibling of 'Seiko' having typical susceptibility to scab, as described above. Therefore, the SNPs identified in this study may serve as selection markers for scab resistance, particularly in segregated populations derived from crosses involving 'Nanko'. However, in other lineages, there were some susceptible varieties despite having resistance genotypes in all the nine SNPs, as seen in 'Jizoume', 'Inabungo', and 'Ellching'. Other loci that were not detected in the present study may be involved in the susceptibility of these accessions to scabs. A GWAS is a powerful method for estimating causal genetic loci in fruit trees without segregating populations. However, this methodology has limitations, such as the inability to detect loci with small effects and rare variants [33]. In addition, annual variations in the severity of naturally occurring scabs in the field may affect the detection of the responsible loci. Further approaches, such as inoculation methods, are required to confirm these results.
Several candidate genes were identified near these nine SNPs ( Table 3). Some of these genes, such as the leucine-rich repeat genes Prupe.8G217900 and Prupe.8G220100, have been implicated in disease resistance in Arabidopsis and play a role in recognizing proteins secreted by pathogens [34]. The tetratricopeptide repeat protein (Prupe.8G219000) is also involved in disease resistance. Ribophorin II (RPN2) (Prupe.8G221600) is associated with susceptibility to powdery mildew [35]. Additionally, 2OG-Fe II oxygenase family proteins (Prupe.8G226500, Prupe.8G226600, and Prupe.8G226700) and oxidoreductases have been linked to susceptibility to downy mildew [36]. LOB domain-containing proteins (Prupe.8G227300, Prupe.8G227400, and Prupe.8G227500) are negative regulators of the susceptibility of Arabidopsis to the root-infecting fungus Fusarium oxysporum [37], and they are also involved in susceptibility to bacterial canker in citrus [38]. Finally, poly [ADPribose] polymerase 1 (Prupe.8G227600) recognizes pathogen-associated molecular patterns and is involved in pathogen infection responses [39]. These genes represent candidates that are likely to contribute to our understanding of the susceptibility of Japanese apricot varieties to scab. Experimental confirmation, such as the measurement of expression levels, is necessary to clarify the roles of these genes.

Conclusions
In this study, we identified nine SNPs harboring genes that may contribute to the susceptibility of the 'Nanko' lineage to scab. The SNPs can potentially be used as selection markers for improving disease resistance in 'Nanko'-one of the major varieties in Japan. Different genetic loci may be involved in scab resistance observed in other lineages. Increasing the sample size is likely to lead to the discovery of more genetic loci; however, achieving this is challenging for large fruit trees with long juvenile periods. The application of genomic prediction, as attempted in apples, Japanese pears, and citrus, may provide a promising approach for introducing minor resistance loci [40][41][42]. The trait data obtained in this study will be valuable for advancing the breeding of Japanese apricot varieties with improved disease resistance. Further exploration of genetic loci related to other diseases will contribute to the sustainable development of the Japanese apricot industry.

Plant Materials
One hundred and eight Japanese apricot accessions maintained at the Japanese Apricot Laboratory, Wakayama Fruit Tree Experiment Station (Minabe-cho, Hidaka-gun, Wakayama, Japan), were used in this study (Table S1). The number of accessions in each group was as follows: F, 45; FS, 11; O, 37; T, 5; AM, 8; and SM, 2. The trees were managed without fungicide application during the survey period (2016-2019).

Evaluation of Scab Resistance
Accessions that fruited during the survey period were selected. To determine the natural prevalence of scab disease, investigations were conducted from May 16 to 17 May 2016, 1 June 2017, 1 June 2018, and 16 June 2019, in which 100 fruits per tree (or all fruits if <100) were examined. The Rt (%) and Sv were calculated [22,23]. Disease severity was classified based on the number of lesions per fruit as follows: score 0, no symptoms; score 1, 1-3 lesions; score 2, 4-8 lesions; score 4, 9-20 lesions; and score 6, ≥21 lesions. Sv was calculated as follows: Sv = Σ (score × number of fruits with corresponding severity)/(6 × number of fruits surveyed) × 100 (1) Spearman's rank correlation coefficients were calculated using the R package "Psych".  Table S1.

Genome-Wide Genotyping Using the Target Capture Method
Publicly available sequencing data [18] were used for other accessions. The SRA accession numbers are listed in Table S1.

Population Structure Analysis
Before analysis, loci with missing rates exceeding 20% (those with DP < 8 were considered missing) were excluded from the VCF file obtained earlier using VCFtools 0.1.16 [48]. Imputation was performed using Beagle 5.2 [26], and loci with minor allele frequency < 0.03 were excluded. Subsequently, SNP pairs with r 2 values > 0.5 were pruned by sliding a window of 50 SNPs with a step of 3 SNPs using PLINK v1.90b6.24 [49]. The resulting SNP set was used for the population structure analysis. PCA using PLINK, maximum likelihood phylogenetic analysis using IQTree 2.0.3 [50], and analysis using ADMIXTURE 1.3.0 [25] were conducted. In the phylogenetic analysis, ModelFinder [51] was used for phylogenetic model selection and ultrafast bootstrap approximation [52] was employed to assess clade support. In the ADMIXTURE analysis, a cross-validation error analysis was performed to determine the optimal number of clusters (K value), according to the software manual.

GWAS
A GWAS was conducted on 93 Japanese apricot accessions based on the results of the population structure analysis, excluding the AM, SM, and T populations. Genotype data were extracted from the SNP set, excluding positions with a missing rate exceeding 0.2 (minDP < 8), and were then imputed using Beagle 5.2 [26]. SNPs with a minor allele frequency > 0.03 were further selected. GWAS was performed using the mixed linear model method in TASSEL 5.2 [27]. Optimal ADMIXTURE analysis results (K = 4) were used as the population structure. The kinship matrix output obtained using TASSEL was inputted as the pedigree structure.
To identify candidate genes in the vicinity of significant GWAS peaks detected over multiple years, we first attempted to define LD blocks around these peaks using HaploView 4.2 [28]. For SNPs in which LD blocks could not be defined, we regarded the gene harboring the significant SNP as a candidate gene because the genotyping method used in the current study targeted the genic region [18]. According to the genomic regions defined above, we obtained information of candidate genes (e.g., functional annotations for expressed protein) from the Phytozome 13 database (https://phytozome-next.jgi.doe.gov/) (accessed on 1 May 2023) based on the annotation of P. persica [24].
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/horticulturae9080872/s1. Figure S1: Box plots of trait values in 2017 for genotypes at nine SNPs detected in multiple years in GWAS; Figure S2: Box plots of trait values in 2018 for genotypes at nine SNPs detected in multiple years in GWAS; Figure S3: Box plots of trait values in 2019 for genotypes at nine SNPs detected in multiple years in GWAS; Table S1: Ratio of diseased fruits (Rt) and disease severity index (Sv) observed among 108 Japanese apricot accessions in 4 years (2016-2019); Table S2: Significant GWAS peaks (FDR < 0.1) for Rt in 2017; Table S3: Significant GWAS peaks (FDR < 0.1) for Sv in 2017; Table S4: Significant GWAS peaks (FDR < 0.1) for Sv in 2018; Table S5: Rt, Sv, and genotypes at nine significant GWAS peaks detected in multiple years.
Funding: This research was funded by JSPS KAKENHI, grant number JP18K14449, to K.N. and Wakayama Prefecture.
Data Availability Statement: Raw FASTQ reads for Prunus mume accessions sequenced in this study were deposited in the Sequence Read Archive (SRA) under accession number DRA016280. The SRA run accession numbers for all sequence reads are listed in Table S1.