Genome-Wide Association Study (GWAS) for Examining the Genomics Controlling Prickle Production in Red Raspberry (

: Red raspberry ( Rubus idaeus L.) is an expanding high-value berry crop worldwide. The presence of prickles, outgrowths of epidermal tissues lacking vasculature, on the canes, petioles, and undersides of leaves complicates both ﬁeld management and harvest. The utilization of cultivars with fewer prickles or prickle-free canes simpliﬁes production. A previously generated population segregating for prickles utilizing the s locus between the prickle-free cultivar Joan J ( ss ) and the prickled cultivar Caroline ( Ss ) was analyzed to identify the genomic region associated with prickle development in red raspberry. Genotype by sequencing (GBS) was combined with a genome-wide association study (GWAS) using ﬁxed and random model circulating probability uniﬁcation (FarmCPU) to analyze 8474 single nucleotide polymorphisms (SNPs) and identify signiﬁcant markers associated with the prickle-free trait. A total of four SNPs were identiﬁed on chromosome 4 that were associated with the phenotype and were located near or in annotated genes. This study demonstrates how association genetics can be used to decipher the genetic control of important horticultural traits in Rubus , and provides valuable information about the genomic region and potential genes underlying the prickle-free trait.


Introduction
Red raspberry belongs to the genus Rubus, which is a member of the family Rosaceae and subfamily Rosoideae. Rubus is one of the most diverse genera comprising 12 subgenera with the overall number of species in Rubus estimated to be between 600 and 800 [1]. The most commonly cultivated types worldwide are red raspberry (R. ideaus L.), black raspberry (R. occidentalis L.), and various blackberry types (R. hybrid) [2,3]. Red raspberry is a globally commercialized specialty crop and is the most economically important species in the genus. It is diploid (2n = 2x = 14) with a basic set of seven chromosomes [1] and a nuclear haploid genome of approximately 280 MB in size [4]. R. idaeus is highly heterozygous in its wild state due to the presence of dominant self-incompatibility alleles that limit pollen tube formation in the style [5]. Although germplasm used in breeding programs has been selected for self-compatibility, inbreeding depression can be very severe, which constrains population development for both breeding and molecular studies [6] thus complicating many genetic studies. Recurrent mass selection is used as the breeding approach with only limited self-pollination and backcrossing for specific traits, thus inbred lines for developing study populations are unavailable.
A plant's epidermis plays several important roles, from protection against pathogens and predators to controlling vital exchange of gas, water and nutrients with the environment. Epidermal structures like the cuticle, hairs, waxy exudates, trichomes, spines, thorns and prickles help in the protection of plants in several ways [7]. Trichomes, simple hair-like structures, in addition to providing protection against biotic and abiotic stresses [8][9][10], also moderate surface temperatures and help reduce the transpiration rate by creating a barrier between the epidermis and the environment [11]. Thorns (modified branches containing vasculature), spines (modified leaves containing vasculature) and prickles (epidermal outgrowths lacking vasculature), while botanically different organs, all provide additional mechanical protection from herbivory and mechanical damage [12][13][14][15][16].
Additionally, the presence of glandular hairs (pubescence) has been found to be associated with resistance to some fungal diseases [17]. Raspberry cultivars with fine hairs on the canes (pubescent canes) are known to be more resistant to spur blight (Didymella applanata (Niessl) Sacc.) and cane botrytis (Botrytis cinerea) [18][19][20] but are more susceptible to powdery mildew (Sphaerotheca macularis), yellow rust (Phragmidium rubi-idaei) and cane spot (Elsinoe veneta) [5,[21][22][23][24]. Pubescence in raspberry is determined by gene H manifested as glandular hairs, which are distinct from prickles with the recessive allele resulting in glabrous canes in the homozygous state (genotype hh) [17]. Gene H has been mapped to linkage group (LG) 2 [25]. Moreover, besides cane pubescence, gene H is known to have other linked pleiotropic effects including a slight increase in prickle density (the term spine is used in the literature) and a decrease in prickle size [22,26]. Since both hairs and prickles are glandular epidermal outgrowths, their early development is likely inter-related [27], suggesting the gene H may be regulated similarly to the early development of Rubus prickles and affect several cell characteristics. Similarly, among several genes known to confer prickles in various Rubus species, the s locus in red raspberry is a simple dominant/recessive locus for the presence of prickles, and has yet to be genetically characterized [28,29]. Other major loci conferring the prickle-free trait in various Rubus species are S TE [30], S f [31], and S fL [32], which all confer prickle-free in the dominant state and originate in blackberry germplasm not generally utilized in breeding due to the unpredictable nature of the loci and ploidy incompatibility.
In rose, the prickle-free trait is also recessive and has been mapped to a major locus on linkage group 3 (LG3) in an interspecific population with three quantitative trait loci (QTLs) on LG3, LG4 and LG1 associated with prickle density [33]. However, segregation for prickles in this population was highly skewed with the prickle-free trait deviating significantly from simple Mendelian inheritance suggesting genetic anomalies or incompatibilities in the interspecific population studied.
Despite the fact that prickles help in protection against mechanical damage, natural predators, and fungal pathogens, it is an undesirable trait for commercial production as a complicating factor in pruning, trellising, and harvest. Hence, the prickle-free trait in Rubus cultivars is one of the most sought after traits for breeders and commercial industries. Several programs have made efforts in incorporating the prickle-free phenotype in their programs, resulting in cultivars such as 'Joan J' and 'Glen Ample' red raspberry [34], and 'Natchez', 'Chester' [35], 'Apache', and 'Triple Crown' blackberry, among others. However, it is time consuming and expensive to integrate the prickle-free trait with other important traits through conventional breeding approaches. For instance, the prickle-free trait has been transferred into black raspberry (R. occidentalis L.) germplasm from related red raspberry germplasm through a multigenerational process spanning more than a decade [36] but has yet to result in a commercial quality cultivar. Additionally, cultivated blackberries are bred at the tetraploid or higher ploidy level, which makes the production of prickle-free homozygous recessive types for evaluation extremely difficult when utilizing the majority of the prickled germplasm available to breeders [14]. The development of molecular tools will expedite both the breeding process and the development of prickle-free cultivars through alternative methods.
In this study, an F1 population segregating for prickles [37] controlled by the s locus conferring the prickle-free trait in the homozygous recessive state [28,29] and segregating in a standard 1:1 Mendelian ratio was examined. As such, the trait is simple and unambiguous to score starting at the cotyledon stage and throughout the life cycle of the raspberry plant. Although a molecular marker for the gene is not necessary for selecting prickle-free canes, it can simplify breeding by eliminating the need for test crosses with prickled parental selections. Additionally, precise genomic data is needed if gene editing techniques are to be utilized to modify important existing cultivars to a prickle-free version.
Currently, the information on the underlying genomics of prickle development in R. idaeus is limited. Genome-wide association studies (GWAS) are powerful tools for mapping complex traits down to the sequence level, aiding in the identification of genes associated with important horticultural traits and biological processes [38,39]. GWAS identifies the association between loci and a particular trait by examining single nucleotide polymorphisms (SNPs) throughout an entire genome. As selected loci and SNPs are associated with a target trait, GWAS can be used for the development of useful molecular markers. GWAS identifies associated SNPs of the target trait using both genotypic and phenotypic data and various statistical methods, which can then be linked to genomic sequence data to identify gene candidates within a specific genomic region. The present study utilizes genome-wide association analysis to elucidate the genetic control of prickle development in red raspberry by identifying significant SNPs associated with the trait, linking those SNPs to a specific chromosomal region, and identifying putative candidate genes underlying the trait.

Plant Materials
A previously described population from a controlled hybridization between pricklefree 'Joan J' (ss) and prickled 'Caroline' (Ss) was phenotyped for cane prickles [37]. 'Caroline' is derived from the hybridization between GE01 × 'Heritage', in which GE01 is a cross between 'Autumn Bliss' × 'Glen Moy'. 'Heritage' and 'Autumn Bliss' are homozygous for prickles and 'Glen Moy' is prickle-free. Similarly, the source of the prickle-free alleles in 'Joan J' is also 'Glen Moy' through a complex set of hybridizations [40]. Plants were grown in a growth chamber prior transfer to a greenhouse for phenotyping and tissue collection. Leaf tissue was collected from 90 F 1 progeny along with both the parents (92 total) for DNA extraction.

DNA Extraction
Genomic DNA was isolated from parents and progeny using a Qiagen DNeasy 96 Plant Extraction Kit (Qiagen Inc., Valencia, CA, USA) following the manufacturer's protocol. DNA concentration and purity were assessed using a NanoDrop ND -1000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). DNA contamination and degradation were assessed on 1.0% agarose gels.

Genotype by Sequencing (GBS)
Genotype by sequencing (GBS) (Novogene, Beijing, China) was used to develop SNP markers on a population segregating for prickles. Briefly, DNA samples from the 92 samples were digested with MseI and EcoRI. Then, a 95-plex GBS sequencing library was prepared by ligating the digested DNA to unique barcode adapters. PCR amplification was performed according to a standard PCR protocol [41], and then all the samples were pooled and size-selected for the required fragments to complete the library construction. Sequencing was performed using an Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA).

Preprocessing
After sequencing, the raw reads were demultiplexed according to the barcode sequences and trimmed using the Illumina pipeline CASAVA v1.8.2 software. The sequences and corresponding quality information were stored in 92 separate FASTQ files. It is common for the quality of bases to decrease at the ends of Illumina reads, so the ends were trimmed at the point where the Phred quality score dropped below Q = 20 (or 0.05 probability of error). Additionally, all 5 and 3 stretches of ambiguous N nucleotides were trimmed. Poor quality sequence reads and reads shorter than 25 bases were discarded.

Sequence Alignment, SNP Calling, and SNP Imputation
The sequencing reads were processed with the GBS Discovery Pipeline for species with a reference genome implemented in TASSEL Version 3.0 [42] and following the pipeline documentation [43]. The Burrows-Wheeler Aligner (BWA) program [44] (parameters: mem -t 4 -k 32 -M) was used to align the clean reads to an unpublished version of the red raspberry (R. idaeus) genome as the reference genome. The TASSEL 3.0 Discovery SNP Caller was implemented to align the multiple sequence tags from the same physical locations across the genome, to call SNPs at these locations across the individual samples. Quality control checks were performed by eliminating markers with minor allele frequency (MAF) < 5%, and markers with a missing rate higher than 10%. Marker data imputation was then applied using the FILLIN algorithm [45] implemented in TASSEL Version 5.0 [42]. The final GBS SNPs were plotted against their respective physical positions on each chromosome to examine their density and distribution with ggplot2 [46]. Identification of significant SNPs from the remaining 8474 SNPs was completed after filtration and imputation.

Genome-Wide Association Analysis
To discover associations between the genome-wide GBS SNPs and the prickle phenotype in our population, a genome-wide association study (GWAS) was conducted. The general linear (GLM) and mixed linear models (MLM) are the most commonly used models for association analysis. GLM accounts for population structure only and a MLM accounts for the population structure and the family relatedness [47,48]. Compared to GLM, the incorporation of population structure and family relatedness in MLM in the association tests helps control false positives; however, true positives can be compromised by these adjustments [49]. Therefore both GLM and MLM show weaknesses in the proclivity to induce false negatives due to overfitting of the model so that important associations can be missed. Fixed and random model circulating probability unification (FarmCPU) works to correct false positives while identifying true positives [49]. In this study, GLM, MLM, and FarmCPU models were compared using phenotypic data and results were evaluated based on quantile-quantile (Q-Q) plots. Subsequently, the FarmCPU model proved to be the most accurate model for association analysis of the data from the raspberry population studied here.
In FarmCPU, the multiple loci linear mixed model (MLMM) is divided into two parts: a fixed effect model (FEM) and a random effect model (REM), which are used iteratively [49]. To avoid model overfitting, REM estimates the multiple associated markers that are used to obtain kinship. The FEM tests markers, one at a time, and kinship from REM as covariates to control false positives and false negatives. At each iteration, p-values of testing markers and multiple associated markers are unified. GWAS was conducted using the genome association and prediction integrated tool (GAPIT) R package [50] in R v.3.0.2. The analysis was composed of 8474 SNPs, each with a minor allele frequency greater than or equal to 0.05. A threshold value -Log p-value ≥ 5.98 was used to declare a significant association of SNPs with the prickle phenotype.

Candidate Gene Identification
SNPs at level of −Log 10 p ≥ 5.98 were considered highly significant and were characterized in silico for their genomic position and functional effect. Candidate genes surrounding significantly associated SNPs were annotated using the Blast2GO tool with BLASTp search against the non-redundant protein database [51]. Candidate genes with a possible connection to prickle, trichome, or epidermal cell growth regulation were taken into consideration.

Phenotype Descriptions
The gross morphology of prickled versus prickle-free raspberry has been previously described [37] and is demonstrated in Figure 1. There are clear morphological differences between the two types, prickled plants having a mixture of the prickle development phases from recently initiated to mature lignified prickles on their stem, petioles and leaves, and prickle-free plants having smooth epidermal surfaces with only simple trichomes. The 'Joan J' × 'Caroline' population segregated almost perfectly in the predicted Mendelian ratio (1:1) (44 prickled:46 prickle-free) for the presence/absence of prickles (Figure 2A).

Candidate Gene Identification
SNPs at level of −Log10 p ≥ 5.98 were considered highly significant and were characterized in silico for their genomic position and functional effect. Candidate genes surrounding significantly associated SNPs were annotated using the Blast2GO tool with BLASTp search against the non-redundant protein database [51]. Candidate genes with a possible connection to prickle, trichome, or epidermal cell growth regulation were taken into consideration.

Phenotype Descriptions
The gross morphology of prickled versus prickle-free raspberry has been previously described [37] and is demonstrated in Figure 1. There are clear morphological differences between the two types, prickled plants having a mixture of the prickle development phases from recently initiated to mature lignified prickles on their stem, petioles and leaves, and prickle-free plants having smooth epidermal surfaces with only simple trichomes. The 'Joan J' × 'Caroline' population segregated almost perfectly in the predicted Mendelian ratio (1:1) (44 prickled:46 prickle-free) for the presence/absence of prickles (Figure 2A).

Genome-Wide Association Analysis
A total of 8474 SNPs were used for association analysis to identify significant SNPs. The final GBS SNPs plotted against their respective physical positions on each chromo-

Genome-Wide Association Analysis
A total of 8474 SNPs were used for association analysis to identify significant SNPs. The final GBS SNPs plotted against their respective physical positions on each chromosome to examine their density and distribution with ggplot2 is shown in Figure 2B [46]. A comparison of FarmCPU, GLM, and MLM models was completed using the prickle data across all parents and progenies. The quantile-quantile (Q-Q) plot assesses how well the GWAS model accounts for population structure and kinship (familial relatedness). The negative logarithms of the p-values from the models fitted in GWAS are plotted against their expected value under the null hypothesis of no association with the trait. The majority of the dotted lines or the points should lie on the diagonal line, since most of the SNPs tested are not associated with the trait. It is expected to see the deviation from the diagonal line at the right tail area, which suggests the association of those SNPs with the trait. The Q-Q plot of the FarmCPU model resulted in a sharp deviation from the expected negative base 10 logarithm of the p-value distribution in the tail area, indicating the association of the SNPs with the trait, and that false positives and negatives were adequately controlled. However, Q-Q plots from MLM and GLM models did not show a sharp deviation (Figure 3). With MLM and GLM most of the SNPs did not lie on the diagonal line indicating that FarmCPU was a better choice than MLM and GLM model for association testing with this data set.
the raspberry chromosomes. The x-axis represents the distance in base pairs.

Genome-Wide Association Analysis
A total of 8474 SNPs were used for association analysis to identify significant S The final GBS SNPs plotted against their respective physical positions on each chr some to examine their density and distribution with ggplot2 is shown in Figure 2B [4 comparison of FarmCPU, GLM, and MLM models was completed using the prickle across all parents and progenies. The quantile-quantile (Q-Q) plot assesses how we GWAS model accounts for population structure and kinship (familial relatedness) negative logarithms of the p-values from the models fitted in GWAS are plotted ag their expected value under the null hypothesis of no association with the trait. The m ity of the dotted lines or the points should lie on the diagonal line, since most of the tested are not associated with the trait. It is expected to see the deviation from the diag line at the right tail area, which suggests the association of those SNPs with the trait Q-Q plot of the FarmCPU model resulted in a sharp deviation from the expected neg base 10 logarithm of the p-value distribution in the tail area, indicating the associati the SNPs with the trait, and that false positives and negatives were adequately contro However, Q-Q plots from MLM and GLM models did not show a sharp deviation (F 3). With MLM and GLM most of the SNPs did not lie on the diagonal line indicating FarmCPU was a better choice than MLM and GLM model for association testing with data set. GWAS using the FarmCPU model was conducted on 8474 SNPs (MAF ≥ 0.05) that segregated within this population. Four SNPs significantly associated with the prickle phenotype were identified with association analysis at the level of −Log 10 p ≥ 5.98 (Figure 4) that delimit a 1604 kb physical region (33,543,148,226) on the distal portion of chromosome 4. On chromosome 4 the allelic effect for these significant SNPs ranged from −0.38 to 0.20 as seen in Table 1. A significant SNP 4_34738035 (5.08 × 10 −23 ) contributed negatively to the trait indicating a major allele (i.e., more common) that favored the pricklefree phenotype, followed by two other significant SNPs, 4_33543882 (2.80 × 10 −11 ) and 4_34134523 (2.22 × 10 −18 ), which also contributed negatively to the trait (i.e., favored segregated within this population. Four SNPs significantly associated with the pr phenotype were identified with association analysis at the level of −Log10 p ≥ 5.98 (Fi 4) that delimit a 1604 kb physical region (33,543,148,226) on the distal portio chromosome 4. On chromosome 4 the allelic effect for these significant SNPs ranged −0.38 to 0.20 as seen in Table 1. A significant SNP 4_34738035 (5.08 × 10 −23 ) contrib negatively to the trait indicating a major allele (i.e., more common) that favored prickle-free phenotype, followed by two other significant SNPs, 4_33543882 (2.80 × 1 and 4_34134523 (2.22 × 10 −18 ), which also contributed negatively to the trait (i.e., fav prickle-free). SNP 4_35148226 (3.61 × 10 −13 ) contributed positively to the trait indicati minor allele (i.e., less common) favored the prickle-free trait.

Candidate Gene Identification
Candidate genes flanking the four SNPs significantly associated with the trait based on the annotation of the draft red raspberry genome were identified ( Table 2). All together 98 transcripts were identified flanking 100 kb (up and downstream) of all the associated SNPs. Among the protein-coding genes identified was the transcription factor (TF) MYB16like (MIXTA-like R2R3-MYB family member), which regulates conical cell outgrowth and trichome initiation in diverse plant species [52]. R2R3-MYB TFs have been determined to  Other transcripts identified in the nearby region of the SNPs included Rosa chinensis axial regulator YABBY 4, Rosa chinensis MLP-like protein 31, Rosa chinensis agamouslike MADS-box protein AGL30, Rosa chinensis ubiquitin-like-specific protease 1D, and Rosa chinensis protein trichome birefringence-like 2. The accession numbers for the genes flanking each marker are listed in Table 2.

Discussion
This research evaluated prickle development in a segregating F 1 population and the parental genotypes in red raspberry. GBS and GWAS was adapted and applied utilizing a draft red raspberry reference genome to identify candidate genes associated with prickle formation. This genetic information may be useful for future modification of important prickled Rubus cultivars into prickle-free versions utilizing gene editing techniques. GWAS has the advantage over traditional QTL mapping in that variations can be mapped down to the nucleotide level. In this study, four highly significant SNPs were identified on the distal portion of chromosome 4 that delimited a 1604 kb region spanning the physical position between 33,543,882 and 35,148,226.
The significant SNP 4_35148226 was identified to contribute positively to the trait while all the remaining significant SNPs contributed negatively to the trait. Positive and negative contributions do not refer to the percentage of the genetic variation caused by the SNPs. It identifies the common and uncommon alleles associated with the phenotype we are looking for, i.e., prickle-free in this case. A positive value indicates that the minor allele (i.e., less common allele) was the allele associated with the prickle-free phenotype (favorable or sought after phenotype) while a negative value indicates that the major allele (i.e., more common allele) was the allele associated with the prickle-free phenotype. Alleles from either the major class (the common allele/variation) or minor class (less common allele/variation) were considered as favorable if they were associated with the prickle-free phenotype (i.e., allelic effect favored the sought after phenotype, prickle-free in this case).
Understanding of the genetic and physiological control of prickle development in Rubus at the molecular level is limited. A genetic linkage map constructed using a population derived from the prickle-free "Glen Moy" and the homozygous prickled "Latham", (the terms spine-free and spiny were used in the study) measured varying prickle density among the progeny [54]. This trait mapped to LG2. A QTL for prickle density was previously mapped to LG6 and three QTL for fungal resistance that overlapped this same region [25].
Another study constructed a genetic linkage map and performed QTL mapping of prickle density and other traits using a (R. parvifolius × "Tulameen") × "Qualicum" population [55]. Both parents exhibited prickled canes, however prickle density ranged from highly prickled canes to prickle-free canes. The study identified two QTL linked to prickle density, one associated with linkage group 4 (LG4), present at the distal end, accounting for approximately 84% of the variation with high LOD score, and second QTL associated with LG6 accounting for less than 10% of the variation. The segregation pattern followed a single gene model with 3:1 segregation, indicating both parents were heterozygous at the prickle locus. The same QTL on LG4 and LG6 were detected when scores for prickle-free individuals were removed and only prickle density was examined, accounting for 46.3% and 16.7% of the variation, respectively. When the trait was scored as qualitative (0 as prickle-free and 1 as prickled on canes), the trait mapped to the QTL region on the distal end of LG4 and was designated gene s.
In the present study, a significant locus mapped to the distal end of LG4 (chromosome 4) associated with the trait corroborates the previous study [55]. A total of 98 transcripts flanking 100 kb upstream and downstream of the four associated SNPs were identified ( Table 2). Among these transcripts, there were a few that are known to be directly or indirectly associated with trichome initiation/early stage development. There were 16 genes flanking 100 kb upstream and downstream of Marker 4_33543882, notably, Rosa chinensis transcription factor MYB16-like (Accession: XP_024196462.1) and Rosa chi-nensis agamous-like MADS-box protein AGL30 (Accession: XP_011463061.1) ( Table 2), which are potential candidate genes associated with prickle development. The MYB16like (MIXTA-like R2R3-MYB) transcription factor is of particular interest as transcriptome analysis showed it to be significantly down regulated in prickle-free plants [56].
In cucumber, the MIXTA-like homolog CsMYB6 has been determined to regulate epidermal cell differentiation, cuticle wax biosynthesis, and trichome morphogenesis [61][62][63]. Moreover, a study conducted to identify and characterize the genome-wide R2R3-MYB family in three species in the Rosaceae family: Malus domestica Borkh., Prunus persica (L.) Batsch, and Fragaria vesca L., identified 44 functional subgroups with seven unique to the Rosaceae family [64]. In the study, functional analysis of the TFs were performed based on the clustering of R2R3-MYB genes of Arabidopsis, which identified two R2R3-MYB genes; Arabidopsis transcription factor AtMYB5 in subgroup 16 and AtMYB106/NOK and At-MYB16/MIXTA in subgroup 5. AtMYB5 is known to regulate trichome morphogenesis and mucilage synthesis [65]. AtMYB106/NOK and AtMYB16/MIXTA are known to participate in trichome development [58]. Therefore, a MIXTA-like R2R3-MYB family member could be one of the key regulators of prickle development.
Similarly, MADS-box genes are the key members of regulatory networks behind multiple developmental pathways. These genes regulate the networks involved in plant responses to stress and the plant developmental plasticity response to seasonal fluctuations [66]. Transcriptome study on Solanum viarum Dunal prickles concluded that the development related transcription factor MADS-box play a role in prickle development in addition to R2R3-MYB, REM, and DRL1 [67].
Finally, a total of 31 genes flanking 100 kb upstream and downstream of Marker 4_35148226 were identified. Among these 31 genes was Rosa chinensis protein trichome birefringence-like 2 (Accession: XP_024197787.1) ( Table 2), which is a putative candidate gene for prickle development in Rubus. Cellulose is one of the major components of the plant cell wall and has diverse functions [68]. The cell shape and development patterns are mainly determined by the cellulose microfibrils [69]. A secondary cell wall, which is the major component in wood and plant fibers, is deposited once the plant cells stop expanding [70]. This secondary cell wall is mainly composed of crystalline cellulose, microfibrils, lignin, and non-cellulosic polysaccharides [69,71,72]. The plant cellulose synthase (CESA) genes function as cellulose synthases and mutants in primary CESA genes exhibit reduced cellulose levels and dwarf growth phenotypes [73][74][75]. The gene that controls a trait referred to as TRICHOME BIREFRINGENCE (TBR) is also an important component required for secondary cell wall cellulose synthesis [76]. Wild type Arabidopsis trichomes display strong birefringence under polarized light whereas the tbr mutant has severely reduced crystalline cellulose content in trichomes and lacks such birefringence [76]. TBR belongs to the TRICHOME BIREFRINGENCE-Like (TBL) gene family. Members of the TBL protein family influence the resistance to pathogens, tolerance to freezing, and synthesis of cellulose on the secondary cell wall [77]. In trichome differentiation, the gene TBR has a fundamental function in the cellulose content, but additionally regulates the density of trichomes on the epidermal surface [76,77]. Therefore, Rosa chinensis protein trichome birefringence-like 2 could play a potential role in prickle development in Rubus.

Conclusions
Prickle-free cultivars facilitate fruit harvesting, cultivation, and management and, as such, are highly desirable. However, development of new prickle-free cultivars through conventional techniques is time consuming and difficult due to the limited prickle-free germplasm and the complex nature of other desirable traits, especially fruit quality traits. The development of prickle-free versions of economically important cultivars through gene editing techniques will provide both desirable new planting material and expand the available germplasm for future breeding efforts. An understanding of the genetic and molecular processes behind prickle development is necessary to facilitate this process. Genotype by sequencing (GBS) combined with genome-wide association mapping study (GWAS) identified four SNPs on chromosome 4 associated with the phenotype. Transcripts flanking ±100 kb of the 4 associated SNPs were identified, which provided potential candidate genes for prickle production in red raspberry including MYB16-like (MIXTA-like R2R3-MYB), MADS-box protein AGL30, and trichome birefringence-like 2 protein. Specific sequence information from these targets could then use gene editing techniques to identify the gene responsible for the prickle development and further development of prickle-free cultivars. With the expanding market for fresh raspberries and blackberries, prickle-free cultivars can improve efficiency of production practices thus reducing labor costs required for plant management and harvest.
Author Contributions: A.K. conducted the research in this manuscript as partial requirement for her PhD degree program at Cornell University under the advisement of C.A.W. who contributed to the project design, data analysis and interpretation, and editorial guidance. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.