A Novel Pinkish-White Flower Color Variant Is Caused by a New Allele of Flower Color Gene W1 in Wild Soybean ( Glycine

: The enzyme ﬂavonoid 3 (cid:48) ,5 (cid:48) -hydroxylase (F3 (cid:48) 5 (cid:48) H) plays an important role in producing anthocyanin pigments in soybean. Loss of function of the W1 locus encoding F3 (cid:48) 5 (cid:48) H always produces white ﬂowers. However, few color variations have been reported in wild soybean. In the present study, we isolated a new color variant of wild soybean accession (IT261811) with pinkish-white ﬂowers. We found that the ﬂower’s pinkish-white color is caused by w1-s3 , a single recessive allele of W1 . The SNP detected in the mutant caused amino acid substitution (A 304 S) in a highly conserved SRS4 domain of F3 (cid:48) 5 (cid:48) H proteins. On the basis of the results of the protein variation effect analyzer (PROVEAN) tool, we suggest that this mutation may lead to hypofunctional F3 (cid:48) 5 (cid:48) H activity rather than non-functional activity, which thereby results in its pinkish-white color. H.J.; G.T.P.; investigation, J.S. J.S.,


Introduction
In plants, flavonoid 3 , 5 -hydroxylase (F3 5 H) is one of the key enzymes responsible for the blue and/or purple coloration in flower petals [1]. F3 5 H, together with dihydroflavonol 4-reductase (DFR), generally synthesize delphinidin-based anthocyanin pigments through the flavonoid biosynthesis pathway [2]. However, F3 5 H enzymes are absent in several ornamental plants, such as rose (Rosa hybrid) and carnation (Dianthus caryophyllus). These ornamental plants only contain cyanidin and/or pelargonidin pigments, and therefore only have pink, yellow, and red as their natural colors but not purple or blue [3]. F3 5 H is a cytochrome P450, which hydroxylates the naringenin or dihydrokaempferol biomolecules at the 3 and 5 positions of the β ring to synthesize the delphinidin-based anthocyanin pigments [4,5]. The loss of function mutations in the F3 5 H gene subsequently affects the production of delphinidin-based anthocyanin and results in flower color variations from blue to pink in several ornamental plants, for instance, petunia (Petunia hybrida) and gentian (Gentiana scabra) [6][7][8]. In leguminous crops, such as pea (Pisum sativum), lack of functional F3 5 H enzyme results in rose-pink flower petals [3]. In contrast, soybean (G. max) showed that the variations in the F3 5 H gene produced white flowers rather than color variations, such as pink flowers, which were observed in other plant species [3]. In addition, under W1 allelic background, the DFR-encoding genes, namely, W3 (DFR1) and W4 (DFR2), are epistatic to each other. Double mutations in these DFRs, i.e., w3 w4, cause near-white flowers in soybean [9].
In soybean, the W1 locus encoding F3 5 H displays purple and white flower colors for its dominant and recessive alleles, respectively [10]. The white flower color observed in the common soybean cultivar, "Williams 82", is caused by the 65-bp insertion and 12-bp deletion in the F3 5 H coding region, consequently resulting in premature translation [11,12]. We previously reported that F3 5 H sequences from 99 landraces with white flowers were identical to that of "Williams 82" cultivar [13]. However, most of the wild soybean accessions (G. soja) have purple flowers and lack color variations among them, except for a few varieties that have been reported in the past two decades [14]. A white flower wild soybean accession, PI 424008C, was isolated from the progenies of the purple-flowered PI 424008A. Genetic analysis showed that the white color flower was caused by a mutation in the W1 locus [15].
In another study, we isolated two G. soja accessions (CW13381 and CW12700) with white flowers [16]. Genomic analysis of the W1 gene of CW13381 revealed the presence of an indel (≈90-bp AT-repeat) in the second intron, whereas the CW12700 mutant had a unique single-nucleotide substitution that subsequently resulted in amino acid change (N 300 K) in the substrate recognition site (SRS) 4 of F3 5 H [16]. Another wild soybean accession (B00146) was found as a single plant with purple and white variegated flowers (B00146-m) [17]. From the progeny of B00146-m, the lines with white (B00146-w) and purple (B00146-r) flowers were developed. The w1-m allele of B00146-m showed the insertion of the Tgs1 transposon (CACTA family) in the first exon. Taken together, the loss of function of F3 5 H in soybean always halts anthocyanin production, consequently resulting in white color flowers [16]. Apart from the aforementioned white color variants, a light-purplecolored G. soja variant (B09121) has been reported with a new w1-lp allele [18]. A unique single-base substitution in the nucleotide position 653 of w1-lp mutant led to a noteworthy amino acid change (V 210 M). Flavonoid analysis showed that the w1-lp mutant had a scarce number of major anthocyanins commonly detected in purple flowers. However, there was no difference in the transcription level between the alleles of w1-lp and W1. On the basis of their results, the authors suggested that an SNP mutation in the F3 5 H gene may lead to reduced F3 5 H enzymatic activity [18].
In this study, we isolated a new color variation of a wild soybean accession (IT261811) with pinkish-white flowers. The objective of the present study was to determine the genetic basis of the new wild soybean variant with pinkish-white flower color and its allelic component that influences anthocyanin biosynthesis.

Plant Material
A wild soybean accession (IT261811) with pinkish-white flowers was obtained from the National Agrobiodiversity Center, Korea. Another wild soybean accession (IT182932) with purple flowers was used as the wild-type accession in the present study ( Figure 1). F 2 individuals derived from the cross between IT261811 and IT182932 were used for the segregation analysis. F 2 progenies with two parental lines were grown in the experimental fields of Kyungpook National University (Gunwi, 36 • 07 N, 128 • 38 E, Korea).

RT-PCR and Sequence Analysis
RT-PCR analysis was performed using the first-strand cDNA method to determine the transcript levels of F3 5 H (W1) and DFR2 (W4). The soybean Actin 1 gene (GmActin; Glyma.19G000900.1) was used as a loading control [19]. The PCR reactions were performed using primer pairs for respective genes described in Sundaramoorthy et al. [16]. Exons and introns of W1 (Glyma.13G072100) and W4 (Glyma.17G252200) were amplified using the primer pairs previously described in Park et al. [20]. PCR products were sequenced (SolGent, Daejeon, Korea) using the same primer pairs used in the aforementioned amplification procedure.

Multiple Alignment Analysis of W1 Proteins
The F3 5 H protein sequences from 14 different plant species were obtained from the National Center for Biotechnology Information Conserved Domains Database (NCBI-CDD, https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml; accessed on 10 January 2021) and used to perform multiple sequence alignment using the ClustalW analysis tool (http://www.genome.jp/tools-bin/clustalw; accessed on 10 January 2021).

RT-PCR and Sequence Analysis
RT-PCR analysis was performed using the first-strand cDNA method to de the transcript levels of F3′5′H (W1) and DFR2 (W4). The soybean Actin 1 gene (G Glyma.19G000900.1) was used as a loading control [19]. The PCR reactions w formed using primer pairs for respective genes described in Sundaramoorthy et Exons and introns of W1 (Glyma.13G072100) and W4 (Glyma.17G252200) were a using the primer pairs previously described in Park et al. [20]. PCR products quenced (SolGent, Daejeon, Korea) using the same primer pairs used in the af tioned amplification procedure.

Multiple Alignment Analysis of W1 Proteins
The F3′5′H protein sequences from 14 different plant species were obtained National Center for Biotechnology Information Conserved Domains Database CDD, https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml; accessed on 10 2021) and used to perform multiple sequence alignment using the ClustalW anal (http://www.genome.jp/tools-bin/clustalw; accessed on 10 January 2021).

Prediction by Protein Variation Effect Analyzer (PROVEAN)
To estimate the impact of non-synonymous SNPs causing amino acid s changes in the W1 proteins, we used the online server PROVEAN (http://provean

Prediction by Protein Variation Effect Analyzer (PROVEAN)
To estimate the impact of non-synonymous SNPs causing amino acid sequence changes in the W1 proteins, we used the online server PROVEAN (http://provean.jcvi. org/; accessed on 12 January 2021). Each amino acid substitution was given as input and the PROVEAN score was calculated [21,22].

SNP-Based Genetic Analysis
To study the phenomenon of segregation, cleaved amplified polymorphic sequence (CAPS) was developed to detect a single-base substitution in IT261811. PCR amplification was performed according to the procedure previously described by Park et al. [20]. Digestion of the PCR-amplified products were performed using DdeI (Enzynomics, Daejeon, Korea).

Genetic Analysis of New Pinkish-White Flower Variant of Wild Soybean
In the present study, we identified a pinkish-white flower variant (IT261811) among the wild soybean (G. soja) accessions ( Figure 1). A total of 124 F 2 individuals derived from the cross between IT261811 and IT182932 segregated into 95 plants with purple flowers and 29 plants with pinkish-white flowers ( Table 1). The segregation fitted a 3:1 ratio, suggesting that a single recessive gene controls the pinkish-white mutant phenotype in IT261811.

Molecular Analysis of the w1-s3 Variant
In soybean (G. max), the W3 and W4 loci encoding DFR enzymes epistatically interact with each other in a W1 genotypic background [23]. However, in G. soja, the W3 locus is not involved in the determination of flower colors [24]. Taking this into consideration, we conducted RT-PCR for both F3 5 H (W1) and (W4) genes to analyze alterations in gene expression. PCR products for W4 were amplified with the size of 1175-bp. For W1 expression, 5 (W1-U) and 3 (W1-L) half regions of the F3 5 H gene were amplified with the size of 331-bp and 558-bp, respectively (Figure 2A). Both W1 and W4 genes from IT261811 showed no significant difference in expression levels compared to that of wild-type IT182932, indicating that the mutant IT261811 had normal W1 and W4 expressions (Figure 2A). We analyzed the genomic sequences of F3 5 H (W1) and DFR2 (W4) to determine the involvement of W1 and W4 in the allelic variation of the mutant, IT261811. First, we analyzed the genomic sequences of DFR2 (nucleotide positions −4 to +3416) from IT182932 and IT261811, and results showed no polymorphism between them. Next, the nucleotide sequences of F3 5 H (nucleotide position −64 to +4534) from the mutant IT261811 showed a single-nucleotide substitution (G-T) in the third exon at nucleotide position +3763 (NCBI GenBank accession number: MW298105; Figure 2B), resulting in amino acid substitution (A 304 S) relative to the corresponding sequences of the wild-type IT182932 (NCBI GenBank accession number: KX077984). The new mutant allele was designated as w1-s3.
Resequencing data of soybean accession in China, Korea, and the USA are publicly available through the NCBI [25][26][27]. With SNP and INDEL data on W1 locus from of 775 resequenced accessions on SoyKB (http://soykb.org; accessed on 15 January 2021), we found no occurrence of SNP and/or INDEL on chromosome 13 at position 17,316,282 (Wm82.a2.v1), where w1-s3 shows a single-nucleotide substitution. This prompted us to perform amino acid sequence alignment of F3 5 H proteins from 14 different plant species from the NCBI-CDD database ( Figure 2C) to determine the effects of single amino acid substitution in the w1-s3 allele on the functionality of the F3 5 H protein. The results showed that the amino acid change (Ser for Ala at position 304) in w1-s3 allele was located at the highly conserved position of SRS4 domain. SRS4 is the one of the six functional SRS domains in F3 5 H enzymes, which plays an important role in substrate-binding specificity [28,29].
In our previous study, a white-flowered EMS-induced mutant, PE1837 (w1-p1), showed a single amino acid substitution (A 304 T) at a position similar to that of the mutant IT261811 (w1-s3) [16]. In the same study, we speculated that the hydroxyl group of T 304 in w1-p1 may have inhibited the proper binding of the flavone substrate on the basis of the results of the 3-D prediction tool, thereby leading to the loss of function of F3 5 H protein [16]. Thus, the amino acid substitution (A 304 S) identified in w1-s3 may also result in functional variation of F3 5 H proteins.  Recently, prediction software tools have been widely used to identify the deleterious or neutral effect of SNPs in candidate genes on the basis of the biochemical severity of the amino acid substitution [30]. We used the Protein Variation Effect Analyzer (PROVEAN, http://provean.jcvi.org/seq_submit.php; accessed on 12 January 2021) software tool that predicts whether an amino acid substitution has any impact on the functional activity of F3 5 H ( Figure 2D) [21]. The PROVEAN tool sets the threshold at −2.5 as default. If the score of the protein variant is ≤−2.5 as predicted, the variation has a "deleterious" effect. Scores above the threshold indicate that the variant has a "neutral" effect [21]. We used the mutants IT261811 (w1-s3) and PE1837 (w1-p1) for predicting the amino acid substitution's effect on F3 5 H protein function, along with the previously reported lightpurple flower-bearing wild soybean mutant B09121 (w1-lp), whose F3 5 H protein was described as hypofunctional due to an alteration in one of its amino acid residue [18]. The results showed that the F3 5 H proteins of all the three w1-s3, w1-p1, and w1-lp mutants had deleterious effects, with −2.692, −3.550, and −2673 PROVEAN scores, respectively. However, the scores of w1-s3 and w1-lp were similar and at par with the cutoff score, suggesting that the w1-s3 mutant protein is similar to that of w1-lp, which is more likely a hypofunctional F3 5 H rather than a completely deleterious one.

Co-Segregation of the W1 Polymorphism with Flower Color Phenotype
We conducted the single-marker analysis, which is the simplest mapping analysis to detect the associations between a marker and phenotype (pinkish-white flower color). The CAPS marker designed from the SNP (G-T) generates a DdeI site (CTNAG) in the PCR-amplified product from the mutant parent, IT261811 ( Figure 3A). The CAPS marker co-segregated with flower colors of the F 2 individuals was derived from the cross between IT261811 and IT182932 ( Figure 3B). The result of single-marker analysis showed that the W1 gene was highly associated with pinkish-white flower in this study (n = 124, p < 0.0001, R 2 = 1). Results also showed that genotype segregation fitted a 1:2:1 ratio (Table 1), indicating that the w1-s3 allele is recessive to W1. We concluded that the new w1-s3 allele under the w1 recessive allelic background produces pinkish-white flowers of the mutant IT261811 on the basis of the tight co-segregation between w1-s3 and pinkish-white flowers.   Funding: This study was supported by the Kyungpook National University Research Fund, 2020.