Development of Multiple Nucleotide Polymorphism Molecular Markers for Enoki Mushroom (Flammulina filiformis) Cultivars Identification

The enoki mushroom (Flammulina filiformis) is one of the most important and popular edible mushrooms commercially in China. However, traditional mushroom cultivar identification is challenging due to poor accuracy, heavy workloads, and low reproducibility. To overcome this challenge, we developed a method for identifying F. filiformis strains using multiple nucleotide polymorphism sequencing (MNP-seq). This involved screening 179 universal MNP markers based on whole-genome sequencing data, constructing an MNP sequence library, and performing multiplex PCR amplification and high-sequencing. We further screened 69 core MNP markers and used them to build a neighbor-joining (NJ) phylogenetic tree of 232 cultivated and wild strains. Our analysis showed that all cultivars could be accurately separated by computing genetic similarity values and that the cultivars could be separated into 22 distinct evolutionary pedigrees. The specific value of genetic similarity can be used as the standard to distinguish F. filiformis cultivars, however, it needs to be comprehensively defined by the additional phenotype and biological characteristics of those strains in the future work.


Introduction
Flammulina is a genus of edible mushrooms that belongs to the phylum Basidiomycota and the family Physalacriaceae. There are about 20 Flammulina species that have been described [1]. Some of the well-known species of Flammulina include F. filiformis, F. populicola, F. hispida, and F. tabacina. Flammulina filiformis (Z.W. Ge, X.B. Liu & Zhu L. Yang) P.M. Wang, Y.C. Dai, E. Horak & Zhu L. Yang, also known as enoki mushrooms in western and winter mushrooms or golden needling mushrooms in China, is one of the most important and popular edible mushrooms available commercially [2]. It is widely cultivated and consumed because of its nourishing qualities and desirable taste [3]. Enoki mushrooms grow naturally on Chinese hackberry tree stumps. There is a clear morphological difference between naturally grown and domesticated strains. Wild enoki mushrooms have yellowish to brown basidiocarps, while cultivated strains have white, thin, and slender stems [4]. Enoki mushrooms were first cultivated in China during the eighth century and then spread to Japan. It has been cultivated on wood logs under semi-wild conditions for over 300 years. The use of bottle cultivation technology in enoki cultivation has become increasingly popular in recent years. Up until the 1990s, Japan dominated the world's enoki mushroom production. Since then, China has replaced Japan as the world's largest producer. Currently, most of China's enoki production units are fully mechanized, with an annual production capacity of 2.4 million tons [3]. Previously, F. filiformis from eastern Asia was named as repetitive SSRs can induce DNA polymerase slippage during polymerase chain reaction, introducing erroneous SSR alleles to analysis [25,26].
In this study, we utilized the MNP marker method to investigate the mushroomforming fungus F. filiformis. To achieve this, we generated 179 MNP markers based on 232 genomic sequences of this species and subsequently identified 69 core MNP marker sequences that were utilized for phylogenetic analysis to reveal their evolutionary relationships. Additionally, we devised a streamlined approach for identifying F. filiformis cultivars by computing genetic similarities between different cultivars and lineages. It is important to note that although we only utilized the MNP-Seq method for F. filiformis in this study, the tool has broader applications and can also be used to analyze other mushroom species.

Materials and Methods
2.1. F. filiformis Strains, DNA Extraction, and Whole-Genome Sequencing F. filiformis strains collected in the study are shown in Tables 1 and 2. In total, 232 strains of F. filiformis strains were collected in this study, including 157 cultivars and 75 wild strains. To obtain a comprehensive understanding of the genetic variation present within the F. filiformis, we selected cultivars from various countries, although a significant proportion of the strains were sourced from China. The mycelia were grown in solid Potato dextrose agar (PDA) medium at 25 • C until they reached full growth. Genomic DNA was extracted from mycelia using the CTAB method [27]. A Nanodrop and 1.0% agarose gel electrophoresis were used to assess the concentration and integrity of the DNA solution. Whole-genome sequencing libraries were prepared using NexteraXT reagents (Illumina). The Illumina Novoseq platform from Novogene was then used for sequencing the DNA samples. Briefly, approximately 2 µg of DNA from each sample was used for fragmentation by Biorupter (high power: (15 s, on/90 s, off), six cycles) and end preparation by NEXT flex TM End-Repair. After PCR amplification (10 cycles), the library was purified using AMPure beads. Qubit was used to evaluate the quality and quantity of each library. The sequencing statistics of the samples are summarized in Table 1.   Sequence artifacts, including reads containing adapter contamination, low-quality nucleotides, and unrecognizable nucleotide (N), undoubtedly set the barrier for the subsequent reliable bioinformatics analysis. Hence, quality control is an essential step to guarantee meaningful downstream analysis. Fastp (version 0.19.7) [28] was used to perform basic statistics on the quality of the raw reads. The steps of data processing were as follows: (1) Discard a paired-read if either one read contains adapter contamination; (2) Discard a paired-read if more than 10% of bases are uncertain in either one read; (3) Discard a paired-read if the proportion of low quality (Phred quality < 5) bases is over 50% in either one read. In total, 163 whole-genome resequencing data of F. filiformis were analyzed. The sequencing data were mapped to the F. filiformis reference genome (accession number: AQHU01) with BWA (version 0.7.17-r1188) [29] with the parameters "bwa mem -t 8 -R". SNPs were then identified with samtools [30]. A sliding window of 130 base pairs was used to scan all SNP-containing genome segments with an increment of 10 bp. The discriminative power (DP) of a window was defined as t/c(N,2), where c(N,2) was the number of variety pairs among N varieties used and t was the number of the teams, each of which had at least two dispersed SNPs within the window. The windows with DP > 0.4 were chosen for multiplex PCR primer design and synthesis at BGI Genomics.

Library Construction and MNP Sequencing
All primers were diluted to 100 µM, and then 5 µL of each primer was pipetted into the primer mix pool. The multi-PCR reaction system consisted of 12 µL Template DNA, 5 µL Primer Mix, 5 µL 10 × Multi HotStart Buffer, 4 µL Super Pure dNTPs, 1 µL Multi HotStart DNA Polymerase, and 27 µL ddH 2 O. The total volume of each reaction mixture was 50 µL. The PCR reactions were performed as follows: 95 • C for 15 min; followed by 25 cycles at 94 • C for 30 s 58 • C for 90 s, and 72 • C for 60 s; followed by elongation at 72 • C for 10 min; and finally cooling to 4 • C. After the reaction, the PCR products were purified using the paramagnetic particle method. A total amount of 1.5 µg DNA per sample was used as input material for library construction. Sequencing libraries were generated using NEBNext Ultra DNA Library Prep Kit for Illumina (NEB in Ipswich, MA, USA, E7370L) following the manufacturer's recommendations, and indexes were added to attribute sequences to each sample. Briefly, the DNA samples were end-polished, A-tailed, and ligated with the full-length adapter for Illumina sequencing. Subsequently, the DNA products were purified by AMPure XP system (Beckman Coulter Life Sciences in Beverly, MA, USA), and size distribution was analyzed by Agilent 5400 system (Agilent Technologies in Santa Clara, CA, USA) and quantified by qPCR (1.5 nM). Qualified libraries were mixed at equal mass (100 ng) and sequenced by the Illumina Novoseq platform from Novogene. The sequencing data volume for each strain was set at 1000 M.

Core MNP Markers and Pedigree Determination
We chose the core MNP markers from all MNP markers with a 100% amplification rate for all strains tested. With the amplified sequences of these core markers, a phylogenetic tree was constructed using the NJ method and ITOL [31], which further differentiates the pedigree of all commercial cultivars.

Genetic Similarity (GS) Calculation
We mapped each sample's multiplex PCR sequencing and whole genome sequencing results to the reference genome Fv6-3 (AQHU01) with BWA (version 0.7.17-r1188) [29] with the parameters "bwa mem -t 8 -R" and the consensus sequence was obtained. All MNP sequences from all the samples were extracted based on the location information of the core MNP markers. We evaluated pairwise comparisons of the core MNP sequences from all samples. Each of the paired samples with identical sequences at the same MNP locus was supposed to have the same genotype. We calculated the genetic similarity (GS) between two strains according to the formula: the number of identical MNP sequences between two strains divided by the number of core MNP sequences.

Genome Resequencing, Screening of Universal MNP Markers, and MNP-Seq
We sequenced 163 F. filiformis strains and obtained about 5 Gb of clean data per sample (Table 1). We assessed the quality of the sequencing data by mapping reads to the F. filiformis reference genome Fv6-3 (NCBI accession no. AQHU01). The mean genome coverage was 90.5%, the mean depth was 133.0, and the average mapping rate of reads was 85.8%. Since the data was sufficient, we used this sequencing data to screen for MNP markers.
First, we selected 179 universal MNP markers based on genomic screening (see Method). Then, we designed and synthesized primers and performed multiplex PCR amplification and sequencing for 69 strains. The mean clean data per strain was 1.3 Gb, the mean Q20 value was 97.7%, and the mean Q30 value was 93.1% (Table S1). In sum, 9583 markers were detected, with an average sequencing coverage of 21,951-fold per strain ( Table 2).

MNP Markers Evaluation
MNP markers were detected in a range of 119 to 162, with an average of 138.9 markers per strain. The distribution of MNP markers detected in each strain is presented in Table 2 and Figure 1. To verify the consistency of the MNP-seq data, twenty-five strains were randomly selected for both MNP-seq and whole genome sequencing, and the MNP markers from both data sets were compared. The comparison revealed that all the MNP markers detected by MNP-seq in each strain were covered by the whole genome sequencing data, indicating a 100% reproducibility rate.

Construction of Phylogenetic Relationship Using Core MNP Sequences
We successfully detected 69 MNP markers in all strains from 179 universal MNP markers, and these markers were chosen as core MNP markers. We constructed an NJ phylogenetic tree using these core MNP sequences from 232 F. filiformis strains (Figure 2), including 69 MNP-seq data and 163 whole genome sequencing data. All cultivars could be recognized as one of 22 lineages (Figure 2).

Construction of Phylogenetic Relationship Using Core MNP Sequences
We successfully detected 69 MNP markers in all strains from 179 universal MNP markers, and these markers were chosen as core MNP markers. We constructed an NJ phylogenetic tree using these core MNP sequences from 232 F. filiformis strains (Figure 2), including 69 MNP-seq data and 163 whole genome sequencing data. All cultivars could be recognized as one of 22 lineages (Figure 2).

Construction of Phylogenetic Relationship Using Core MNP Sequences
We successfully detected 69 MNP markers in all strains from 179 universal MNP markers, and these markers were chosen as core MNP markers. We constructed an NJ phylogenetic tree using these core MNP sequences from 232 F. filiformis strains (Figure 2), including 69 MNP-seq data and 163 whole genome sequencing data. All cultivars could be recognized as one of 22 lineages (Figure 2).

Genetic Similarity Values between Different Cultivars and Lineages
The GS values were computed between each pair of cultivars and all pairwise lineages. There are potentially 22 different pedigrees that can be used to distinguish all F. filiformis cultivars (named G1-G22, respectively, in Figure 3 and Table S2). Within the same pedigrees, the GS values for various cultivars were all more than 60%. Pedigrees 1, 2, and 19 showed the highest (mean GS > 91%), while Pedigrees 10, 20, and 22 had the lowest genetic diversity values (mean GS < 72%). The minimum genetic similarity values between pedigrees (Table S3) and cultivars showed that these pedigrees could be distinguished by a GS value of less than or equal to 60%, and the GS value between strains with the range of 60-98.6%, they could be identified as different cultivars but in the same pedigree.

Genetic Similarity Values between Different Cultivars and Lineages
The GS values were computed between each pair of cultivars and all pairwise lineages. There are potentially 22 different pedigrees that can be used to distinguish all F. filiformis cultivars (named G1-G22, respectively, in Figure 3 and Table S2). Within the same pedigrees, the GS values for various cultivars were all more than 60%. Pedigrees 1, 2, and 19 showed the highest (mean GS > 91%), while Pedigrees 10, 20, and 22 had the lowest genetic diversity values (mean GS < 72%). The minimum genetic similarity values between pedigrees (Table S3) and cultivars showed that these pedigrees could be distinguished by a GS value of less than or equal to 60%, and the GS value between strains with the range of 60-98.6%, they could be identified as different cultivars but in the same pedigree.

Discussion
Flammulina filiformis is one of the most widely cultivated mushrooms in the world on a large commercial scale. It was reported that the first cultivar in China was domesticated from a wild strain isolated from Fujian Province in 1974 [32]. In 1983, Fujian breeders introduced the first white strain from Japan [21,32]. Four years later, in 1987, F21, another strain with white and slender stem characters, was introduced in China. This pattern indicates that the white strains in China were probably originally introduced from Japan, and the yellow strains may have been domesticated directly from the wild strains [21,33].
In this study, we found that most white cultivars were from Pedigree 1, Pedigree 2, and Pedigree 3, and they have close evolutionary relationships, especially for Pedigree 1 and Pedigree 2. In contrast, the yellow cultivars were often clustered with wild strains, which is consistent with previous studies that the yellow cultivars were directly domesticated from wild strains isolated from China or hybridized between white and yellow strains [21]. For example, the yellow cultivar G130Y from Pedigree 11 was clustered with the wild strain HB171, cultivar YRW1513 from Pedigree 16 was clustered with the wild strain HNY6, and cultivars SDY2114, F629, and CJ631 from Pedigree 22 were clustered with the wild strain HL1703. Generally, the white cultivars and the wild strains are separated clearly in our phylogenetic tree using core MNP markers. The high genetic diversity in wild populations (Figure 2) suggests that a large gene pool in nature is available for mushroom breeding, which is consistent with the previous study of F. filiformis using SSR markers in China [21]. Additionally, strains from the same region or country were assigned to different pedigrees, indicating that the genetic distances are not correlated with geographic origins.

Discussion
Flammulina filiformis is one of the most widely cultivated mushrooms in the world on a large commercial scale. It was reported that the first cultivar in China was domesticated from a wild strain isolated from Fujian Province in 1974 [32]. In 1983, Fujian breeders introduced the first white strain from Japan [21,32]. Four years later, in 1987, F21, another strain with white and slender stem characters, was introduced in China. This pattern indicates that the white strains in China were probably originally introduced from Japan, and the yellow strains may have been domesticated directly from the wild strains [21,33].
In this study, we found that most white cultivars were from Pedigree 1, Pedigree 2, and Pedigree 3, and they have close evolutionary relationships, especially for Pedigree 1 and Pedigree 2. In contrast, the yellow cultivars were often clustered with wild strains, which is consistent with previous studies that the yellow cultivars were directly domesticated from wild strains isolated from China or hybridized between white and yellow strains [21]. For example, the yellow cultivar G130Y from Pedigree 11 was clustered with the wild strain HB171, cultivar YRW1513 from Pedigree 16 was clustered with the wild strain HNY6, and cultivars SDY2114, F629, and CJ631 from Pedigree 22 were clustered with the wild strain HL1703. Generally, the white cultivars and the wild strains are separated clearly in our phylogenetic tree using core MNP markers. The high genetic diversity in wild populations (Figure 2) suggests that a large gene pool in nature is available for mushroom breeding, which is consistent with the previous study of F. filiformis using SSR markers in China [21]. Additionally, strains from the same region or country were assigned to different pedigrees, indicating that the genetic distances are not correlated with geographic origins.
Interestingly, we found that many white cultivars grown in different factories belong to the same pedigree. For example, strains YH217 from Youhong, T8-4 from Kangrui, and DJ1401 from Xuerong are from Pedigree 1; strains TS816 from Zhongxing and E3202 from Gangrongtai are from Pedigree 2. White cultivars grown in different factories sharing the same ancestry might indicate that they were originally introduced from the same strain [20]. It is also possible because they have been intentionally bred to have similar traits, such as color, disease resistance, or yield potential. In this case, breeders might use the same parent or closely related strains to develop different cultivars with similar traits. In addition, GS values of 100% for some cultivars in this study indicate that they share the same genetic origin, but they were given distinct names: cultivars XR2111 and WB210 from Pedigree 1 are the same; cultivars 531 and 6B25 from Pedigree 2 are the same. Further cultivation experiments will be required to determine whether they are the same cultivars.
The efficiency of MNP-Seq was attributed to multiplex PCR, high-throughput sequencing, and bioinformatics analysis. A single PCR reaction enriched thousands of marker loci by multiplex amplification in the first step [17]. Combining deep sequencing and bioinformatics analysis with MNP-Seq software, we could genotype more than 1000 MNP markers for F. filiformis in only one day. In our experience, MNP-seq has many advantages over other methods. It requires less starting DNA to amplify the MNP markers using mixed MNP marker primers. The high-throughput sequencing-based detection of MNP markers overcomes the uncertainty of SSR amplification length displayed on gel electrophoresis. MNP markers are often sequenced thousands of times, improving reproducibility and accuracy. Compared with whole-genome sequencing-based SNP markers, MNP-seq requires less experimental and data analysis time. Due to the high reproducibility and accuracy of MNP-seq, no replicate was required for MNP genotype determination [17].

Conclusions
The identification of different strains of enoki mushroom by MNP molecular markers is a systematic work. In this study, we have established the relevant MNP sequences library by using 69 pairs of primers, built the phylogenetic trees, and calculated the pair genetic similarity values of all strains. The results showed that most strains could be distinguished well by the phylogenetic topology and different genetic similarity values. The specific value of genetic similarity can be used as the standard to distinguish F. filiformis cultivars, however, it needs to be comprehensively defined by the additional phenotype and biological characteristics of those strains in the future work.
The development of MNP molecular markers is a promising approach for accurately identifying enoki mushrooms. MNP markers are based on variations in the DNA sequence, which can be used to distinguish different strains of mushrooms. The use of MNP markers has several advantages over traditional methods, including high accuracy, reproducibility, and ease of use once the identification system was established. However, there are some limitations to this approach. One limitation is the need for specialized equipment and bioinformatic expertise to develop those MNP markers. Another limitation is the availability of reference sequences for different strains of enoki mushrooms. More research is needed to expand the database of reference sequences and to develop a standardized protocol for MNP marker analysis. Future studies should focus on optimizing the MNP-seq method for enoki mushroom identification and developing a user-friendly tool for mushroom growers and researchers. This would enable accurate and rapid identification of different strains of enoki mushrooms, which could have important implications for their breeding, cultivation, and commercialization. Additionally, further investigation into the genetic diversity of enoki mushrooms would help to better understand the molecular basis of this species and its potential for future breeding programs.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jof9030330/s1, Table S1: the basic information of MNP marker library; Table S2: the GS values of pairwise comparison between 22 pedigrees and all cultivars; Table S3: