Reference-Based Restriction-Site-Associated DNA Sequencing Data Are Useful for Species Delineation in a Recently Diverged Asexually Reproducing Species Complex (Parmeliaceae, Ascomycota)

Cryptic species are common in lichen-forming fungi and have been reported from different genera in the most speciose family, Parmeliaceae. Herein, we address species delimitation in a group of mainly asexually reproducing Parmelina species. The morphologically distinct P. pastillifera was previously found nested within a morphologically circumscribed P. tiliacea based on several loci. However, these studies demonstrated a relatively high genetic diversity within P. tiliacea sensu lato. Here, we revisit the species delimitation in the group by analyzing single-nucleotide polymorphisms (SNPs) through genome-wide assessment using Restriction-Site-Associated sequencing and population genomic methods. Our data support previous studies and provide further insight into the phylogenetic relationships of the four clades found within the complex. Based on the evidence suggesting a lack of gene flow among the clades, we recognize the four clades as distinct species, P. pastillifera and P. tiliacea sensu stricto, and two new species, P. clandestina sp. nov. and P. mediterranea sp. nov.

Reference-based restriction-site-associated DNA sequencing (RADseq), which generates data from thousands of loci across the genome, has been shown to be a successful and cost-effective tool for species delimitation in lichenized fungi that outperforms multi-locus approaches [25][26][27][28][29].
The genus Parmelina, as currently circumscribed, contains nine species [30] occurring in the Northern Hemisphere, mainly in Western Europe, the Mediterranean, and Western North America.In contrast, the sister genus Myelochroa is most diverse in eastern Asia [31].
The two genera are estimated to have split during the Eocene.Diversification in Parmelina was estimated to have occurred during the Miocene, with the ancestor of the genus probably occurring in the Turanian region and Europe or only Europe.The genus is characterized by broad, subirregular lobes with a smooth upper surface, cylindrical conidia, a white medulla, and the upper part of the inner exciple being carbonized.The latter can be seen as an amphithecial ring in a superficial view of the ascomata.The closest relative, Myelochroa, is mainly distinguished from Parmelina based on chemical characters [30,32].
The species delimitation in the genus is surprisingly complex for a relatively small genus.The sexually reproducing Parmelina quercina complex was shown to consist of several distinct lineages, with the Australasian clade now classified as a distantly related genus, Austroparmelina [33,34].In addition, a lineage that is only distantly related to P. tiliacea but morphologically very similar was discovered and subsequently recognized as P. cryptotiliacea [9].In contrast, the morphologically distinct P. pastillifera [35] was found nested within P. tiliacea [9].These species reproduce mainly asexually by vegetative propagules, i.e., isidia.A subsequent study on the genetic diversity of P. tiliacea using three loci revealed high genetic diversity within three clusters with uneven but overlapping distributional ranges.All three clusters were shown to be present in the Canary Islands, consistent with the hypothesis that this area is a refugium for the group [36].In another study, the nesting of P. pastillifera within P. tiliacea was interpreted as a case of speciation by split-off or budding [37], in which the origin of a new taxon does not affect the existence of the parental taxon [38].
Given the genetic diversity observed in P. tiliacea, which reproduces mainly asexually (cylindrical isidia), and the uncertainty of the distinction of this species from P. pastillifera (also asexually reproducing, isidia button-like), herein we revisit the species delimitation in the group using genome-wide assessment of single-nucleotide polymorphisms (SNPs) produced by RAD sequencing.In addition, we applied population genomic methods to measure the degree of genomic divergence and infer the levels of co-ancestry for each lineage found in our analysis.

Specimen Sampling
Parmelina samples collected in Armenia, Austria, Slovenia, France, Germany, Iran, Italy, Morocco, Norway, Portugal, Spain, Sweden, Switzerland, Tunisia, Turkey, and the United Kingdom between 2009 and 2017 were used in this study.For this study, a total of 86 representative specimens of P. pastillifera and P. tiliacea were selected, together with 5 P. carporrhizans and 4 P. atricha specimens (Table S1).Specimens were identified based on morphological characteristics [36,38].A reference sequence of Parmelia sp. was downloaded from GenBank (GCA_018257885) [39] to filter for lichen fungal loci of metagenomic RAD sequences.

DNA Extraction and RAD Library Preparation
DNA extraction and RADseq libraries were constructed and sequenced at the University of Wisconsin-Madison Biotechnology Center.RADseq libraries were prepared as described in [40] using the restriction enzyme ApeKI [25] and sequenced on an NO-VAseq6000Illumina Inc. (San Diego, CA, USA).The resulting RADseq data were obtained in FASTQ format.

RADseq Assembly
RADseq data were processed in ipyrad v.0.9.90 [41] using the bioinformatics servers at The Grainer Bioinformatics Center, Field Museum, as previously described [27].We used the reference-based approach in ipyrad to filter for mycobiont loci, which mapped the metagenomic reads of the lichen symbiosis to a reference fungal genome of Parmelia sp.(GCA_018257885).We changed the parameter file in ipyrad to "gbs" and ploidy to haploid ("1").We used a default minimum coverage of 4. Samples for which no clusters passed filtering of ipyrad were removed from the analyses.

Phylogenomic Analyses
We used the SNP data from the ipyrad output files, consisting of a matrix containing only variable sites, to conduct a maximum likelihood phylogenetic reconstruction with RAxML v8.2.11 [42].The GTRCATX model was used with an ascertainment bias correction (--asc-corr=lewis).For each analysis, 100 bootstrap replicates were calculated using the fast-bootstrapping option implemented in RAxML [26,43].The phylogenetic tree was midpoint-rooted and visualized using FigTree v1.4.3.Samples with extraordinarily long branches indicating a high sequencing error rate were removed, and RAxML was rerun with a reduced sample set.Our code for both the phylogenomic and population genomic analyses are included in Supplemental File S1.

Analysis of Population Structure
Differences in the population structure were calculated with a reduced dataset of 81 samples, which excluded samples of P. carporrhizans and P. atricha.A vcf output file with all variant SNPs was filtered for all SNPs with a minor allele frequency (MAF) greater than or equal to 0.05 and excluding sites on the basis of the proportion of 0.5 missing data (--max-missing 0.5) using vcftools v.0.1.15[44].This filtered vcf file was converted to a genlight object using the R package "vcfR".Then, the genlight object was converted to a genind object from the R package adegenet v2.1.10 [45,46].Subsequently, the genind object was appended with additional information settings for haploid genomes and the population memberships.The genind object and all the information enclosed were used for population genetics analysis executed in R.
We assessed the degree to which the populations are subdivided by estimating Nei's Gst [47][48][49] and Hedrick's G'st [48][49][50] indices.Nei's Gst is a good measure when the mutation rate is small relative to the migration rate.Hedrick's G'st standardizes the genetic differentiation measure and fits data with high mutation rates.Both indices were calculated as a population pairwise comparison and performed using the R package mmod v1.3.3 [51].
We chose a nonparametric approach with a Discriminant Analysis of Principal Components (DAPC) to evaluate the genetic structure of P. pastillifera and the P. tiliacea populations.DAPC performs a PCA (Principal Components Analysis) transformation of data, and then a DA (Discriminant Analysis) to separate groups.DAPC is implemented in the adegenet v2.1.10package in R and was executed using the proportion of variance (95%) explained by the first 60 principal components.In addition, DAPC predicted the group members' probability for each sample and displayed it in a STRUCTURE-like plot.
We used fineRADstructure [41] to estimate recently shared ancestry by patterns of genomic similarity between individuals.First, the pyRAD allele output file was converted into a fineRAD structure file using the finerad input.pyscript implemented in fineRADstructure tools.The dataset was reduced to contain only a minimum number of samples in a locus of four (--minsample 4).Subsequently, RADpainter and fineSTRUCTURE scripts from FineRADstructure were used to measure the population structure.A co-ancestry matrix for a haploid dataset (-p1) was generated using RADpainter, and individuals were assigned to populations using the fineSTRUCTURE Markov chain Monte Carlo (MCMC) clustering algorithm with the following arguments: -x 100,000, -z 100,000, and -y 1000.fineSTRUCTURE was also used to run a simple tree-building algorithm on the data of the co-ancestry matrix following the arguments -m T and -x 10,000.A visualization of the co-ancestry matrix, the MCMC output, and the coalescence tree were plotted out in R.
An Analysis of Molecular Variance (AMOVA) [52] was performed to calculate the proportion of genomic variance by differences within and among clades using the R package poppr [53].

Assembly of RAD Sequencing
After the ipyrad assembly, filtering, and processing of all raw sequences and the reconstruction of an initial phylogenetic tree with RAxML, a final genomic dataset was considered that included 90 samples of the 95 initially processed.One sample was removed because in this sample, no clusters passed filtering in ipyrad (Pa_16634), and four samples were removed either because of contamination or misidentification that was indicated by long branches in the phylogenetic tree (Pt_17495, Pt_17340, Pt_17257, Pt_17294).The total number of filtered loci was 5189, with an average of 979.333 loci per sample (SD = 623.254),and a matrix comprised 11,895 columns (SNPs) with a missing site percentage of 79.98.For the population structure analyses, a resulting alignment matrix of 81 samples of Parmelina pastillifera, P. clandestina, P. mediterranea, and P. tiliacea s. str.comprised 8097 columns (SNPs) with a missing site percentage of 79.61, a total number of filtered loci of 5383, an average of 1013.160loci per sample (SD = 620.025),and an average sequencing depth of 10.799 per SNP (SD = 5.742).

Phylogenomic Analyses
We identified six well-supported clades after conducting a phylogenetic analysis of 90 Parmelina samples using RADseq data (Figure 1a).Two clades comprised samples of Parmelina carporrhizans and P. atricha, respectively.Another clade consisted of samples of P. pastillifera, while the samples of P. tiliacea were clustered into three separate clades, recognized below as P. tiliacea, P. clandestina, and P. mediterranea.The clade here identified as P. tiliacea s. str.includes the epitypus specimen of the species (MAF-Lich 16485).The analysis also revealed three misidentified specimens."Pt_174881_P.tiliacea_Italy" was clustered within the P. pastillifera clade.After reviewing this specimen, we could identify the common P. pastillifera character of button-like isidia (Figure 1b).In addition, the specimens "Pp_19501_P.pastillifera_Portugal" and "Pp_16537_P.pastillifera_Portugal" clustered within the P. clandestina clade.After reviewing these specimens, we could identify cylindrical isidia, which are commonly observed in the taxa of the P. tiliacea sensu lato clades (Figure 1c).
Previous studies on the species delimitation of the Parmelina pastillifera-tiliacea complex using nuclear ITS and mitochondrial LSU rDNA showed that P. pastillifera is nested within P. tiliacea [9,35].A later multi-locus study, including the nuclear EF1-α marker, confirmed the genetic diversity of P. tiliacea sensu lato [36].The nesting of P. pastillifera within P. tiliacea sensu lato was interpreted as a case of speciation by split-off [37,38].However, studies based on multi-locus markers were insufficient to resolve the relationship of this group.Using reference-based RAD sequencing as a reduced genome representation method, we sequenced thousands of loci over the genome [25,28].Unlike single-marker approaches, with RADseq, we obtained sufficient sequenced data to reconstruct a robust topology of this species complex: a clear separation of the morphologically distinct P. pastillifera and three cryptic lineages in P. tiliacea.

Analysis of Population Structure
For the population structure analyses of the Parmelina pastillifera-tiliacea complex, we reduced the RAD dataset to the specimens of the Parmelina pastillifera, P. clandestina, P. mediterranea, and P. tiliacea s. str.clades.The initial SNP matrix of 8097 columns (SNPs) was filtered for SNPs with an MAF greater than 0.05 and excluding sites on the basis of the proportion of 0.5 missing data (--max-missing 0.5).
Nei's Gst and Hedrick's G'st were calculated to assess the genetic differentiation of the four species (Table 1).The differentiation measures for Nei's Gst/Hedrick's G'st were 0.73/0.91 between P. clandestina and P. mediterranea, 0.78/0.94 between P. clandestina and P. tiliacea s. str., and 0.77/0.93 between P. mediterranea and P. tiliacea s. str.As most G'st indices tend towards 1, the four species have isolated genomes.Previous studies on the species delimitation of the Parmelina pastillifera-tiliacea complex using nuclear ITS and mitochondrial LSU rDNA showed that P. pastillifera is nested within P. tiliacea [9,35].A later multi-locus study, including the nuclear EF1-α marker, confirmed the genetic diversity of P. tiliacea sensu lato [36].The nesting of P. pastillifera within P. tiliacea sensu lato was interpreted as a case of speciation by split-off [37,38].However, studies based on multi-locus markers were insufficient to resolve the relationship of this group.Using reference-based RAD sequencing as a reduced genome representation method, we sequenced thousands of loci over the genome [25,28].Unlike single-marker approaches, with RADseq, we obtained sufficient sequenced data to reconstruct a robust topology of this species complex: a clear separation of the morphologically distinct P. pastillifera and three cryptic lineages in P. tiliacea.In addition, the DAPC revealed genomic separation among the samples of the four clades (Figure 2a).Furthermore, the DAPC showed a clear distinction between the four species, as evidenced by the group members' probability, which indicated a 100% probability of each sample belonging to its respective clade (Figure 2b).The fineRADstructure analysis revealed that the four species correspond to the four clades identified in the phylogenetic tree.The clustering indicates a higher shared co-ancestry within each species than among them (Figure 3).Also, in the P. mediterranea cluster, some samples showed a very high level of co-ancestry: Pt_7197 and Pt_7198 (dark blue), were collected at the same locality in Spain, and Pt_17256 and Pt_17242 (small black blocks), which were collected in the Canary Islands but not at the same locality.The samples also clustered in the phylogenetic tree (Figure 1).This high level of co-ancestry could indicate that these samples of P. mediterranea are indeed clones, which is likely when they were collected at the same locality, showing a potential case of long-distance dispersal in the Canary Islands.The AMOVA results indicate that around ~95% of the genomic variance is due to clade variation (Table 2), solidifying a delineation of the four species of the Parmelina pastillifera-tiliacea complex.All population genomic methods confirmed a high degree of genomic divergence among these clades of the complex and supported the interpretation of these clades as distinct lineages.Consequently, we recognize four species in the complex, two of which-P.mediterranea and P. clandestina-we describe as new species.
Cryptic species are common in lichen-forming fungi.RADseq, which has successfully resolved other morphologically challenging lichen groups [25][26][27][28], showed similar success The AMOVA results indicate that around ~95% of the genomic variance is due to clade variation (Table 2), solidifying a delineation of the four species of the Parmelina pastillifera-tiliacea complex.

AMOVA Components of Covariance %
Variations between samples 94.906133 Variations within samples 5.093867

Total variations 100
Phi-samples-total = 0.9490613 All population genomic methods confirmed a high degree of genomic divergence among these clades of the complex and supported the interpretation of these clades as distinct lineages.Consequently, we recognize four species in the complex, two of which-P.mediterranea and P. clandestina-we describe as new species.
Cryptic species common in lichen-forming fungi.RADseq, which has successfully resolved other morphologically challenging lichen groups [25][26][27][28], showed similar success in this study.Notes: Parmelina mediterranea is a phenotypically cryptic species and difficult to recognize in the field.In the phylogenetic tree (Figure 1), it is clustered in the P. tiliacea-P.pastillifera clade with uncertain phylogenetic relationships.It is sympatric with the phenotypically similar species P. tiliacea s. str.and P. clandestina.This new species can only be identified with genetic data.Thus, we recommend using the term "Parmelina tiliacea aggregate" for field studies.
Type: MOROCCO.Ifrane, Foret Sidi, 33 Etymology: the epithet refers to the enigmatic and difficult-to-detect properties of the new species.
Distribution: Europe, so far observed in Germany, Italy, Morocco, Portugal, Slovenia, Spain, Turkey, and the UK.
Notes: The new species is morphologically cryptic and difficult to recognize in the field.However, in the phylogenetic tree (Figure 1), it forms a strongly supported sister

Figure 1 .
Figure 1.Phylogenetic tree inferred from Parmelina pastillifera-tiliacea complex RADseq data.(a) Maximum-likelihood phylogenetic reconstruction of Parmelina pastillifera-tiliacea complex based on concatenated DNA sequences of 5189 loci.Bootstrap values > 75 are indicated on main branches.Taxon labels include the sample's identification before this and country of collection.(b,c) Taxa in bold and pictures highlight three misidentified specimens.

Figure 1 .
Figure 1.Phylogenetic tree inferred from Parmelina pastillifera-tiliacea complex RADseq data.(a) Maximum-likelihood phylogenetic reconstruction of Parmelina pastillifera-tiliacea complex based on concatenated DNA sequences of 5189 loci.Bootstrap values > 75 are indicated on main branches.Taxon labels include the sample's identification before this and country of collection.(b,c) Taxa in bold and pictures highlight three misidentified specimens.

Figure 2 .
Figure 2. Discriminant Analysis of Principal Components (DAPC) of samples of the Parmelina pastillifera-tiliacea complex.(a) Scatterplot for discriminant functions.Individuals and groups are represented by dots and inertia ellipses colored as in Figure 1.The bottom-left inset graph shows the cumulative variance explained by PCA eigenvalues; dark-gray bars indicate the first 60 PCs retained.The bottom-right inset graph of the linear Discriminant Analysis (DA) eigenvalues displays the proportion of genetic variation explained by each discriminant function; dark-gray bars highlight the first two discriminant functions shown in the main scatterplot; (b) barplot with assigned membership probabilities.Each bar represents an individual.The colors correspond to the ones used in Figure 1.

Figure 2 .
Figure 2. Discriminant Analysis of Principal Components (DAPC) of samples of the Parmelina pastillifera-tiliacea complex.(a) Scatterplot for discriminant functions.Individuals and groups are represented by dots and inertia ellipses colored as in Figure 1.The bottom-left inset graph shows the cumulative variance explained by PCA eigenvalues; dark-gray bars indicate the first 60 PCs retained.The bottom-right inset graph of the linear Discriminant Analysis (DA) eigenvalues displays the proportion of genetic variation explained by each discriminant function; dark-gray bars highlight the first two discriminant functions shown in the main scatterplot; (b) barplot with assigned membership probabilities.Each bar represents an individual.The colors correspond to the ones used in Figure 1.

J 13 Figure 3 .
Figure 3. Clustered fineRADstructure co-ancestry matrix of samples of the Parmelina pastillifera-tiliacea complex.The top tree shows the population structure of the samples according to the co-ancestry matrix.Four major clades corresponding to P. mediterranea, P. clandestina, P. tiliacea s. str., and P. pastillifera.The four orange-red diagonal blocks in the co-ancestry matrix indicate that samples within the four species share more co-ancestry with each other than among species.Small black and dark-blue blocks in the P. mediterranea clade indicate closely related samples.The first sample pair (Pt_17256 and Pt_17242) were collected at distinct localities in the Canary Islands and the second sample pair (Pt_7197 and Pt_7198), potentially clones, were collected in Spain at the same locality.

Figure 3 .
Figure 3. Clustered fineRADstructure co-ancestry matrix of samples of the Parmelina pastillifera-tiliacea complex.The top tree shows the population structure of the samples according to the co-ancestry matrix.Four major clades corresponding to P. mediterranea, P. clandestina, P. tiliacea s. str., and P. pastillifera.The four orange-red diagonal blocks in the co-ancestry matrix indicate that samples within the four species share more co-ancestry with each other than among species.Small black and dark-blue blocks in the P. mediterranea clade indicate closely related samples.The first sample pair (Pt_17256 and Pt_17242) were collected at distinct localities in the Canary Islands and the second sample pair (Pt_7197 and Pt_7198), potentially clones, were collected in Spain at the same locality.

Table 1 .
Pairwise average values of Nei's Gst and Hendrick's G'st.