Taxonomy Complexity of Some Tyrrhenian Endemic Limonium Species Belonging to L. multiforme Group (Plumbaginaceae): New Insights from Molecular and Morphometric Analyses

The delimitation of Limonium taxa is highly complicated due to hybridization, polyploidy, and apomixis. Many “microspecies” were described and aggregated into groups, most of which are still poorly known from both molecular and morphological points of view. The aim of this study is to investigate four endemic species from the Tyrrhenian coast of central Italy and the Ponziane Archipelago belonging to the L. multiforme group (L. amynclaeum, L. circaei, L. pandatariae, and L. pontium) by means of molecular and morphometric analyses. Molecular data by sequencing ITS and three plastid markers and morphometric data highlight new information about the taxonomy of these taxa so as to reduce them into a single specific entity. In fact, the better taxonomic choice is to consider the populations studied as part of a single species, i.e., Limonium pontium. Three subspecies are recognized, i.e., subsp. pontium [= L. circaei = L. amynclaeum; from Circeo to Gianola localities (excluding Terracina) and from islands Ponza, Palmarola, Zannone, and Santo Stefano], subsp. pandatariae comb. et stat. nov. (from island of Ventotene), and subsp. terracinense subsp. nov. (from Terracina).


Introduction
Limonium Mill. is the largest genus of the family Plumbaginaceae Juss., comprising more than 600 species which naturally occur in all the continents except Antarctica [1]. The Mediterranean Basin is one of the centers of diversity of this genus and most of the currently accepted species are concentrated there [2]. Members of Limonium are important constituents of halophytic communities growing in coastal areas in rocky and sandy places as well as in salt marshes (e.g., [2][3][4]), having a significant role in the biodiversity of these places [5].
The delimitation of species within the genus Limonium, which is crucial for their conservation, is, however, complicated due to several phenomena, i.e., hybridization, polyploidy, and apomixis, at least in some species groups [4][5][6][7][8][9][10]. According to some authors (e.g., [2]), these evolutionary processes would occur in the typical habitat in which Limonium grow (especially in insular and peninsular areas) because these places are often geographically and/or ecologically isolated, leading the segregation of many microspecies. In addition, a particular reproductive strategy occurs in Limonium, as first noted by Baker [11] who demonstrated that the heterostyly pollen/stigma dimorphism is linked to the occurrence of apomixis, that is: the sexual species are characterized by having a dimorphic self-incompatibility system, whereas the agamospermous species are almost always monomorphic. Finally, as a consequence of these segregation events, the intricate

Plant Material
Field surveys were carried out during the period 2011-2018, along the coast of the Lazio region and in all the islands of the Ponziane Archipelago (central Italy) (Table 1 and Figure 1). All the specimens collected are deposited in the Herbarium RO [32]. In addition, type specimens of the four species were also analyzed: L. amynclaeum   [27].

Molecular Analysis
The molecular regions chosen for this study are located both in the biparental nuclear DNA (nrDNA) and in the uniparental plastid DNA (cpDNA). The plastid genome of most plants is maternally inherited, reflecting gene flow by seeds (e.g., L. carolinianum (Walter) Britton analyzed in Corriveau and Coleman [33]); the chosen molecular regions include two introns (petD and trnL (UAA) ) and two intergenic spacers (IGSs) (petB-petD and trnL (UAA) -trnF (GAA) ), which have already been employed in Limonium phylogeography and phylogenies [4,10,23,34,35]. In the nuclear genome, the chosen sequences were the variable fragments of the internal transcribed spacers of ribosomal genes (ITS1 and ITS2, also including the 5.8S gene), which have already been used to study the evolution history of Limonium [4,10,[35][36][37]. The same populations were also analyzed using a morphometric approach.

Genomic DNA Isolation
In total, 80 individuals were analyzed for a total of ten populations (Table 1). Each population was represented by eight individuals and total genomic DNA was extracted from dried leaves with a GeneAll Exgene Plant SV kit (GeneAll Biotechnology, Seoul, Korea) according to the manufacturer's instructions. Recalcitrant samples were extracted using a modified CTAB 2X procedure [38]. DNA quality was checked via 1% agarose electrophoresis with SafeView Nucleic Acid Stain (Applied Biological Materials Inc., Richmond, VA, USA) and visualized using the UVIdoc HD5 gel documentation system (UVITEC, Cambridge). DNA concentration was estimated using a Qubit 3 Fluorometer (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA).
The volume of each amplification reaction was 20 µL, using ca. 2-4 ng of template DNA, 0.25 µM of each primer, and Phire Plant Direct PCR Master Mix (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions. For the recalcitrant or ambiguous samples, high-fidelity Kodaq 2X MasterMix (Applied Biological Materials, Richmond, VA, USA) was used. All amplicons (>500 bp) were purified by PEG 8000 precipitation (PEG 15%, 2.5mM NaCl), with two washes with 80% ethanol, and resuspended in 10 µL of nuclease-free molecular biology grade water (Ambion, Thermo Fisher Scientific, Waltham, MA, USA). Approximately, 7 ng of purified amplicons were sequenced in a volume of 5 µL using 0.5 µL of primer 6.4 * The base position corresponds to the genotype N4 sequence as reference. -, deletion. Table 3. Haplotypes detected with the plastid markers (petB-petD IGS + petD intron and trnL (UAA) intron + trnL (UAA) -trnF (GAA) IGS) in the Limonium populations and type specimens. See Figure 4 for the distribution map.
The base position corresponds to the haplotype A sequence as reference.
ITS amplicons with multiple peaks within the sequence were cloned using the Clone-JET PCR Cloning Kit (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions. Transformation was performed using StrataClone SoloPack Competent Cells (Agilent Technologies, Santa Clara, CA, USA). The bacteria were cultured in LB medium at 37 • C for 30 min and subsequently transferred to LB agar plates containing 100 ug/mL ampicillin ON. Thirty-two randomly selected clones from each transformation were amplified using the corresponding PCR primers and ten amplicons were sequenced.

Data Analysis
Sequences datasets from nrDNA (ITS) and cpDNA (petB-petD IGS+petD intron and trnL (UAA) intron + trnL (UAA) -trnF (GAA) IGS) were analyzed separately. The sequences were aligned with Clustal W [43] as implemented in the BioEdit ver. 7.2.5 software package [44] and checked manually. A geographic map was generated to assess the distribution and frequency both genotypes (nrDNA) and haplotypes (cpDNA) present in the ten Limonium populations in this study.
NrDNA sequences. To check the phylogenetic collocation of the Limonium accessions in this study, a Bayesian inference (BI) framework was performed with MrBayes ver. 3.2.6 software [45] and using the dataset of Malekmohammadi et al. [4]  Plumbago auriculata Lam. was used as an outgroup as also reported in Malekmohammadi et al. [4]. The most likely substitution model for the ITS marker was computed by using the jModeltest ver. 2.1.10 software [47] and the GTR + G + I model was the better model according to the Akaike information criterion (AIC). Two runs of four Markov chains (three hot, one cold) were performed for 10,000,000 generations, sampling every 1000 generations, and discarding the first 20% as burn-in. Convergence diagnostics were also checked with Tracer ver. 1. 7.1 software [48]. A maximum likelihood (ML) tree search was performed using RaxML-NG via its web server portal (https://raxml-ng.vital-it.ch/#/; (accessed on 18 April 2022) [49]). Bootstrap analyses were carried out with an automatic number of replicates with a bootstopping cutoff of 0. 03.
CpDNA sequences. A preliminary phylogenetic analysis (BI and ML inference) was performed on combined plastid matrix (i.e., petB-D+ trnL (UAA) -F (GAA) regions) with literature data to evaluate the haplotypes' relationship. We used the same alignment data file employed in Malekmohammadi et al. [4], which was kindly provided by M. Malekmohammadi (Supplementary Materials File S2). From this reference dataset, the plastid sequences that had more than 100 contiguous basepairs missing were discarded from the analysis  [4]. Plumbago auriculata was used as an outgroup as also reported in Malekmohammadi et al. [4]. The most likely substitution model for both markers was GTR + I and the same BI and ML settings used in nuclear data were used for these analyses.
If our haplotypes' relationships were not evident in phylogeny inference due to a low non-discriminating variability among them, the assessment of haplotypes' relationships was conducted using TCS version 1.21 software [50] with the optimality criterion of maximum parsimony [51] and treating the gaps as a fifth state. Mononucleotide repeats (polyN) from the cpDNA sequences were included in the analysis, if fixed in the populations sampled (i.e., no variation among individuals/population). Indels were considered as a single mutation event and were therefore coded as single positions in the final alignment. TCS was run with a default parsimony connection limit of 95% among Limonium sample data and a connection limit to 45 steps using the outgroup. Limonium pruinosum (L.) Chaz. was used as an outgroup according to both Malekmohammadi et al. [4] and our preliminary phylogeny (GenBank accession numbers petB-petD IGS+ petD intron, MF083795; trnL (UAA) intron+trnL (UAA) -trnF (GAA) , MF083894).

Morphometric Analysis
Ten populations (Figure 1), 25 individuals for each one, plus the holotypes of the four species considered in the research were investigated (Table 1). Twenty-two quantitative characters (3 discrete, 19 continuous (Table 4); see Figure 2 for the details of the inflorescence) were measured in 254 individuals (a total of 5742 measurements) using a Zeiss GXS stereomicroscope. The data matrix (individuals × variables) was processed using the software package NCSS 2007. The variability of the characters was examined by principal component analysis (PCA), discriminant analysis (DA), and box plots. DA was performed using the first six components derived from PCA, which explain about the 70% of the total variability. The use of component scores (each linearly independent by construction) allowed an unbiased discriminant model both solving the indeterminacy due to multicollinearity of the independent variables and providing a more reliable prediction for the smaller number of involved variables [52][53][54]. As a supervised technique, we performed the DA on groups classified both as species and as localities. The matrix of actual/predicted groups was analyzed by comparing the values among these groups, especially regarding the diagonal, whose values reveal the matching of actual and predicted observations for each group. The value of correct classification reported in the results is the classification accuracy achieved by the actual discriminant functions over what is expected. A multivariate analysis of variance (MANOVA) was also performed to test the significance of differences between response (dependent) variables (morphological characters) and factor variables (=groups, i.e., taxa). Relevant literature (including protologues; [28,29]) was also analyzed.

Molecular Analyses
Nuclear data. The ITS of our dataset is from 620 to 622 characters. Several specimens belonging to SS (4), AR (7), ST (4), TE (6) populations, and two type specimens (L. circaei and L. pandatariae) presented ITSs with sequences with double peaks caused by indels and two single different nucleotides (Table 2 and Figure 3). After cloning and sequencing, in total four single ITS genotypes were identified (blue-N1 (620 bp), green-N2 (621 bp), pink-N3 (620 bp), and brown-N4 (622 bp); Table 2) which determined three "mixed patterns" (orange-H1, white-H2, and yellow-H3) as shown in Figure 3. In the rest of the dataset, two single ITS genotypes were observed (N1 and N2), of which N1 is present with a greater frequency in all populations (98.8 vs. 19%, respectively) as a single genotype or as a mixed pattern; in fact, only one specimen from the population AR does not present the N1 genotype ( Figure 3). The H3 mixed pattern was exclusive to the TE population as was the presence of the single genotype N2 in the AR population. In this population, the H1 mixed pattern was also present with higher frequency. The H3 mixed pattern was exclusive to the TE population as well as the H2 mixed pattern which was observed only in the type specimens of L. circaei (Figure 3).  Table 1; colored symbols correspond to the genotypes (N1-N3) shown in Table 2.
After the BLASTn analyses, the identity of the single ITS genotypes was N1 = 100% (query cover 95%) with L. circaei The length of the ITS sequence alignment with the 100 accessions from the literature was 740 characters, 450 of which were variable and 354 of which were parsimony informative (Supplementary Materials File S1). Considering only our single genotypes, their alignment had 13 variable sites and one parsimony informative site. According to the ML and BI phylogenetic analyses (Figure 3), our four genotypes fell into the "L. graecum clade" which corresponds to node F of Figure 3 of Malekmohammadi et al. [4]. The N1 genotype was sister to the N3 genotype, and these genotypes are related to the N2 genotype. All genotypes fell into the group of accessions of the taxa afferent to the L. multiforme group (i.e., N1 with L. circaei with a sequence identity of 100%, N2 was sister to L. cumanum and L. remotispiculum, and N3 with two accessions of L. multiforme, of which one shares a 100% sequence identity). The N4 genotype was present in a different unresolved group consisting of collapsed taxa not belonging to the L. multiforme group. All our genotypes' accessions are well supported according to BI while, on the contrary, in ML analyses, only the N4 genotype had maximum support ( Figure 3).
Plastid data. Excluding the primer and about the first 25 bases due to reading by the automated sequencer, the length of the trnL (UAA) intron together with trnL (UAA) -trnF (GAA) IGS was between 897 and 898 bp because of a variable polyA in the trnL (UAA) intron (Table 3). No SNPs were detected except for polyA which was A5 in western island populations (codes IF, SS, ZA; Figure 4) and A6 in the rest of the dataset. By BLASTn analyses, this marker (excluding polyA) was not able to discriminate among the Limonium taxa in the study, and more than 20 species from the GenBank database share 100% identity with our sequences (e.g., L. circaei, L. bonifaciense, and L. saracinatum R.Artelari).  Table 1; colored symbols correspond to the haplotypes shown in Table 3.
The petB-petD IGS plus petD intron was 868 bp in length and three variable sites that were not parsimony informative (one in the IGS and two in the intron) were observed determining four haplotypes (Table 3). By BLASTn analyses, these haplotypes present a sequence identity of 99.77-99.88% (q.c. 100%) from two to three Limonium taxa present in the GenBank database (R = 99.77% L. connivens Erben, L. cumanum, and L. multiforme; G = 99.77%, L. connivens and L. multiforme; B = 99.77%, L. connivens and L. multiforme; A = 99.88%, L. connivens and L. multiforme). By combining the sequences of all plastid markers, four haplotypes were always observed with a geographical distribution as previously observed (orange-A, blue-B, green-G, and red-R; Table 3 and Figure 4). Two haplotypes were exclusive to island populations (red-R and blue-B for western and eastern island populations, respectively). The green (G) haplotype was present in all continental populations except for L. circaei type and was very rare in island populations where it was observed in the two type specimens (L. pandatariae and L. pontium) and in one sample of the AR population; a unique haplotype (orange-A) is present for the type of L. circaei as also observed for ITS data (Figures 3 and 4).
The length of plastid sequence alignment matrix with 94 accessions from Malekmohammadi et al. [4] was 1898 characters (474 of which were variable and 275 of which were parsimony informative; Supplementary Materials File S2). According to the ML and BI phylogeny inference, our plastid accessions are always placed in the "L. graecum clade" of Figure 2 (node F) in Malekmohammadi et al. [4]. In our analyses, they formed a supported single group (99/1; bootstrap/posterior probability, respectively) sister to L. virgatum and successively to several collapsed accessions as L. cumanum/L. multiforme/L. connivens/ L. gueneri/L. cumanum/L. gibertii (Supplementary Materials File S3). Considering the TCS analyses, the ancestral haplotype was A (orange in Figure 4) which corresponds to the type specimen of L. circaei from which the other three haplotypes derive independently and presents a geographical correlation as previously reported (Figure 4).

Morphometric Analysis
The PCA of the 22 morphological characters analyzed (Table 4) shows that the cumulative percentage of eigenvalues for the first seven axes is 10.17%, with a higher contribution (more than 10%) given by the first three components (23.24%, 13.83%, and 9.82%, respectively). The examination of the combined graphs among pairs of these seven components shows three separated groups along the first and second components ( Figure 5). These groups correspond to the populations from (1) TE (Terracina), (2) AR (Ventotene), and (3) a large group including the remaining localities. The highest contributions to axes were given by the following characters: length of the leaves, number of fertile and sterile branches, average angle between branches, length of the ludicule, number of flowers, length of the outer, median, and inner bracts, width of the median and outer bract, length of limb, and width of the corolla lobes. The DA shows different results depending on the use of the species names or the localities to classify the groups: (1) When we classified the populations using the localities' names (= ten groups; see Table 1), DA predicted four groups ( Figure 6) based on the first two discriminant functions which explain 72.2% of the total variation (eigenvalues: 40.5% (1st function) and 31.7% (2nd function)). These four groups, partially overlapped, correspond to the following localities: A) AR (Ventotene), B) CV (Sperlonga), C) TE (Terracina), and D) a large group including the remaining localities. The matrix of actual/predicted groups displays high percentages along the diagonal (whose values reveal the matching of actual and predicted observations for each group) for CV (100%), TE (96%), and AR (82%), whereas low or very low percentages characterized the diagonal values of the other groups. The value of correct classification is low (61.8%).
(2) When we classified the populations using the names of the species (= four groups, L. amynclaeum, L. circaei, L. pandatariae, and L. pontium), DA predicted two partially overlapped groups (Figure 7) based on the first two discriminant functions which explain 84.5% of the total variation (eigenvalues: 64.9% (1st function) and 19.6% (2nd function)). These two groups correspond to (A) L. pantadariae (one locality, AR (Ventotene); see Table 1) and (B) a group comprising all the individuals identified as L. amynclaeum, L. circaei, and L. pontium and collected in the remaining nine localities (coast of Lazio and the islands of the Ponziane archipelago except Ventotene; see Table 1). In fact, the matrix of actual/predicted groups displays a percentage of the diagonal value of L. pandatariae of 56%. The value of correct classification is low (57.5%).
Furthermore, we performed the DA using the three groups generated from the PCA (L. pandatariae from AR (Ventotene), Limonium sp. from TE (Terracina), and L. pontium (including L. amynclaeum and L. circaei) from all other localities). The result is that these three groups are statistically well supported, based on the two discriminant functions which explain 100% of the total variation (eigenvalues: 52.1% (1st function), and 47.9% (2nd function)) ( Figure 8). The value of correct classification is high (86.4%).   Finally, the box plots, made using the characters derived from PCA that are able to discriminate the various groups (length of the leaves, number of fertile and sterile branches, average angle between branches, length of the ludicule, number of flowers per spikelet, length of the outer, middle, and inner bracts, width of the middle and inner bract, length of limb, and width of the calyx lobes), confirm the separation of the following three groups: AR (Ventotene), TE (Terracina), and all other localities together (Figure 9). The results of the MANOVA show significant differences at both species and population levels. Probability level is less than 0.000001 for all the statistical tests considered (Wilks' lambda, Hotelling-Lawley trace, Pillai's trace, and Roy's largest root). F-ratios are high, ranging from F = 16.35 to 17.97 (Table 5).

Discussion
According to the molecular data, the information obtained has been useful to give another additional reading key to the morphometry approach on these taxa belonging to the L. multiforme group. Analyzing the ITS (nrDNA), which is biparental inheritance, the presence of mixed genotypes suggests that sexuality events have probably been or still are recurrent in these taxa as supposed in Brullo and Guarino [27]. The presence of the N1 genotype ( Figure 3) was always observed in all the individuals examined (except one individual of the AR population) which had either single or mixed genotypes. The other single genotypes observed (Figure 3) are close to taxa belonging to the L. multiforme group (i.e., see phylogenetic position of N1, N2, and N3 genotypes, Figure 3). This confirms how this group is a set of related taxa. For example, the population from the island of Ventotene (code AR and locus classicus of L. pandatariae, Table 1) presents ITS genotypes related to L. circaei, L. cumanum, and L. remotispiculum (Figure 3), whereas the population from the Torre Paola locality (code TP and locus classicus of L. circaei, Table 1) is related to L. circaei and L. multiforme (Figure 3). On the other hand, the population from Terracina (code TE, Table 1), in addition to having the ubiquitous N1 genotype, presents a very different genotype that has a phylogenetic affinity to species that do not belong to the L. multiforme group (Figure 3).
Concerning the analyses of uniparental plastid markers (cpDNA) (Figure 4), it has been possible to deduce both (1) a geographical correlation of the distribution of the haplotypes which all derive from an ancestral form like the type of L. circaei; and (2) that several colonization events from the mainland (Tyrrhenian coast) and neighboring islands (Ponziane Archipelago) have occurred over time among Limonium populations (e.g., see green haplotype (code G) in Figure 4). In fact, the long seed dispersion by wind, birds, and water (along the coast) of Limonium is present [34,55] and the fruits can float for a long time in seawater without destruction of their germination power as observed in L. vulgare Mill. by Koutstaal et al. [55] and L. ramosissimum (Poir.) Maire subsp. provinciale (Pignatti) Pignatti [56]. So, if with a molecular approach we have given a broader interpretation key by also integrating the data with the published phylogenies [4,10,[35][36][37], we are reasonably in agreement with what is reported in Brullo and Guarino [27] for the taxa belonging to the L. multiforme group. In fact, the taxa analyzed would not show apomixis (e.g., for shared molecular markers, Figures 3 and 4), they might have been (or are) interfertile (e.g., due to the presence of different shared ITS genotypes, Figure 3), and the geographical isolation (e.g., see haplotypes distribution, Figure 4) has been and still is of fundamental importance to determine the variability that has been detected from the morphological point of view.
Morphometric data obtained highlight that three groups are statistically well supported, and they are correlated with geographical distribution areas, i.e., populations from (1) Ventotene (the southmost island of the Ponziane Archipelago), (2) Terracina, and (3)  All things considered, we here propose to consider all populations studied as part of a single species whose name is to be Limonium pontium due to the nomenclatural priority over the other available names (Art. 11 Finally, the Sperlonga population (previously named as L. amynclaeum) can be distinguished from the morphological point of view but there is no molecular support. Consequently, we here synonymize L. amynclaeum with L. pontium subsp. pontium.