Molecular and Morphological Data Improve the Classification of Plantagineae (Lamiales)

The tribe Plantagineae (Lamiales) is a group of plants with worldwide distribution, notorious for its complicated taxonomy and still unresolved natural history. We describe the result of a broadly sampled phylogenetic study of tribe. The expanded sampling dataset is based on the trnL-F spacer, rbcL, and ITS2 markers across all three included genera (Aragoa, Littorella and Plantago) and makes this the most comprehensive study to date. The other dataset uses five markers and provides remarkably good resolution throughout the tree, including support for all of the major clades. In addition to the molecular phylogeny, a morphology database of 114 binary characters was assembled to provide comparison with the molecular phylogeny and to develop a means to assign species not sampled in the molecular analysis to their most closely related species that were sampled. Based on the molecular phylogeny and the assignment algorithm to place unsampled species, a key to sections is presented, and a revised classification of the tribe is provided. We also include the description of new species from North America.


Introduction
The tribe Plantagineae (Lamiales) consists of three diverse genera of plants: the worldwide distributed plantains (Plantago), the small aquatic Littorella, and a genus of northern Andean shrubs (Aragoa). This group is notorious for its complicated taxonomy, the difficulty of species identification, and an evolutionary trend of morphological reduction and simplification [1][2][3][4].
Plantains (ribworts), Plantago L., grow almost everywhere in the world, except for Antarctica and tropical wet forests [2,5]. Morphologically, they bear an unusual combination of characters [6]: sympetalous 4-merous non-showy flowers developing into circumscissile capsule-like fruit (pyxidium), and monocot-like leaves with arcuate or parallel venation, usually borne in a rosette. Even in the eighteenth century, when only sixteen species were described in Plantago [7], botanists mentioned the remarkable similarity between species, and many collections today still hold a significant amount of incorrectly determined specimens. Plantains exhibit reduction of both vegetative and generative characters. A few species have a branched stem and "dicotyledonous" leaves. Recent research [8] demonstrated how flower reduction could occur within Antirrhinum-Plantago lineage, in which evolution towards anemophily resulted in significant morphological convergence with graminoid monocots. Plantago likely underwent a rapid evolution [9] and relatively recent diversification [10]. Plantago includes both species with broad distribution and local endemics or near endemics. On many remote ocean islands, for example, there are unique species of Plantago [11,12]. Several new Plantago species have recently been described, and most of them are rare and narrowly distributed (e.g., [13][14][15]). All species). Data from 86 species have been taken from public databases. In all, we were able to increase the amount of available information three-fold (four-fold in Aragoa). Due to common problems with voucher identification [37], we always used our own identification and, if there was a choice, preferred to use our own samples.
In this paper, we followed the "appropriate citation of taxonomy" (ACT) principle, that is to "to ensure that common elements of taxonomic papers, generally considered de facto citations by taxonomists but not by [citation indexes], are presented in a format that is considered a valid citation by [citation indexes]" ( [38]). Thus, we cited in the References names of the most important supra-species groups ( [39]).

DNA Extraction and Sequencing
DNA extraction performed using multiple standard protocols, but soon after the start of the project (2011), we decided to exclusively use the NUCLEOSPIN Plant II Kit (MACHEREY-NAGEL GmbH & Co., KG, Düren, Germany) which appeared to provide a good trade-off between efficiency and simplicity. We improved this protocol in several respects, e.g., increased the lysis time to 30 min and used a thermomixer on slow rotation speed (350 rpm) instead of a water bath. To assess DNA quality, we used Nanodrop 1000 Spectrophotometer (Thermo Scientific, Wilmington, DE, USA), which estimates concentration and purity (the 260/280 nm ratio of absorbance) of samples. Typically, 1.4 ratio was enough to guarantee PCR amplification of smaller markers; the average ratio was 1.7. Lower DNA quality was typically obtained from samples collected in humid environments. Especially low was the quality of our Aragoa samples. We speculate that this might be due to the use of drying cabinets; indeed, when we dried our own Aragoa samples in room conditions, they yielded high-quality DNA.
Short DNA markers are best to amplify for herbarium samples [40]; therefore, our first choices were the nuclear ITS2 and plastid trnL-F spacer and rbcL genes. We amplified them following recommendations and protocols from Barcoding of Life ( [40]). Several samples were sequenced with the direct help of Barcoding of Life ("SAPNA" project); they also provided us with sequences of mitochondrial COI and plastid matK markers. ITS sequences were always checked against the GenBank database to exclude potential contamination with DNA from other plant groups or fungi.
Our PCR reaction mixture for most samples had a total volume of 20 µL which contained 5.2 µL of PCR Master Mix (components mostly from Platinum DNA Taq Polymerase, Thermo Fisher Scientific, Waltham, Massachusetts, USA) supplied with Platinum DNA Taq Polymerase), 1 µL of 10 µM forward and reverse primers, 2 µL of DNA solution from the extraction and 10.8 µL of MQ purified water (obtained from a Barnstead GenPure Pro system, Thermo Scientific, Langenselbold, Germany) in the TBT-PAR water mix ( [41]). The latter was developed to improve amplification from the herbarium samples. Thermal cycler programs for most samples were 94 • C for 5 min, then 35 cycles of 94 • C for 1 min; 50-52 • C (depending on the primer) for 1 min, 72 • C for 2 min, and finally 72 • C for 10 min. PCR products were sent for purification and sequencing to Functional Biosciences, Inc. (Madison, Wisconsin); and sequenced there in both directions using a standard Sanger-based protocol. Sequences were obtained, assembled, and edited using Sequencher™ 4.5 (Genes Codes Corporation, Ann Arbor, MI, USA).
We used several multivariate methods to assess phylogenetic trees: for example, to visualize the general structure of phylogenetic trees in the compact way, we employed multidimensional scaling on cophetetic distances between tree tips and created "density surfaces" (density of points was estimated with 2D binned kernel).
Since all single marker trees were concordant, we used the super-matrix approach. Our first dataset included multiple markers available from public databases and our sequencing, which is plastid rbcL, trnL-F, matK, and also nuclear ITS2 and mitochondrial COI. We call this dataset "broad" since it is relatively rich in data but has a limited sampling along the species dimension (87 entries and 4188 bp, including 656/497/561/1565/909 bp in COI, ITS2, rbcL, trnL-F and matK, respectively). The second dataset was made with the most extensive species coverage but included only ITS2 and trnL-F data, which is originated mostly from our sequencing. Below, we designate it as "tall" (273 entries, including some subspecies and forms and 2062 bp length, including the same ITS2 and trnL-F fragment lengths as above). The amount of missing data in these two datasets was 23% and 18%, respectively.

Morphology
Ripeline is also capable of using morphological characters, and we employed the updated and expanded morphological dataset from [2] to make the morphological dataset. To convert this dataset into digital, we applied optical character recognition, cleaned the results, converted tables into spreadsheets and merged them. We also added characters of seed sculpture ( [53,54]) to the characters used in [2], and expanded the dataset with species absent in the last work. In total, our morphological matrix has 114 characters; all character states are binary. To make a combined molecular and morphological matrix, we multiplied [55] the morphological dataset several times in order to achieve the approximate parity between of molecular and morphological characters.
We measured seven additional morphometric characters of Plantago (petiole, leaf, spike, and scape lengths, maximal leaf width, presence of taproot, and presence of gaps in the inflorescence) on 405 herbarium samples, the same samples which were used in DNA extraction.
Using these three datasets (morphological, morphometric and DNA), we were able to perform a broad spectrum of statistical analyses, for example, Procrustes analysis of the correspondence between molecular and morphological information ( [56,57]) and nearest neighbor machine learning ( [58]). First method helped us to understand the comparative roles of molecular and morphological characters, second allowed to place approximately the under-studied taxa. To help with the construction of dichotomous keys for identification of Plantago sub-groups, we employed recursive partitioning ( [59]), a machine learning technique which creates the classification trees in a way similar to identification keys ( [32,60]). To find patterns in the character distribution, we compared leaf shapes (from morphometric dataset) by subgenera, sections, and macro-regions and used chi-square tests to assess results. To determine the significance of morphological characters in respect to the molecular phylogenetic tree ("molecular weight"), we used the average or maximum Spearman correlation between morphological matrices and phylogenetic trees based either on a "tall" dataset or molecular-morphological dataset.
All datasets, scripts, and phylogenic trees used in the preparation of this publication are available from the first author's Open Repository here: http://ashipunov.info/ shipunov/open/plantago.zip (accessed on 17 October 2021). Ripeline is available on Github: https://github.com/ashipunov/Ripeline (accessed on 17 October 2021). We encourage readers to reproduce our results and develop our methods further. All sequences were deposited into GenBank.

Plantagineae in General
In total, we generated 296, 15 and 158 new sequences of ITS, rbcL and trnL-F, respectively.
All trees based on "broad" and "tall" datasets returned the highly supported (about 100% bootstrap and 100 BPP) topology (Aragoa, (Littorella, Plantago)) (  CADM test for the congruence between trnL and ITS parts of "tall" supermatrix returned the high Kendall concordance value (W = 0.85147994), the null hypothesis of incongruence was rejected with p-value 0.00009999 (Chi-squared = 63690.69985174). CADM tests for the congruence between COI, ITS and rbcL parts of "broad" supermatrix also returned the high Kendall concordance values (W = 0.60137286 and 0.71073409, respectively), the null hypotheses of incongruence were both rejected with p-values 0.00109989 (Chi-squared = 4602.90790653) and 0.00009999 (Chi-squared = 5439.95871762), respectively.  . Density surface of the cophenetic space, based on the MB trees from the "tall" dataset. This surface is the result of multidimensional scaling of the cophenetic distances between tree tips; density of points estimated with 2D binned kernel. "Ridges" reflect areas with the highest density and correspond well with three major subgeneric divisions of Plantago. Note the three-fold structure of the "phylogenetic surface": tallest corresponds with Plantago subg. Plantago, close and behind it, is Plantago subg. Coronopus whereas Plantago subgg. Psyllium and Albicans are the rightmost.

Aragoa
Generally, the support of subclades is low in Aragoa ( Figure 2). The most stable is a placement of Aragoa lucidula S.F.Blake as a sister to all other studied Aragoa species. Within the rest of the subtree, the majority of species make one clade with subdivisions mostly without high support. Morphologically outstanding (with relatively broad leaves) A. dugandii Romero forms a clade with A. lycopodioides Benth. and A. occidentalis (all branches here are longer than in other parts of Aragoa tree). The other unusual species (with long, yellow corolla tubes), A. perez-arbelaeziana Romero, forms a clade with A. romeroi Fern.Alonso (Figures 4 and 5A).

Littorella
Three (two in the "broad" dataset) species of Littorella make a stable group where European L. uniflora (L.) Asch. is sister to American L. americana Fernald and L. australis Griseb. ex Benth. & Hook.f. (Figures 4 and 5A).

Plantago subg. Plantago
Trees originating from the "broad" dataset contain many highly supported clades whereas "tall" trees have reliable support only for some terminal clades (Figures 2, 4 and 5B-D).
The less stable but relatively consistent group forms around polymorphic P. asiatica L. from mainland China and Japan, including P. schneideri Pilg., P. centralis Pilg., and P. cavaleriei H.Lév. We found typical P. asiatica on Hawaii Island, thus extending the range of this East Asian species to the mid-Pacific. However, all "P. asiatica" from mainland USA are either P. major or P. rugelii Decne. ( [62,63]).
Plantago hakusanensis Koidz. is a Japanese alpine endemic species with a distinct morphology. On our trees, it is branched closely to P. asiatica. In PE herbarium, we discovered the Yunnan sample labeled as "Plantago zhongdainensia" (nomen nudum), which morphologically might be considered similar to both P. hakusanensis and P. asiatica. Unfortunately, DNA data is not available from this sample. Proximal to P. asiatica is also morphologically distinct P. hasskarlii Decne. from Java mountains. Another species from Southeast Asia, P. incisa Hassk., groups outside of P. asiatica clade(s).
An unusual sample "Plantago sp. Hupeh1" was collected from China. It has morphological similarities with P. densiflora J.Z.Liu (synonymized with P. asiatica in the "Flora of China," [64]), but has 4-6 large black seeds and large fruits which is not in agreement with P. densiflora protologue. On our trees, it groups with P. depressa Willd. and allies (e.g., P. komarovii Pavlov and P. camtschatica Link). Besides, on "broad" trees, P. depressa robustly groups together with P. macrocarpa Cham. & Schltdl.; this grouping is also present on "tall" trees with less support.
American P. eriopoda Torr., P. rugelii, P. sparsiflora Michx., and P. tweedyi A.Gray are robustly supported as a clade on "broad" trees. Two samples collected in Chihuahua desert (BRIT) from northern Mexico also belong to this clade. They cluster together with P. eriopoda and P. tweedyi but morphologically are very distinct (see below).
Morphologically, Plantago rugelii is similar to P. major, but they do not group on any tree. Plantago major in turn, does not branch closely to morphologically similar P. asiatica, which was pointed out in [4]. Instead, P. major s.l. groups with P. japonica Fr. & Sav., P. cornuti Gouan, P. gentianoides Sibth. & Sm. and P. griffithii Decne., albeit with low support. Sequences from polyspermous form of P. major [65] described as P. uliginosa F.W.Schmidt, are identical to the typical P. major. Plantago griffithii, which is frequently considered a form of P. gentianoides, groups with the P. gentianoides s.str. on our trees, but this grouping is unstable.
Plantago pachyphylla A.Gray and P. hawaiensis (A.Gray) Pilg. (both from Hawaii) group together, and also with P. aundensis P.Royen from New Guinea. Alpine form of P. pachyhylla from Kauai (labeled in HUH as "Plantago nubicola Tessene," nomen nudum) clusters outside of P. hawaiensis + P. pachyphylla from Hawaii Island.
Two New Zealand species, P. triandra Berggr. and P. unibracteata Rahn, always cluster together outside of the rest of Plantago subg. Plantago.
Our phylogenic trees, especially from the "tall" dataset, also provide the primary ground for the placement of little-studied or previously molecularly not studied forms, for example, for P. laxiflora Decne. This South African species is morphologically unusual for the region and groups outside of African species. Other African and Madagascan species, i.e., P. africana Verdc., P. longissima Decne., P. palmata Hook.f., P. remota Lam. and P. tanalensis Baker, tend to group together with low support.
Most of Plantago sect. Virginica ([66]) species do not group with high support. However, we note that Andean P. oreades Decne. always branches outside of the P. australis Lam. group. Another Peruvian form from this section was listed by Knud Rahn (MO herbarium note) as a possible new species; on our trees, it groups with different members of the section, including South American P. tomentosa Lam. The second "unknown" from Peru, Plantago sect. Virginica sample from NY with a long stem (unusual in Plantago subg. Plantago) frequently groups with P. tenuipala (Rahn) Rahn from Columbia.
Plantago firma Kunze ex Walp. was typically considered as strictly Chilean species, but we have found its samples collected in Peru (USM). All Chilean and Peruvian P. firma samples robustly group together, and then with another species, bipolarly distributed P. truncata Cham. & Schltdl.
Most of Knud Rahn's Oliganthos species (this group includes P. barbata G.Forst., P. correae Rahn, P. pulvinata Speg., P. sempervivoides Dusén, and P. uniglumis Wallr. ex Walp.) were sequenced for the first time. On our "tall" trees, the group does not have high support but clusters together with P. moorei Rahn, P. tehuelcha Speg. and P. fernandezia Bertero ex Barnéoud, all from Southern Cone and surrounding islands.
Most of the Australian species form a low-supported but relatively stable grade; here we were able to place some under-researched species: P. antarctica Decne., P. depauperata Merr. & L.M.Perry, P. drummondii Decne., P. gunnii Hook.f., P. polita Craven (New Guinea) and P. turrifera B.G.Briggs & al.

Plantago subg. Coronopus
All topologies supported the subdivision of sects. Maritima and Coronopus (Figures 4 and 5B). Within Plantago sect. Maritima, we were able to place with confidence the rare Central Asian P. eocoronopus Pilg. (as a sister to the whole group) and North African P. rhizoxylon Emb. We detected the presence of the "true" P. maritima in South Africa (PRE herbarium); these samples are molecularly not different from the rest of P. maritima.
Macaronesian P. asphodeloides Svent. is the sister to other species from Plantago sect. Coronopus, and North African P. crypsoides Boiss. is sister to Mediterranean P. serraria L.

Plantago subgg. Bougueria, Albicans and Psyllium
Within this highly supported group (character abbreviations are explained in the Support Table S3, Figures 4 and 5D), P. nubicola (Decne.) Rahn (which sometimes regarded as a separate genus Bougueria) always as sister to all other species where the following topology is prevalent: (Psyllium s.str., ("American clade," "Plantago ciliata clade," "Mediterranean clade")); these we will describe in detail below.
Psyllium forms a robust, relatively long branch that split between mostly annual species with non-linear bracts (e.g., P. squarrosa Murray) and mostly perennial, woody species with narrow bracts (e.g., P. arborescens Poir.).
"Plantago ciliata clade" on "tall" trees is sister to "American clade", whereas on "broad" trees P. ciliata Desf. is sister to "Mediterranean clade" (with lower support). This clade includes P. ciliata and two successfully sampled species from the Plantago sect. Hymenopsyllium, P. cretica L. and P. bellardii All.
"American clade" is, in essence, Rahn's Plantago sect. Gnaphaloides. Plantago erecta E.Morris is variably at the base of this group, and P. helleri Small branches close to the southern P. nivea Kunth. The rest of North American species form a stable clade (which therefore roughly corresponds with Rahn's ser. Gnaphaloides), where P. aristata Michx. and P. argyrea E.Morris form a subgroup.

Morphological and Combined Analyses
Procrustes analysis allows for the embedding of two multivariate datasets ( [56,57]) and related statistical tests. Our molecular and morphological datasets are significantly correlated (Procrustes correlation = 0.7748, with significance = 0.001 based on 999 permutations), but individual placements are variably shifted ( Figure 6).  Even after intensive sampling, there are species which still lack the molecular information. With k-nearest neighbor machine learning (Figure 7), we obtained the section placements of these species. More than half of them placed with relatively high (>90 BPP) confidence (Table S4).
Chi-squared tests returned significant p-values (0.0005, 0.016, and 0.0055 respectively) and typically large effect sizes (corrected Cramér's V 0.39, 0.34 and 0.26 respectively, see also Figure 8) when comparing leaf shapes (morphometric dataset) by subgenera, sections, and macro-regions. The presence of a gapped spike was significantly different in comparisons between sections (p-value 0.0004 and 0.48 corrected Cramér's V). At the same time, the relative sizes of the stalk and spike were not significant in all comparisons.
Among morphometric characters ( Figure 9A), most significant in respect to the molecular phylogenetic tree, was the presence of a taproot, and then the length of leaves ( Figure 9B). The top 10 binary morphological characters ( Figure 9C) were: seed surface type 4 (with elongated ridges: [67]), long corolla (> 3 mm) lobes, opposite leaves, presence of pedicel, truncated base of leaf blade, presence of glandular hairs, elongated stem, antrorse hairs on the stalk, and presence of non-glandular hairs with the strongly refracted walls.    We used recursive partitioning ( [59]) to construct classification trees of the group ( Figure 10A,B). To find as many useful characters as possible, we ran this analysis three times, excluding characters used in the previous run. The resulting recursive classification trees employed 20, 20, and 19 characters (out of 115) and had 25.3%, 35.5%, and 48.1% misclassification errors, respectively. With morphometric characters, the resulting tree used all seven characters and returned a 75.5% misclassification error.    Table S3.

Plantagineae in General
The inclusion of the COI marker might in theory alter the outcome of our analyses because there is an evidence [68] of horizontal transfer in COI analyses of other plant groups. In case of our single-marker COI tree ( Figure S1), congruence tests suggest that it is not discordant with other markers.

Relationships of Plantago, Aragoa and Litorella within the tribe
There is unequivocal support for the Aragoa (Littorella (Plantago)) structure of the group phylogeny (Figures 1, 2 and 4). This structure is concordant with the current understanding of the evolution of the tribe, including the evolution of flower symmetry ( [8]).
Littorella, with three distinct species lineages, is robustly supported as a sister group of Plantago. Our data (Figures 4 and 5A) agree with a view of morphologically and ecologically outstanding Littorella as a separate generic lineage ( [21]).

Species Relationships within Aragoa
Bello et al. [23] included only three species (plus two hybrids) of Aragoa and did not resolve their relationships. Here for the first time, a significant part of the Aragoa diversity was reviewed using molecular data (Figures 4 and 5A). In general, there is some support for groupings previously obtained based solely on morphology ( [1]).
However, some morphologically unusual species like A. dugandii with relatively broad leaves, and A. perez-arbelaeziana with long tubular flowers do not make separate clades but are clustered together with more "typical" species (A. lycopodioides and A. romeroi, respectively, both from the main clade). The relatively broad, patent leaves and the simple inflorescence of A. dugandii, together with relatively large flowers, might be therefore interpreted as an adaptation to environments where moisture loss is not so critical as for many other species of the genus. As for A. perez-arbelaeziana, the presence of long, pendulous, yellowish corollas, unusual in the genus, can be interpreted as a result of recent adaptive radiation to a specific type of pollinator (hummingbirds: Fernández personal comm./observation) which does not entail the significant modifications in vegetative structures.
The machine learning placement of three unsampled Aragoa species resulted in the selection of possible candidate neighbors (Table S4) from the same main Aragoa clade.

Subgeneric Relationships within Plantago
In general, there are no principal conflicts with the most recent taxonomic studies of the genus based on morphology summarized in [2]. However, there are numerous differences worth emphasizing.
All our analyses reproduce the ((Coronopus, Plantago), (Bougueria, (Psyllium, Albicans))) backbone (Figures 4 and 5B-D); this is also visible on the phylogenetic density map (Figure 3). Our results correspond well with [3,4,12]. In addition, we provide strong support for the separate Psyllium and Albicans clades, which we treat here as subgenera based on the molecular and morphological evidence.

Plantago subg. Plantago
In general, our trees in this part ( Figure 5B,C) do not provide a clear, well-resolved picture as in plastome studies ( [4]). Our results are generally well in line with previous studies but comparing with [4,12,33,69,70] we were able to obtain information about many species previously not studied molecularly, and also about species which were not included in plastome research [4]. In several cases, the position of species in the specific section was "inferred based on the accumulated knowledge." Many such species are now assigned to sections (Table S4), and this is reflected in our working classification of Plantagineae (Table S2).
For example, it is mentioned in [4] that ". . . based on our phylogeny it is impossible to infer the position of the five unsampled American species . . . P. barbata, P. correae, P. pulvinata, P. sempervivoides, and P. uniglumis . . . ". Our current trees ( Figure 5C) place these species (except P. correae, for which we have no data) in one clade, which also includes P. rigida, P. tubulosa, P. moorei, P. tehuelcha, and P. fernandezia, the placement of which is well justified geographically (Table S5). Morphologically unusual P. sempervivoides is sister to the remainder of the group. Two Plantago sect. Carpophorae species are sister to P. fernandezia + P. barbata. This group likely has an Andean origin, and P. fernandezia might, therefore, have arrived at Juan Fernández from South America ( [12,71]). The positions of P. fernandezia, P. tehuelcha, and Plantago sect. Carpophorae species are different on the plastome trees ( [4]), but these trees have low support exactly in these parts.
Another complicated group was not resolved entirely, but we provide several insights for the placements and phylogeny in general of species around polymorphic and widespread P. asiatica ( [69,70,72]). Most of these forms cluster together (Figures 4 and 5C) and separately from the P. major clade; thus, morphological similarity here does not justify taxonomic closeness. Japanese endemic Plantago hakusanensis is either within this group or sister to the rest of this group; the same is true for P. hasskarlii. Relations of P. hakusanensis and P. hasskarlii mandate more in-depth research.
Molecular data from Plantago japonica, together with ecology and morphology, suggests that this species is likely distinct from P. major ( [70,72]).
Samples from the Hubei province of China ("Plantago sp. Hupeh1") appear similar to Plantago asiatica but branch near P. komarovii + P. camtschatica clade ( Figure 5B). As the allopolyploid origin of tetra-or hexaploid P. asiatica and P. rugelii was suggested in [70], we cannot exclude the possibility of the allopolyploid origin of these Hubei samples (with ITS kept from the Plantago sect. Pacifica parent). We believe that a thorough study of Chinese, Korean, and Japanese Plantago subg. Plantago species (and specifically species from sect. Pacifica) is required before reaching any robust conclusions. In the light of [70], the recent historical origin of P. rugelii (which branches closely to P. sparsiflora on our trees but morphologically hardly distinguishable from P. major) might also be justified.
There are seven endemic Plantago species described from New Guinea (P. aundensis, P. depauperata, P. montisdicksonii P.Royen, P. papuana P.Royen, P. polita, P. stenophylla Merr. & L.M.Perry and P. trichophora Merr. & L.M.Perry), but DNA extraction from available samples failed in 90% of cases. Nevertheless, we were able to place four of these species ( Figure 5B): P. aundensis with Hawaiian P. pachyphylla s.l. and P. hawaiensis (the third Hawaiian species, P. princeps does not hold a stable position on our trees); P. depauperata and P. polita with Australian P. muelleri Pilg.; and P. trichophora with Australian P. gaudichaudii Barnéoud. Still, much more sampling is needed, and Hawaiian species also deserve a closer look ( [4]).
Morphologically unusual samples ( Figure 11) from Chihuahua (Mexico) are superficially similar to P. gaudichaudii from Australia and P. remota from South Africa, but we believe that this is a clear example of morphological convergence. On our trees, these samples belong to Plantago sect. Pacifica clade and branch together with Midwestern P. eriopoda and P. tweedyi from the Rocky Mountains region ( Figure 5B). We believe that these samples deserve to be described as a new species of Plantago (see below).
The same Pacifica clade includes on "broad" trees Plantago macrocarpa, the Northern Pacific seashore species. On the "tall" trees ( Figure 5B), the placement of this species is not stable.
Plantago krascheninnikovii is superficially similar to the inland forms of P. maritima and therefore treated as a member of Plantago subg. Coronopus ( [73]). Known populations of this rare Urals species do not typically form ripe seeds ( [53]), thus preventing the correct placement on the base of morphology. Here we confirm for the first time its similarity with other Lamprosantha species, for example, Eastern European P. schwartzenbergiana ( Figure 5C). The southern tetraploid forms similar to Plantago media but with thick, erect, grayish leaves ( [53,73]), often regarded as P. urvillei Opiz or P. media subsp. stepposa (Kuprian.) Soó are distant from P. media s.str. (Figure 5C), thus necessitating the separation of this taxon. However, the diversity of P. media s.l. is still far from being fully understood ( [74]), and more research is needed to draw robust taxonomic conclusions. Here should be noted that P. media subsp. stepposa must not be mixed with the similarly looking shade, mesophytic plants of P. media ( [53]).
Plants from Öland described as P. minor Fr. are morphologically distinct and geographically isolated from other species of this section. However, common garden experiments ( [75]) led to the conclusion that they are conspecific with P. tenuiflora. On our trees, the sample from Öland is proximal to P. polysperma. We believe that more research on Öland plantains is needed to understand the taxonomic status of these forms better.
Andean Plantago oreades Decne. with distinct morphology (narrow leaves, long inflorescences, broad bracts, 1-3 seeded fruit, thick roots) was nevertheless included into broadly understood P. australis ( [2]). On our trees ( Figure 5C), it almost always separate from the other P. australis samples. Therefore, we propose here to re-establish this species.

Plantago subg. Coronopus
Our trees (Figures 4 and 5B) provide one of the most comprehensive phylogenies for the Plantago subg. Coronopus, and are in concordance with the recent work of [32]. In addition, we were able to place several species which have not been the subject of molecular studies before. The most interesting are positions of Canarian P. asphodeloides Svent as sister to the rest of species from Plantago sect. Coronopus, and P. eocoronopus Pilg. as sister to the rest of the Plantago sect. Maritima. The latter species is the rare Afghan plant, practically absent in collections. Pilger ([30]) guessed this species to be ancestral for the section, and now we can support this view on the basis of both molecular markers and morphology ( [61]). All our "P. schrenkii" C. Koch samples from the Arctic are identical to P. maritima ( [76]).

Plantago subgg. Bougueria, Albicans and Psyllium
Generally, this part of our trees (Figures 4 and 5D) is the most congruent with the classification [2]. Our data is also congruent with the most complete (until now) sampling of the group ( [3]) and provides robust support for many sub-groupings, which is reflected in our working classification (Table S2).
Among other results, we found that likely extinct P. johnstonii ( [14]) branches close to the coastal annual P. limensis and therefore belongs not to ser. Brasilienses but to ser. Hispiduleae. It is possible then that the perennial life form of the former species is the result of adaptation to the "lomas" microclimate.
The most recent review of the Plantago sect. Lancifolia ( [77]) is in agreement with our results but also provides new insights for the classification of this Mediterranean taxon. More research is needed to understand the relations of rare endemic species in this group.

Integration of Molecular and Morphological Data
With Procrustes analysis (Figure 6), we found that the overall "picture of diversity" is retained between morphological and molecular approaches. In other words, the correspondence between these analyses is high. This in turn allows for the k-nearest neighbor (Figure 7) placement of several species (Table S4) which might be otherwise incertae sedis in our working classification (Table S2).
Within sections and subgenera, we found several repetitive morphometric patterns ( Figure 8) which are present in each taxonomic group of particular level ("refrains": [78]); this is additional evidence that morphometric characters should work better within sections ( [32]). We also found several morphological and morphometric characters most correlated with molecular phylogenies (Figure 9). The morphometric characters are especially interesting because the analysis was performed on the same samples and not on higher units like species descriptions. Notable is the importance of the presence of taproot, which is another argument for collecting plantains always with carefully preserved underground parts. Among the binary morphological characters, attention should be paid on the research of seed surface characters ( [67]), highly correlated with molecular data ( Figure 9C).

Identification Keys
Production of identification keys is a complicated task in plantains. These keys must take into account the high variability and overlapping of the most distinctive characters used in Plantago taxonomy ( [4]). Therefore, it might be desirable to employ here results of machine learning techniques such as recursive partitioning. Partitioning trees ( Figure 10A,B) are able to identify the most informative characters and distinguish groups with minimal errors. We used these partitioning trees as a basis of the prototypic dichotomous key to Plantago sections. This prototypic key might serve as a framework for the future development of even more comprehensive keys: 1a. Ovary with 1-3 ovules and a rudiment of an upper compartment on the adaxial side of the placenta. Corolla lobes longer than 1 mm.  Notes:-Very little is known about habitat, ecology and phenology of this species. Collected specimens grow on alkaline soils, in grass semi-desert on the relatively high altitudes (about 1 km above sea level). Flowering time is likely the second half of summer (August). As only two specimens are known, more observations and collections are needed. This species is definitely rare and must be protected, and we hope that describing this new species, we will improve its chances to survive the Anthropocene.
Supplementary Materials: The following are available online https://www.mdpi.com/article/10.3 390/plants10112299/s1, Table S1: Vouchers, Table S2: Working classification of Plantagineae, Table S3: Morphological characters, Table S4: Machine learned placements of the molecularly under-sampled species of Plantago, Table S5: Geographical origin of Plantago species, Figure S1: The single-marker Plantago phylogenetic tree based on COI sequences; this tree is similar to other trees suggesting (together with CADM tests) the absence of HGT.