Apomictic Mountain Whitebeam (Sorbus austriaca, Rosaceae) Comprises Several Genetically and Morphologically Divergent Lineages

Simple Summary The genus Sorbus (whitebeams, rowans, and service trees) encompasses forest trees and shrubs characterised by exceptional diversity resulting from the interplay of polyploidisation, hybridization, and apomixis. The spatiotemporal processes driving Sorbus diversification remain poorly understood. This research aims to provide insights into the evolution and diversification patterns of mountain whitebeam (S. austriaca) covering most of its range in the mountains of Central and South-eastern Europe. Our molecular and morphometric data revealed pronounced cryptic diversity within the S. austriaca complex; it is composed of different lineages, that likely originated via multiple allopolyploidisations accompanied by apomixes, and these lineages exhibit different distribution patterns. Our results are particularly valuable from a biodiversity conservation perspective due to the continuing generation of novel diversity in sympatric populations of the parental taxa. Such derived diversity requires process-oriented conservation plans and measures. Abstract The interplay of polyploidisation, hybridization, and apomixis contributed to the exceptional diversity of Sorbus (Rosaceae), giving rise to a mosaic of genetic and morphological entities. The Sorbus austriaca species complex from the mountains of Central and South-eastern Europe represents an allopolyploid apomictic system of populations that originated following hybridisation between S. aria and S. aucuparia. However, the mode and frequency of such allopolyploidisations and the relationships among different, morphologically more or less similar populations that have often been described as different taxa remain largely unexplored. We used amplified fragment length polymorphism (AFLP) fingerprinting, plastid DNA sequencing, and analyses of nuclear microsatellites, along with multivariate morphometrics and ploidy data, to disentangle the relationships among populations within this intricate complex. Our results revealed a mosaic of genetic lineages—many of which have not been taxonomically recognised—that originated via multiple allopolyploidisations. The clonal structure within and among populations was then maintained via apomixis. Our results thus support previous findings that hybridisation, polyploidization, and apomixis are the main drivers of Sorbus diversification in Europe.


Introduction
The evolutionary importance of polyploidisation in flowering plants as a key mechanism shaping plant diversity is widely acknowledged [1][2][3][4]. Whole-genome multiplication via auto-or allopolyploidy can reshuffle genome structure, alter gene expression, induce phenotypic and physiological changes, and provide adaptive potential to polyploid plants [5,6]. Polyploidisation is sometimes connected to the breakdown of self-tionary entities-which we consider premature-but rather discuss the taxonomic implications of our results.

Plant Material
Leaf material from 37 localities was collected and silica-dried for molecular analyses and herbarized for morphological analyses ( Figure 1A, Table S1); localities 22,27,36, and 37 were sampled only for molecular analyses. The taxa were identified using Flora Europaea [46], Euro + Med Plantbase [52], and national floras [18,[53][54][55][56]. Voucher specimens are kept at the National Museum of Bosnia and Herzegovina (SARA).  Table S1. AFLP clusters within Sorbus subgen. Soraria are colour-coded and labelled with the letters a-l; in (A) their presence in each sampled locality is indicated (the parental taxa S. aria and S. aucuparia are in white).  Table S1. AFLP clusters within Sorbus subgen. Soraria are colour-coded and labelled with the letters a-l; in (A) their presence in each sampled locality is indicated (the parental taxa S. aria and S. aucuparia are in white). The NeighborNet diagram is supplemented with bootstrap values > 85% derived from a Neighbourjoining analysis ( Figure S1); multilocus genotypes (MG's) derived from nuclear microsatellites; and ploidy data obtained by flow cytometry; dashed lines denote the plastid haplotype affiliation according to the plastid trnT-trnF phylogenetic analysis. Note that only a subset of individuals was sequenced for plastid DNA variation.

Amplified Fragment Length Polymorphism (AFLP)
One to 13 individuals per locality, totaling 172 individuals from 37 sampled localities, were included in the AFLP analysis (Table S1). A modified CTAB-procedure [57] was used for extraction of total genomic DNA from c. 20 mg of silica-dried leaf material. The AFLP protocol was followed [58], with the modifications described in [39]. We used the following primer combinations for the selective PCR (fluorescent dye in brackets): EcoRI (6-FAM)-ACA/MseI-CAC, EcoRI (VIC)-AAG/MseI-CTG, and EcoRI (NED)-ACC/MseI-CAG (MseI-and EcoRI-primers: Sigma-Aldrich). Reproducibility was tested using fifteen replicated samples. Electropherograms were analysed with Peak Scanner 1.0 (Applied Biosystems, Waltham, MA, USA) with default peak detection parameters. The minimum fluorescent threshold was set at 50 relative fluorescence units (RFU). RawGeno 2.0 [59], a package for R [60], was used for automated data scoring with the following settings: 75-500 bp scoring range, 50 RFU minimum intensity, bin width 1.0-1.5. Fragments with reproducibility lower than 80% based on sample-replicate comparisons were excluded. A neighbour-joining analysis based on Nei-Li genetic distances [61] was conducted and bootstrapped (2000 pseudo-replicates) with TREECON 1.3 b [62]. A NeighborNet was produced from a matrix of uncorrected P distances using SplitsTree 4.12 [63]. A principal coordinate analysis (PCoA) based on Jaccard distances was conducted using PAST 2.15 [64].

Analysis of Nuclear Microsatellites
The amplification of six nuclear microsatellite-specific loci (CH01F02, MSS5, MSS13, MSS16, D11, and H10) was successfully performed for 171 individuals from 37 sampled localities (Table S1), following Robertson et al. [33,34]. An ABI PRISM 310 Genetic Analyzer (Applied Biosystems) was used for electrophoretic separation of the PCR products. Alleles were sized relative to the internal size standard TAMRA 500 (Applied Biosystems). Electropherograms were analysed using GeneMapper (Applied Biosystems). To study the genetic diversity, we determined the multilocus genotype (MG) for each individual on the basis of microsatellite alleles for each of the six loci using the software GenoType 1.2 [65]. Assignment of individuals to a particular clone was completed using the algorithm of Meirmans and Van Tienderen [65] according to the calculation of a genetic distance matrix and a threshold value (set to 2 after testing different thresholds as recommended) under the stepwise mutation model option. Clonal diversity was presented as the sum of the total number of multilocus genotypes (Ng), effective number of genotypes (Eff ) and genotypic diversity (Div), calculated with GenoDive 1.2 [65]. Relationships among multilocus genotypes were visualised by principal coordinate analysis (PCoA) based on Jaccard distances using PAST 3.17 [64]. The maximum number of alleles per locus was used to infer the ploidy level.
Maximum parsimony (MP) and MP bootstrap (MPB) analyses of the concatenated plastid sequences (including gap codes) were performed using PAUP 4.0b10 [70]. The most parsimonious trees were searched for heuristically with 100 replicates of random sequence addition, TBR swapping, and MulTrees on. All characters were equally weighted and unsorted. The data set was bootstrapped using full heuristics with 1000 replicates, TBR branch swapping, MulTrees option off, and a random addition sequence with five replicates. Bayesian analyses of the same dataset were performed using MrBayes 3.2.1 [71], applying the HKY85 substitution model proposed by the Akaike information criterion implemented in MrAIC.pl 1.4 [72]. The alignment was partitioned into nucleotide and indel data sets, and the latter was treated as morphological data according to the model  [73]. Values for all parameters, such as the shape of the gamma distribution, were estimated during the analyses. The settings for the Metropolis-coupled Markov chain Monte Carlo process included four runs with four chains each (three heated ones using the default heating scheme), running simultaneously for 10,000,000 generations each, sampling trees every 1000th generation using default priors. The posterior probabilities (PP) of the phylogeny and its branches were determined from the combined set of trees, discarding the first 1001 trees of each run as burn-in.

Genome Size Estimation
Genome size estimation followed the protocol of Hajrudinović et al. [41], using flow cytometry. Briefly, fresh leaves of 45 individuals from 12 populations (Table S1) were co-chopped with a razor blade with fresh leaves of the internal standard Medicago truncatula Gaertn. cv. R108-1 (0.98 pg [74]) in 600 mL of cold Gif nuclear buffer. The suspension was filtered through a 50 µm nylon mesh (CellTrics, Partec), and RNAse (Roche) was added to 25 U mL −1 . The nuclei were stained with propidium iodide (Sigma-Aldrich) with a final concentration of 50 mg mL −1 and incubated on ice for ca. 15 min prior to analysis. The fluorescence of~3000 nuclei was recorded for each sample using a Partec CyFlow SL3 (Partec, Münster, Germany) 532 nm laser cytometer or CyFlow Ploidy Analyser (Sysmex Europe SE) 532 nm laser. The 2C DNA values were obtained, and DNA ploidy levels [75] were inferred by comparison with the 2C DNA values of individuals of known chromosome counts, i.e., 2n = 2x = 34 for diploids and 2n = 4x = 68 for tetraploids [76].
For the purpose of reproduction mode identification, we conducted flow cytometric seed screening (FCSS) on 40 seeds from 10 locations (see Results), following Hajrudinović et al. [41]. Seeds were collected from previously cytotyped mother individuals (Table S1). Furthermore, to test for self-compatibility, the inflorescences of three S. austriaca trees from locality 14 were covered with pollination bags before anthesis. Seven seeds from pollination bags were collected and analysed. Only well-formed seeds, cleaned, shortly dried at room temperature, and kept in paper bags at 4 • C prior to analysis were used. Each seed was analysed separately. Endosperm ploidy was calculated using the inferred monoploid genome size of the embryo. Following Hajrudinović et al. [41], DNA ploidies of embryo and endosperm were compared with distinguish between sexual and apomictic origin of each seed.

Morphometric Analyses
Morphometric measurements were performed on 96 individuals from 24 localities (Table S1) of simple-leaved populations belonging to S. aria × S. austriaca, S. austriaca, S. mougeotii and S. pekarovae (all from the subgenus Soraria). Leaves of S. anglica, S. cuneifolia, and S. pauca, distributed outside the range of S. austriaca, were not available for measurements. The measurements included leaf characters that were previously shown to be informative [18,23,24,27,77]. The following 18 quantitative leaf characters were measured: lamina length (LLEAV), petiole length (LPET), length of the first, second, and third tooth (1SEINL, 2SEINL, 3SEINL), length of the first, second, and third nerve (1NERV, 2NERV, 3NERV), angle between the first, second, and third nerve compared with the primary nerve (1NANG, 2NANG, 3NANG), lamina width (WLEAV), distance of the leaf base to the line of maximal leaf width (MXWLEAV), leaf width 1 cm beneath the leaf apex (1ALEAV), leaf width 1 cm above the leaf base (1BLEAV), number of secondary nerve pairs (NNER), ratio of lamina length and lamina width (LLEAV/WLEAV) and ratio of lamina width and distance of the leaf base to the line of maximal leaf width (WLEAV/MXLEAV). The measurements were completed by hand, using millimeter paper and a digital calliper. The arithmetic means of three to five measurements per leaf character for each individual (from different mid-leaves of short sterile shots) were used for statistical analyses. Multivariate principal component analysis (PCA), canonical discriminant analysis (CDA), and classificatory discriminant analysis (DA) were performed for two data matrices. The first matrix encompassed data of all samples (96 individuals), while S. pekarovae, S. mougeotii and three putative back-crossed individuals S. aria × S. austriaca (AFLP group l in Figure 1B) were excluded from the second matrix (84 individuals of the S. austriaca lineages, see Results). Two canonical discriminant analyses (CDA 1 and CDA 2) followed by two classificatory discriminant analyses (DA 1 and DA 2) were performed for the two data matrices [78]. Prior to the analyses, the data matrix was standardised due to the different measurement units used. PCA based on the correlation matrix characters of the first matrix was aimed at displaying a general pattern of variation and relationships among individuals/populations/taxa. CDA based on Mahalanobis' distances was used to analyse the morphology of a priori defined groups using the 12 AFLP groups with simple leaves marked as a-l in Figure 1B. The obtained results were validated using DA. The validation criterion in the identification of morphological groups was >70% of a posteriori correctly classified cases into the a priori defined groups in CDA [79]. Classificatory DA was performed using a leave-one-out cross-validation (jackknifing) procedure. Both analyses were performed using PAST 3.14 [64]. Furthermore, basic descriptive statistical parameters were calculated for the analysed taxa/genetic groups: the arithmetic mean (µ), standard deviation (SD), value range (Min-Max) and coefficient of variation (CV%).

AFLP Fingerprinting
We obtained 435 high-quality and reproducible AFLP fragments from 172 individuals. The initial average error rate was 4.1%. The neighbour-joining analysis inferred 15 clusters with high bootstrap support (BS > 85%; Figure S1) that were also divergent in the Neighbor-Net ( Figure 1B). The two most divergent clusters contained the parental species, S. aucuparia and S. aria. Whereas the former cluster was genetically more uniform, separated by a long split from all other samples, the latter cluster was more diverse. All other 13 clusters that were positioned intermediate between the parental taxa corresponded to Sorbus subgen. Soraria. Populations treated as S. austriaca were included in nine clusters, which included single populations scattered in the Northeastern Limestone Alps (Austria; clusters b and e, the latter including population 24 sampled ten kilometres away from the locus classicus of S. austriaca), the central Balkan Peninsula (Bosnia and Herzegovina, g; Serbia, d), and the Western Carpathians (Slovakia, h, i). Cluster j included two populations from the Southern Carpathians (Romania), cluster c included five populations from the Northern and Southern Limestone Alps (Austria, Slovenia) and the central Balkan Peninsula (Bosnia and Herzegovina), and cluster f included nine populations from the western Balkan Peninsula (Dinaric Mountains; Bosnia and Herzegovina, Croatia, Kosovo, and Serbia). Three populations of S. mougeotii from the Western Alps (Switzerland) and the Northern Limestone Alps (Austria) were included in cluster a. Likewise, the single population of S. pekarovae from the Western Carpathians (Slovakia) formed its own cluster k, whereas the single populations of S. anglica and S. cuneifolia from England and Wales, respectively, were included in cluster w. Finally, two populations from the central Balkan Peninsula (Kosovo and Serbia) and the single individual of S. pauca from the eastern Sudetes (Czech Republic) were genetically closest to S. aria and were included in cluster l; we assume these individuals are hybrids between S. aria and S. austriaca.

Nuclear Microsatellites
A total of 25 clonal multilocus genotypes (MGs) were found within 110 accessions of Sorbus subgen. Soraria ( Figure 2, Table 1). Twenty-four out of 28 localities contained at least one MG shared by a different number of individuals within a locality, whereas localities 22 (Czech Republic), 33, and 34 (Switzerland) contained only one sampled individual, and locality 7 (Kosovo) contained three unique genotypes (Table S2). Nineteen populations with >1 sampled individuals were completely clonal (Ng = 1; Table S2). The effective number of genotypes (Eff ) was higher than 1 in six populations due to the presence of unique genotypes conferring higher values of genotypic diversity (Div) in those populations. Nineteen MGs were limited to single populations, and MG 2 was present in geographically  (Table 1). However, several MGs occurred at multiple localities (MGs 2, 5, 12, 22, 23, and 24; Table 1). The most widespread clonal genotype (MG 5, Table 1) occurred at seven localities in the Dinaric Mountains (localities 7, 11, 13, 14, 15, 17, and 18; Table 1).
The relationships among MGs ( Figure 2) were generally consistent with the AFLP data ( Figure 1B). The two most divergent Soraria MG clusters along the first PCoA axis thus corresponded to AFLP clusters c and f, including populations from the Balkan Peninsula and the Eastern Alps. The populations of most other AFLP clusters were scattered between these two main MG clusters; clusters a and w were most divergent from cluster c along the second PCoA axis. The AFLP cluster l including putative hybrids S. austriaca × S. aria was most diverse. Individuals from locality 14 (Bosnia and Herzegovina) had the most heterogeneous genotypes. Individuals were included in the two most divergent clusters, in accordance with the AFLP results, where these individuals were included in clusters c and f. The populations from other areas were more uniform, however, without a clear geographic pattern. There were also several monoclonal populations, namely in locations 19, 20, and 21 from the Western Carpathians, 23 and 24 from the Eastern Alps, and 2 from the Central Balkan Peninsula. On the other hand, the populations from locations 3 and 4 from the Southern Carpathians and 31, 33, and 34 from the Alps all shared a single clone. In addition, the allele composition of the parental species, S. aucuparia and S. aria, is given in Table S3.  Table 1). Twenty-four out of 28 localities contained at least one MG shared by a different number of individuals within a locality, whereas localities 22 (Czech Republic), 33, and 34 (Switzerland) contained only one sampled individual, and locality 7 (Kosovo) contained three unique genotypes (Table S2). Nineteen populations with > 1 sampled individuals were completely clonal (Ng = 1; Table S2). The effective number of genotypes (Eff) was higher than 1 in six populations due to the presence of unique genotypes conferring higher values of genotypic diversity (Div) in those populations. Nineteen MGs were limited to single populations, and MG 2 was present in geographically close populations (Table 1). However, several MGs occurred at multiple localities (MGs 2, 5, 12, 22, 23, and 24; Table 1). The most widespread clonal genotype (MG 5, Table 1) occurred at seven localities in the Dinaric Mountains (localities 7, 11, 13, 14, 15, 17, and 18; Table 1). Locality numbers correspond to Figure S1A and Table 1 and colours and letters to the AFLP clusters in Figure 1B. Multilocus genotypes belonging to the same AFLP cluster are connected with dashed lines. The numbers in parentheses denote the number of clones per locality.  Locality numbers correspond to Figure S1A and Table 1 and colours and letters to the AFLP clusters in Figure 1B. Multilocus genotypes belonging to the same AFLP cluster are connected with dashed lines. The numbers in parentheses denote the number of clones per locality. N-Number of individuals belonging to each multilocus genotype; N per locality-Number of individuals (in brackets) bearing a particular multilocus genotype per each locality (in bold), numbered as in Figure 1; Xo-Maximum number of alleles per locus; Xe-Expected ploidy level-obtained by flow cytometry for at least one individual per population from each MG group a or from published results for the same sampled locality [80]

Plastid trnT-trnF Phylogenetic Relationships
The trnT-trnF alignment of the concatenated trnT-trnL intergenic spacer and the trnL-trnF partial sequence was 1943 bp long. The shortest sequences (1766 bp) were those of S.  14). Eight substitutions, of which two were outapomorphic, contributed to variability within the S. aria lineage. Twenty-three characters were parsimony-informative, and Bayesian and parsimony analyses of the trnT-trnF sequences resulted in congruent phylogenies (Figure 3) that reflected the variation outlined above.

Genome Size, Ploidy Level and Reproduction Mode
Flow cytometry of 45 individuals resulted in holoploid absolute genome sizes (GS, 2C value) corresponding to two ploidy levels, namely diploid and tetraploid (Figure 4).

Genome Size, Ploidy Level and Reproduction Mode
Flow cytometry of 45 individuals resulted in holoploid absolute genome sizes (G 2C value) corresponding to two ploidy levels, namely diploid and tetraploid (Figure 4)  Figure 1B) and locality numbers followed by individual numbe to Figure 1A and Table S1.  (Table 2). Table 2. Flow cytometric results for nuclei from leaves and seeds of Sorbus taxa accompanied w the deduced origin of seeds / reproduction mode.

Locality
No.   Figure 1B) and locality numbers followed by individual numbers to Figure 1A and Table S1.  (Table 2). Table 2. Flow cytometric results for nuclei from leaves and seeds of Sorbus taxa accompanied with the deduced origin of seeds/reproduction mode.

Locality
No.  * Asterisks mark the number of seeds from inflorescences isolated in pollination bagss.

Affiliation in AFLP NNet
Flow cytometric seed screening of 40 seeds resulted in three different embryo:endosperm profiles, namely 2x:3x, 4x:10x and 4x:12x (Table 2). Diploid S. aria and S. aucuparia mother trees yielded seeds with 2x embryo and 3x endosperm, which represents a regular sexual profile. On the other hand, all seeds of analysed tetraploid S. austriaca, S. aria and S. aria × S. austriaca mother trees were of apomictic origin with 4x embryos and 12x endosperms, or, in one seed, 10x endosperm (Table 2). Moreover, several inflorescences of the three S. austriaca individuals from locality 14 that were covered with pollination bags to check for self-compatibility, yielded fruits. All seeds developed in pollination bags were of apomictic origin with 4x embryos and 12x endosperms ( Table 2).

Morphology
Principal Component Analysis generated four significant principal components (two are displayed, Table S4). They accounted for 83.1% of the total variance (PC1 = 40.2%, PC2 = 19.6%, PC3 = 14.1%, and PC4 = 9.1%), with moderate correlation to the majority of corresponding morphological traits (Table S4). The PCA ordination diagram ( Figure S2) showed a pattern in which most AFLP clusters overlapped among the four clusters (c, d, f, and k). The CDA 1 diagram of all samples ( Figure S3A) showed that the overlapping AFLP clusters c (Dinaric Mountains, Southern Limestone Alps, Northern Limestone Alps), h, i, and j (all Carpathians) were morphologically divergent from the overlapping clusters e (Northern Limestone Alps), f and g (both Dinaric Mountains), and l (Dinaric Mountains, Bohemia), whereas the partly overlapping Alpine clusters a (Western Alps) and b (Northern Limestone Alps) were intermediate along DF1 (explaining 41.5% of variation). Along DF2 (explaining 20.6% of the variation), the populations from AFLP clusters d (Carpathians-Balkan Mountains) and k (Carpathians) were clearly divergent, whereas all other AFLP clusters were intermediate. Finally, along DF3 (11.3%; Figure S3B), some AFLP clusters that were overlapping along DF1 and DF2 were divergent, namely clusters a, b, e, and j. The characters that contributed mostly to the discrimination along DF1 were LLEAV/WLEAV, NNERV, 1NERV and 2NERV, those along DF2 were WLEAV, LLEAV/WLEAV, WLEAV, and those along DF3 1NANG, 2NANG. The classification matrix with the Jackknifed procedure for the CDA 1 dataset resulted in 82% of correctly classified individuals (Table S5).
The CDA 2 resulted in clearer morphological discrimination among the AFLP clusters of the S. austriaca lineages ( Figure 5). Along with DF1 (45.2%) and DF2 (18.0%; Figure 5A), AFLP clusters c, d, and j were clearly divergent. On the other hand, f, e, and g, as well as b, h, and i, overlapped. Along DF3 (16.0%; Figure 5B) clusters e, h, j and j were divergent, whereas c, d, and f, as well as f and g, overlapped. The characters that contributed most to the discrimination along DF1 were LLEAV/WLEAV, 1NERV, NNER, and 2NERV; those along DF2 were WLEAV, LLEAV/WLEAV, NNERV, and 3NER; and those along DF3 were WLEAV/MXLEAV. The classification matrix with the Jackknifed procedure for the CDA 2 dataset resulted in 88% of correctly classified individuals (Table S6). Basic descriptive statistical parameters of measured leaf characters are given in Table S7.

Discussion
Our integrative approach combining AFLP fingerprinting, plastid DNA sequences, nuclear microsatellites, ploidy-level estimation, and morphometric analyses inferred intricate patterns of diversification within the S. austriaca complex. The genetic data revealed a clear divergence among (groups of) populations included in different allotetraploid apomictic lineages that originated in various parts of South-Eastern and Central Europe. Our results thus support previous findings that hybridisation, polyploidization, and apomixis are the main drivers of Sorbus diversification, at least in Europe [33,34,36,82].

Multiple Origins of S. austriaca Lineages
Our AFLP and nuclear microsatellite data suggest independent origins of the different S. austriaca lineages in different parts of the Alps, Dinaric Mountains, and the Carpathians, likely as a consequence of polytopic hybridisation between the parental diploid

Discussion
Our integrative approach combining AFLP fingerprinting, plastid DNA sequences, nuclear microsatellites, ploidy-level estimation, and morphometric analyses inferred intricate patterns of diversification within the S. austriaca complex. The genetic data revealed a clear divergence among (groups of) populations included in different allotetraploid apomictic lineages that originated in various parts of South-Eastern and Central Europe. Our results thus support previous findings that hybridisation, polyploidization, and apomixis are the main drivers of Sorbus diversification, at least in Europe [33,34,36,82].

Multiple Origins of S. austriaca Lineages
Our AFLP and nuclear microsatellite data suggest independent origins of the different S. austriaca lineages in different parts of the Alps, Dinaric Mountains, and the Carpathians, likely as a consequence of polytopic hybridisation between the parental diploid species S. aria and S. aucuparia ( Figures 1B and 2, Table 1). Hybridisation was followed by polyploidisation, as all individuals of S. austriaca for which we established the ploidy, were tetraploid ( Figures 1B and 4, Tables 1 and 2). This pattern, along with the restriction of many lineages to single (lineages b, d, e, g, h, i, and k) or a few (lineages a, j, and w) localities, some of them in areas that were strongly glaciated during the Last Glacial Maximum (LGM) [83,84], may imply their recent origin. On the other hand, the geographically widespread lineages c and f included multiple populations and shared related multiclonal MGs differing in one or two alleles per locus ( Figure 1A,B; Table 1). Cluster f includes populations solely from the Dinaric Mountains, which were much less affected by the Pleistocene glaciations than the Alps [85,86]. Cluster c also includes populations from this area and the southern margins of the Eastern Alps, which suggests that they might have originated earlier and thus had more time to disperse [87]. Their disjunct distributions could be a result of multiple dispersals and establishments in isolated localities, but also of a previously more continuous range disrupted by the climatic changes during the Pleistocene.
An important factor in the range expansion of apomicts is self-compatibility, a reproductive trait of the mating system of S. austriaca as determined in population 14 (Table 2). Self-compatibility most likely facilitated the range expansion of certain clonal genotypes, more specifically those from clusters c and f ( Figure 1B). Due to self-compatibility, apomictic clones can establish populations via single individuals that use their own pollen for the required endosperm fertilisation to produce functional seeds [7]. In this way, apomictic genotypes promote range expansions to remote areas, where they can function as pioneer colonisers of new habitats [15]. The fleshy fruits of Sorbus, such as those of other Malinae (e.g., Crataegus), are adapted to dispersal by vertebrates, mainly birds, often over relatively long distances [16,88].
A certain level of genetic differentiation among the populations in the Dinaric cluster f (Figure 1) suggests their persistence in disjoint localities over a longer period rather than their recent dispersal. Such a differentiation is surprising, as it is not expected in an obligate apomictic such as S. austriaca [37,41]. Different mechanisms can facilitate genetic divergence even among obligate apomictic populations, namely residual sexuality [89], accumulation of mutations and chromosome rearrangements [12,90], recombination during restitutional meiosis [91], transposon activity [92], and heritable epigenetic variation [93]. On the other hand, in the Dinaric-Alpine cluster c, the level of diversification appears to be lower, suggesting more recent divergence of some populations. This is consistent with the extensive glaciation of the Alps during the LGM [94], rendering postglacial colonisation of the Alpine populations from the South likely. Nuclear microsatellite analysis confirmed that diploids from South-eastern Europe (Albania and Montenegro) and Central Europe (Slovenia) were involved in the formation of the Central European tetraploid populations of S. aria [87], which further supports the hypothesis of postglacial colonisation of the Alps from the South. In the same line, the Balkan Peninsula served as an important Pleistocene refugium and a source for post-glacial colonisation of Central Europe for other trees, e.g., alder buckthorn (Frangula alnus L. [95]), European beech (Fagus sylvatica L. [96]), and hornbeam (Carpinus betulus L. [97]).
The co-occurrence of the widespread genetic lineage f and the stenoendemic lineages c and g at the same localities in the central Dinaric Mountains (14 and 15, Figure 1) is an additional line of evidence for recurrent allopolyploidisations in sympatric populations of parental diploid sexuals. Weak reproductive barriers between S. aucuparia and S. aria facilitate gene flow and continuously generate novel hybrid derivatives, which are often polyploid [32]. Along the same line, the co-occurrence of AFLP group f in S. austriaca and S. aria at locality 7 in the south-eastern Dinaric Mountains could have led to hybridisation and, consequently, the origin of a novel genetic entity (i.e., cluster l).
Also in our dataset, AFLP cluster l that was in the NeighborNet closest to S. aria ( Figure 1B), shared its haplotype with S. aria (Figure 3). Due to their intermediate position between S. aria and the S. austriaca complex in the NeighborNet, we suggest that these populations represent backcrosses of the S. austriaca complex with S. aria as the maternal parent (S. aria × S. austriaca). Alternatively, direct hybridisation of S. aria with S. aucuparia as a pollen donor could be plausible, but we consider this less probable because such a scenario would result in an intermediate leaf morphology (semipinnate leaves [33]). To our best knowledge, S. aucuparia served as the maternal parent to all allopolyploid derivatives originating from hybridisation with S. aucuparia.
The individuals of cluster l in our study, including S. pauca, all shared a S. aria haplotype and were tetraploids with simple leaves ( Figure 1B, Table 2). Whereas localities 6 and 7 are separated by only 27 km, and the clustering of their individuals is thus not surprising, the close relationship of the single analysed individual of S. pauca from a locality more than 970 km away, is unexpected. Sorbus pauca was recently described as a hybridogenous tetraploid apomictic endemic of the eastern Sudetes that most probably arose from two hybridisation events [25], but this hypothesis as well as the formal description of this new species were exclusively based on morphology and cytometry. It is likely that multiple recent hybridisations between genetically similar parents at different localities led to highly similar hybrid genotypes. Alternatively, cluster l has a single origin and is more common in intermediate areas, for instance the Carpathians.
The origin of Sorbus tetraploids may follow two different pathways. One involves an initial cross of a diploid sexual and a tetraploid apomict, leading to apomictic triploid offspring. Subsequently, unreduced triploid eggs are fertilised by reduced pollen of the parental diploid to produce new tetraploids [33,37,41]. Alternatively, crossing the reduced megagametophyte (n = 2x) of a sexual tetraploid with the reduced pollen (n = 2x) of an apomictic tetraploid can produce the same tetraploid offspring. In the case of S. austriaca, the first scenario would include fertilisation of a reduced diploid S. aucuparia megagametophyte (n = x) with reduced pollen of tetraploid S. aria (n = 2x) to produce triploid offspring; tetraploid S. aria is namely widespread in the Balkan Peninsula [39,41]. In the next step, the unreduced megagametophyte of such a triploid is fertilised by reduced pollen of S. aucuparia (n = x) to produce a tetraploid. Facultative sexuality of apomictic allopolyploids could allow backcrossing with parental species [101]. The second scenario would include fertilisation of a reduced allotetraploid megagametophyte (such as S. bosniaca with the same parental combination [39]) with reduced pollen of tetraploid S. aria. The problem with the latter scenario is that there is no evidence for sexuality in S. bosniaca. The third scenario presumes a direct cross via fertilisation of an unreduced S. aucuparia megagametophyte (n = 2x) with reduced pollen of tetraploid S. aria (n = 2x).
Therefore, different scenarios for the formation of Sorbus polyploids are plausible, at least in the Dinaric Mountains, where different cytotypes and taxa often coexist [39,41]. Despite recent methodological progress, reconstructing the origin of allopolyploids is still challenging due to their recurrent formation, recombination among homeologous chromosomes, different epigenetic expression, genome restructuring, or extinction of parental lineage [5,6,102,103]. It is particularly difficult to trace subgenomes of the S. aria group within allopolyploid complexes due to the enormous genetic and cytotypic variability of the members of this group [82].
In Sorbus, particularly ploidy data are considered important because in this genus, polyploidy confers apomictic reproduction, which is the key criterion in addition to morphology [18]. Apomictic reproduction per se does not necessarily imply uniform and distinctive morphology but may result in poorly differentiated individuals/populations arising from the same or related combinations of parental taxa/cytotypes, as shown for S. aria from Central Bavaria [82]. Our results show that genetically similar populations also tend to cluster together morphologically, regardless of their geographic origin (clusters c and f, Figures 1B, S2 and S3).
Our study reveals cryptic diversity within the S. austriaca complex. High congruence of molecular and morphological data demonstrates that the sampled populations of S. austriaca in fact form multiple evolutionary entities. Interestingly, both the level of genetic ( Figure 1B) and morphological ( Figures S2 and S3) divergence among described species of subgenus Soraria, namely S. anglica and S. cuneifolia (cluster w), S. mougeotii (a), and S. pekarovae (k), is similar to the divergence among different lineages of S. austriaca (clusters b-g). The S. austriaca clusters are differentiated from S. pekarovae and S. mougeotii based on leaf morphology ( Figure S3), although some morphological overlap between S. mougeotii and S. austriaca (cluster b, population 23) is evident.
The strongest morphological differentiation is seen among S. austriaca lineages from the Balkan Peninsula (clusters f and d) and the Balkan-Alpine cluster c ( Figure 5A). The Balkan populations have genetic affinities with both the Northern and Southern Limestone Alps (cluster c, Figures 1A and 5), which is also reflected in similar morphology. On the other hand, the pronounced genetic and morphological divergence of the Carpathian lineages of S. austriaca (the Southern Carpathian cluster j and the Western Carpathian clusters i and h) is in line with their geographical isolation relative to lineages from the Alps and the Dinaric Mountains. Their distinctiveness may be explained by a potentially different parental combination, i.e., a non-sampled paternal lineage of genetically variable S. aria [45,82].
Allopolyploidisation likely occurs polytopically, leading to the evolution of independent populations [81], and natural selection and chromosome rearrangements can then result in the formation of similar morphological forms [110]. On the other hand, epigenetic mechanisms coupled with polyploidy can produce different phenotypes regardless of the similarity of the genomic compositions of allopolyploids [111]. In our case, the morphological integrity of the S. austriaca lineages and their distinctiveness across the sampled area are likely maintained via apomixis and reproductive isolation. Even if some of the genetic clusters could be associated with existing species (e.g., cluster c could pertain to the recently described S. lippertiana [40]), the others likely represent undescribed cryptic microspecies. We here refrain from recognising the uncovered cryptic diversity as distinct taxonomic entities; additional studies, including a denser sampling and more detailed morphological analyses, are needed to taxonomically resolve this intricate complex.

Conclusions
Our molecular data revealed pronounced cryptic diversity within the S. austriaca complex; it is actually composed of different lineages, which likely originated at different time horizons and exhibit different distribution patterns. These data highlight the importance of genetic analyses on the one hand and including samples from a broader geographic area when taking taxonomic decisions and describing new species in Sorbus on the other hand. Apart from that, our results are particularly valuable from a biodiversity conservation perspective, because in the genus Sorbus, the interaction of hybridisation, polyploidy, and apomixis represents a powerful mechanism generating novel diversity in sympatric populations of the parental taxa. Traditional conservation efforts, based on clearly defined boundaries of taxonomic entities in order to specify appropriate action plans and measures, have limitations in preserving the diversity of taxonomic complex groups [112], such as the genus Sorbus, whose dynamic evolution requires a different approach. Therefore, a new conservation concept based on evolutionary processes was proposed (Process-Based Species Action Plan [113]). The goal of this concept is the conservation of processes that generate diversity, i.e., the preservation or increase of the number of individuals in a potential interaction. Studies such as the present one are timely, as they are setting the stage for such process-oriented biodiversity conservation.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biology12030380/s1, Figure S1, Neighbor-Joining analysis of AFLP data for Sorbus subgen. Soraria, S. subgen. Aria and S. subgen. Sorbus. Localities' numbers and groups' colours follow the other Tables and Figures; Figure S2, PCA ordination of 18 morphological traits for all Soraria accessions. Colours and letters correspond to AFLP clusters in Figure 1; Figure S3, Canonical discriminant analysis (A: DF1 vs. DF2; B: DF1 vs. DF3) of 12 predefined groups (corresponding to AFLP clusters) based on individual plants and 18 morphological characters. Colours and letters correspond to AFLP clusters in Figure 1; Table S1, Geographic origin and number of individuals of Sorbus populations included in the analyses of AFLP, nuclear microsatellite, plastid DNA sequencing, flow cytometric and morphometric data, respectively; Table S2, Indices of population clonal diversity based on nuclear microsatellite data for the studied populations of Sorbus subgen. Soraria; Table S3, Alelle composition in six nuclear microsatellite loci for the presumed parental taxa of Sorbus subgen. Soraria; Table S4, Results of principal component analysis (PCA, see Figure  S2) and canonical discriminant analysis (CDA, see Figure S3) of Soraria individuals based on the morphological characters of leaves; Table S5, Classification matrix with Jackknife procedure (82% of cases correctly classified) for the dataset of analysed Sorbus individuals based on morphometric leaf measurements (groups are defined according to AFLP clusters, Figure 1B); Table S6, Classification matrix with Jackknife procedure (88% of cases correctly classified) for the dataset of analysed Sorbus austriaca individuals based on morphometric leaf measurements (groups are defined according to AFLP clusters, Figure 1B); Table S7, Descriptive statistics of leaf morphometrics for Sorbus subgen. Soraria individuals presented for AFLP groups.