Evidence for Plio-Pleistocene Duck Mussel Refugia in the Azov Sea River Basins

: Freshwater mussels (Bivalvia: Unionoida) play an important role in freshwater habitats as ecosystem engineers of the water environment. Duck mussel Anodonta anatina is widely distributed throughout Europe, Siberia, and Western and Central Asia, which makes it a convenient object for biogeographic studies. In this study, we analyzed the divergence of A. anatina populations and discovered a separate genetic lineage distributed in rivers of the Azov Sea basin. This was confirmed by the high genetic distances between this group and previously defined populations, and by the position of this clade in the Bayesian phylogeny calibrated by an external substitution rate. Based on our approximate Bayesian computation (ABC) analysis, biogeographic scenarios of A. anatina dispersal in Europe and Northern, Western, and Central Asia over the Neogene– Quaternary were simulated. The haplogroup’s isolation in the rivers of the Azov Sea basin most likely occurred in the Late Pliocene that was probably facilitated by rearrangement of freshwater basins boundaries in the Ponto-Caspian Region. Population genetic indices show the stability of this group, which allowed it to exist in the river basins of the region for a long time. The discovery of a long-term refugium in the rivers of the Azov Sea led to a better understanding of freshwater fauna evolution in the Neogene–Quaternary and highlighted the importance of conservation of these freshwater animals in the region as a source of unique genetic diversity. and D.M.P. collected samples. A.V.K. and A.A.T. designed and carried out molecular analyses. A.A.T., A.A.L. and A.V.K. performed phylogenetic modeling and population genetic analysis. M.Y.G. the maps. A.A.T. the with from A.V.K., I.N.B. M.V.V.


Introduction
Under the influence of climate changes at various stages of the Late Cenozoic in Europe, there existed refugia in which the fauna survived and evolved in isolation without interaction with neighboring regions. The identification of such regions allows reconstructing biogeographic history of particular taxa in detail. Such studies also help to improve the knowledge on geological and climatic events of the Neogene-Quaternary (23.03 Ma-11.70 Ka).
Isolated lineages of freshwater animals were identified in several European regions on the basis of molecular genetic data. The Italian, Iberian, and Balkan Peninsulas became refugia for continental fauna during cold climate events in various geological periods [1][2][3][4].
In this respect, the eastern part of Europe has been poorly studied. The Ponto-Caspian Region was subject to significant climatic fluctuations and the sea level changes during the Neogene-Quaternary. Approximately 4.15 million years ago, according to a comprehensive reconstruction of paleoecological conditions [5], there was a transition from hyper-saline marine to fluviolacustrine freshwater environments associated with the progressive filling of the basin [6,7]. This environmental change approximately coincided in time with the Pliocene thermal optimum [8].
As a result of climatic changes and the transition to a freshwater environment in water bodies, various species of aquatic animals became distributed in this territory. This process was accompanied by emergence of isolated populations. For example, genetic studies of Barbus fish (Teleostei: Cyprinidae: Barbus) in Europe made it possible to identify the Ponto-Caspian subclade within this genus and corresponding refugium for freshwater fauna [9]. Connections between parts of the Ponto-Caspian basin were also studied on the basis of the phylogeny of marine invertebrates, i.e., amphipods [10]. The time split between the Caspian and Black Sea lineages of Pontogammarus maeoticus (Sovinskij, 1894) (3.83-4.31 Ma) was estimated using genetic data.
In our study, we add new evidence for the hypothesis that postulated the existence of the Ponto-Caspian Pliocene-Pleistocene refugium based on a study of the freshwater mussel species Anodonta anatina (Linnaeus, 1758) (Bivalvia: Unionidae). This freshwater mussel seems to represent one of the most appropriate models for identifying freshwater refugia and reconstructing connections between ancient freshwater basins. This species is widespread in fresh water bodies of the temperate zone of Eurasia, and also inhabits its subarctic part [11]. The mitochondrial genome of A. anatina is characterized by a low nucleotide substitution rate, and therefore can be used for reconstruction on the scale of several million years [12]. The results presented by Froufe et al. [13] indicated the existence of three genetic lineages of this species determined on the basis of mitochondrial DNA cytochrome c oxidase subunit I (COI) gene sequences.
In this work, we present the results of a broad-scale phylogeographic study of Anodonta anatina throughout Europe, and Northern, Western, and Central Asia to reconstruct the demographic history of its populations in the Pliocene and Pleistocene epochs under the influence of climatic and paleoecological changes.

Sample Collection, DNA Extraction, PCR and Sequencing
Samples of freshwater mussels were collected by hand from different water bodies in the following regions: Northeastern Europe (rivers of the Barents, White and Baltic Sea basins); Southeastern and Eastern Europe (rivers of the Black, Azov and Caspian Sea basins); and Northwestern, Southwestern, and Central Asia (rivers of the Mediterranean, Kara, and Laptev Sea basins) (Table S1). Sample locations from three divergent haplogroups of Anodonta anatina (IBER, ITAL, and AZOV) and samples from the Eurasian haplogroup (excluding samples from Siberia) are shown in Figure 1. Soft tissue snips for DNA analyses were preserved in 96% ethanol immediately after collection. The specimens were deposited in the collection of the Russian Museum of Biodiversity Hotspots (RMBH) of the Federal Center for Integrated Arctic Research, the Ural Branch of the Russian Academy of Sciences (FCIARtic).  (Table  S1). Pink color indicates basins of following rivers: Kuban, Don, Kagalnik, Kirpili, Yeya, Chelbas, Beisug. The map was created using ESRI ArcGIS 10 software (https://www.esri.com/arcgis); the topographic base of the map was created with Natural Earth Free Vector and Raster Map Data (https://www.naturalearthdata.com) and HydroSHEDS (https://www.hydrosheds.org) (Map: Mikhail Yu. Gofarov).

Phylogenetic Analyses and Divergence Time Estimates
The alignment of the COI sequences was performed directly using the ClustalW algorithm [25]. For phylogenetic analyses, each of the 324 aligned sequences of Anodonta anatina was trimmed, leaving a 590 bp fragment. Then, identical COI sequences were removed from the dataset using online FASTA sequence toolbox FaBox v.1.5 (http://users-birc.au.dk/palle/php/fabox) [26], leaving a total of 85 unique haplotype sequences (including the four outgroup taxa). For phylogenetic analyses, we used the COI dataset with unique haplotypes. The best models of sequence evolution for each partition were as suggested on the basis of the corrected Akaike Information Criterion (AICc) of MEGA7 [27]. We used an external COI mutation rate of 2.65 × 10 −9 substitutions/site/year, which was based on two vicariate Unio taxa separated by the Messinian salinity crisis as a paleogeographic event [28]. This estimation corresponded well with our own records. Hypothetical divergence times were estimated in BEAST v. 2.1.3 [29,30] on the basis of this evolutionary rate using a lognormal relaxedclock algorithm with the Yule speciation process as the tree prior [31,32]. Calculations were performed at the San Diego Supercomputer Center through the CIPRES Science Gateway [33]. The HKY model was applied to each partition instead of the GTR model because prior and posterior ESS values under the GTR model were always recorded <100. This indicated that the GTR model was likely overly complex for our data.
Two replicate searches were conducted, each with 10 million generations. The trees were sampled every 1000 th generation. The log files were checked visually with Tracer v. 1.6 for an assessment of the convergence of the MCMC chains and the effective sample size of parameters [34].
All ESS values were recorded as >200; posterior distributions were similar to prior distributions. The resulting tree files from two independent analyses were joined with LogCombiner v. 2.1.3. The first 10% of the trees were discarded as an appropriate burn-in. The maximum-clade-credibility tree was reconstructed using TreeAnnotator v. 2.1.2 [15].

Demographic History and Molecular Dating Analysis
Biogeographic scenarios were compiled on the basis of phylogenetic studies of the freshwater mussel genus Anodonta. Froufe et al. [13] identified the following haplogroups of Anodonta anatina on the basis of COI gene sequences: (1) Iberian Peninsula without the Ebro river basin; (2) continental Europe; and (3) Italian Peninsula with the Ebro river basin. Subsequently, this scheme was supplemented by data on the divergence of Anodonta sp. in the rivers of Southwestern Europe [2]. Our COI sequence data for A. anatina from the rivers of the Azov-Prikubanskaya Lowland (Kuban, Don, Kagalnik, Kirpili, Yeya, Chelbas, Beisug riverine basins) indicated the presence of a separate subclade within this species. Thus, biogeographic scenarios used for demographic history modeling included these four lineages.
Scenario Sc1 suggests simultaneous separation of southern populations due to a paleogeographic event at time t3 from the continental Eurasian population. According to this scenario, the four groups of Anodonta anatina were separated. According to Froufe et al. [2,13] and our COI gene sequences (based on uncorrected p-distances), we proposed two more scenarios suggesting divergence of populations in the southern regions of Europe from the rest of the European populations, i.e., Scenarios Sc2 and Sc3. The Sc2 scenario suggests initial separation of the Azov-Prikubanskaya Lowland population, and then that of the Italian Peninsula and the existence of A. anatina refugium in the Iberian Peninsula at time t3. In its turn, scenario Sc3 assumes initial separation of the Italian + Ebro subclade from the continental population, and the existence of ancient genetic group of A. anatina there. Further, according to this scenario, there was a separation of populations of the Azov-Prikubanskaya Lowland, and then those of the Iberian Peninsula and Northern Eurasia ( Figure 2).  Table 1. These demographic scenarios were simulated for comparison using an ABC approach with DIYABC v. 2.1.0 software [35]. The primary sequence dataset included four samples: (1) Table 1. The HKY was the best evolutionary model for our sequence dataset [36].
A total of 3 × 10 6 simulated datasets were calculated. Pre-evaluations of model prior combinations in ABC inference revealed that the prior settings were correctly assigned ( Figure S1). Posterior probabilities of the three biogeographical scenarios were calculated using direct and logistic approaches. On this basis, the scenario with the highest value of posterior probability was determined. Times of divergence between lineages in a supported scenario were calculated using estimation of posterior distributions of parameters with DIYABC v. 2.1.0 [35].
Population genetic diversity indices (haplotype and nucleotide diversity), Tajima's D test, and Fu's F-test statistics, and mismatch distribution analysis under a spatial expansion model were calculated using Arlequin v. 3.5.1.2 software to estimate the demographic histories of the sampled populations [37]. For this analysis, the same groups as for the ABC procedure were used.
The mean genetic divergence between those subclades (uncorrected p-distances) varied from 2.35% to 3.41%. The mean genetic distances between ITAL and IBER, EUR, and AZOV were more than 3.0%, while the mean p-distances between EUR and AZOV, IBER and AZOV, IBER and EUR were 2.35-2.56% (Table 2). The time-calibrated phylogeny showed that the ITAL lineage was likely separated from the other subclades 11.4 Ma ago, while the AZOV lineage diverged 9.9 Ma ago. The split between IBER and EUR subclades was placed 8.4 Ma ago.

Historical Demography and Divergence Time Estimates
The IBER and AZOV lineages differed from the other subclades by a high haplotype diversity (mean Hd values from 0.873 to 0.912) and a high nucleotide diversity (π > 0.8%). The EUR and ITAL lineages had lower values of these population parameters (mean Hd ranges from 0.725 to 0.767, π < 0.5%). The Fu's FS and Tajima's D tests indicated no significant deviation from mutation-drift equilibrium for ITAL and AZOV lineages, whereas the IBER lineage had a statistically significant value for Fu's FS test. Additionally, EUR lineage had significant negative values of both statistics, revealing possible historic demographic expansion. Mismatch distribution analysis of the ITAL lineage resulted in multimodal distribution with three peaks at 0, 5, and 8 bp. Samples from the three other lineages showed bimodal distribution with peaks at 1 and 9 bp (IBER and AZOV), and at 1 and 7 bp (EUR) (Figure 4). The EUR lineage revealed the lowest values of parameter τ, which reflects a time since population expansion, while all the other subclades returned much larger momentestimator values (Table 3).  Results of the ABC simulation showed that Scenario Sc3 was identified as the most likely scenario with the highest posterior probability specified by logistic approach as 0.94 ( Figure 5 and Table 4).  The posterior predictive error rate, which was calculated over 1000 datasets using the direct approach, was 0.170. Type I error for the choice of Scenario Sc3, in accordance with the direct approach, was 0.102. Type II errors for the choice of Scenario Sc3, according to the direct approach for 1000 datasets in favor of Scenarios Sc1 and Sc2, were calculated as 0.045 and 0.198, respectively.
Modeling results obtained under Scenario Sc3 revealed an order of isolation of Anodonta anatina southern refugia in Europe (Table 5). In particular, this scenario placed the split between ITAL and the continental population in the Messinian stage of the Miocene (mean age = 6.11 Ma; 95% CI: 3.52-9.01 Ma). The divergence of the AZOV lineage most likely was in the Late Pliocene (mean age = 3.61 Ma; 95% CI: 1.85-5.73 Ma), and the split between EUR and IBER lineages was placed in the Middle Pleistocene (mean age = 1.50 Ma; 95% CI: 0.769-2.49 Ma). Our model predicted a relatively slow substitution rate in A. anatina lineages as follows: mean rate = 2.66 × 10 −9 s/s/y; 95% CI: 2.61 × 10 −9 -2.70 × 10 −9 s/s/y).

Discussion
Using Bayesian phylogenetic analyses, we confirmed the existence of four Anodonta anatina lineages revealed in previous studies [13,18,38]. The EUR lineage is widespread throughout Europe and Western, Northern, and Central Asia. Earlier, Froufe et al. [13] described the distribution of this subclade in Eastern and Central Europe. Klishko et al. [18] revealed that samples from Lake Baikal and Ukraine also belong to this subclade. In our novel phylogeny, this lineage included samples from the vast territory of Eurasia, in which freshwater fauna diversification occurred relatively recently, mostly during the Pleistocene [39]. We revealed that the EUR lineage inhabits rivers of the White and Barents Sea basins (Northern Dvina, Pechora, Onega, Indiga, Keret, Kuloy), and the Taz, Yenisei, Lena, Urals, Ob, Upper and Lower Volga, and Dnieper river drainages. The IBER and ITAL lineages formed two separate clades in our and earlier phylogenies [2,13].
Based on a large COI dataset for Anodonta anatina from rivers of the Azov Sea basin, we identified another intraspecific lineage of this species, the level of genetic divergence of which is similar to that of the Iberian and Italian subclades [13]. Froufe et al. [13] proposed that these peninsulas were served as glacial refugia for A. anatina because of high genetic distance values between these subclades and EUR lineage. The mean genetic distances between the IBER, ITAL, and EUR lineages and the AZOV lineages varied from 2.35% to 3.08%, which indicated the long-term isolation of freshwater basins in the Azov-Prikubanskaya Lowland and A. anatina populations inhabiting those water bodies. Previously, it has been shown that several other aquatic taxa, e.g., the freshwater fish genus Barbus and the amphipod genus Pontogammarus, had survived in the Ponto-Caspian refugia during the Neogene-Quaternary epoch [9,10].
The low genetic diversity of the ITAL lineage (Hd = 0.725 ± 0.058, π = 0.477 ± 0.285) may be explained by the fact that a significant part of specimens in this sample originated from three river basins, i.e., Reno (N = 12), Po (N = 11), and Ebro (N = 7). In addition, such values of these parameters may indicate an ancient genetic lineage that is not currently dispersing. This assumption was also confirmed by phylogenetic data. The high values of Hd (0.912 ± 0.022) and π (0.858 ± 0.469) for the IBER lineage are probably associated with long-term evolutionary processes occurring in this subclade in isolated river basins located in different parts of the Iberian Peninsula. The Anodonta anatina lineage from rivers of the Azov Sea basin has high haplotype and nucleotide diversity (Hd = 0.873 ± 0.035, π = 0.824 ± 0.452), which can indicate an ecological plasticity of this group and its longterm isolation (existence in a stable refugium). Fu's Fs and Tajima's D values for the IBER, ITAL, and AZOV lineages were not statistically significant, indicating that there is currently no expansion of these subclades. A. anatina samples belonging to the EUR lineage have low genetic diversity (Hd = 0.767 ± 0.028, π = 0.477 ± 0.280). The results of Fu's Fs (Fs = -21.078, p <0.02) and Tajima's D (D = -1.840, p <0.05) neutrality test indicated that this subclade is expanding its range throughout Eurasia during the Pleistocene. This lineage was found in numerous river basins of the continent and occupied substantially larger area compared with IBER, ITAL, and AZOV lineages.
Modeling the demographic history of these four A. anatina lineages from freshwater basins of Europe, and Western, Northern, and Central Asia made it possible to estimate the age of their divergence. On the basis of the ABC modeling results, we can hypothesize that divergence of A. anatina lineages with respect to refugia in Southern Europe was determined by changes in the sea level and rearrangements of regional hydrological networks during climate change periods. Almost complete drying of the western part of the Mediterranean Basin at the end of the Miocene during the Messinian salinity crisis could be a possible cause of direct connections between freshwater basins of the Iberian and Italian peninsulas, and subsequent gene exchange between A. anatina lineages. At this time, according to the simulation results, the ITAL lineage was separated, including isolated populations of Anodonta anatina in the Italian Peninsula and Ebro River basin (mean age = 6.11 Ma; 95% CI: 3.52-9.01 Ma).
Available paleogeographic data on changes in the Ponto-Caspian basin boundaries during the Pliocene indicate a shift from marine to freshwater environment in the Azov-Prikubanskaya Lowland [5][6][7]. At the same time, spreading of freshwater and brackish water animals happened by changing environmental conditions. As a consequence, isolated populations of these animals evolved, in particular among amphipods, freshwater fishes [9,10], and freshwater mussels (this study). Our phylogeographic study revealed the presence of a separate, well-diverged subclade of Anodonta anatina in Southeastern Europe. This subclade contains samples from rivers of the Azov Sea drainage, i.e., the Kuban, Don, Kagalnik, Kirpili, Yeya, Chelbas, and Beisug river basins. We can assume that isolated freshwater basins inhabited by A. anatina have existed in this part of the Ponto-Caspian Region in the Pliocene. According to the ABC modeling, isolation of AZOV lineage occurred in the mid-Pliocene (mean age = 3.61 Ma, 95% CI: 1.85-5.73 Ma). We found high values of the diversity of haplotypes (Hd) and nucleotides (π) within the AZOV subclade (Table 3), and a high level of genetic distance between this group and other populations of A. anatina ( Table 2).
The IBER lineage, determined on the basis of COI gene sequences [13], contains samples from rivers of the Atlantic Ocean basin in Iberia (South-Central, North-Western, and South-Western subregions). According to the demographic history modeling, split between the IBER and EUR lineages occurred 1.5 Ma ago in the Early Pleistocene (Calabrian).
During transition from the Early to Middle Pleistocene (~1 Ma), critical paleoclimatic changes occurred in the Ebro River basin when cycles of climate fluctuations were established (100 Ka ago) [40]. Subsequent fluvial evolution was characterized by a major entrenchment of fluvial valleys and staircase-terrace formation associated with stronger stadial/interstadial oscillations. This shift from the Early Pleistocene basins to Middle-Late Pleistocene river valleys is consistent with river evolution models described for other regions of Central and Northwestern Europe. Thus, sea-level fluctuations and geomorphological transformations of the Ebro River basin in the Early-Middle Pleistocene apparently contributed to the isolation of freshwater mussel populations (e.g., A. anatina) inhabiting river basins of the Iberian Peninsula.
Timing of the splits between Anodonta anatina lineages using the BEAST approach yielded results exceeding the time calculated under ABC simulation, and this difference increased with numerical values of age. At the same time, in both analyses, we used the time-scale calibration by an external calibration rate for the Unionidae [28]. This discrepancy between estimates of divergence ages using different approaches was noted by Feher et al. [41], who referred to the need to improve timeline calibration in calculations. This could be because BEAST does not consider the possibility of accelerated evolution and does not evaluate population parameters that are used to reconstruct population demographic history using ABC [36].

Conclusion
Here, we report the discovery of a separate mitochondrial DNA (COI) lineage of Anodonta anatina, which is restricted to rivers of the Azov Sea basin. This record highlights that the Azov Sea Region could have served as an ancient refugium for freshwater fauna. We showed that A. anatina lineages in Eurasia were largely diversified since the Late Miocene (Tortonian), with the Azov lineage separated from other subclades in the Late Pliocene ca. 3.6 Ma ago. Our novel phylogenetic and population demographic data improved modern knowledge on dispersal and refugial patterns of freshwater mussels in Eurasia. Finally, our data revealed that the Azov Sea Region can be considered a high priority area for future phylogeographic research and conservation efforts using freshwater animal groups as models.
Supplementary Materials: The following are available online at www.mdpi.com/1424-2818/12/3/118/s1, Figure  S1: Test results of biogeographical scenarios Sc1, Sc2 and Sc3 concerning origin of the Anodonta anatina populations under an ABC framework using the COI gene sequences. Model checking to measure a mismatch between the parameters of posterior combination and observed data sets in Scenarios Sc1 (a), Sc2 (b) and Sc3 (c). A description of basic assumptions with prior settings for each scenario is presented in Table 1. Table S1: List of sequences used in this study, including the species, the location and NCBI's GenBank accession numbers.