Integrative Taxonomy within Eremias multiocellata Complex (Sauria, Lacertidae) from the Western Part of Range: Evidence from Historical DNA

The Kokshaal racerunner, Eremias kokshaaliensis Eremchenko et Panfilov, 1999, together with other central Asian racerunner species, is included in the Eremias multiocellata complex. In the present work, for the first time, the results of the analysis of historical mitochondrial DNA (barcode) are presented and the taxonomic status and preliminary phylogenetic relationships within the complex are specified. We present, for the first time, the results of the molecular analysis using historical DNA recovered from specimens of several species of this complex (paratypes of the Kokshaal racerunner and historical collections of the Kashgar racerunner E. buechneri from Kashgaria) using DNA barcoding.


Introduction
Modern phylogenetic methods allow a look into the past with the extraction of DNA from specimens collected a century and a half ago. Collections at the Zoological Institute in St. Petersburg and Moscow State University have herpetological samples stored in ethanol that have never been exposed to formalin that go back to the 1880s. This study accesses this unique resource to answer taxonomic questions in the lizard genus Eremias. Currently, the genus Eremias includes 40 species [1] of predominantly arid sand, steppe, and desert lizards distributed from Northern China, Korea, central and southwestern Asia to southeastern Europe [2][3][4]. The central Asian racerunners of the Eremias multiocellata-E. przewalskii complex belong to the taxonomically most complicated group of the genus Eremias [5][6][7][8], which includes 8-9 central  . The genetic variability and phylogenetic relationships between E. multiocellata, E. przewalskii and allied taxa remain insufficiently studied. Further progress in this area is hampered by complicated taxonomy and a lack of verified identifications and comparisons with type material. Until recently, the multi-ocellated racerunner, Eremias multiocellata, was considered as a single species with several subspecies, only one of which, E. m. yarkandensis, was recognized before 1992 in the territory of the former USSR [2,5]. Long before that, an interesting population of racerunners was reported from the Sary-Dzhaz syrts of Kyrgyzstan [9], which was later assigned to the Kashgar racerunner E. buechneri [10]. This population, In the molecular genetic analysis, seven specimens of lizards of this complex (OK624437-OK624443) were used (Table 1; Figure 1).  . Numbers in square correspond to location of the specimens used both in morphological and molecular genetic analysis: solid square; this study: dashed square [14]. Red square corresponds to specimens used in this study; pink square [27]; Figure 1 and Appendix II. Numbers in circle correspond to location of the specimens used in morphological analysis only.
MtDNA sequencing and analyses. Tissue samples of Eremias sp. were taken from seven specimens preserved in ethanol including five specimens of type series from collections of ZISP and BSI.
Molecular analysis (DNA extraction and PCR) was carried out in the ZMMU Laboratory of Historical DNA, equipped exclusively for the work with museum DNA specimens, where no previous work on fresh tissues had been performed. DNA was extracted and purified using the Qiagen QIAamp DNA MiniKit (Qiagen, Hilden, Germany) with some changes: overnight lysis step at 56 °C and two steps of elution with AE buffer, each with 50 μL of buffer and 5 min of incubation at room temperature. We amplified a fragment of the COI (cytochrome oxidase I subunit) gene of mitochondrial DNA. DNA was highly degraded, so only short fragments (100-200 bp) were obtained using a combination . Numbers in square correspond to location of the specimens used both in morphological and molecular genetic analysis: solid square; this study: dashed square [14]. Red square corresponds to specimens used in this study; pink square ( [27], Figure 1 and Appendix II). Numbers in circle correspond to location of the specimens used in morphological analysis only.
MtDNA sequencing and analyses. Tissue samples of Eremias sp. were taken from seven specimens preserved in ethanol including five specimens of type series from collections of ZISP and BSI.
Molecular analysis (DNA extraction and PCR) was carried out in the ZMMU Laboratory of Historical DNA, equipped exclusively for the work with museum DNA specimens, where no previous work on fresh tissues had been performed. DNA was extracted and purified using the Qiagen QIAamp DNA MiniKit (Qiagen, Hilden, Germany) with some changes: overnight lysis step at 56 • C and two steps of elution with AE buffer, each with 50 µL of buffer and 5 min of incubation at room temperature. We amplified a fragment of the COI (cytochrome oxidase I subunit) gene of mitochondrial DNA. DNA was highly degraded, so only short fragments (100-200 bp) were obtained using a combination of internal primers designed for this study. Five primer pairs for five overlapping fragments were manually developed using Bioedit [28,29] and an alignment of Eremias multiocellata-complex sequences from GenBank (Table 2). Amplification was performed in 22 µL reaction volumes containing 2 µL DNA, 4 µL of Evrogen HS-Screen mix, and 1 µL of each primer (10 pmol/µL). All stages of the extraction process included a blank as a negative control run in parallel. PCR products were visualized using a 1% agarose gel. PCR products were sequenced at the Evrogen laboratory using an ABI PRISM 3500xl sequencer with BigDye Terminator Chemistry v. 3.1 (Applied Biosystems, Foster City, CA, USA) using PCR primers.
Short fragments were first aligned using Seqman v5.06 software [27], assembled into longer ones and then aligned using BioEdit Sequence Alignment Editor 7.1.3.0 [30] with default parameters. Subsequently, the alignment including all complete sequences of studied samples was reviewed and manually revised if necessary. Obtained consensus sequences of Eremias ssp. were deposited in GenBank under the following accession numbers: OK624437-OK624443. Other sequences of Eremias were downloaded from GenBank [14,27] (Table 1). Two sequences of E. multiocellata (KY366658 and KY366636) and Darevskia rudis (MN613693) were included as outgroups. The final alignment used for the phylogenetic analyses comprised 676 bp for 52 ingroup sequences of Eremias spp. Mean inter-and intraspecific uncorrected genetic p-distances and sequence characteristics were calculated using MEGA 6.1.
Phylogenetic trees were reconstructed by Bayesian inference criteria (BI), by maximumlikelihood (ML) and maximum-parsimony (MP) methods. The optimal partitioning schemes for Bayesian inference analysis were identified using PartitionFinder software [31] using greedy search algorithm under BIC criterion: HKY + I+G for all positions together. Bayesian inference (BI) was performed using Mr Bayes v3.1.2 software [32] with two simultaneous runs, each with four chains, for 5 million generations. We checked the convergence of the runs and that the effective sample sizes (ESSs) were all above 200 by exploring the likelihood plots using TRACER v1.5 software [33]. The initial 10% of trees were discarded as burn-in. Confidence in tree topology was assessed by posterior probability (PP) [34]. The ML trees were generated using IQ tree software [35] using ultrafast bootstrap = 10,000 [36], the model was selected using ModelFinder software [37]: TPM2u + F+G4. The unweighted MP analysis was conducted in PAUP 4.0b10 software [38] with 1000 bootstrap replications. Additionally, we calculated decay indices for parsimonious tree using PRAP2 [39] under non-ratchet parameters.

Sequence Characteristics
We obtained from 329 to 672 bp of each historical specimen of Eremias. Of the 676 aligned sites, 540 were found to be conserved, 130 variable and 110 parsimonyinformative; the transition-transversion bias was estimated at 4.11 (all data given for ingroups only). The nucleotide frequencies on the light strand were 24.5% (A), 28.5% (T/U), 29.2% (C), and 17.8% (G). Uncorrected p-distances are provided in Table 3. Between-group distances vary from 1.46% to 9.57%, and within-group distances vary from 0 to 1.41%. A histogram showing the distribution of pairwise genetic p-distances demonstrates two gaps: at 3.0 and 6-7%, respectively (Figure 1).

Molecular Phylogeny
The extensive type series (  The "-" sign shows that the node exists on the Bayesian tree, but the topology is different in the specified analysis. Numbers below the nodes represent decay indices for parsimony tree. Red labels highlight sequences from historical specimens.

Discussion
Our results showed that the studied samples of Eremias kokschaaliensis are divided into two groups. One group of E. kokschaaliensis A (paratype BSI no. R-000589, paratype no. ZISP R-8289, and one specimen no. R-3302 ZMMU from type territory) form a clade with Eremias specimens from Aksu (China) with high support (PP/BS/MP = 1/0.99/93); E. kokshaaliensis specimen no. R-8289 ZISP has a basal position within this group (PP/BS/MP = 0.98/0.88/52). The uncorrected within-group p-distance within this clade is 1.41%.
Three specimens of the type series E. kokshaaliensis, which are paratypes of E. kokschaaliensis (nos. ZISP-8277-1, ZISP 8277-2, ZISP 8277-3), form the second group: E. kokschaaliensis B (PP/BS/MP = 0.89/87/-). The resolution of their position is too low, and it would be impossible to assign them to any form until more molecular data have been obtained and studied.
E. kokshaaliensis from the type series of group A is easily recognizable by the characteristic of the mottled pattern of the dorsal surface of the body of ethanol-preserved specimens (Figures 3 and 4). As for the type specimens belonging to group B, they differ in a darker dorsolateral stripe. There may be slight differences in the number of scales around the midbody and number of subdigital lamellae on the fourth toe of the hindlimb, which is higher in E. kokshaaliensis from the type series of group A. It should be noted that these preliminary conclusions are not based on statistically reliable data, due to the high variability of the pholidosis characteristics and a limited number of specimens in the sample of E. kokshaaliensis group B. New material from the type locality of the E. kokshaaliensis group B is required to study the morphological variability and differences of these cryptic forms in more detail. The "-" sign shows that the node exists on the Bayesian tree, but the topology is different in the specified analysis. Numbers below the nodes represent decay indices for parsimony tree. Red labels highlight sequences from historical specimens.

Discussion
Our results showed that the studied samples of Eremias kokschaaliensis are divided into two groups. One group of E. kokschaaliensis A (paratype BSI no. R-000589, paratype no. ZISP R-8289, and one specimen no. R-3302 ZMMU from type territory) form a clade with Eremias specimens from Aksu (China) with high support (PP/BS/MP = 1/0.99/93); E. kokshaaliensis specimen no. R-8289 ZISP has a basal position within this group (PP/BS/MP = 0.98/0.88/52). The uncorrected within-group p-distance within this clade is 1.41%.
Three specimens of the type series E. kokshaaliensis, which are paratypes of E. kokschaaliensis (nos. ZISP-8277-1, ZISP 8277-2, ZISP 8277-3), form the second group: E. kokschaaliensis B (PP/BS/MP = 0.89/87/-). The resolution of their position is too low, and it would be impossible to assign them to any form until more molecular data have been obtained and studied.
E. kokshaaliensis from the type series of group A is easily recognizable by the characteristic of the mottled pattern of the dorsal surface of the body of ethanol-preserved specimens (Figures 3 and 4). As for the type specimens belonging to group B, they differ in a darker dorsolateral stripe. There may be slight differences in the number of scales around the midbody and number of subdigital lamellae on the fourth toe of the hindlimb, which is higher in E. kokshaaliensis from the type series of group A. It should be noted that these preliminary conclusions are not based on statistically reliable data, due to the high variability of the pholidosis characteristics and a limited number of specimens in the sample of E. kokshaaliensis group B. New material from the type locality of the E. kokshaaliensis group B is required to study the morphological variability and differences of these cryptic forms in more detail. The E. kokshaaliensis group A from the Sary-Dhaz River basin occurs mainly in rocky habitats with desert vegetation, along the rocky banks of rivers and slopes of the southern, southeastern and southwestern expositions. In the area of the village of Enylchek (2500 m above sea level), it occupies sandy terraces near the river and rough-stony lower slopes of the mountains. Here they are quite common, and in one hour of the tour, there can be from 18 to 30 individuals [13]. There is no available information about the habitat of E. kokshaaliensis group B from Kara-Teke.
It is currently not possible to discuss the phylogenetic relationships of E. kokshaaliensis and E. buechneri without newly collected specimens from Kashgaria and the results of DNA analysis.
Our DNA analysis confirmed the species designation of E. kokschaaliensis Eremchenko et Panfilov, 1999 and E. buechneri Bedriaga, 1907, which were previously described only on the basis of external morphology and partially on osteological and other characteristics. An important result is the revealing of the heterogeneity of the E. kokshaaliensis type series. The two phylogenetic groupings are geographically separated by about 300 km (Figure 1). The type of territory of the genotyped of E. kokshaaliensis group A coincides with the type of territory of the species, since the holotype: IBC BR 000580 originates from the same locality as the paratype BSI R000589: Kyrgyzstan, Settlement Sary-Dhaz. Thus, the phylogenetic line A corresponds to the species name given in the description, and the taxonomic status of the phylogenetic line B needs to be confirmed by new materials from the Kara Teke. The E. kokshaaliensis group A from the Sary-Dhaz River basin occurs mainly in rocky habitats with desert vegetation, along the rocky banks of rivers and slopes of the southern, southeastern and southwestern expositions. In the area of the village of Enylchek (2500 m above sea level), it occupies sandy terraces near the river and rough-stony lower slopes of the mountains. Here they are quite common, and in one hour of the tour, there can be from 18 to 30 individuals [13]. There is no available information about the habitat of E. kokshaaliensis group B from Kara-Teke.
It is currently not possible to discuss the phylogenetic relationships of E. kokshaaliensis and E. buechneri without newly collected specimens from Kashgaria and the results of DNA analysis.
Our DNA analysis confirmed the species designation of E. kokschaaliensis Eremchenko et Panfilov, 1999 and E. buechneri Bedriaga, 1907, which were previously described only on the basis of external morphology and partially on osteological and other characteristics. An important result is the revealing of the heterogeneity of the E. kokshaaliensis type series. The two phylogenetic groupings are geographically separated by about 300 km (Figure 1). The type of territory of the genotyped of E. kokshaaliensis group A coincides with the type of territory of the species, since the holotype: IBC BR 000580 originates from the same locality as the paratype BSI R000589: Kyrgyzstan, Settlement Sary-Dhaz. Thus, the phylogenetic line A corresponds to the species name given in the description, and the taxonomic status of the phylogenetic line B needs to be confirmed by new materials from the Kara Teke.

Conclusions
An important result of this study is our success in achieving the possibility of the molecular identification of historical specimens that were collected in 19th century (1880, 1881 and 1891). This novel DNA sequencing of historical specimens (ZISP) of E. kokschaaliensis reveals the heterogeneity of the type series and sheds light on the possibility of cryptic species within this group.