Not the Cryptic Species: Diversity of Hipposideros gentilis (Chiroptera: Hipposideridae) in Indochina

: We present here the result of phylogenetic analysis for Vietnamese Hipposideros gentilis specimens using 7 nuclear genes and one mitochondrial gene. The complex distribution of divergent mitochondrial DNA lineages contradicts, at least in part, nuclear and morphological data. The most likely explanation for this discordance is the historical hybridization between ancestral populations of H. gentilis and H. rotalis / H. khaokhouayensis . Our data supports the species status of H. gentilis , while only partially corroborating its previously proposed subspecies delimitation. We suggest the lowland forest populations from south Vietnam may correspond to their own subspecies. At the same time, the close phylogenetic relationship and morphological similarity of mountain forms from south and central Vietnam to the north Vietnamese populations make doubtful the subspecies status of H. gentilis sinensis .


Introduction
Hipposideros is the core genus to Hipposideridae (Old World leaf-nosed bats), a bat taxon prominent in Southeast Asia. It is also one of the largest mammalian genera, retaining a highly complicated taxonomy even after having lost two of its former species groups, namely "commersoni" and "cyclops" [1,2]. Within the genus, Hipposideros bicolor is a group that comprises the most forms. Many of them are similar in morphology, and their taxonomic placement is often uncertain. In the last decade, significant efforts were dedicated to solving this puzzle (e.g., [3][4][5]).
One of the common Southeast Asian species of leaf-nosed bats from the "bicolor" group was widely referred to as Hipposideros pomona by Andersen in 1918 [6][7][8]. Morphologically, this form was once treated as a subspecies or group of subspecies within H. bicolor (Temminck, 1834) [9]. Thereafter H. pomona was considered a distinct species, including the forms gentilis (Andersen, 1918) and sinensis (Andersen, 1918) as partial synonyms [10]. Paracoelops megalotis (Dorst, 1947) was initially described as a separate species and genus from central Vietnam [11]. However, a review of the single type specimen concluded that it belongs to H. pomona [12]. Molecular studies have revealed that several geographically limited haplogroups are present in H. pomona populations. Distances between these haplogroups are high enough to assume species-level divergence. Moreover, the monophyly of H. pomona was questioned [13,14]. In the Indochina peninsula, four main lineages have been discovered, namely northern, central, and two in the south. One of the southern clades takes a sister position to the lineage which includes H. khaokhouayensis (Guillen-Servent, Francis, 2006) and H. rotalis (Francis, Kock, Habersetzer, 1999) [13]. Some of these haplogroups were treated as subspecies in an earlier study of H. pomona genetic variability in China [5]. The clades SM and SY were assigned to the subspecies sinensis and gentilis, respectively, with no suggestions for the "central" clade (SH). The southern haplogroups were not discussed in that paper.
On the other hand, the shape of baculum (os penis) is very similar in different genetic lineages of H. pomona [15] (see also [16]). The baculum is greatly reduced and stick-like, about 0.5 mm in length. It is strikingly different from that of H. bicolor, H. cineraceus (Blyth, 1853) and other related species [4]. At the same time H. pomona s. str. from southwest India is remarkably distinct from all other populations in bacular and cranial morphology [17]. Hipposideros gentilis is the next senior name, and in this case should be accepted as valid for all populations under current study.
Similarity in baculum shape, as well as the absence of obvious differences between populations in the skull morphometry and the structure of the nasal leaves (orig., unpubl.), cast a shadow of doubt upon the putative species status of the H. gentilis haplogroups. Here, we apply a multi-locus analysis of nuclear genes to address this subject.

Analyzed Specimens
Our research is based on the Zoological Museum of Moscow State University (ZMMU) collections. Tissue samples (muscle fragments) were taken from specimens preserved in ethanol. In total, 14 specimens of H. gentilis (H. "pomona") were used, together with a few other species of the genus Hipposideros. Aselliscus stoliczkanus was taken as outgroup. A full list of specimens and obtained sequences is provided in Tables 1 and 2. Also, additional sequences of cytb and four nuclear genes were downloaded from GenBank (see Table 2 and Appendix A).

Morphometric Analysis.
To understand how the observed genetic diversity is reflected in morphology, we performed a morphometric analysis using cranial and dental measurements.
A total of 102 specimens of the Hipposideros pomona/gentilis complex (dry or alcohol preserved skins with extracted skulls) were examined for morphometric comparison. We excluded several specimens due to missing measurements. Only 99 were included in the final analysis. The full list of these specimens is provided in Appendix C. Localities are shown in Figure 1 [18]). Localities of morphologically studied material are marked with full circles; localities for original genetic samples (cytb and nuclear genes) are marked with empty circles. Type localities of named forms are numbered as follows: 1-gentilis Andersen, 2-sinensis Andersen, 3-megalotis Dorst. Cranial and dental measurements were made under stereo microscope with digital calipers, rounding to the nearest 0.01 mm. The following cranial measurements were taken: greatest skull length (TL), condylo-canine length (CCL), skull width at the mastoid (MW), brain case width above the mastoids (BCW), occiput height (height from the lower margins of the occipital condyles to the highest point just above them; OH), zygomatic width (ZW), width of the postorbital constriction (POC), rostral width at the level of the anterorbital foramina (RW), length of rostrum in front of the anterorbital foramen (RL), width across the upper canines (CC), width across posterior upper molars (MM), length of the upper tooth row (CM), length of the upper molariform row, distance from P 4 to the posterior molar (PM), longitudinal length of the upper canine (C), width of nasal opening (NO), lower tooth row length (cm), articular length of the mandible (MdL), and height of the mandible (MdH).
To assess the variation pattern of quantitative characters, Principal Component (PC) and Discriminant Function (DF) analyses were performed for the 20 craniodental measurements, using STATISTICA for Windows version 9.0 (StatSoft, currently TIBCO Software Inc., Palo Alto, CA, USA). DF analysis was used to calculate squared Mahalanobis distances between groups and the significance of inter-group differences. The training set for calculating the squared Mahalanobis distances for the DF analysis included six samples, namely "Myanmar", "Central Thailand", "NW Indochina", "NE Indochina and SE China", "C Indochina" and "S Indochina". The third and fourth training sets were selected based roughly on the two haplogroup (SY and SM) ranges as they are described in [5]. Indian material, due to low specimens' availability, was not treated as a training set, but was included in the analysis as "unidentified".
We were not able to genotype all specimens used in the morphometric study. However, all geographic samples from Indochina used in both the PS and DF analyzes include specimens genotyped for one or both mitochondrial genes and for nuclear genes (see Appendix C and appropriate Figures).

Molecular Analysis
Genomic DNA was extracted from ethanol-preserved muscle tissue by standard method of phenol-chloroform deproteinization [19]. One mitochondrial gene cytochrome b (cytb; 1137 bp) and fragments of seven nuclear genes (ABHD11, 460 bp; ACOX2, 597 bp; COPS, 744 bp; RAG2, 1035 bp; ROGDI2, 509 bp; SORBS2, 569 bp; and THY, 565 bp) were sequenced. Primers were taken from previously published papers [20][21][22][23]. The thermal profile amplification reaction included 35 cycles and was set up as follows: denaturation for 30 s at 94 • C, annealing for 1 min (at 55 • C for cytb and 60 • C for the nuclear genes), and elongation for 1 min at 72 • C. Predenaturation lasted 3 min at 94 • C, and the final elongation was 6 min at 72 • C. Automatic sequencing was performed on an ABI PRISM 3500xl sequencer (Applied Biosystems (ABI), Thermo Fisher Scientific, Waltham, Massachusetts, United States) using ABI PRISM ® BigDyeTM Terminator v. 3.1 reagent kits (Applied Biosystems, United States) in the laboratory of Eurogen Joint Stok Company (Moscow, Russia). To obtain additional mitochondrial topology, a total of 117 specimens, including 95 sequences of H. gentilis and seven other Asian Hipposideros species were analyzed for patterns of genetic divergence in the COI region. These data were taken from publicly accessible projects housed by the Barcode of Life Data System (BOLD, Guelph, Ontario, Canada) [24]. A list of BOLD Process ID numbers is provided in Appendix B.
Obtained sequences were aligned by the MAFFT v7.307 algorithm [25], then trimmed and corrected manually, after which heterozygous sites were marked by IUPAC ambiguity symbols for tree construction and phased in DNA 6.12.03 [26] for network building. Indels were replaced by "N." The reconstruction of phylogenetic trees was carried out using maximum likelihood (ML) in IQ-TREE v2.0.6 [27,28]. The built-in algorithm was applied to select optimal models and partitioning scheme [29]. Bayesian analysis (BA) was also performed using the MrBayes 3.2.6 program [30][31][32]. Partitioning schemes and models for BA were determined in Partition-Finder v. 2.1.1 [33] using a BIC criterion with a greedy search method. The initial partition scheme included separate partitions for each gene. For mitochondrial genes, codon positions were used as the initial partitioning scheme.
Supertree was computed using the MRP method (matrix representation with parsimony) [34][35][36]. Using the alignment of individual nuclear genes, bootstrap samples of 1000 ML trees were constructed. The resulting topologies were combined using the maximum parsimony method, implemented in the phytools v0.7.70 [37] package in the R v 3.5.2 [38] environment. The reliability of the topology was calculated by applying a bootstrap procedure to total tree sample.

Pattern of Variation in the mtDNA Marker
Phylogeny reconstructed from cytb sequences mostly corroborates previously published COI gene data [13]. Basal nodes and relationships between different species groups have low support. The monophyly of the species groups generally is well supported. Topologies of Bayes and ML trees didn't contradict each other at highly supported nodes (Figure 2). Hipposideros gentilis is non-monophyletic by this gene and is divided into two wellsupported clades (Figure 2, Figure S1; Table 3, Table S1). The first clade is by itself paraphyletic with respect to the H. khaokhouayensis/rotalis lineage and includes samples from the southern lowland forests of Vietnam (Dong Nai, Tuy Phong, and Bihn Chau provinces) and from the southern part of the Vietnamese central highlands (Gia Lai province). Within the second clade of H. gentilis, specimens from Vietnam, Laos and China are represented. The Haplogroup from the northern half of central Vietnam forms a sister clade to the animals from Laos, Hainan (SH clade in [5]) and one specimen from Yunnan province, China. Animals from north Vietnam (Gulf of Tonkin) are grouped with those from Guangdong (SM clade). Such relations somewhat contradict the results of [5] and suggest a possible overlap in distribution of haplogroups. Both the central Vietnam and Gulf of Tonkin clades are monophyletic with high support.

Pattern of Variation in nuDNA Markers
Analyses of individual nuclear genes mainly demonstrate insufficient phylogenetic signal ( Figure S2, Tables S2, S3). Three of seven genes (RAG2, ROGDI, THY) yielded monophyletic clades for the analyzed samples of H. gentilis. However, only the THY gene had a good bootstrap support value for this branch (87%). It is unclear which of the two species (H. ater, H. cineraceus) is closest to H. gentilis. The other three genes (ABHD11, ACOX2, SORBS2) did not support H. gentilis monophyly, and we placed H. khaokhouayensis, H. bicolor or H. cineraceus respectively within this clade. Not enough sequences were obtained for the COPS gene to draw any meaningful conclusions.
Low diversity levels and dataset incompleteness precluded reliable distance computation, although the substitution count between geographic samples of H. gentilis is generally lower than those between species in the "bicolor" group. We also found heterozygous SNPs in sites which bear differences between the three Vietnamese H. gentilis lineages.
These data definitely contradict to the results obtained from mitochondrial genes and generally indicate closer relationships of H. gentilis populations. However, the tree built with the MRP algorithm on the combined data for all the seven nuclear genes supports the monophyly of H. gentilis and provides well-resolved topology within its clade (Figure 3). The position of this clade relative to H. bicolor, H. ater and H. khaokhouayensis is unresolved, and H. cineraceus is a sister branch to all of the above. Contrary to mitochondrial data, animals from the south Vietnam lowlands form their own branch, which is the most basal within the H. gentilis radiation.

Patterns of Morphological Variation
The morphometric data are moderately factorized: the total contribution of the first four factors to the total variance is about 70%, which means relatively low overall variation of cranial proportions in H. pomona s. lato. In the space of the first two factors, most geographic samples overlap highly, except for the Indian specimens, representing H. pomona s.str. (Figure 4). This agrees with the species-level dissimilarity between H. pomona and H. gentilis. The third factor greatly reduces the overlap between the samples from northern Indochina (both western and eastern) and the other analyzed samples. Then, the fourth factor separates the sample from southern Indochina from all the others ( Figure 5). The most significant contribution to the first PC was made by CM, PM, C and cm variables (i.e., by the tooth row length); by MW, BCW, and CCL (skull width and overall size) to the second; and by CC and, to a lesser extent, other measurements related to skull width and mandible size (Table 4) to the fourth.  The results of the discriminant function analysis clearly show the high significance of the differences among all the training sets, with one exception: animals from western and eastern north Indochina are separated insignificantly, suggesting their taxonomic equality ( Figure 6; Table 5). Surprisingly, the specimens set from Myanmar turned out to be one of the most distinct. The training set from southern Indochina is well-separated from all the rest by the values of the first two canonical variables. It shows some degree of similarity with the set from central Thailand and overlaps slightly with it, but not the others. The specimens from the Dalat Plateau and Gia Lai province, which are similar in mitochondrial sequences to the south Vietnamese ones, show similarities with different training sets (more often with central Indochina), but not with southern Indochina.  Figure 6. Bivariate scatter plot for the first and second canonical variances of a Discriminant Function analysis based on 18 cranial and dental measurements of six training sets of Hipposideros pomona s. lato. All Indian specimens were included as "undetermined." Specimens marked by closed diamonds were genotyped with at least one mitochondrial gene.

Discussion
Taxonomical issues regarding the status and delimitation of morphologically similar tropical forms are always quite complicated and cannot be solved by considering one particular trait alone. According to our results, a simultaneous analysis of a few (here seven) nuclear genes can provide a highly supported and reliable tree topology that cannot be obtained from any of the markers separately. The resulting phylogenetic relationships within the H. gentilis clade and between this species and other congenerics differs substantially from the previously known mtDNA tree topology. The first remarkable feature of the mitochondrial phylogeny is paraphyletic position of the different H. gentilis lineages, which suggests the assignment of species status to at least some of them. However, our nuclear data definitely support the monophyly of H. gentilis, which diminishes the likelihood of finding cryptic taxa within this complex. Such discordance may imply historical introgression events.
Specimens from the mountain and lowland regions of southern Vietnam arranged themselves into distinct, but related, haplogroups (also related to the H. khaokhouayensis/rotalis lineage) in the mtDNA analysis. However, the nuDNA data is more likely to group animals from different parts of the Vietnamese mountain areas (Dalat plateau, Gialai-Kontum plateaus, and the northern part of central Vietnam) and from northern Vietnam together, suggesting a common origin of these populations. At the same time, the southern lowland branch has full support and is a sister clade to all other H. gentilis specimens. Given such topology discordance, we may suppose that the H. gentilis populations of southern Vietnam once obtained mtDNA from the common ancestor of the H. khaokhouayensis/rotalis lineage (with which they have no direct contact in present). It is difficult to pinpoint the exact time frame of these introgressive events, but they might have happened approximately on the late Pliocene and early Pleistocene boundary (based on the estimated divergence time between H. gentilis and H. khaokhouayensis: [1]). Noteworthily, the H. gentilis population co-occurring with H. khaokhouayensis in sympatry on Catba island does not show any traces of such introgression. One may suggest other scenarios leading to the discordance between mitochondrial and nuclear data, for example, ancestral polymorphism, but they seem less likely to us than introgression. The verification of one hypothesis or another requires more material than we currently have, and, probably the employment of additional markers (e.g., microsatellites).
Hipposideros gentilis was described from the Thayet (Thayetmyo) district of central Myanmar (Burma) [9,38]. Since it was for long time considered a subspecies of H. pomona, all specimens from Myanmar, northeast India, and Thailand were allocated to this form [4,39]. The occurrence of the form gentilis in Yunnan and northern Laos was also presumed, mainly from the molecular studies [5].
The form sinensis was described from Fujian as a subspecies of H. gentilis [4,40]. According to a published study [5] and our original results, this name could be applied to animals from southeastern China and, at least partially, to this species in north Vietnam (e.g., Gulf of Tonkin). Both lineages are quite close to each other, which also agrees with their morphological similarity. It should be noted, however, that the genetically studied material presented in [5] was collected in northeastern Myanmar, far from the terra typica of H. gentilis. Specimens which belong to the lines SY and SM (sensu Zhao et al. [5]) are morphologically similar to each other, at the same time the morphologically distinct series from the vicinity of Mandalay demonstrates some difference in skull morphometry ( Table 3). The type territory of H. gentilis is located even further south (Figure 1). Thus, the interpretation of the SY line as gentilis s. str. can be questioned. Perhaps the entire "northern" clade (from south China, north Vietnam, Laos, and the adjoining parts of Myanmar) should be designated as H. g. sinensis. The relationship between the later and the nominotypical form requires further clarification.
Animals from central Vietnam (SH clade in [5]) could be regarded as a distinct subspecies, though their distribution limits and relationships with the northern mitochondrial lineages require clarification through more extensive material. The only other named form, undoubtedly associated with H. gentilis, is megalotis, which is described from the vicinity of Vinh [11,12]. However, it cannot be used as a valid for this putative subspecies since it represents a junior homonym to the African Hipposideros megalotis (Heuglin, 1861).
We can conclude from our data that Hipposideros gentilis from the lowland forests of southern Vietnam (and probably from the adjacent areas of Cambodia) is quite a distinct form, both genetically and morphologically. It could be, according to nuclear markers, a sister group for all other H. gentilis populations. Meanwhile, no named forms are associated with the "H. pomona" (H. gentilis in current interpretation) populations from the southern part of the species range [41]. This suggests that the "southern" H. gentilis is an undescribed taxon. Its status is not entirely clear, given the common origin and probable past hybridization with other genetic lineages attributed to H. gentilis; and it requires further, more comprehensive, study. Therefore, here we refrain from its formal description.
On the whole, we can conclude that H. gentilis is, contrary to the previous opinion, a monophyletic taxon, presumably divided into two or three subspecies. Its morphological diversity is to some extent in agreement with the obtained genetic lineages, but the decision regarding the taxonomic rank of these lineages can vary greatly depending on the dataset used (mitochondrial or nuclear markers).

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/d13050218/s1, Table S1: Pairwise distances calculated in cytb gene analysis, Figure S1: Phylogenetic relationships of selected Hipposideros species inferred from cytb sequences, Figure S2: Median-joining networks showing the relationships among the alleles of the individual nuclear genes of Hipposideros gentilis and some related forms, Table S2: Correspondence between specimen IDs and haplotypes as they designated on Figure S1, Table S3: Correspondence between nodes of the networks on Figure S1 and additional haplotypes, belonging to the same nodes. Funding: Study of collections was supported by the Russian Foundation for Basic . The whole study was performed in line with the stated theme of scientific work of the ZMMU ("Taxonomic and chorological analysis of the animal world, as a ground for study and conservation of the biological diversity", 121032300105-0).

Informed Consent Statement: Not applicable.
Data Availability Statement: Genetic data: data available in a publicly accessible repository: Gen-Bank and BOLD (see numbers in Table 2 and Appendix B); morphometric data: data available on request from the corresponding author due to privacy reasons.