Species Differentiation of Chinese Mollitrichosiphum (Aphididae: Greenideinae) Driven by Geographical Isolation and Host Plant Acquirement

The impact of both the uplift of the Qinghai-Tibetan Plateau (QTP) and the separation of the Taiwan and Hainan Islands on the evolution of the fauna and flora in adjacent regions has been a topic of considerable interest. Mollitrichosiphum is a polyphagous insect group with a wide range of host plants (14 families) and distributions restricted to Southeast Asia. Based on the mitochondrial Cytochrome C Oxidase Subunit I (COI) and Cytochrome b (Cytb) genes, the nuclear elongation factor-1α (EF-1α) gene, and the detailed distribution and host plant data, we investigated the species differentiation modes of the Chinese Mollitrichosiphum species. Phylogenetic analyses supported the monophyly of Mollitrichosiphum. The divergence time of Mollitrichosiphum tenuicorpus (c. 11.0 mya (million years ago)), Mollitrichosiphum nandii and Mollitrichosiphum montanum (c. 10.6 mya) was within the time frame of the uplift of the QTP. Additionally, basal species mainly fed on Fagaceae, while species that fed on multiple plants diverged considerably later. Ancestral state reconstruction suggests that Fagaceae may be the first acquired host, and the acquisition of new hosts and the expansion of host range may have promoted species differentiation within this genus. Overall, it can be concluded that geographical isolation and the expansion of the host plant range may be the main factors driving species differentiation of Mollitrichosiphum.


Introduction
Southern China is characterized by a complex topography and represents the major biodiversity hotspots [1]. This region includes several mountain chains (i.e., the Wuyi, Nanling and Hengduan Mountains) and two islands (Taiwan and Hainan). The uplift of the Qinghai-Tibetan Plateau (QTP), which strongly influenced the environment of the plateau and its surrounding areas, as well as the climate in Asia and across the globe, was one of the most important events in the Earth's geological history [2][3][4]. The recent violent uplifts of the QTP during the Quaternary initiated the formation of mountain and valley glaciers in many areas [2,5,6], both the Taiwan and Hainan Islands were repeatedly connected and separated with the mainland following the glacial and interglacial cycle over the past million years [7]. It has been suggested that these events greatly influenced the biogeography of the fauna and flora in this region and the surrounding areas [7][8][9].
Studies based on distribution data have indicated that the Hengduan Mountains, north of the Sichuan and Yunnan Province, and the Hainan and Taiwan Islands are diversity centers for birds, amphibians, insects and plants [10][11][12][13][14][15]. These studies, which are based on distribution records, described the primary spatial diversity patterns in southern China. However, research based on phylogeny and a time-calibrated framework is required to understand the evolutionary history of the species diversity in this region. To date, some phylogeographical studies have been performed in southern China, especially focused on plants [16][17][18], mammals [19], birds [20][21][22] and reptiles [23]. However, few of these reports discussed the biogeographical patterns beyond the species level and across a wide geographical range.
The aphid genus Mollitrichosiphum (Aphididae, Greenideinae) was described by Suenaga in 1934, based the description on the numerous transverse ridges on the hind tibiae, six elongated antennal segments and a pointed ultimate rostral segment. Mollitrichosiphum contains 18 known species, based on the different lengths and distributions of the setae on the antenna and the shape of the radial sector. This genus is divided into two subgenera, namely, Mollitrichosiphum and Metatrichosiphon [24,25]. The hosts of this genus belong to the plant families Fagaceae and Betulaceae, as well as to 12 other families ( ). These patterns provide the opportunity to determine the relative effects of geographical isolation caused by both the uplift of the QTP and the separation of the islands on the differentiation of these species.
Previous studies of this genus were alpha-taxonomic, and species descriptions based on morphology [25][26][27][28][29]. Zhang et al. [30] primarily investigated the phylogenetic relationships of seven Chinese Mollitrichosiphum species based on the mitochondrial Cytochrome C Oxidase Subunit I (COI) and Cytochrome b (Cytb) genes. Their results indicated that the widespread species, M. tenuicorpus, may consist of three cryptic species, which are isolated by the Tibetan Plateau and Lancang River. However, these results require further investigation with additional data to elucidate the relationship between the species of this genus and the effects of host plants on species divergence.
Based on the geographic distribution and host plant use data of Mollitrichosiphum, we hypothesized that both geographical isolation and host plant diversity may have played important roles in species differentiation of Mollitrichosiphum in southern China. To test this hypothesis, we used extensive sampling and combined mitochondrial (COI, Cytb) and nuclear (elongation factor-1α, EF-1α) data to reconstruct the phylogenetic relationships of the Chinese Mollitrichosiphum species. We also estimated the divergence times of Mollitrichosiphum using a Bayesian approach to evaluate the potential relationship between species divergence, geological events and host plant evolution.

Phylogenetic Analyses
We used mitochondrial gene COI, Cytb, and nuclear gene EF-1α to reconstruct the phylogenetic relationships of Mollitrichosiphum. All sequences were submitted to GenBank (see Table 2 for accession numbers). The detailed characters of the individual sequences and the combined dataset are listed in Table 3. The monophyly of the sampled Mollitrichosiphum species was supported by all phylogenetic analyses. The results based on the combined dataset of COI and Cytb were similar to that of a previous study [30] (data not shown here). Phylogenetic trees based on EF-1α sequences and the combined dataset of the EF-1α and mitochondrial genes are shown in Figure 1. In the phylogenetic analyses based on the EF-1α gene, four representatives of Thelaxinae, Cervaphidini and Greenideini were used to root the trees. Maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) analyses yielded congruent trees, and the tree obtained from the ML analysis is shown ( Figure 1A). The monophyly of the genus Mollitichosiphum was highly supported (MP/ML/BI: 82/93/1.00). Three samples of Greenideini, Cervaphidini and Thelaxinae were used as outgroups in the Bayesian and ML analyses of the combined dataset. The combined dataset also supported the monophyly of this genus (ML/BI: 80/1.00) and of most species ( Figure 1B In the phylogenetic analyses based on the combined dataset, the samples of M. tenuicorpus were subdivided into three major clades ( Figure 1B), which was also supported by the results obtained with EF-1α to some extent ( Figure 1A).

Divergence Time and Geographical Distribution
The divergence times were estimated based on the mitochondrial and nuclear data separately, and the mean age estimates of most nodes were largely consistent. The molecular dating results indicate that clade A within M. tenuicorpus diverged at approximately 11.0 mya (95% Highest Posterior Density, HPD: 5.5-16.4 mya) (Figure 2), while clades B and C diverged at 8.   Figure 1A. Pie charts indicate the relative likelihoods at respective nodes. Terminal taxa are color-coded for the state of host use. The black part in the pie represents probability lower than 5%.

Ancestral State Reconstruction
According to collection records of specimens and published reliable literature [25,27,31], we summarized host plants of other 18 species in Mollitrichosiphum. Except for the host plant of M. kazirangi still unknown, the hosts of all other species related to 14 families were listed in Table 1

Phylogenetic Relationships
The monophyly of the genus Mollitrichosiphum was supported by phylogenetic analyses based on different approaches and genes ( Figure 1). The phylogenetic trees of the combined dataset supported the division of the two subgenera (Mollitrichosiphum and Metatrichosiphon) based on morphological characters [24,25]. The molecular phylogenetic relationships between the species were also supported by morphological characters, such as the ultimate rostral segments of M. luchuanum, M. rhusae and M. nigrum, which are at least twice the length of the second hind tarsal segment, while the ultimate rostral segment of M. nandii and M. montanum are not more than twice the length of the second hind tarsal segment.
The phylogenetic placement of M. nigrofasciatum with respect to the other species remains undetermined, as the results based on different datasets were inconclusive. A possible explanation for the incongruent patterns of M. nigrofasciatum is incomplete lineage sorting during successive speciation events. However, more data and further study are needed to verify the real reasons that contribute to these results. Thus, additional markers are required for future studies to reveal the position of M. nigrofasciatum. Analysis of the EF-1α gene yielded a paraphyletic clade that included M. luchuanum and M. rhusae, while the combined datasets resulted in independent clades of the two species. This finding may be explained by the differential effectiveness of the genes. The nuclear EF-1α gene is usually used as a marker to investigate the phylogenetic relationship of aphids above the genus level, due to its slow rate of evolution, whereas the mitochondrial genes are more suitable for revealing the relationships at intra-and interspecies levels [32]. M. luchuanum and M. rhusae may have diverged recently, therefore, only mitochondrial genes may provide phylogenetic signals.

Divergence Times and Biogeography
Southern China, which appears to be a center of diversity for Mollitrichosiphum, has experienced great topographical and climatic changes during the past million years. Estimated divergence times of this genus provide a reasonable temporal framework to reconstruct historical patterns of species divergence and distribution. Despite a degree of controversy regarding the exact timing of the uplift of the Tibetan Plateau [33], geological estimates date the period of rapid uplift unequivocally within the Late Tertiary/mid-Miocene [34], and episodes of uplift probably continued throughout the Late Pliocene (c. 3 mya) and into the Quaternary [5,35]. Clade A within M. tenuicorpus (Figure 2) is only currently distributed throughout the QTP, for which the divergence time was at approximately 11.0 mya, thereby falling within the time frame of the mid-Miocene uplift of the QTP. The clade including M. nandii and M. montanum was restricted to the Hengduan Mountains, which are located in the southeastern region of the QTP, and diverged from other species at approximately 10.6 mya (HPD: 6.2-15.1 mya). This result indicates that the distribution of these two species was probably also influenced by the uplift of the QTP.
The divergence of clades B and C within M. tenuicorpus was estimated at approximately 8.2 mya (Figure 2). These two clades are distributed east or west side of the Lancang River [30], which concurs with the long-held notion that the "Mekong-Salween Divide" (MSD) acted as a significant historical and geographical barrier to species dispersal [18,36,37]. The rivers located at the southeast margin of the Tibetan Plateau, the Dadu River, Mekong (Lancang River), Salween (Nu River), and Tsangpo-Brahmaputra (Yalu-Tsangpo River), were once tributaries of a single southward flowing river system, the Paleo-Honghe (Red River). The uplift of the Tibetan Plateau dramatically changed the characteristics of the major river drainages [38]. Time scales of formation of the MSD have rarely been estimated [18], thereby providing little insight into whether there is a direct link between the divergence of the B and C clades with the formation of the MSD. However, these historical events may have isolated the populations of M. tenuicorpus and stimulated differentiation among the populations.
M. rhusae and Mollitrichosiphum sp. are restrictedly distributed in the Hainan Island and Taiwan Island, respectively. The MRCAs of these two species are estimated at approximately 3.9 mya and 15.8 mya, respectively. Based on the results of geological studies, Hainan Island was first separated from the mainland of China during the Early Pleistocene (c. 2 mya) [7], and Taiwan Island was separated approximately 5 mya [39]. The divergence times show that the MRCAs of M. rhusae and Mollitrichosiphum sp. originated much earlier than the separation of the two islands. These results indicate that the geographical isolation of the two islands was probably not the primary factor triggering the differentiation of the two species. However, the isolation may have limited the subsequent species dispersal between the islands and mainland and thereby promoted the present distribution pattern. Although both islands were connected to the mainland when the sea level retreated during the Pleistocene, it has been suggested that the flora covering the emerged areas during the glaciations were poorly developed [40] and not suitable for aphid species dependent on woody host plants.
Fossils of Mollitrichosiphum have been found exclusively in Europe, while all extant species of this genus are currently restricted to Southeast Asia [25,41]. Peñalver et al. [42] demonstrated that the geographic ranges of Greenideinae were markedly wider during the Miocene and comprised the northern coasts of the Tethys Sea. During the Lower and Middle Miocene, the southern region of Europe was characterized by a hot subtropical climate. The terrestrial plant communities of Europe were dominated by needle-leaved forests, Cupressaceae species, and some broadleaf forest species [41]. Later geological events, such as the uplift of the Tibetan Plateau, affected both the climatic conditions of southern Europe and the composition of the flora [43]. During this period, a suitable climate and host plants for Mollitrichosiphum disappeared [44,45]. However, the climate in Southeast Asia remained unchanged. This may have determined the current distribution pattern of Mollitrichosiphum in southeastern Asia, including southern China.

Species Differentiation of Chinese Mollitrichosiphum
Speciation processes are generally driven by the advent of barriers to gene flow between interbreeding populations [46,47]. The life history of many phytophagous insects often depends on specific host plants. Differences in host plant preference can result in reproductive isolation [48]. For example, studies of the pea aphid Acyrthosiphon pisum demonstrated that species differentiation occurred among different populations that fed on pea, clover and alfalfa [49,50].
Phylogenetic analyses show that Mollitrichosiphum sp. and M. nigrofasciatum are two basal clades in the subgenus Metatrichosiphon; together with M. tenuicorpus, these three species mainly feed on Fagaceae. The divergence times of these species were 15.8 mya, 10.6 mya and 18.4 mya, respectively (Figure 2), which are much earlier than the other species. In contrast, the host plants of a younger clade (with an age of 7.7 mya) including M. nigrum, M. rhusae and M. luchuanum are distinct (Table 1), and each of the three aphid species has hosts belonging to at least two plant families. Similarly, the sister species M. nandii and M. montanum with divergence times of 3.5 mya and 1.5 mya, are associated with host plants belonging to three and four families, respectively. It seems that these species have acquired a wider range of host plants during their evolutionary history.
The phylogenetic pattern of host association shows that basal species much more exclusively feed on Fagaceae, which is a common host plant for most species. Results of ancestral state reconstruction also supported the fact that Fagaceae may be the ancestral host plant of Mollitrichosiphum. Notably for the subgenus Metatrichosiphon, the acquisition of new hosts and the expansion of the host range might be the main factor driving species differentiation. Of the species in the subgenus Mollitrichosiphum, M. tenuicorpus is widely distributed in Southeast Asia and southern China, while the other three species are restrictedly distributed in Nepal, the northeastern region of India (Sikkim), and the southern edge of the QTP [25,27,31]. The divergence of the three clades within M. tenuicorpus ( Figure 2) may be due to the combined effects of geographical isolation (as discussed above) and the differentiation of host plants (clade A on Betulaceae, clade B on Sabiaceae and Fagaceae, and clade C on Fagaceae). However, it seems that geographical isolation probably represents the key factor promoting the species divergence of the subgenus Mollitrichosiphum.

Taxa Sampling
We sampled the Chinese Mollitrichosiphum species except for four species with very restricted distributions (M. glaucae, Hong Kong; M. niitakaensis, and M. taiwanum, Taiwan; M. yamabiwae, Taiwan, Hong Kong and Fujian). Mollitrichosiphum sp. is an undescribed species found in Taiwan Island (unpublished data), and can be distinguished from all other species of Mollitrichosiphum by an enlarged distal half of siphunculi. Specimens for slide-mounting were stored in 75% ethanol for morphological examination. Samples for molecular experiments were stored in 95% ethanol. All the samples were preserved at −30 °C. On the basis of current knowledge of the phylogenetic relationships within Aphididae [51,52], outgroups were chosen from Cervaphis van der Goot (Cervaphidini), Eutrichosiphum Essig & Kuwana (Greenideini), Greenidea Schouteden (Greenideini) and Kurisakia Takahashi (Thelaxinae), respectively. All voucher specimens were deposited in the National Zoological Museum of China (NZMC), Institute of Zoology, Chinese Academy of Sciences (Beijing). Collection information for all samples, including collection localities, host plants and collection dates are listed in Table 2.

Sequence Alignment and Phylogenetic Analyses
The sequences were assembled using Seqman II (DNASTAR) and assessed manually. Multiple alignments were performed using ClustalX 1.81 [57]. Nucleotide composition, conserved sites, variable sites, and parsimony informative sites were calculated using MEGA 4 [58]. For the EF-1α gene, we used the coding regions in the phylogenetic analyses. The intronic sequences were removed using the GT-AG rule and by comparing the sequences with the cDNA sequence of Mollitrichosiphum montanum (GenBank accession number: DQ493826). The combined dataset of the COI, Cytb and EF-1α genes was evaluated with the partition homogeneity test implemented with PAUP* 4.10b [59] using random taxon addition (10 replicates), tree bisection-reconnection branch swapping, and heuristic searches with 100 random repartitions of the data.
Phylogenetic analyses were conducted in PAUP* 4.10b and MrBayes 3.1.2 [60]. Maximum parsimony (MP) analysis was conducted for all datasets independently, EF-1α and the combined datasets of COI + Cytb and COI + Cytb + EF-1α, using a heuristic search with 1000 random sequence repetitions and tree-bisection-reconnection (TBR) branch swapping. Consensus trees (50% majority rule) were obtained if more than one equally maximum parsimonious tree was found. The reliability of the MP trees was tested using the bootstrap approach [61] with 1000 pseudoreplicates using the heuristic search strategy and 100 random additions of sequences in each pseudoreplicate. Modeltest 3.7 [62] was used to determine the best-fit nucleotide substitution model for maximum likelihood (ML) analyses. The best models selected according to the Akaike Information Criterion (AIC) [63] for the combined dataset of mitochondrial genes and EF-1α, the combined mitochondrial dataset, and EF-1α were GTR + I + G, GTR + G and GTR + G, respectively. ML analysis was performed using PHYML [64], and nodal support among branches was evaluated by bootstrap analysis. Bayesian analysis was performed separately on the EF-1α gene and the combined dataset of three genes using the models selected by Modeltest 3.7. The combined dataset was partitioned into mitochondrial versus nuclear sequences and with separate models. Four chains were run starting from a random tree and proceeded for one million Markov chain Monte Carlo generations with tree sampling every 100 generations. Four independent runs were conducted to verify the results. The first 2500 trees (25% of the total) were discarded as burn-in. Next, 50% majority-rule consensus trees were generated from the remaining trees. The examination of the log-likelihood scores and the average standard deviation of the split frequencies suggested that the burn-in periods were long enough for chains to become stationary.
Results of the partition-homogeneity test indicated that nuclear versus mitochondrial alignments were incongruent (P = 0.01). However, because each gene provided similar tree topology, we still performed phylogenetic analyses based on the combined dataset. For Bayesian analysis, partitions were made for each gene separately using their own parameters derived from Modeltest.

Divergence Time Estimation and Ancestral State Reconstruction
To date, the only fossil species of this genus was Mollitrichosiphum rubusensis [41], which was found in Europe and dates back to 18-19 mya (million years ago). This time was used as a calibration point representing the minimum age of the most recent common ancestors (MRCAs) of this genus in our study. The divergence times among species were estimated using BEAST v1.6.1 [65] under a relaxed molecular clock and uncorrelated lognormal model. The GTR model of nucleotide substitution with gamma rate heterogeneity among sites and a proportion of invariant sites and Yule speciation process were used.
After determining the optimal parameters over several preliminary runs, multiple analyses were run for 10 million generations, and 25% of samples were removed as burn-in. The results were visualized in Tracer v1.5 [66], where the stationarity was assessed by ensuring that the Effective Sample Size (ESS) values were high for most parameters. The results proved repeatable over multiple independent runs, and thus a chronogram was reconstructed based on the results from a representative single run and was subsequently visualized using the program Fig Tree v1.3.1 [67].
Host plants of the genus Mollitrichosiphum were reconstructed according to a Bayesian criterion, using RASP 2.0 (Reconstruct Ancestal State in Phylogenies) software to implements Bayesian Binary MCMC (BBM). BBM offers a statistical procedure for inferring states, including geographic distributions, at ancestral nodes using a full hierarchical Bayesian approach [68]. For reconstruction of the ancestral state of host plants, each possible host plant of the species was coded based on the previous plants they fed on.

Conclusions
The monophyly of Mollitrichosiphum and most inter-species phylogenetic relationships were supported by molecular data as well as morphological characters. The sister relationships of M. nandii and M. montanum, M. luchuanum, M. rhusae with M. nigrum were supported in all analyses. Based on the divergence times, distribution and host plant information of the species, we speculate that geographical isolation due to the uplift of the Qinghai-Tibetan Plateau and the separation of the Hainan and Taiwan islands, as well as the expansion of host plant range may be the main factors driving differentiation of Chinese Mollitrichosiphum species and shaping their current distribution patterns.