Mitochondrial Gene Sequence (COI) Reveals the Genetic Structure and Demographic History of Lymantria dispar (Lepidoptera: Erebidae: Lymantriinae) in and around China

The gypsy moth, Lymantria dispar, is among the most destructive quarantine pests of forests. Here, we reconstructed the genetic structure and determined the population differentiation of gypsy moths across its distribution range at different times. This information could be used to both improve the prevention and detection of gypsy moths in the field. Using 31 newly designed species-specific primers targeting fragments of 216–1102 bp, we identified 103 full-length cytochrome oxidase subunit I (COI) gene sequences from eight fresh samples and 95 L. dispar specimens collected between 1955 and 1996, mainly in China. Combining 103 full-length COI gene sequences with 146 COI gene sequences from Genbank or DNA barcode libraries, we analyzed the genetic differentiation, gene flow and haplotypes between gypsy moth populations in order to reflect the genetic structure and population dynamics of gypsy moths. We discovered 25 previously unknown haplotypes from old gypsy moth specimens. We found that the genetic diversity among gypsy moth populations (collected in the same region at different time points) was relatively high. Furthermore, the genetic structure of Chinese geographical populations (Heilongjiang, Liaoning, Beijing) in different years was distinct. Our results suggested that some gypsy moths in China showed the genetic affinity with European gypsy moths (a sub-species of gypsy moths found mainly in Europe).

Over recent years, increased levels of international trade and tourism have facilitated the inadvertent dispersal of gypsy moths beyond their native ranges. However, introducing L. dispar to areas where it is not already present causes unpredictable effects upon the economic benefits and ecological stability of forest ecosystems [20]. For quarantine purposes, more and more studies are trying to investigate the genetic structure and demographic history of L. dispar over a wide geographic distribution by using various methods, such as allozyme detection, amplified fragment length polymorphism, restriction fragment length polymorphism, sequence-based analysis and microsatellites [21][22][23][24][25][26][27][28].
In the first such attempt, by Bogdanowicz et al. [22], three mitochondrial genes (COI, COII, NDI) (cytochrome oxidase subunit I, cytochrome oxidase subunit II, NADH dehydrogenase subunit I) were investigated in order to perform cluster analysis for gypsy moths from Japan, Europe, Tunisia and North America. This strategy revealed four populations: Okinawa in Japan; Hokkaido in Japan; Honshu and Kyushu in Japan and the Asian continent; Europe, Tunisia and North America. Later, Keena et al. [14] separated 46 gypsy moth strains into four groups (North American, European, Siberian and Asian) based on two restriction sites for cytochrome oxidase subunit I (COI), the nuclear FS1 marker and four microsatellite loci.
More recently, Wu et al. [29] identified four genetic clusters within L. dispar by analyzing microsatellite loci and the mitochondrial DNA sequences of 1738 gypsy moths, with three clusters corresponding to the three known subspecies (L. d. asiatica, L. d. dispar, L. d. japonica); the North American populations formed a distinct fourth cluster. Based on the DNA barcode sequences of gypsy moth specimens sourced from 16 sites in China, Japan, Russia, Europe and the USA, Chen et al. [24] divided 288 specimens into two groups: an Asian group (individuals from China, the Far East of Russia and Japan) and a European group (individuals from Siberian Russia, Greece, Lithuania and the USA). Furthermore, using microsatellite loci and mitochondrial genes, the population genetic structure and demographic history of 20 L. dispar populations from Asia were analyzed by Kang et al. [30], and revealed three groups: Group 1 (Korean inland region and adjacent areas); Group 2 (Korean southern region) and Group 3 (Hokkaido region). Research on population genetic structure provides insight for enhancing monitoring and exclusion programmes [24,29].
In most studies, however, only recently formed population genetic structures have been revealed; consequently, we know little about the precise history of gypsy moth populations in the field. In light of this, further investigations are required to identify the historical genetic structure and demographic history of L. dispar, a subspecies that is particularly prone to taxonomic confusion. For example, some individuals of L. dispar asiatica that were collected from the southern coastal area of Korea reportedly had morphological characteristics similar to those of L. d. japonica [31]. Chen et al. [24] further suggested that gypsy moths from Guizhou, plus one from Liaoning, exhibited genetic affinity with the Siberian population, and might even be a new cryptic subspecies of L. dispar [24].
To resolve this taxonomic confusion between gypsy moth subspecies, a demographic history of L. dispar populations based on intensive sampling was deemed necessary. To this end, we investigated the genetic structure of gypsy moth specimens (collected over the past 10 years or even decades ago) in several regions of the world, by using their COI (cytochrome oxidase subunit I) sequences. We also used several dried and formalin-fixed specimens, collected between 1955 and 1996, instead of fresh samples. Worldwide, museums and other natural history collections, house millions of animal and plant specimens, making these institutions a treasure trove of biological diversity [32]. The combination of fresh and museum specimens has created an incredible potential for advanced studies of evolutionary change [33][34][35]. For example, utilizing thess types of material, researchers recently revealed some remarkable instances of phenotypic or genotypic change over short timescales (years) in response to strong selective pressures [36,37]. Among the many opportunities provided by combining genomic and morphological data from the same specimens, most earlier studies have focused upon the DNA fragment size of biological collections, species delimitations, thus facilitating the deduction of phylogenetic associations and the construction of DNA barcode sequence libraries for DNA-based identifications [38][39][40][41][42][43][44][45][46][47][48][49][50].
Building upon this foundation, in the present study, we have attempted to link molecular data from museum specimens, as well as recently collected fresh samples in order to assess the genetic associations of these specimens. In short, this study made use of a large number of specimens which had been collected over the past 53 years (1955-2008). Importantly, these included many specimens taken at the earliest known time of gypsy moth distribution in China, as well as specimens collected from the USA in 1996. All COI gene sequences of gypsy moth specimens were amplified and spliced; along with the COI gene sequences of gypsy moths deposited in GenBank or DNA barcode libraries, we hoped to enhance our understanding of gypsy moth dynamics. For instance, our aim was to use this robust dataset to characterize the variation in the genetic structure of populations over different years and genetic differentiation in populations across different years. We particularly focused upon the mitochondrial cytochrome oxidase subunit I (COI) gene, which has been successfully used in the past to investigate phylogenetic relationships in the three L. dispar sub-species and widely used to study the genetic structure and genetic differentiation of the gypsy moth populations [24,29,30].

Samples
In 2017, we obtained fresh larval samples from the Insect Virus Research Centre of the Chinese Academy of Forestry. Several gypsy moth egg masses were collected in Liaoning Province in 2008 (Table 1), and when hatched, reared on an artificial diet in a greenhouse under controlled conditions (temperature: 24-26 • C; photoperiod: 14-h-light:10-h-dark and a relative humidity of 60%-80%). Samples were stored in 100% ethanol and kept at -20 • C to await extraction. A total of 75 dried gypsy moth specimens, and 20 formalin-fixed larval specimens, were analyzed ( Table 1). All of the specimens collected between 1955 and 1996 from 10 different locations had been preserved at ambient temperature, without dehumidification, in the Specimen Banks of Forest Insects at the Research Institute of Forest Ecology, Environment and Protection (Chinese Academy of Forestry, China). Adult specimens were dried naturally before setting them with an insect pin, whereas larvae were preserved in formalin. Most of the formalin-fixed specimens were kept together in a single large jar. A few specimens had been stored in cotton-stoppered glass vials, with several vials kept together submerged in formalin within a larger jar. Figure 1 shows a map of sample collection points.  Table 1 and Table S2. The  Table 1 and Table S2. The collection locations of HLJ56 and HLJ79 populations are the same. The collection locations of LN79 and LN08 populations are the same. BJ79, BJ81 and BJ82 populations are collected in the same area.

DNA Extraction
All laboratory equipment and work areas were first cleaned with 75% ethanol to limit potential contamination before extraction. We also used new consumables and reagents to prevent possible cross-contamination of DNA extraction from other species. All specimens were briefly treated to remove surface contamination before DNA extraction. For fresh samples, larval samples were first washed with sterile distilled water, and part of their abdomen was ground in liquid nitrogen. Sterilized brushes were used to remove bristles and dust from the old dried specimens prior to DNA extraction. All dried specimens were also washed in sterile distilled water to remove surface contamination, and then air-dried. For dried specimens, part of their abdomen was ground in liquid nitrogen, without causing any damage to other external morphological features. The ground tissue (40 mg) of both fresh samples and dried specimens was then used for DNA extractions. The tissue was then lysed, and the lysate purified with an E.Z.N.A.™ Insect DNA kit (Solarbio, Shanghai, China) in accordance with the manufacturer's instructions, but with a slight modification, i.e., lysate was incubated at 60 • C for 2 h.
For formalin-fixed larval specimens, another DNA isolation protocol was required. Approximately 60 mg of tissue from each specimen was cut into pieces and centrifuged at 10,000 g for 2 min. The supernatant was then discarded. Tissues were then rinsed overnight in xylene at room temperature (10-25 • C), centrifuged at 10,000 g for 1 min and the supernatant was discarded. Next, all tissue samples were treated with a gradient of ethanol concentrations (65%, 70%, 75%, 80%, 85%, 90%, 95% and 100%) for 2 hours, and centrifuged at 5000 g for 10 min to gradually break protein-DNA cross-linkages caused by the long-term exposure to formalin. Lastly, the specimens were left to dry overnight at room temperature (10-25 • C). After completing this drying process, the E.Z.N.A.™ Insect DNA kit was used in accordance with the manufacturer's instructions, but with a slight modification: lysate was incubated at 60 • C for 2 h. Extracted DNA was stored in an elution buffer (EB) at -20 • C. Samples from different years were individually extracted, to eliminate any cross-contamination between DNA extracts. After extractions, a DS-11 Spectrophotometer (DeNovix) was used to determine the DNA concentration and absorbance ratio (A260/A280) of each sample.
Generally, the expected ratios for extracted DNA samples from insects should range from 1.7 to 2.0 according to the manufacturer's protocol for the E.Z.N.A.™ Insect DNA kit.

Primer Design, PCR Amplification and Sequencing
The sizes of extracted DNA fragments were determined by automatic capillary electrophoresis, with an observed fragment distribution of 21-900 bp across all specimens. Based upon these results, and full COI sequences from L. dispar (FJ617240, KY798442, KY923059-KY923067), 31 new species-specific primers (Table S1) were designed using the Primer v5.0 tool. The size of target fragments ranged from 216 to 1102 bp. Full COI gene sequences were recovered via the combination of various overlapping amplicons.
All PCR amplifications were performed directly on total DNA extracts, in a final volume of 25 µL, consisting of 12.  Table S1), 1 min at 72 • C; and with a final extension step of 5 min at 72 • C. Amplifications were performed using a Gene-Explorer Touch Gene Amplifier (GE9612T-S, Hangzhou Bio-Gener Technology Co, Ltd). Thereafter, 3 µL of each PCR product was resolved by 1.0% agarose gel electrophoresis to determine whether the amplification had succeeded; and the remaining reaction volume was used for Sanger sequencing. The DNA sequencing was performed using bidirectional reads in a semi-automated sequencing process based on magnetic beads (TSINGKE Biological Technology, Beijing, China).
All sequences for a given specimen were first aligned separately in BioEdit v7.2 software [51] and then connected to generate its COI sequence. To verify the authenticity of COI sequences obtained from old specimens, we crosschecked the new sequences with established reference sequences (FJ617240, KY798442, KY923059-KY923067).

Gypsy Moth Mitochondrial COI Sequence Data
The COI gene sequences of gypsy moths, from various countries and different sites, were previously amplified and uploaded into Genbank or DNA barcode libraries (Bold Systems V4). Such sequences can provide crucial information for defining genetic structure and demographic history in terms of its area of origin and adjacent areas [24,29,30]. In the present study, we obtained the COI gene sequences of gypsy moth specimens collected during different periods. Combining this information with the COI gene sequences of gypsy moths published in Genbank or the DNA barcode library (Bold Systems V4), we identified the genetic structure of gypsy moths in specific periods and discussed associated changes in genetic structure at different times. Therefore, in order to reveal the historical genetic structure of gypsy moths, in addition to the specific gene sequences obtained in this study, we also downloaded the following gene sequences from Genbank or the DNA barcode library (Bold Systems V4): (1) 11 full-length COI gene sequences from gypsy moths from around the world (FJ617240, KY798442, KY923059~KY923067), combined with the COI gene sequences of gypsy moths obtained in this study, both of which may perhaps reveal previously undetected haplotypes; (2) 86 COI gene sequences from gypsy moths in China (GMCF001-14~GMCF074-14, HM775684~HM775692, HM775605, HM775607, HM775608) combined with the COI gene sequences of gypsy moths obtained in the present study, which could similarly reveal previously undetected haplotypes of the Chinese gypsy moth populations; (3) 63 COI gene sequences collected from Japan, Russia, the USA, Greece, Lithuania and different regions of China in the 1990s (GMCF075-14, GMCF085-14~GMCF134-14, HM775684~HM775692, HM775605, HM775607, HM775608), which revealed the genetic structure and population differentiation of the gypsy moth population in the 1990s; and (4), recently published COI gene sequence from Chinese gypsy moth populations (GMCF001-14~GMCF074-14). It was hoped that these sequences (including COI gene sequences of the gypsy moth specimens) could reveal temporal changes in the genetic structure of the gypsy moth population. The information of all the downloaded gene sequences is shown in Table S2, and the collection locations for all sequences are marked in Figure 1.
Compared with the recently published datasets from fresh samples, the gene sequences used in this study provide additional valuable information relating to species history. However, this heterochroneous dataset, which features population-level information spanning 53 years (1955-2008), intrinsically violates the assumption of constant sampling time; hence, heterochrony had to be explicitly avoided for in further analyses. To remove such heterochrony-driven bias, the COI gene sequences obtained during the same time period were correspondingly assigned to the same group according to the time of collection. For instance, 28 COI gene sequences from the HN55, HLJ56, NM56 and DX57 populations were used to analyze the genetic structure among four geographic populations in the 1950s. Similarly, 18 COI gene sequences from three regions (Heilongjiang, Liaoning and Beijing) were used to assess genetic structure among these three populations in the 1970s. Combining this data with COI gene sequences from the 1990s and 2010s from Heilongjiang, Liaoning and Beijing, enabled us to analyze changes in genetic structure across different years. Furthermore, we had hoped that DNA barcode sequences from the same region, but from different years, may well provide information relating to the population dynamics of gypsy moths in that region. For example, using this model, we analyzed the COI sequences for the Beijing populations over several years (1979,1981,1982,1987,1992, 2011).

Analysis of Genetic Structure in Populations of Gypsy Moths
Genetic differentiation coefficients (F ST ) and gene flow (Nm) are two key indices that reflect genetic structure. The gene differentiation coefficient is often used to reveal variation in allele frequency among different populations, along with the degree of genetic differences between disparate species or populations, which is an important parameter that expresses the evolutionary rate of colonies. As gene flow can maintain genetic similarity between populations through the migration of gametes, individuals, or even entire populations, this represents a key factor in the homogenization of population genetic structure. Generally, the species populations with high gene flow have a lower degree of genetic differentiation, whereas populations with low gene flow will exhibit greater genetic differentiation. According to gene flow studies, genetic drift becomes the main factor driving genetic differentiation among populations when Nm < 1. In contrast, when Nm > 4, gene exchange between populations is sufficient to counteract the effects of genetic drift, thereby preventing genetic differentiation between populations [52]. Hence, pairwise comparisons of F ST values and Nm values between L. dispar populations were calculated using the Kimura two-parameter model in Arlequin v3.0 (alpha level of significance = 0.05, significance level obtained from n = 1000 permutations) [53] based on its COI gene sequences.
The median-joining haplotype network can be useful for bothconveying population structures and estimating the genealogical relationships among various haplotypes. For this reason, a median-joining network was also derived based on the COI gene sequences of gypsy moths, using the program NETWORK v4.6.1.3 [54]. Moreover, for a specific species, its evolutionary potential and adaptability to the environment are closely related to genetic diversity of the populations. Those species endowed with higher genetic diversity are more able to tolerate and survive environmental stresses; hence, values for the number of nucleotide differences (K), nucleotide divergence (Pi) and haplotype diversity (Hd) were computed with DnaSP software [55].

Phylogenetic Analyses among Populations of Gypsy Moths
Phylogenetic trees were constructed using the neighbor-joining (NJ) phylogenetic approach in MEGA v7.0 software [56], using the COI sequence from Lymantria albescens (Genbank ID: NC035627) serving as an outer group. The best evolutionary model was chosen, and its phylogeny was tested by the bootstrap method. Rates of change among lineages were presumed to be gamma distributed (G), and the bootstrap of each branch of the phylogenetic tree was subjected to n = 1000 replications.

Assessing the Utility of Newly-Designed Primer Pairs
For dried specimens, a total of 3-12 µg DNA was normally extracted from a 40 mg tissue sample, with an absorbance ratio (260/280) of 1.8-2.0. DNA extracts from formalin-fixed specimens exhibited a DNA concentration of 18.7-28.3 ng/µL; here 260/280 ratios mostly ranged from 1.8 to 2.0. These results indicated that DNA extracts for the majority of samples were pure and could be sequenced. To do this, 31 newly designed primer pairs were tested with fresh samples obtained in 2018; 30 of these successfully generated COI sequences and only Lco31, targeting fragments of 1102 bp, failed to yield a DNA sequence. This showed that the designed primers were reasonably reliable for further experimentation. Not all primers amplified the target band in all specimens, but for all specimens, we were able to assemble the complete COI gene sequence from the amplified gene sequences. In all, a total of 103 full-length COI gene sequences were recovered by assembling fragments with overlapping sequences. All DNA sequences are listed in Suplementary Materials.

New Haplotypes from Old Gypsy Moth Specimens
In a parsimonious network built with 103 full-length COI gene sequences obtained from old specimens and 11 published COI gene sequences, we discovered 25 new haplotypes ( Figure 2). Comparing these haplotypes among regional populations in the 1950s, 1970s or 1990s, revealed that they were geographically distinct ( Figure 2). Moreover, regional populations from the same location, but collected at different times, also showed clearly distinctive haplotypes ( Figure 2). Fourteen previously unknown haplotypes for Chinese gypsy moth populations were revealed (Figure 3), based on the 95 DNA barcode sequences obtained in this study and those 86 that had already been published. In the median-joining network, one high-frequency haplotype (H10, 67 samples) was connected by low-frequency haplotypes ( Figure 3). We also identified five shared haplotypes between the COI gene sequences obtained in our study and previously published data. Notably, a shared haplotype existed between the SC93 (Sichuan in 1993), LN11 (Liaoning in 2011) and GZ (Guizhou in 2011) populations. Chen et al. [24] indicated that the mitochondrial COI genotype of LN11 and GZ populations was similar to that of the Siberian gypsy moth.

286
The detailed information of the haplotypes distribution is shown in Table 2.   Table 2.

293
MV nodes indicate that inferred haplotypes were not found in the sampled populations. The detailed 294 information of the haplotypes distribution is shown in Table 3. Median-joining haplotype network for DNA barcode sequences of L. dispar. Previously published haplotypes are shown in black type, while those haplotypes obtained from the old museum specimens are in red. Each link between haplotypes represents one mutational difference. MV nodes indicate that inferred haplotypes were not found in the sampled populations. The detailed information of the haplotypes distribution is shown in Table 3.  Note: The number of populations included in each haplotype is given in brackets.

Genetic Differentiation in Gypsy Moth Populations across Different Years
Specimens from the four years 1956, 1979, 1992 and 2012 were obtained to evaluate the variation in COI sequence over time and to provide insight into gypsy moth dynamics in Heilongjiang, China. We first analyzed the genetic diversity among Heilongjiang populations, and found that the mean number of nucleotide differences (K), haplotype diversity (Hd) and nucleotide diversity (Pi) were 1.0593, 0.6719 and 0.0016, respectively. In addition, four haplotypes were identified in the Heilongjiang population ( Figure 4A). A shared haplotype was revealed between the HLJ92 and HLJ12 populations; however, none could be found in the HLJ56, HLJ79, HLJ92 and HLJ12 populations ( Figure 4A). Genetic differentiation between the HLJ92 (collected in 1992) and HLJ12 populations (collected in 2012) was similar (p > 0.05), whereas it differed significantly among the other populations (p < 0.01, Table S3). The number of samples included in each haplotype is given in brackets. Each link between haplotypes represents one mutational difference. MV nodes indicate that inferred haplotypes were not found in the sampled populations. The detailed information of population codes is shown in Table S2.
Analysis of genetic diversity among the Liaoning populations from the four years 1979, 1992, 2008 and 2011 revealed that mean K, Hd and Pi values were 0.9048, 0.5022 and 0.0014, respectively. Additionally, four haplotypes were identified for the Liaoning populations ( Figure 4B) but no shared haplotype was identified between the LN79 population and other populations. Lastly, genetic differentiation between LN11 population and LN92, LN08 populations was not significant (Table S4). For three Hebei populations (1964,1992 and 2012), mean K, Hd and Pi values were 1.3856, 0.6744 and 0.0021, respectively, with five haplotypes discovered ( Figure 4C). Two haplotypes found in the HB64 population (collected in 1964) were previously unknown. With regards to genetic differentiation, F ST values between the HB92 population and the HB64, HB12 populations were similar (p > 0.05, Table S5), while those of the HB64 population differed significantly from the HB12 population (p < 0.05, Table S5).
DNA barcode sequences of Beijing populations corresponding to the six years 1979,1981,1982,1987,1992 and 2011 were analyzed; mean K, Hd and Pi values were 1.9073, 0.8281 and 0.0029, respectively. Overall, there was relatively high nucleotide polymorphism and genetic diversity in the Beijing populations over different years, for which seven haplotypes were identified ( Figure 4D). The four haplotypes found in old specimens were previously unknown. Notably, there was a shared haplotype between the BJ87 (collected in 1987) and BJ92 populations (collected in 1992), and both populations showed similar levels of genetic differentiation (F ST, p > 0.05); however, the other populations (except for the BJ87 and BJ92 populations) showed significantly genetic differentiation (p < 0.01.; Table S6). Moreover, the genetic differentiation (F ST(average) = 0.9293) between the BJ79 population (collected in 1979) and other populations was the greatest, but the lowest (F ST (average) = 0.6078) was observed between the BJ87 population and other populations (Table S6).

Variation in Population Structure for Three Geographical Populations over Different Years
For the gypsy moth populations in Liaoning and Beijing, China, we obtained DNA barcode sequences from moth samples collected in 1979, 1992 and 2011. In addition, we obtained DNA barcode sequences of the Heilongjiang population sampled in 1979, 1992 and 2012. Using these gene sequences, we analyzed the differences in genetic diversity in three gypsy moth populations over three years (1979, 1992, 2011 or 2012). Although the collection times of the three populations were not consistent, the differences in gene sequence were almost negligible because of the small difference between the two collection times  Table S8) and shared one haplotype ( Figure 5B). For 2011, the three populations exhibited means of 0.5496, 0.9610, 0.5584 and 0.0015, respectively; the HLJ12 and LN11 populations showed not significantly difference in terms of F ST (p > 0.05, Table S9), with one shared haplotype between two populations ( Figure 5C).     Table S2.

Population Structure of the Four Geographical Populations in the 1950s
We investigated population structure in Hunan, Heilongjiang, Inner Mongolia and Greater Hinggan, based upon F ST and Nm values, which are shown in Table S10. In the 1950s, these four geographical populations showed relatively high genetic differentiation, with values ranging from 0.8847 to 0.9356, and highly significant differences (p < 0.01) ( Table S10). The Nm values among the four populations were <1 (0.0344-0.0652) (Table S10), which was consistent with their high levels of differentiation. In addition, eight haplotypes were identified for these populations in the 1950s ( Figure 6).     Table S11. Two cluster heat maps were also generated to reflect the degree of 393 genetic differentiation and gene flow between populations (Figures 7, 8). In the cluster heat . The haplotype of a population is represented by the same color. The size of the fan area corresponds to the proportion of each population with the same haplotype. The number of samples included in each haplotype is given in brackets. Each link between haplotypes represents one mutational difference. MV nodes indicate that inferred haplotypes were not found in the sampled populations. The detailed information of population codes is shown in Table S2.

Population Structure for Geographical Populations in the 1990s
A total of 85 DNA barcode sequences, including 62 COI gene sequences downloaded from GenBank and Bold Systems V4, from 13 regions in the 1990s, were obtained in this study. With these, the genetic structure and demographic history of gypsy moths in China, and adjacent areas, were revealed. Firstly, we calculated pairwise genetic differentiation coefficients (F ST ) and gene flow (Nm) between all populations; values for the 1990s are shown in Table S11. Two cluster heat maps were also generated to reflect the degree of genetic differentiation and gene flow between populations (Figure 7, Figure 8). In the cluster heat map, the SC93, CT94, KL97, KK92 and JK94 populations were clustered together and the degree of genetic differentiation among those populations was relatively low. Furthermore, the HLJ92, BJ92, USA96, HB92, SD92, LN92, HS96 and PK92 populations were clustered together and the degree of genetic differentiation between populations was also relatively low (Figure 7). In the heat map of gene flow, we showed that the gene flow was strong between the SD92, USA96, BJ92 populations and the HLJ92, PK92, HB92 and HS96 populations (Figure 8).

404
The degree of genetic differentiation between populations is positively correlated with the z-value.  We identified 17 haplotypes for gypsy moths in the 1990s (Table 4). In this network, two major  In a cluster heat map, all F ST values are standardized according to the following formula, i.e.,

427
The detailed information of the haplotypes distribution is shown in Table 4.   Table 4.

Discussion
In previous studies, a large proportion of research focusing on the analysis of haplotypes from several geographic populations of gypsy moth has relied on fresh samples. In the present study, we analyzed the haplotypes of several gypsy moth populations by using museum specimens collected in different eras. We first supplemented the COI gene sequences of the gypsy moth specimens from more than a decade, or even several decades. These data would be used to supplement existing data for future evolutionary research of gypsy moths. In addition, our findings provide information and insight into the historical haplotypes of different geographic populations of gypsy moth, which revealed many previously unknown gypsy moth haplotypes. The abundance of haplotypes may be related to the strong adaptability of gypsy moths to changing environments. We also found that haplotypes of the gypsy moth populations in different periods were different; this may be related to genetic flow between different populations or the genetic diversity of gypsy moths under environmental pressure. In short, the extensive variation in haplotype observed in this study may complicate efforts to control and detect gypsy moths.
Herein, we provide the first report on the gypsy moth dynamics in Heilongjiang, Liaoning, Hebei, and Beijing based on DNA barcode sequences from different years . We found that nucleotide polymorphism and genetic diversity among these four populations had, at different time points, remained relatively high, indicating that substantial genetic variability exists among these populations collected in the same region at different time points. Probably some of this variability relates to environmental changes that occurred between 1956 and 2012. For example, elevated atmospheric CO 2 and O 3 concentrations, rising temperatures, and an uneven distribution of rainfall has been found to profoundly impact upon the composition of insect communities and the interactions between host plants and pest insects, and their natural enemies [57]. To successfully adapt to different environmental conditions, it argues that the long-term accumulation of genetic variation has led to significant genetic differentiation among populations, such that more genetically diverse species are more likely to persist under environmental stresses.
Furthermore, our results clearly showed that genetic diversity in Beijing populations was higher than that in the Liaoning population and the Heilongjiang population. From this, we inferred that the Beijing populations may be more adaptable to different ecological environments and had a strong flight characteristic. Therefore, we should pay more attention to the prevention and control of the gypsy moth populations with high adaptability and strong flight ability. Such information could be useful in detecting and preventing of gypsy moth outbreaks, for which it may be necessary to tailor monitoring and control programs in different regions based on their historical information derived from these. Furthermore, the degree of genetic differentiation between populations seems to be inconsistent with the time gaps between the sample collections in our study. For instance, genetic differentiation (F ST(average) = 0.9732) between the HLJ79 population and other Heilongjiang populations was greater than that between the HLJ56 population and other Heilongjiang populations (Table S3). We inferred from these findings that gypsy moth populations carrying similar genetic information to the HLJ56 population re-invaded the Heilongjiang region in 1992 or 2012; in other words, these populations experienced genetic exchanges with the Heilongjiang populations in 1992 or 2012; thus, compared with the HLJ79 population, the genetic differentiation between the HLJ56 population and other Heilongjiang populations was relatively low.
A number of studies have revealed the genetic structure of gypsy moths in some areas [24,30], and provided guidance for the quarantine of moths between different regions and countries. In our study, we found that the population structure in different years was distinct among all three geographical populations (Heilongjiang, Liaoning and the Beijing populations). Genetic differentiation was greatest in 1979, but much lower in 1992 and 2011 (2012). Correspondingly, the median-joining haplotype network in different years was also distinct. Since we obtained gypsy moth specimens collected before 1979, we supposed that moths invaded these three regions sometime before 1979. Then, following infestation, genetic variation increased such that the gypsy moth population could adapt to the local environment.
The environment differs across these three regions; as such, the genetic variation of the gypsy moth population is also different, such that the genetic differentiation of the three populations was higher in 1979. Doubtless, ongoing transportation, trade, and the development of tourism after this date were conducive to the dispersion of moths among the three regions. Genetic differentiation of the three populations had gradually decreased by 1992 and 2011 (2012). In addition, population structures between different geographical locations, changed at different times, a finding perhaps related to the long-range flight ability of adult females.
The genetic structure of gypsy moth populations from China, Japan, Russia, USA, Greece and Lithuania in the 1990s was also revealed in this study. We showed that the genetic distances of Chinese gypsy moth populations (HLJ92, BJ92, HB92, SD92 and LN92) were relatively close, probably as a consequence of their close geographical distance. This indicated that the gypsy moth populations were likely to spread between adjacent areas, a finding which also directly relates to the strong flight ability of the Asian moths. In addition, based on the genetic differentiation and gene flow between populations, we suspected that increasing levels of international trade and tourism in the 1990s facilitated the overseas dispersal of gypsy moth populations from Honshu (Japan), Primorsky Krai (Russia) and China.
Chen et al. [24] indicated that the mitochondrial COI genotype of two Chinese populations (Guizhou and Liaoning) was similar to that of the Siberian gypsy moths, though not identical. Our study re-visits the issue of whether there is genetic affinity between some of the gypsy moths from China and those found in Siberia [24]. In a neighbor-joining (NJ) phylogenetic tree constructed for all haplotypes in the 1990s, we found that all haplotypes from the SC93 population, and of European origin, were clustered into one branch. In another median-joining network built with DNA barcode sequences from China, a shared haplotype existed between the Sichuan and Guizhou populations. Two factors may have contributed to the genetic affinity between some of the gypsy moths from China and those found in Siberia. On the one hand, European gypsy moths may have been introduced into China. On the other hand, Chinese gypsy moths may have been introduced into Europe, and mixed with those already there so went undetected. Even so, we found that the Liaoning moths from 1979 and 1992 showed no genetic affinity with the Siberian samples, in contrast to recent work by Chen et al. (2015). From this, we believed that the spread of gypsy moths between Siberian and Liaoning did not occur earlier than the late 1970s.
Lastly, we found that one haplotype was shared by both the USA96 population and the Asian gypsy moth populations; furthermore, in the phylogenetic tree, these two populations were clustered into one as a single group. Consequently, two factors may have contributed to the genetic affinity between some of the gypsy moths from USA and those found in China. On the one hand, American gypsy moths may have been introduced into China, and mixed with those already there so went undetected. On the other hand, Chinese gypsy moths may have been introduced into USA. This matter could be readily resolved with more molecular data obtained by testing additional samples from these two latter regions. Furthermore, since mtDNA sequences only reflect the movements of females, datasets consisting of both mitochondrial and nuclear sequences are essential to reliably distinguish subspecies of this moth.
Several studies have revealed the genetic structure of gypsy moths, but such analyses have primarily sought to determine the lineage of the gypsy moth in its native range and adjacent areas [24,29,30]. Female gypsy moths from Asia are known for their strong flight ability [13][14][15][16][17]. Moreover, human activities can facilitate moth dispersal to different countries or sites. The female adult will lay eggs on the surface of containers, outdoor furniture, wood, cars, etc., and eggs of gypsy moths in diapause can remain viable for several months without hatching. Human activities provide an opportunity for the transport of egg masses attached to the cargo to a new habitat. For these two reasons, the genetic structure of gypsy moth populations in many parts of its occupied range may have changed quickly. Undoubtedly, the specimens used here were effective in revealing changes in the genetic structure of gypsy moth populations. In order to reveal changes in the genetic structure of additional gypsy moth populations, we encourage other researchers to collect and examine a wider range of molecular data. We only used the COI gene sequences, which, when considered alone, are insufficient to reveal further information relating to gypsy moth history, and clearly, in future studies, a wider range of samples and molecular markers are required to reveal additional features of the history and dynamics of this important pest.

Conclusions
Our analyses of mtDNA COI data in gypsy moth specimens from disparate locations revealed previously unknown genetic relationships in populations across space and time. Gypsy moth populations (collected in the same region at different time points) showed relatively high nucleotide polymorphism and genetic diversity. The Beijing gypsy moth population may be more conducive to produce genetic variation at different times based on our results of both nucleotide polymorphism and genetic diversity. We also found that the genetic structure among several Chinese geographical populations in different years was distinct. In addition, the development of international trade and tourism has facilitated genetic flow between gypsy moth populations in parts of China, Japan and Russia. Last but not least, our results suggested that some gypsy moths in China showed the genetic affinity with European gypsy moth (a sub-species of gypsy moth found mainly in Europe).

Supplementary Materials:
The following are available online at http://www.mdpi.com/2075-4450/10/5/146/s1, Text 1: The COI gene sequence obtained in this study, Table S1: All primers used in this study, Table S2: Detailed information of COI gene sequences,  Table S7: Pairwise F ST (below the diagonal) and Nm (above the diagonal) values among gypsy moth populations in 1979, Table S8: Pairwise F ST (below the diagonal) and Nm (above the diagonal) values among gypsy moth populations in 1992, Table S9: Pairwise F ST (below the diagonal) and Nm (above the diagonal) values among gypsy moth populations in 2011, Table S10: Pairwise F ST (below the diagonal) and Nm (above the diagonal) values among different populations in the 1950s, Table S11: Pairwise F ST (below the diagonal) and Nm (above the diagonal) values among different populations in the 1990s.