Gut Bacterial Communities of Dendroctonus valens and D. mexicanus (Curculionidae: Scolytinae): A Metagenomic Analysis across Different Geographical Locations in Mexico

Dendroctonus bark beetles are a worldwide significant pest of conifers. This genus comprises 20 species found in North and Central America, and Eurasia. Several studies have documented the microbiota associated with these bark beetles, but little is known regarding how the gut bacterial communities change across host range distribution. We use pyrosequencing to characterize the gut bacterial communities associated with six populations of Dendroctonus valens and D. mexicanus each across Mexico, determine the core bacteriome of both insects and infer the metabolic pathways of these communities with Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) to evaluate whether these routes are conserved across geographical locations. Our results show that the β-diversity with UniFrac unweighted varies among locations of both bark beetles mainly due to absence/presence of some rare taxa. No association is found between the pairwise phylogenetic distance of bacterial communities and geographic distance. A strict intraspecific core bacteriome is determined for each bark beetle species, but these cores are different in composition and abundance. However, both bark beetles share the interspecific core bacteriome recorded previously for the Dendroctonus genus consisting of Enterobacter, Pantoea, Providencia, Pseudomonas, Rahnella, and Serratia. The predictions of metabolic pathways are the same in the different localities, suggesting that they are conserved through the geographical locations.


Bacterial Community Diversity
The operational taxonomic units (OTUs) richness, evenness, and overall diversity of the bacteria were similar among localities in both bark beetles, except for the Michoacan locality in Dendroctonus valens. An average of 127 and 84.5 of the observed OTUs was recorded across locations for D. valens and D. mexicanus, respectively. The higher number of OTUs was detected in Morelos I (177) and Oaxaca I (102) for D. valens and D. mexicanus, respectively ( Table 1).
The values of species richness (Chao1) were not statistically different among the locations of  Table  1).

Bacterial Community Diversity
The operational taxonomic units (OTUs) richness, evenness, and overall diversity of the bacteria were similar among localities in both bark beetles, except for the Michoacan locality in Dendroctonus valens. An average of 127 and 84.5 of the observed OTUs was recorded across locations for D. valens and D. mexicanus, respectively. The higher number of OTUs was detected in Morelos I (177) and Oaxaca I (102) for D. valens and D. mexicanus, respectively ( Table 1).
The values of species richness (Chao1) were not statistically different among the locations of Dendroctonus mexicanus (F Welch test:  Table 1).

Prediction Roles of Gut Bacterial Communities
Based on 79,349 and 92,660 reads for Dendroctonus valens and D. mexicanus, respectively, the predictive analyzes of the microbiota in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways at level 2 ( Figure 6) revealed a total of 195 metabolic pathways for gut bacterial communities for the former species and 203 for the latter. The Nearest Sequenced Taxon Index (NSTI) values varied from 0.020 to 0.031 (0.023 ± 0.003 standard deviation, SD) in D. valens locations, whereas these varied from 0.018 to 0.13 (0.064 ± 0.041 SD) in D. mexicanus locations, indicating accurate metabolic predictions of the bacterial metagenome of both bark beetles. The heatmap, based on the whole gut bacterial communities of each bark beetle, revealed the presence of genes putatively important for amino acid, carbohydrate, and vitamin metabolism ( Figure 6). Lastly, the correlation between the pairwise phylogenetic distance (wU and uwU distances) of bacterial communities and the pairwise geographical distance of locations of each bark beetle using the Mantel test were not significant at the geographical space analyzed ( RMA r s = 0.07-0.11, p s > 0.05, Table S3).

Prediction Roles of Gut Bacterial Communities
Based on 79,349 and 92,660 reads for Dendroctonus valens and D. mexicanus, respectively, the predictive analyzes of the microbiota in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways at level 2 ( Figure 6) revealed a total of 195 metabolic pathways for gut bacterial communities for the former species and 203 for the latter. The Nearest Sequenced Taxon Index (NSTI) values varied from 0.020 to 0.031 (0.023 ± 0.003 standard deviation, SD) in D. valens locations, whereas these varied from 0.018 to 0.13 (0.064 ± 0.041 SD) in D. mexicanus locations, indicating accurate metabolic predictions of the bacterial metagenome of both bark beetles. The heatmap, based on the whole gut bacterial communities of each bark beetle, revealed the presence of genes putatively important for amino acid, carbohydrate, and vitamin metabolism ( Figure 6). The ANOSIM test did not show statistically significant differences among metabolic pathways of bacterial communities of D. valens (R = −0.18, p > 0.05) and D. mexicanus (R = −0.23, p > 0.05) across the geographical locations.

Discussion
One of the poorly studied aspects about the bacterial communities in insects, including bark beetles, is whether their structure (i.e., the composition and abundance of their members) and function (metabolic pathways) change across geographic space [41,42]. Our findings show that the gut bacterial community structure of bark beetles Dendroctonus valens and D. mexicanus is apparently stable across the geographical locations, as suggested by the values of richness and diversity indices that did not show significant differences among localities, except the Michoacan location in D. valens, whose geographical distance to its nearest pair (Jalisco) is 160.2 km and to its furthest pair (Oaxaca) the 608.2 km (Figure 1).
These findings agree with those reported for the microbial communities in the pine weevil Hylobious abietis, which are similar at lower taxonomic levels (family and genus) across locations in Europe [32], but they were different from those reported for Dendroctonus valens from different geographical sites in the United States of America [23]. Whereas it is difficult to recognize the factors The ANOSIM test did not show statistically significant differences among metabolic pathways of bacterial communities of D. valens (R = −0.18, p > 0.05) and D. mexicanus (R = −0.23, p > 0.05) across the geographical locations.

Discussion
One of the poorly studied aspects about the bacterial communities in insects, including bark beetles, is whether their structure (i.e., the composition and abundance of their members) and function (metabolic pathways) change across geographic space [41,42]. Our findings show that the gut bacterial community structure of bark beetles Dendroctonus valens and D. mexicanus is apparently stable across the geographical locations, as suggested by the values of richness and diversity indices that did not show significant differences among localities, except the Michoacan location in D. valens, whose geographical distance to its nearest pair (Jalisco) is 160.2 km and to its furthest pair (Oaxaca) the 608.2 km (Figure 1).
These findings agree with those reported for the microbial communities in the pine weevil Hylobious abietis, which are similar at lower taxonomic levels (family and genus) across locations in Europe [32], but they were different from those reported for Dendroctonus valens from different geographical sites in the United States of America [23]. Whereas it is difficult to recognize the factors that can drive these differences in D. valens, some of them may be the number of insects sampled, the anatomical site analyzed (whole insects, tree host, gut, exoskeleton, mycangium), the methodology employed (molecular cloning, DGGE or Next Generation Sequencing (NGS) methodologies), and the geographical scale studied.
The Venn diagram shows that the Dendroctonus species have an intraspecific core bacteriome different in genera number ( Figure 4) and relative abundance ( Figure 2). However, despite these differences, they share a minimal number of genera, among which stand out Enterobacter, Pantoea, Providencia, Pseudomonas, Rahnella, and Serratia (Figure 4). This small group of bacterial genera is the same as those reported for the Dendroctonus species [28], indicating that the intraspecific core bacteriome of these bark beetles, and perhaps of each Dendroctonus species, is higher than the interspecific core bacteriome reported for Dendroctonus. Based on these results, we hypothesize that each Dendroctonus species may have a specific core bacteriome, which can include all or almost all the members of the interspecific core bacteriome known to the Dendroctonus genus [28]. This closed group may support the basic metabolic functions of these bark beetles independent of the number of localities sampled and geographical space analyzed, as reported in other studies [32].
Our findings regarding β-diversity with unweighted UniFrac show that bacterial communities of both bark beetles vary geographically. The PCoA (uwU) and heatmap (Figures 3 and 5b,d) show that this variation in both bark beetles is a result of the presence or absence of rare members with low relative abundance (<1% reads), as well as due to some dominant genera. In fact, some members of the intraspecific core bacteriome of these bark beetles are not necessarily the most abundant within the community, and their relative abundance varies among locations. This same pattern in the variation of β-diversity, given by the less-frequent members, has been observed in other studies carried out in other scolytines [3,[26][27][28]32].
Our Mantel test results indicate that the spatial variation of gut bacterial communities in both bark beetles is not associated with the geographic distance, suggesting that the dominant genera (>1% relative abundance), including the strict core bacteriome members of both bark beetles, are present in all the geographical localities analyzed. Our findings agree with those reported for the pine weevil Hylobius abietis [32] but not with those reported for Dendroctonus valens at a wider geographical space [23]. This difference observed in D. valens with respect to geographical distance may be explained by the different geographical distances analyzed and the statistical coverage of the techniques used. In this study, we used NGS technologies and the maximum geographical distance between the sites was 1175 km, but Adams et al. [23] used the DGGE technique, and the maximum distance between their localities was 2500 km.
The factors that influence the structure to the bacterial communities should be determined in further studies. As suggested for other insects [43][44][45] and some Dendroctonus species [21,31,46], it is possible that geographical distance is not an important factor that influences these differences, particularly considering that pine-associated endophytic bacterial communities are highly similar and independent of their geographical location [38,47,48].
In addition, it is known that the gut is a micro-environmentally heterogeneous habitat and hence limiting for many bacteria, where the prevalence or not of specific bacteria within this system is not random [49]. In the case of bark beetles' guts, different ecological and demographic factors (e.g., interactions, competition, population growth, resource availability) may determine the presence and/or dominance of bacterial groups and mutualistic relationships between them, at least between members of the bacteriome. Moreover, when the physiological conditions in the gut of bark beetles change, the pathogenic or commensal capacities of some bacteria may be expressed in this system [50]. Independent of the absence or presence of members at low frequency, the metabolic pathways inference analyzes with PICRUSt suggest that the basic biochemical functions of the microbiota and their contribution to the host biology are apparently conserved. In fact, whereas a considerable number of functional metabolic pathways were predicted for the gut bacterial communities of both the Dendroctonus species, the most important metabolic pathways were those related to nutrition and detoxification. Of these, carbohydrate metabolism, cofactors, and vitamins, as well as the biodegradation and metabolism of terpenoids and xenobiotic compounds were the most frequent. These functions are fundamental because these bark beetles feed on a substrate rich in organic compounds, non-essential amino acids, and structural carbohydrates, such as hemicellulose and cellulose, that are not easily degraded by insects. Additionally, throughout their entire life cycle, bark beetles are in contact with plant defensive compounds, some highly toxic, such as limonene and pinene, that can cause cellular damage and kill the Dendroctonus species [51][52][53]. Lastly, future studies using metatranscriptomics are necessary to confirm these inferences based on the 16S rRNA gene because to the best of our knowledge, there are no associative studies between the taxonomy assignments using this gene and specific functional pathways or genes in the Dendroctonus species, further, the results generated from this research using next-generation sequencing could expand knowledge for the control and management of these bark beetles.

Insect Collection, Dissection and DNA Extraction
Emerged adults of D. valens and D. mexicanus were directly collected from naturally infested pine trees in six distinct geographical locations in Mexico during 2017 ( Figure 1, Table S1). The largest geographical distance between localities was 1175 km (Durango-Oaxaca sites), and the shortest distance was 160 km (Morelos-Puebla sites). Insects were removed from their galleries using fine sterile forceps, placed in sterile polycarbonate containers with wet paper to avoid insect desiccation, stored at 4 • C for their transport, and processed immediately after arriving at the laboratory.
Two sets of 30 insects each were taken in each locality, to integrate two biological replicates for each Dendroctonus species. The insects were dissected under sterile conditions as described by Briones-Roblero et al. [9]. The last washing water was inoculated in Petri dishes with trypticase soy agar (TSA, BD, Difco, Sparks, MD, USA) inoculated with the last rinsing water and by negative Polymerase Chain Reaction (PCR) amplification of the same water to assess the efficiency of the disinfestation. The plates were incubated at 28 • C for 48-72 h. Each set of 30 guts was processed independently for DNA extraction with DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's protocol.

Bacterial 16S rRNA Polymerase Chain Reaction Amplification and Pyrosequencing
The V1-V3 region of the 16S rRNA gene was amplified using universal primers 8F and 556R [54],
The high-quality sequences were grouped in OTUs through the open-reference picking method at a 97% similarity threshold [57] using Uclust v 1.2.22 [58]. Chimeric sequences were identified with Chimera Slayer v 1.39.5.0 [59] and then removed from the data set. The most abundant read of each OTU was selected as a representative sequence and then aligned with the Greengenes core sequences set (available online: http://greengenes.lbl.gov/) using PyNast [60].
The taxonomic assignment from phylum to genus for representative sequences was done at a ≥ 0.8 confidence score with the Ribosomal Database Project (RDP) Classifier (available online: https://rdp. cme.msu.edu/index.jsp) to know its taxonomic identity [61]. The taxonomic assignment was manually corroborated by comparing with the closest matched sequences on three databases (RDP, GenBank, and Greengenes), the cutoff being from 97% to 100% at the genus level. A Maximum-Likelihood (ML) phylogenetic inference analysis was performed in PhyML (available online: http://atgc.lirmm.fr/ phyml/), using only representative sequences at the genus level. To this analysis, these sequences were aligned in Clustal X v 2.0.10 (available online: http://www.clustal.org/clustal2/) [62] and trimmed at their 5 and 3 ends in SeaView v 4 (available online: ftp://pbil.univ-lyon1.fr/pub/ mol_phylogeny/seaview/archive/). The best-fit model of nucleotide substitution was selected with JModeltest v 2.1.7 (available online: http://darwin.uvigo.es/our-software/) [63] based on the Akaike Information Criterion (AIC). The reliability of each node was estimated via a bootstrap analysis after 1000 pseudoreplicates. Finally, the phylogeny and heatmap of relative abundances were matched using Interactive Tree of Life (iTol) (available online: https://itol.embl.de/) [64].

Bacterial Diversity Analysis
As the amount of sequences analyzed could affect the diversity analysis, we used rarefied data running the multiple rarefaction script implemented in QIIME v 1.9 (for each sample, we calculated the number of OTUs expected to be observed for 100 reads). To determine the probability that a randomly selected amplicon from a sample was previously sequenced, the Good's coverage was calculated as an estimator of sampling completeness [65]. Estimators of richness (Chao1), diversity (Simpson and Shannon), and phylogenetic diversity (PD) were used to determine the α-diversity in bacterial communities for each bark beetle species and biological replicates [66,67]. The normality and homogeneity of the variances of these estimators were tested with a Shapiro-Wilkinson test and F test [68]. As the diversity indices did not meet the assumptions of equal variances, those values were compared using Welch's F test for ANOVAs and its respective post hoc multiple paired comparisons using Dunn's test [69].
The β-diversity of gut bacterial communities among locations was estimated using unweighted (considers only phylogenetic richness) and weighted (considers both relative abundance and phylogenetic richness) Fast UniFrac distances [70], as well as the Bray-Curtis dissimilarity index [71]. The Monte Carlo method was performed to test the statistically significant differences among the bacterial communities using both Fast UniFrac distances [70], while the Adonis test was used to evaluate the differences with Bray-Curtis distances. A Principal Coordinate Analysis (PCoA) using unweighted and weighted UniFrac distances was performed in NTSYS-PC v 2.02 (Exeter Software, Setauket, New York, NY, USA) to explore multidimensional patterns of diversity variation of bacterial communities among locations [72].

Core Bacteriome
The strict core bacteriome of these bark beetles was determined according to Hernández-García et al. [28]. We, only include those bacterial genera present in all localities and their replicates, whose relative frequencies were >1%. To identify unique and shared OTUs among localities of both the Dendroctonus species, a Venn diagram was generated in Bioinformatics and Evolutionary Genomics (available online: http://bioinformatics.psb.ugent.be/webtools/Venn/).
To determine the variation of bacterial communities of each bark beetle with respect to geographical distance, a nonparametric Mantel test was used to determine the correlation between the matrix of pairwise geographic distance among localities and pairwise phylogenetic distance matrix, using both weighted and unweighted UniFrac [73]. The randomized matrices were generated by row and column permutation. The randomization was performed 5000 times to establish a 95% confidence interval [73]. However, given that the estimation and measurement of the correlation test between the geographic and phylogenetic distance matrices of bacterial communities were not free of errors, it was necessary to estimate the slope and y-intercept to find the best straight line that described this association in a confinable way. To do this, we used the reduced major axis (RMA) method and 95% confidence intervals for elements of the line that were calculated using a bootstrap test as implemented on PAST v 3.20 (available online: https://folk.uio.no/ohammer/past/) [69].

Predictive Functional Analysis Based on Metagenomic 16S rRNA Surveys
The predicted functional profiles of bacterial communities associated with the gut of D. valens and D. mexicanus of each locality were inferred with the bioinformatic tool PICRUSt [74]. The sequences from each library were demultiplexed and used to generate the OTU table in BIOM format, following a closed-reference method. Given that the ability of PICRUSt to estimate a metabolic profile relies on a set of known sequenced genomes, the OTU table was normalized and used to make a comparison with sequenced bacterial genomes deposited in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [75] at hierarchical levels 2 and 3.
The obtained tables with the counts of predicted genes per sample were cleaned according to the following criteria: Removal of categories unrelated to bacterial physiology/metabolism (similar to human diseases) and removal of gene family categories with a count equal to zero. The functional predictions were plotted at the hierarchical levels 2 and 3 of KEGG. To evaluate the prediction accuracy of samples, we calculated the Nearest Sequenced Taxon Index (NSTI). The NSTI scores summarize the extent to which microorganisms in a sample are related to sequences' genomes, and they represent the average branch length that separates each OTU from a reference bacterial genome, weighting their relative abundance in each sample [74]. Lastly, to compare whether the metabolic pathways among the gut bacterial communities of different locations of both the Dendroctonus species were different, we performed an analysis of similarity (ANOSIM) using a Gower index with 10,000 permutations in PAST v 3.20 [69].

Data Accessibility
The pyrosequencing-derived 16S rRNA gene sequence datasets were submitted to the NCBI database, under accession number of Sequence Read Archive (SRA) SRP158467.