The gastro-intestinal canal is heavily colonized, with over 700 bacterial species or unnamed phylotypes in the oral cavity alone [1
]. The bacteria form niche-specific ecosystems with patterns of bacterial cohabitation due to influences from factors like transmission, host receptor availability, pH, access of nutrients, oxygen and symbiosis or communication with nearby species [2
]. Generally, these ecosystems are in equilibrium but shifts may occur due to medical or lifestyle changes [4
Many studies report associations between host factors, including saliva quality, pH regulation, and genetics, and single or panels of bacteria [7
], but few studies have evaluated the associations between host traits and the oral microbiota in an untargeted manner [10
]. Carbohydrate, especially high sugar, intake is reported to correlate with the enrichment of acidogenic and acid-tolerant caries-associated species. As an example, frequent glucose pulses (low p
H) on a tooth-mimicking biofilm with nine bacterial species enumerated species that thrive at a low p
H, whereas others were relatively reduced [13
]. This shift mirrors the tooth biofilm dysbiosis in dental caries [14
]. Sucrose exposure in vivo and assessments of targeted species support these experimental findings [15
]. However, associations between sugar intake and untargeted characterization of the oral microbiota remain sparsely studied [4
]. That genetic variation plays a fundamental role in individual differences in food preference and thereby food selection has been described from studies targeting candidate genes and in genome wide studies (GWAS) as comprehensively reviewed [18
]. However, non-genetic and non-lifestyle linked factors have also been indicated to influence food habits, i.e., oral microbiota as well as gut microbiota have been suggested to modulate taste perception and eating behaviours [20
]. We, and others, have shown that the intake of and preference for sweet foods are associated with polymorphisms in sweet and bitter taste receptor encoding genes, such as TAS1R2, TAS1R3
but also glucose transporter genes (SCL2
) and the gustducin-encoding (GNAT3
) gene [24
The aim of the study was to explore the global saliva microbiota structure, identify groups of subjects defined by similar oral microbiota profiles in Swedish young adults, and search for potentially associated lifestyle and host factors.
2. Materials and Methods
2.1. Study Subjects
Teenagers and young adults attending one public dental health care clinic in the city of Umeå, Sweden were recruited consecutively as they attended the dentist’s office for their regular dental health control. Those who had received antibiotic treatment in the preceding 6 months, had a systemic disease or were taking regular medication, or were unable to communicate in Swedish or English, were not approached. In total, 176 participants in the age range 17–21 years were eligible and consented to participate.
The project received ethical approval by the Swedish Ethical Review Authority (Dnr 2012-111-31M) with an addendum (Dnr 2015-389-32M), and it adhered to the Helsinki Declaration and the General Data Protection Regulation (GDPR). The project is reported in accordance with Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines for cohort studies.
2.2. Saliva Sampling, Bacteria Culturing and DNA Extraction
Approximately 3 mL of chewing stimulated saliva was collected into ice-chilled test tubes. Colony-forming units (CFUs) of mutans streptococci (Streptococcus mutans and Streptococcus sobrinus) and lactobacilli were assessed per ml of fresh saliva by cultivation of 100 μL on mitis salivarius sucrose agar supplemented with 0.2 U of bacitracin (MSB) and lactobacillus selective (LBS) agar, respectively. The plates were incubated at 37°C in 5% CO2 for 48 h. The remaining saliva was stored at −80°C.
Genomic DNA was extracted from saliva samples, positive and negative controls using the GenElute™ Bacterial Genomic DNA Kit (Sigma-Aldrich, St. Louis, MO, USA). Briefly, the samples were centrifuged for 5 min at 13,000 rpm, lysed in buffer with lysozyme and mutanolysin, treated with RNase and Proteinase K, and purified and washed. All DNA extractions were done at the laboratory at the Dental School, Umeå University, Sweden by the same person and with kits from the same batch. The quality of the extracted DNA was estimated using NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific, Uppsala, Sweden) and the quantity by the Qubit 4 Fluorometer (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA). The mean yield from 200 μL saliva was 32 ng/μL (range 6–92 ng/μL) and the ratio of the absorbances at 260 and 280 nm was 1.8 or higher. Mixtures of known mock bacteria were used as positive controls and sterile water as negative control.
2.3. 16S rRNA Gene Amplicon Generation and Sequencing
Bacterial 16S rRNA gene amplicons were generated from the variable regions v1-v3 (27F-forward AGAGTTTGATCATGGCTCAG and 530R-reverse GTATTACCGCGGCTGCTG primers), and v3-v4 (357F-forward TACGGGAGGCAGCAG and 800R-reverse CCAGGGTATCTAATCC primers) from saliva and a mock community DNA. Library preparation, sequencing on the Illumina Miseq platform, and sequence demultiplexing were done at Eurofins Genomics (Ebersberg, Germany) according to their standard protocols. The company provided demultiplexed FASTQ-files, which were imported to QIIME2 [26
], and DADA2 was used for denoising, pair end read fusion, chimeric sequence removal [27
], and the identification of 100% identical amplicon sequence variants (ASVs) per sample [28
]. Default parameters in DADA2 were used with left trimming of 13 bp for both forward and reversed reads, right trimming at 230 bp for the reversed sequences and 268 bp for the forward sequences. ASVs with >1 read were blasted against the extended Human Oral Microbiome Database (eHOMD, http://www.homd.org
) for taxonomic annotation [29
]. In the eHOMD blast, only ASVs with at least 98.5% identity with a named species or unnamed phylotype in eHOMD were retained, and those with the same HMT number were aggregated. The negative control contained <50 sequences and the positive control mock species were correct for representative sequences with 25 or more reads. Therefore, all comparisons were based on taxa with at least 25 reads. For simplicity, all taxa are referred to as species in the text. Sequencing failed for one saliva sample, leaving 175 samples in the final analyses.
2.4. Diet Recording
The participants reported their diet intake in a food frequency questionnaire (FFQ, http://www.matval.se
). The FFQ is a semi-quantitative questionnaire, with questions on 93 food items/food aggregates selected to represent the habitual intake in Sweden and includes questions on alcoholic beverages. Participants were asked to report their typical intake in the last year. Intakes were reported on an increasing, nine-level scale, from never to four or more times a day. Portion sizes were estimated from photographs showing four portion sizes of staple foods (potatoes, rice, and pasta), meat/fish and vegetables, or standard food weights, such as for an egg or apple. Sucrose intake was estimated from nine questions, i.e., (i) fruit soup with or without a thickening agent, (ii) buns and biscuits, (iii) cookies and cakes, (iv) marmalade and jam, (v) ice cream, (vi) sodas, (vii) syrups, (viii) sugar, (ix) sweets including candies and chocolate. For sugar, the sum of mono and disaccharides (excluding lactose) from fruits, berries, vegetables, juices and honey was added. The FFQ included one question on how often the respondent ate or drank sweet products without sugar, i.e., with a sugar substitute.
A Healthy Diet Score that reflects healthy eating habits was calculated as previously described [30
]. Briefly, frequency of intake per day was calculated for eight food/beverage groups. Favourable food groups included fish, fruits (except juices), vegetables (except potatoes) and whole grains. Unfavourable food/beverage groups included red or processed meats, desserts and sweets, sugar-sweetened beverages and fried potatoes. Intake frequencies were ranked within each sex in ascending quartile ranks for favourable foods/beverage groups, and in descending quartile ranks for unfavourable foods/beverage groups. The sum of all quartile ranks represents the Healthy Diet Score, with a minimum of zero and a maximum of 24, and with higher ranks indicating healthier food and beverage choices.
The relative validity of FFQ-derived intakes has been estimated against 24 h dietary records and/or biological markers [31
]. For sucrose reliability, measured as correlations between registrations done one year apart, results were 0.80 for men and 0.75 for women, and the relative validity against 10 repeated 24 h recalls were 0.69 and 0.62 for sweet foods for men and women, respectively [31
To reduce potential recording bias, energy-providing nutrients were expressed as energy standardized values (E%) and 11 participants with unrealistic reported energy intakes were excluded from analyses involving diet assessments. This was based on food intake level (FIL) scores calculated to estimate energy intake relative to minimal energy needs [35
2.5. Recording of Medical and Other Lifestyle Conditions
Information on health status, oral hygiene, tobacco use, alcohol intake and most recent antibiotic exposure was obtained from a questionnaire. Dental caries was scored from visual and radiographic examinations in the dentist’s office with optimal lightning. Tooth surfaces that were sound according to ICDAS [36
], score 0, or had caries in the enamel (e) according to ICDAS scores =1 and 2, or had caries in the enamel with a localized breakdown with or without dentine involvement (D) according to ICDAS score ≥3, or with a filling (F), were recorded. The total numbers of decayed and filled tooth surfaces (DeFS) were calculated. The M component was not considered because tooth loss occurred for orthodontic reasons or severe hypomineralization in this study group.
2.6. Genotyping of Single Nucleotide Polymorphism in Taste Associated Genes
Genotyping of 121 Single Nucleotide Polymorphisms (SNPs) in the TAS1R1
taste-associated genes was performed at SciLife, Uppsala as described previously [24
]. One SNP marker received a call rate of 0%. None of the remaining 120 SNPs deviated from Hardy–Weinberg equilibrium (p
> 0.001), and they had an average call rate per sample of 99.8% and overall call rate of 99.8%. Genotyping data are uploaded at figshare (https://figshare.com/s/e292568e15c601e67a03
2.7. Prediction of Functional Potential from the 16S rRNA Gene Information
Obtained representative ASVs were used to search for the potential molecular functions of the saliva microbiome using the 16S rRNA gene as marker gene, Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt2) [37
] and the Molecular Functions by orthology annotation (Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology database, KO, https://www.genome.jp/kegg/kegg1.html
]. The steps included (i) creating a closed reference feature table in QIIME2 using the trained Greengenes dataset gg-13-8-99-nb-classifier.qza (Greengenes http://greengenes.lbl.gov
], (ii) qiime diversity core-metrics analysis in QIIME2, and (iii) export of pathway abundances and the feature table for down-stream analyses in KO and multivariate modelling. Group separation was tested by Euclidean distances in permutational multivariate analysis of variance (PERMANOVA), Bonferroni-corrected p-values and 9999 permutations. Follow-up functional enrichment analyses were done using the STRING database (version 10.5, https://string-db.org/
]. The same procedure was also done for eHOMD-defined species.
2.8. Data Handling and Statistical Analyses
Unsupervised hierarchical clustering (Ward’s method) was used to classify the 175 participants by the presence (or not) of ASVs and presence (or not) of species from eHOMD identification. The number of ASVs were standardized to the level of the sample with the fewest reads after DADA2 filtering (38,293 reads), and lowest per-sample abundance of eHOMD taxa, and transformed by inverse hyperbolic sine transformation, which defines log values, including for zero-values, which are prevalent for many ASVs and some species.
Continuous phenotypic variables were presented as means with 95% confidence interval limits (CI), and when adjusted for sex, age and body mass index (BMI) using generalized linear modelling. Differences were tested with non-parametric tests. For discrete measures, the percentages in groups were estimated and proportion differences tested with Chi2 test. SPSS version 25 (IBM Corporation, Armonk, NY, USA) was used for these analyses. All tests were controlled by the Benjamini and Hochberg procedure, and those with p-values <0.05 yielding an FDR of 5% are presented.
Alpha- and beta-diversities with associated PERMANOVA tests and ASV proportions for bar charts were calculated in QIIME 2.
Multivariate modelling was performed by partial least-square regressions (PLS) (SIMCA P+ version 15.0, Sartrius Stedim Data Analytics AB, Malmö, Sweden). PLS identifies directions in an X-swarm that characterize X well and are related to Y. The software scales all variables to unit variance, and performs a K-fold cross-validation where 1/7th of the data are systematically kept out to fit a model and predict it from the remaining data (Q2-values). The results are displayed in scatter plots illustrating the separation of observations, and loading column plots displaying the mean correlation coefficient, with 95% CI between each predictor and the outcome variable. CIs that do not include zero are considered statistically significant.
The linear discriminant analysis effect size (LEfSe) method [42
] was used to identify taxa effect size. Species that were shared between groups were identified in a Venn diagram [43
This study identified groups of people defined by similar oral microbiota profiles, targeted as dichotomized amplicon species variants (ASVs) or named species and unnamed phylotypes, and then explored lifestyle and host factors which were associated with these groups. The most striking difference between groups classified by unsupervised cluster analysis was for reported sucrose intake but associations were also found for total sugar intake, saliva flow rate and allelic variations in two taste-perception-associated genes. The group who reported the highest sucrose intake had the lowest species richness, but the microbiota in clusters with a higher sucrose intake were either defined by a pamphlet of species with no clear association with carbohydrate metabolic pathways or by a microbiota enriched for acidogenic and acid-tolerant species and carbohydrate degradation metabolic pathways. The latter species included several species that were suggested to be associated with the caries disease.
The prevailing ecological plaque hypothesis [44
] describes an ecological shift (collapse) towards the enrichment of acidogenic and acid-tolerant species in low pH environments, such as after sugar consumption [13
]. The present findings are consistent with this hypothesis, as several such species were found to be enriched in two of the three cluster groups with the highest sucrose intake. Among these were species in Actinomyces, Bifidobacterium
and S. wiggsiae, S. mutans
, and S. sobrinus
with documented relevance for dental caries in small children or adults [14
]. In fact, although not statistically assured, these two cluster groups tended to have had the highest mean caries experience. Conversely, the study identified a third cluster group with similar sugar intake but not characterized by enrichment of the most acidogenic and acid-tolerant species, but rather a large number of taxa including several species in Capnocytophaga, Leptotrichia, Prevotella
. Thus, the effects of sugar on the salivary microbiota are potentially modified by host-related factors such as innate immunity peptides and buffer functions, but there are other possible explanations, for example, if the microbiota response to sugar is regulated by inter-bacteria communication or if there was differential measurement error in the reporting of sucrose intake in the different cluster groups. Overall, the present findings are well in line with our previous finding in the same population, where S. wiggsiae, S. mutans, B. longum, Leptotrichia sp. HMT498
, and Selenomonas spp
. in tooth biofilm samples discriminated caries-affected from caries-free adolescents [48
], and support previous experimental findings in vitro [13
] and in animals [49
]. A direct comparison with the results from Anderson et al. 2018 [4
] may not be appropriate as they used a different 16S rRNA segment and lower similarity requirement for taxa determination.
The strength of the present study is the comparably large sample and that the study group likely represents the underlying population with minimal selection bias since the attendance rate to the public dental clinics is very high, as care is free in the targeted age group and that participants who were enrolled consecutively agreed to participate. The major weakness relates to the inherent difficulties in measuring diet using questionnaires [50
]. Monitoring sugar intake may be even more challenging in the dentist´s office, since patients are aware that sugar is bad for the teeth. This may be a source of systematic measurement error which may be manifested in the low self-reported sugar intake and the limited variation compared to other studies in the country [51
]. However, energy-adjustment, excluding participants who reported the most implausible energy intakes and use of robust statistical methods based on ranking rather than reported consumption, was performed to reduce the impact of this error, but it cannot completely exclude a nullifying bias from the under-reporting of sucrose intake.
Factors influential for being classified in a cluster group were assessed using multivariate PLS regression, which has the advantage of accepting correlated variables. These models were strong, but it should be noted that this is partly an effect of the fact that the cluster groups were formed by cluster analysis of bacterial taxa. However, since clustering was unsupervised, the association with sucrose intake and which specific bacteria were present in each cluster is not a consequence of clustering per se and does not prevent the comparison of phenotypic characteristics of people in the cluster groups or species that were influential for being in the groups.
An alternative model to the idea that sugar intake and, indirectly, gene variations, drives oral microbiota transformation would be that the oral microbiota per se influences the taste phenotype of the host and, accordingly, the individual’s food selection, as previously suggested [20
]. In the present study, sucrose intake was the strongest explanatory diet factor for microbiota clustering, and supported by several in vitro and a few in vivo studies [4
]. We suggest that the variant oral microbiota ecologies (here, the clusters) are a consequence of low pH due to sugar intake [42
]. Thus, we suggest that the clinical implication of the present findings is that even a moderate difference in sugar intake may affect the oral microbiota into a more cariogenic composition, which may be reverted by sugar restriction. However, we cannot fully exclude that the alternative hypothesis is of relevance too.
We, and others, have reported that genetic variants located in or near genes related to taste perception and preference were associated with diet preference and caries [33
], but studies have not evaluated variations in such genes in relation to the concerted oral microbiome. In this study, allelic variations in the TAS1R1
genes were significantly associated with the pattern of oral bacteria at the group level. The TAS1R1
gene is involved in bitter taste sensations with known associations with sweet food preferences [53
], whereas the GNAT3
gene, encoding the gustducin alpha-3 chain protein, can be involved in sweet, bitter or umami taste sensation depending on heterodimeric formations with various TAS proteins [54
]. Besides taste perception, the gustducin protein functions as a sugar sensor in the gut with suggested effects on sugar absorption and metabolic syndrome [55
]. Thus, hypothetically, the GNAT3
gene may affect the oral microbiota via dietary habits or secondary effects from diet-related metabolic disorders [56
The present study also found that under the present conditions sequencing of the v1-v3 amplicons of the 16S rDNA gene yielded significantly fewer sequences and poorer recognition of the mock species than the v3-v4 amplicons. This is in line with previous reports [57
] and stresses the importance of taking the sequencing protocol into consideration when data are compared from different studies.