The Problem with ‘Microbiome’

: The term “microbiome” is currently applied predominantly to assemblages of organisms with 16S rRNA genes. In this context, “microbiome” is a misnomer that has been conferred a wide-ranging primacy over terms for community members lacking such genes, e.g., mycobiome, eukaryome, and virome, yet these are also important subsets of microbial communities. Widespread convenient and affordable 16S rRNA sequencing pipelines have accelerated continued use of such a “microbiome”, but at what intellectual and practical costs? Here we show that the use of “microbiome” in ribosomal gene-based studies has been egregiously misapplied, and discuss potential impacts. We argue that the current focus of “microbiome” research, predominantly on only ‘bacteria’, presents a dangerous narrowing of scope which encourages dismissal and even ignorance of other organisms’ contributions to microbial diversity, sensu stricto, and as etiologic agents; we put this in context by discussing cases in both marine microbial diversity and the role of pathogens in global amphibian decline. Fortunately, the solution is simple. We must use descriptive nouns that strictly reﬂect the outcomes attainable by the methods used. “Microbiome”, as a descriptive noun, should only be used when diversity in the three recognized domains is explored.

The term 'microbiome' has become a global buzzword, appearing in over 69,000 published papers (Table 1), penetrating popular culture, and even appearing in commercials for products that purport to maintain "microbiome" health. One would thus expect everyone who knows the term to agree on what it refers to, and how to use it. We show here, however, that neither of these is the case. "Microbiome" is typically defined as the diversity of microbes, i.e., bacteria, fungi, 'protists', and viruses in a habitat, such as those in or on the human body. While the ontology and history of use of the term have been well described [1,2], our concern is that the way the term is typically used today may mislead both specialists and non-specialists alike, especially about the nature and extent of microbial diversity. For example, the importance of Bacteria species such as Escherichia coli in the human gut was described well over a century ago [3], and we have also long known of bacterial and non-bacterial contributions in the guts and health of other organisms, and to global biogeochemical cycles [4]. However, terms that previously described these communities, such as 'gastrointestinal microorganisms', 'microbiota', or 'commensal', have been colloquially replaced by the term 'microbiome', a word that implies a whole, but in reality refers to only a part of the microbial diversity that is surely present [5][6][7][8][9][10][11].
The advent of next-generation sequencing (NGS) saw "microbiome" take on a somewhat different meaning, describing " . . . the collective genomes of members of a microflora", "microbiota", or "metagenome" [12,13]. Fast-forward to today, and the popularity and convenience of NGS pipelines for sequencing of 16S rRNA gene fragments, plus the low cost per sequenced base permit rapid surveys of what is still termed the "microbiome", a misnomer given these genes occur only in Archaea and Bacteria ( Figure 1, Table 2). Thus, the ascendance of "microbiome" to describe collections only of Archaea and Bacteria to the exclusion of all other microbial diversity has been enabled by both technology and the term's polysemy [14]. It is perplexing, however, that while the technology can contemporaneously and easily define the diversity of eukaryotes, or explore a sub-set such as of fungi, no such Eucarya component is usually targeted [15,16]. The advent of next-generation sequencing (NGS) saw "microbiome" take on a somewhat different meaning, describing "…the collective genomes of members of a microflora", "microbiota", or "metagenome" [12,13]. Fast-forward to today, and the popularity and convenience of NGS pipelines for sequencing of 16S rRNA gene fragments, plus the low cost per sequenced base permit rapid surveys of what is still termed the "microbiome", a misnomer given these genes occur only in Archaea and Bacteria ( Figure 1, Table 2). Thus, the ascendance of "microbiome" to describe collections only of Archaea and Bacteria to the exclusion of all other microbial diversity has been enabled by both technology and the term's polysemy [14]. It is perplexing, however, that while the technology can contemporaneously and easily define the diversity of eukaryotes, or explore a sub-set such as of fungi, no such Eucarya component is usually targeted [15,16]. Wolfe [17] described resistance to establishing the Archaea on the basis of their 16S rRNA genes thus: "the vast inertia invested in the morphological approach to taxonomy and phylogeny approach […] makes changing the course of such a huge ship very difficult." It is ironic that our field now runs a similar risk, namely that its current course, based on a misused term that refers to work focused on the same gene, may also become difficult to change. Now is the time for a course correction. In science, such misuse is particularly problematic because narrowing our scope of possibilities may promote sampling biases introduced by the limited choice of target. We thus sought to quantify the scale and scope of the problem, and further explored if this is just a semantic issue, or if  Wolfe [17] described resistance to establishing the Archaea on the basis of their 16S rRNA genes thus: "the vast inertia invested in the morphological approach to taxonomy and phylogeny approach [ . . . ] makes changing the course of such a huge ship very difficult." It is ironic that our field now runs a similar risk, namely that its current course, based on a misused term that refers to work focused on the same gene, may also become difficult to change. Now is the time for a course correction. In science, such misuse is particularly problematic because narrowing our scope of possibilities may promote sampling biases introduced by the limited choice of target. We thus sought to quantify the scale and scope of the problem, and further explored if this is just a semantic issue, or if relaxed language is driving our science. We then considered what may be the consequences.
We conducted a literature survey of Pubmed for the term "microbiome" in conjunction with any of the genetic markers "16S", "18S", or "ITS" using the logical "and", resulting in a total of 15,703 studies through 2019, the last year with complete data ( Table 2). Of these, 8072 records have abstracts: "16S" appears in 99% of the abstracts, while "18S" and "ITS" each appear in only 2%. Even fewer used multiple markers. We randomly sampled 5% of the 8172 "16S" abstracts in order to read and classify them into broad subject areas.
We found an exponential increase in microbiome studies starting around 2006, due almost exclusively to studies targeting 16S rRNA gene fragments (Figure 1) to the exclusion of either 18S or ITS (Table 2). Most studies (54%) could be classified as "biomedical", those pertaining to human health or disease models. Many of these did not recognize the discordance of their usage with the more comprehensive meaning of microbiome, i.e.: "Thus, understanding the diversity of bacteria, termed the microbiome, in these open lesions is important for proper treatment"(emphasis added; the quote, intentionally anonymous, is merely an example of an all too common misuse). Even studies targeting the skin or the vagina, areas of the human body known to host fungal pathogens, failed to include any markers for eukaryotes. In probably the most high-profile example, the National Institutes of Health (NIH) Human Microbiome Project's (HMP) website describes characterizing the microbiome with only one marker (https://www.hmpdacc.org/hmp/ micro_analysis/microbiome_analyses.php, "16S sequencing and analysis" tab, accessed on 16 February 2021): "16S rRNA sequencing was performed to characterize the complexity of microbial communities at each body site, and to begin to ask whether there is a core healthy microbiome".
We note that in terms of methodology, the HMP in phase two is expanding into metagenomics to capture a broader array of targets, but as of this writing, the mischaracterization of the description of the diversity captured by 16S rRNA sampling remains. Is it any wonder, then, that other studies follow suit?
Perhaps more surprisingly, 33% of the studies characterizing the microbiome through only 16S rRNA gene fragments could be assigned to the fields of ecology, evolution, or environmental monitoring, specialist fields whose practitioners one might expect to be best versed in microbial diversity and its assessment. The remaining studies pertained to agriculture, livestock, or human food production (8%), and methods development and other fields (4%). Projects that sequence hypervariable regions of 16S rRNA genes clearly do not determine taxonomic diversity, sensu stricto, in the microbial community, but rather only that in two domains, the Archaea and the Bacteria [18][19][20][21][22][23][24][25].
Gene surveys have revolutionized how microbial community diversity is assessed [26,27]. However, encouraging a bacteriocentric focus on the study of microbial communities misses opportunities. For example, by including eukaryotes, Brown et al. [28] revealed the highest marine microbial Eucarya richness estimates detected to that date; that many Eucarya sequences appeared related to putative parasitic organisms (and not grazers, as previously thought) suggested that what we believed of how microbial Eucarya participate in carbon flux through the microbial loop might require revision. This raises the question, "How many eukaryotes, both known and unknown, plus their relationships with other organisms, both microbial and not, and their contributions to everything from disease to biogeochemical cycles, have gone unnoticed in microbiome surveys that use only the 16S rRNA gene as their marker of choice?" Missed opportunity is unfortunate, but it is particularly so when sampling remote or extreme environments which require extraordinary effort to sample, all while the analytical tools are so readily available.
What are the consequences of the current bacteriocentric view of the "microbiome"? One of our greatest concerns is that it overlooks the role of the Eucarya in etiology, ecology, and of course, community diversity [28][29][30][31][32][33]. Consider the global decline of amphibians. In the 1990s, researchers realized the scope of the problem was not isolated species facing extinction, but a catastrophic die-off of the world's frog species [34]. Much research focused on environmental factors including habitat loss, climate change, pollution, invasive species, predators, and pathogens [35]. Within the realm of pathogens, bacterial agents were considered beginning in the 1970s [35][36][37][38], but it was not until 1998 that a major causative agent of mortality was identified as a chytrid fungus [39], even though members of this phylum are ubiquitous and cosmopolitan in soil and water. Subsequently, evidence of the disease was found in museum specimens dating back to the 1930s, leaving herpetologists to wonder if they had unwittingly contaminated remote areas with this pathogenic strain on their boots! Identifying the causes of wildlife decline is exceedingly difficult, as many ultimate and proximate factors surely interact to decrease fitness [39]. Thus, it is imperative that we not unconsciously limit the scope of our investigations. If there is any hope of discovering causative disease agents, broader genomic surveys including all three domains are required, as it is extremely difficult to find what you are not looking for.
How will continuing to use only surveys of bacteria to describe microbiomes affect or even direct the perceptions of the public, students, and researchers? Specific descriptive nouns exist for non-bacteria components of a community, such as virome for viruses and mycobiome for fungi, but the numbers of papers in which such terms appear annually pale in comparison to those describing only 'bacteria' as the microbiome, or vice versa [40,41] ( Figure 1, Table 1).
It really is time for the field to employ descriptive nouns that reflect the outcomes attainable by the methods used, since semantic clarity in reporting methods and findings would permit a better understanding of microbial communities in any setting. Given some communities have "become the units of evolutionary and ecological study," researchers focusing on microbial eukaryotes beyond only fungi should consider using only the term 'eukaryome' [13,15,40,41]. 'Bacteriome' and 'archaeome' are intuitive choices to define the Bacteria and Archaea components, respectively, in 'microbiome' studies. We note, however, that while the term bacteriome already defines a specialized organ in some insects, context should make the intended usage clear [42][43][44][45][46]. When a study is inclusive of all domains of the microbiome, subsets can be referred to as, e.g., "the bacterial component of the microbiome". However, when only a single domain is studied, we recommend the use of a specific term e.g., "bacteriome".
Given that many authors have argued for rigor in use of terms in classification that describe biologically meaningful groups [18,46,47], it is surprising that "microbiome" is used so egregiously today. Use of a biologically accurate terminology avoids being positively misleading, and may drive more rapid research progress in understanding the contributions of microbes in all facets of life.