Next Article in Journal
Yeast Surface Display of Protein Addresses Confers Robust Storage and Access of DNA-Based Data
Previous Article in Journal
Hypodiploidy: A Poor Prognostic Cytogenetic Marker in B-CLL
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Challenges in the Identification of Environmental Bacterial Isolates from a Pharmaceutical Industry Facility by 16S rRNA Gene Sequences

by
Juliana Nunes Ramos
1,*,
Luciana Veloso da Costa
1,
Verônica Viana Vieira
2 and
Marcelo Luiz Lima Brandão
1
1
Institute of Technology in Immunobiologicals, Oswaldo Cruz Foundation, Rio de Janeiro 21040-900, Brazil
2
Interdisciplinary Laboratory of Medical Research, Oswaldo Cruz Foundation, Rio de Janeiro 21040-900, Brazil
*
Author to whom correspondence should be addressed.
Submission received: 15 May 2025 / Revised: 23 June 2025 / Accepted: 25 June 2025 / Published: 7 July 2025

Abstract

Microbial contamination is a critical challenge for the pharmaceutical industry, especially in thermosensitive sterile products, and can compromise their quality and safety. The accurate identification of microorganisms is essential to trace sources of contamination and adopt corrective measures. Although MALDI-TOF MS technology has revolutionized this process, its database limitations necessitate the use of complementary methods, such as sequencing 16S rRNA genes, housekeeping genes, and, in some cases, the entire genome. Advances in sequencing have expanded genomic taxonomy, increasing the accuracy of bacterial identification. The integration of these approaches significantly improves the reliability of identification, overcoming the limitations of isolated methods.

1. Background

Microbial contamination is one of the biggest obstacles facing the pharmaceutical industry, especially for heat-sensitive sterile products such as immunobiologicals that cannot be terminally sterilized [1,2]. Data from 2012 to 2019 revealed that over 50% of all drug product recalls registered by the U.S. Food and Drug Administration (FDA) were linked to microbiological issues [3,4]. This high rate of recalls suggests an insufficient understanding of contamination risks and a flawed implementation of relevant control programs. Substantial evidence, such as warning letters, alert notifications, and reported failures, indicates a direct correlation between the level of environmental control and the final product quality [4,5,6]. Microbial contamination can alter the physical and chemical properties of pharmaceutical products and excipients, affecting product quality and consumer safety. Therefore, the performance of microbiological testing is essential for the quality control of these products [2,7]. Good Manufacturing Practices (GMPs) must be followed to minimize the risk of microbial contamination in pharmaceutical production environments, ensuring that biological products meet, among other parameters, acceptable limits for microorganisms [8].
Several groups of bacteria may be present as contaminants in clean areas, but aerobic endospore-forming bacteria have been described as one of the most important groups of bacteria isolated from these environments due to their ability to produce spores that are resistant to temperature variations and sanitizers used in industry, allowing them to persist in the environment for long periods [7,8]. The identification of a detected microbiome is essential to investigate the source of contamination and, consequently, to take preventive and corrective measures [2,9,10]. Alternative microbiological methods (AMMs) have been developed, with lower detection limits and which do not require the use of culture media, to detect cultivable and non-cultivable microorganisms in intermediate and final products, such as RAMAN spectroscopy [11,12], flow cytometry, solid-phase cytometry, and bioluminescence methodology based on adenosine triphosphate (ATP) [12,13], among others [14,15].
For sterility test and media failures, the identification of microbial contamination at the species level is mandatory [14,16]. For bioburden analysis, pharmaceutical industries must identify the contamination when it is up to the specification limit, generally 10 colony-forming units (CFU)/100 mL [17]. However, pharmaceutical industries must determine their own contamination control strategies and when identification is necessary, according to the sample and the step of the production chain [10,18]. According to Annex 1: Manufacture of Sterile Medicinal Products Guide for Sterile from the European Medicines Agency (EMA), microorganisms found in Grade A and B areas must be identified at the species level. The guide also recommends the identification of endospore-forming bacteria when found in Grade C and D areas [10].
Rapid microbial identification results are not merely a matter of efficiency; they are a cornerstone of effective decision-making, thorough deviation investigations, and the timely implementation of robust corrective and preventive actions. The compendial sterility test has a 14-day incubation time and is often the time-limiting step in the assess and release process of pharmaceutical products [19]. So, results allow manufacturers to make informed decisions promptly. This includes determining the scope of a potential contamination event, assessing the risk to batches and processes, and deciding on the necessary steps, such as product holds or recalls [19,20].
A bacterial species is characterized as a group of strains, including the type strain, that share more than 70% similarity in DNA-DNA hybridization (DDH), a maximum of 2% G+C span, values above 98.7% in 16S rRNA gene sequence identity, and distinct chemotaxonomic and phenotypic characteristics [21]. Several identification methods are used in clinical or industrial laboratories to identify bacterial species, which are described in more detail below.
Matrix-Assisted Laser Desorption Ionization–Time of Flight/Mass Spectrometry (MALDI-TOF MS) technology has been widely used to identify microorganisms contaminating pharmaceutical production environments because it offers several advantages over other identification methods, particularly in terms of speed and greater specificity and sensitivity [7,8,22]. Therefore, if the MALDI-TOF database fails to identify a bacterial isolate, it is necessary to sequence the 16S rRNA gene and some housekeeping genes according to the suspected species or genera, or even the whole genome, to perform genomic taxonomy analyses [7,22,23]. The biggest challenge is when the strain is a potential new species, and the genes and/or genomes of all the closest species in the genus are not deposited in the databases to compare similarities and reconstruct the phylogenetic tree.
This review discusses the difficulties encountered in identifying environmental bacterial isolates from pharmaceutical facilities using MALDI-TOF MS technology, 16S rRNA and housekeeping gene sequence analyses, which often require genomic taxonomy analyses using genome sequencing data.

2. Phenotypic Identification

Phenotypic identification using commercial biochemical systems can be performed before the bacterial isolate is submitted for identification by MALDI-TOF MS or gene sequencing. The biochemical identification of environmental isolates using commercial biochemical systems such as the API® system (bioMérieux, Craponne, France) and VITEK® 2 Compact System (bioMérieux, Craponne, France) is still used in the pharmaceutical industry [1,24,25].
The API® system consists of a series of biochemical tests based on the fermentation of sugars (carbohydrates), the assimilation of other carbon sources, and the production of unique enzymes and metabolites. These tests are used to identify Gram-positive and Gram-negative bacteria and yeasts. The type of kit is selected based on colony morphology and staining results. The API profiles obtained are identified using the APIWEBTM database [26,27]. The VITEK® 2 system is used for microbial identification and antimicrobial susceptibility testing. It consists of a card filling/sealing system, an incubator/reader, coupled to a computer. Microbial identification is performed using cards containing dehydrated biochemical substrates that require no additional reagents. As with the API® system, card selection is based on colony morphology and staining results. VITEK® identification includes microbial species of clinical and industrial importance [28]. Both systems have potential limitations, such as the following: (i) they may have difficulty in determining phenotypic variation between strains; (ii) some may show different results in repeated tests; (iii) there may be limited databases; and (iv) small changes in test performance can give false results. In addition, non-fermenting bacteria can be problematic due to their phenotypic variations and slower growth rates [29].
Studies using both identification systems are scarce in the literature. Ligozzi et al. compared the identification of clinically relevant Gram-positive cocci by the VITEK® 2 system, with identification by the API® Staph and API® 20 Strep systems [30]. However, it was not reported whether the isolates were previously identified by gene sequencing. Vidal carried out the phenotypic characterization of 25 isolates of Gram-positive cocci from sterile pharmaceutical products and controlled environments using the API and VITEK® 2 systems, comparing them with the sequencing of the 16S rRNA and housekeeping genes. The API Staph and VITEK® 2 systems correctly identified the bacterial genus of 69% and 68% of the isolates, respectively [31]. Ramos compared the phenotypic identification of 60 isolates of irregular Gram-positive rods from intravenous sites by the API® Coryne and VITEK® 2 systems with sequences of the 16S rRNA and rpoB genes. Phenotypic characterization by API® Coryne and Vitek® 2 systems correctly identified 14.28% and 55.36% of the isolates at the species level, respectively [32].
For microorganisms of environmental origin and pharmaceutical products, some studies show the need for molecular methods to complete the identification of different bacterial groups. It is important to note that despite the ineffectiveness of these methods in phenotypic identification, the results of biochemical assays can be useful in differentiating some species [1,25]. A flowchart with the identification steps of the environmental bacterial isolates described in this study is shown in Figure 1, and the advantages and limitations of each identification method are shown in Table 1.

3. MALDI-TOF MS

MALDI-TOF MS is considered a high-throughput technology, based on the acquisition of unique molecular signatures that are representative of a wide range of proteins and can clearly distinguish the differences between two closely related species. In most cases, microorganisms must first be cultured in microbiological media. There is no ideal standardized protocol for sample preparation and culture conditions for bacterial profiling. Previously, it was shown that culture conditions affected microbial physiology and protein expression profiles, but did not influence identification by MALDI-TOF [34,35]. Popovic et al. demonstrated that different culture types and conditions, sample preparation, and matrix solutions play a role in MALDI-TOF MS identification at the species level [34]. Its database contains spectra of microorganisms that are compared to the mass spectrum of the microorganism to be identified to find a closer match. Regardless of which system is used, be it the MALDI Biotyper (Bruker Daltonics, France) or the Vitek MS (bioMerieux, France), each of them requires different preparations and has different databases and algorithms [36,37].
This method has undoubtedly revolutionized the microbial identification system, reducing the time and cost of identification. Accuracy and speed in reporting results are crucial for the batch release of pharmaceutical products. It is important to note that the MALDI-TOF MS databases were initially created using mainly clinically relevant strains for species identification. The application of technology for the identification of bacterial isolates from other sources, such as soil, industry, freshwater, etc., requires new databases with relevant corresponding species. Due to its rapid identification method, the technology attracted the interest of several specialist groups who expanded the MALDI-TOF MS databases. Although the cost per identification is low, the initial cost of the MALDI-TOF equipment is high, as is the maintenance of the databases [35]. Species not detected by MALDI-TOF MS, but identified by 16S rRNA gene sequences, were added to the database to study microbial communities from poorly understood locations, making this technology more widely used beyond clinical laboratories [38]. Costa et al. [7] added 24 Bacillus spp. strains and related genera from the pharmaceutical industry to the MALDI-TOF MS database, after a careful genotypic characterization of the strains using 16S rRNA and rpoB gene sequences. Miranda et al. [25] reported for the first time the isolation of six Sutcliffiella horikoshii strains from an immunobiological pharmaceutical facility that were not identified by MALDI-TOF MS. After gene sequencing, the MALDI-TOF MS database was expanded, and the strains were correctly identified as S. horikoshii by MALDI-TOF MS. In 2017, an aerobic, spore-forming Gram-positive rod isolated from an air monitoring sample from an immunobiological production unit was not identified by MALDI-TOF MS. After physiological and genotypic characterization and biochemical tests, the Bacillus lumeideiriae species was described, and the MALDI-TOF database was expanded with the spectra profiles of the proposed new species [23].
It is possible to increase the identification capacity of MALDI-TOF by including the spectral profiles in the database [18]. For this purpose, the bacterial isolate must first be identified by an analysis of the 16S ribosomal gene sequence at the genus level, since this gene has limitations in differentiating species. Sequences of genes encoding highly conserved proteins, called housekeeping genes, can be used in conjunction with 16S rRNA gene sequences to try to reach the species level [7,33]. Although MALDI-TOF MS accurately identifies the most common clinical organisms, it has limitations for identifying environmental bacteria and distinguishing closely related species due to the intrinsic similarity between these microorganisms. In these situations, identification is often only possible at the group, complex, or genus level. When differentiation at the species level is relevant, supplementary tests, such as genetic sequencing, are necessary. Database updates or the creation of specific libraries by the user and complementary methods such as sequencing are effective strategies for overcoming these limitations [39,40]. It is important to mention that there is no public database of environmental bacterial isolates for comparing spectra obtained with MALDI-TOF MS, which makes bacterial identification and the exchange of information between researchers and institutions difficult. For highly pathogenic bacteria (biosafety level 3), such as Bacillus anthracis, Yersinia pestis, Burkholderia mallei, Burkholderia pseudomallei and Francisella tularensis, as well as their related species, the Robert Koch German Institute has developed a database with MALDI-TOF MS mass spectra, which serves as a reference for diagnosing these bacteria using microbial identification software. The spectra are available in a zip file containing the original mass spectra in the data format used by Bruker Daltonics [41].

4. Analysis of the 16S Ribosomal Gene Sequences

The sequencing of the 16S ribosomal gene is widely used in bacterial identification [42]. The 16S rRNA gene is approximately 1500 base pairs (bp) in size and is located in the small subunit (30S) of the prokaryotic ribosome [43]. Some species may have shorter or longer sequences. The 16S rRNA gene does not encode proteins, but in addition to having a structural role, it is crucial for protein synthesis. Although rare, the horizontal transfer of the 16S rRNA gene can also occur, but only at the intragenus or intraspecific level [44,45]. The 16S rRNA gene is considered an important molecular marker, present in all members of Bacteria and Archaea, highly conserved in bacteria and evolving slowly; it is a target widely used for phylogenetic studies of bacteria. In this sense, its sequences are well used for taxonomic classification at the genus level. The presence of multiple copies in the bacterial genome with sequence differences and low polymorphism are some of the limitations that must be considered for classification at the species level [46]. The 16S rRNA gene contains highly conserved, variable, and hypervariable regions that are unevenly distributed, constituting nine hypervariable regions, designated V1 to V9, that vary in length, position, and taxonomic discrimination. Such variation is conducive to inferring phylogenetic relationships between phyla while also being used in comparisons of interest [44,45,47].
Although considered the gold standard for bacterial identification, the amplification of the 16S rRNA gene is still costly and impractical in the routine of some laboratories [48]. For more accurate molecular identification, the complete sequencing of the gene (~1500 bp) is required [46]. The 16S rRNA gene sequences should be compared with available databases such as Basic Local Alignment Search Tool (Blast, https://blast.ncbi.nlm.nih.gov/Blast.cgi) or EzBioCloud (https://www.ezbiocloud.net/). If a new bacterial species is suspected, it is important to limit the comparison of similarities to the sequences of type strains [49]. Previously, to be considered from the same species, bacterial isolates should share a similarity in the 16S rRNA gene of more than 97%, based on a relationship with 70% DDH, which is considered the gold standard method for delimiting bacterial species [50]. According to Stackebrandt & Ebers (2006) [33], the similarity criterion for the 16S rRNA gene is greater than 98.7%. Identity values in the 16S rRNA gene sequence below 95% with the phylogenetically closest species with a validated name may indicate that the isolate is representative of a new genus [51]. Although they provide valuable phylogenetic information, 16S ribosomal gene sequences are not always useful for distinguishing closely related species due to their highly conserved nature. When several species of the same genus share >98.7% identity in the 16S gene sequence, the sequencing of other genes, called essential or constitutive, is recommended [23,43].
Bacillus and related genera are one of the most common groups of bacteria found in pharmaceutical industries environment, and 16S rRNA gene sequencing is not accurate enough to identify these organisms at the species level [7,8]. These also occur with other groups of bacteria found in pharmaceutical industries, such as Acinetobacter spp., that are commonly found in water samples [52], Staphylococcus spp., which are found in environmental monitoring and bioburden analysis generally due to operator contamination [18,53,54], and the Burkholderia cepacia complex, a significant contaminant of water used in the pharmaceutical industry. Recalls of products contaminated by species of this complex can result in the loss of customers, production batches, and equipment and even harm the health of patients or the community [55,56].

5. Analysis of the Housekeeping Gene Sequences

Housekeeping genes, also called essential genes, encode enzymes that are essential for maintaining cellular function. Examples include the recA gene, which encodes a recombinase protein; rpoB, which encodes the beta subunit of RNA polymerase; and gyrB, which encodes the B subunit of the DNA gyrase protein. Housekeeping genes are highly conserved and accumulate mutations more rapidly than rRNA genes, making them useful for differentiating bacteria at the species level, since the 16S rRNA gene shows high genetic homology within certain genera and may not be useful alone for distinguishing closely related species [42,45,51,57,58]. Bacillus and related genera are among the main bacterial groups isolated from pharmaceutical manufacturing environments [7]. Some Bacillus species have highly similar 16S rRNA gene sequences. For the newly described species B. lumedeiriae, isolated from an air monitoring sample from a Brazilian immunobiological production facility, genes such as rpoB and gyrB were used to help identify the species [7,23]. The genus Burkholderia has been recognized as a contaminant in water used in pharmaceutical and hospital sectors, as well as being present in medicines and cosmetics [55]. The sequencing of the recA gene has significantly improved the identification of Burkholderia species [58]. In another study aimed at developing electrospinning nanofibers as a vaginal release system for probiotics, species of Lactobacillus gasseri, Lactobacillus crispatus, and Lactobacillus jensenii were identified by the PCR amplifying of the partial sequences of the chaperonin (cpn60), GTB-binding protein (lepA), and transketolase subunit A genes, respectively [59].
The analysis of the sequences of a certain number of housekeeping genes, called multilocus sequence analysis (MLSA), incorporates similarity values to differentiate species, and is considered a phylogenetic tool to support and clarify the delimitation of bacterial species with a higher resolution than studies based on 16S rRNA genes. The gene sequences are used to construct a phylogenetic tree to infer phylogenies [60,61]. MLSA is based on multilocus sequence typing (MLST), a method for typing pathogenic bacteria for epidemiological and population genetic purposes, first described by Maiden et al., 1998 [61,62].
The choice of genes for each taxon analyzed is critical to the reliability of the analysis. Housekeeping genes should be considered because they are more stable in relation to rapid genetic change and are present in all species of a genus. It is also recommended that they be single-copy genes, distributed throughout the entire genome. The ad hoc Committee recommends the use of at least five housekeeping genes for the re-evaluation of the species definition in bacteriology, although most studies use seven genes. Some species require the use of more genes for better differentiation [61,63,64]. Before amplifying and sequencing of the selected housekeeping genes of interest for comparison and construction of a phylogenetic tree, it is very important to consult the different sequence databases, such as the National Center for Biotechnology Information (NCBI, https://www.ncbi.nlm.nih.gov/), European Molecular Biology Laboratory’s—European Bioinformatics Institute (EMBL-EBI, https://www.ebi.ac.uk/), DNA Data Bank of Japan (DDBJ, https://www.ddbj.nig.ac.jp/index-e.html), Bacterial Diversity Metadatabase (BacDive, https://bacdive.dsmz.de/), and Joint Genome Institute (JGI, https://genome.jgi.doe.gov/portal/), among others. If the sequences of the housekeeping genes of interest or the complete genome of the closest type strain are not available, an alternative would be to purchase the type strain from a collection, such as the American Type Culture Collection (ATCC) or the Deutsche Sammlung von Mikroorganismen und Zellkulturen (DSMZ), for example, and perform the sequencing yourself. Some sequences may be in the possession of researchers who have not yet published their data. Contacting these researchers may also be an alternative.
Gene sequencing for MLSA involves time-consuming and laborious steps. With the advancement of high-throughput sequencing and bioinformatics tools in recent years, MLSA can be performed in silico due to the substantial increase of whole-genome data in public databases, allowing gene sequences to be extracted directly from genomes [64].

6. Genomic Taxonomy Tools

Improvements in DNA sequencing technologies have resulted in a significant increase in the amount of genomic data generated, combined with a reduction in the cost of sequencing [60]. Whole-genome sequencing (WGS) data and the development of bioinformatics tools have allowed the establishment of taxonomic schemes based on evolutionary information contained in genome sequences, such as digital DNA-DNA hybridization (dDDH), Average Amino Acid Identity (AAI), supertrees, and others [65]. Such taxonomic schemes, which are described in more detailed below, can be used not only to describe a new bacterial species but also to confirm the identification of the bacterial isolate, especially when it comes to environmental isolates belonging to the genus Bacillus and/or related genera, which have a large number of described species, and genotypic characterization by sequencing the 16S rRNA and housekeeping genes is not always possible.
Genomic taxonomy is defined based on an integrated comparative genomics approach with the goal of extracting taxonomic information from genomes. The main goal of genomic taxonomy is to extract taxonomic information from genomes that can be used to provide a solid framework for identifying and classifying prokaryotic species and even populations. These new tools mentioned above have led to a new understanding of genetic relationships that the 16S rRNA gene can only approximate [21,65,66].
DDH indirectly measures the degree of genetic similarity between two genomes, one of which is the genome of the type strain, and has been the “gold standard” for bacterial species delimitation [67]. In brief, heated DNA strands are dissociated and immediately reassociated, and hybridization occurs. The degree of relationship between the two genomes is verified, and the two genomes are considered to belong to the same species if the DDH value is greater than 70% [60,64]. However, this value is not sufficient to distinguish, for example, the species Rickettsia rickettsii, Rickettsia conorii, Rickettsia sibirica, and Rickettsia montanensis; in other words, the DDH limit used is not applicable to all genera [51]. Few laboratories in the world perform this methodology because it is a laborious, slow, and expensive technique that requires specialized personnel. Another disadvantage is that the results may vary depending on the protocol used, which can lead to experimental errors, and a comparison of results obtained by different methodologies is not recommended [50,51,64].
With the advent of high-throughput DNA sequencing and the various genomes deposited in public repositories, dDDH was then proposed [60,68]. Some authors suggested replacing DDH as the “gold standard” in prokaryotic taxonomy by pairwise genomic sequence-derived similarity [68,69]. The analysis of dDDH consists of the local alignment between two genomes, and intergenomic correspondences are generated, which are later used to calculate the distance matrix, whose values are analogous to DDH (>70%) [64]. The Genome-to-Genome Distance Calculator (GGDC) is one of the most popular online tools for calculating in silico DDH values, provided free of charge by the German bacterial collection DSMZ. The GGDC allows a comparison of bacterial genomes to determine their similarity on the same scale as DDH, aiding in the identification and classification of bacterial species. Moreover, the GGDC reports the difference in G+C content, which can be used for species delineation [69,70].
Over the years, there has been an increasing use of the Average Nucleotide Identity (ANI) tool for the classification and identification of bacterial species [71,72], which shows a strong correlation with DDH [73]. ANI was proposed in 2005 and consisted of the local alignment between two genomes, calculating the average identity of the nucleotides of the shared open reading frames (ORFs) instead of the whole genome [69]. In 2007, a comparison between two whole genomes was implemented, allowing ANI results to be directly compared with DDH [74]. Two prokaryotic genomes can be considered to belong to the same species if they share an ANI value ≥ 96% [51]. Furthermore, ANI shows a strong correlation with 16S rRNA gene sequence similarity [75]. Although a robust tool for bacterial species delimitation, ANI has been shown to have low resolution at higher taxonomic levels. Therefore, other metrics should be used, such as AAI [76,77], another widely used tool for bacterial species classification and identification, which is based on the calculation of conserved protein-coding genes between a pair of genomes, determined by a pairwise comparison of whole genomes using the BLAST algorithm. Briefly, all protein-coding genes in one genome are searched for in relation to all protein-coding genes in the other genome. The AAI of all conserved genes between the pair of genomes is then measured from the genetic relatedness between a pair of genomes, and a value of ~95% to 96% is established, with a strong correlation to the similarity of the 16S rRNA gene sequence [64,65,78]. The AAI has been shown to be a very useful metric for genus delimitation (>60–65%) [75]. Based on the discontinuous distribution of its AAI values, several studies have developed boundaries to delimit genera, such as Chryseobacterium [79], Prochlorococcus [80] and Lactobacillus [81].
Complex phylogenetic relationships between different taxa can be captured by genome-based phylogenies, or phylogenomics. In 2019, the online platform TYGS (Type Strain Genome Server) was developed by the German collection DSMZ for the classification and identification of microorganisms based on their genomes, combining genomic data with robust phylogenetic approaches to provide accurate and reproducible classification. The Type Strain Genome Server (TYGS) contains a comprehensive database that is constantly updated and revised to ensure accuracy and reliability and is maintained by the Leibniz Institute/DSMZ. This server allows the analysis of dDDH indices and the construction of phylogenies based on the 16S rRNA gene and the Genome BLAST Distance Phylogeny (GBDP) method, using as a reference the next closest phylogenetic matches previously identified by Mash genomic distances and 16S rRNA gene data. In addition, the TYGS is integrated with a database that powers the List of Prokaryotic Names with Standing in Nomenclature (LPSN), providing information on the most recent updates in the nomenclature and taxonomic literature. The platform also performs a preliminary species-level classification to assist in the identification of possible new species [82,83].
Another computational tool that supports the phylogenomic classification of bacteria based on genomic data is the Genome Taxonomy Database Toolkit (GTDB-Tk), which compares genomes to the Genome Taxonomy Database (GTDB) taxonomy using a pre-calculated phylogenomic tree and metrics such as ANI and Relative Evolutionary Divergence (RED). The GTDB-Tk allows the assignment of taxa to new genomes, facilitating the identification of new species and the reclassification of organisms within a standardized phylogenomic taxonomy. The GTDB-Tk is an independent tool linked to the GTDB that inserts the genome into a previously calculated multigene-based phylogeny (MBP) phylogenomic tree, calculates RED indices, and performs species assignment based on ANI, when viable. This process allows the identification of new taxa, both at the species level and in broader taxonomic categories. The GTDB and backbone trees are updated regularly, undergoing, for example, a major annual revision [82,84].
Another approach worth mentioning is Ribosomal Multilocus Sequence Typing (rMLST, https://pubmlst.org/species-id), which was proposed to overcome the limitations of current methods for bacterial typing and phylogenetic reconstruction by using 53 ribosomal genes (rps genes) that are present in all bacteria and distributed along bacterial chromosomes. They also encode highly conserved ribosomal proteins. This technique provides greater phylogenetic resolution than traditional methods, such as using the 16S rRNA gene, and allows bacteria to be accurately classified at multiple taxonomic levels, including the domain, phylum, class, order, family, genus, and species. The rMLST database is an extensible, web-accessible database containing complete genomic data from thousands of bacterial isolates, enabling the rapid and computationally efficient identification of the phylogenetic position of any bacterial sequence at multiple taxonomic levels [85].

7. Conclusions and Future Directions

Many laboratories in the pharmaceutical industry use MALDI-TOF MS for bacterial identification because of its rapid results and low cost, which are essential for the batch release of pharmaceutical products. However, there is a high initial investment in the equipment and costs for keeping the database up to date. Here, we suggest some methods for the identification of environmental bacteria, from MALDI-TOF MS to genomic taxonomy analysis (Figure 1), and the advantages and disadvantages of each method (Table 1). Initially, MALDI-TOF MS databases focused on clinically relevant strains, but their application has expanded to environmental and industrial samples, requiring new references. Species not detectable by MALDI-TOF MS have been included after identification by 16S rRNA and housekeeping gene sequencing. Identification can be improved by adding spectral profiles to the database.
The most commonly used method to assess the phylogenetic position of a prokaryote is the comparison of the 16S rRNA gene sequence. The sequencing of the 16S rRNA gene is widely used for bacterial identification due to its universal presence in Bacteria and Archaea, high conservation, and low rate of evolution, being an essential marker for phylogenetic studies. Nucleotide variations within multiple rRNA operons in a single genome and the possibility of 16S rRNA genes being derived from horizontal gene transfer can distort relationships between taxa in phylogenetic trees. Despite its relevance, the 16S rRNA gene does not always distinguish closely related species due to its high conservation; its phylogeny is robust at the genus level and above, making it necessary to complement the analysis with the sequencing of housekeeping genes, which have been used in MLSA as an alternative for a more accurate identification of microorganisms.
Advances in DNA sequencing technologies have increased the availability of genomic data and reduced their costs, allowing the creation of taxonomic schemes based on in silico analysis of genomes. Genomic taxonomy uses tools such as dDDH, ANI, and AAI to identify and classify bacterial species. The traditional DDH method, previously considered the gold standard for species delimitation, has been replaced by computational methods, such as the GGDC, which calculates the genomic similarity between organisms. ANI, which compares the average nucleotide identity between two genomes, is widely used to define species with values ≥96%, while AAI, based on the identity of conserved coding proteins, helps to delimit genera (>60–65%). In addition, phylogenomic tools such as the TYGS and GTDB-Tk provide robust approaches for microbial classification, using continuously updated databases and phylogenetic analyses based on multiple genes. In addition to all the bioinformatic tools described above for the identification of the bacterial isolate, we also suggest the rMLST approach, which is a web-accessible database with support for an online database hosted by PubMLST. The approach is based on the analysis of 53 highly conserved ribosomal genes for the identification and classification of a wide range of bacterial species and can be applied to poorly characterized or undescribed species. The rMLST database is compatible with whole-genome sequencing data or metagenomes, facilitating the analysis of complex microbial communities.
Approaches that combine multiple analyses and metrics are strongly encouraged as they provide greater accuracy in bacterial identification and overcome the limitations of 16S rRNA gene sequencing in distinguishing closely related species.

Author Contributions

Conceptualization, M.L.L.B.; formal analysis, J.N.R., L.V.d.C., V.V.V. and M.L.L.B.; data curation,. L.V.d.C. and M.L.L.B.; writing—original draft preparation, J.N.R.; writing—review and editing, J.N.R., L.V.d.C., V.V.V. and M.L.L.B.; supervision, M.L.L.B. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro—FAPERJ (E-26/200.546/2025). The funding body played no role in the design of the study and in writing the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AAIAverage Amino Acid Identity
ANIAverage Nucleotide Identity
ATCCAmerican Type Culture Collection
BacDiveBacterial Diversity Metadatabase
BLASTBasic Local Alignment Search Tool
bpBase pairs
CFUColony-Forming Unit
DDBJDNA Data Bank of Japan
dDDHDigital DNA-DNA Hybridization
DDHHibridization DNA-DNA
DSMZDeutsche Sammlung von Mikroorganismen und Zellkulturen
EMAEuropean Medicines Agency
EMBL-EBIEuropean Molecular Biology Laboratory—European Bioinformatics Institute
FDAFood and Drug Administration
G+CGuanine and cytosine
GBDPGenome BLAST Distance Phylogeny
GGDCGenome-to-Genome Distance Calculator
GMPsGood Manufacturing Practices
GTDBGenome Taxonomy Database
GTDB-TkGenome Taxonomy Database Toolkit
JGIJoint Genome Institute
LPSNList of Prokaryotic Names with Standing in Nomenclature
MALDI-TOF MSMatrix-Assisted Laser Desorption Ionization–Time of Flight/Mass Spectrometry
MBPMultigene-Based Phylogenies
MLSAMultilocus Sequence Analysis
MLSTMultilocus Sequence Typing
NCBINational Center for Biotechnology Information
ORFsOpen Reading Frames
PCRPolymerase Chain Reaction
REDRelative Evolutionary Divergence
rMLSTRibosomal Multilocus Sequence Typing
rRNARibosomal RNA
TYGSType Strain Genome Server
WGSWhole-Genome Sequencing

References

  1. da Costa, L.V.; da Silva Lage de Miranda, R.V.; da Fonseca, E.L.; Gonçalves, N.P.; dos Reis, C.M.F.; Frazão, A.M.; Cruz, F.V.; Brandão, M.L.L.; Ramos, J.N.; Vieira, V.V. Assessment of VITEK® 2, MALDI-TOF MS and Full Gene 16S RRNA Sequencing for Aerobic Endospore-Forming Bacteria Isolated from a Pharmaceutical Facility. J. Microbiol. Methods 2022, 194, 106419. [Google Scholar] [CrossRef] [PubMed]
  2. United States Pharmacopeial Convention. The United States Pharmacopeia, 43rd ed.; United States Pharmacopeial Convention: Rockville, MD, USA, 2021. [Google Scholar]
  3. Santos, A.M.C.; Doria, M.S.; Meirinhos-Soares, L.; Almeida, A.J.; Menezes, J.C. A QRM Discussion of Microbial Contamination of Non-Sterile Drug Products, Using FDA and EMA Warning Letters Recorded between 2008 and 2016. PDA J. Pharm. Sci. Technol. 2018, 72, 62–72. [Google Scholar] [CrossRef] [PubMed]
  4. Jimenez, L. Microbial Diversity in Pharmaceutical Product Recalls and Environments. PDA J. Pharm. Sci. Technol. 2007, 61, 383–399. [Google Scholar] [PubMed]
  5. Sandle, T. Review of FDA Warning Letters for Microbial Bioburden Issues (2001–2011). Pharma Times 2012, 44, 29–30. [Google Scholar]
  6. Jain, S.K.; Jain, R.K. Review of FDA Warning Letters to Pharmaceuticals: Cause and Effect Analysis. Res. J. Pharm. Technol. 2018, 11, 3219. [Google Scholar] [CrossRef]
  7. da Costa, L.V.; da Silva Lage de Miranda, R.V.; dos Reis, C.M.F.; de Andrade, J.M.; Cruz, F.V.; Frazão, A.M.; da Fonseca, E.L.; Ramos, J.N.; Brandão, M.L.L.; Vieira, V.V. MALDI-TOF MS Database Expansion for Identification of Bacillus and Related Genera Isolated from a Pharmaceutical Facility. J. Microbiol. Methods 2022, 203, 106625. [Google Scholar] [CrossRef]
  8. Caldeira, N.G.S.; de Souza, M.L.S.; da Silva Lage de Miranda, R.V.; da Costa, L.V.; Forsythe, S.J.; Zahner, V.; Brandão, M.L.L. Characterization by MALDI-TOF MS and 16S RRNA Gene Sequencing of Aerobic Endospore-Forming Bacteria Isolated from Pharmaceutical Facility in Rio de Janeiro, Brazil. Microorganisms 2024, 12, 724. [Google Scholar] [CrossRef]
  9. Food and Drug Administration—FDA. Guidance for Industry Sterile Drug Products Produced by Aseptic Processing—Current Good Manufacturing Practice; FDA: Silver Spring, MD, USA, 2004.
  10. European Medicines Agency. The Rules Governing Medicinal Products in the European Union. In European Union Guidelines for Good Manufacturing Practice for Medicinal Products for Human and Veterinary Use; Annex 1: Manufacture of Sterile Medicinal Products; European Medicines Agency: Amsterdam, The Netherlands, 2022. [Google Scholar]
  11. Maruthamuthu, M.K.; Raffiee, A.H.; De Oliveira, D.M.; Ardekani, A.M.; Verma, M.S. Raman Spectra-Based Deep Learning: A Tool to Identify Microbial Contamination. Microbiologyopen 2020, 9, e1122. [Google Scholar] [CrossRef]
  12. Masucci, E.M.; Hauschild, J.E.; Gisler, H.M.; Lester, E.M.; Balss, K.M. Raman Spectroscopy as an Alternative Rapid Microbial Bioburden Test Method for Continuous, Automated Detection of Contamination in Biopharmaceutical Drug Substance Manufacturing. J. Appl. Microbiol. 2024, 135, lxae188. [Google Scholar] [CrossRef]
  13. Guest, M.; Pickard, B.; Smith, B.; Drinkwater, S. The Use of Amplified ATP Bioluminescence for Rapid Sterility Testing of Drug Product Formulations. PDA J. Pharm. Sci. Technol. 2023, 77, 402–411. [Google Scholar] [CrossRef]
  14. Brazilian Health Regulatory Agency. Brazilian Pharmacopeia, 7th ed.; Brazilian Health Regulatory Agency: Brasilia, Brazil, 2024; Volume I.
  15. European Directorate for the Quality of Medicines & HealthCare (EDQM). European Pharmacopoeia, 11th ed.; Council of Europe: Strasbourg, France, 2023. [Google Scholar]
  16. United States Pharmacopeial Convention. The United States Pharmacopeia, 1st ed.; United States Pharmacopeial Convention Inc: Rockville, MD, USA, 2024. [Google Scholar]
  17. European Medicines Agency. Guideline on the Sterilisation of the Medicinal Product, Active Substance, Excipient and Primary Container; EMA/CHMP/CVMP/QWP/850374/2015; European Medicines Agency: Amsterdam, The Netherlands, 2019; pp. 1–25. [Google Scholar]
  18. Mattoso, J.M.V.; Costa, L.V.; Vale, B.A.; Reis, C.M.F.; Andrade, J.M.; Braga, L.M.P.S.; Conceição, G.M.S.; Costa, P.B.M.; Silva, I.B.; Rodrigues, L.A.P.; et al. Quantitative and Qualitative Evaluation of Microorganism Profile Identified in Bioburden Analysis in a Biopharmaceutical Facility in Brazil: Criteria for Classification and Management of Results. PDA J. Pharm. Sci. Technol. 2024, 79, 125–156. [Google Scholar] [CrossRef] [PubMed]
  19. Deutschmann, S.; Paul, M.; Claassen-Willemse, M.; van den Berg, J.; IJzerman-Boon, P.; Grunert da Fonseca, V.; Brunbech, E.; Johnson, L.; Knutsen, C.; Plourde, L.; et al. Rapid Sterility Test Systems in the Pharmaceutical Industry: Applying a Structured Approach to Their Evaluation, Validation and Global Implementation. PDA J. Pharm. Sci. Technol. 2023, 77, 211–235. [Google Scholar] [CrossRef] [PubMed]
  20. Stamatoski, B.; Ilievska, M.; Babunovska, H.; Sekulovski, N.; Panov, S. Optimized Genotyping Method for Identification of Bacterial Contaminants in Pharmaceutical Industry. Acta Pharm. 2016, 66, 289–295. [Google Scholar] [CrossRef] [PubMed]
  21. Thompson, C.C.; Vidal, L.; Salazar, V.; Swings, J.; Thompson, F.L. Microbial Genomic Taxonomy. In Trends in the Systematics of Bacteria and Fungi; CABI: Wallingford, UK, 2021; pp. 168–178. [Google Scholar] [CrossRef]
  22. Moreira, F.M.; de Araujo Pereira, P.; da Silva Lage de Miranda, R.V.; dos Reis, C.M.F.; da Silva Braga, L.M.P.; de Andrade, J.M.; do Nascimento, L.G.; Mattoso, J.M.V.; Forsythe, S.J.; da Costa, L.V.; et al. Evaluation of MALDI-TOF MS, Sequencing of D2 LSU RRNA and Internal Transcribed Spacer Regions (ITS) for the Identification of Filamentous Fungi Isolated from a Pharmaceutical Facility. J. Pharm. Biomed. Anal. 2023, 234, 115531. [Google Scholar] [CrossRef]
  23. da Costa, L.V.; Ramos, J.N.; de Sousa Albuquerque, L.; da Silva Lage de Miranda, R.V.; Valadão, T.B.; Veras, J.F.C.; Vieira, E.M.D.; Forsythe, S.; Brandão, M.L.L.; Vieira, V.V. Bacillus lumedeiriae sp. nov., a Gram-Positive, Spore-Forming Rod Isolated from a Pharmaceutical Facility Production Environment and Added to the MALDI Biotyper® Database. Microorganisms 2024, 12, 2507. [Google Scholar] [CrossRef]
  24. Obasi, A.; Nwachukwu, S.; Ugoji, E.; Kohler, C.; Göhler, A.; Balau, V.; Pfeifer, Y.; Steinmetz, I. Extended-Spectrum β-Lactamase-Producing Klebsiella Pneumoniae from Pharmaceutical Wastewaters in South-Western Nigeria. Microb. Drug Resist. 2017, 23, 1013–1018. [Google Scholar] [CrossRef]
  25. da Silva Lage de Miranda, R.V.; da Costa, L.V.; de Sousa Albuquerque, L.; dos Reis, C.M.F.; da Silva Braga, L.M.P.; de Andrade, J.M.; Ramos, J.N.; Mattoso, J.M.V.; Forsythe, S.J.; Brandão, M.L.L. Identification of Sutcliffiella Horikoshii Strains in an Immunobiological Pharmaceutical Industry Facility. Lett. Appl. Microbiol. 2023, 76, ovad056. [Google Scholar] [CrossRef]
  26. Sala-Comorera, L.; Vilaró, C.; Galofré, B.; Blanch, A.R.; García-Aljaro, C. Use of Matrix-Assisted Laser Desorption/Ionization–Time of Flight (MALDI–TOF) Mass Spectrometry for Bacterial Monitoring in Routine Analysis at a Drinking Water Treatment Plant. Int. J. Hyg. Environ. Health 2016, 219, 577–584. [Google Scholar] [CrossRef]
  27. Vithanage, N.R.; Yeager, T.R.; Jadhav, S.R.; Palombo, E.A.; Datta, N. Comparison of Identification Systems for Psychrotrophic Bacteria Isolated from Raw Bovine Milk. Int. J. Food Microbiol. 2014, 189, 26–38. [Google Scholar] [CrossRef]
  28. Biomérieux. VITEK® 2. Fully Integrated Identification and Antimicrobial Susceptibility Testing; Biomérieux: Marcy-l’Étoile, France, 2025. [Google Scholar]
  29. Bosshard, P.P.; Zbinden, R.; Abels, S.; Böddinghaus, B.; Altwegg, M.; Böttger, E.C. 16S RRNA Gene Sequencing versus the API 20 NE System and the VITEK 2 ID-GNB Card for Identification of Nonfermenting Gram-Negative Bacteria in the Clinical Laboratory. J. Clin. Microbiol. 2006, 44, 1359–1366. [Google Scholar] [CrossRef]
  30. Ligozzi, M.; Bernini, C.; Bonora, M.G.; De Fatima, M.; Zuliani, J.; Fontana, R. Evaluation of the VITEK 2 System for Identification and Antimicrobial Susceptibility Testing of Medically Relevant Gram-Positive Cocci. J. Clin. Microbiol. 2002, 40, 1681–1686. [Google Scholar] [CrossRef] [PubMed]
  31. Vidal, L. Caracterização de Cocos Gram Positivos Provenientes de Análises Microbiológicas de Produtos Farmacêuticos Estéreis Realizadas No INCQS/FIOCRUZ; Oswaldo Cruz Foundation: Rio de Janeiro, Brazil, 2013. [Google Scholar]
  32. Ramos, J.N. Caracterização de Estirpes Sugestivas de Corinebactérias Isolados de Sítios Intravenosos; Oswaldo Cruz Foundation: Rio de Janeiro, Brazil, 2014. [Google Scholar]
  33. Stackebrandt, E.; Ebers, J. Taxonomic Parameters Revisited: Tarnished Gold Standards. Microb. Today 2006, 33, 152. Available online: https://www.scienceopen.com/document?vid=0cf4b084-5683-4ef4-a80c-c0df44a135dc (accessed on 17 March 2025).
  34. Topić Popović, N.; Kazazić, S.P.; Bojanić, K.; Strunjak-Perović, I.; Čož-Rakovac, R. Sample Preparation and Culture Condition Effects on MALDI-TOF MS Identification of Bacteria: A Review. Mass Spectrom. Rev. 2023, 42, 1589–1603. [Google Scholar] [CrossRef] [PubMed]
  35. Singhal, N.; Kumar, M.; Kanaujia, P.K.; Virdi, J.S. MALDI-TOF Mass Spectrometry: An Emerging Technology for Microbial Identification and Diagnosis. Front. Microbiol. 2015, 6, 791. [Google Scholar] [CrossRef]
  36. Seuylemezian, A.; Aronson, H.S.; Tan, J.; Lin, M.; Schubert, W.; Vaishampayan, P. Development of a Custom MALDI-TOF MS Database for Species-Level Identification of Bacterial Isolates Collected From Spacecraft and Associated Surfaces. Front. Microbiol. 2018, 9, 780. [Google Scholar] [CrossRef]
  37. Zasada, A.A.; Mosiej, E. Contemporary Microbiology and Identification of Corynebacteria spp. Causing Infections in Human. Lett. Appl. Microbiol. 2018, 66, 472–483. [Google Scholar] [CrossRef]
  38. Shah, H.N.; Shah, A.J.; Belgacem, O.; Ward, M.; Dekio, I.; Selami, L.; Duncan, L.; Bruce, K.; Xu, Z.; Mkrtchyan, H.V.; et al. MALDI-TOF MS and Currently Related Proteomic Technologies in Reconciling Bacterial Systematics. In Trends in the Systematics of Bacteria and Fungi; CABI: Wallingford, UK, 2021; pp. 93–118. [Google Scholar] [CrossRef]
  39. Ashfaq, M.Y.; Da’na, D.A.; Al-Ghouti, M.A. Application of MALDI-TOF MS for Identification of Environmental Bacteria: A Review. J. Environ. Manag. 2022, 305, 114359. [Google Scholar] [CrossRef]
  40. Rychert, J. Benefits and Limitations of MALDI-TOF Mass Spectrometry for the Identification of Microorganisms. J. Infect. Epidemiol. 2019, 2, 1–5. [Google Scholar] [CrossRef]
  41. Lasch, P.; Stämmler, M.; Schneider, A. A MALDI-TOF Mass Spectrometry Database for Identification and Classification of Highly Pathogenic Microorganisms from the Robert Koch-Institute (RKI); Zenodo: Geneva, Switzerland, 2016. [Google Scholar] [CrossRef]
  42. Caamaño-Antelo, S.; Fernández-No, I.C.; Böhme, K.; Ezzat-Alnakip, M.; Quintela-Baluja, M.; Barros-Velázquez, J.; Calo-Mata, P. Genetic Discrimination of Foodborne Pathogenic and Spoilage Bacillus spp. Based on Three Housekeeping Genes. Food Microbiol. 2015, 46, 288–298. [Google Scholar] [CrossRef]
  43. Rajendhran, J.; Gunasekaran, P. Microbial Phylogeny and Diversity: Small Subunit Ribosomal RNA Sequence Analysis and Beyond. Microbiol. Res. 2011, 166, 99–110. [Google Scholar] [CrossRef]
  44. Church, D.L.; Cerutti, L.; Gürtler, A.; Griener, T.; Zelazny, A.; Emler, S. Performance and Application of 16S RRNA Gene Cycle Sequencing for Routine Identification of Bacteria in the Clinical Microbiology Laboratory. Clin. Microbiol. Rev. 2020, 33, e00053-19. [Google Scholar] [CrossRef] [PubMed]
  45. Madigan, M.; Martinko, J.; Bender, K.; Buckley, D.; Stahl, D. Brock Biology of Microorganisms, 14th ed.; Benjamin Cummings: San Francisco, CA, USA, 2015. [Google Scholar]
  46. Mahato, N.K.; Gupta, V.; Singh, P.; Kumari, R.; Verma, H.; Tripathi, C.; Rani, P.; Sharma, A.; Singhvi, N.; Sood, U.; et al. Microbial Taxonomy in the Era of OMICS: Application of DNA Sequences, Computational Tools and Techniques. Antonie Leeuwenhoek 2017, 110, 1357–1371. [Google Scholar] [CrossRef] [PubMed]
  47. D’Amore, R.; Ijaz, U.Z.; Schirmer, M.; Kenny, J.G.; Gregory, R.; Darby, A.C.; Shakya, M.; Podar, M.; Quince, C.; Hall, N. A Comprehensive Benchmarking Study of Protocols and Sequencing Platforms for 16S RRNA Community Profiling. BMC Genom. 2016, 17, 55. [Google Scholar] [CrossRef] [PubMed]
  48. Rodrigues, N.M.B.; Bronzato, G.F.; Santiago, G.S.; Botelho, L.A.B.; Moreira, B.M.; da Silva Coelho, I.; de Souza, M.M.S.; de Mattos de Oliveira Coelho, S. The Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry (MALDI-TOF MS) Identification versus Biochemical Tests: A Study with Enterobacteria from a Dairy Cattle Environment. Braz. J. Microbiol. 2016, 48, 132–138. [Google Scholar] [CrossRef]
  49. Stackebrandt, E.; Mondotte, J.A.; Fazio, L.L.; Jetten, M. Authors Need to Be Prudent When Assigning Names to Microbial Isolates. Curr. Microbiol. 2021, 78, 4005–4008. [Google Scholar] [CrossRef]
  50. Tindall, B.J.; Rosselló-Móra, R.; Busse, H.J.; Ludwig, W.; Kämpfer, P. Notes on the Characterization of Prokaryote Strains for Taxonomic Purposes. Int. J. Syst. Evol. Microbiol. 2010, 60 Pt 1, 249–266. [Google Scholar] [CrossRef]
  51. Sentausa, E.; Fournier, P.E. Advantages and Limitations of Genomics in Prokaryotic Taxonomy. Clin. Microbiol. Infect. 2013, 19, 790–795. [Google Scholar] [CrossRef]
  52. Vasconcellos, L.; Silva, S.V.; da Costa, L.V.; da Silva Lage de Miranda, R.V.; dos Reis, C.M.F.; da Silva Braga, L.M.P.; Silva, C.; Conceição, G.; Mattoso, J.; Silva, I.B.; et al. Phenotypical and Molecular Characterization of Acinetobacter spp. Isolated from a Pharmaceutical Facility. Lett. Appl. Microbiol. 2023, 76, ovad101. [Google Scholar] [CrossRef]
  53. de Almeida do Vale, B.; Costa de Lima, J.; de Souza, P.A.; da Silva Laje de Miranda, R.V.; Brandao, M.L.L.; da Costa, L.V.; Toma, H.K. Characterization of Staphylococcus Hominis Strains Isolated in an Immunobiological Pharmaceutical Unit. In Congresso Brasileiro de Microbiologia; Sociedade Brasileira de Microbiologia: Foz do Iguaçu, Brazil, 2023. [Google Scholar]
  54. Loreiro, J.M.P.; Guimarães, R.C.C.; Valadao, T.B.; Miranda, R.V.S.L.; Andrade, J.M.; Costa, L.V.; Brandao, M.L.L. Application of Fourier-Transform Infrared Spectroscopy (FT-IR) for Staphylococcus Epidermidis Typing as a Tool for Contamination Control Strategy in a Pharmaceutical Industry Facility. PDA J. Pharm. Sci. Technol. 2024, 78, 761–762. [Google Scholar] [CrossRef]
  55. Bazani, V.B.; da Silva, A.C.F.; de Pádua Silva, K.; Müller, K.C. Contaminação de Produtos Farmacêuticos Pelo Complexo Burkholderia Cepacia e Seus Possíveis Impactos Na Saúde e Na Indústria: Uma Revisão Bibliográfica. Res. Soc. Dev. 2024, 13, e10313245032. [Google Scholar] [CrossRef]
  56. Santana, G.; Aguiar, A.; Sales, F.; Miranda, R.; Valadão, T.; Costa, L.; Brandão, M. Polyphasic Characterization of Burkholderia cepacia Complex Strains Isolated from a Pharmaceutical Industry Facility. In Proceedings of the 8th International Symposium on Immunobiologicals, Rio de Janeiro, Brazil, 8–10 May 2024. [Google Scholar]
  57. Vlach, J.; Javůrková, B.; Karamonová, L.; Blažková, M.; Fukal, L. Novel PCR-RFLP System Based on RpoB Gene for Differentiation of Cronobacter Species. Food Microbiol. 2017, 62, 1–8. [Google Scholar] [CrossRef] [PubMed]
  58. Payne, G.W.; Vandamme, P.; Morgan, S.H.; LiPuma, J.J.; Coenye, T.; Weightman, A.J.; Jones, T.H.; Mahenthiralingam, E. Development of a RecA Gene-Based Identification Approach for the Entire Burkholderia Genus. Appl. Environ. Microbiol. 2005, 71, 3917–3927. [Google Scholar] [CrossRef] [PubMed]
  59. Stojanov, S.; Kristl, J.; Zupančič, Š.; Berlec, A. Influence of Excipient Composition on Survival of Vaginal Lactobacilli in Electrospun Nanofibers. Pharmaceutics 2022, 14, 1155. [Google Scholar] [CrossRef]
  60. Chun, J.; Rainey, F.A. Integrating Genomics into the Taxonomy and Systematics of the Bacteria and Archaea. Int. J. Syst. Evol. Microbiol. 2014, 64 Pt 2, 316–324. [Google Scholar] [CrossRef] [PubMed]
  61. Glaeser, S.P.; Kämpfer, P. Multilocus Sequence Analysis (MLSA) in Prokaryotic Taxonomy. Syst. Appl. Microbiol. 2015, 38, 237–245. [Google Scholar] [CrossRef]
  62. Maiden, M.C.J.; Bygraves, J.A.; Feil, E.; Morelli, G.; Russell, J.E.; Urwin, R.; Zhang, Q.; Zhou, J.; Zurth, K.; Caugant, D.A.; et al. Multilocus Sequence Typing: A Portable Approach to the Identification of Clones within Populations of Pathogenic Microorganisms. Proc. Natl. Acad. Sci. USA 1998, 95, 3140–3145. [Google Scholar] [CrossRef]
  63. Stackebrandt, E.; Frederiksen, W.; Garrity, G.M.; Grimont, P.A.D.; Kämpfer, P.; Maiden, M.C.J.; Nesme, X.; Rosselló-Mora, R.; Swings, J.; Trüper, H.G.; et al. Report of the Ad Hoc Committee for the Re-Evaluation of the Species Definition in Bacteriology. Int. J. Syst. Evol. Microbiol. 2002, 52, 1043–1047. [Google Scholar] [CrossRef]
  64. Hayashi Sant’Anna, F.; Bach, E.; Porto, R.Z.; Guella, F.; Hayashi Sant’Anna, E.; Passaglia, L.M.P. Genomic Metrics Made Easy: What to Do and Where to Go in the New Era of Bacterial Taxonomy. Crit. Rev. Microbiol. 2019, 45, 182–200. [Google Scholar] [CrossRef]
  65. Thompson, C.C.; Chimetto, L.; Edwards, R.A.; Swings, J.; Stackebrandt, E.; Thompson, F.L. Microbial Genomic Taxonomy. BMC Genom. 2013, 14, 913. [Google Scholar] [CrossRef]
  66. Land, M.; Hauser, L.; Jun, S.R.; Nookaew, I.; Leuze, M.R.; Ahn, T.H.; Karpinets, T.; Lund, O.; Kora, G.; Wassenaar, T.; et al. Insights from 20 Years of Bacterial Genome Sequencing. Funct. Integr. Genom. 2015, 15, 141–161. [Google Scholar] [CrossRef]
  67. Wayne, L.G.; Brenner, D.J.; Colwell, R.R.; Grimont, P.A.D.; Kandler, O.; Krichevsky, M.I.; Moore, L.H.; Moore, W.E.C.; Murray, R.G.E.; Stackebrandt, E.; et al. Report of the Ad Hoc Committee on Reconciliation of Approaches to Bacterial Systematics. Int. J. Syst. Evol. Microbiol. 1987, 37, 463–464. [Google Scholar] [CrossRef]
  68. Chun, J.; Oren, A.; Ventosa, A.; Christensen, H.; Arahal, D.R.; da Costa, M.S.; Rooney, A.P.; Yi, H.; Xu, X.W.; De Meyer, S.; et al. Proposed Minimal Standards for the Use of Genome Data for the Taxonomy of Prokaryotes. Int. J. Syst. Evol. Microbiol. 2018, 68, 461–466. [Google Scholar] [CrossRef] [PubMed]
  69. Gosselin, S.; Fullmer, M.S.; Feng, Y.; Gogarten, J.P. Improving Phylogenies Based on Average Nucleotide Identity, Incorporating Saturation Correction and Nonparametric Bootstrap Support. Syst. Biol. 2022, 71, 396–409. [Google Scholar] [CrossRef] [PubMed]
  70. Meier-Kolthoff, J.P.; Carbasse, J.S.; Peinado-Olarte, R.L.; Göker, M. TYGS and LPSN: A Database Tandem for Fast and Reliable Genome-Based Classification and Nomenclature of Prokaryotes. Nucleic Acids Res. 2022, 50, D801–D807. [Google Scholar] [CrossRef] [PubMed]
  71. Wang, J.; Ran, Q.; Du, X.; Wu, S.; Wang, J.; Sheng, D.; Chen, Q.; Du, Z.; Li, Y.-Z. Two New Polyangium Species, P. aurulentum sp. nov. and P. jinanense sp. nov., Isolated from a Soil Sample. Syst. Appl. Microbiol. 2021, 44, 126274. [Google Scholar] [CrossRef]
  72. Cuny, H.; Offret, C.; Boukerb, A.M.; Parizadeh, L.; Lesouhaitier, O.; Le Chevalier, P.; Jégou, C.; Bazire, A.; Brillet, B.; Fleury, Y. Pseudoalteromonas ostreae sp. nov., a New Bacterial Species Harboured by the Flat Oyster Ostrea Edulis. Int. J. Syst. Evol. Microbiol. 2021, 71, 005070. [Google Scholar] [CrossRef]
  73. Colston, S.M.; Fullmer, M.S.; Beka, L.; Lamy, B.; Peter Gogarten, J.; Graf, J. Bioinformatic Genome Comparisons for Taxonomic and Phylogenetic Assignments Using Aeromonas as a Test Case. mBio 2014, 5, e02136. [Google Scholar] [CrossRef]
  74. Goris, J.; Konstantinidis, K.T.; Klappenbach, J.A.; Coenye, T.; Vandamme, P.; Tiedje, J.M. DNA-DNA Hybridization Values and Their Relationship to Whole-Genome Sequence Similarities. Int. J. Syst. Evol. Microbiol. 2007, 57 Pt 1, 81–91. [Google Scholar] [CrossRef]
  75. Konstantinidis, K.T.; Tiedje, J.M. Genomic Insights That Advance the Species Definition for Prokaryotes. Proc. Natl. Acad. Sci. USA 2005, 102, 2567–2572. [Google Scholar] [CrossRef] [PubMed]
  76. Qin, Q.-L.; Xie, B.-B.; Zhang, X.-Y.; Chen, X.-L.; Zhou, B.-C.; Zhou, J.; Oren, A.; Zhang, Y.-Z. A Proposed Genus Boundary for the Prokaryotes Based on Genomic Insights. J. Bacteriol. 2014, 196, 2210–2215. [Google Scholar] [CrossRef]
  77. Kim, D.; Park, S.; Chun, J. Introducing EzAAI: A Pipeline for High Throughput Calculations of Prokaryotic Average Amino Acid Identity. J. Microbiol. 2021, 59, 476–480. [Google Scholar] [CrossRef] [PubMed]
  78. Rodriguez-R, L.M.; Konstantinidis, K.T. Bypassing Cultivation To Identify Bacterial Species: Culture-Independent Genomic Approaches Identify Credibly Distinct Clusters, Avoid Cultivation Bias, and Provide True Insights into Microbial Species. Microbe Mag. 2014, 9, 111–118. [Google Scholar] [CrossRef]
  79. Nicholson, A.C.; Gulvik, C.A.; Whitney, A.M.; Humrighouse, B.W.; Bell, M.E.; Holmes, B.; Steigerwalt, A.G.; Villarma, A.; Sheth, M.; Batra, D.; et al. Division of the Genus Chryseobacterium: Observation of Discontinuities in Amino Acid Identity Values, a Possible Consequence of Major Extinction Events, Guides Transfer of Nine Species to the Genus Epilithonimonas, Eleven Species to the Genus Kaistella, and Three Species to the Genus Halpernia Gen. Nov., with Description of Kaistella daneshvariae sp. nov. and Epilithonimonas vandammei sp. nov. Derived from Clinical Specimens. Int. J. Syst. Evol. Microbiol. 2020, 70, 4432–4450. [Google Scholar] [CrossRef]
  80. Walter, J.M.; Coutinho, F.H.; Dutilh, B.E.; Swings, J.; Thompson, F.L.; Thompson, C.C. Ecogenomics and Taxonomy of Cyanobacteria Phylum. Front. Microbiol. 2017, 8, 2132. [Google Scholar] [CrossRef]
  81. Zheng, J.; Wittouck, S.; Salvetti, E.; Franz, C.M.A.P.; Harris, H.M.B.; Mattarelli, P.; O’toole, P.W.; Pot, B.; Vandamme, P.; Walter, J.; et al. A Taxonomic Note on the Genus Lactobacillus: Description of 23 Novel Genera, Emended Description of the Genus Lactobacillus Beijerinck 1901, and Union of Lactobacillaceae and Leuconostocaceae. Int. J. Syst. Evol. Microbiol. 2020, 70, 2782–2858. [Google Scholar] [CrossRef]
  82. Riesco, R.; Trujillo, M.E. Update on the Proposed Minimal Standards for the Use of Genome Data for the Taxonomy of Prokaryotes. Int. J. Syst. Evol. Microbiol. 2024, 74, 006300. [Google Scholar] [CrossRef]
  83. Meier-Kolthoff, J.P.; Göker, M. TYGS Is an Automated High-Throughput Platform for State-of-the-Art Genome-Based Taxonomy. Nat. Commun. 2019, 10, 2182. [Google Scholar] [CrossRef]
  84. Parks, D.H.; Chuvochina, M.; Rinke, C.; Mussig, A.J.; Chaumeil, P.A.; Hugenholtz, P. GTDB: An Ongoing Census of Bacterial and Archaeal Diversity through a Phylogenetically Consistent, Rank Normalized and Complete Genome-Based Taxonomy. Nucleic Acids Res. 2022, 50, D785–D794. [Google Scholar] [CrossRef]
  85. Jolley, K.A.; Bliss, C.M.; Bennett, J.S.; Bratcher, H.B.; Brehony, C.; Colles, F.M.; Wimalarathna, H.; Harrison, O.B.; Sheppard, S.K.; Cody, A.J.; et al. Ribosomal Multilocus Sequence Typing: Universal Characterization of Bacteria from Domain to Strain. Microbiology 2012, 158 Pt 4, 1005–1015. [Google Scholar] [CrossRef]
Figure 1. Steps of identification of environmental bacterial isolates, from MALDI-TOF MS to genomic taxonomy analyses. AAI, average amino acid identity; ANI, average nucleotide identity; dDDH, digital DNA-DNA hybridization; GTDB-Tk, Genome Taxonomy Database Toolkit; MALDI-TOF MS, Matrix-Assisted Laser Desorption Ionization–Time of Flight/Mass Spectrometry; rMLST, Ribosomal Multilocus Sequence Typing; TYGS, Type Strain Genome Server, ↑, above. * According to Stackebrandt, Ebbers (2006) [33].
Figure 1. Steps of identification of environmental bacterial isolates, from MALDI-TOF MS to genomic taxonomy analyses. AAI, average amino acid identity; ANI, average nucleotide identity; dDDH, digital DNA-DNA hybridization; GTDB-Tk, Genome Taxonomy Database Toolkit; MALDI-TOF MS, Matrix-Assisted Laser Desorption Ionization–Time of Flight/Mass Spectrometry; rMLST, Ribosomal Multilocus Sequence Typing; TYGS, Type Strain Genome Server, ↑, above. * According to Stackebrandt, Ebbers (2006) [33].
Dna 05 00033 g001
Table 1. The advantages and limitations of each identification method are presented in this study.
Table 1. The advantages and limitations of each identification method are presented in this study.
Identification MethodAdvantagesLimitations
Phenotypic (API®, VITEK® 2-Fast and easy to use
-Applicable to clinical and environmental bacteria
-Determines the biochemical profile of isolate analyzed
-Automation or semi-automatic available
-Database originally clinical (limitations for environmental bacteria)
-Low resolution for closely related species
-Results may vary
-Limited to existing databases
-Difficulty with non-fermenting and environmental bacteria
-May not identify new species without updating the database
MALDI-TOF MS-Very fast
-Low cost per analysis
-High accuracy for many species
-Allows database expansion
-Database originally clinical (limitations for environmental bacteria)
-May not identify new species without updating the database
-High initial cost of equipment
-Requires comparison with well-characterized spectra
16S rRNA gene sequencing-Widely used
-High conservation allows identification at genus level
-Gold standard for general classification
-Low resolution between closely related species
-There may be multiple copies in genome (intragenomic variability)
-Analysis can be expensive and slow in routine use
Housekeeping gene sequencing and multilocus sequencing analysis (MLSA)-Higher resolving power than 16S gene
-Differentiates closely related species
-Supports construction of robust phylogenetic trees
-Laborious process
-Depends on appropriate choice of genes
-Sequences are not always available in databases
Identification methodAdvantagesLimitations
Comparative genomics:        





dDDH



ANI





AAI




TYGS




GTDB-Tk



rMLST
-Greater taxonomy accuracy
-Based on complete genome
-Defines new species and genera

-Good correlation with traditional DDH
-Free tool (e.g., GGDC)

-Easy interpretation
-Strong correlation with DDH
-Useful for delimiting species (≥96%)

-Good tool for delimiting genera (≥60-65%)
-Correlation with evolutionary relationships

-Automated and updated platform
-Compares with recognized type strains

-Classifies based on global phylogenetic tree
-Constant updates

-Universal applicability
-High resolution
-Robustness against genetic recombination
-Public database
-Requires complete genomic sequencing data
-High initial cost
-Complexity of analysis and need for bioinformatics

-Requires high-quality genomic data


-Low resolution for categories above species
-Requires direct genomic comparison


-Requires annotated genomes



-Dependence on strain databases



-Dependence on strain databases


-Dependence of genomic data
-Variability between loci
-Need for continuous curation
AAI, average amino acid identity; ANI, average nucleotide identity; DDH, DNA-DNA hybridization; dDDH, digital DNA-DNA hybridization; GGDC, Genome-to-Genome Distance Calculator; GTDB-Tk, Genome Taxonomy Database Toolkit; MALDI-TOF MS, Matrix-Assisted Laser Desorption Ionization–Time of Flight/Mass Spectrometry; MLSA, multilocus sequence analysis; rMLST, Ribosomal Multilocus Sequence Typing; TYGS, Type Strain Genome Server.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nunes Ramos, J.; Veloso da Costa, L.; Viana Vieira, V.; Lima Brandão, M.L. Challenges in the Identification of Environmental Bacterial Isolates from a Pharmaceutical Industry Facility by 16S rRNA Gene Sequences. DNA 2025, 5, 33. https://doi.org/10.3390/dna5030033

AMA Style

Nunes Ramos J, Veloso da Costa L, Viana Vieira V, Lima Brandão ML. Challenges in the Identification of Environmental Bacterial Isolates from a Pharmaceutical Industry Facility by 16S rRNA Gene Sequences. DNA. 2025; 5(3):33. https://doi.org/10.3390/dna5030033

Chicago/Turabian Style

Nunes Ramos, Juliana, Luciana Veloso da Costa, Verônica Viana Vieira, and Marcelo Luiz Lima Brandão. 2025. "Challenges in the Identification of Environmental Bacterial Isolates from a Pharmaceutical Industry Facility by 16S rRNA Gene Sequences" DNA 5, no. 3: 33. https://doi.org/10.3390/dna5030033

APA Style

Nunes Ramos, J., Veloso da Costa, L., Viana Vieira, V., & Lima Brandão, M. L. (2025). Challenges in the Identification of Environmental Bacterial Isolates from a Pharmaceutical Industry Facility by 16S rRNA Gene Sequences. DNA, 5(3), 33. https://doi.org/10.3390/dna5030033

Article Metrics

Back to TopTop