Preclinical Safety Assessment of Bacillus subtilis BS50 for Probiotic and Food Applications

Despite the commercial rise of probiotics containing Bacillaceae spp., it remains important to assess the safety of each strain before clinical testing. Herein, we performed preclinical analyses to address the safety of Bacillus subtilis BS50. Using in silico analyses, we screened the 4.15 Mbp BS50 genome for genes encoding known Bacillus toxins, secondary metabolites, virulence factors, and antibiotic resistance. We also assessed the effects of BS50 lysates on the viability and permeability of cultured human intestinal epithelial cells (Caco-2). We found that the BS50 genome does not encode any known Bacillus toxins. The BS50 genome contains several gene clusters involved in the biosynthesis of secondary metabolites, but many of these antimicrobial metabolites (e.g., fengycin) are common to Bacillus spp. and may even confer health benefits related to gut microbiota health. BS50 was susceptible to seven of eight commonly prescribed antibiotics, and no antibiotic resistance genes were flanked by the complete mobile genetic elements that could enable a horizontal transfer. In cell culture, BS50 cell lysates did not diminish either Caco-2 viability or monolayer permeability. Altogether, BS50 exhibits a robust preclinical safety profile commensurate with commercial probiotic strains and likely poses no significant health risk to humans.


Introduction
Bacillus subtilis is a Gram-positive bacterium with a long history of use in molecular biology, industry, medicine, and fermented foods [1,2]. Bacillus strains are particularly useful for their ability to produce and secrete enzymes in mass and amenability to genetic manipulation. In the past two decades, many strains of Bacillus spp. have been used as human probiotics and direct-fed microbial for animal health. Probiotics are live microorganisms that, when administered in adequate amounts, confer a health benefit on the host [3]. Probiotics may provide health benefits such as supporting digestion, gastrointestinal (GI) health, immune health, beneficial resident gut microbes, and mood and stress response [4][5][6][7][8]. Some of the most commonly used probiotic strains include members of the Lactobacillaceae family (Bacillota phylum, formally known as Firmicutes), including the Lactiplantibacillus, Lacticaseibacillus, and Lactobacillus genera. Common probiotic strains also include Bacillus spp. and Weizmannia coagulans (formally Bacillus coagulans) strains from the Bacillaceae family of the Bacillota phylum and Bifidobacterium spp. from the Actinomycetota (formally Actinobacteria) phylum.
Bacillaceae species are well-suited for probiotic applications because they can be manufactured as spores that persist without refrigeration and resist the acidic and high bile salt conditions that occur throughout the GI tract of humans and monogastric animals [9]. Bacillus subtilis (or B. subtilis), in particular, has a history of safe consumption across the globe. B. subtilis has been used in traditional fermented foods of many East Asian cultures for centuries, including the use of B. subtilis subsp. natto for commercial production of libraries using SQK-LSK109 chemistry, and Native Barcode Extension packs EXP-NBD104 and EXP-NBD114 from Oxford Nanopore Technologies (Oxford, UK). All necessary cleanup steps were carried out using Clean NA magnetic beads for next-generation sequencing (Clean NA, Waddinxveen, Netherlands). Genome sequencing took place on MinIOn Flow-Cells FLO-MIN106D over 48-72 h (Oxford Nanopore Technologies, Oxford, UK). The full genome was assembled with Flye [68] using default settings. The BS50 genome comprises a single, circular contig 4,150,844 bp in length. No plasmids were detected. The BS50 genome has a GC content of 43.7%.

BS50 Taxonomic Classification via Multilocus Sequence Typing
Using BLAST+ command-line software [69], the nucleotide BLAST (BLASTn) algorithm [70] was used to identify nucleotide sequences in the BS50 genome and 20 other Bacillus genomes that aligned with six genes from the genome of B. subtilis subspecies subtilis, strain 168-one of the longest existing and most extensively studied strains of B. subtilis (type strain Marburg derived) [71,72]: rpoB (GeneID: 936335), purH (GeneID: 936053), gyrA (GeneID: 940002), groEL (GeneID: 938045), polC (GeneID: 939620), and 16S rRNA (GeneID: 936895). These genes are standard "housekeeping" genes for Bacillus spp. and are commonly used for phylogenetic analysis of Bacillus spp. [73]. For each strain, the sequences aligning to these six genes were then concatenated into single nucleotide sequences (~19,616 nt). The strains used for comparison were selected based on having a complete genome available in the NCBI (National Center for Biotechnology Information) database or if they were currently used in probiotic supplements or food (i.e., MB40, BEST195, and DE111).
Multiple sequence alignment of the concatenated sequences for each Bacillus strain was performed using MAFFT [74] (accessed 10 June 2021). The multiple sequence alignment file produced by MAFFT was then input into MEGA X [75] for phylogenetic tree construction. Their evolutionary history was inferred using the Maximum Likelihood method and Tamura-Nei model [76]. Data was bootstrapped 50 times. The tree with the highest log likelihood (−30362.06) was chosen. Initial tree(s) for the heuristic search were obtained automatically by applying neighbor-joining, and BioNJ algorithms to a matrix of pairwise distances estimated using the Tamura-Nei model and then selecting the topology with a superior log-likelihood value. This analysis involved 21 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd+Noncoding. There was a total of 15,093 positions in the final dataset.
In order to further characterize the sequence identities between the whole genomes of BS50 and 20 other B. subtilis strains, pairwise BLASTn alignments between BS50 and each Bacillus strain were performed via the NCBI website (accessed 25 January 2022) by uploading BS50 as the query and the other Bacillus genome as the subject. Default settings were used.

BLASTn Screen for Known Bacillus Toxins
A BLASTn search was completed via the NCBI website (accessed 2 June 2021) to determine the presence or absence of toxin genes commonly associated with the Bacillus genus. A table of the genes that were screened is shown in Table 1. In addition, positive control genes were identified in B. subtilis glutamyl-tRNA(Gln) amidotransferase subunit and B. cereus methionyl-tRNA synthetase. These genes were used as a query against the subject sequence B. subtilis BS50 genome to demonstrate the BLASTn algorithm was able to generate a match both within and across species when one existed. Each toxin gene DNA sequence was identified using NCBI gene or NCBI nucleotide databases. The sequence for the B. cereus cereulide gene cluster (cesHPTABCD) was obtained from the 270 kb plasmid pCER270 sequence (NC_010924.1, location: 15094 to 38668) [77,78]. Finally, each toxin gene DNA sequence was used as a query against the subject sequence BS50 genome. All nucleotide BLASTn alignments were run using default parameters.

BLASTx Screen for Known Bacillus Toxins
A translated nucleotide BLAST search was completed via the NCBI website (accessed 2 June 2021) to determine the presence or absence of coding sequences that are homologous to toxins commonly associated with the Bacillus genus. Protein sequences related to the control and toxin genes previously included in the BLASTn analysis were identified (http://www.ncbi.nlm.nih.gov/protein (accessed on 4 June 2021). These protein sequences were used as subjects against the query B. subtilis BS50 translated genome. All BLASTx alignments were run using default parameters.

In Silico PCR Amplification of BS50 for Bacillus Toxins
In silico PCR amplification was accessed online (4 June 2021) to search the B. subtilis BS50 genome for toxins via gene primer matches [79]. Ten sets of sequence primers for Bacillus toxin DNA amplification [80][81][82] were used to complete the virtual PCR (Supplemental Table S2). The following parameters were used to closely mimic an actual PCR run: two mismatches allowed, no mismatch allowed in the last nucleotide of the 3 end, and a maximum band length of 10,000 nucleotides. As a positive control for the primers, the same set of primers was screened against the B. cereus genome, generating matches in all cases. As a control for the virtual PCR protocol, primers for 16S rRNA were used to show that the program would find a match when one was present.

Secondary Metabolite Screen via AntiSMASH
To determine if BS50 has the capacity to produce secondary metabolites, the BS50 genome was submitted to the online database antiSMASH bacterial version 6.0.1 (accessed 18 January 2022) [83]. Default settings were used; detection strictness was set to relaxed, and the features KnownClusterBlast, ActiveSiteFinder, RREFinder, and SubClusterBlast were turned on.

Secreted Protein via SignalP 6.0 Analysis
To determine if the BS50 genome encodes secreted proteins, it was uploaded onto the online server PATRIC [84] and annotated and translated via the RAST Tool Kit (RASTtk) [85] (accessed 26 March 2021). The translated amino acid sequences from the annotated BS50 genome were then analyzed for the presence of secreted proteins using the online SignalP 6.0 database [86] by setting the organism as "other" and setting model mode to "fast". SignalP 6.0 was accessed on 18 January 2022. SignalP utilizes a machine learning model that predicts the presence of signal peptide motifs (i.e., Sec/SPI, Sec/SPII, Sec/SPIII, Tat/SPI, Tat/SPII) and the location of their cleavage sites [86].

Virulence Factor Screen via VFDB
To assess if the BS50 genome encodes for virulence factors (VF) or proteins involved in VF synthesis, the virulence factor database (VFDB) [87] was accessed online (17 January 2022), and the "full dataset" of VF-associated protein sequences was downloaded. The "full dataset" includes 1381 amino acid sequences for both verified and predicted VF-associated proteins from 954 medically relevant bacterial strains, whereas the "core dataset" only includes sequences of experimentally verified VF-associated proteins. The full dataset includes 36 VF-associated proteins from 164 strains and eight species of Bacillus, including proteins related to adherence (e.g., BslA), antiphagocytosis (e.g., capsule), iron acquisition, enzymes (e.g., InhA), regulation (e.g., AtxA), secretion systems (e.g., T7SS), and toxins (e.g., ALO, anthrax toxin, cereulide, certhrax, CytK, HBL, and Nhe) [87]. Since the dataset was primarily curated from medically relevant Bacillus strains, VF detection in BS50 was potentially limited. Using the BLASTx algorithm [70] with local BLAST+ command-line software [69], the BS50 genome was translated and screened against the VF dataset. Hits with <20% coverage were excluded from analysis, and multiple hits aligned to the same region of the BS50 genome were screened for the hit with the highest bit score.

Antimicrobial Resistance Gene and Mobile Genetic Element Screen
The BS50 genome was screened for antibiotic resistance factors using the Resistance Gene Identifier (RGI), which is part of the Comprehensive Antibiotic Resistance Database (CARD) [88,89]. RGI is a web-based platform that utilizes BLAST to predict complete "resistomes" from genomic and metagenomic data. The BS50 genome sequence was submitted to the RGI CARD webserver (accessed 24 April 2021) using the following criteria: Perfect, Strict, complete genes only, 95% identity nudge used. Identity nudge allows any loose hit with at least 95% identity to be scored as a strict hit.
To screen the BS50 genome for mobile genetic elements (MGE), the "A CLAssification of Mobile genetic Elements" (ACLAME) [90] database, version 0.4, was downloaded (1 June 2021) and aligned against the BS50 genome using the BLASTn [70] command with local BLAST+ software [69] under default parameters. The database contains 125,190 nucleotide sequences of predicted MGEs from prophages, virus, and bacterial plasmids. The BS50 genome was screened for known insertion sequences using the online program ISfinder [91] (accessed 1 June 2021), which utilizes the BLASTn algorithm [70] to search for nucleotide sequences that match insertion sequences.
To assess if MGEs or insertion sequences present within the BS50 could play a role in antibiotic resistance gene transfer, the loci of the sequences were manually compared to the loci of antibiotic resistance genes. Mobile genetic elements and insertion sequences that were not within five Kb of the loci of antibiotic resistance genes were not considered to play a role in antibiotic resistance gene transfer [92].

Antibiotic Minimum Inhibitory Concentration (MIC) Evaluation of BS50
MIC evaluation of BS50 against eight commonly prescribed antibiotics (i.e., chloramphenicol, clindamycin, erythromycin, gentamicin, kanamycin, streptomycin, oxytetracycline, and vancomycin) was completed by BioSciences (Bozeman, MT, USA; report number 2105336-202). The MIC of each antibiotic was determined based upon the methodology described in Clinical and Laboratory Sciences Institute (CLSI) Document M07 11th edition [93]. BS50 cells (3.93 × 10 6 CFU/mL per well) were exposed to each of the 10 different dilutions of each antibiotic in sterile nutrient broth. Following an appropriate incubation period, the MIC of each antibiotic was determined visually and documented. Enterococcus faecalis (ATCC Accession No. 29212) and Staphylococcus aureus (ATCC #29213) (2.96 × 10 6 and 8.25 × 10 5 CFU/mL per well, respectively) were tested in tandem with BS50 to verify the methodology performed in this study, and they exhibited MICs within the CLSI quality control range. BS50 was deemed susceptible or resistant to particular antibiotics based on specific MIC thresholds established by the European Food Safety Authority (EFSA) for Bacillus strains [94,95].

Blood Hemolysis Assay
BS50 was streaked onto sheep blood agar plates to assess its ability to lyse blood cells. After incubation overnight, the agar was inspected for alpha-or beta-hemolysis. Alpha-hemolysis, or incomplete hemolysis, is indicated by a discolored, darkened, or green medium color after test culture growth. Beta-hemolysis, or complete hemolysis, is indicated by a clear and colorless medium after growth. An indiscernible change in the color of the agar indicates that no hemolysis occurred (i.e., gamma-hemolysis).

Caco-2 Cell Viability Assay
The effects of BS50 cell lysate on Caco-2 cell viability were tested at Charles River Laboratories (Bristol, UK). Caco-2 cells are an immortalized epithelial cell line of human colorectal adenocarcinoma cells. To generate the cell lysate, BS50 cells were harvested from overnight bacterial cultures and washed. The cells were lysed via enzymatic and mechanical bead-based processes. The final lysate was filtered through a 0.2 µM filter to remove any remaining cells. The final sterile-filtered lysate was plated on TSA to ensure it was free of viable cells. A "blank" sample was used as a process control sample for the lysate production method. The blank sample was sterile, uninoculated media that was treated exactly as the lysates were, including all spins, washes, lysing, and filtering steps. To perform the assay, Caco-2 cells were harvested, counted, and plated into 96-well flat-bottomed plates at 1 × 10 4 cells/well in 100 µL volumes and left to adhere overnight at 37 • C, 5% CO 2 in a humidified chamber. Cells were treated with BS50 lysate and incubated for an additional 48 h. Controls included cells that were left untreated and cells that were fully lysed at the time of treatment. Cell treatments were done in technical triplicate. Caco-2 cell viability was assayed using a CellTiter-Glo ® intracellular ATP quantification assay (Promega, Madison, WI, USA), alongside an ATP standard curve as per the manufacturer's guidelines. Luminescence was quantified using a GloMax ® Plate reader (Promega). Levels of intracellular ATP in test conditions were quantified using the standard curve. ATP concentrations were tested for statistical significance using the Kruskal-Wallis test followed by a post-hoc Dunn's test with Bonferroni correction for multiple testing in R Studio (Version 4.0.5). p-values less than 0.05 were considered significant.

Caco-2 Cell Transepithelial Electrical Resistance (TEER) Assay
The TEER assay was used to determine the effect of BS50 on gut barrier permeability (Charles River Laboratories, Portishead, UK). To generate a Caco-2 monolayer, Caco-2 cells were seeded on Transwell inserts over 14 days. At day 14, the polarized Caco-2 monolayers were pre-treated with a 1:5 dilution of BS50 lysate, sterile media process control, or lipopolysaccharide (LPS) control and left to incubate for 48 h. There was also a non-treatment control. TEER was measured before treatment (0 h), and at 2, 4, 6, 24, and 48 h after treatment. The TEER assays were performed twice on separate dates, with separate cell lysate preparations. Since the starting TEER values (ohm/cm 2 ) at 0 h varied across treatments and trials, the TEER fold-changes were calculated relative to 0 h. Foldchange data from both trials were then combined and statistically analyzed as duplicates via the Kruskal-Wallis test, followed by a post-hoc Dunn's test with Bonferroni correction for multiple testing in R Studio (Version 4.0.5). p-values less than 0.05 were considered significant.

Taxonomic Classification of BS50
To confirm that BS50 is taxonomically a Bacillus subtilis strain, a phylogenetic tree of BS50 and 20 Bacillus strains was generated using concatenated~20,000 nt sequences containing six "housekeeping" genes (i.e., rpoB, purH, gyrA, groEL, polC, 16S rRNA) [73]. The phylogenetic tree shows that BS50 aligns closely with other common B. subtilis strains, including the B. subtilis type strain 168 and B. subtilis MB40, a commercial probiotic strain ( Figure 1). BS50 also closely aligns with commercial stains previously classified as Bacillus subtilis subsp. such as B. inaquosorum DE111. Pairwise whole genome alignments were performed between BS50 and the other Bacillus strains using BLASTn (Supplemental Table S1). Bacterial genomes sharing at least 95% average nucleotide identity are generally accepted as belonging to the same species [96,97]. The BS50 genome has 98.5% sequence identity to B. subtilis MB40 and 99.0% identity to B. subtilis subsp. natto BEST195, a B. subtilis strain commonly found in Japanese fermented natto beans (Supplemental Table S1). These data further support the classification of BS50 as a bona fide B. subtilis strain.

Taxonomic Classification of BS50
To confirm that BS50 is taxonomically a Bacillus subtilis strain, a phylogenetic tree BS50 and 20 Bacillus strains was generated using concatenated ~20,000 nt sequences c taining six "housekeeping" genes (i.e., rpoB, purH, gyrA, groEL, polC, 16S rRNA) [7 The phylogenetic tree shows that BS50 aligns closely with other common B. subtilis strai including the B. subtilis type strain 168 and B. subtilis MB40, a commercial probiotic str ( Figure 1). BS50 also closely aligns with commercial stains previously classified as Baci subtilis subsp. such as B. inaquosorum DE111. Pairwise whole genome alignments w performed between BS50 and the other Bacillus strains using BLASTn (Supplemental ble S1). Bacterial genomes sharing at least 95% average nucleotide identity are genera accepted as belonging to the same species [96,97]. The BS50 genome has 98.5% seque identity to B. subtilis MB40 and 99.0% identity to B. subtilis subsp. natto BEST195, a B. s tilis strain commonly found in Japanese fermented natto beans (Supplemental Table S These data further support the classification of BS50 as a bona fide B. subtilis strain.

BLASTn Screen for Known Bacillus Toxins
To screen the BS50 genome for toxin-encoding genes, the nucleotide sequences of known Bacillus toxins were aligned against the BS50 genome using BLASTn. The control genes, gatA and metG, yielded positive matches of 98% identity with 100% sequence coverage and 71% identity with 95% sequence coverage, respectively. The metG gene from B. cereus was used as a control for cross-species sequence matches to ensure that BLASTn could identify matches within BS50 when a gene from a different species was used as the input. Because B. subtilis and B. cereus are different species, a high identity is not expected. Thus, 71% identity with 95% sequence coverage satisfies its use as a control gene for crossspecies matches (Table 1). No significant similarities were found between the query toxin sequences and the BS50 genome. The identified matches, including HblA, entFM, cytK, and NheA, B, C from B. cereus and NheA, B, C from B. mycoides, were the only partial matches that covered less than 25% of the toxin gene sequences.

BLASTx Screen for Known Bacillus Toxins
To further account for the ability of BS50 to produce toxin-encoding genes, the translated BS50 genome was aligned against the amino acid sequences of known Bacillus toxins using BLASTx.
The control proteins, GatA and MetG, yielded positive matches of 100% identity and 74.16% identity, respectively. Because B. subtilis and B. cereus are different species, a high identity was not expected, and thus, a 74.16% identity further satisfies its use as a control gene for cross-species matches ( Table 2). No significant similarities were found between the query toxin protein sequences and the translated BS50 genome. The alignment between the translated BS50 genome and EntFM from B. cereus exhibited only 52.21% identity over a span of 113 amino acids. The EntFM protein sequence is 426 amino acids long, and the alignment only covered 26.5% of the EntFM protein sequence, which is insufficient coverage to conclude that BS50 produces the EntFM protein. The BS50 genome was translated and compared to the seven proteins encoded by the B. cereus cereulide gene cluster cesHPTABCD. There were matches between the BS50 genome and the protein sequences of CesA, CesB, CesC, CesH, CesP, and CesT all of which were less than 40% identical. CesH aligned at a locus of the BS50 genome that was roughly 1.3 Mb upstream of the other cereulide biosynthesis protein alignments. There were no significant matches with CesD (Table 2).

In Silico PCR Amplification of BS50 for Bacillus Toxins
Virtual PCR only yielded matches using the positive control 16S rRNA and spoIVA primers. None of the 11 queried toxin genes were detected in the BS50 genome using virtual PCR (Supplemental Table S2).

Secondary Metabolite Screen via AntiSMASH
To determine if BS50 has the ability to produce secondary metabolites, the BS50 genome was screened for secondary metabolite biosynthetic gene clusters using the online tool, antiSMASH [83]. Ten unique secondary metabolites (two terpene hits) were predicted in the BS50 genome (Table 3).

Secreted Protein via SignalP 6.0 Analysis
To determine if the BS50 genome encodes for secreted proteins, the translated BS50 genome was analyzed for the presence of secreted proteins using the online SignalP 6.0 database [86]. As a result, 151 proteins were predicted with a greater than 50% likelihood to have Sec/SPI motifs, 93 proteins were expected to have Sec/SPII motifs, four proteins were predicted to have Tat/SPI motifs, and three proteins were predicted to have Sec/SPIII motifs.

Virulence Factor Screen via VFDB
To assess if BS50 genome encodes for virulence factors (VF), the virulence factor database (VFDB) [87] was aligned against the BS50 genome using BLASTx. There were 12 hits for VF-associated proteins in the BS50 genome (Table 4).

Antibiotic Resistance Gene Analysis
The online tool RGI was used to screen the BS50 genome for antibiotic resistance genes. RGI identified one perfect, three strict, and 275 loose hits. Of the 275 loose hits, only 12 hits had at least a 95% identity and were nudged to strict hits (Table 5). Based on the presence of a gene with roughly 98% identity to aadK, an aminoglycoside 6-adenylyltransferase that is part of the ANT6 gene family, BS50 is predicted to be resistant to streptomycin. BS50 is also predicted to be resistant to the macrolides spiramycin and telithromycin due to the presence of mph(K), a macrolide phosphotransferase. Additionally, BS50 is predicted to be resistant to tetracycline due to the presence of a tetracycline efflux pump (Tet(L)). In total, there are 16 potential resistance gene hits including aadK, mphK, and tet (45), but only seven hits that cover more than 90% of the reference gene sequence.

Insertion Sequences and Mobile Genetic Element Analysis
To assess if the antibiotic resistance genes present within the BS50 genome have the ability to be horizontally transferred to other bacteria, the BS50 genome was screened for insertion sequences using ISfinder and other mobile genetic elements using the ACLAME database (4.0). ISfinder found no matches between the BS50 genome and known insertion sequences with coverages greater than 15%. There were 122 unique loci in the BS50 genome that aligned with known mobile genetic element sequences from the ACLAME database with greater than 50% coverage, e-values less than 1.3 × 10 −11 , and bit scores greater than 65. To assess if these putative mobile genetic elements could play a role in antibiotic resistance gene transfer, the loci of sequences in the BS50 genome matching mobile genetic elements were then compared to the loci of antibiotic resistance genes identified via RGI. Out of the 122 loci that aligned to mobile genetic elements from the ACLAME database (4.0), one was found within five kb of an antibiotic resistance gene. The nucleotide sequence for the cupin domain-containing protein (NC_006322.1 (1,461,102 to 1,461,695)) was detected 1641 bp upstream of the blt-encoding gene (start position: 3,686,740; stop position 3,687,924). However, the nucleotide sequence for the cupin domain-containing protein only aligned to the BS50 genome with 80.3% similarity and 67% coverage, for which the 174 nt of the 5 region did not align.

MIC Evaluation of BS50 against Eight Antibiotics
BS50 sensitivity to eight medically relevant antibiotics, including chloramphenicol, clindamycin, erythromycin, gentamicin, kanamycin, streptomycin, oxytetracycline, and vancomycin was determined by MIC methods [93]. BS50 was susceptible to seven of eight antibiotics and exhibited resistance against streptomycin (Table 6).

Blood Hemolysis Assay
To characterize any potential hemolytic activity, BS50 cells were streaked onto sheep blood agar plates and incubated overnight. The agar displayed a greenish hue surrounding the streaks where BS50 colonies grew, indicating that BS50 exhibits alpha-hemolysis.

Caco-2 Cell Viability Assay
Caco-2 cells were treated with BS50 lysate to test for deleterious effects on cell viability. While there was a significant difference in ATP concentrations between the cell lysis control and the untreated control (p = 0.014), the cells exposed to BS50 lysate showed similar ATP concentrations to the untreated control (p = 0.423) (Figure 2). Similarly, there was no significant difference in ATP concentrations between the untreated Caco-2 cell control and the blank sample, nor between the BS50 treatment and blank sample treatment.
Caco-2 cells were treated with BS50 lysate to test for deleterious effects on cell viability. While there was a significant difference in ATP concentrations between the cell lysis control and the untreated control (p = 0.014), the cells exposed to BS50 lysate showed similar ATP concentrations to the untreated control (p = 0.423) (Figure 2). Similarly, there was no significant difference in ATP concentrations between the untreated Caco-2 cell control and the blank sample, nor between the BS50 treatment and blank sample treatment.

Caco-2 Cell TEER Assay
TEER assays were performed to determine the effect of BS50 on gut barrier permeability ( Figure 3). Due to variations in the initial TEER measurements across wells, foldchanges relative to 0 h from both trials were combined into one data set for statistical analysis. There were no significant differences in TEER fold-change values between the untreated control, blank process control, and cells treated with BS50 lysate at both 24 h and 48 h post-treatment (p > 0.2), whereas the LPS control lowered TEER compared to all other treatments at 24 h (p < 0.006).

Caco-2 Cell TEER Assay
TEER assays were performed to determine the effect of BS50 on gut barrier permeability ( Figure 3). Due to variations in the initial TEER measurements across wells, fold-changes relative to 0 h from both trials were combined into one data set for statistical analysis. There were no significant differences in TEER fold-change values between the untreated control, blank process control, and cells treated with BS50 lysate at both 24 h and 48 h post-treatment (p > 0.2), whereas the LPS control lowered TEER compared to all other treatments at 24 h (p < 0.006).

Discussion
Spore-forming bacteria, particularly several Bacillaceae strains, are increasingly used in dietary supplements, food, and beverages due to their resistance to high temperatures and stability during manufacture, storage, and transportation [98]. Furthermore, the European Food Safety Authority (EFSA) has identified 17 Bacillaceae spp. with Qualified Presumption of Safety (QPS) status, including B. subtilis, B. amyloliquefaciens, B. licheniformis, W. coagulans, and P. megaterium, which are used as probiotics for humans and animals  A,B). TEER was measured before treatment (0 h) and 2, 4, 6, 24, and 48 h after treatment. Square, untreated Caco-2 cells; diamond, "blank" lysate processing control; circle, BS50 lysate treatment; triangle, LPS treatment (TEER reduction control). Data are shown as two separate trials without replication within each trial (n = 1). Values on the y-axis are plotted on a logarithmic scale.

Discussion
Spore-forming bacteria, particularly several Bacillaceae strains, are increasingly used in dietary supplements, food, and beverages due to their resistance to high temperatures and stability during manufacture, storage, and transportation [98]. Furthermore, the European Food Safety Authority (EFSA) has identified 17 Bacillaceae spp. with Qualified Presumption of Safety (QPS) status, including B. subtilis, B. amyloliquefaciens, B. licheniformis, W. coagulans, and P. megaterium, which are used as probiotics for humans and animals [54]. Regardless of the established safety of numerous Bacillaceae species, it is important to assess the safety of each individual strain, as reflected in the QPS qualifications that strains are required to meet (e.g., lack of acquired antimicrobial resistance, lack of cytotoxicity). We show here that B. subtilis strains BS50 show a robust preclinical safety profile. BS50 is a unique B. subtilis strain with at least 98% sequence similarity to commercial probiotic strains such as B. subtilis subsp. natto and B. subtilis MB40 (Supplemental Table S1).
Bacillaceae spp., such as B. anthracis, B. cereus, and B. thuringiensis, are pathogenic in humans and animals [55][56][57][58][59]. B. cereus produces the emetic toxin cereulide, enterotoxins haemolysin BL (Hbl) and non-hemolytic enterotoxin (Nhe), and cytotoxin K (CytK) [60,61]. Other strains such as B. subtilis, B. mojavensis, B. pumilus, and B. fusiformis can produce cytotoxic and emetic toxins [99][100][101]. In order to address if BS50 is capable of producing toxins, we utilized BLASTn and BLASTx to screen the BS50 genome against the nucleotide and amino acid sequences of known Bacillus toxins, including the Bacillus cereus cereulide gene cluster (cesHPTABCD, 24-kb gene cluster belonging to the 270 kb plasmid pCER270) (Tables 1 and 2). There were matches between the translated BS50 genome and the protein sequences of CesH, CesP, CesT, CesA, CesB, and CesC, but they had less than 40% sequence identity, and they did not contiguously align throughout the genome. Further, while most of these matches aligned with greater than 90% coverage, CesA and CesB aligned with less than 65% coverage, and there were no significant matches with CesD (Table 2). Given the absence of CesD in the BS50 genome, non-contiguous alignment, and the low sequence identity and/or coverage to CesH, CesP, CesT, CesA, CesB, and CesC, there is sufficient evidence to conclude that BS50 does not contain a functioning cereulide synthase cluster.
The BS50 genome was also screened in silico for virulence factors and secondary metabolites. It was found that the BS50 genome contains secondary metabolite biosynthetic gene clusters and encodes several proteins that are associated with virulence in pathogenic organisms. However, the products encoded by these genes are not innately toxic. Contrary to primary metabolites, secondary metabolites are non-essential small organic molecules that may contribute to evolutionary fitness over time, such as improving survival against competing organisms in the same niche [102]. For example, a few secondary metabolites (e.g., bacillibactin and fengycin) that are synthesized by non-ribosomal peptide synthases (NRPS) were predicted to be produced by BS50. Bacillibactin is a catecholate siderophore encoded by the dhb operon (as detected in Table 4) and is involved in the chelation and utilization of ferric iron [103,104].
Due to its ability to bind and remove iron, bacillibactin has been proposed to treat Parkinson's disease since patients exhibit an accumulation of iron in the brain's substantia nigra [105]. In silico analysis also predicted that BS50 produced fengycin, an established antimicrobial in preclinical studies and suggested bioactive in a clinical observational trial; The presence of fecal Bacillus spp. was correlated with the reduced fecal occurrence of the pathogen Staphylococcus aureus in a rural Thai population [106]. Preclinical experiments suggest that fengycin production by B. subtilis is required to exert this pathogen exclusion effect [106]. Two antibiotic-encoding genes were also detected in the BS50 genome, including bacilysin and bacillaene. Bacilysin is a non-ribosomally synthesized dipeptide antibiotic that inhibits Gram-negative foodborne pathogens [107][108][109]. Bacillaene is a polyene antibiotic that can accelerate biofilm formation and has activity against a broad spectrum of bacteria, including S. aureus and E. coli [110][111][112][113]. It functions by inhibiting bacterial protein synthesis, but it cannot inhibit eukaryotic protein synthesis. BS50 also encodes for genes involved in capsular polyglutamate synthesis and transport (i.e., CapA, CapB, and CapC). Polyglutamate can enhance the pathogenesis of B. anthracis and S. epidermidis by evading the innate immune response [114,115]. Interestingly, poly-γ-glutamic acid isolated from a novel B. sonorensis strain has been shown to inhibit S. aureus and E. coli growth [116].
Most of the secondary metabolites and VF-associated proteins that were detected in the BS50 genome are also widely present throughout many Bacillus genomes [102]. As mentioned in [102], surfactin, plipastatin/fengycin, bacillibactin, bacillaene, and bacilysin are produced by 99%, 97%, 99%, 77%, and 93% of B. subtilis strains tested. Subtilosin A is also produced by several B. subtilis strains, including Strain 22a, a wild strain of B. subtilis isolated from a fermented soybean product [117,118]. All four strains of B. subtilis and no other species isolated from the mushroom substrate (including Lactococcus lactis, B. lichenimormis, and B. sonorensis) produce subtilomycin [119]. As mentioned prior, BS50 encodes genes involved in the biosynthesis of polyglutamate (Table 4). Polyglutamate is produced by many commensal Bacillus strains and is found in several Bacillus-fermented foods, including natto [120]. In a study examining polyglutamate synthesis in fermented foods, 4.7%, 1.8%, and 3.0% of the Bacillus-like strains isolated from Cheongkukjang, Doenjang, and Kochujang samples, respectively, produced polyglutamate [121]. Because these metabolites/virulence factors predicted to be synthesized by B. subtilis BS50 are produced by other species of B. subtilis, these properties should be considered intrinsic.
BS50 was also screened for the presence of antibiotic resistance encoding genes and susceptibility to antibiotics. The emergence of multidrug resistant pathogens is a major global health concern, and overuse of antibiotics has contributed to a greater incidence of antibiotic-resistant pathogens [62,122,123]. Additionally, antibiotic resistance genes present in plasmids, transposons, and integrons can be transferred from one bacteria to another via horizontal gene transfer [63][64][65][66][67]. The GI tracts of humans and animals contain complex and diverse microbial communities that may contribute to the transfer of antibiotic resistance genes from commensal organisms to potentially pathogenic bacteria [124]. BS50 was predicted to encode 16 antibiotic resistance genes that can provide resistance against multiple types of antibiotics, including fluoroquinolones, aminoglycosides, macrolides, lincosamides, tetracyclines, phenicols, nucleoside antibiotics, and peptide antibiotics (Table 5). BS50 was then tested in vitro for susceptibility/resistance against a comprehensive suite of medically relevant antibiotics as established by EFSA guidelines [94,95]; in vitro susceptibility tests determined that BS50 was resistant to the aminoglycoside streptomycin and susceptible to one phenicol antibiotic, two macrolides/lincosamides, two aminoglycosides, one glycopeptide, and oxytetracycline (Table 6).
Streptomycin resistance is widespread throughout Bacillus species and is most likely a part of their intrinsic genetic makeup rather than having acquired resistance from transferable genetic elements [125]. Regarding antibiotic resistance gene transfer, no plasmids were detected during BS50 genome assembly. While 122 regions of the BS50 genome aligned with mobile genetic elements from the ACLAME database, only one mobile genetic element was within five kb of any antibiotic resistance genes detected via CARD. The mobile genetic element cupin-domain-containing protein was detected 1641 bp upstream of the blt gene, which confers resistance against fluoroquinolone antibiotics and acridine dyes. However, 174 nt of the 5 region of the sequence encoding for the cupin-domain-containing protein did not align to the BS50 genome, suggesting that this gene is non-functional and/or truncated. Thus, BS50 is at low risk of transferring antibiotic resistance genes to human gut-resident bacteria.
Of note, the BS50 genome encodes for a hemolysin, putative membrane hydrolase (hlyIII) ( Table 4). In turn, BS50 was streaked onto sheep blood agar plates to assess its ability to lyse blood cells, and it was determined that BS50 exhibits incomplete hemolysis. Hemolytic activity has been detected throughout several Bacillus strains isolated from commercially available probiotics [126]. While this may present a safety concern if BS50 comes into contact with the bloodstream, the likelihood of an oral probiotic translocating through the intestinal barrier into the bloodstream is small and has only been reported at very low rates in hospitalized patients [127]. Nonetheless, to address potential concerns with gut barrier integrity and translocation, human colon-derived Caco-2 epithelial cell ATP viability and TEER tests were performed. We established that BS50 lysates did not negatively affect Caco-2 cell viability or monolayer permeability. Maintenance of Caco-2 cell viability and monolayer barrier integrity during BS50 lysate exposure, together with the in silico safety profile, suggest that BS50 will not be toxic to enterocytes in the human intestine or affect gut barrier integrity. A clinical trial in healthy adults has been initiated to better understand the safety and tolerability of BS50 in humans (ClinicalTrials.gov (last accessed on 18 April 2022). Identifier: NCT04655352).

Conclusions
Based on the results from in silico and in vitro analyses, BS50 is expected to be safe for human consumption. A clinical trial is being conducted to support the safe use of this strain by humans at anticipated rates of consumption from use in food or dietary supplements.