Negativibacillus massiliensis gen. nov., sp. nov., a New Bacterial Genus Isolated from a Human Left Colon Sample

: A new genus, a member of the Ruminococcaceae family, was isolated from the left colon of a healthy woman. Strain Marseille P3213 was a non-motile, spore-forming, Gram-stain negative, rod-shaped bacterium. This strictly anaerobic species reached optimal growth after an incubation of 72 h at 37 ◦ C. The 16S rRNA gene sequence of this strain shared a 93.52% similarity level with Harryﬂintia acetispora strain V20-281a, its closest phylogenetic neighbor with standing in the nomenclature. Its genome had a size of 2.87 Mb, with a 45.81% G + C content. We hereby propose the creation of Negativibacillus massiliensis strain P3213 T as the 43rd genus within the Ruminococcaceae family.


Introduction
The bacterial diversity of the gut microbiota has a tremendous impact on physiological functions and disease susceptibility. Therefore, studies regarding its role and diversity are of great importance for human health [1,2], and they have increased exponentially over the two last decades [3]. These studies were mostly conducted using culture independent approaches [1]. However, the last decade has witnessed a turning point in the study of gut microbiota diversity with a rebirth of culture methods [2,4], with high-throughput culture methods such as culturomics [2,5]. The culturomics method was coined for the exploration of gut microbiota diversity, and is based on the multiplication of culture conditions with a variety of physic-chemical parameters, such as the culture medium, atmosphere, temperature and pH [5]. This technique is complementary to metagenomics studies [2,5], and has led to an increased diversity of the cultured human intestinal microbiota, and therefore to the increase of the repertoire of known bacterial species to man [6].
In 2016, as a part of a culturomics study focused on the modifications of human gut microbiota diversity along the different anatomical sites of the gastrointestinal tract [7], a new member of the Ruminococcaceae family was isolated. The family Ruminococcaceae-formerly known as Clostridium cluster III [8]-was first coined in 2010, and presently consists of 41 genera which are all strictly anaerobic [9]. These bacterial species are morphologically diverse, including cocci and bacilli, as well as intermediate forms [8]. This family, of which the members share the ability to break down cellulose [10], includes numerous commensals of the human gut [11].
In this study, we present a complete description of this new member of the Ruminococcaceae family according the concept of taxonogenomics. This innovative concept uses a combination of phenotypic, proteomic and genomic characteristics [12,13] to classify and describe new bacterial species.

Sample Collection
The left colon sample was collected from a seventy-six-year-old woman who had a body mass index of 26 kg/m 2 . This woman was hospitalized for a colonoscopy and fibroscopy in order to assess the condition of her esophagus and observe colonic polyps. The samples were immediately cultured without prior storage. Written consent was obtained from the patient for this study, which was validated by the ethics committee of the Institut Fédératif de Recherche IFR48 under number 09-022.

Bacterial Strain and Identification
The bacterial diversity of this sample was studied using the standardized culturomics technique consisting of 18 culture conditions [14].
For each of the 18 culture conditions, the samples were incubated in a liquid medium for enrichment. Subsequently, at different timepoints (incubation day 1,3,7,10,15,21,30), this enriched culture was diluted and seeded on 5% sheep-blood-enriched Columbia agar (COS, bioMerieux Craponne France). The colonies obtained after 48 to 72 h of incubation were then subcultured on COS plates. Each colony was identified using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) [15]. Each colony was tested in duplicate; the obtained spectra were matched with the database consisting of the protein profiles of numerous species. The bacteria were considered to be identified if the score obtained was higher than 1.9. Between 1.7 and 1.9, the colony was considered identified at the genus level. If the score was lower than 1.7, the 16S rRNA gene was then sequenced using fD1-rP2 primers, as previously described [16], using a 3130-XLsequencer (Applied Biosciences, Saint Aubin, France). The obtained spectra were then added to the in-lab database publicly available at https://www.mediterranee-infection. com/acces-ressources/base-de-donnees/urms-data-base (accessed on 25 December 2020). The obtained sequences were matched with the NCBI database using BLASTn. As defined by Stackebrandt and Ebers [17], the thresholds to define a new species and a new genus were 98.65% and 95%, respectively.

Morphologic and Biochemical Characteristics, and Antibiotic Susceptibility
Phenotypic characteristics like Gram staining (bioMerieux, Craponne, France), oxidase (Beckton Dickinson, Arcueil, France), catalase (bioMerieux, Craponne, France), motility and sporulation were determined according to the manufacturer's instructions. Motility was determined by observing the fresh bacterial colonies using an optical microscope, magnification ×100. In order to test the spore-forming ability of strain P3213, a thermic shock was used. In fact, a suspension of 10 8 cfu/mL was heated in a dry bath for 20 min at 80 • C. In total, 50 µL of this bacterial suspension was inoculated on COS plates.
Morphologic observations were also carried out by performing negative staining. Detection form var-coated grids were deposited on a 40 µL bacterial suspension drop and incubated at 37 • C for 30 min. This was then followed by a 10-s incubation on ammonium molybdate 1%. The grids were dried on blotting paper and observed with a Tecnai G20 transmission electron microscope (FEI Company, Limeil Brevannes, France).
In order to determine the metabolic features of strain P3213, three API strips were used. The 50CH API strip allowed the evaluation of the capacity of the studied strain to metabolize carbohydrates and their derivatives, such as heterosids, polyalcohols, uronic acids. The enzymatic capacity was evaluated using an API zym strip, while an API 20A strip was used to complete the metabolic profile of the strain. All of the strips were used according to the manufacturer's instructions (bioMerieux, Craponne, France). The antibiotic susceptibility was determined using the disk diffusion method according to the European Committee recommendations on antimicrobial susceptibility testing 2015 [18].

Fatty Acid Methyl Ester (FAME) Analysis by GC/MS
The cellular fatty acid methyl ester (FAME) analysis was performed by GC/MS in duplicate, with approximately 40 mg bacterial biomass per tube harvested from several culture plates. The fatty acid methyl esters were prepared as described by Sasser and colleagues [19]. The GC/MS analyses were carried out as described before [20]. Briefly, the fatty acid methyl esters were separated using an Elite 5-MS column and monitored by mass spectrometry (Clarus 500-SQ 8 S, Perkin Elmer, Courtaboeuf, France). The obtained spectra were matched with the Standard Reference Database 1A (NIST, Gaithersburg, MD, USA) and the FAMEs mass spectral database (Wiley, Chichester, UK) using MS Search 2.0.

Genome Sequencing
For the genomic DNA (gDNA) extraction, strain P3213 was cultured on three COS plates at 37 • C anaerobically. The plates were harvested and resuspended in 400 of Tris-EDTA (TE) buffer. The chemical lysis was carried out by adding 1 mL TE buffer; subsequently, a 30 min incubation at 37 • C with 2.5 µg/µL lysozyme (Sigma Aldrich, Saint-Quentin Fallavier, France) was performed, followed by an overnight incubation at 37 • C with 20 µg/µL proteinase K (Euromedex, Souffelweyersheim, France). The gDNA was purified using three consecutive phenol-chloroform extractions with an ethanol precipitation performed overnight at −20 • C. After centrifugation, the DNA was resuspended in 160 µL elution buffer. The gDNA was quantified using a Qubit assay with a high sensitivity kit (Life technologies, Carlsbad, CA, USA) to a concentration of 88.3 ng/µL, and then sequenced using the MiSeq Technology (Illumina Inc, San Diego, CA, USA) with the mate pair strategy. After barcoding the gDNA using the Nextera Mate Pair sample prep kit (Illumina, Evry, France) in order to mix it with gDNA from 11 other projects, 1.5 µg was used to prepare the mate pair library using the Nextera mate pair Illumina guide. The gDNA was simultaneously fragmented and labeled using a mate pair junction adapter. The fragmentation pattern was checked using an Agilent 2100 BioAnalyzer (Agilent Technologies Inc, Santa Clara, CA, USA) with a DNA 7500 labchip. DNA fragments ranging in size from 1.5 kb up to 11 kb were obtained, with an average size at 6.639 kb. The labeled fragments were circularized before mechanical shearing into small fragments, with an optimal size of 843 bp, on the Covaris device S2 in T6 tubes (Covaris, Woburn, MA, USA). Using a High Sensitivity Bioanalyzer LabChip (Agilent Technologies Inc., Santa Clara, CA, USA), the library profile was visualized and then normalized at 2 nM, and pooled. After the DNA denaturation and dilution at 15 pM, the pool of libraries was loaded onto the instrument, along with the flow cell for the automated cluster generation and sequencing run.

Genome Assemblage, Annotation and Comparison
The 1,423,574 paired reads were trimmed and assembled into eight scaffolds using the software SPAdes 3.9.0. The contigs with a size below 800 bp, as well as those with a coverage of less than 25% of the average coverage, were trimmed. The Open Reading Frames (ORFs) prediction was achieved using Prodigal [21], with the default parameters. The protein coding genes were predicted by matching against the NR database using BLASTP with an E-value of 10 −3 (10 −5 for a sequence shorter than 80 amino acids, coverage 0.7, and an identity percentage of 30%). The predicted protein coding genes were then matched against the Clusters of Orthologous Groups (COG) using BLASTP (E-value 10 −3 , coverage 0.7 and identity percent 30%) in order to infer the functional abilities of the described organism. The transfer RNA genes (tRNA) were predicted using the tRNAScanSE tool [22], whereas the ribosomal RNAs (rRNA) were predicted using RNAmmer [23]. The lipoprotein signal peptides and the number of transmembrane helices were predicted using Phobius [24]. The genes were considered to be ORFans if no hits were obtained using BLASTP (an E-value smaller than 10 −3 for ORFs with a sequence size higher than 80 aa, or E-value smaller than 1 × 10 5 for ORFs with sequence size lower than 80 aa).
For the genome comparison, the genomes were selected from the 16S rRNA phylogenetic tree using Phylopattern, an XEGEN software [25]. The retrieved sequences from the FTP of NCBI included the complete genome sequence, proteome genome sequence (all gene sequences encoding proteins in a genome) and Orfeome genome sequence (all gene sequences encoding orphan proteins in a genome). The proteomes were analyzed using proteinOrtho [26]. Moreover, the distribution into functional classes of the predicted genes according to the clusters of orthologous groups of proteins was performed as described above. The genomic similarity between the compared genomes was evaluated using two parameters: digital DNA-DNA Hybridization (dDDH), a parameter highly correlated with DDH [27,28], and OrthoANI [29]. OrthoANI, a similarity score consisting of the mean value of nucleotide similarity between two compared genomes determined using the OAT software, was computed for each couple of genomes. The dDDH was determined using Type Strain Genome Server TYGS (https://tygs.dsmz.de/, accessed on 25 December 2020) [30] and interpreted with the d4 formula, as recommended. The Multi-Agent software system DAGOBAH [31], including Figenix [32] libraries, was used to perform the annotation and comparison.

Strain Identification
Strain P3213 was isolated after 14 days of preincubation in an anaerobic blood culture bottle supplemented with 5 mL sheep blood and 5 mL sterile rumen fluid, followed by a 72 h incubation in an anaerobic atmosphere at 37 • C on COS plates.
A score under 1.7 was obtained for strain P3213 after the MALDI-TOF MS analysis. Therefore, the 16S rRNA gene was sequenced. The sequence-which is available under accession number LT598596 Marseille-P3213-exhibited a 93.52% identity with Harriflyntia acetispora strain V20-281a (GenBank accession no. KU999999), the phylogenetically-closest species with a validly-published name, as shown in the phylogenetic tree in Figure 1. The reference spectra for Negativibacillus massiliensis ( Figure 2) were incremented in the MALDI-TOF database.
The GenBank accession numbers for the 16S rRNA gene are indicated in parenthesis. The sequences were aligned, and the phylogenetic inferences were obtained using the maximum-likelihood method within the MEGA7 software. The numbers at the nodes are the percentages of the bootstrap values obtained by repeating the analysis 1000 times, in order to generate a majority consensus tree. Catabacter hongkongensis was used as an outgroup. The scale bar represents a 1% nucleotide sequence divergence. The GenBank accession numbers for the 16S rRNA gene are indicated in parenthe The sequences were aligned, and the phylogenetic inferences were obtained using maximum-likelihood method within the MEGA7 software. The numbers at the nodes the percentages of the bootstrap values obtained by repeating the analysis 1000 times order to generate a majority consensus tree. Catabacter hongkongensis was used as an o group. The scale bar represents a 1% nucleotide sequence divergence.   The GenBank accession numbers for the 16S rRNA gene are indicated in parenthesi The sequences were aligned, and the phylogenetic inferences were obtained using th maximum-likelihood method within the MEGA7 software. The numbers at the nodes ar the percentages of the bootstrap values obtained by repeating the analysis 1000 times, i order to generate a majority consensus tree. Catabacter hongkongensis was used as an ou group. The scale bar represents a 1% nucleotide sequence divergence.

Optimal Growth Conditions
Growth was observed after 48 h at 37 • C and 45 • C, but not at 25, 28 or 56 • C. This strain is strictly anaerobic, as no growth was observed aerobically or microaerobically. The optimal growth was obtained at 37 • C after 48 h. Strain P3213 was able to grow at all of the tested pH, ranging from 5 to 8, with an optimum pH level at 7.5. Conversely, no growth was observed at the NaCl concentrations tested. However, strain P3213 was able to grow with a NaCl concentration at 10 g/L contained in the modified Columbia agar medium used to assess its pH tolerance.

Genome Properties
The genome of strain P3213 is 2,876,881 bp long, with a 45.41% GC content. It consists of 8 scaffolds (consisting of eight contigs). Of the 2779 predicted genes, 2716 were proteincoding genes, and 63 were RNAs (five copies of 5S rRNA, one copy of 16S rRNA, one copy of 23S rRNA, and 56 tRNA genes). A total of 2453 genes (90.32%) were assigned as a putative function (by cogs or by NR blast). The comparison with the Cluster of Orthologous Groups (COGs) database allowed us to assign a function to 1827 predicted proteins (67.27%), with transcription [K] (172 proteins, 8.56%) and translation [I] (159 proteins, 7.91%) being the most represented functions. Moreover, 9.66% of the proteins, i.e., 194, were assigned to an unknown function [S]. In total, 21 genes were identified as ORFans (0.77%). The remaining 148 genes (5.45%) were annotated as hypothetical proteins ( Figure 5, Table 2).

Conclusions
Strain Marseille-P3213 T exhibits a 16S rRNA sequence divergence under 5% with its phylogenetically closest genus with standing in the nomenclature. In addition, the highest OrthoANI and dDDH values observed were well under 95% and 70%, respectively, with the closest species with a validly-published name. We consequently suggest the creation of a new genus Negativibacillus gen. nov., type species Negativibacillus massiliensis sp. nov., type strain 3213 within the Ruminococcaceae family.
The genome of strain P3213 is 2,876,881 bp long, with a 45.41% GC content, and is accessible under FTRU00000000 in the GenBank collection. The 16S rRNA sequence is also accessible in the GenBank collection under accession number NR_147378. The type strain Marseille-P3213 (=CSUR P3213 = DSM 103594) was isolated from the left colon of a French woman.

Nucleotide Sequence Accession Number
The 16S rRNA gene and genome sequences were deposited in Genbank under accession numbers NR_147378 and FTRU00000000, respectively.

Deposit in Culture Collections
Strain Marseille-P5551 T was deposited in the Collection de Souches de l'Unité des Rickettsies under the number CSUR P3213, and in the DSMZ collection under the number DSM 103594.
Supplementary Materials: The following are available online at https://www.mdpi.com/2036-7 481/12/1/4/s1, Figure S1: Phylogenetic tree based on the genomic sequences of closely related species with available genomes., Table S1: List of type species of validly published genera within the Ruminococcaceae family and their 16S similarity with Negativibacillus massiliensis.