Pan-Genome Analyses of the Genus Cohnella and Proposal of the Novel Species Cohnella silvisoli sp. nov., Isolated from Forest Soil

Two strains, designated NL03-T5T and NL03-T5-1, were isolated from a soil sample collected from the Nanling National Forests, Guangdong Province, PR China. The two strains were Gram-stain-positive, aerobic, rod-shaped and had lophotrichous flagellation. Strain NL03-T5T could secrete extracellular mucus whereas NL03-T5-1 could not. Phylogenetic analysis based on 16S rRNA gene sequences revealed that the two strains belong to the genus Cohnella, were most closely related to Cohnella lupini LMG 27416T (95.9% and 96.1% similarities), and both showed 94.0% similarity with Cohnella arctica NRRL B-59459T, respectively. The two strains showed 99.8% 16S rRNA gene sequence similarity between them. The draft genome size of strain NL03-T5T was 7.44 Mbp with a DNA G+C content of 49.2 mol%. The average nucleotide identities (ANI) and the digital DNA–DNA hybridization (dDDH) values between NL03-T5T and NL03-T5-1 were 99.98% and 100%, indicating the two strains were of the same species. Additionally, the ANI and dDDH values between NL03-T5T and C. lupini LMG 27416T were 76.1% and 20.4%, respectively. The major cellular fatty acids of strain NL03-T5T included anteiso-C15:0 and iso-C16:0. The major polar lipids and predominant respiratory quinone were diphosphatidylglycerol (DPG) and menaquinone-7 (MK-7). Based on phylogenetic analysis, phenotypic and chemotaxonomic characterization, genomic DNA G+C content, and ANI and dDDH values, strains NL03-T5T and NL03-T5-1 represent novel species in the genus Cohnella, for which the name Cohnella silvisoli is proposed. The type strain is NL03-T5T (=GDMCC 1.2294T = JCM 34999T). Furthermore, comparative genomics revealed that the genus Cohnella had an open pan-genome. The pan-genome of 29 Cohnella strains contained 41,356 gene families, and the number of strain-specific genes ranged from 6 to 1649. The results may explain the good adaptability of the Cohnella strains to different habitats at the genetic level.


Introduction
The genus Cohnella, which lies within the family Paenibacillaceae of the order Bacillales, was first proposed by Kämpfer et al. with the description of Cohnella thermotolerans as the type species [1].The genus Cohnella currently comprised 45 species with valid names listed in the LPSN database (https://www.bacterio.net/-allnamesac.html,accessed on 1 August 2023).Cells were characterized as Gram-stain-positive, spore-forming or nonspore-forming, aerobic or facultatively anaerobic, rods, and motile or non-motile.The major respiratory quinone is menaquinone-7 (MK-7), the predominant polar lipids contain diphosphatidylglycerol (DPG) and phosphatidylethanolamine (PE), and the main fatty acid profiles include iso-C 16:0 , anteiso-C 15:0 and C 16:0 [2].Species of the genus Cohnella are widely distributed in various environments such as soil [3,4], rhizosphere or root nodules [5], water [6], green algae [7], animal faeces [8], Siberian permafrost [9] and preserved vegetables [10].The Cohnella species are thought to play an important role in recycling plant biomass within soil, with multiple members of the genus possessing genes for degradation of chitinase, xylan, hemicellulose and cellulose [11][12][13].During an investigation on the diversity and novelty study of bacterium in soils of the Nanling National Forests, Guangdong Province, PR China, two strains, designated NL03-T5 T and NL03-T5-1, were isolated and described as a novel species of the genus Cohnella using polyphasic taxonomic studies including phylogenetic analysis, physiological and biochemical characterization, and genomic analysis.Furthermore, comparative genomics of the Cohnella strains were used to define the pan-genome, core genome, and unique genes and to assess genetic diversity to understand their adaptability to different habitats.

Strain Isolation and Cultivation
Strains NL03-T5 T and NL03-T5-1 were isolated from a soil sample (24 • 56 16 N; 113 • 00 09 E) collected on 6 October 2020 from the Nanling National Forests, Guangdong Province, China.The two strains were obtained using a standard dilution and plating method.More specifically, a 1 g air-dried soil sample was added to 9 mL of sterile physiological saline.The mixtures were placed in shaker at 160 rpm for 2 h at 30 • C.Then, 10× series (10 −2 , 10 −3 , 10 −4 and 10 −5 ) dilutions were performed separately and spread onto Reasoner's 2A (R2A; Haibo, Qingdao, China) medium.After cultivation at 30 • C for a week, colonies were picked out and a pure aerobic culture was obtained by repeated subculture of cells from the edge of the colony.Upon purification, strains were stored at −80 • C as a suspension containing 25% glycerol.

16S rRNA Gene Sequence and Phylogenetic Analysis
Genomic DNAs of strains NL03-T5 T and NL03-T5-1 were extracted from fresh cells grown on R2A agar using a HiPure Bacterial DNA kit (Magen Biotech Co., Ltd., Guangzhou, China) following the manufacturer's instructions.The 16S rRNA genes of NL03-T5 T and NL03-T5-1 were amplified using the extracted genomic DNAs as a template with the universal primers 27F and 1492R [14].PCR products were sequenced in Majorbio, China.The 16S rRNA genes were aligned in the EzBioCloud (https://eztaxon-e.ezbiocloud.net/)and GenBank (www.ncbi.nlm.nih.gov)databases accessed on 1 August 2023.The 16S rRNA gene sequences were submitted to the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov/genome,accessed on 1 August 2023) under the accession numbers MZ955418 and OQ913505.Phylogenetic trees based on 16S rRNA genes were reconstructed using methods including maximum-likelihood (ML) [15], neighbor-joining (NJ) [16] and minimum-evolution (ME) [17] using MEGA 7.0 software [18].The topology in each phylogenetic tree was calculated based on 1000 replications and evolutionary distances were calculated using Kimura's two-parameter model [19].Based on the phylogenetic analysis, the type strains Cohnella abietis HS21 T , Cohnella lupini LMG 27416 T , and Cohnella arctica NRRL B-59459 T obtained from the Belgian Co-ordinated Collections of Micro-organisms and the Agricultural Research Service Culture Collection were used as experiment control strains and cultured under optimum conditions.

Morphological, Physiological, and Biochemical Characteristics
The morphological features of strains NL03-T5 T and NL03-T5-1 were observed by light microscope (DM6/MC190, Leica, Wetzlar, Germany) and transmission electron microscope (H7650, Hitachi, Tokyo, Japan) with cell growth on R2A agar for 4 d at 30    C) was tested on R2A agar for 3-4 d.Sodium chloride tolerance was tested at 0, 0.1, 0.2, 0.5, 1.0 and 1.5%, and pH tolerance was performed from 5.0 to 9.5 at intervals of 0.5 pH units according to the method described in [20] on R2A medium for 3-4 d.The Gram-staining reaction was performed by using a bioMérieux Gram stain kit (bioMérieux, Tokyo, Japan) according to the manufacturer's instructions.Oxidase activity was tested using oxidase test strips [1% (w/v) tetramethyl-p-phe-nylenediamine, HKM], and catalase activity was determined by bubble production after mixing cells with 3% H 2 O 2 .Gliding motility was checked by observing the edges of colonies formed on 1:6-diluted R2A and using the hanging drop technique as described by Bernardet et al. (2002) [21].Hydrolyses of casein (1%, w/v), CM-cellulose (1%, w/v), Tween 20 (1%, w/v) and 40 (1%, w/v) were examined as described by Son et al. [22].Other physiological properties were examined using API 20NE and API ZYM kits (bioMérieux) according to the manufacturer's instructions.

Chemotaxonomic Properties
For cellular fatty acids, polar lipids, and respiratory quinones analysis, strain NL03-T5 T and its related species were harvested from R2A agar after being incubated for 4 d at 30 °C.Fatty acid methyl esters were extracted using the Sherlock Microbial Identification System (MIDI) protocol version 6.1 and analyzed by gas chromatography (model 7890A, Hewlett Packard, Palo Alto, CA, USA) as previously described [23].The polar lipids were extracted and determined according to the protocol of Tindall et al. [24].Respiratory quinones were extracted and purified using the method of Minnikin et al. [25] and analyzed using HPLC (UltiMate 3000, 205 Dionex, Thermo Fisher Scientific, Waltham, MA, USA).

Genome Sequencing, Annotation, and Pan-Genomic Analysis
The genomic DNAs of strains NL03-T5 T and NL03-T5-1 were sequenced on the Illumina HiSeq platform at Shanghai Majorbio Bio-Pharm Technology Co., Ltd.(Shanghai, China).Sequencing reads were assembled into contigs and scaffolds by applying SPAdes version 3.11.1 with default parameters [26], and the sequences were submitted to the NCBI database under the accession numbers JAIOAP000000000 and JASKHM000000000.Genomic annotation was performed using the software Rapid Annotation using Subsystem Technology (RAST) pipeline (http://rast.nmpdr.org/,accessed on 1 August 2023) with default parameters.The genes encoding carbohydrate active enzymes (CAZymes) were identified using the dbCAN2 meta server (http://cys.bios.niu.edu/dbCAN2,accessed on 1 August 2023) with HMMER annotation (E-Value < 1 × 10 −15 , coverage > 0.35), and biosynthetic gene clusters (BGCs) were annotated using antiSMASH bacterial version 6.1.1 (https://antismash.secondarymetabolites.org,accessed on 1 August 2023) with default parameters, respectively.All 27 reference genomes of the genus Cohnella were downloaded from the NCBI database.A pan-genome analysis was performed using the bacterial pan-genome analyses tool (BPGA) pipeline [27] with default parameters.Orthologous genes were identified with the USEARCH algorithm using a threshold of 0.5.Core, accessory, and unique genes were functionally annotated using the eggNOG mapper v2 [28].The data of the pan-genomes were visualized using the ImageGP online database (https://www.bic.ac.cn/ImageGP/, accessed on 1 August 2023).

Phylogenetic Analysis Based on 16S rRNA Genes and Genomic Sequences
Strains NL03-T5 T and NL03-T5-1 showed 99.8% 16S rRNA gene sequence similarity.In the EzBiocloud and NCBI databases, the two strains were closely related to the species of the genus Cohnella and showed the highest similarities with C. lupini LMG 27416 T (95.9% and 96.1%), and they both exhibited 95.3% and 94.0% similarities with C. abietis HS21 T and C. arctica NRRL B-59459 T , respectively.The 16S rRNA gene phylogenetic trees based on the ML, NJ, and ME methods (Figures 1a, S1 and S2) all showed that strain NL03-T5 T and NL03-T5-1 formed an independent cluster with C. lupini LMG 27416 T and C. arctica NRRL B-59459 T .Furthermore, the phylogenomic tree indicated that strains NL03-T5 T and NL03-T5-1 formed an independent cluster with Cohnella abietis HS21 T (Figure 1b).Therefore, we further selected C. lupini LMG 27416 T , C. abietis HS21 T , and C. arctica NRRL B-59459 T as reference type strains for taxonomic studies.

Phylogenetic Analysis Based on 16S rRNA Genes and Genomic Sequences
Strains NL03-T5 T and NL03-T5-1 showed 99.8% 16S rRNA gene sequence similarity.In the EzBiocloud and NCBI databases, the two strains were closely related to the species of the genus Cohnella and showed the highest similarities with C. lupini LMG 27416 T (95.9% and 96.1%), and they both exhibited 95.3% and 94.0% similarities with C. abietis HS21 T and C. arctica NRRL B-59459 T , respectively.The 16S rRNA gene phylogenetic trees based on the ML, NJ, and ME methods (Figures 1a, S1 and S2) all showed that strain NL03-T5 T and NL03-T5-1 formed an independent cluster with C. lupini LMG 27416 T and C. arctica NRRL B-59459 T .Furthermore, the phylogenomic tree indicated that strains NL03-T5 T and NL03-T5-1 formed an independent cluster with Cohnella abietis HS21 T (Figure 1b).Therefore, we further selected C. lupini LMG 27416 T , C. abietis HS21 T , and C. arctica NRRL B-59459 T as reference type strains for taxonomic studies.

Physiological Characterization
The cells of strains NL03-T5 T and NL03-T5-1 were Gram-stain-positive, aerobic, lophotrichous flagellation, rod-shaped, and 1.2-2.0µm long and 0.4-0.6 µm in diameter after incubation on R2A agar for 4 d at 30 • C (Figure 2c-f).Colonies were white-cream colored and strain NL03-T5 T could secrete extracellular mucus whereas NL03-T5-1 could not (Figure 2a,b).The two strains could grow on R2A, but not on NA, TSA, Mac and MD1 agar.Physiological analyses indicated that strains NL03-T5 T and NL03-T5-1 were able to grow at 10-37 • C, pH 5.0-8.5 and cells could tolerate 0.5% (w/v) NaCl (Table 1).Strains NL03-T5 T and NL03-T5-1 were negative for oxidase activity whereas their closely related species C. abietis HS21 T , C. lupini LMG 27416 T , and C. arctica NRRL B-59459 T were positive.In addition, strain NL03-T5 T was positive for hydrolysis of Tween 20 and utilization of a-mannosidase whereas its related species C. lupini LMG 27416 T and C. arctica NRRL B-59459 T were negative.Additional differences between strain NL03-T5 T and its closely related species are shown in Table 1.

Chemotaxonomic Analysis
The cellular fatty acid compositions of strain NL03-T5 T and its reference species C. abietis HS21 T , C. lupini LMG 27416 T , and C. arctica NRRL B-59459 T are given in Table 2.

Genomic Characteristics and OGRI Values
The draft genome size of strain NL03-T5 T was 7.44 Mbp with 43 contigs and an N50 value of 310,309 bp.In addition, the draft genome size of strain NL03-T5-1 was 7.44 Mbp with 41 contigs and an N50 value of 310,325.The two genomics DNA G+C contents were both 49.2 mol%, which were lower than the G+C contents of C. lupini LMG 27416 T and C. arctica NRRL B-59459 T (50.7 and 50.3 mol%), and higher than C. abietis HS21 T (44.8%) (Table 1).The distribution of genes into functional categories of strain NL03-T5 T using RAST revealed that the highest percentages of genes were assigned to carbohydrates (18.5%), amino acids and derivatives (17.3%), protein metabolism (9.3%) and cofactors, vitamins, prosthetic groups, and pigments (8.9%) (Table S1).In addition, strains NL03-T5 T and NL03-T5-1 showed 20 gene differences in the functional category of amino acids and derivatives, which probably cause phenotypic differences between the two strains (Table S1).The antiSMASH tool identified four complete BGCs and four BGCs on the contig edge that might be fragments of BGCs.The four complete BGCs include a resorcinol, a terpene, a RiPP-like and a phosphonate.The dbCAN2 analysis of the NL03-T5 T genome predicted 513 CAZymes which were distributed across 114 different CAZymes families, with glycoside hydrolases (GHs) and carbohydrate-binding modules (CBMs) constituting the most abundant families (Figure S4).Additionally, the ANI and dDDH values of strain NL03-T5 T and NL03-T5-1 were 99.98% and 100%, indicating the two strains were of the same species.The ANI values among NL03-T5 T and its closely related species C. abietis HS21 T and C. lupini LMG 27416 T were 75.7% and 76.0%, respectively.The dDDH values among them both were 20.4%.These values were lower than the threshold values of 95-96% and 70% for species discrimination [33], indicating that strains NL03-T5 T and NL03-T5-1 represent a novel species.

Pan-Genome Analysis of the Genus Cohnella
The pan-genome of the 29 Cohnella strains comprised 41,356 gene families.The core genes were present in all 29 genomes, accessory genes were present in 2-28 genomes, and unique genes were present only in one genome.The numbers of core genes, accessory genes, and unique genes were 492 (1.2%), 19,166 (46.3%), and 21,698 (52.5%), respectively (Figure 3a).The numbers of strain-specific genes ranged from 6 to 1649 (Figure 3b), suggesting there is an obvious difference among the genomes of the 29 Cohnella strains.The size of the pan-genome increased with the increasing number of genomes.Correspondingly, the core-genome size decreased with the addition of genomes (Figure 3c).The curves of pan-genome and core-genome sizes indicated an open pan-genome of the genus Cohnella, which was supported by the parameter b value (0.567477, between zero and one) in the power-law regression function.New gene distribution and gene family distribution of the 29 Cohnella strains are shown in Figures 3d and 3e, respectively.Functional characterization from core, accessory, and unique genes was conducted using the COGs annotation.As shown in Figure 3f, many core, accessory, and unique genes were assigned to the category "S", indicating their functions await to be studied further.Except for poorly characterized categories, the largest proportion of core genes belonged to the categories "translation, ribosomal structure and biogenesis (J)", followed by "amino acid transport and metabolism (E)".In contrast, the highest percentage of accessory genes and unique genes was related to carbohydrate transport and metabolism (G), followed by transcription (K).The results above may explain the good adaptability of the Cohnella strains to different habitats through gene gains or losses during frequent evolutionary changes at the genetic level.

Discussion
In this study, we isolated two bacterial strains, designated NL03-T5 T and NL03-T5-1, from a soil sample collected from the Nanling National Forests, PR China.The two strains showed 99.8% 16S rRNA gene sequence similarity.The ANI and dDDH values between them are 99.98% and 100%.These results indicate the two strains are of the same species.However, strain NL03-T5 T could secrete extracellular mucus whereas NL03-T5-1 could not (Figure 2a,b).Furthermore, the distribution of genes into functional categories using RAST revealed 20 gene differences in the functional category of amino acids and derivatives between them (Table S1).These different genes probably cause phenotypic differences.Based on phylogenetic analysis, genomic DNA G+C content, ANI and dDDH values, physiological characterization and chemotaxonomic analysis, strain NL03-T5 T was identified as a novel species in the genus Cohnella, for which the name Cohnella silvisoli sp.nov. is proposed.
The genus Cohnella strains are widely distributed in different environments.This to a certain extent can be explained by the nature of Cohnella which allows it to acclimatize itself to many environments.Comparative genomics indicated that the genus Cohnella had an open pan-genome and exhibited broad genetic diversity.In general, an open pan-genome is predominant in bacteria that are susceptible to horizontal gene transfer (HGT) [34].The pan-genome of 29 Cohnella strains contained 41,356 gene families, and the numbers of core genes, accessory genes, and unique genes were 492, 19,166, and 21,698, respectively (Figure 3a,c).Tettelin et al. (2008) has illustrated that the core genome is essential for the basic lifestyle of bacteria, whereas the accessory genome and unique genes provide some characteristics such species diversity and environmental adaptability [35].Therefore, we preliminarily inferred that the genomic differences in the Cohnella strains might be associated with their colonized environments.
The type strain is NL03-T5 T (=GDMCC 1.2294 T = JCM 34999 T ), which was isolated from a soil sample collected from the Nanling National Forests, Guangdong Province, PR China.The genomic DNA G+C content of the type strain is 49.2 mol%.

Figure 1 .
Figure 1.Phylogenetic relationship among species of the genus Cohnella.(a) Maximum-likelihood phylogenetic tree based on 16S rRNA gene sequences generated by MEGA 7.0 software.Bootstrap values (represented percentages of 1000 replication) > 50% are shown at nodes.Bar, 0.01 substitutions per nucleotide position.(b) The UBCG phylogenetic tree based on 92 up-to-date bacterial core genes sequences is constructed using the ML algorithm.Bar, 0.05 substitutions per nucleotide position.GenBank accession numbers are shown in parentheses.

Figure 1 .
Figure 1.Phylogenetic relationship among species of the genus Cohnella.(a) Maximum-likelihood phylogenetic tree based on 16S rRNA gene sequences generated by MEGA 7.0 software.Bootstrap values (represented percentages of 1000 replication) > 50% are shown at nodes.Bar, 0.01 substitutions per nucleotide position.(b) The UBCG phylogenetic tree based on 92 up-to-date bacterial core genes sequences is constructed using the ML algorithm.Bar, 0.05 substitutions per nucleotide position.GenBank accession numbers are shown in parentheses.

Microorganisms 2023 ,
11, x FOR PEER REVIEW 8 of 11Cohnella strains to different habitats through gene gains or losses during frequent evolutionary changes at the genetic level.

Figure 3 .Figure 3 .
Figure 3.The pan-genome analysis of the genus Cohnella.(a) The numbers and proportions of core genes, accessory genes, and unique genes.(b) Flower plot painting the core genome and uniquespecific strain of the Cohnella strains.(c) Boxplots of the pan-genome (blue) and core genome (red) of the 29 analyzed genomes.(d) Number of new genes represented within the numbers of Cohnella genomes.(e) Number of gene families represented within the numbers of Cohnella genomes.(f) The proportions of COGs functional categories of core genes, accessory genes, and unique genes.J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination, and repair; D, cell cycle control, cell division, and chromosome partitioning; V, defense mechanisms; T, signal transduction mechanisms; M, cell wall/membrane/envelope biogenesis; N, cell motility; U, intracellular trafficking, secretion, and vesicular transport; O, posttranslational modification, pro-Figure 3. The pan-genome analysis of the genus Cohnella.(a) The numbers and proportions of core genes, accessory genes, and unique genes.(b) Flower plot painting the core genome and unique-specific 4 15,20,25,30,35,of CaCl 2 •2H 2 O, 1 L of distilled water, pH 7.2).Growth at different temperatures (10,15,20,25,30,35, 37, 40

Table 1 .
Differential phenotypic characteristics of strains NL03-T5 T , NL03-T5-1 and their closely related species of the genus Cohnella.All data are from this study unless indicated otherwise.+, positive; −, negative; w, weakly positive reaction; ND, no data available.
genome (blue) and core genome (red) of the 29 analyzed genomes.(d) Number of new genes represented within the numbers of Cohnella genomes.(e) Number of gene families represented within the numbers of Cohnella genomes.