Genome Sequencing of five Lacticaseibacillus Strains and Analysis of Type I and II Toxin-Antitoxin System Distribution

The analysis of bacterial genomes is a potent tool to investigate the distribution of specific traits related to the ability of surviving in particular environments. Among the traits associated with the adaptation to hostile conditions, toxin–antitoxin (TA) systems have recently gained attention in lactic acid bacteria. In this work, genome sequences of Lacticaseibacillus strains of dairy origin were compared, focusing on the distribution of type I TA systems homologous to Lpt/RNAII and of the most common type II TA systems. A high number of TA systems have been identified spread in all the analyzed strains, with type I TA systems mainly located on plasmid DNA. The type II TA systems identified in these strains highlight the diversity of encoded toxins and antitoxins and their organization. This study opens future perspectives on the use of genomic data as a resource for the study of TA systems distribution and prevalence in microorganisms of industrial relevance.


Introduction
Toxin-antitoxin (TA) systems are widely distributed in bacterial genomes and are involved in various cellular processes, including adaptive response to environmental conditions, by regulating cell growth and/or death [1,2]. TA systems generally consist of a toxin that interferes with essential cellular functions, and a cognate antitoxin, capable of neutralizing the toxic effect. TA systems are classified according to the nature of the antitoxin and its mode of inhibition of toxin activity [3]. In particular, type I TA systems are composed of a toxic peptide and a non-coding RNA antitoxin able to inhibit toxin expression by interacting with toxin encoding mRNA. In type II TA systems, both toxin and antitoxin are proteins that can interact forming an inactive complex. Type I TA systems located in plasmid DNA are involved in plasmid maintenance, while the role of their homologs identified in chromosomal DNA has not been clarified. Type II TA systems, usually identified in chromosomal DNA, represent bacterial stress response mechanisms. Currently, the increasing availability of genome sequences has allowed us to grasp more about the complexity of bacterial lifestyle and adaptation features and has expanded the possibilities to identify and describe new TA systems. Attention has been given to the distribution and prevalence of TA systems in various food-associated microorganisms. Recently, genome-wide screening allowed the identification of novel TA systems in the food-borne pathogens Staphylococcus aureus [4] and Listeria monocytogenes [5]. TA systems are widespread also in the genomes of strains relevant for industrial productions, such as L. rhamnosus (strains 1019, 1473) and L. paracasei (strains 2333, 4186 and 2306), belonging to the University of Parma Culture Collection (UPCC), were isolated from dairy matrices. Bacterial strains were maintained as frozen stocks (−80 • C) in Man Rogosa Sharpe (MRS) medium (Oxoid, Milan, Italy) supplemented with glycerol 15% (w/v). The cultures were propagated three times with a 2% (v/v) inoculum in 6 mL MRS in sterile glass tubes and incubated in anaerobiosis (AnaeroGen, Oxoid, Milan, Italy) at 37 • C for 15 h.

Nucleic Acid Extraction, NGS Library Preparation and Sequencing
Total DNA extraction was performed on 1 mL of overnight cultures prepared as described above using DNeasy Blood & Tissue kit (Qiagen, Milan, Italy), following manufacturer's instructions for lysis of Gram+ bacterial cells. Plasmid DNA extraction was performed on 1 mL overnight cultures with Plasmid Mini Prep kit (Fisher Molecular Biology, Rome), following the manufacturer's instructions. An amount of 1 ng DNA from each sample was used to prepare Next Generation Sequencing libraries with the Illumina Nextera XT library prep kit. Quality control of the libraries was performed with Agilent Bioanalyzer HS DNA kit and quantitation with Qubit DNA HS kit. Approximately 4 Million 2x150 bp PE reads were generated for each sample in Illumina MiSeq Sequencer.

Genome Assembly, Annotation and Bioinformatics Analysis
Illumina sequence de novo assembly was performed using a Unicycler [21], and the quality of the assembled genomes was evaluated by the use of QUAST software [22]. The presence of rearrangements in the newly sequenced genomes was evaluated by com-Microorganisms 2021, 9,648 3 of 15 parison with reference strains isolated from dairy products, i.e., L. rhamnosus LC705 for L. rhamnosus 1019 and 1473 strains, and L. paracasei JCM8310 for L. paracasei 2333, 4186 and 2306. A circular representation of the sequence rearrangements was drawn with Circos software [23]. Annotation of the assembled genomes was performed by RAST 2.0 server (https://rast.nmpdr.org/rast.cgi, accessed on 14 February 2019).
Manual annotation of type I TA systems homologous to Lpt/RNAII [17] was performed by a tBLASTn search using the Lpt amino acid sequence as a query. The prediction of putative promoter and transcription terminator sequences was carried out with the BPROM (http://www.softberry.com, accessed on 1 March 2021) and with the "Arnold finding terminators" web services (http://rna.igmors.u-psud.fr/toolbox/arnold/, accessed on 1 March 2021), respectively.
Type II TA systems were initially annotated by RAST software. In addition, a manual BLASTn search by using dinJ/yafQ and yefM/yoeB identified in Ferrari et al. [19] or mazEF and phd/doc sequences deposited in the databases led to the identification of unannotated sequences.

Genomic Features of the Lacticaseibacillus Isolates
Assembly and annotation were performed on the genomes of the five Lacticaseibacillus isolates. Details about genome size, number of contigs and features for each strain are reported in Table 1. The genome of L. paracasei 4186 is the smallest one (2,922,588 bp), while the strain L. paracasei 2306 shows the largest genome (3,240,492 bp). The number of annotated coding sequences ranges from 3232 in L. rhamnosus 1019 to 3915 in L. paracasei 2333. In the genomes of the newly sequenced microorganisms, 26 contigs can be attributed to plasmids, according to the analyses performed by using dedicated software and data available in public repositories (Table 2). For each strain, a number of plasmids ranging between three (L. paracasei 4186) and eight (L. paracasei 2333) was identified. The largest plasmid, p1019-1 containing 27,278 bp, was identified in L. rhamnosus 1019, while the smallest one, p2333-8, 2076 bp long, was found in L. paracasei 2333. Of the 318 coding sequences located on the retrieved plasmids, only 43 could be associated to known functions (Table S2). Most of the newly detected genes are related to plasmid maintenance functions, but also genes involved in amino acid, sugar and nucleotide metabolism, or related to antimicrobial resistance have been identified. Among these traits, plasmid p1019-1 from strain L. rhamnosus 1019 possesses a gene cluster, encoding the enzymes CysE2 and CysK2, respectively, a serine O-acetyltransferase and a cysteine synthase, and for the enzyme Ctl1, a cysteine-S-conjugate beta-lyase ( Figure S1). This cluster is conserved in strains L. paracasei Microorganisms 2021, 9, 648 4 of 15 2306 and L. paracasei 4186, with amino acid sequence identities above 99%. Interestingly, the sequences coding for the enzymes CysK2 and Ctl1 are extremely conserved also in strains Lactobacillus delbrueckii sp. bulgaricus American Type Culture Collection (ATCC) 11842 and Streptococcus thermophilus CNRZ1066 (percentage of identity ≥ 97%). Conversely, a greater divergence is observed for CysE2 coding sequence, which is absent in L. delbrueckii sp. bulgaricus ATCC 11842, while S. thermophilus CNRZ1066 gene shares an identity of 59% with Lacticaseibacillus sequences ( Figure S1).  Each newly sequenced genome was compared with a strain of the same species isolated from a dairy product and available in the public database, that was chosen as a reference sequence. This comparison highlights different levels of sequence rearrangements ( Figure 1). In particular, comparing L. rhamnosus genomes with that of the strain LC705 used as a reference sequence, we can observe only a few rearrangements for strain 1019 but several structural variations for strain 1473. In the case of L. paracasei genomes, numerous structural variations are observed when aligned with the selected reference strain, L. paracasei JCM8130.  Figure 1. Circular representations of the chromosomes of Lacticaseibacillus isolates. The outer grey circle represents the 177 genome of the reference strain used for the alignment, the green circle represents the contigs aligned with the reference 178 sequence, the red circle represents structural contig rearrangements compared to the reference sequence. Genome scale is 179 in Kbp. Below each circular representation the RefSeq accession number for each type strain is reported (see Table S1

183
A pan-genome reconstruction was carried out, comparing the 5 isolates described in 184 this study with 15 reference strains available in public databases (Table S1). Reconstruc-185 tion of phylogenetic relationship according to core gene alignment shows that strains be-186 longing to the L. rhamnosus species form a distinct clade, comprising L. rhamnosus 1019 187 and 1473 strains ( Figure 2a). Strains belonging to the species L. casei and L. paracasei form 188 Figure 1. Circular representations of the chromosomes of Lacticaseibacillus isolates. The outer grey circle represents the genome of the reference strain used for the alignment, the green circle represents the contigs aligned with the reference sequence, the red circle represents structural contig rearrangements compared to the reference sequence. Genome scale is in Kbp. Below each circular representation the RefSeq accession number for each type strain is reported (see Table S1 for details). A pan-genome reconstruction was carried out, comparing the 5 isolates described in this study with 15 reference strains available in public databases (Table S1). Reconstruction of phylogenetic relationship according to core gene alignment shows that strains belonging to the L. rhamnosus species form a distinct clade, comprising L. rhamnosus 1019 and 1473 strains ( Figure 2a). Strains belonging to the species L. casei and L. paracasei form a single group, except for the strains L. casei ATCC 393 and L. casei LC5, that form a separate, small clade. The strains L. paracasei 2333 and L. paracasei 4186 appear to be closely related, while strain L. paracasei 2306 has a greater distance from these two strains ( Figure 2a). The latter strain was previously reported as belonging to the species L. casei [28], but genome-scale comparison allowed correct attribution of strain 2306 to the species L. paracasei. Pangenome analysis reported in Figure 2b was conducted on a set of 10,020 genes for all the 20 Lacticaseibacillus strains. Of these, 822 are shared by at least 19 strains, representing the core genome. Of the remaining genes, 4421 are shared by 3 to 18 strains, while 4777 genes were unique or shared only by two strains (Figure 2b). comparison was made among them, to highlight common and unique genes ( Figure 2c). 199 A total of 862 genes shared among all the analyzed strains were identified, mostly coding 200 for metabolic traits such as carbohydrates utilization, amino acid and protein metabolism 201 (data not shown). This result is in line with data provided in the broader Lacticaseibacillus 202 comparison ( Figure 2b). The highest number of shared distinctive genes (1384)

Type I TA Systems Distribution and Structure
Type I TA systems were predicted in the sequenced genomes by a tBlastn search using the aminoacid sequences of Lpt toxin and its homologs identified in Folli et al. [17] as queries.
Overall, 14 sequences encoding peptides homologous to Lpt were predicted spread in all the five analyzed strains, and most of the sequences are located in contigs putatively identified as plasmids, with plasmid p1473-1 encoding for two different peptides. In particular, four distinct putative toxins homologous to Lpt were found in both L. rhamnosus strains, three sequences in L. paracasei 2333, two sequences in L. paracasei 2306 and only one in L. paracasei 4186 (Table 3). Table 3. Sequence encoding for Lpt toxins and its homologs in the Lacticaseibacillus strains. The corresponding plasmid contig is reported, when applicable. n.a., not applicable.
The multiple alignment of the corresponding amino acid sequences highlighted three distinct sequences, named Lptlike, Lptlike1 and Lptlike2, in addition to Lpt peptide (Table  3 and Figure 3). A single copy of Lpt peptide was identified in L. paracasei 2333 and 4186 and in L. rhamnosus 1019, while two distinct sequences were found in the strain 1473 of L. rhamnosus. In addition, four identical Lptlike, with 68% of sequence identity with Lpt, were identified in L. rhamnosus 1473 and 1019 (two sequences) and in L. paracasei 2306, while four identical Lptlike1, with 65% of sequence identity with Lpt, were found in L. rhamnosus 1019 and 1473, in L. paracasei 2306 and in L. paracasei 2333. Both these putative toxin peptides have been previously identified in Lacticaseibacillus plasmid sequences deposited in the DNA database [18]. Interestingly, an additional 30 amino acid peptides with a sequence identity of 43% with Lpt were recognized in L. paracasei 2333 (Lptlike2). The comparison of the amino acid sequences of the different peptides highlights a conserved hydrophobic stretch of nine amino acids, which could locate inside the cellular membrane based on the structural prediction of the Lpt toxin [27].

Type I TA Systems Distribution and Structure 215
Type I TA systems were predicted in the sequenced genomes by a tBlastn search us-216 ing the aminoacid sequences of Lpt toxin and its homologs identified in Folli et al. [17] as 217 queries. 218 Overall, 14 sequences encoding peptides homologous to Lpt were predicted spread 219 in all the five analyzed strains, and most of the sequences are located in contigs putatively 220 identified as plasmids, with plasmid p1473-1 encoding for two different peptides. In par-221 ticular, four distinct putative toxins homologous to Lpt were found in both L. rhamnosus 222 strains, three sequences in L. paracasei 2333, two sequences in L. paracasei 2306 and only 223 one in L. paracasei 4186 (Table 3). 224

227
The multiple alignment of the corresponding amino acid sequences highlighted three 228 distinct sequences, named Lptlike, Lptlike1 and Lptlike2, in addition to Lpt peptide (Table  229 3 and Figure 3). To evaluate the functionality of these putative toxins, nucleotide sequences located upstream and downstream of their coding sequences were analyzed to identify promoter and terminator regions capable of controlling toxin-encoding and antitoxin RNA transcription. Figure S2 shows that these DNA regions are highly conserved among all the strains. Nevertheless, in the strain L. rhamnosus 1019 it was not possible to complete this analysis for a copy of Lptlike and for Lptlike1 because the contigs containing the toxin encoding regions do not encompass the entire TA loci. In particular, Lptlike1 contig interrupted  Figure S2), started 1 nt before the -10 sequence of the toxin promoter. In all the other putative TA systems, promoters have been correctly identified, with the exception of Lpt-Lr1473_1 of L. rhamnosus 1473 and Lptlike2 of L. paracasei 2333. In particular, in L. rhamnosus 1473 it was not possible to predict functional promoters capable of controlling Lpt RNA and antitoxin RNA synthesis, suggesting that the TA system is inactive. In L. paracasei 2333, a putative functional promoter is located upstream of the Lptlike2 coding sequence while no promoter sequences were identified on the complementary strand able to control antitoxin synthesis, suggesting that the toxicity of the peptide has been lost or that it could be controlled by a different mechanism of inhibition. An additional feature typical of type I TA systems are the direct repeat sequences located on toxin-encoding and antitoxin RNAs. These complementary sequences are involved in the RNA-RNA interaction inducing inhibition of toxin activity. In all the systems potentially active, these regions have been identified on both RNAs and are indicated in Figure S2. A graphical summary of the identified Lpt/RNAII systems is reported in Figure 4.
To evaluate the functionality of these putative toxins, nucleotide sequences located 245 upstream and downstream of their coding sequences were analyzed to identify promoter 246 and terminator regions capable of controlling toxin-encoding and antitoxin RNA tran-247 scription. Figure S2 shows that these DNA regions are highly conserved among all the 248 strains. Nevertheless, in the strain L. rhamnosus 1019 it was not possible to complete this 249 analysis for a copy of Lptlike and for Lptlike1 because the contigs containing the toxin 250 encoding regions do not encompass the entire TA loci. In particular, Lptlike1 contig inter-251 rupted 48 nt after the toxin stop codon, while the contig encompassing Lptlike (named 252 Lptlike Lr1019_1 in Figure S2), started 1 nt before the -10 sequence of the toxin promoter. 253 In all the other putative TA systems, promoters have been correctly identified, with the 254 exception of Lpt-Lr1473_1 of L. rhamnosus 1473 and Lptlike2 of L. paracasei 2333. In partic-255 ular, in L.rhamnosus 1473 it was not possible to predict functional promoters capable of 256 controlling Lpt RNA and antitoxin RNA synthesis, suggesting that the TA system is inac-257 tive. In L. paracasei 2333, a putative functional promoter is located upstream of the Lptlike2 258 coding sequence while no promoter sequences were identified on the complementary 259 strand able to control antitoxin synthesis, suggesting that the toxicity of the peptide has 260 been lost or that it could be controlled by a different mechanism of inhibition. An addi-261 tional feature typical of type I TA systems are the direct repeat sequences located on toxin-262 encoding and antitoxin RNAs. These complementary sequences are involved in the RNA-263 RNA interaction inducing inhibition of toxin activity. In all the systems potentially active, 264 these regions have been identified on both RNAs and are indicated in Figure S2. A graph-265 ical summary of the identified Lpt/RNAII systems is reported in Figure 4.

Type II TA Systems Distribution and Characteristics 279
Automatic annotation of the newly sequenced genomes by using the RAST server 280 allowed us to find toxin and antitoxin sequences attributed to type II TA systems. A 281 BLAST manual search by using the sequences described in the Materials and Methods 282  (Table S1). By using the tBlastn program, one Lpt and two Lptlike2 were identified, while no Lptlike and Lptlike1 were found. The identified Lpt sequence is located on the plasmid pLBPC-2 of L. paracasei JCM 8130 and it was previously described in Folli et al. [17]. Lptlike2 sequences were found in L. paracasei JCM 8130 and in L. paracasei ATCC334 and the alignment of the nucleotide sequences encompassing their coding regions is shown in Figure S3 in comparison with Lptlike2 of L.paracasei 2333. Interestingly, it was not possible to predict convergently transcripted RNA molecules able to act as antitoxins in any of these sequences.

Type II TA Systems Distribution and Characteristics
Automatic annotation of the newly sequenced genomes by using the RAST server allowed us to find toxin and antitoxin sequences attributed to type II TA systems. A BLAST manual search by using the sequences described in the Materials and Methods section led to the identification of additional sequences related to type II TA systems (Figures 5 and 6). It is interesting to note that each isolate encodes several toxin and antitoxin sequences but, not all are organized in toxin/antitoxin pairs, therefore some sequences are here categorized as orphans ( Figure 5). The most represented TA systems are DinJ/YafQ and MazEF, classified as ribosome-dependent and ribosome-independent RNA interferase, which are spread in all the analyzed strains in single or multiple copies. In addition to these systems, complete or incomplete YefM/YoeB and Phd/Doc systems were identified in different strains (Figures 5 and 6).

310
From a functional point of view, it is interesting to note that the truncated form of 311 YafQ (YafQ_rh1473 and YafQ_rh1019) is inactive [19] and that the orphan YafQ proteins 312 (YafQ_pa2333_1, YafQ_pa2306_3, YafQ_pa4186_1 and YafQ_rh1473_2) are characterized 313 by two mutations (D61/A and D67/N or S) (Figure 6a) involving two out of the four cata-314 lytic residues identified in E. coli, suggesting their inactivation. Data reported in the liter-315 ature on E. coli YafQ have indeed shown that the mutant YafQ D61/A maintains 50% of 316 the enzymatic activity, while the replacement of aspartate 67 with alanine leads to the loss 317 of toxicity [29]. 318 About the DinJ/YafQ TA system, eight complete homologs were identified spread in all the strains, while orphan toxins were found in all the analyzed genomes except L. rhamnosus 1019 (Figures 5 and 6). The alignment of all the identified YafQ sequences is reported in Figure 6a. The four orphan toxins (YafQ_pa2333_1, YafQ_pa2306_3, YafQ_pa4186_1 and YafQ_rh1473_2) share high sequence identity ranging from 72 to 100% but are quite dissimilar from the other identified YafQ proteins (sequence identity from 14 to 27%) (Figure S4c). Among YafQ proteins belonging to complete systems, four toxins (YafQ_pa4186, YafQ_pa2333, YafQ_pa2306_1, YafQ_rh1473_1) share high sequence identity (from 76 to 98%) ( Figure S4a), while two YafQ homologs found in L. paracasei 2306 (YafQ_pa2306 and YafQ_pa2306_2), show sequence identity ranging from 11% to 38% with the other YafQ toxins. In addition, the truncated YafQ form, lacking the amino-terminal end previously characterized by Ferrari et al. [19], was found in both the L. rhamnosus genomes (YafQ_rh1473 and YafQ_rh1019) ( Figure S4b).
From a functional point of view, it is interesting to note that the truncated form of YafQ (YafQ_rh1473 and YafQ_rh1019) is inactive [19] and that the orphan YafQ proteins (YafQ_pa2333_1, YafQ_pa2306_3, YafQ_pa4186_1 and YafQ_rh1473_2) are characterized by two mutations (D61/A and D67/N or S) (Figure 6a) involving two out of the four catalytic residues identified in E. coli, suggesting their inactivation. Data reported in the literature on E. coli YafQ have indeed shown that the mutant YafQ D61/A maintains 50% of the enzymatic activity, while the replacement of aspartate 67 with alanine leads to the loss of toxicity [29].
Comparing E.coli YafQ catalytic residues (H50, H63, H87, D61, D67 and F91) with those found in the YafQ proteins belonging to complete systems of Lacticaseibacillus strains, it is possible to note that the three histidines are highly conserved with the only exception of YafQ_pa2306_1, in which H63 has been substituted with a tyrosine. D61 is conserved or replaced by an E residue except in YafQ_rh1473_1 in which is substituted with a K, while D67 is replaced by a G in YafQ_pa2333 (Figure 6a). This latter mutation induces the loss of toxic activity as reported in Ferrari et al. [19].

324
Comparing E.coli YafQ catalytic residues (H50, H63, H87, D61, D67 and F91) with 325 those found in the YafQ proteins belonging to complete systems of Lacticaseibacillus 326 strains, it is possible to note that the three histidines are highly conserved with the only 327 exception of YafQ_pa2306_1, in which H63 has been substituted with a tyrosine. D61 is 328 The identified DinJ antitoxins share similar trends in terms of amino acid composition. DinJ_pa4186, DinJ_pa2333, DinJ_pa2306_1, DinJ_rh1473_1 share sequence identity in the range of 74%-100%, while DinJ_pa2306_2 and DinJ_pa2306 appear more dissimilar from the other DinJ homologous sequences (sequence identity 4-24%) ( Figure S5a).
In the case of MazEF, a single copy was identified in each strain. The identified MazF toxins share a high sequence identity, ranging from 98% to 100%, and show the conservation of both the catalytic residues involved in the RNA cleavage (Figure 6b) [30]. MazE antitoxins present more variability among the strains, in a species-dependent way, and show 86% of sequence identity between L. paracasei and L. rhamnosus ( Figure S5b).
In the case of YefM/YoeB, the antitoxin YefM was identified in all strains except for L. paracasei 2333, while only two YoeB toxins were found in the two L. rhamnosus strains ( Figure 5). These YoeB proteins show a sequence identity of 97% and contain both the catalytic residues identified in E. coli (Figure 6c) [31]. The YefM antitoxins have sequence identities ranging from 35% of the truncated YefM_pa2306 sequence, lacking the aminoterminal end, up to 98% ( Figure S5c).
Finally, two complete Phd/Doc systems and four orphan antitoxins Phd were identified. In the Phd/Doc pairs of L. paracasei strains 2333 and 4186, toxins have identical amino acid sequences and show the typical catalytic motif HXFX(D/N)(A/G)NKR (Figure 6d), while their cognate antitoxins (Phd_pa2333 and Phd_pa4186) share a sequence identity of 98% ( Figure S5d). The orphan antitoxin Phd_pa2333_1 is shown as incomplete because it was not possible to identify its N-terminal aminoacid sequence due to its location at the beginning of the contig. Among the other orphan Phd sequences, Phd_rh1473 and Phd_rh1019_1 share an identity of 95%, while Phd_rh1019 is very dissimilar, showing a sequence identity of 8% and 9% with Phd_rh1473 and Phd_rh1019_1, respectively ( Figure S5d).
A preliminary search aimed to identify type II TA systems in the strains used as references in the genome analysis (Table S1) led to the identification of several complete systems. MazEF and DinJ/YafQ were the most spread among the analyzed strains with one copy of MazEF for each strains and five copies of DinJ/YafQ found in L. casei ATCC393 and in the strains BD-II, BL23, JCM8130 and W56 of L.paracasei. In addition, three complete YefM/YoeB were identified in L. rhamnosus (ATCC 8530, Lc 705 and LOCK908 strains) and one complete Phd/Doc system was found in the strains JCM8130 of L. paracasei.

Discussion
Comparative analysis of microbial genomes represents an unprecedented tool for phylogenetic analysis and allows the provision of insights on strain-level diversity and niche adaptation [7]. The genus Lacticaseibacillus was recently proposed to include the species formerly known as L. casei, L. paracasei and L. rhamnosus, which form a clade of closely phylogenetically related microorganisms with relevant application in the food industry [9,11]. Bacteria from this genus are isolated from a variety of habitats, among which are dairy products, where they can contribute to the development of cheese aromatic properties [32]. Specific traits related to the adaptation to this environment have been identified in some strains [13].
Isolates of the Lacticaseibacillus group are often reported as nomadic lactobacilli, as a reference to their frequency of isolation from a wide variety of niches [7]. In the perspective of a nomadic lifestyle, plasmids can carry several genes important for the growth and adaptation of microorganisms to varying environmental conditions, as well as technological traits useful for the selection of strains for industrial applications [33]. Indeed, a comparative analysis of plasmids from the broad Lactobacillus genus corroborated the association between plasmids and functional traits involved in niche adaptation, highlighting the enrichment of genes associated with transposition, and defense systems such as restriction-modification enzymes and toxin-antitoxin systems [34].
Plasmid analysis performed in this study has revealed a small cluster of genes involved in the metabolism of the sulfur-containing amino acids cysteine and methionine, encoded in plasmid p1019-1 found in L. rhamnosus 1019. Furthermore, the same cluster is present also in the genomes of the strains L. paracasei 2306 and L. paracasei 2333. Utilization of sulfurcontaining amino acids is linked to the production of key flavor compounds, representing a trait of interest for the selection of strains for the food industry [35]. The cluster identified in this study was recently described in a paper, which investigated the transcriptional regulation of cysteine and methionine metabolism in L. paracasei FAM18149, reporting the overexpression of this operon in a chemically defined media omitting cysteine or methionine [36]. A previous work reported the presence of genes involved in methionine and cysteine metabolism in the plasmid pLC1 from L. rhamnosus Lc705, suggesting the occurrence of plasmid-mediated transfer from species like S. thermophilus, L. helveticus and L. delbrueckii sp. bulgaricus [37]. It is noteworthy that all the Lacticaseibacillus strains included in the present work were isolated from dairy products, which is a common source of isolation for bacteria from this genus, highlighting their potential implication in the development of sensory attributes of dairy products [38,39].
A typical trait associated to the adaptation to hostile environments is the presence of toxin-antitoxin systems in bacterial genomes. In this work, we investigated the distribution of the main type II TA systems and of Lpt/RNAII homologs, a type I TA system previously identified in plasmid DNA of Lacticaseibacillus strains. The distribution of type I TA systems in the analyzed strains has shown their frequent location on plasmid DNA, as well as in multiple copies in agreement with the proposed role in the post-segregation killing phenomena [3]. Three toxin peptides, named Lptlike, Lptlike1 and Ltlike2, in addition to Lpt have been predicted but in vivo studies should be carried out to verify their toxicity; also, expression studies conducted on Lacticaseibacillus strains could provide more insight into the activity of these TA systems. Interestingly, the presence of type I TA systems loci on plasmid carrying genes involved in the expression of favorable properties, such as the production of volatile compounds, can promote the maintenance of the plasmid and the consequent preservation of the positive character. At the same time, most of the protein-coding sequences of the plasmids carrying type I TA systems are of unknown function; thus, future research could aim at their functional characterization.
The prevalence of type II TA systems is in agreement with the distribution of these systems in the Lactobacillus sequences deposited in the public databases, as reported in Ferrari et al. [19]. Different orphan YafQ toxins and orphan Phd/YefM antitoxins have been also identified. Interestingly, all the YafQ toxins not associated with the cognate antitoxins DinJ show the replacement of catalytic residues essential for their activity. On the contrary, the orphan antitoxins often share a high sequence identity with similar antitoxins associated with the cognate toxins suggesting that they could play a role in cross-regulation mechanisms [40]. The wide distribution observed for type II TA systems in these bacterial strains isolated from dairy products suggests an important role in the adaptation of these strains to environmental stresses, such as scarcity of nutrients typical of cheese. Recent studies have actually shown that the expression of DinJ/YafQ TA systems can be related to various environmental stresses, including those encountered in food production [16,41].
The strains sequenced in this work were previously characterized for their potential application as adjunct starters in dairy products [42], or for use in the production of fermented fruit juices [8]. It is noteworthy that strain L. paracasei 2306, that showed the highest number of identified type II TA systems, shows a phenotype of reduced growth in vegetable substrates, but retains metabolic activity associated with increased volatile compounds and modification of the polyphenolic profile [8]. Despite the fact that no association can be made between the presence and activation of TA systems, transcriptomic analysis could provide information on the activity of TA systems in different environmental conditions useful for finding interesting correlations with specific traits of interest for various industrial applications.
Although the biological functions of many TA modules and their role in the adaptation of microorganisms to various substrates have not been completely clarified, genome analysis can provide useful insights. By comparing the genomes of Lacticaseibacillus isolates from dairy matrices, it is possible to identify traits that are involved in the adaptation to the environment of cheese or in the development of the sensory profile of the final product. Moreover, the knowledge of TA systems in industrially relevant microorganisms is still limited, opening the possibility to shed light on the mechanisms that lead to the assembly of food microbiota.
Supplementary Materials: The following are available online at https://www.mdpi.com/2076-2 607/9/3/648/s1, Figure S1: Cysteine and methionine genetic cluster identified in p1019-1 from L.rhamnosus 1019, Figure S2: Multiple alignment of TA systems homologous to Lpt/RNAII identified in Lacticaseibacillus strains, Figure S3: Multiple alignment of Lptlike2/RNAII TA systems identified in Lacticaseibacillus strains used as references compared with Lptlike2_pa2333. Figure S4: Multiple alignments of YafQ proteins from Lacticaseibacillus strains compared to E. coli YafQ. Figure S5: Multiple alignments of antitoxin sequences belonging to type II TA systems; Table S1: Accession numbers of the 15 Lacticaseibacillus type strains publicly available genomes used in this work, Table S2: Functional annotation of plasmid-encoded features.