Thermophilin 13: In Silico Analysis Provides New Insight in Genes Involved in Bacteriocin Production

Bacteriocins are a large family of ribosomally synthesised proteinaceous toxins that are produced by bacteria and archaea and have antimicrobial activity against closely related species to the producer strain. Antimicrobial proteinaceous compounds are associated with a wide range of applications, including as a pathogen inhibitor in food and medical use. Among the several lactic acid bacteria (LAB) commonly used in fresh and fermented food preservation, Streptococcus thermophilus is well known for its importance as a starter culture for yoghurt and cheese. Previous studies described the bacteriocin thermophilin 13 exclusively in S. thermophilus SFi13 and the genes encoding its production as an operon consisting of two genes (thmA and thmB). However, the majority of bacteriocins possess a complex production system, which involves several genes encoding dedicated proteins with relatively specific functions. Up to now, far too little attention has been paid to the genes involved in the synthesis, regulation and expression of thermophilin 13. The aim of the present study, using in silico gene mining, was to investigate the presence of a regulation system involved in thermophilin 13 production. Results revealed the dedicated putative bacteriocin gene cluster (PBGC), which shows high similarity with the class IIb bacteriocins genes. This newly revealed PBGC, which was also found within various strains of Streptococcus thermophilus, provides a new perspective and insights into understanding the mechanisms implicated in the production of thermophilin 13.


Introduction
Streptococcus thermophilus is a nonpathogenic lactic acid bacterium commonly isolated from bovine mammary tissue and raw milk, producing lactic acid, exopolysaccharides (EPS) and several organoleptic compounds from the fermentation of lactose and galactose, and also is a well-known starter culture used in the production of yoghurt and cheese [1]. The species has GRAS (Generally Regarded As Safe) status from the FDA (Food and Drug Administration) and QPS (Qualified Presumption of Safety) status from the EFSA (European Food Safety Authority). Several strains of S. thermophilus produce bacteriocins, which are small, ribosomally synthesized peptides with narrow or broad spectrum antimicrobial activity [2]. Examples include thermophilin A (strain ST134) [3], thermophilin T (strain ACA-DC 0040) [4], thermophilin 110 (strain 580) [5] and thermophilin 1277 (strain SBT1277) [6]. Two other unnamed bacteriocins were reported in S. thermophilus strain 81 and S. thermophilus strain 580, but little is known about their peptide sequence and genes encoding them [7,8].
Expression of bacteriocin genes is usually subject to external induction factors (IFs) regulation. The gene encoding the pre-peptide is normally located in the same operon

Genome Sequences
The thermophilin 13 operon sequence (U93029.1) amounting to 960-bp and listed in the National Center for Biotechnology (NCBI, Bethesda, MD, USA; https://www.ncbi.nlm .nih.gov/; accessed on 2 October 2022) nucleotide database was used to conduct a similarity search using the NCBI Basic Local Alignment Search Tool (BLAST) [14]. Similarities to the thermophilin 13 operon were determined using NCBI Sequence Viewer [15]. The complete genome sequences of all bacterial strains showing the presence of 960-bp with an identity of 100% with the thermophilin 13 operon (U93029.1) sequence were downloaded from the NCBI database.
Amino acid sequences with predicted ORFs (open reading frames) were compared against the non-redundant protein database using Blastp version 2.9.0+ (protein-protein BLAST) [18]. Using the Jukes-Cantor model, a Nearest-Neighbor-Interchange (NNI) tree with 1000 Bootstraps was constructed, including all bacteriocin gene sequences provided by antiSMASH and BAGEL4 hosted on the NCBI website. Analyses were conducted using the MEGA 11 software (Version 11.0.11) platform [19].

Results
Currently, no complete genome sequence of S. thermophilus SFi13 is available on the NCBI database. In a study by Comelli et al. (2002) and in the deposited patent USOO7491386B2, the authors described and evaluated bacterial strains with potential properties as oral probiotics, useful for the prevention of dental caries. According to the Nestlé Culture Collection (NCC), they also affirmed that strain S. thermophilus SFi13 was reclassified as S. thermophilus NCC 2008 [26,27].
lated to the accession codes for genomes valid for the NCBI database; the same strains are reported as Streptococcus thermophilus CIRM-BIA1048 Streptococcus thermophilus CIRM-BIA1049 in citation [32,33]. *** GenBank code PEBN01000000.1 and PEBN01000052.1 both refer to Streptococcus macedonicus 19AS strain in NCBI database.
Data obtained using BAGEL4 and antiSMASH version 5.0 confirmed the distribution of thermophilin 13 BGC (biosynthetic gene cluster) in all 11 strains of S. thermophilus. All strains showed the same area of interest (AOI), with some variation in nucleotide sequences. Strains Streptococcus thermophilus B59671 (CP022547.1) and Streptococcus macedonicus 19AS (PEBN00000000.1) were the most diverse based on AOI. Figure 1 shows a cladogram tree that is derived from the multiple sequence alignments of AOIs identified from the in silico study. Known bacteriocin loci were detected, e.g., the lantibiotic salivaricin 9 operon in the genome of S. thermophilus NCTC 12958 and Streptococcus thermophilus ATCC 19258 and thermophilin 110 operon in S. thermophilus B59671 [5]. The entire locus of Salivaricin 9 was fully characterized from S. salivarius strain JIM8780, and it was shown to consist of eight genes, having the following putative functions: sivK, sensor kinase; sivR, response regulator; sivA, Sal9 precursor peptide; sivM, lantibiotic modification enzyme; sivT, ABC transporter involved in the export of Sal9 and concomitant cleavage of its leader peptide; and sivFEG, encoding lantibiotic self-immunity [39]. The broad-spectrum bacteriocin thermophilin 110 is encoded within the blp gene cluster. Furthermore, thermophilin 110 was reported to inhibit the growth of Listeria monocytogenes, Streptococcus mutans, Streptococcus pyogenes and Propionibacterium acnes.
Manual curation and annotation were performed to compare the differences between the ORFs predicted by the bacteriocin mining tools. Comparisons of AOIs indicated that the gene loci in the thermophilin 13 operon are organised into eight genes/ORFs encoding proteins related to bacteriocin production, plus the two thermophilin 13 structural genes. Known bacteriocin loci were detected, e.g., the lantibiotic salivaricin 9 operon in the genome of S. thermophilus NCTC 12958 and Streptococcus thermophilus ATCC 19258 and thermophilin 110 operon in S. thermophilus B59671 [5]. The entire locus of Salivaricin 9 was fully characterized from S. salivarius strain JIM8780, and it was shown to consist of eight genes, having the following putative functions: sivK, sensor kinase; sivR, response regulator; sivA, Sal9 precursor peptide; sivM, lantibiotic modification enzyme; sivT, ABC transporter involved in the export of Sal9 and concomitant cleavage of its leader peptide; and sivFEG, encoding lantibiotic self-immunity [39]. The broad-spectrum bacteriocin thermophilin 110 is encoded within the blp gene cluster. Furthermore, thermophilin 110 was reported to inhibit the growth of Listeria monocytogenes, Streptococcus mutans, Streptococcus pyogenes and Propionibacterium acnes.
Manual curation and annotation were performed to compare the differences between the ORFs predicted by the bacteriocin mining tools. Comparisons of AOIs indicated that the gene loci in the thermophilin 13 operon are organised into eight genes/ORFs encoding proteins related to bacteriocin production, plus the two thermophilin 13 structural genes. These were consistent for all strains and include a response regulator (RR), sensor histidine protein kinase (HPK), quorum-sensing system pheromone BlpC, ABC-transporter, bacteriocin accessory protein, thiol-disulfide oxidoreductases, CAAX protease and genes thmA and thmB. Despite the similarity in translation, three operon patterns were observed using the Operon-mapper web server. These variations, including all strains analysed in the present study, are schematically visualised in Figure 2.
These were consistent for all strains and include a response regulator (RR), sensor histidine protein kinase (HPK), quorum-sensing system pheromone BlpC, ABC-transporter, bacteriocin accessory protein, thiol-disulfide oxidoreductases, CAAX protease and genes thmA and thmB. Despite the similarity in translation, three operon patterns were observed using the Operon-mapper web server. These variations, including all strains analysed in the present study, are schematically visualised in Figure 2. A similar regulation and secretion system was observed for thermophilin 13 in groups 1, 2 and 3 ( Figure 2). However, an additional nucleotide sequence (mobile element zone) was detected in group 1 (Figure 2). No variations in gene transcription upstream and downstream of this area were observed. In this regard, the insertion element, which is present in all sequenced BGCs of cluster 1, requires further investigation to assess possible interference with thermophilin 13 production due to the presence of transposases. A third ORF (ORFC), encoded by the U93029.1 operon, was reported by Marciset et al. (1997) [9] and was found in all BGCs groups shown in Figure 2. Structure models of the poration complex formed by Thermophilin 13 were described as the ThmA enhancing ThmB peptide with maximal explication in antimicrobial activity in equimolar concentration. However, the peptide ThmA alone resulted in antibacterial activity against S. thermophilus, Clostridium botulinum, Listeria. monocytogenes and Bacillus cereus. A similar regulation and secretion system was observed for thermophilin 13 in groups 1, 2 and 3 ( Figure 2). However, an additional nucleotide sequence (mobile element zone) was detected in group 1 (Figure 2). No variations in gene transcription upstream and downstream of this area were observed. In this regard, the insertion element, which is present in all sequenced BGCs of cluster 1, requires further investigation to assess possible interference with thermophilin 13 production due to the presence of transposases. A third ORF (ORFC), encoded by the U93029.1 operon, was reported by Marciset et al. (1997) [9] and was found in all BGCs groups shown in Figure 2. Structure models of the poration complex formed by Thermophilin 13 were described as the ThmA enhancing ThmB peptide with maximal explication in antimicrobial activity in equimolar concentration. However, the peptide ThmA alone resulted in antibacterial activity against S. thermophilus, Clostridium botulinum, Listeria. monocytogenes and Bacillus cereus.
In this regard, the presence of GxxxG-motifs or GxxxG-like motifs AxxxA and SxxxS motif, instead of the GxxxG-motif and a high helical content were related to the twopeptide bacteriocins into form membrane-penetrating helix-helix structures, explaining the increased helical content forming a dimer complex, in which an incremented antimicrobial action is attributable [40] This dual peptide interaction was described in several class IIb bacteriocins including thermophilin 13 as is reported by the authors Oppegård et al. (2008) and Nissen-Meyer et al. (2010) [41,42]. However, this aspect requires further investigation due to multiple GxxxG motifs, located in positions 21 Figure 3.
In this regard, the presence of GxxxG-motifs or GxxxG-like motifs AxxxA and SxxxS motif, instead of the GxxxG-motif and a high helical content were related to the two-peptide bacteriocins into form membrane-penetrating helix-helix structures, explaining the increased helical content forming a dimer complex, in which an incremented antimicrobial action is attributable [40] This dual peptide interaction was described in several class IIb bacteriocins including thermophilin 13 as is reported by the authors Oppegård et al.  [41,42]. However, this aspect requires further investigation due to multiple GxxxG motifs, located in positions 21 GxxxG 28 for ThmB peptides, respectively, as is showed in Figure 3.

Discussion
Most bacteriocin operons include genes involved in the post-transcriptional modification and/or secretion of these peptides [12]. Based on that, the present study examined the thermophilin 13 operon (U93029.1) described by Marciset et al. (1997) [9], which appears lacking in bacteriocin-regulating genes involved in bacteriocin synthesis.
In silico analysis is an excellent predictor of "bacteriocin-associated driver genes" within genomes genes adding information on the mechanism related to the specific bacteriocin production. Starting from genomic or amino acid sequences, the main advantages of these methods are a significant reduction in time in comparison to the traditional screening method and, subsequently, the costs embroiled to the use of laboratory materials. Antimicrobial genome-mining tools have been closing the gap between a large number of predicted biosynthetic gene clusters (BGC) encoding bacteriocins, including ribosomally synthesized, post-translationally modified peptides (RiPPs) and also polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS) [43]. However, the presence of bacteriocin genes in a strain is always directly related to an effective translation into biological antimicrobial activity [44].
Furthermore, we confirm the presence of ORFC in all strains; however, no correspondence related to this peptide has been associated with bacteriocin production in the databases.
Similarly to other bacteriocin gene clusters, response regulators grouped as LytR/AlgR family (RR) were predicted [45]. These regulators explicate their function in binding to promoters that initiate the transcription after phosphorylation of Asp residues promoting bacteriocin production and autoactivating their respective operons [46]. LytR Regulatory Systems represents the most abundant type of transcriptional regulator in the

Discussion
Most bacteriocin operons include genes involved in the post-transcriptional modification and/or secretion of these peptides [12]. Based on that, the present study examined the thermophilin 13 operon (U93029.1) described by Marciset et al. (1997) [9], which appears lacking in bacteriocin-regulating genes involved in bacteriocin synthesis.
In silico analysis is an excellent predictor of "bacteriocin-associated driver genes" within genomes genes adding information on the mechanism related to the specific bacteriocin production. Starting from genomic or amino acid sequences, the main advantages of these methods are a significant reduction in time in comparison to the traditional screening method and, subsequently, the costs embroiled to the use of laboratory materials. Antimicrobial genome-mining tools have been closing the gap between a large number of predicted biosynthetic gene clusters (BGC) encoding bacteriocins, including ribosomally synthesized, post-translationally modified peptides (RiPPs) and also polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS) [43]. However, the presence of bacteriocin genes in a strain is always directly related to an effective translation into biological antimicrobial activity [44].
Furthermore, we confirm the presence of ORFC in all strains; however, no correspondence related to this peptide has been associated with bacteriocin production in the databases.
Similarly to other bacteriocin gene clusters, response regulators grouped as LytR/AlgR family (RR) were predicted [45]. These regulators explicate their function in binding to promoters that initiate the transcription after phosphorylation of Asp residues promoting bacteriocin production and autoactivating their respective operons [46]. LytR Regulatory Systems represents the most abundant type of transcriptional regulator in the prokaryotic kingdom involved as either activators or repressors of single or operonic genes; of genes, including those involved in virulence, metabolism, quorum sensing motility and bacteriocins [45,47]. Furthermore, histidine kinases and response regulators mediate the actual response regarding bacteriocin production by a two-component signal-transducing system [48][49][50][51].
Peptide MTKHRTSLTAFTELSPSELHRISGGDWWDWMKYFPSKQAIDSNKHKLG is present in all groups. By identifying the potential role in the bacteriocins biosynthetic gene cluster of this peptide, the prediction showed affinity to the quorum-sensing pheromone BlpC (PF03047 HMM), which is also appointed as ComC/BlpC family leader-containing pheromone/bacteriocin. Interestingly, these peptides are different but are reported in several quorum-sensing regulated bacteriocins in S. thermophilus, stimulating the production of BLP (bacteriocin-like peptides) as a signal peptide for the activation of bacteriocin synthesis through a three-component regulatory system consisting of a peptide pheromone, a membrane-associated histidine protein kinase, and response regulators [52,53]. Plantaricins A, E/F and J/K by L. plantarum of C11, sakacin A of L. sakei Lb706EF and sakacin P of L. sakei LTH673102 are the best examples of bacteriocin of class II regulated by the three-component regulatory system, including inducing peptide (an indicator of the cell density), which is sensed by the corresponding (HPK), resulting in the activation of the RR [54].
A dedicated bacteriocin ABC-transporter, including a peptidase C39 motif, predicted to be a bacteriocin/lantibiotic transporter based on conserved domains (COG227400), and a bacteriocin accessory protein generally associated with transport, was observed [55,56]. ABC-transporter proteins related to the class II bacteriocin maturation and secretion carry a proteolytic peptidase C39 domain in their N-termini. The proteolytic peptidase C39 cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion, modulated in association with an ATP-binding cassette component located in the same protein [57,58]. Differences in ABC transporter sequences in Group 2 were detected in the C39 motif. Interestingly, an independent protein containing C39 peptidase domains, in terms of amino acid sequences, is present in subgroup 2.2. This protein conformation is termed C39 peptidase-like domains (CLD); additionally, their role is not yet completely understood, and they appear degenerated with nonproteolytic activity [59,60]. Most endopeptidases of family C39 are the less conservative component in the entire bifunctional transporter protein with a dedicated catalytic function for the secretion of the antimicrobial peptide of interest [61].
Thiol-disulfide oxidoreductases (TDORs) in Gram-positive bacteria play an essential role in forming disulfide bonds, allowing correct folding in class II bacteriocins through the R-S-S-R bond of the CXXC catalytic site resulting in disulfide-bonded cysteines [62,63]. In this regard, only thmA has two cysteine residues in positions 6 and 53 of the aminoacidic backbones. The aminoacid methionine and single cysteines are also vulnerable to oxidation, but it has never been reported the disulfide bridge formation with this conformation in bacteriocins. However, in this protein, the LPxTG motif membrane-anchored transpeptidase, which cleaves proteins between the threonine (Thr) and the glycine (Gly), is conserved.
Interestingly, ThmA and ThmB peptides lack Thr residues; this is in accordance with Marciset et al. (1997), who observed that the oxidation of methionines to methoxides in position (Met 10 , Met 54 and/or Met 57) of ThmA seems the only possible explanation of the proposed poration complexes (AB)n (i.e., Thermophilin 13) [9].
CAAX metalloproprotease (bacteriocin-processing enzymes) detected in bacteriocin loci, including the Abi genes downstream of the bacteriocin structural genes, is likely involved in self-immunity. The role of these conserved motifs in the immunity function conferred a high degree of cross-resistance against each other's bacteriocins, suggesting the recognition of a common receptor. An example of this mechanism was found in Latilactobacillus sakei 23K [64]. Furthermore, the bacteriocin-like gene sak23Kalphabeta showed antimicrobial activity when expressed in a heterologous host, and the associated Abi gene sak23Ki conferred immunity against the related bacteriocin [65,66]. Genes encoding the production, secretion, regulation, and immunity of thermophilin 13 are similar to gene sequences reported for class II bacteriocins from S. thermophilus strains LMG18311, CNRZ1066, and LMD-9 [67]. However, in S. thermophilus B59671, belonging to Group 3, TDOR and CAAX protease are replaced with a CRISPR/Cas system. Prior studies have noted the importance of quorum sensing induction peptides encoded by the different blp gene clusters found in S. thermophilus strains ST109, LMD-9, ST106, LMG18311, CNRZ1066, ND03, JIM8232, MN-ZLW002 and B59671 due to their homology to a bacteriocin-like peptide (blp) gene cluster in S. pneumoniae [28,51,68,69]. In relation to this aspect, the strains S. thermophilus B59671, ST106, ST109 and LMD-9 have been shown to produce a broad spec- trum of bacteriocins encoded within a bacteriocin-like peptide (blp) gene cluster. However, the thermophilin 13 operon is also present in LMD-9 and B59671 strains but must not be confused with the bacteriocin-like peptide (blp) in S. pneumoniae gene clusters mentioned above. In this regard, strains LMD-9 and B59671 could be multiple-bacteriocins producer strains and should be highlighted for the necessary evaluation of the role of environmental factors and medium composition on bacteriocin production.
Bacteriocin production is an energy-utilising process involving a cascade of genetic mechanisms that varies greatly in how bacteriocin loci are organised. Among bacteriocin production mechanisms, in many strains, quorum-sensing (QS) circuits modulate various physiological responses, including the production of antimicrobial compounds [70,71]. However, in silico screens can be limited by their dependence on similarity to those previously described by Walsh et al. (2015) [72]. Further work is required to confirm that operon variation between strains influences the production of thermophilin 13. In summary, these results highlight that the production of peptides ThmA and ThmB is strongly related to its PBGC, which is not limited to only thmA and thmB genes. Therefore, it can be assumed that the QS system regulates the expression of thermophilin 13 bacteriocins in several S. thermophilus strains. It has to be considered that since 1997 no other investigations have been made on this bacteriocin. However, the evidence gathered in this study provides further insights into the mechanism of production and regulation of thermophilin 13; this has been observed and described to the scientific community after twenty-five years. All these reported strains are used in industrial applications, and their technological properties have already been proven, opening a new panorama of research that need further investigation. In light of the urgent need for new weapons to counteract pathogens without the use of antibiotics, the identification of the most suitable thermophilin 13 producer strain in terms of bacteriocin production and its applicability in food manufacturing is relevant.

Conclusions
The significance of antimicrobial peptides (AMPs) is growing for applicability in various fields, including as a bioprotector agent. There are still many challenges regarding bacteriocins looking for an answer, such as structural multiplicity, different modes of action, different classes, and the high cost of production. Furthermore, also for bacteriocins already applied as preservative agents, the major issues are connected to finding strategies for optimizing their maximum rate of production and developing more effective purification steps from the bacterial supernatant, which are currently long and complicated.
A large number of genomes available in public repositories offer novel approaches valuable in identifying novel bacteriocin genes and gene clusters [54]. Screening of putative bacteriocin gene clusters provides a deeper understanding of how these peptides are regulated. Genome mining indicates that operon thermophilin 13 is present in several strains grouped in three different clusters on the basis of the different genes organization of the eight genes involved in these bacteriocins' biosynthesis. As Marciset et al. (1997) suggested in their conclusion, thermophilin 13 showed peculiar and different characteristics in its mode of action that can share functional properties of lantibiotics. Our results also indicated that the thermophilin 13 two-component peptide system belongs to class IIb with its own related genes cluster, composed of a response regulator (RR), sensor histidine protein kinase (HPK), quorum-sensing system pheromone BlpC, ABC-transporter, bacteriocin accessory protein, thiol-disulfide oxidoreductases, CAAX protease and genes thmA and thmB. However, the predictions obtained from the present research and the others in silico studies in general, must not be accepted as conclusive evidence for bacteriocin production, and we do not claim that all strains included in this study can produce thermophilin 13 in vitro and/or in vivo. Nevertheless, the information obtained in this study shed some light on the possible quorum sensing involvement in the mechanisms of regulation and secretion of thermophilin 13, which has already been reported in Streptococcus thermophilus strains having bacteriocin-like peptide (blp) gene cluster. This is a solid starting point for further investigation into a topic that has not been explored since 1997, which provides