The Complete Genome of Probiotic Lactobacillus sakei Derived from Plateau Yak Feces

Probiotic bacteria are receiving increased attention due to the potential benefits to their hosts. Plateau yaks have resistance against diseases and stress, which is potentially related to their inner probiotics. To uncover the potential functional genes of yak probiotics, we sequenced the whole genome of Lactobacillus sakei (L. sakei). The results showed that the genome length of L. sakei was 1.99 Mbp, with 1943 protein coding genes (21 rRNA, 65 tRNA, and 1 tmRNA). There were three plasmids found in this bacteria, with 88 protein coding genes. EggNOG annotation uncovered that the L. sakei genes were found to belong to J (translation, ribosomal structure, and biogenesis), L (replication, recombination, and repair), G (carbohydrate transport and metabolism), and K (transcription). GO annotation showed that most of the L. sakei genes were related to cellular processes, metabolic processes, biological regulation, localization, response to stimulus, and organization or biogenesis of cellular components. CAZy annotation found that there were 123 CAZys in the L. sakei genome, with glycosyl transferases and glycoside hydrolases. Our results revealed the genome characteristics of L. sakei, which may give insight into the future employment of this probiotic bacterium for its functional benefits.


Introduction
Probiotics are live microbes which confer health benefits to their host by supporting a healthy gut microbiota, which plays an important role in the digestive system, and the alleviation of symptoms of disease [1,2]. Lactobacillus sakei (L. sakei) is a representative probiotic that is commonly isolated from fermented foods [3]. This nonpathogenic lactic acid bacterium has been found to prevent the growth Genes 2020, 11, 1527 3 of 15

Library Construction and Sequencing
Extraction and purification of targeted DNA fragments was performed using a BluePippin automatic nucleic acid gel-cutting instrument (Sage Science, Beverly, MA, USA). DNA samples were fragmented, end-repaired, purified, end-linked with barcodes, purified, and linked with a sequencing adapter by employing an NBD103 and NBD114 kit (Oxford Nanopore Technologies, Oxford, UK) to construct the DNA library. The sequence library concentration was measured using an Invitrogen Qubit 3.0 (Thermo Fisher Scientific, Wuhan, China). Sequencing was carried out by utilizing a MinION Nanopore sequencer (Oxford Nanopore Technologies, Personalbio, Shanghai, China) and Illumina HiSeqTM 2000 sequencer (Personalbio, Shanghai, China).

Sequencing Data and Quality Control
After obtaining the Nanopore and Illumina sequence of the generated raw data, quality control was performed by using the Quality Score and content distribution. The base Q-Score is a commonly used evaluation parameter to represent reliability and accuracy. The base content distribution was analyzed to detect the separation of AT and GC. Reads with connectors and low-grade reads (proportion of N > 10%, over half of a read Q-Score ≤ 10) were removed to ensure high quality and accuracy for the subsequent analysis.

Plasmid Analysis
The annotations of L. sakei plasmids and protein-coding genes were blasted in the databases similar to those used for genome analysis.

Genome and Plasmid Comparison
In order to determine the genomic structure and plasmids of L. sakei, the genome sequence data was analyzed in comparison to the NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/ ?term=Lactobacillus+sakei).

Raw Data Deposit
The sequences of L. sakei's genome and plasmids were submitted to the NCBI database with the accession numbers: chromosome (CP064817) and plasmids (MW265923, MW265924, MW265925).

Sequencing Information and Quality Control Results
A total of 113,051 filtered reads (96.29%), with a total of 1,671,594,841 bp, were obtained through Nanopore sequencing (Table 1). Among these, low-quality reads were less than 5000 ( Figure 1A). Illumina sequencing showed that all the read error rates were less than 0.00035 ( Figure 1B). The distribution of A/T/C/G contents were all nearly horizontal, except in the first few reads (presented as peaks) ( Figure 1C), because of preference for random primer sequences in the initial reads. Overall, the clean reads of L. sakei were as high as 99.79% (7074292/7089390), with a total base of 1,061,058,974 bp (Table 2, Figure 1D). The Q-Scores of Q20 and Q30 were 97.95% and 93.42%, respectively (Table 2), revealing the high quality and accuracy of the L. sakei genome sequence reads.

Genome and Plasmid Comparison
In order to determine the genomic structure and plasmids of L. sakei, the genome sequence data was analyzed in comparison to the NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/?term=Lactobacillus+sakei).

Raw Data Deposit
The sequences of L. sakei's genome and plasmids were submitted to the NCBI database with the accession numbers: chromosome (CP064817) and plasmids (MW265923, MW265924, MW265925).

Sequencing Information and Quality Control Results
A total of 113,051 filtered reads (96.29%), with a total of 1,671,594,841 bp, were obtained through Nanopore sequencing (Table 1). Among these, low-quality reads were less than 5000 ( Figure 1A). Illumina sequencing showed that all the read error rates were less than 0.00035 ( Figure 1B). The distribution of A/T/C/G contents were all nearly horizontal, except in the first few reads (presented as peaks) ( Figure 1C), because of preference for random primer sequences in the initial reads. Overall, the clean reads of L. sakei were as high as 99.79% (7074292/7089390), with a total base of 1,061,058,974 bp (Table 2, Figure 1D). The Q-Scores of Q20 and Q30 were 97.95% and 93.42%, respectively (Table  2), revealing the high quality and accuracy of the L. sakei genome sequence reads.

Genome Assembly, Annotation and Phylogenetic Tree
A total of 1,989,121 bp (1.99 Mbp) were sequenced from L. sakei's circular genome, with a GC content of 41.1%. The sequencing depth of L. sakei was 533, which could reduce false positives and sequencing error rates. Prodigal-2.6.2 (https://github.com/hyattpd/prodigal/wiki) was used to perform the gene prediction of L. sakei, and the results showed that the total gene length was 1,776,396 bp with an average gene length of 875 bp ( Table 3). The GC content of L. sakei genes was observed to be 41%, with an intergenetic region length of 212,724. Gene density and Gene/Genome (%) were 1045 and 89%, respectively. In total, L. sakei had 1943 protein coding genes, 21 rRNA (equal number of 5S rRNA, 16S rRNA, 23S rRNA), 65 tRNA, and 1 tmRNA. A total of 103 repeated sequences were observed, in which 49 sequences were located in rRNA, 45 were simple repeated sequences, and 9 were low-complexity repeat sequences. The L. sakei genome map was drawn using cgview ( Figure 2) [32]. The 16s rRNA phylogenetic tree showed that the L. sakei yak isolate is the closest evolutionary relative of L. sakei (MK640920.1) ( Figure 3). The whole genome-based phylogenetic tree showed that the L. sakei yak isolate is the closest evolutionary relative of L. sakei (NZ_CP059697) ( Figure S1).

Genome Assembly, Annotation and Phylogenetic Tree
A total of 1,989,121 bp (1.99 Mbp) were sequenced from L. sakei's circular genome, with a GC content of 41.1%. The sequencing depth of L. sakei was 533, which could reduce false positives and sequencing error rates. Prodigal-2.6.2 (https://github.com/hyattpd/prodigal/wiki) was used to perform the gene prediction of L. sakei, and the results showed that the total gene length was 1,776,396 bp with an average gene length of 875 bp ( Table 3). The GC content of L. sakei genes was observed to be 41%, with an intergenetic region length of 212,724. Gene density and Gene/Genome (%) were 1045 and 89%, respectively. In total, L. sakei had 1943 protein coding genes, 21 rRNA (equal number of 5S rRNA, 16S rRNA, 23S rRNA), 65 tRNA, and 1 tmRNA. A total of 103 repeated sequences were observed, in which 49 sequences were located in rRNA, 45 were simple repeated sequences, and 9 were low-complexity repeat sequences. The L. sakei genome map was drawn using cgview ( Figure 2) [32]. The 16s rRNA phylogenetic tree showed that the L. sakei yak isolate is the closest evolutionary relative of L. sakei (MK640920.1) ( Figure 3). The whole genome-based phylogenetic tree showed that the L. sakei yak isolate is the closest evolutionary relative of L. sakei (NZ_CP059697) ( Figure S1).

Functional Analysis
EggNOG annotation uncovered that the mainly known genes of L. sakei genes were found to belong to J (translation, ribosomal structure, and biogenesis), L (replication, recombination, and repair), G (carbohydrate transport and metabolism), and K (transcription) (Figure 4). GO annotation showed that most of the L. sakei genes are related to biological processes, metabolic processes, biological regulation, localization, response to stimuli, and cellular component organization or biogenesis. In terms of function, most genes contribute to cell structural functions, including

Discussion
In the current study, we performed genome sequencing of L. sakei via both the Nanopore and Illumina sequencing platforms. The genome size, GC content, and number of coding proteins (1943) of the yak L. sakei isolates were compared to previously reported strains (Table 4) [3,18,19,[38][39][40]. The number of rRNA genes (22) of L. sakei was higher than most of the previously reported isolates (9~21). Interestingly, most of the previously reported L. sakei strains had 0~2 plasmids, with lengths ranging from 1526 bp to 84,581 bp, while the present yak isolates had three plasmids, with lengths of 4034~74,183 bp (Table 5).
A total of 49 repeated RNA genes were found in the L. sakei genome, with a high redundancy, which may promote the bacterium ability to flourish in a complex microbial environment (Figure 2) [40]. In the KEGG annotation, we found that the L. sakei genome was rich in energy metabolism-related genes (Figure 7). In the 72 "ko-" signaling pathways, carbohydrate-like glycolysis (ko00010), pentose phosphate (ko00030), pentose and glucuronate interconversions (ko00040), fructose and mannose metabolism (ko00051), galactose metabolism (ko00052), starch and sucrose metabolism (ko00500), amino sugar and nucleotide sugar metabolism (ko00520), and pyruvate metabolism (ko00620) were found. Lipid pathways, such as fatty acid biosynthesis (ko00061), fatty acid degradation (ko00071), glycerolipid metabolism (ko00561), glycerophospholipid metabolism (ko00564), glyoxylate and dicarboxylate metabolism (ko00630), propanoate metabolism (ko00640), butanoate metabolism (ko00650), and fatty acid metabolism (ko01212) were uncovered. ATP-related oxidative phosphorylation (ko00190) and phosphotransferase system (PTS) (ko02060), with essential carbohydrate and lipid signaling pathways, may play an important role in promoting host energy storage and metabolism, especially in harsh conditions. PTS is comprised of two cytoplasmic phosphoryl proteins and sugar-specific enzyme II complexes, which have many regulatory functions in metabolism (carbon, nitrogen, and phosphate), transport (chemotaxis and potassium), and pathogen virulence [41,42]. Bacteria using PTS to transport carbohydrates into the bacterial cells contribute to metabolism [43]. These results add to the evidence for specific adaptation characteristics of the L. sakei yak strain. Interestingly, besides β-lactam resistance (ko01501) and cationic antimicrobial peptide resistance (ko01530) pathways, antibacterial signaling, e.g., Streptomycin (ko00521) during Staphylococcus aureus infection (ko05150) may keep yaks from being infected by specific pathogens, as Staphylococcus aureus can modulate the sensitivity to cationic antimicrobial peptides to decrease the immune response (https://www.kegg.jp/dbgetbin/www_bget?ko05150) [44]. Our previous study also confirmed that this L. sakei strain was not only resistant to commonly used antibiotics by letting them survive during the treatment of Staphylococcus aureus infection, but also has strong antibacterial ability to Staphylococcus aureus in vitro [5]. Other pathways like longevity regulating pathway-worm (https://www.kegg.jp/dbget-bin/www_ bget?ko04922), glucagon signaling pathway (https://www.kegg.jp/dbget-bin/www_bget?ko04922), and central carbon metabolism in cancer (https://www.kegg.jp/dbget-bin/www_bget?ko05230) were signaling related to glucose metabolisms and mitochondrial respiration, which may indicate the probiotic characteristics of this bacterium. However, further study is needed to evaluate such signaling.
The gut microbiota genome (CAZys encoded) contributes significantly to host carbohydrate disintegration, which has been found to be indispensable in human nutrition [45]. CAZy annotation uncovered that 35 and 38 genes encode glycosyl transferases (GTs) and glycoside hydrolases (GHs) in the L. sakei genome, respectively ( Figure 6). Key enzyme genes of GTs of L. sakei were in line with those in L. pentosus (29), while GHs were considerably higher than in L. pentosus (10) [45]. These GHs are necessary for glycosidic bond hydrolysis. Its richness in yak strains may be beneficial for carbohydrate metabolism.

Conclusions
This study provides an overview of the genome and reports on unique genetic properties of probiotic L. sakei isolated from yak feces. Genomic analysis can determine the properties of this strain and may show the application of new biotechnology to key strains in the industry. However, further studies of the molecular and physiological properties of released metabolic products are required in order to understand the nature of the association of these probiotics with its metabolites. We hope that future research will help in understanding the biology of metabolites better, and provide new insights into the interactions between metabolites and bacteria.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/11/12/1527/s1, Figure S1: Phylogenetic tree of the L. sakei yak isolate and references strains. The tree was constructed using software MEGA 6.0 by the neighbor-joining method, based on whole genome sequences with 1000 replications in bootstrap testing.