Genome Analysis of the Marine Bacterium Labrenzia sp . Strain 011 , a Potential Protective Agent of Mollusks

The marine bacterium Labrenzia sp. strain 011 was isolated from the coastal sediment of Kronsgaard, Germany. The Labrenzia species are suggested to be protective agents of mollusks. Labrenzia sp. strain 011 produces specialized metabolites, which showed activity against a range of microorganisms, thereunder strong inhibitory effects against Pseudoroseovarius crassostreae DSM 16,950 (genus Roseovarius), the causative agent of oyster disease. The genome of Labrenzia sp. strain 011 was sequenced and assembled into 65 contigs, has a size of 5.1 Mbp, and a G+C content of 61.6%. A comparative genome analysis defined Labrenzia sp. strain 011 as a distinct new species within the genus Labrenzia, whereby 44% of the genome was contributed to the Labrenzia core genome. The genomic data provided here is expected to contribute to a deeper understanding of the mollusk-protective role of Labrenzia spp. Dataset: This whole-genome shotgun project has been deposited at DDBJ/ENA/GenBank under the accession no. QCYM00000000. The version described in this paper is the first version, QCYM01000000 (https://www.ncbi.nlm.nih.gov/nuccore/QCYM01000000). Dataset License: CC0 (databases of molecular data on the NCBI Web site include examples such as nucleotide sequences (GenBank), protein sequences, macromolecular structures, molecular variation, gene expression, and mapping data. They are designed to provide and encourage access within the scientific community to sources of current and comprehensive information. Therefore, NCBI itself places no restrictions on the use or distribution of the data contained therein).

against the oyster pathogen Roseovarius crassostreae [1].R. crassostreae has an adverse effect on natural oyster populations and on oyster farming operations [5].In addition, strains of the genus Labrenzia, which produce compounds showing antimicrobial activity, were associated with soft corals and the marine sponge Erylus discophorus [6,7].Moreover, an analysis of the available genome of Labrenzia sp.strain EL143 showed many genes that are linked to the symbiotic relationship with sessile hosts, genes that can be linked to resistance mechanisms against antibiotics and toxic compounds, and genes corresponding to a strong dehalogenation potential [8].This can be regarded as a requirement for filter-feeding organisms that are exposed to halogenated substances in their environment, and might use bacterial symbionts with dehalogenase activity for detoxification and nutrition [9].These reports reflect the importance of Labrenzia species and their potential for the protection of marine bivalves and for biotechnological applications.Therefore, the genome of Labrenzia sp.strain 011 will enable the identification of biosynthetic gene clusters corresponding to protective compounds.The data shown here can be useful for research groups working on natural product discovery, by enabling further genome-mining approaches.

Data Description
The draft genome sequence of Labrenzia sp.strain 011 consists of 65 contigs (>1000 bp) with 5,102,962 bp in length, and a G+C content of 61.6%.There were 4812 coding sequences (CDSs) that were predicted (this number includes proteins annotated as hypothetical), of which 2280 CDSs (48%) were categorized in 473 different subsystems with identified functional roles.
A phylogenetic tree of all of the Labrenzia strains with the available genomes based on the core genomes alignment revealed Labrenzia sp.strain OB1 and L. marina DSM 17,023, which were isolated from coastal seawater in La Jolla, CA, USA, and South Korea, respectively, as the most closely related strains to Labrenzia sp.strain 011 (Figure 1).
showed activity against the oyster pathogen Roseovarius crassostreae [1].R. crassostreae has an adverse effect on natural oyster populations and on oyster farming operations [5].In addition, strains of the genus Labrenzia, which produce compounds showing antimicrobial activity, were associated with soft corals and the marine sponge Erylus discophorus [6,7].Moreover, an analysis of the available genome of Labrenzia sp.strain EL143 showed many genes that are linked to the symbiotic relationship with sessile hosts, genes that can be linked to resistance mechanisms against antibiotics and toxic compounds, and genes corresponding to a strong dehalogenation potential [8].This can be regarded as a requirement for filter-feeding organisms that are exposed to halogenated substances in their environment, and might use bacterial symbionts with dehalogenase activity for detoxification and nutrition [9].These reports reflect the importance of Labrenzia species and their potential for the protection of marine bivalves and for biotechnological applications.Therefore, the genome of Labrenzia sp.strain 011 will enable the identification of biosynthetic gene clusters corresponding to protective compounds.The data shown here can be useful for research groups working on natural product discovery, by enabling further genome-mining approaches.

Data Description
The draft genome sequence of Labrenzia sp.strain 011 consists of 65 contigs (>1000 bp) with 5,102,962 bp in length, and a G+C content of 61.6%.There were 4812 coding sequences (CDSs) that were predicted (this number includes proteins annotated as hypothetical), of which 2280 CDSs (48%) were categorized in 473 different subsystems with identified functional roles.
A phylogenetic tree of all of the Labrenzia strains with the available genomes based on the core genomes alignment revealed Labrenzia sp.strain OB1 and L. marina DSM 17,023, which were isolated from coastal seawater in La Jolla, CA, USA, and South Korea, respectively, as the most closely related strains to Labrenzia sp.strain 011 (Figure 1).In order to obtain further insight into the degree of similarity between the analyzed genomes, the numbers of the core genes and of the singletons were determined.There were 2131 CDS that contributed to the core genome of the Labrenzia strains, equivalent to ~44% of the Labrenzia sp.strain 011 genome (Figure 2A).To identify the actual core genome of a species, it is possible to use an approximate approach by extrapolating the number of core genes for an infinite number of genomes [10].Using this methodology, it was calculated that the core genome will be around 2113 CDS, based on a decay function (2929.005× exp(−x/3.229)+ 2112.783,see Figure 2B).The pan genome increases with every additional Labrenzia strain, indicating an open pan genome of Labrenzia (Heaps' law function: 5736.13 × x ^0.462, see Figure 2C).In order to obtain further insight into the degree of similarity between the analyzed genomes, the numbers of the core genes and of the singletons were determined.There were 2131 CDS that contributed to the core genome of the Labrenzia strains, equivalent to ~44% of the Labrenzia sp.strain 011 genome (Figure 2A).To identify the actual core genome of a species, it is possible to use an approximate approach by extrapolating the number of core genes for an infinite number of genomes [10].Using this methodology, it was calculated that the core genome will be around 2113 CDS, based on a decay function (2929.005× exp(−x/3.229)+ 2112.783,see Figure 2B).The pan genome increases with every additional Labrenzia strain, indicating an open pan genome of Labrenzia (Heaps' law function: 5736.13 × x 0.462 , see Figure 2C).The average nucleotide identity (ANI) values between Labrenzia sp.strain 011 and all of the analyzed Labrenzia strains was between 73.55% to 84.85% in the pair-wise sequence comparisons (Figure 3).This puts the strain only into distant relation to other strains, as values smaller than 80%-85 % ANI must be regarded as distantly related [11].The in-silico DNA-DNA hybridization (isDDH) values between Labrenzia sp.strain 011 and the other Labrenzia strains was between 22.7% to 33.1%, whereby the highest values were obtained for Labrenzia sp.strain OB1 and L. marina DSM 17023, verifying the phylogenetic relationship between these two and strain 011.Furthermore, differences in the G+C content between Labrenzia sp.strain 011 and other Labrenzia strains were between 1.32%-5.38%,which supports the species delineation (Table 1).Therefore, the in silico parameters (ANI ≥ 96%, isDDH ≥ 70%, and difference in G+C content of ≤ 1%) [11][12][13] define Labrenzia sp.strain 011 as a distinct new species of the genus Labrenzia (Figure 3, Table 1).Instead, CP4, UBA4493, C1B70, and C1B10 seem to be strains closely related to L. aggregata RMAR6, with ANI values between 97%-100% (Figure 3).The average nucleotide identity (ANI) values between Labrenzia sp.strain 011 and all of the analyzed Labrenzia strains was between 73.55% to 84.85% in the pair-wise sequence comparisons (Figure 3).This puts the strain only into distant relation to other strains, as values smaller than 80-85% ANI must be regarded as distantly related [11].The in-silico DNA-DNA hybridization (isDDH) values between Labrenzia sp.strain 011 and the other Labrenzia strains was between 22.7% to 33.1%, whereby the highest values were obtained for Labrenzia sp.strain OB1 and L. marina DSM 17023, verifying the phylogenetic relationship between these two and strain 011.Furthermore, differences in the G+C content between Labrenzia sp.strain 011 and other Labrenzia strains were between 1.32-5.38%,which supports the species delineation (Table 1).Therefore, the in silico parameters (ANI ≥ 96%, isDDH ≥ 70%, and difference in G+C content of ≤ 1%) [11][12][13] define Labrenzia sp.strain 011 as a distinct new species of the genus Labrenzia (Figure 3, Table 1).Instead, CP4, UBA4493, C1B70, and C1B10 seem to be strains closely related to L. aggregata RMAR6, with ANI values between 97-100% (Figure 3).The genome of Labrenzia sp.strain 011 carries genes related to nitrogen metabolism and denitrification (56 CDSs), polyhydroxybutyrate metabolism (32 CDSs), and many genes that are related to stress response, for example, heat and cold shock (169 CDSs) (Figure 4).Labrenzia sp.strain 011 belongs to the family of Rhodobacteraceae, which is a sister family of the Rhizobiales.The latter fix nitrogen in plant roots [14].This data may explain the denitrification ability of the oyster microbiome, which is dominated by Rhodobacteraceae [15].The bacteria of this family are surface colonizers and are known for the production of compounds with antibacterial activity, which prohibit the growth of other bacteria; thereby, shaping the microbiome [15,16].The genome of Labrenzia sp.strain 011 carries genes related to nitrogen metabolism and denitrification (56 CDSs), polyhydroxybutyrate metabolism (32 CDSs), and many genes that are related to stress response, for example, heat and cold shock (169 CDSs) (Figure 4).Labrenzia sp.strain 011 belongs to the family of Rhodobacteraceae, which is a sister family of the Rhizobiales.The latter fix nitrogen in plant roots [14].This data may explain the denitrification ability of the oyster microbiome, which is dominated by Rhodobacteraceae [15].The bacteria of this family are surface colonizers and are known for the production of compounds with antibacterial activity, which prohibit the growth of other bacteria; thereby, shaping the microbiome [15,16].In total, 11 biosynthetic gene clusters (BGCs) were identified (5.3% of the genome), including one type-I polyketide synthase, one terpene, one bacteriocin, four fatty acids, and four saccharide BGCs (Figure 5).Additionally, 23 putative gene clusters were identified using the cluster finder algorithm (3.8% of the genome), thereunder three BGCs for cyclopropane fatty acid synthases (Figure 5).In total, 11 biosynthetic gene clusters (BGCs) were identified (5.3% of the genome), including one type-I polyketide synthase, one terpene, one bacteriocin, four fatty acids, and four saccharide BGCs (Figure 5).Additionally, 23 putative gene clusters were identified using the cluster finder algorithm (3.8% of the genome), thereunder three BGCs for cyclopropane fatty acid synthases (Figure 5).In total, 11 biosynthetic gene clusters (BGCs) were identified (5.3% of the genome), including one type-I polyketide synthase, one terpene, one bacteriocin, four fatty acids, and four saccharide BGCs (Figure 5).Additionally, 23 putative gene clusters were identified using the cluster finder algorithm (3.8% of the genome), thereunder three BGCs for cyclopropane fatty acid synthases (Figure 5).

Sequencing and Assembly
The marine bacterium Labrenzia sp.strain 011 was isolated from sediment from the coastal area of Kronsgaard, Germany.The phenotypic appearance of its colonies is creamy yellow on Difco TM marine agar 2216 (Table 2).The genomic DNA isolation of Labrenzia sp.strain 011 was performed as described before [17].In brief, a one-week culture in a marine broth liquid medium was used to harvest the cell pellets.Therefrom, the DNA was isolated using the GenElute™ Bacterial Genomic DNA Kit (Sigma-Aldrich).Illumina shotgun paired-end sequencing libraries were generated and sequenced on a MiSeq instrument (Illumina, San Diego, CA, USA).Quality filtering using Trimmomatic version 0.36 (6) resulted in 495,158 paired-end reads for Labrenzia sp.strain 011.The paired-end reads were combined using the Spades assembler v3.10, yielding initial sequence scaffolds [18].Scaffolds smaller than 1 kb were filtered and 65 contigs remained as determined with Quast [19].The genome completeness was estimated using CheckM [20] and the genus level marker genes, resulting in a value of 83.2%.

Figure 1 .
Figure 1.Phylogenetic tree of selected Labrenzia strains with available genomes.The tree was build out of a core of 2131 genes per genome.The geographic origins of the strains are given in parentheses.The tree was calculated with 100 iterations.All branches have 100/100 bootstrap support, except the branch between L. aggregate RMAR6 and Labrenzia sp.UBA4493/Labrenzia sp.CP4, which is 61/100.

Figure 1 .
Figure 1.Phylogenetic tree of selected Labrenzia strains with available genomes.The tree was build out of a core of 2131 genes per genome.The geographic origins of the strains are given in parentheses.The tree was calculated with 100 iterations.All branches have 100/100 bootstrap support, except the branch between L. aggregate RMAR6 and Labrenzia sp.UBA4493/Labrenzia sp.CP4, which is 61/100.

Figure 2 .
Figure 2. (A) Core vs. pan genome plot of the genomes.(B) Core genome development plot.(C) Pan genome development plot.

Figure 2 .
Figure 2. (A) Core vs. pan genome plot of the genomes.(B) Core genome development plot.(C) Pan genome development plot.

Figure 3 .
Figure 3. Average nucleotide identity (ANI) heat map of the selected Labrenzia strains.

Figure 3 .
Figure 3. Average nucleotide identity (ANI) heat map of the selected Labrenzia strains.

Figure 4 .
Figure 4. Subsystem category distribution and feature counts in the genome of Labrenzia sp.strain 011.

Figure 5 .
Figure 5. Distribution of the biosynthetic gene clusters (BGCs) in the genome of Labrenzia sp.strain 011.In total, 463,048 bp (equal to 9.1% of the genome) were identified.The identified regions and percentages of the total are given.

Figure 4 .
Figure 4. Subsystem category distribution and feature counts in the genome of Labrenzia sp.strain 011.

Figure 4 .
Figure 4. Subsystem category distribution and feature counts in the genome of Labrenzia sp.strain 011.

Figure 5 .
Figure 5. Distribution of the biosynthetic gene clusters (BGCs) in the genome of Labrenzia sp.strain 011.In total, 463,048 bp (equal to 9.1% of the genome) were identified.The identified regions and percentages of the total are given.

Figure 5 .
Figure 5. Distribution of the biosynthetic gene clusters (BGCs) in the genome of Labrenzia sp.strain 011.In total, 463,048 bp (equal to 9.1% of the genome) were identified.The identified regions and percentages of the total are given.