The Complete Genome of a Novel Typical Species Thiocapsa bogorovii and Analysis of Its Central Metabolic Pathways

The purple sulfur bacterium Thiocapsa roseopersicina BBS is interesting from both fundamental and practical points of view. It possesses a thermostable HydSL hydrogenase, which is involved in the reaction of reversible hydrogen activation and a unique reaction of sulfur reduction to hydrogen sulfide. It is a very promising enzyme for enzymatic hydrogenase electrodes. There are speculations that HydSL hydrogenase of purple bacteria is closely related to sulfur metabolism, but confirmation is required. For that, the full genome sequence is necessary. Here, we sequenced and assembled the complete genome of this bacterium. The analysis of the obtained whole genome, through an integrative approach that comprised estimating the Average Nucleotide Identity (ANI) and digital DNA-DNA hybridization (DDH) parameters, allowed for validation of the systematic position of T. roseopersicina as T. bogorovii BBS. For the first time, we have assembled the whole genome of this typical strain of a new bacterial species and carried out its functional description against another purple sulfur bacterium: Allochromatium vinosum DSM 180T. We refined the automatic annotation of the whole genome of the bacteria T. bogorovii BBS and localized the genomic positions of several studied genes, including those involved in sulfur metabolism and genes encoding the enzymes required for the TCA and glyoxylate cycles and other central metabolic pathways. Eleven additional genes coding proteins involved in pigment biosynthesis was found.


Introduction
The purple sulfur bacterium T. roseopersicina BBS belongs to the Chromatiaceae family from the class Gammaproteobacteria.This strain was isolated from the White Sea Estuary in 1969 [1].Bacterial cells are spherical with a diameter of 1.0-1.5 microns.The color of the culture depends on the growth conditions and the composition of growth media and may range from pinkish red to light pink in cases of accumulating molecular sulfur.The bacteria are Gram negative.Despite the absence of flagella, the bacteria are capable of spontaneous impulse movements in wet microscopic samples.The culture grows in the pH range of 6.0-8.0 (optimum pH is 7.0-7.5)and in the presence of NaCl (up to 5%, the optimum content being 1-2%).T. roseopersicina BBS is a B 12 auxotroph.It reproduces by a binary fission.The culture accumulates polyphosphate granules, poly-β-butyrate, and polysaccharides as storage compounds.It can grow under chemolithotrophic conditions [2].Over the past years, the metabolism of hydrogen, sulfur, and carbon as well as the biosynthesis of the photosynthetic pigments in these bacteria have been extensively studied at both biochemical and physiological levels [3][4][5][6][7][8].Special attention has been paid to the hydrogenases of this species, their structure, maturation, and cellular functions as well as the genes encoding the hydrogenases and the enzymes responsible for their synthesis [9][10][11][12][13][14].However, for future research and practical application of this bacterium, the full genome sequence is important.
The aim of this work was to assemble the data of the whole genome of Thiocapsa sp.BBS; give a short description; analyze the genetic potential of this bacterium in sulfur metabolism, comparing it with the most studied purple sulfur bacterium, Allochromatium vinosum, for research into HydSL hydrogenase's role in sulfur metabolism; present the genetic reason for the autotrophic growth of the bacterium; check for the presence of the citric acid cycle; and validate systematic position of the studied bacterium.

Genome Sequencing, Assembly, and Annotation
Genomic DNA was isolated from the biomass of a fresh culture (a colony) of T. roseopersicina BBS grown under anaerobic photoheterotrophic conditions on modified Pfennig's medium in the presence of 0.2% sodium acetate using the Monarch ® HMW DNA Extraction Kit (T3050L, NEB).Sequencing was performed using the MinION sequencer with an R9.4.1 flow cell (Oxford Nanopore Technologies [ONT]) in the facilities of the State Research Center for Applied Microbiology and Biotechnology (SRCAMB, Obolensk, Russia).The library was prepared using the Rapid Barcoding Kit (cat.# SQK-RBK00).Guppy version 3.2.4software was used for base calling, which yielded a total of 338.5 Mb distributed in 179,494 reads.Reads were filtered based on the quality metric (Q > 10).
Quality control of Illumina reads was carried out using FastQC.(http://www.bioinformatics.babraham.ac.uk/projects/fastqc accessed on 1 June 2023).Illumina and Nanopore reads were used for hybrid assembly with SPAdes version 3.15.2[15].Nanopore reads were assembled into contigs using the Flye assembler version 2.6 [16].SPAdes contigs were then combined into replicons using Flye data as a reference.Illumina reads were used to correct Nanopore errors using Bowtie2 version 2.3.5.1 [17] and Pilon version 1.23 [18] software.Default settings were used for all software.Circularization of the ends of a chromosome was confirmed by overlapping ends as well as by visualization in the Tablet program [19].
Data were submitted to the GenBank database under the following accession numbers: BioProject-PRJNA224116, BioSample-SAMN23799607, and GenBank-CP089309. 1 The assembled genome was annotated using Prokka [20] and RAST [21].The functions of some proteins were checked manually using BLAST.The phylogenetic tree was constructed by the neighbor-joining method using the REALPHY service [22].Genome sequences of Thiocapsa strains required for constructing the phylogenetic tree were taken from the WGS database (https://www.ncbi.nlm.nih.gov/Traces/wgs/?view=wgs, accessed on 11 October 2021).The circular map was created using DNAPlotter [23].
The ANI value was calculated using the EzBioCloud service [24].The DDH parameter was calculated using the Genome-to-Genome Distance Calculator 2.1 service [25].We used the Kyoto Encyclopedia of Genes and Genomes (KEGG) [26] to perform functional annotation using the blastp module.
Metabolic pathways at the level of enzymes and their genes in T. roseopersicina BBS were analyzed using information from the KEGG Pathway database for prokaryotes, as well as records for substrates and products of biochemical reactions in KEGG Compounds.Searching for homologous genes in the T. roseopersicina BBS genome was performed using tBLASTn (by amino acid sequences of six open reading frames of the genome [29]).The best-characterized gene of a prokaryotic organism (i.e., having an experimentally proven function of interest) was chosen as a query for analysis.If highly homologous genes were absent, the test was repeated with less-characterized sequences (from purple bacteria only) as queries.The algorithm parameters were set to default values.The criteria for finding the gene of interest were as follows: for protein sequences, identity > 40% and alignment length > 100 Aa, Bit Score > 50, and e-value (the expectation value) ≤ 10 −6 indicated the presence of the gene (provided that all other conditions were met); whereas e-value > 10 −6 implied that the gene was absent.If this did not reveal the putative gene, the gene was considered as absent.We detected a small number of genes satisfying the search criteria.Amino acid sequences of the proteins from the UniProt database were used as search templates [30].

General Characteristics of the T. roseopersicina BBS Genome
We sequenced and assembled the genome of T. roseopersicina BBS.Its main statistical features are shown in Table 1.The T. roseopersicina BBS DNA consists of a circular chromosome 5,649,927 bp long (with a GC content of 63.94%), and plasmids are absent (Figure 1).The chromosome contains 5036 genes, 2 rRNA clusters (5S, 16S, 23S), and 51 tRNAs.Of 4978 protein-coding sequences, 2468 (49.6%) were functionally annotated (Figure 2).

Validating the taxonomic position of the BBS strain
The genus Thiocapsa currently comprises nine species.Two of them have not been validated at the time of preparing this manuscript (LPSN -List of Prokaryotic names with Standing in Nomenclature, https://www.bacterio.net/,accessed on 01.06.2023).One of the two non-validated species includes the strain under consideration: the reclassification of the T. roseopersicina BBS species into a novel species T. bogorovii BBS

Validating the Taxonomic Position of the BBS Strain
The genus Thiocapsa currently comprises nine species.Two of them have not been validated at the time of preparing this manuscript (LPSN -List of Prokaryotic names with Standing in Nomenclature, https://www.bacterio.net/,accessed on 1 June 2023).One of the two non-validated species includes the strain under consideration: the reclassification of the T. roseopersicina BBS species into a novel species T. bogorovii BBS was suggested by Tourova et al. in 2009 with BBS as a type (and single at the moment) strain [31].However, these data require further validation using a full genome, since GenBank contains only fragmented data on small sequences of T. roseopersicina BBS genomic DNA up to 40,000 base pairs in size (AF528191.1,JF712872.1,etc.).Therefore, it has not previously been possible to determine the extent to which this bacterium differs from the typical T. roseopersicina strain.
Analyzing the obtained whole genome of T. roseopersicina BBS through an integrated approach that included calculating the ANI and DDH parameters resulted in validation of the systematic position of T. bogorovii BBS.This bacterium is not closely related to any typical strain of the species from the Thiocapsa genus (T.marina, T. rosea, T. imhoffii, and T. roseopersicina) (Table 2).Thus, for the first time, we obtained the assembled whole genome of T. bogorovii BBS and confirmed the relevance of its classification into a novel typical species of the Thiocapsa genus.The systematic position of T. bogorovii BBS explains the existence of substantial differences in bacterial physiology from T. roseopersicina DSM 217T.In particular, T. bogorovii BBS is characterized by the absence of assimilatory sulfate reduction, vitamin B12 auxotrophy, and higher optimal values of growth medium salinity as well as the distinct range of utilized organic compounds [1].

A Comparison of the Whole Genomes of T. bogorovii BBS and Alc. Vinosum DSM 180 T
To study the specificities of sulfur metabolism and the role of the involved hydrogenases in T. bogorovii BBS, the whole sequence of its genome was characterized and compared with the genome of another representative of the Chromatiaceae family, Alc.vinosum DSM 180 T , since its sulfur metabolism has been studied in detail.The whole sequence of Alc.vinosum DSM 180 T consists of one circular chromosome and two plasmids [32].Similar to T. bogorovii BBS, this bacterium is capable of phototrophic growth (anaerobic in the light) and chemotrophic growth (aerobic in the dark).Data on the major features of the genetic categories are given in Table 3.So, these bacteria belong to distinct families of purple sulfur bacteria and are very different in genome composition.Nevertheless, sulfur metabolism is the common features between them and it allows us to compare them in this respect.

Pigment Biosynthesis and Photocomplexes
The photosynthetic apparatus of many purple bacteria consists of three types of pigment-protein complexes: two light-harvesting antenna complexes (LHI and LHII) capture photons of different wavelengths and transfer the excitation energy to the reaction center (RC, complex III) [33].The core antenna LH1 is adjacent to the RC, forming an open ring [34].The LH2 peripheral antenna is located at the periphery.There are three structural types of RC-LH1 complexes in purple bacteria [35].The gaps in the LH1 ring are considered to be necessary for the penetration of the ubiquinone that links RC with the cytochrome bc 1 complex.In Cereibacter sphaeroides, the PufX protein is present in each monomer of the dimeric RC-LH1 [36].In the purple non-sulfur bacterium Rhodopseudomonas palustris [37], RC exists in the form of the RC-LH1 monomer and contains the protein W (similar to PufX).The third structural type of RC-LH1 complex has been identified in the purple sulfur bacterium Thermochromatium tepidum [38].In this bacterium, RC exists as the RC-LH1 monomer and does not comprise proteins similar to the PufX or W proteins (however, the LH1 ring is open containing channels).Recently, a fourth form of RC-LH1 with a closed ring has been reported in purple bacteria [39].
Gene clusters in T. bogorovii BBS related to photosynthesis have been investigated earlier.The localization of several genes is uncommon compared to other photosynthetic bacteria [8].The obtained whole genome of these bacteria allowed for specification of the localization of the enzymes that belonged to the specific pathway of carotenoid synthesis in T. bogorovii BBS, the main carotenoid being spirilloxanthin [1,8].It also allowed for identification of the genes of the light-harvesting photosynthetic complexes and the genes related to the synthesis of bacteriochlorophyll a, which are located outside of the previously studied genomic fragments (Table 4).Sequencing errors (substitutions and deletions) were detected in some previously characterized regions.According to the genomic analysis performed, the genes encoding the proteins W and PufX are absent from both T. bogorovii BBS and Alc.vinosum DSM 180 T .

Autotrophy and RuBisCO
In purple bacteria, the major pathway for CO 2 assimilation under autotrophic conditions is the reductive pentose phosphate cycle (Calvin-Benson-Bassham cycle) [40].Ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO), Phosphoribulokinase (PRK), and Sedoheptulose-bisphosphatase (SBP) are the enzymes that function exclusively in this cycle [41].Together, RuBisCo and PRK can indicate that this cycle is present in the organism in question.Enzymes that differ from the typical plant form of RuBisCo have been identified in various microorganisms.There are four main forms of RuBisCO [42], with RuBisCO forms I, II, and III exhibiting carboxylase and oxygenase activity, but under potentially different physiological conditions.Form III has been found only in archaea and has therefore been allocated into a separate category.Form IV includes the RubisCO-like protein (RLP).This enzyme does not catalyze ribulose 1,5-bisphosphate-dependent CO 2 fixation; however, it might be involved in sulfur metabolism and stress responses [43,44] as well as in the methionine salvage pathway [45].The function of RLP is unknown in many organisms.RuBisCO activity has been detected in the T. roseopersicina DSM 217T and T. bogorovii BBS cultures [3,7,46].The peculiarity of the BBS strain is the insensitivity of RuBisCO synthesis to molecular oxygen [3].In contrast, RuBisCO synthesis is suppressed by oxygen in most purple bacteria [40].
The green-like form I RuBisCO was the only one obtained and sequenced using specific oligonucleotide primers in T. roseopersicina DSM 217T and T. bogorovii BBS [31], whereas Alc.vinosum DSM 180 T was shown to contain two forms (the green-like form I RuBisCO and the red-like form II RuBisCO).
Analysis of the T. bogorovii BBS genome using the RAST 2 algorithm revealed two copies of the gene encoding the green-like RuBisCO form I. It is worth noting that the genes encoding the small subunits are different, and their nucleotide sequences have 70% identity (using standard settings of the Nucleotide BLAST algorithm).Using tBlastn, the bacterial genome was shown to contain a gene (LT988_08840, 1944337 to 1945626 base pairs) encoding a protein product with 71% identity of amino acid sequence of the form IV "RuBisCO-like" protein of Alc.vinosum DSM 180 T .The role of RLP in T. bogorovii BBS has not yet been elucidated.
The structural genes encoding two enzymes of the RuBisCO form I are far apart, and their genetic neighborhoods are different, similar to the RuBisCO genes in Alc.vinosum DSM 180 T [32].In one case, the RuBisCO structural genes (LT988_02750, rubisco L 566332-567750 bp and LT988_02755, rubisco S 567908-568264 bp) were located upstream of the genes encoding RuBisCO activation proteins (LT988_02760, CbbQ 568387-569193 bp and LT988_02770, CbbO 569671-572049 bp), with the 23Sr gene (LT988_02765, 569275-569637 bp) encoding the four-helix bundle protein located in-between.The transcriptional regulator of the RuBisCO operon (LT988_RS02745, CbbR) is located within 566015-565080 bp.This gene has an inverted orientation relative to the other RuBisCO genes.
In another case, based on the classical RAST 2 annotation, the RuBisCO structural genes (LT988_RS07075, CbbL, 1573243-1574658 bp and LT988_RS07080, CbbS, 1574821-1575165 bp) were located upstream of the genes encoding carboxysome proteins (carboxysome shell proteins CsoS2 (LT988_07085) and CsoS3 (LT988_07090), putative carboxysome peptides A and B (LT988_07095 and LT988_07100, respectively), two carboxysome shell proteins CsoS1 (LT988_07105 and LT988_07110)), and bacteriopheophytin (suspected to be related to carboxysome, LT988_07115).The transcription regulator of the RuBisCO operon is located at some distance (LT988_07130, 1582292-1583305 base pairs).Carboxysomes are organelle-like protein microcompartments found in cyanobacteria and many chemoautotrophic bacteria.They elevate the efficiency of carbon fixation by abolishing oxygen access to RuBisCO, located under the protein envelope, as well as via the mechanism of carboanhydrase-dependent CO 2 concentration [47,48].In Alc.vinosum DSM 180 T , low CO 2 caused a preferential expression of the RuBisCO form flanked by carboxysomal genes, compared to the second RuBisCO form that dominates at high CO 2 .This is considered to support the role of the carboxysome-dependent CO 2 concentration mechanism in this bacterium [32].However, there is still no data regarding the existence of carboxysomes in T. boborovii BBS.Nevertheless, analysis of the genomes of Alc.vinosum DSM 180 T and T. bogorovii BBS using RAST ver.2.0 revealed 15 and 23 genes associated with the carboxysome system, 19 and 22 genes associated with photorespiration (involving RuBisCO oxygenase activity), and 13 and 17 genes associated with the Calvin-Benson-Bassham cycle, respectively.
Thus, the analysis of the genomic sequence of T. bogorovii BBS allowed us to detect several genes for different forms of RuBisCO and complemented the data obtained by Turova et al., who found only part of the single gene encoding the green-like RuBisCO form I in this bacterium using oligonucleotide primers specific to different types and forms of RuBisCO genes [31].Further studies are needed to confirm the expression activity of the identified genes and to determine their physiological roles in metabolism in T. bogorovii BBS.

Heterotrophy
T. bogorovii BBS assimilates organic substrates during photoheterotrophic growth and can utilize them both as electron donors and also as carbon sources.The main metabolic pathways of carbon metabolism are the tricarboxylic acid cycle, the Entner-Doudoroff pathway, and the Embden-Meyerhof-Parnas pathway.
Organic acids and substrates metabolized through acetyl-CoA are assimilated via the tricarboxylic acid cycle (TCA cycle).T. bogorovii BBS is categorized as a purple bacterium, in which neither tricarboxylic acid cycle functions, by its ability to consume few organic compounds from the growth medium and usually only as additional carbon sources.However, according to the results of KEGG-Pathway analysis of the complete genome of T. bogorovii BBS and the genome of Alc.vinosum DSM 180T, both bacteria possess genes of all enzymes necessary for the functioning of the tricarboxylic acid cycle [7].
The lack of beneficial effect after adding certain substrates to an inorganic medium can be explained by the lack of their intracellular transport.The revealed contradictions are the basis for further studies of the functional activity of genes of the considered metabolic pathways at the levels of transcription and proteomics.
Earlier works have shown that T. bogorovii BBS has an open reductive TCA cycle (rTCA) typical of bacteria lacking the full set of TCA enzymes [7,49].Essentially, rTCA functions as a reverse oxidative TCA employed by bacteria as a pathway for autotrophic CO 2 fixation and acetate assimilation.The full turn of the cycle under autotrophic conditions leads to the fixation of four CO 2 molecules with the formation of one oxaloacetate molecule.Two irreversible TCA cycle reactions in the reductive cycle are catalyzed by the alternative enzymes.The formation of 2-oxoglutarate from succinyl-CoA and CO 2 is catalyzed by 2-oxoglutarate synthase (EC: 1.2.7.3 1.2.7.11).The irreversible citrate synthase can be substituted by the reversible ATP-dependent citrate lyase (EC: 2.3.3.8)[50] or by two sequentially functioning enzymes: citryl-CoA synthetase (EC: 6.2.1.18)and citryl-CoA ligase (EC: 4.1.3.34)[51].Analysis of the T. bogorovii BBS genome with KEGG revealed one of these key genes encoding 2-oxoglutarate synthase (EC: 1.2.7.3 1.2.7.11;LT988_18060, LT988_18065).However, the enzymes required to replace the irreversible citrate synthase involved in the oxidative TCA cycle have not been identified yet; therefore, this pathway has been termed the open rTCA cycle.The genes encoding the key rTCA cycle enzymes are absent in Alc.vinosum DSM 180 T as well.The genes encoding the enzymes of TCA and rTCA cycles in T. bogorovii BBS are listed in Table 5. Conversion of carbohydrates to phosphoenolpyruvate, pyruvate, and acetyl-CoA occurs via the Entner-Doudoroff and Embden-Meyerhof-Parnas pathways.According to the KEGG Annotation, the genomes of T. bogorovii BBS and Alc.vinosum DSM 180 T contain the genes of glycolytic enzymes (Embden-Meyerhof pathway) and enzymes involved in pyruvate oxidation and gluconeogenesis (Table 6).The Entner-Doudoroff pathway has no genetic potential for function in the bacteria when considered according to the KEGG Pathway database.This confirms the results previously obtained when studying T. bogorovii BBS at the physiological level.
According to KEGG, the genes required for the pentosophosphate pathway function are present in the genome of T. bogorovii BBS (Table 7).It should be noted that T. bogorovii BBS has the gene LT988_13700 encoding D-glucose permease of the phosphotransferase system (PTS, which is absent in the genome of Alc.vinosum DSM 180 T ).This enzyme (EC 2.7.1.199) is a component (known as enzyme II) of the phosphoenolpyruvate (PEP)-dependent sugar transporting PTS.
Further research is required to confirm the functions of the proteins encoded by the identified genes.

Hydrogenases
Hydrogenases belong to a class of oxidoreductases that can catalyze molecular hydrogen activation.They are widely distributed in archaea, bacteria, and some unicellular eukaryotes.These enzymes are involved in highly diverse biological processes [52] where they perform different functions.In some cases, the organisms produce hydrogen to remove an excess of reducing equivalents; in other cases, they uptake hydrogen and utilize it as an electron source.Hydrogen uptake hydrogenases are used by nitrogen-fixing microorganisms to utilize the hydrogen generated by the nitrogenase system.A separate group of sensor hydrogenases is involved in the transcription induction of hydrogen uptake hydrogenases [52,53].
HupSL hydrogenase is involved in the uptake of the exogenous and endogenous molecular hydrogen with the transfer of electrons to the ubiquinone pool.This enzyme is also known to participate in the reverse uptake of hydrogen produced by nitrogen fixation [52].This hydrogenase is thought to ensure bacterial growth under photoautotrophic conditions as well as chemolithotrophic growth under microaerophilic conditions in the dark [10,57].HupS (LT988_02910) and hupL (LT988_02915) encode the small and large subunits of HupSLC hydrogenase, respectively.HupC (LT988_02920) is the cytochrome of the b type.The genome of T. bogorovii BBS also contains the genes hupDHI required for the biosynthesis of HupSLC hydrogenase [55]: LT988_02925 (Hydrogenase expression/formation protein HupD), LT988_02930 hupH (comparison of the nucleotide sequence of hupH from the AY837591.1 (GenBank) with the sequence from the whole genome of T. bogorovii BBS using tBlastn revealed the differences reflected in 92% identity of amino acid sequences), and LT988_02935 (Rubredoxin Gene hupI).
Hyd hydrogenase can reversibly catalyze the reaction towards hydrogen production as well as hydrogen uptake [58,59].However, to date, the only known major physiological cellular function of Hyd is to catalyze the reduction of elemental sulfur with hydrogen uptake in the dark [10].HydSL is encoded by LT988_03130 (HydS subunit) and LT988_03145 (HydL subunit).The structural genes encoding the large and small subunits are separated by two open reading frames (LT988_03135 and LT988_03140) called isp1 and isp2.The Isp1 protein encoded by the gene with the same name is a heme-containing transmembrane protein of b type, whereas the sequence of Isp2 demonstrates a substantial level of similarity with heterosulfide reductases [11].
Despite the presence of HupUV genes (LT988_05045 and LT988_05050) in the genome of T. bogorovii BBS, this bacterium does not express this enzyme that commonly serves as a hydrogen sensor in other bacteria and is involved in the synthesis of HupSL hydrogenases [60].
Soluble hydrogenases belonging to the HoxEFUYH type of Ni-Fe hydrogenases and involved in NAD+ reduction or NADH oxidation, have been shown experimentally [13,61,62].The activity of Hox hydrogenases is closely linked to sulfur metabolism.Elemental sulfur serves as an electron donor for hydrogen production under light [10].The Hox1 hydrogenase consists of the HoxYH hydrogenase (LT988_09435 and LT988_09445) and HoxFU diaphorase (LT988_09425 and LT988_09430).HoxE (LT988_09420) is a fifth subunit that is thought to be involved in an electron transfer, followed by a gene (LT988_09450) encoding the hydrogenase maturation protein, namely HoxW.
HydSL, HupSLC, and Hox1 enzymes have been identified and described in Alc.vinosum DSM 180 T [32]; however, in contrast to T. bogorovii BBS, this bacterium lacks the genes of the sensor hydrogenase HupUV.The Alc. vinosum DSM 180 T genome contains the genes of the other two hydrogenases.The sequences of one of the hydrogenases (Alvin_0807 to Alvin_0810) show a high degree of similarity with the hydrogenase/sulforeductases (EC: 1.12.98.4) of Thermodesulfovibrio yellowstonii DSM 11347 and Chlorobium tepidum [32].Searching for the genes encoding this enzyme in the genome of T. bogorovii BBS using tBlastn revealed that the highest degree of similarity was between the sequences of the sulfhydrogenase and Hox2 hydrogenase genes.
Thus, analysis of the bacterial genome searching for known forms of hydrogenases revealed the presence of hydrogenase genes previously demonstrated experimentally.

Chemotrophic Metabolism
Upon photoautotrophic growth, sulfide, thiosulfate, sulfur, or H 2 are utilized as electron donors.Chemoautotrophic growth in the dark under anaerobic conditions is possible due to thiosulfate used as a sulfur source as well as an electron donor and energy source for CO 2 assimilation [2].
T. roseopersicina and T. bogorovii BBS are known to grow under chemolithotrophic conditions utilizing thiosulfate as an electron donor [2].It should be noted that, in theory, CO 2 fixation involving RuBisCO can occur in the presence of oxygen, since its synthesis is continued in the presence of O 2 in both Thiocapsa species.It might be related to the presence of the RuBisCO carboxysome form described in the 'Autotrophy and RuBisCO' subparagraph.
Similar to Alc. vinosum DSM 180 T [32], the genome of T. bogorovii BBS contains the genes encoding both oxidases required for the chemotrophic growth of the bacteria in the presence of oxygen.Cytochrome bd oxidase is encoded by the genes LT988_17870 (annotated as the ubiquinol oxidase subunit I, its amino acid sequence exhibits 85% identity with the sequence of the oxidase Alvin_2499 in Alc.vinosum DSM 180 T ), LT988_17865 (annotated as the cytochrome d ubiquinol oxidase subunit II, 80% identity with Alvin_2500 in Alc.vinosum DSM 180 T ), and LT988_17860 (annotated as the cytochrome bd-I oxidase subunit CydX, 73% identity with Alvin_2501 from Alc. vinosum DSM 180 T ).Cytochrome bd oxidase from Alc. vinosum DSM 180 T is known to function mostly under microanerobic conditions, removing toxic oxygen generated during nitrogen fixation [64].
Cytochrome cbb3 oxidase from T. bogorovii BBS is encoded by the genes located between LT988_18880 and LT988_18925.Cytochrome cbb3 oxidase maintains its catalytic activity in the presence of low oxygen and is capable of proton translocating [32].We demonstrated the presence of genes responsible for chemotrophic metabolism.Further research is required to confirm the functions of the proteins encoded by these genes.

Nitrogen Metabolism
According to genome annotation (Subsystem Technology (RAST; version 2.0) online genome analysis software), the fraction of genome involved in nitrogen metabolism in T. bogorovii BBS is 1.8 times greater than in Alc.vinosum DSM 180 T (Table 2).Both bacteria are known to be diazotrophs [32,65].Their ability to fix molecular nitrogen is provided by a Mo-containing-bicomponent complex of the metalloenzyme nitrogenase encoded by nifHDK genes.The genomic DNA of T. bogorovii BBS contains all the genes of the nitrogenase complex: LT988_07970 (NifH, nitrogenase iron protein), LT988_07975 (NifD, nitrogenase molybdenum-iron protein alpha chain), and LT988_07980 (NifK, nitrogenase molybdenumiron protein beta chain).Similar to Alc. vinosum DSM 180 T [32,65], in T. bogorovii BBS genes associated with nitrogen fixation are located in various genomic regions and organized in genetic clusters.
T. bogorovii BBS can utilize ammonium salts, urea, peptone, casein hydrolyzate, and arginine as nitrogen sources.However, it cannot grow in the presence of alanine or hydroxylamine [31].Furthermore, T. bogorovii BBS does not grow when supplemented with glutamic acid or KNO 3 .According to the online analysis tools and KEGG Pathway database, this bacterium has a shortage of genes involved in assimilatory nitrate reduction (ferredoxin-nitrate reductase [EC: 1.7 -]).At the genomic level, this bacterium has the potential for the first reaction of the dissimilatory nitrate reduction, conversion of nitrate into nitrite by nitrate reductase/nitrite oxidoreductase, alpha subunit [EC:1.7.5.1 1.7.99.-] (LT988_06635 encoding respiratory nitrate reductase subunit gamma; and LT988_07690 encoding nitrate reductase cytochrome c-type subunit).Analysis of the Alc.vinosum DSM 180 T genome revealed the lack of the genes involved in the above processes mentioned above [32].
Thus, we have clearly demonstrated the presence of genes responsible for nitrogen metabolism, experimentally discovered here based on previous physiological data.Further studies are needed to confirm the functions of the proteins encoded by the identified genes.

Sulfur Metabolism
T. bogorovii BBS grows only in the presence of S 0 , S 2-, or S 2 O 3 - [31].It does not utilize cysteine and methionine as sulfur sources.Sulfide and thiosulfate are oxidized to sulfate via the formation of elemental sulfur accumulated in cells as granules.There are several known pathways for utilizing sulfur compounds as electron donors and acceptors for energy acquisition.
In contrast to Alc. vinosum DSM 180 T [32], T. bogorovii BBS is not capable of assimilatory sulfate reduction [31].This is confirmed by the results of KEGG Pathway genome analysis.This bacterium lacks the genes encoding the enzymes of phosphoadenosine phosphosulfate reductase [EC:1.8.

Thiosulfate Oxidation
Thiosulfate oxidation can occur in the periplasm with the formation of tetrathionate (S 4 O 6 2− ) or in the periplasmic multi-enzyme sulfur oxidation system (Sox system) [66].The main enzyme of the tetrathionate pathway in Alc.vinosum DSM 180 T functions as tetrathionate reductase (TsdA, Alvin_0091 [67]).The electron transfer from TsdA to the photosynthetic or respiratory ETC upon thiosulfate oxidation involves the TsdB protein (diheme cytochrome c4 encoded by Alvin_2879 in Alc.vinosum) and possibly protein Alvin_2880 [66].Another possible acceptor of electrons from TsdA in the purple sulfur bacteria is the high potential protein HiPIP (350 mV).In the T. bogorovii BBS genome, there is no gene encoding TsdA or the alternative archaeal enzyme AoxDA (thiosulfate/quinone oxidoreductase, which is absent in phototrophic prokaryotes).
The second pathway of thiosulfate oxidation is mediated by the Sox system involved in the disproportionation of thiosulfate with the formation of sulfate and molecular sulfur [66].In this pathway, protein-bound sulfur atoms undergo oxidation.The genes involved in the Sox pathway are present both in Alc.vinosum DSM 180 T [32] and T. bogorovii BBS (Table 8).LT988_16035 (SoxY), LT988_16040 (SoxZ), LT988_24235 (SoxB), LT988_24240 (SoxX), LT988_24245 (SoxA), and also LT988_24250 (61% identity with the amino acid sequence of the SoxXA-binding protein from Alc. vinosum DSM 180 T (SoxK, Alvin_2170)); LT988_24255 and LT988_09460 (61% and 43% identity with the periplasmic sulfur transferase from Alc. vinosum DSM 180 T (SoxL, Alvin_2171)).SoxL serves as an alternative to the SoxCD proteins and is typical of bacteria that do not accumulate sulfur.The SoxCD encoding genes are absent both in T. bogorovii BBS and Alc.vinosum DSM 180 T .
The described SQRs are single-subunit flavoproteins bound to the cytoplasmic membrane.They are classified into six types based on structural data.SQRA and SQRE types are not found in the purple sulfur bacteria, SqrF and SqrD proteins are widely distributed among the Chromatiaceae purple sulfur bacteria, and the ScrB type is common among the Ectothiorhodospirilaceae family members [66].The primary product of the SQR reaction is polysulfide.
T. bogorovii BBS has the genes LT988_02950 and LT988_15980, annotated as the FADdependent oxidoreductase (Table 8, reaction 6) according to GenBank and homologous to the genes from Alc. vinosum DSM 180 T encoding SqrD and SqrF proteins, respectively.In contrast to Alc. vinosum DSM 180 T , T. bogorovii BBS lacks the genes Alvin_1196 and Alvin_1197 that are located downstream of SqrF and are possibly responsible for attaching the enzyme to the membrane in Alc.vinosum DSM 180 T [32].
Sulfide oxidation can involve the flavocytochrome c sulfide reductase in the periplasmic space; however, the physiological role of this enzyme has not been clarified yet (the knockout bacteria oxidize sulfide with the same rates [32], whereas some species of green and purple sulfur bacteria do not produce the proteins).T. bogorovii BBS has two genes, LT988_00625 and LT988_00620 (Table 8, Nos.7,8), that are homologous to the genes encoding the FccB and FccA subunits of flavocytochrome c sulfidereductase FccAB from Alc. vinosum DSM 180 T .

Oxidizing Elemental Sulfur
Elemental sulfur is almost insoluble in water, and it has not been fully understood how phototrophic organisms bind, activate, and uptake this substrate [66].Exogenous elemental sulfur mainly consists of S 8 rings, chains of polymeric sulfur, and traces of S 7 rings.The uptake of elemental sulfur by Alc.vinosum DSM 180 T seems to require the direct interaction of cells with sulfur and, although unlikely, the action of secreted compounds towards the substrate outside the cell.
The purple sulfur bacteria of the Chromatiaceae family accumulate molecular sulfur in the periplasmic space in the form of globules surrounded by the protein coat.In Alc.vinosum DSM 180 T , the protein coat is a monolayer consisting of four sulfur globule proteins, SgpABCD, that perform an exclusively structural function and that are required for sulfur formation and deposition.SgpA (Alvin_1905, annotated as Sulfur globule protein CV1) and SgpB (Alvin_0358, annotated as Sulfur globule protein CV2) are similar, have a weight of 10.5 kDa, and can partially substitute for each other.The SgpC protein encoded by Alvin_1325 (annotated as Sulfur globule protein CV3) is important for globule expansion.SgpD (Alvin_2515) is the most common protein of these globules, as shown by the proteomic studies.According to the sequencing genomic data, Sgp proteins are present in all members of purple sulfur bacteria from the Chromatiaceae family in variable combinations [32] and absent in the members of the Ectothiorhodospirilaceae family.Earlier, the protein sequences of two large Sgps isolated from Alc. vinosum strain ATCC 17899 (10.5 kDa) were demonstrated to be similar to one of the two isolated Sgp proteins from the T. roseopersicina strain SMG219.Likewise, the sequences of small Sgp proteins from these bacteria (8.5 and 8.7 kDa, respectively) were found to show a certain amount of similarity as well [68].Three genes annotated as sulfur globule proteins were discovered in T. bogorovii BBS.They are LT988_04675, encoding sulfur globule protein CV3; LT988_09360, encoding sulfur globule protein CV1; LT988_19155, encoding sulfur globule protein CV1 LT988_19155; and two more genes that are classified as sulfur globule family proteins (LT988_08020 and LT988_19100).No associated KEGG Orthology function has been identified for these genes.
In Alc.vinosum, sulfur is present in the globules in the form of mono-and bisorganylsulfanes [69].The sulfur accumulated in the globules can undergo oxidation after being activated and then translocated to the cytoplasm with a vehicle perthiol molecule.There are two known pathways of sulfur oxidation: the Dsr system (which involves reversible dissimilatory sulfite reductase DsrAB) and the sulfur oxidation pathway involving the system of enzymes similar to heterosulfide reductase [66].
Low-molecular-weight organic persulfides such as glutathion persulfide can serve as vehicle molecules translocating sulfur from periplamic or extracellular deposits to the cytoplasm [70].As yet, the pathway mediating formation of possible molecules in persulfide vehicles has not been studied; it is not known whether there are specific enzymes and transporters involved in this process.In Alc.vinosum DSM 180 T , sulfur from the globules is translocated to the active center of sulfite reductase DsrAB via the cascade of persulfide intermediates located on the rhodanese, TusA, and probably DsrE2A, DsrE, and DsrC [61].Along with SoxL, six other genes that can encode rhodanese in Alc.vinosum DSM 180 T have been annotated: Alvin_0258, Alvin_0866, Alvin_0868, Alvin_1587, Alvin_2599, and Alvin_3028.Currently, the role of these proteins in the dissimilatory sulfur metabolism is not clear.The T. bogorovii BBS genome contains seven genes encoding the products annotated as rhodanese-like domain-containing protein (LT988_07780, LT988_09460, LT988_12845, LT988_16050, LT988_19985, LT988_23355, and LT988_24255), some of which show similarity with rhodanese genes from Alc. vinosum DSM 180 T .None of these proteins from T. bogorovii BBS have been assigned a function according to KEGG Orthology.Previously, the product of the Rhd_2599 gene (Sulfurtransferase Alvin_2599, rhodanese-like protein) has been experimentally shown to serve as a sulfur vehicle and to participate in sulfur transfer involved in oxidative-dissimilatory sulfur metabolism [71].In Alc.vinosum, this protein provides sulfur mobilization and the transfer of sulfur from a low-molecular-weight thiol (probably from glutathion) to the TusA protein (Alvin_2600); TusA serves as a sulfur donor for DsrEFH (genes Alvin_1253, Alvin_1254, and Alvin_1255), which in its turn perform the persulfation of DsrC (Alvin_1256); persulfated DsrC is likely to serve as a direct substrate for reverse sulfite reductase DsrAB.Rhd_2599, TusA, and DsrE2 in Alc.vinosum DSM 180 T have been shown to bind sulfur atoms via conserved cysteine residues.
The Dsr proteins of Alc.vinosum DSM 180 T are encoded by the same gene cluster, dsrABEFHCMKLJOPNRS, which is located downstream of Alvin_1251 and upstream of Alvin_1265 and Alvin_2601 (DsrE2).T. bogorovii BBS has genes dsrABEFHCMKLJOPNRS (located in from LT988_06665 to LT988_06595 in the genome), but the only genes annotated in the same manner are dsrA (LT988_06665) and dsrB (LT988_06660); the other genes were identified using the tBlastn algorithm (Table 9).Furthermore, the genome contains several more genes annotated as TusE/DsrC/DsvC family sulfur relay protein (LT988_03825, LT988_03875, LT988_06640, LT988_06690, LT988_16185, LT988_16425, and LT988_19515) according to GenBank.
The results obtained were compared with the data on the possible pathways of sulfate oxidation in purple sulfur bacteria summarized in the reviews [61].Figure 3 shows the scheme demonstrating the reactions of sulfur metabolism that were revealed in T. bogorovii BBS at a genomic level.

Conclusions
The phototrophic purple sulfur bacterium T. bogorovii BBS belonging to the Chromatiaceae family was isolated from the White Sea Estuary in 1969 [1].In this work, we have obtained, assembled, and published the whole genome of T. bogorovii BBS (GenBank: CP089309.1).Analysis of the genome sequence allowed us to validate the systematic position of the bacteria and to confirm the appropriateness of classifying it as a new typical species of the genus Thiocapsa, as proposed by Tourova et al. [31].Some genes involved in pigment synthesis in T. roseopersicina BBS have been described previously, but identification of genes outside the genomic DNA fragment examined by the authors was required.Through mining the whole genome of T bogorovii sequence, we found eleven additional genes coding proteins involved in pigment biosynthesis.
We employed the functional gene annotation capabilities of the KEGG database, the Subsystem Technology software for genome analysis (RAST; version 2.0), performed a comparative analysis with the genome of the purple bacterium Alc.vinosum DSM 180 T , and manually searched for several genes using the tBlastn algorithm.As a result, we both located and functionally characterized the genes in the genomic DNA that determine the function of central metabolic pathways in T. bogorovii BBS.Two copies of the RuBisCO green-like form I genes allow RuBisCO-mediated CO 2 fixation in this bacterium.The genes are located far apart and have different neighborhoods.In one case, the gene is flanked by the genes encoding RuBisCO activation proteins.In the other case, the genes encoding carboxysome proteins and the regulator of the RuBisCO operon transcription are located downstream of the RuBisCO structural genes.Along with concentrating CO 2 , carboxysomes protect RuBisCO from O 2 , as O 2 can compete with CO 2 .However, there is no information in the pertinent literature about carboxysomes in bacterial cells.The genome of the bacterium under scrutiny also contains a gene belonging to the type IV RuBisCO-like proteins that is not involved in CO 2 fixation.According to the KEGG Pathway database, T. bogorovii BBS has all of the genes encoding the enzymes that are required for the TCA and glyoxylate cycles.
In earlier works, T. bogorovii BBS was demonstrated to have the reductive TCA cycle, which is supported by the genomic data in this present work.The cycle is open due to the absence of the enzymes required to substitute for irreversible citrate synthase from the oxidizing TCA cycle.
The T. bogorovii BBS genome contains the genes from the Embde-Meyerhof-Parnas pathway, pyruvate oxidation, and gluconeogenesis for the synthesis and conversion of carbohydrates to phoshoenolpyruvate, pyruvate, and acetyl-CoA.The Entner-Doudoroff pathway has no genetic potential in T. bogorovii BBS.
We manually searched for hydrogenase genes and determined their locations in the bacterial genome.These are in full accordance with previously published data.
We have identified the genes for the cytochromes bd and cbb3 of oxidases required for chemotrophic growth of the bacteria in the presence of oxygen.
The ability to fix atmospheric molecular nitrogen is ensured by the existence of genes for the Mo-containing nitrogenase complex nifHDK.
T. bogorovii BBS lacks some of the genes required for the assimilatory nitrate reduction, denitrification, and nitrification.However, it has the potential for dissimilatory nitrate reduction.This finding is new and not supported by previous biochemical data.
It is known that HydSL hydrogenase from T. bogorovii BBS can be simply immobilized on carbon electrodes with proper orientation between a distal FeS cluster and an electrode and the production of high current densities [72].Taking into account the fact that that this hydrogenase is stable and can work even at 70 • C, one can conclude that it is a very promising enzyme for use in enzymatic hydrogenase electrodes.Several schemes of HydSL hydrogenase participation in metabolism of T. bogorovii BBS exist [14,66,73].All of them indicate that HydSL hydrogenase in purple bacteria is closely connected with sulfur metabolism but, unfortunately, none of them took into account all valid experimental data.We have genetically analyzed the sulfur metabolism of T. bogorovii BBS and compared it with the most studied sulfur metabolism of Alc.vinosum [66].This knowledge is important for understanding the mechanisms involved in electron transfer between hydrogenases and native partners.The automatic annotation of the whole genome of the bacteria T. bogorovii BBS was refined and the genomic positions of several studied genes, including those involved in sulfur metabolism, were localized.The bacterium has no genetic potential for assimilatory sulfate reduction, but it has the genes required for dissimilatory sulfate reduction and thiosulphate oxidation by the SOX-complex.Figure 3 shows a scheme demonstrating the reactions of sulfur metabolism revealed in T. bogorovii BBS at a genomic level.
The functional potential of central metabolic pathways identified at the genome level in the bacterium undoubtedly requires further experiments to confirm their activity at the transcriptional and enzymatic levels.Some gene sequences are present in the T. bogorovii BBS genome in several copies.Further research is required to confirm the functions of the proteins encoded by the identified genes.

Figure 2 .
Figure 2. Functional annotation of the genomic DNA of T. roseopersicina BBS.

Figure 2 .
Figure 2. Functional annotation of the genomic DNA of T. roseopersicina BBS.

Figure 3 .
Figure 3.The scheme of sulfur metabolic pathways in purple non-sulfur bacteria.The proteins encoded by genes from the annotated sequence of T. bogorovii BBS with names different from those in Alc.vinosum DSM 180 T are shown in red; the proteins with the same names are shown in green; the proteins absent in T. bogorovii BBS are shown in brown; currently unidentified proteins are shown in gray.

Figure 3 .
Figure 3.The scheme of sulfur metabolic pathways in purple non-sulfur bacteria.The proteins encoded by genes from the annotated sequence of T. bogorovii BBS with names different from those in Alc.vinosum DSM 180 T are shown in red; the proteins with the same names are shown in green; the proteins absent in T. bogorovii BBS are shown in brown; currently unidentified proteins are shown in gray.

Table 1 .
Comparative genome statistics for the purple sulfur bacteria T. roseopersicina BBS and Alc.vinosum DSM 180 T .

Table 2 .
The level of similarity between the T. roseopersicina BBS genome and other typical species of the Thiocapsa genus *.

Table 3 .
Comparison of major gene categories * encoded by the genomes of T. bogorovii BBS and Alc.vinosum DSM 180 T .

Table 4 .
The genes encoding the enzymes of specific carotenoid synthesis, light-harvesting photosynthetic complexes, and genes related to bacteriochlorophyll a in T. bogorovii BBS.
* indicates that the information is absent.

Table 5 .
Cont. * indicates that the sequence of a gene in the genome in question is absent. *

Table 7 .
The genes encoding the enzymes of the pentosophosphate pathway (1-7) in T. bogorovii BBS.