Cytochrome c M Is Probably a Membrane Protein Similar to the C Subunit of the Bacterial Nitric Oxide Reductase

: Cytochrome c M was ﬁrst described in 1994 and its sequence has been found in the genome of manifold cyanobacterial species ever since. Numerous studies have been carried out with the purpose of determining its function, but none of them has given place to conclusive results so far. Many of these studies are based on the assumption that cytochrome c M is a soluble protein located in the thylakoid lumen of cyanobacteria. In this work, we have reevaluated the sequence of cytochrome c M , with our results showing that its most probable 3D structure is strongly similar to that of the C subunit of the bacterial nitric oxide reductase. The potential presence of an α -helix tail, which could locate this protein in the thylakoid membrane, further supports this hypothesis, thus providing a new, unexpected role for this redox protein.


Introduction
Electron transport chains play a fundamental role in the metabolism of all living beings. They consist of a series of redox reactions, spatially separated, in which electrons are transferred from a donor molecule to an acceptor one. In phototrophic eukaryotes, the respiratory and photosynthetic electron transport chains are spatially separated into specialized organelles (mitochondria and chloroplasts). However, in cyanobacteria, both electron transport chains are located in the thylakoid membranes and share some components, such as plastoquinone (PQ), plastocyanin (Pc) or cytochrome (Cyt) c 6 and the Cyt b 6 -f complex.
In cyanobacteria, two c-type cytochromes involved in the electron transport chains have been described: Cyt c 6 and Cyt c 550 . While Cyt c 6 is a soluble electron carrier involved both in photosynthesis and in respiration, Cyt c 550 is structurally bound to photosystem (PS) II and is not directly involved in electron transport [1]. One further c-type cytochrome has been described in cyanobacteria: the so-called Cyt c M [2]. The gene encoding for Cyt c M is conserved in nearly every sequenced cyanobacterium [3,4], and it has been consistently studied in the unicellular organism Synechocystis sp. PCC 6803 [5,6]. In Synechocystis, the annotated gene contains two methionine codons (Figure 1), of which the distal one was proposed to be the actual start codon as it was preceded by a GA-rich motif, i.e., a probable ribosome binding site [2,6]. The sequence further comprised a hydrophobic amino-terminal region that was assumed to constitute either a membrane anchor or a signal peptide. However, due to its close similarity to the signal peptide of Cyt c 6 from Anabaena sp. PCC 7119, the latter was regarded as the most probable option [6]. The two potential translation initiation sites are underlined. The black triangle separates the hydrophobic region (orange) from the hydrophilic one (blue) and marks the point at which the signal peptide was presumed to finish, i.e., right after the VLA motif (boxed). The characteristic CXXCH heme binding motif (boxed) can also be observed.
Cho et al. [5] were the first to successfully detect the protein in cyanobacteria through Western blot analysis. Nevertheless, their results were not clear enough to conclude whether the protein featured the N-terminal hydrophobic domain in vivo or not, since the molecular weight of the protein detected by Western blotting did not match either of the two possible options [6]. As for the protein location, even though Bernroitner et al. [3] claimed to have determined that the thylakoid and plasma membranes are the exact places where Cyt cM is located in the cell, these results have later been questioned [7] given the high likelihood that cross-contamination between membranes occurred. Ever since, no other author has declared to have successfully established the cellular region where the protein is placed, probably due to its extremely scarce expression levels under normal conditions [8].
Based on the hypothesis that the actual Cyt cM protein sequence starts with the second methionine and that it contains a signal peptide to the thylakoid lumen-which would be subsequently released thereby generating a soluble globular protein-several authors have tried to determine its physiological function. As an alleged electron carrier similar to Cyt c6, speculations about the biological role of Cyt cM have been mainly linked it to a hypothetical role in respiratory and photosynthetic electron transport chains. The fact that deletion mutants were able to carry out both processes without any significant phenotype [2] dismissed the possibility of Cyt cM being a main component of any of these systems. However, there was a chance that Cyt cM participated in respiration or photosynthesis as a secondary element, i.e., as a backup piece that is only expressed under certain conditions to substitute or complement a major constituent. Interestingly, the expression levels of Cyt  Cho et al. [5] were the first to successfully detect the protein in cyanobacteria through Western blot analysis. Nevertheless, their results were not clear enough to conclude whether the protein featured the N-terminal hydrophobic domain in vivo or not, since the molecular weight of the protein detected by Western blotting did not match either of the two possible options [6]. As for the protein location, even though Bernroitner et al. [3] claimed to have determined that the thylakoid and plasma membranes are the exact places where Cyt c M is located in the cell, these results have later been questioned [7] given the high likelihood that cross-contamination between membranes occurred. Ever since, no other author has declared to have successfully established the cellular region where the protein is placed, probably due to its extremely scarce expression levels under normal conditions [8].
Based on the hypothesis that the actual Cyt c M protein sequence starts with the second methionine and that it contains a signal peptide to the thylakoid lumen-which would be subsequently released thereby generating a soluble globular protein-several authors have tried to determine its physiological function. As an alleged electron carrier similar to Cyt c 6 , speculations about the biological role of Cyt c M have been mainly linked it to a hypothetical role in respiratory and photosynthetic electron transport chains. The fact that deletion mutants were able to carry out both processes without any significant phenotype [2] dismissed the possibility of Cyt c M being a main component of any of these systems. However, there was a chance that Cyt c M participated in respiration or photosynthesis as a secondary element, i.e., as a backup piece that is only expressed under certain conditions to substitute or complement a major constituent. Interestingly, the expression levels of Cyt c M sharply increase in stress conditions such as low temperature and high-intensity light whereas those of the two main electron carriers, plastocyanin (Pc) and Cyt c 6 , decrease [9]. Despite some further evidence indicating a "spare component" role for Cyt c M in photosynthesis [10,11], we ruled this out by showing that the surface electrostatic potential of Cyt c M is too different from those of Pc and Cyt c 6 and by proving that the kinetic interaction constant between Cyt c M and PS I, to which Pc and Cyt c 6 donate electrons, is negligible [6]. Moreover, the low redox potential of Cyt c M (+150 mV) makes it thermodynamically unlikely that it can be reduced by Cyt f (with a redox potential of +320 mV), as is the case with Pc and Cyt c 6 [6,12]. Consequently, only respiration remained as a real possibility. Initially, it was suggested that Cyt c M could reduce respiratory terminal oxidases (COX) since it was kinetically plausible [3] and it was impossible to create a ∆cytM/PSI double mutant [10]. However, the hypothesis of Cyt c M working as an electron donor in respiration was dismissed once it was observed that ∆cytM mutants have in fact a higher respiration rate under dark heterotrophic conditions [13], with the lack of COX not affecting this whatsoever [8].
In conclusion, none of the hypotheses formulated to this day regarding the physiological role or the cellular location of Cyt c M has proved solid, and this matter remains altogether an open debate. Since most attempts to determine any of these have been based on the assumption that it is a small soluble protein located in the lumen of the thylakoid, we have reevaluated the sequence and structure of Cyt c M in the present work. Our results suggest that, contrarily to what was previously thought, Cyt c M could be a membrane protein with a structure similar to that of the C subunit of bacterial nitric oxide reductase.
The free software UCSF Chimera [19] (https://www.cgl.ucsf.edu/chimera/, version 1.15, accessed on 11 May 2021) was used to visualize protein 3D structures and to determine hydrophobicity and electrostatic potential, as well as for distance measurements and structure matching. For the electrostatic potential, the "Coulombic Surface Coloring" option was chosen. For comparison with NorC, we used the structure of nitric oxide reductase of Pseudomonas aeruginosa [20], which we obtained from the Protein Data Bank (https://www.rcsb.org/; PDB entry: 3O0R, accessed on 14 May 2021).

Results and Discussion
With the purpose of determining whether the N terminal part of the protein encodes for a signal peptide or a transmembrane helix, we retrieved all the protein sequences corresponding to the cytM gene that appear in either of the two main repositories of annotated cyanobacterial genes: CYORF and CyanoBase. A total of 18 sequences was obtained (Table 1), and they were then subjected to a structural prediction through Phyre 2 , which uses the homology recognition (threading) approach.
The first striking result was that 15 out of 18 sequences produced almost identical structures with a long, protruding α-helix (or "tail") corresponding to the initial section of the protein (Figure 2A). This was the precise same region whose nature-signal peptide or transmembrane domain-has not been clearly established to this day. In fact, the VLA motif, repeatedly alleged as the cleavage site of the signal peptide, is located exactly at the junction of the tail with the "body" of the protein in the newly proposed structure ( Figure 2B). The length of this helix motif of 5.1 nm (the average membrane being between 5 and 10 nm wide; Figure 2C) and the remarkably hydrophobic nature of this section ( Figure 2D) agree with a membrane anchoring function. The "Phyre 2 " column states whether the structure predicted for that specific sequence by the tool Phyre 2 matched most of the structures predicted for the other sequences or was different from the rest. * First model proposed by Phyre 2 did not match the rest, but the second one (the next most probable according to the software's algorithm) did. ** None of the models proposed for this sequence by Phyre 2 matched the rest. The template protein used by Phyre 2 for calculating those 15 structures happened to be the c subunit of nitric oxide reductase (NorC) from Pseudomonas aeruginosa. Thus, the similarity between the returned models and the original template were compared. Surprisingly, the overlap between both proteins was clear and almost total ( Figure 3A). To confirm this, a second, independent prediction for the same 15 sequences was performed using I-TASSER, a different structure prediction software that incorporates the ab initio At this point, it is important to bring up the fact that the signal peptides of Cyt c 6 and Cyt c 6 -like proteins from the cyanobacterium Anabaena sp. PCC 7119, having a processing site (AXA) downstream from the N-terminal end, are recognized by E. coli which results in transport of the proteins to periplasmic space, a subcellular compartment equivalent to thylakoid lumen, where the signal peptide is correctly processed [21][22][23]. However, the hydrophobic tail of the Synechocystis Cyt c M is not recognized by E. coli as a signal peptide. For this reason, it has only been possible to express the soluble region of Synechocystis Cyt c M in E. coli by substituting its hydrophobic tail for the signal peptide of Anabaena Cyt c 6 [6].
The template protein used by Phyre 2 for calculating those 15 structures happened to be the c subunit of nitric oxide reductase (NorC) from Pseudomonas aeruginosa. Thus, the similarity between the returned models and the original template were compared. Surprisingly, the overlap between both proteins was clear and almost total ( Figure 3A). To confirm this, a second, independent prediction for the same 15 sequences was performed using I-TASSER, a different structure prediction software that incorporates the ab initio approach in its algorithm. In this new test, the structures obtained were, once again, almost identical to that of NorC ( Figure 3B). We obtained similar results once more using AlphaFold, a novel deep learning algorithm that incorporates physical and biological knowledge about protein structure and that leverages multi-sequence alignments [18]. In order to determine similarities in charges between Cyt cM and NorC, we subsequently evaluated their surface electrostatic potential ( Figure 4). Crucially, NorC is a subunit of nitric oxide reductase that has never been found in cyanobacteria, while the core subunit NorB is present in several cyanobacterial species [24] and its expression decreases in ∆cytM mutants [7]. Furthermore, Cyt c M sharply increases its expression in oxidative stress conditions [9], a condition in which nitric oxide reductase is highly active [25].
In order to determine similarities in charges between Cyt c M and NorC, we subsequently evaluated their surface electrostatic potential (Figure 4).
Also in this aspect, we found similarities between Cyt c M and NorC that might be indicative of a similar function, with Cyt c M acting as a functional part of a nitric oxide reductase and therefore being involved in the oxidative stress response. These included a positively charged tip at the end of the tail, a negatively charged region at the body closest to the tail, another positively charged region on the opposite side of the body and yet one more negatively charged pocket on the bottom. On the other hand, there are some clear differences too, such as a more extensive positive zone in the case of NorC, but the resemblances between both structures here are again undeniable. Moreover, the midpoint redox potential of Cyt c M and NorC, a fundamental parameter to establish the function of redox proteins, are relatively close with +150 mV and +183 mV, respectively [6,26]. This is interestingly opposed to the major divergences found previously between the surface electrostatic potential and midpoint redox potential of Cyt c M and the ones of Cyt c 6 and Pc [6]. Figure 3. Comparative of the predicted structures of Cyt cM and NorC. (A) shows the predicted structure for Cyt cM by Phyre 2 (blue) vs. NorC (red) while (B) shows the same prediction (blue) vs. the one by I-TASSER (yellow). In both panels, the structure predicted by Phyre 2 is the one for Cyt cM of Nostoc sp. PCC 7120, as in Figure 2.
In order to determine similarities in charges between Cyt cM and NorC, we subsequently evaluated their surface electrostatic potential (Figure 4).