A Hypothesis on How the Azolla Symbiosis Mitigates Nitrous Oxide Based on In Silico Analyses

: Nitrous oxide is a long-lived greenhouse gas that exists for 114 years in the atmosphere and is 298-fold more potent than carbon dioxide in its global warming potential. Two recent studies showcased the utility of Azolla plants for a lesser footprint in nitrous oxide production from urea and other supplements to the irrigated ecosystem, which mandates exploration since there is still no clear solution to nitrous oxide in paddy ﬁelds or in other ecosystems. Here, we propose a solution based on the evolution of a single cytochrome oxidase subunit II protein (WP_013192178.1) from the cyanobiont Trichormus azollae that we hypothesize to be able to quench nitrous oxide. First, we draw attention to a domain in the candidate protein that is emerging as a sensory periplasmic Y_Y_Y domain that is inferred to bind nitrous oxide. Secondly, we draw the phylogeny of the candidate protein showcasing the poor bootstrap support of its position in the wider clade showcasing its deviation from the core function. Thirdly, we show that the NtcA protein, the apical N-effecting transcription factor, can putatively bind to a promoter sequence of the gene coding for the candidate protein (WP_013192178.1), suggesting a function associated with heterocysts and N-metabolism. Our fourth point involves a string of histidines at the C-terminal extremity of the WP_013192178.1 protein that is missing on all other T. azollae cytochrome oxidase subunit II counterparts, suggesting that such histidines are perhaps involved in forming a Cu center. As the ﬁfth point, we showcase a unique glycine-183 in a lengthy linker region containing multiple glycines that is absent in all proximal Nostocales cyanobacteria, which we predict to be a DNA binding residue. We propose a mechanism of action for the WP_013192178.1 protein based on our in silico analyses. In total, we hypothesize the incomplete and rapid conversion of a likely heterocystous cytochrome oxidase subunit II protein to an emerging nitrous oxide sensing/quenching subunit based on bioinformatics analyses and past literature, which can have repercussions to climate change and consequently, future human life.


Introduction-Nitrous Oxide, Climate Change and Azolla
Before the oxygenation of the ancient world billions of years ago, prokaryotic lifeforms had to contend with nitrous oxide that saw its birth from corona discharge and not as lightning strikes-although there is literature attributing nitrous oxide emanation to streaks of lightning-which anucleated microorganisms reduced using an enzyme designated as nitrous oxide reductase [1]. Nitrous oxide is a greenhouse gas that subsists in the atmosphere for 114 years and is 298-fold more potent than the benchmark gas of carbon dioxide in global warming potential [2,3]. Nitrous oxide reductases are known to be of the same protein fold as cytochrome oxidases (subunit II), the former being the more pre-historical one and the latter evolving from the ancient fossil of a fold, although this is now disputed [4]. There are two biochemical pathways for the formation of nitrous oxide: the conversion of nitric oxide to nitrous oxide by nitric oxide reductases or by ammonia J 2022, 5 167 oxidation [5]. Although nitrous oxide reductases are mostly found in heterotrophic bacteria that are capable of reducing nitrous oxide to nitrogen gas, they too are found in Archaea and in selective photosynthetic cyanobacteria. There is little information on the cyanobacterial nosZ gene containing microorganisms, where the nosZ gene product is responsible for the catalytic biogenesis of nitrogen gas from nitrous oxide that is known to be present in cyanobacteria [6]. Therefore, it would be interesting to search for candidates of nitrousoxide-reducing enzymes in cyanobacteria, even ones of which the catalytic activity on nitrous oxide is secondary.
In cyanobacteria, there are two kinds of cytochrome oxidase types: one that is expressed in all types of cells and ones which are specific for heterocysts, the cellular sites for nitrogen fixation [7]. They carry the designations, heterocyst-specific coxBACII operon and the globally expressed coxBAC1 operon, with the former found in heterocysts as well as pro-heterocysts [7]. Nitrogenases, which are localized inside heterocysts of cyanobacteria, are known to be cross-reactive to a myriad of gases, namely nitrogen, nitrous oxide, acetylene and carbon monoxide that can become the substrate for the molybdenum-type nitrogenases [8]. Nitrous oxide, just the same as oxygen, is a highly oxidative gas, and the ambient presence of nitrous oxide by biological and anthropogenic means can lead to the evolution of systems that are capable of using nitrous gas as effector or substrate. One such source of nitrous oxide is urea, which was first synthesized in 1828 by Friedrich Wohler and has carried on its shoulder rice cultivation-especially the urea-responsive elite cultivars that were developed at the International Rice Research Institute (IRRI) during the Green Revolution of 1960-70s-for 150 years since the advent of the Industrial Revolution.
A conspicuous site in irrigated rice fields is water ferns of the genus Azolla, which are found omnipresent in most parts of Asia, particularly in the rice belt countries such as Vietnam and Thailand. Azolla has been used for 1000 years or more in rice cultivation, and farmers depend on Azolla for growing rice cultivars from the presence of a cyanobiont residing inside Azolla plants that is known for highly efficient nitrogen fixation [9].The history of Azolla cultivation goes back even before urea was used, and the combined treatment of Azolla and urea is now a common sight due to attempts at Azolla-mediated mitigation of greenhouse gases due to rising global warming and the resultant effects of climate change [10,11]. The "how" of nitrous oxide mitigation by Azolla has remained elusive thus far. This communication is to remediate that notion by dissecting biology using bioinformatics.
The holistic nature of the Azolla system and recent studies have prophesized and shown the advantages of using Azolla as a complete substitute or as a supplement to urea usage [10,11]. Azolla has a cyanobiont living inside its fronds that is known by a trilogy of genera, namely Anabaena, Nostoc and Trichormus [12]. The nature of the cyanobiont is obligate and it is commonly inferred that the cyanobiont is in transit from being a free-living microorganism to an endosymbiont to an ultimate plastid status [13]. The cyanobiont inside Azolla possesses the nitrogenase enzyme that is molybdenum-reliant, and this furnishes copious amounts of ammonia at the expense of nitrogen gas in an energetic and efficient transformation [12].
It is mandatory that a solution to nitrous oxide is found from the biological arsenals that evolution has created. We show here in this bioinformatics manuscript that the Azolla symbiotic system can theoretically be a putative remedy for nitrous oxide emanation. We are especially interested in domain architecture of a single sensor protein, cellular localization, promoters, tertiary structures, linker regions, phylogeny, heterocyst microenvironment and physiology and plasticity of ancient folds that may have repercussions on accompanying an emerging function that we hypothesize is gaining in strength. The purpose of this manuscript is to connect the dots of theory as to how Azolla may be able to quench and possibly transform nitric/nitrous oxide as a solution to elevated nitrous oxide, as well as aiding scientists to better fashion such systems using newer methods such as CRISPR-Cas9, which may be able to enhance the function of an emerging protein.

Rationale for an In Silico Study of a Candidate Protein from Trichormus azollae
WP_013192178.1 was earlier identified by the first author of this study, using sequence comparison to a nosZ-containing cyanobacterial protein [3]. The comparison between WP_013192178.1 and the nosZ-containing protein yielded~31.43% identity and~55% sequence coverage [3]. Note that the assignment of nomenclature was based on computational methods and not wet experiments. According to the SCOP (structural classification of proteins) classification, a protein that has >30% sequence identity belongs to the same protein/enzyme family, with most arbitrational cutoffs determined for coverage taken as >50% [14]. According to SCOP, members of a protein superfamily have low sequence conservation (<30%) and share a common evolutionary origin, which puts nitrous oxide reductases and cytochrome oxidases (subunit II) in the same superfamily, although the 31.43% sequence identity and 55% sequence coverage identifies the WP_013192178.1 protein as belonging to the same family as the nitrous oxide reductases (nosZ gene products) [3]. It seems this cytochrome oxidase subunit 2 has partial features of a nitrous oxide reductase. The complete set of T. azollae cytochrome oxidases subunit 2 proteins are shown in Table 1. The c-terminal copper center of nitrous oxide reductases are conserved in the candidate protein under study in this in silico exploration ( Figure 1). In two recent studies, the researchers showed a significant reduction in nitrous oxide in urea/biochar-treated rice paddies in the presence of grown Azolla, which suggests that Azolla cyanobiont may have means to negate the nitrous oxide that is produced due to urea application and denitrification [10,11]. We suggest here that WP_013192178.1 is capable of nitrous oxide sensing and/or quenching and we use bioinformatics and literature to further our claims.
On this note, key studies have demonstrated that nitric oxide/nitrous oxide is susceptible to interactions with terminal cytochrome oxidases occupying sites found within the structure of the enzyme, although it is not the primary substrate [15]. Nitric oxide has been shown to be a substrate, inhibitor and effector of candidate cytochrome oxidases ( Figure 2) [16]. At lower concentrations of oxygen, as is the case for heterocysts, the cytochrome oxidases are reduced, allowing for the accumulation of NO in the near vicinity (microenvironment) of the enzyme and for not consuming NO [16]. The above sequestering reaction has been termed physiologically relevant.
It is also proposed that nitrous oxide may be converted into dinitrogen gas in the presence of enzymes [15]. In fact, both mammalian (bovine) and plant (bean) mitochondrial cytochrome oxidases have been shown to be inhibited by nitrous oxide [15]. The chemistry of the interaction between cytochrome oxidase II and nitrous oxide is shown to be participatory to multiple sites of the bovine heart cytochrome oxidase, although the method of participation is not suggested to involve ligand donation or oxidase function [15]. The cross-reaction of cytochrome oxidases with nitrous oxide has been reported in other studies, as a measure of concentration of cytochromes.
Therefore, the rationale of this study is that the SCOP-based assessment of likely family and putative (surrogate) sensor activity-i.e., nitric/nitrous oxide sensing-requires exploration in silico first, prior to the study of biological and biochemical parameters from lab-based experimentation. This is such an attempt. Figure 1. The T. azollae WP_013192178.1 aligned against two nitrous oxide reductases (nosZ gene products), showcasing the key residues (in color) that are conserved in the motif H-X34-CXEXCX3HX2M. The putative Y_Y_Y domain is shown in a box. Stars indicate identical fully-conserved residues between the three sequences.  The likely functional aspects of cytochrome oxidases as a consequence of oxygen, nit oxide and oxidative potential. The above illustration was adapted partially from a past draw [16].

The Sensory Periplasmic Y_Y_Y Domain
A unique sequence motif yielded a stretch of three amino acids (KTP) in a short lo and a single helix (GDYSLI). The above nine residue sequence motif is distinct from t same stretch in closely associating proteins, and the highlighted mutations (TP and S) a not all simple wobble (third) position mutations, which emphasizes their integral a likely functional nature. The above stretch of 9 amino acids is found in a region containi a periplasmic sensor Y_Y_Y domain (residue from 260 to 284) that is shown in Figure  of which the function has remained elusive to the scientific community, although it is sa to be involved in bacterial signal transduction. The periplasmic localization of this Y_Y domain tallies with the partial periplasmic localization of the WP_013192178.1 prote which consolidates the insights drawn from motif analyses. The Y_Y_Y protein is syno ymous with nitric oxide sensing and the modification of this domain in t WP_013192178.1 protein hints at a sensory function for nitric oxide [17]. The Y_Y_Y d main is only found in the WP_013192178.1 protein and is missing in the closest allied p tein sequences in T. azollae that it aligns with, suggesting a recent modification to tran form into a nitric/nitrous oxide sensor as suggested in this study. Out of 506 Y_Y_Y co taining proteins in the PFAM database, 237 contain the histidine kinase domain outsi of the Y_Y_Y domain [18]. Other domains found with Y_Y_Y motifs are GGDEF adeny cyclase signaling domain, as well as the SpoIIE sporulation domain, while 26/506 Y_Y proteins were not two-component regulators in sequence [18].
When the 24 residue Y_Y_Y domain of the WP_013192178.1 protein is search against the family "cyanobacteria" using the BLASTp search tool, the results reveal ma proteins, with the top two being cytochrome oxidase subunit II proteins from Nostoc FACHB-110 that when searched for putative conserved domains contained the Y_Y_Y d main. Full length sequences in the list "after" Nostoc sp. FACHB-110 ( Figure 4) did n contain the Y_Y_Y domain. Nostoc sp. FACHB-110 is a cyanobacterium that is found "organism specific"in China. Therefore, we present a hypothesis here that all three cy chrome oxidases that have a putative Y_Y_Y domain are likely to have undergone som The likely functional aspects of cytochrome oxidases as a consequence of oxygen, nitric oxide and oxidative potential. The above illustration was adapted partially from a past drawing [16].

The Sensory Periplasmic Y_Y_Y Domain
A unique sequence motif yielded a stretch of three amino acids (KTP) in a short loop and a single helix (GDYSLI). The above nine residue sequence motif is distinct from the same stretch in closely associating proteins, and the highlighted mutations (TP and S) are not all simple wobble (third) position mutations, which emphasizes their integral and likely functional nature. The above stretch of 9 amino acids is found in a region containing a periplasmic sensor Y_Y_Y domain (residue from 260 to 284) that is shown in Figure 3, of which the function has remained elusive to the scientific community, although it is said to be involved in bacterial signal transduction. The periplasmic localization of this Y_Y_Y domain tallies with the partial periplasmic localization of the WP_013192178.1 protein, which consolidates the insights drawn from motif analyses. The Y_Y_Y protein is synonymous with nitric oxide sensing and the modification of this domain in the WP_013192178.1 protein hints at a sensory function for nitric oxide [17]. The Y_Y_Y domain is only found in the WP_013192178.1 protein and is missing in the closest allied protein sequences in T. azollae that it aligns with, suggesting a recent modification to transform into a nitric/nitrous oxide sensor as suggested in this study. Out of 506 Y_Y_Y containing proteins in the PFAM database, 237 contain the histidine kinase domain outside of the Y_Y_Y domain [18]. Other domains found with Y_Y_Y motifs are GGDEF adenylyl cyclase signaling domain, as well as the SpoIIE sporulation domain, while 26/506 Y_Y_Y proteins were not two-component regulators in sequence [18].
When the 24 residue Y_Y_Y domain of the WP_013192178.1 protein is searched against the family "cyanobacteria" using the BLASTp search tool, the results reveal many proteins, with the top two being cytochrome oxidase subunit II proteins from Nostoc sp. FACHB-110 that when searched for putative conserved domains contained the Y_Y_Y domain. Full length sequences in the list "after" Nostoc sp. FACHB-110 ( Figure 4) did not contain the Y_Y_Y domain. Nostoc sp. FACHB-110 is a cyanobacterium that is found as "organism specific"in China. Therefore, we present a hypothesis here that all three cytochrome oxidases that have a putative Y_Y_Y domain are likely to have undergone some degree of evolution to handle nitric/nitrous oxide from the immediate ecosystem. Whether the nitric oxide/nitrous oxide binding is used for the concomitant transformation of the gases cannot be inferred from this study alone. None of the other full-length cytochrome oxidase subunit II proteins in Trichormus azollae contain the Y_Y_Y domain ( Figure 5). J 2022, 5, FOR PEER REVIEW 6 degree of evolution to handle nitric/nitrous oxide from the immediate ecosystem. Whether the nitric oxide/nitrous oxide binding is used for the concomitant transformation of the gases cannot be inferred from this study alone. None of the other full-length cytochrome oxidase subunit II proteins in Trichormus azollae contain the Y_Y_Y domain ( Figure 5).    yond Nostoc sp. FACHB-110 (sequence 2 and 3 above)-a cyanobacterium found in China-the Y_Y_Y domain is absent, i.e., only three cyanobacterial sequences are likely to contain the putative Y_Y_Y domain in cytochrome oxidase subunit II proteins, and all appear to be found associated with the rice growing belt, namely T. azollae and Nostoc sp. FACHB-110 (Note: this is when the fulllength sequence of the corresponding protein is searched for likely functional domains and not domains by themselves).

The Phylogeny of the Cytochrome Oxidase Subunit II Enzymes
Phylogeny was inferred using the maximum likelihood method with bootstrap support from 500 replicates. Phylogenetic kinships revealed WP_013192178.1, the cytochrome oxidase subunit 2 sequence of T. azollae, as an evolving protein and nosZ gene products of bacterial candidates as supplementary enzymes that evolved from a single origin ( Figure 6). A single archaeal contender was used as the outgroup to anchor the phylogenetic tree ( Figure 6). Cluster 3 of the phylogenetic tree contained sequences from all three cyanobionts, with an internal duplication event (of N. punctiforme) ( Figure 6). There is high bootstrap support for Cluster 1 and Cluster 3, intraspecifically, and this shows that the phylogenetic inferences are likely to be accurate for the above eight sequences, organized as two clusters. Cluster 1 appears to have committed to terminal outcomes in all three cyanobiont species using two consecutive symmetrical bifurcation events (both prior to Bryophyte evolution), but not cluster 2 and 3, which evolved in three stages. As for cluster 2, which too has conserved members of the three cyanobionts, there appears to be two evolutionary outcomes to N. punctiforme before the temporally distinct division of the common clade into its nodal tips carrying Cycas and Azolla cyanobiont sequences.
However, the key protein of emphasis (WP_013192178.1) in this study is poorly organized into a cluster of low bootstrap support (<41%) ( Figure 6) so that we are unable to draw strong inferences from that clade. However, the sequence deviation supports a case of divergent evolution which we hypothesize to involve contact/exposure with highnitrous oxide levels. evolution), but not cluster 2 and 3, which evolved in three stages. As for cluster 2, which too has conserved members of the three cyanobionts, there appears to be two evolutionary outcomes to N. punctiforme before the temporally distinct division of the common clade into its nodal tips carrying Cycas and Azolla cyanobiont sequences.
However, the key protein of emphasis (WP_013192178.1) in this study is poorly organized into a cluster of low bootstrap support (<41%) ( Figure 6) so that we are unable to draw strong inferences from that clade. However, the sequence deviation supports a case of divergent evolution which we hypothesize to involve contact/exposure with highnitrous oxide levels. Figure 6. The phylogenetic tree of cyanobiont (T. azollae, Nostoc punctiforme and Nostoc cycadea) cytochrome oxidase subunit II sequences with bacterial nitrous oxide reductases (nosZ gene products) and one archaeal protein sequence (outgroup) for the inference of phylogeny from the above sequences to resolve the evolutionary relationships of proteins that possess the same fold. Clusters are named 1, 2 and 3 and as bacterial nosZ gene products. Haloarcula rubripromontori belongs to the kingdom Archaea.

Gene Synteny Analyses
When a gene synteny analysis was performed on coxBAC operons of T. azollae with 3 other canonical model organisms (a single strain of each N. punctiforme, Anabaena cylindrica, and Trichodesmium erythraem), it could be seen that T. azollae had the most copies of the coxBAC operons (four complete operons) (Figure 7). Visualization of the synteny map was performed using SimpleSynteny v1.4 using default options [21].

Gene Synteny Analyses
When a gene synteny analysis was performed on coxBAC operons of T. azollae with 3 other canonical model organisms (a single strain of each N. punctiforme, Anabaena cylindrica, and Trichodesmium erythraem), it could be seen that T. azollae had the most copies of the coxBAC operons (four complete operons) (Figure 7). Visualization of the synteny map was performed using SimpleSynteny v1.4 using default options [21].

Promoter for the NtcA Transcription Factor
One key regulator of nitrogen fixation, heterocyst development and function, is the NtcA transcription factor, which is thought to regulate the downstream function of >2000 mostly nitrogen-metabolism-related genes [22]. We searched for the NtcA binding promoter region in the WP_013192178.1 encoding gene to see whether it was under top-down regulation. We found a 100% match for the NtcA promoter at -66 to -52 region upstream of the WP_013192178.1 coding gene (Figure 8). The sequence reads 5′ GTA gcctgcac TAC 3′ (Figure 8), which was accurate for the signature binding sequence (5′ GTA n(8) TAC 3′) of the NtcA transcription factors [23]. Therefore, the WP_013192178.1 encoding gene is likely to be part of the nitrogen response transcriptome and interactome and functioning inside heterocysts. Heterocysts are oxygen depleted, photosystem-2 inactivated and ATPproducing (from photosystem 1), which synthesize ammonium ions and transport such

Promoter for the NtcA Transcription Factor
One key regulator of nitrogen fixation, heterocyst development and function, is the NtcA transcription factor, which is thought to regulate the downstream function of >2000 mostly nitrogen-metabolism-related genes [22]. We searched for the NtcA binding promoter region in the WP_013192178.1 encoding gene to see whether it was under top-down regulation. We found a 100% match for the NtcA promoter at −66 to −52 region upstream of the WP_013192178.1 coding gene (Figure 8). The sequence reads 5 GTA gcctgcac TAC 3 (Figure 8), which was accurate for the signature binding sequence (5 GTA n(8) TAC 3 ) of the NtcA transcription factors [23]. Therefore, the WP_013192178.1 encoding gene is likely to be part of the nitrogen response transcriptome and interactome and functioning inside heterocysts. Heterocysts are oxygen depleted, photosystem-2 inactivated and ATP-producing (from photosystem 1), which synthesize ammonium ions and transport such nitrogen currencies to the neighboring vegetative cells as glutamine. None of the upstream regions of the closest gene homologs of the focal gene under scrutiny here (WP_013192178.1 coding gene), contain the NtcA binding sequence, showcasing that perhaps this modification is recent in origin.
Under nitrogen-rich conditions, 2-oxoglutarate counts are low and NtcA is found in an inactive form [22]. In a recent publication, it was shown that the NtcA transcription factor of T. azollae putatively uses a putative disulfide bond to perform its robust function in modifying gene expression using NtcA responsive promoters [22]. We propose here that nitrous oxide released from urea can gradually replace or function secondarily to the canonical function of reducing oxygen to water by the WP_013192178.1 enzyme due to extremely low levels of oxygen found inside heterocysts. However, cellular respiration inside heterocysts has to be taken into consideration, where the WP_013192178.1 enzyme can contribute as a terminal oxidase by keeping the oxygen concentrations extremely low so as not to inhibit the nitrogenase function, although the light-dependent photosystem-1-based Mehler activity in heterocysts is known to supersede respiration [24], again giving the cytochrome oxidase subunit II the freedom to acquire a newer substrate through rapid evolution. In fact, Cyt c6 is the key donor for both cytochrome oxidase enzymes as well as the engine for cyclic photosynthesis of photosystem 1 in heterocysts, while in vegetative cells, there is the division of labor by Cyt c6 and plastocyanin as electron donors [25]. that nitrous oxide released from urea can gradually replace or function secondarily to the canonical function of reducing oxygen to water by the WP_013192178.1 enzyme due to extremely low levels of oxygen found inside heterocysts. However, cellular respiration inside heterocysts has to be taken into consideration, where the WP_013192178.1 enzyme can contribute as a terminal oxidase by keeping the oxygen concentrations extremely low so as not to inhibit the nitrogenase function, although the light-dependent photosystem-1-based Mehler activity in heterocysts is known to supersede respiration [24], again giving the cytochrome oxidase subunit II the freedom to acquire a newer substrate through rapid evolution. In fact, Cyt c6 is the key donor for both cytochrome oxidase enzymes as well as the engine for cyclic photosynthesis of photosystem 1 in heterocysts, while in vegetative cells, there is the division of labor by Cyt c6 and plastocyanin as electron donors [25].

Terminal Histidines
There are four histidines that are found at the C-terminal extremity of the WP_013192178.1 protein (Figure 9). The copper center in bacterial nitrous oxide reductases is formed by six-seven histidine residues. Therefore, the presence of a string of histidines, which are completely unavailable in all other T. azollae cytochrome oxidase subunit II enzymes (Figure 9), suggests that this may be an adaptive function to aid in the handling of nitrous oxide in periplasmic spaces, which serendipitously happens to be the localization of all classical nitrous oxide reductases. The MBD2440653.1 and WP_199336312.1 sequences from Nostoc sp. FACHB-110, which possess the putative Y_Y_Y domain, contain the same histidines as the WP_013192178.1 protein. The C-terminal half of the three proteins are highly conserved, suggesting a likely significant function ( Figure 10). Clustered histidines are known to donate ligands for copper binding [26] We state that our bioinformatics-based inferences require further functional validation.
handling of nitrous oxide in periplasmic spaces, which serendipitously happens to localization of all classical nitrous oxide reductases. The MBD2440653. WP_199336312.1 sequences from Nostoc sp. FACHB-110, which possess the p Y_Y_Y domain, contain the same histidines as the WP_013192178.1 protein. The C nal half of the three proteins are highly conserved, suggesting a likely significant fu ( Figure 10). Clustered histidines are known to donate ligands for copper binding [ state that our bioinformatics-based inferences require further functional validation Figure 9. Sequence alignment of the putative cytochrome oxidase subunit II enzymes in T. The exclusive presence of four histidines (colored in yellow) on the C-terminus WP_013192178.1 protein in T. azollae, perhaps priming the presence of a copper center. Th locality (positively charged residues) containing the twin arginines is also shown in pink an (NOTE: MBD2440653.1 and WP_199336312.1 sequences from Nostoc sp. FACHB-110 con same histidines). Stars indicate identical fully-conserved residues between the four sequenc

Translocation of the Periplasmic Domains
Periplasmic proteins such as nitrous oxide reductases conventionally require the presence of a transporter (TAT-Twin Arginine Translocation) to ferry the protein across from the cytoplasm to the periplasmic spaces and are identified by a leader sequence that includes a region "SRRXFLK" in bacteria. Although such a region is not found residue to residue, there are twin arginines in the N-terminal half surrounded by a strongly basic

Translocation of the Periplasmic Domains
Periplasmic proteins such as nitrous oxide reductases conventionally require the presence of a transporter (TAT-Twin Arginine Translocation) to ferry the protein across from the cytoplasm to the periplasmic spaces and are identified by a leader sequence that includes a region "SRRXFLK" in bacteria. Although such a region is not found residue to residue, there are twin arginines in the N-terminal half surrounded by a strongly basic region (in the immediate vicinity) in our focal protein of this study (Figure 9).
Leader sequences more than 60 residues are found in bacteria. For example, the Paracoccus denitrificans nitrous oxide reductase, and at least three other homologs from bacteria, have unusually lengthy leader sequences [27]. The leader sequence of nitrous oxide reductase of the organism, Paracoccus denitrificans, is cleaved between amino acid 57 and 58 [27] . However, we could not find a translocation signal peptide using the SignalP 5.0 web server, indicating that our inferences may be coincidental

Proposed Functional Mechanism of the Candidate Protein WP_013192178.1
A co-functional dependence has been attributed to nitrite reductases and NO reductases in heterotrophic bacteria [28]. They can be elaborated as the following [28]: (i) localization of NO reductase on the cytoplasmic membrane to preclude the diffusion of toxic NO to the cytoplasm; (ii) highly-efficient NO binding to NO reductase for the concomitant catalytic conversion of NO to N 2 O, a stable product.
Therefore, the candidate protein under study here is likely to be present partially in the periplasmic spaces and to be effectuated by the NO present in that peri-cytoplasmic locality due to myriad processes. In the Azolla symbiosis, nitric oxide is made by conversion of arginine, while nitrous oxide is synthesized by ammonia oxidation and mitochondrial cytochrome oxidases activity in plants, while the microbiome in Azolla can also contribute nitrous oxide (nitrification and denitrification) while also taking into consideration urea breakdown in the ambience [29]. In fact, it is reported that aseptically grown plants emit nitrous oxide from mitochondrial nitric oxide conversion by terminal cytochrome oxidases [29] which strengthens our hypothesis.
We infer that these are the likely steps of functions based exclusively on bioinformatic inferences and past literature 1.
The NtcA transcription factor binds to the promoter of the gene to produce mRNA that will be translated into the corresponding WP_013192178.1 protein 2.
The periplasmic domain of the WP_013192178.1 protein is exported into the periplasmic space. 3.
The nitric oxide sensing function will negotiate periplasmic NO as the trigger using the Y_Y_Y domain to detect toxic NO in the periphery of the heterocyst membrane. Note, we also envisage direct binding of N 2 O to the Y_Y_Y domain, since they are both small molecules and known to bind to the same proteins 4.
The binding of a Y_Y_Y domain to nitric oxide has been reported at least once before to disentangle the sensor protein from the membrane localization, and this dissociation from a fixed membrane position, has been linked to subsequent binding to DNA to effectuate downstream functions [17] . We propose a similar mechanism here. 5.
The mid-section of the candidate protein (WP_013192178.1), comprising a loosely random-coiled section as shown in the homology model (Figure 11), is likely to be functional as a DNA binding surface. DNA binding could be involved in transcription of nitric/nitrous oxide quenching/transforming ORFs or other regulatory functions. The evolutionarily "elderly" global fold common to nitrous oxide reductases and cytochrome oxidases (subunit II) may be more adaptable due to its long vigil in time and may even accommodate novel -adaptive -catalytic properties [30]. Long coiled segments are characteristic of DNA and RNA binding proteins, including DNA and RNA chaperones [31] and even as oligomerization interfaces.
The glycine-rich region in WP_013192178.1 (Figure 11) is likely to be part of a DNA binding mechanism and whether it is the regulation of specific genes, i.e., as a transcription factor, needs to be established. It could even be that the coxBAC operon in heterocysts is regulated by this DNA binding activity stemming from glycine-183.

DNA Binding and/or Oligomerization
It is of note that when the protein sequence WP_013192178.1 was queried for RNA and DNA binding residues using DRNApred server [32], there was a single glycine (Glycine-183) in a glycine-rich region that was identified as a DNA binding residue. Surprisingly, the MBD2440653.1 and WP_199336312.1 from Nostoc sp. FACHB-110 ( Figure 10) lack this glycine (Glycine-183), which is mutated to an alanine in both proteins. When a simple PSI-BLAST search was performed on the linker region, it was clear that Glycine-183 was unique and unfounded in any of the closest matches, suggesting a residue that is functionally significant to the WP_013192178.1 protein. A single glycine found in a hinge region has been shown to be crucial for DNA binding for the HapR protein from Vibrio cholerae, where a single G39D mutation abolished the DNA binding property [33].
The glycine-rich region is found in a connecting "lengthy" region amounting to 43 amino acids between the N-terminus and the C-terminal cupredoxin-like fold ( Figure 11). Such glycine-rich linker regions within DNA binding domains are found in proteins such as HapR, a quorum-sensing master regulatory protein in Vibrio cholerae, where they connect two helices and have been demonstrated to be significant for function. The 43-residue linker region when searched using PSI-BLAST only manifests~76% sequence identity with its nearest match, showcasing a divergent nexus, and is significantly longer to fellow counterparts from T. azollae ( Figure 11) Glycine-rich regions are also used to form protein dimers or oligomers. In fact, linkers that are glycine rich are flexible and provide a linker to connect functional domains without interfering with the functional aspects of the individual domains [34]. The lengths of glycine-rich linkers vary from 2 to 31 amino acids and can play a role that has minimal constraints to separated functionalities of evolutionary-fused proteins [34].
The glycine-rich region in WP_013192178.1 ( Figure 11) is likely to be part of a DNA binding mechanism and whether it is the regulation of specific genes, i.e., as a transcription factor, needs to be established. It could even be that the coxBAC operon in heterocysts is regulated by this DNA binding activity stemming from glycine-183.

Phosphorylation of Serine Residues
There were 6 serine phosphorylation sites in total in the WP_013192178.1 protein with two possessing compelling likelihoods: Ser 105 (0.719 probability) and Ser 304 (0.757 probability) ( Figure 12). Out of the above, Ser-105 is unfound in all the other cytochrome oxidases found in T.azollae, while the position 304 is held by a serine found in the proximal 3-D space of the Y_Y_Y domain. Even the serine residue at position 5 is also unfounded in all other protein sequences. The presence of such unique serines could be used for signal transduction via protein phosphorylation by relevant kinases.

Phosphorylation of Serine Residues
There were 6 serine phosphorylation sites in total in the WP_013192178.1 protein with two possessing compelling likelihoods: Ser 105 (0.719 probability) and Ser 304 (0.757 probability) ( Figure 12). Out of the above, Ser-105 is unfound in all the other cytochrome oxidases found in T.azollae, while the position 304 is held by a serine found in the proximal 3-D space of the Y_Y_Y domain. Even the serine residue at position 5 is also unfounded in all other protein sequences. The presence of such unique serines could be used for signal transduction via protein phosphorylation by relevant kinases.

Interpreting a Hydropathy Profile of the WP_013192178.1 Protein
The hydropathy profile from the Kyte and Doolittle method indicates that there are two highly hydrophobic regions that are likely to form helices crossing the inner membrane of the T. azollae heterocysts (Figure 13). The two highly hydrophobic (positive > 1.6) transmembrane domains are likely to appear in the first 110 residues, with a strongly hydrophilic "surface exposed" region separating the two (a surface exposed region between residues 65 and 85) that can be seen in the homology model in Figure 13 where the two

Interpreting a Hydropathy Profile of the WP_013192178.1 Protein
The hydropathy profile from the Kyte and Doolittle method indicates that there are two highly hydrophobic regions that are likely to form helices crossing the inner membrane of the T. azollae heterocysts ( Figure 13). The two highly hydrophobic (positive > 1.6) transmembrane domains are likely to appear in the first 110 residues, with a strongly hydrophilic "surface exposed" region separating the two (a surface exposed region between residues 65 and 85) that can be seen in the homology model in Figure 13 where the two helices are depicted in the color "blue". However, the region 110-200 amino acids are weakly hydrophilic, indicating a region that is immersed into the cytoplasmic side of the cell. In addition, when the "outside negative" rule is implemented, we arrive at the periplasmic region (220-300) rich in aspartates (D) and glutamates (E), suggesting their translocated nature (i.e., present in the periplasmic space), including that of the Y_Y_Y domain [35]. Still, the region 260-284 is stronger in its hydrophobicity compared to the regions in the near adjacent vicinities within the periplasmic domain, perhaps hinting at a binding function for an inert gas such as nitrous oxide or the bulky reduction intermediates that may show a preference for hydrophobic residues. In this, three valines (locations 300-302) in the near structural vicinity (3D space) of the Y_Y_Y domain, stand out.

Conclusions
While viruses such as coronaviruses mutate in days, weeks and months, antibioticresistant bacterial strains are known to have occurred in the last 60-80 years ever since the golden age  of antibiotics discovery, and there are evolutionary mal-adaptations that are stemming from polluters such as Tributylin (discovered in the 1950s) that are changing the reproductive biology/organs of the marine biota (such as mollusks). There is no assured way of deciding on a sensory reception module (Y_Y_Y motif) that may be gaining strength in the T. azollae cytochrome oxidase subunit II protein or to state whether the evolution is ongoing or terminal. Due to the proximity of the sequence motif to the conserved Cys-XXX-Cys sequence and taking into consideration the nature of the mutations as non-wobble-position mutations and the periplasmic presence of the Y_Y_Y motif, it is fair to propose that the emergence of this prototypical protein is likely to have perhaps occurred in the last 150 years from exposure to enhanced urea and other synthetic N-fertilizer usage.
Nitrogen containing compounds such as thalidomide (C13H10N2O4) have also had immediate footprints on the human health landscape, while even biogenic cyanotoxins, being nature's arsenal of nitrogenous molecules, have resulted in animal and human mortalities. Nitric oxide is also a potent toxic molecule and a reactive nitrogen species, as well as being a signaling molecule that needs to be handled carefully when it can putatively harm organisms such as cyanobacteria when produced by nitric oxide synthases [36]. In-

Phylogenetic Reconstructions
The non-redundant downloaded amino acid sequences (as FASTA files) from each query were first aligned with the ClustalW algorithm using MEGA version X (default parameters) and phylogenetic reconstruction was performed using the neighborhood joining/maximum likelihood methods with support from 500 bootstrap replications. There was also the assignment of an archaeal outgroup [36].

Multiple Sequence Alignments
The non-redundant downloaded amino acid sequences (as FASTA files) were employed for sequence alignment using the ClustalW algorithm using the MEGA X and ClustalW servers available in the web such as Clustal Omega (https://www.ebi.ac.uk/ Tools/msa/clustalo/). (Accessed on 22 January 2022).

Determination of Protein Families/Superfamilies
The PSI-BLAST search was also used to find suitable homologs. As stated in the structural classification of proteins (SCOP) database [37], 30% sequence identity was considered as the lower cutoff for protein families and 15% for protein superfamilies. Coverage was taken as 50%, which was an arbitrary value. Together, they formed the basis for the identification of true protein homologs belonging to a single family.

Building of Homology Models
Homology models were built using the SWISS-MODEL (https://swissmodel.expasy. org/) (accessed 22 December 2021] server utilizing the default parameters of the search box for searching the PDB-stored structures of homologous proteins to be used as templates [39]. The highest homology structure was used as template.

Prediction of DNA Binding Residues
DNA binding residues were predicted using the DRNApred server [32].

Conclusions
While viruses such as coronaviruses mutate in days, weeks and months, antibioticresistant bacterial strains are known to have occurred in the last 60-80 years ever since the golden age (1940-1960) of antibiotics discovery, and there are evolutionary mal-adaptations that are stemming from polluters such as Tributylin (discovered in the 1950s) that are changing the reproductive biology/organs of the marine biota (such as mollusks). There is no assured way of deciding on a sensory reception module (Y_Y_Y motif) that may be gaining strength in the T. azollae cytochrome oxidase subunit II protein or to state whether the evolution is ongoing or terminal. Due to the proximity of the sequence motif to the conserved Cys-XXX-Cys sequence and taking into consideration the nature of the mutations as non-wobble-position mutations and the periplasmic presence of the Y_Y_Y motif, it is fair to propose that the emergence of this prototypical protein is likely to have perhaps occurred in the last 150 years from exposure to enhanced urea and other synthetic N-fertilizer usage.
Nitrogen containing compounds such as thalidomide (C13H10N2O4) have also had immediate footprints on the human health landscape, while even biogenic cyanotoxins, being nature's arsenal of nitrogenous molecules, have resulted in animal and human mortalities. Nitric oxide is also a potent toxic molecule and a reactive nitrogen species, as well as being a signaling molecule that needs to be handled carefully when it can putatively harm organisms such as cyanobacteria when produced by nitric oxide synthases [42]. Inert nitrous oxide biogenesis is a solution to nitric oxide toxicity.
While urea may not have the same chemistry as toxins, we do not know the actual effect urea has on organismal physiology. Some plants contain ureases that are able to hydrolyze urea, similar to a spectrum of soil bacteria without which urea will tend to accumulate in the ambience and in plant parts/organs if internalized [43]. The genus Oryza contains urease enzymes but such catalysts appear to be lacking in the genus Azolla. Azolla is also poor in its internalization of nickel, which is a cofactor for ureases.
Receptive proteins, due to their presence in the periplasmic spaces, form interactions with environmental polluters and may be susceptible through exposure to developing a system to remediate the anthropogenically magnified molecule. This is using the same fundamental reasoning as the acquisition of antibiotic resistance that is acquired through lateral transfer events as plasmid vectors, as well as due to mutations (due to increased evolvability) as a means to resist an antibiotic in the immediate medium, primarily through SOS mutagenesis where the microbes produce error-prone DNA polymerases [44]. Furthermore, endosymbionts change faster than free-living organisms due to genome erosion, pseudogenization, and stronger substitutions that are common in prototypical symbionts that have not yet become a full-fledged plastid [13]. To conclude, it is stated by the authors here that the WP_013192178.1 protein from T. azollae is possibly emerging as a possible candidate for nitric/nitrous oxide sensing (via an emerging Y_Y_Y domain).
The evolution of enzymes has been shown to involve the conservation of the key residues of the catalytic site while allowing for the modification of residues in the substrate binding site [45]. In an enzyme superfamily such as Nudix hydrolases, the amino acid arrangement of the Nudix box active site and the overall protein fold is retained, while the substrate binding site is tinkered to allow for diverse substrate recognition [46].
Heterocysts are known for ecological specialization such as the thermophilic Fischerella spp., modifying the impermeability of the heterocyst envelopes to ensure anoxic conditions, agreeing with the plasticity-first hypothesis, which states that natural selection can customize ancestrally "plastic" features to fine-tune, to fit or replace with a locally optimized trait [47]. The order of events took place-as we hypothesize-mediated by nitrous oxide synthesized abiotically, paving way to biotic oxygen, and now, nature is evolving a specific cytochrome oxidase subunit II enzyme to possibly quell the urea-based nitrogen pollution by sensing nitric/nitrous oxide. Dollo's principle states that evolution discourages reinventing (exactly) an ancient trait from an eclipsed unidirectional fate, and reforming an enzymatic function is no easy feat. Many mutations are likely to be necessary to reimagine the same fold and the same superfamily with a reinvented substrate specificity-albeit one found abundantly in nature-to make sense as a member of a receptive function.
Rapid evolution is proposed to occur even inside the span of a century, just about the time we have used urea this far and is thought to have an imprint on ecological interactions that are already underway [48,49]. Rapid evolution can take place due to a plethora of anthropogenic measures, such as Peppered Moths and the Industrial Revolution, or due to natural means, the latter being synonymous with Darwin's Finches in the face of fluctuating rainfall [49]. We gently state that the likely heterocyst-functioning WP_013192178.1 is expedited in "rapid" evolution.
We suggest that the (1) emerging sensory Y_Y_Y domain, (2) the accumulation of four histidines at the C-terminal extremity, (3) a putative lengthy linker region rich in glycines including a likely unique DNA-binding glycine-183, (4) the NtcA-responsive promoter/binding site, (5) the secondary nature of cellular respiration in heterocysts to the Mehler activity, (6) the weak phylogeny to familial counterparts and (7} The promiscuity of ancient folds and structures are all indicative of the putative role of nitrous oxide sensing and effectuation of DNA-reliant downstream functions by an evolving cytochrome oxidase subunit II protein that may not be an optimal design as of the contemporary, but may be one of utility. There is no solution to nitrous oxide in the atmosphere and we hope that the clues and theories presented in the paper will be a trigger for the characterization of the candidate protein's many activities, the resolution of the three-dimensional structure, gene knockout studies and precision gene-editing (CRISPR-Cas9) that may respectively validate function and augment the utility of this enzyme to the Azolla symbiosis and the climate changeimpacted crucible of mankind.
To end, due to T. azollae being a strong nitrogen-fixing obligate cyanobiont, the addition of nitrous oxide conversion to dinitrogen gas could make this biological system one of invaluable utility to both climate change mitigation as well as crop nutrition. Determination of novel catalytic chemistries require "wet lab" biochemical experiments supported by structural biology studies.
Author Contributions: D.G. was responsible for the conceptualization, designed the in silico study, performed the bioinformatics-based explorations, analyzed the data and wrote the manuscript. V.H. performed the gene synteny analyses and produced the accompanying literature. All authors have read and agreed to the published version of the manuscript.