Proteomic Analysis of Methanonatronarchaeum thermophilum AMET1, a Representative of a Putative New Class of Euryarchaeota, “Methanonatronarchaeia”

The recently discovered Methanonatronarchaeia are extremely halophilic and moderately thermophilic methyl-reducing methanogens representing a novel class-level lineage in the phylum Euryarchaeota related to the class Halobacteria. Here we present a detailed analysis of 1D-nano liquid chromatography–electrospray ionization tandem mass spectrometry data obtained for “Methanonatronarchaeum thermophilum” AMET1 grown in different physiological conditions, including variation of the growth temperature and substrates. Analysis of these data allows us to refine the current understanding of the key biosynthetic pathways of this triple extremophilic methanogenic euryarchaeon and identify proteins that are likely to be involved in its response to growth condition changes.


Introduction
Methanogenesis is a major biogeochemical process that is indispensable for the global carbon cycle and climate maintenance. Currently, only archaea have been conclusively shown to produce methane at anaerobic conditions in nature [1,2]. Methanogenic archaea belong to several distinct lineages. The best characterized ones are members of the phylum Euryarchaeota and include the classes Methanomicrobia, Methanobacteria, Methanopyri, Methanococci, the recently discovered candidate classes "Methanonatronarchaeia" and "Methanofastidiosa" and the order Methanomassiliicoccales within the class Thermoplasmata. Several putative methyl-reducing methanogens belong to the candidate phyla "Bathyarchaeota" and "Verstaraetearchaeota" that are classified within the Thaumarchaeota-Aigarchaeota-Crenarchaeota-Korarchaeota (TACK) superphylum [2,3]. Several methanogens that apparently represent additional, diverse archaeal groups so far have been identified only in metagenomic samples and the respective genomes are expected to become available in the near future [2]. Reconstruction of the gene gain and loss in the evolution of archaea suggests that the last the minimal possible growth temperature for this strain "34 • C", otherwise optimal conditions; 55 • C, as the maximum possible growth temperature for this strain "55 • C", otherwise optimal conditions; "TMA"-trimethylamine as an alternative electron acceptor (with much slower growth in comparison to methanol), otherwise optimal conditions; "FeS"-FeS as a growth factor (with slower and poorer growth in comparison to sterilized sediments), otherwise optimal conditions; "H 2 "-H 2 as an electron donor (poorly soluble at saturated salt concentration and with slower growth in comparison to soluble formate), otherwise optimal conditions. Cultivation was performed in all cases as described above.
Given the low growth rates of the pure culture of AMET1 under the different conditions tested and the highly resource-consuming nature of the proteomic analyses (more than 20 µg total protein) we decided to conduct shotgun proteomic analyses by combining cells from biological triplicates and further processing them as a single sample. We sacrificed the differences between replicates, which have been found to be minimal in similar previous studies, to investigate differences among cells from cultures obtained under different conditions. More in details, cell pellets obtained from three parallel cultures for each condition (except TMA with one replicate culture) were pooled and the resulting pellet dissolved in lysis buffer (8 M urea, 2 M thiourea, 5% 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate (CHAPS), 5 mM Tris (2-carboxyethyl) phosphine (TCEP)-HCl and a protease inhibitors cocktail). Homogenization of the cells was achieved by ultra-sonication for 5 min on ultrasonic bath. After homogenization, the lysed cells were centrifuged at 20,000× g for 10 min at 4 • C and the supernatant containing the solubilized proteins was concentrated and used for liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiment. Detailed descriptions of all methodological procedures used in this study for protein concentration, 1D-nano liquid chromatography-electrospray ionization tandem mass spectrometry, protein identification and confidence threshold based on the one-peptide rule [9] and data acquisition of the Exponentially Modified Protein Abundance Index (emPAI), are given in a previous work [3]. The emPAI was used as a relative quantitation score of the proteins in a complex mixture based on protein coverage by the peptide matches in a database search result [11].

Proteomic Data Normalization and Analysis
Normalized emPAI values (nemPAI, see Proteomic data normalization and analysis section in Methods) were obtained from the emPAI values by dividing each individual value by the sum of all emPAI values in a given experiment [12]; then multiplied by the number of proteins in M. thermophilum AMET1 distinct by their peptide composition (1529) to make the scale commensurate with the average protein abundance. Decimal logarithms of nemPAI values with all originally zero values replaced by artificial low values of 0.001 (the lowest observed non-zero abundance being 0.006) were used in most subsequent analyses (referred to as lPAI values).
When coarse-grained data was required for the correlation analysis, the following procedure was applied to each protein-specific vector of 6 lPAI values: the minimum lPAI value was subtracted from each data point, the result was divided by a threshold value (0.5 decimal log units, see Results for details) and rounded down to the nearest integer. This produced a stepwise scaling of the data in threshold value units.
Principal Component Analysis (PCA) was performed on the lPAI values using the prcomp function of R package with rescaling of the original data columns [13].
Normal equivalent of the distribution [14] of lPAI values for proteins with non-zero measured abundance was calculated as follows: the mean of the equivalent normal distribution µ e was set to the median lPAI value. The equivalent standard deviation σ e was calculated as 0.74 of the interquartile distance (as the interquartile distance of the normal distribution is 1.35 standard deviation units). Therefore, the z-score for the protein abundance can be calculated as z = (x−µ e )/σ e and the Bonferroni-corrected e-value as the standard normal distribution p-value of z multiplied by the number of distinct proteins (1529). Proteins with z >0 and e-values below 1 were considered super-abundant in the sense that their abundances exceeded the expectation of the abundance range under the assumption of the log-normal distribution.

Sequence Analysis
M. thermophilum AMET1 proteins were assigned to most recent archaeal Clusters of Orthologous Groups, arCOGs in the course of the previous work [3]. The PSI-BLAST program was used to search protein sequences against non-redundant database at NCBI (National Center for Biotechnology Information) [15]. Protein sequences for selected families were aligned using MUSCLE [16]. Protein secondary structure was predicted using Jpred 4.0 [17]. Transmembrane helices in proteins were predicted using TMHMM 2.0 [18].

Estimation of Significant Changes in Protein Abundance
To determine if a change in a protein abundance is likely to be biologically significant, it is essential to assess the level of non-biological noise in the data. Although this level is not directly measurable (due to the lack of precise reference data), the following approach allows an approximate estimation. Ribosomal proteins, specifically those 25 that are encoded within the ribosomal superoperon (AMET1_0469-AMET1_0497 on MRZU01000003.1), are present in the ribosome in equal stoichiometric ratios, so their relative abundances should be largely independent of the physiological conditions. Thus, variations of the measured relative abundances of these ribosomal proteins under different experimental conditions should be largely attributed to non-biological noise.
This analysis shows that the measurements have the root-mean-square deviation from the expectation by a factor of 1.8 (0.25 decimal log units). The threshold of 0.5 decimal log units (a factor of 3.16) renders 97% of the deviations insignificant. Therefore, in the further analysis, we called changes in a protein abundance as significant if they differed from the baseline measurements by a factor of 10 0.5 = 3.16 or greater.

Methanonatronarchaeum thermophilum AMET1 Protein Abundances under Optimal Growth Conditions
We identified 1026 (67%) of the 1529 predicted protein-coding genes of M. thermophilum AMET1 being expressed under the optimal growth conditions ( Table S1). Most of the predicted proteins that have not been detected in the proteomic experiments belong to "silent" gene islands and the largest islands correspond to predicted integrated elements ( Figure 1). Three other such regions are occupied by proviruses related to His2-like spindle-shaped haloviruses: partial AMET1_0244-AMET1_0250 and two apparently complete proviruses AMET1_0519-AMET1_0565, AMET1_1204-AMET1_1235, with integrases AMET1_0519 and AMET1_1204 respectively and His2-like major capsid proteins AMET1_0248, AMET1_0558 and AMET1_1226. These proviruses, however, could be replication-defective, because they do not appear to encode a family B DNA polymerase that is characteristic of His2 viruses [19]. Another island corresponds to a putative plasmid AMET1_0986-AMET1_0992, with two genes responsible for plasmid replication (AMET1_0989 and AMET1_0990). The silent islands also include uncharacterized integrated elements AMET1_0003-AMET1_0013 and AMET1_1363-AMET1_1373, with integrase AMET1_0013 and gene shared between the two islands (AMET1_0008 and AMET1_1363), suggesting that the elements might be related. The mechanism of silencing of such large DNA regions remains to be elucidated. Several transposable elements present in the genome are not expressed either. The distribution of non-zero nemPAI values is generally bell-shaped in logarithmic scale ( Figure  S1A). About 2% proteins have values of 9.0 or greater with 8 of these representing the super-abundant heavy tail of the distribution ( Table 1). As expected, many of these proteins perform house-keeping functions, such as ribosomal proteins, chromatin, or RNA-binding proteins and chaperons. The archaeal DNA-binding protein Alba (AMET1_0379) shows the highest abundance value, 161 (Table  1). Alba is thought to play a key role in 3D chromosome organization and gene expression [20]. Another chromatin associated protein, histone, is also very abundant with nemPAI value 22. Notably, Alba is not particularly abundant in neutraphilic HMET1 compared with the histone, with nemPAI values 7 and 156 respectively [3]. Thus, it appears that chromosome organization is significantly different in these two organisms. Another group of highly abundant proteins is related to methyl coenzyme M reductase, a key multisubunit protein complex that is involved in the last enzymatic step of methanogenesis and is considered to be a hallmark of methanogens [21,22] (Table 1). This observation is consistent with extremely high abundances of these subunits in HMET1 and other methanogens for which proteomic data are available [3,[23][24][25]. The distribution of non-zero nemPAI values is generally bell-shaped in logarithmic scale ( Figure S1A). About 2% proteins have values of 9.0 or greater with 8 of these representing the super-abundant heavy tail of the distribution ( Table 1). As expected, many of these proteins perform house-keeping functions, such as ribosomal proteins, chromatin, or RNA-binding proteins and chaperons. The archaeal DNA-binding protein Alba (AMET1_0379) shows the highest abundance value, 161 (Table 1). Alba is thought to play a key role in 3D chromosome organization and gene expression [20]. Another chromatin associated protein, histone, is also very abundant with nemPAI value 22. Notably, Alba is not particularly abundant in neutraphilic HMET1 compared with the histone, with nemPAI values 7 and 156 respectively [3]. Thus, it appears that chromosome organization is significantly different in these two organisms. Another group of highly abundant proteins is related to methyl coenzyme M reductase, a key multisubunit protein complex that is involved in the last enzymatic step of methanogenesis and is considered to be a hallmark of methanogens [21,22] (Table 1). This observation is consistent with extremely high abundances of these subunits in HMET1 and other methanogens for which proteomic data are available [3,[23][24][25]. Finally, four abundantly expressed proteins are either assigned only a general function prediction or remain uncharacterized ( Table 1). The first one is a transcriptional regulator containing a predicted winged helix-turn-helix domain (AMET1_0092, arCOG01060). In arCOGs, this subfamily is present in most of the methanogens but only in a few members of Halobacteria, which might point to involvement of this protein in regulation of methanogenesis pathways. Two other uncharacterized proteins belong to the CBS domain family (Table 1). These proteins are thought to be energy sensors that are involved in modulating various cellular metabolic processes [26]. There are numerous CBS proteins in M. thermophilum AMET1 including the cluster AMET1_0341-AMET1_0345 and most of these proteins are also highly abundant, suggesting important roles in metabolism regulation (Table S1). AMET1_0636 protein is not similar to protein family with known function. It has the largest nemPAI value (39.4) among uncharacterized proteins, with AMET1_1159 as a distant second, with nemPAI value of 8.0. Moreover, this protein has high nemPAI values under all conditions tested in this work (Table S1). Sequence analysis shows that homologs of this small protein (~40-90 amino acids (aa)) are present in many methanogens and other, mostly uncultivated archaea and bacteria (Table S2). Multiple alignment of selected proteins of this family is shown in Figure 2. Apparently, the core of the protein includes~40 aa and, according to the secondary structure prediction, is an all-beta strand domain ( Figure 2). AMET1_1159, like the majority of its homologs, consists of two duplicated domains whereas several archaea have only one domain which accounts for the variation in the protein length. In many prokaryotic species, these proteins contain variable positively charged amino acid patches ( Figure 2). Such patches are typical of RNA chaperones [27]. The high abundance of AMET1_1159 suggests that it plays an important role in house-keeping cellular functions which is compatible with the predicted RNA chaperone function.  Table S2). The sequences are denoted by their locus tag numbers. Two underlined locus tag numbers correspond to M. thermophilum AMET1 and HMET1. Secondary structure elements are shown above the alignment, with "E" indicating positions predicted to be in a beta strand. Amino acids in the conserved (80%) positions are coloured according to their physico-chemical properties as follows: yellow background indicates aliphatic residues (I,L,M,V), green background indicates small residues (A,G,P,S), red letters indicate positively charged residues (K,R), blue-indicate negatively charged residues (D,E,N,Q), magenta-aromatic residues (F,Y,W). Patches of positively charged residues are highlighted in bold red font.
Next, we compared the abundance levels of different functional classes of proteins under the optimal conditions ( Figure S1B). This comparison showed that the differences between functional classes are not substantial or statistically significant, with only proteins involved in translation showing an overall elevated abundance level.
Previously, we have found that the majority of proteins that are predicted to be involved in key metabolic pathways are produced in both M. thermophilum AMET1 and HMET1 as indicated by LC-MS/MS data analysis [3]. These include methyl-reduction pathway of methanogenesis, membrane respiratory chain and energy-converting membrane complexes, in particular, cytochromes, acetate incorporation pathway, CO2 fixation pathway through archaeal RUBISCO and a few other carboxylases, both glycolysis and gluconeogenesis pathway and others (Table S1). Most proteins involved in the biosynthetic pathways for all nucleotides, amino acids (including pyrrolysine), cofactors and lipids were also detected ( Table S1). The lipid biosynthesis enzymes, however, as well as many transporters, are not well recovered, presumably, due to the known limitations of the protein preparation methods in dealing with proteins that are tightly associated with the membrane [28]. Only a few proteins that are also detected but were not included in previous reconstruction of the aforementioned pathways are functionally characterized and allow us to add a few more details to the M. thermophilum AMET1 metabolic map. In addition to the Suf system of Fe-S cluster formation, M. thermophilum AMET1 expresses the Isc system that is much more abundant than Suf. Several Fe-S clusters containing redox proteins that are common in methanogens but rare in Halobacteria are also abundant, namely NorV-like flavorubredoxin (AMET1_0320) and rubrerythrin (AMET1_1450 and AMET1_0761). Like in most archaea, synthesis of aromatic amino acids in M. thermophilum AMET1 proceeds via an alternative pathway in which the two first steps of 3-dehydroquinate biosynthesis are catalysed by 2-amino-3,7-dideoxy-D-threo-hept-6-ulosonic acid synthase (AMET1_1249) and 3dehydroquinate synthase and (AMET1_1248) [29]. Both are detected in our proteomic data (Table S1).
Several proteins detected by proteomic analysis could not be confidently connected to the predicted metabolic network of AMET1. These include proteins comprising part of the hydrogenotrophic methanogenesis pathway, namely, coenzyme F420-dependent N(5),N(10)methylene tetrahydromethanopterin reductase (Mer), coenzyme F420-dependent N(5),N(10)methenyltetrahydromethanopterin dehydrogenase (Mtd), methenyltetrahydromethanopterin  Table S2). The sequences are denoted by their locus tag numbers. Two underlined locus tag numbers correspond to M. thermophilum AMET1 and HMET1. Secondary structure elements are shown above the alignment, with "E" indicating positions predicted to be in a beta strand. Amino acids in the conserved (80%) positions are coloured according to their physico-chemical properties as follows: yellow background indicates aliphatic residues (I,L,M,V), green background indicates small residues (A,G,P,S), red letters indicate positively charged residues (K,R), blue-indicate negatively charged residues (D,E,N,Q), magenta-aromatic residues (F,Y,W). Patches of positively charged residues are highlighted in bold red font.
Next, we compared the abundance levels of different functional classes of proteins under the optimal conditions ( Figure S1B). This comparison showed that the differences between functional classes are not substantial or statistically significant, with only proteins involved in translation showing an overall elevated abundance level.
Previously, we have found that the majority of proteins that are predicted to be involved in key metabolic pathways are produced in both M. thermophilum AMET1 and HMET1 as indicated by LC-MS/MS data analysis [3]. These include methyl-reduction pathway of methanogenesis, membrane respiratory chain and energy-converting membrane complexes, in particular, cytochromes, acetate incorporation pathway, CO 2 fixation pathway through archaeal RUBISCO and a few other carboxylases, both glycolysis and gluconeogenesis pathway and others (Table S1). Most proteins involved in the biosynthetic pathways for all nucleotides, amino acids (including pyrrolysine), cofactors and lipids were also detected ( Table S1). The lipid biosynthesis enzymes, however, as well as many transporters, are not well recovered, presumably, due to the known limitations of the protein preparation methods in dealing with proteins that are tightly associated with the membrane [28]. Only a few proteins that are also detected but were not included in previous reconstruction of the aforementioned pathways are functionally characterized and allow us to add a few more details to the M. thermophilum AMET1 metabolic map. In addition to the Suf system of Fe-S cluster formation, M. thermophilum AMET1 expresses the Isc system that is much more abundant than Suf. Several Fe-S clusters containing redox proteins that are common in methanogens but rare in Halobacteria are also abundant, namely NorV-like flavorubredoxin (AMET1_0320) and rubrerythrin (AMET1_1450 and AMET1_0761). Like in most archaea, synthesis of aromatic amino acids in M. thermophilum AMET1 proceeds via an alternative pathway in which the two first steps of 3-dehydroquinate biosynthesis are catalysed by 2-amino-3,7-dideoxy-D-threo-hept-6-ulosonic acid synthase (AMET1_1249) and 3-dehydroquinate synthase and (AMET1_1248) [29]. Both are detected in our proteomic data (Table S1).
Several proteins detected by proteomic analysis could not be confidently connected to the predicted metabolic network of AMET1. These include proteins comprising part of the hydrogenotrophic methanogenesis pathway, namely, coenzyme F420-dependent N(5),N(10)-methylene tetrahydromethanopterin reductase (Mer), coenzyme F420-dependent N(5),N(10)methenyltetrahydromethanopterin dehydrogenase (Mtd), methenyltetrahydromethanopterin cyclohydrolase (Mch) and formylmethanofuran-tetrahydromethanopterin formyltransferase (Ftr). As discussed previously, the reactions catalysed by these enzymes could not be connected to the rest of the methyl-reducing pathway because of the absence of genes for Mtr and Fdw complex subunits [3]. Nevertheless, all these proteins were detected by proteomic analysis (Table S1), confirming their functionality and suggesting that an unknown link exists between these reactions and the rest of the metabolic flow and remains to be identified in further experiments.
The UbiX-UbiD decarboxylase system is common among archaea, especially in methanogens. It has a considerable variety in terms of substrate specificity and physiological role [30]. Originally, it has been characterized as prenyltransferase required for bacterial ubiquinone biosynthesis but this pathway is absent in M. thermophilum AMET1 and most other methanogens, so the function of these proteins remains unknown [31]. We noticed, however, that in some archaea, including AMET1, ubiX and/or ubiD genes belong to the same gene neighbourhood with aconitase acoX, which is implicated in tricarboxylic acid cycle (TCA) [32] ( Figure S2). The TCA reactions predicted from the gene content are disconnected in M. thermophilum AMET1 [3]. Thus, this organism might possess a yet unknown modification of the TCA that could require the carboxylase activity of UbiX-UbiD. Alternatively, both AcoX and UbiX-UbiD carboxylase enzymes might comprise a new pathway not related to TCA. The intriguing possibility is that they might be involved in methanophenazine biosynthesis pathway, a specific methanogenic variant of respiratory quinone derivatives that are present in many archaea and but for most of them the biosynthesis pathway is not yet known [33,34].

The Structure of the Protein Abundance Space in Different Conditions
In order to determine the impact of different growth conditions (optimal, sub-and supraoptimal temperature, trimethylamin as an electron acceptor, FeS as growth factor, H 2 as electron donor) on the abundance of 1157 unique quality-filtered proteins obtained from the 6 sample groups, we applied PCA to the proteomic data ( Figure 3A). The first Principal Component (PC1) encompasses high positive contributions of abundances from all experimental conditions and these contributions are comparable in magnitude ( Figure 3A). PC1 captures most of the original data variance, 88%. This indicates that, overall, the abundances in all experimental conditions are strongly and positively correlated and that the condition-independent component of the abundance is the dominant trend in the data. PC2 and PC3 are largely composed of the contributions of the two suboptimal growth temperature conditions; these contributions have the same signs in PC2 and opposite signs in PC3, together accounting for further 7.5% of the data variance. The remaining components are largely composed of individual contributions from experiments with changed trophic conditions that exert uncorrelated effects on the structure of the data space.
The observed relationship between the experimental conditions allows for a straightforward interpretation of the data. Because the abundances of most proteins are highly correlated between the experiments whereas different conditions produce largely orthogonal (uncorrelated) effects on different genes (except for the suboptimal temperature conditions that seem to exert some common influence), the optimal conditions data appear to be a natural choice for the baseline abundance values, whereas the protein abundances in the 5 experiments with altered conditions demonstrate largely uncorrelated deviations from this baseline.
We also compared the 6 abundance profiles for M. thermophilum AMET1 with the single profile for HMET1 (at optimal growth conditions). As shown in Figure 3B, the HMET1 profile is sufficiently similar to all M. thermophilum AMET1 profiles to make the common signal dominant in the principal components space. However, all M. thermophilum AMET1 data points group closely together compared to the distance from the HMET1 profile, suggesting that evolutionary divergence affects protein expression to a much greater extent than the changes in growth conditions covered by our experiments.

Supra-and Suboptimal Temperature Effect on Protein Abundance
Both cold and heat shock responses typically affect expression of hundreds of genes, most of which are organism-specific [35][36][37]. In archaea, these responses have been studied in several model organisms, mostly using transcriptomics. A major common trend that is shared not only by archaea but also by bacteria and eukaryotes, involves the heat-induced over-expression of a distinct set of highly conserved heat shock proteins, most of which are molecular chaperones [38]. Compared to the heat shock response, the response to cold is apparently more species-specific [36,39].
The PCA described above shows that sub-or supraoptimal temperature contribute jointly and significantly to the variation of the protein abundances, suggesting that many changes in protein abundance in these conditions are correlated. Indeed, 138 proteins are similarly affected by supraand suboptimal temperatures (Figure 4). This result does not appear surprising given that M. thermophilum AMET1 grows equally poorly at 55 °C and 34 °C [3]. This list is dominated by the proteins that are either significantly downregulated or not detectable at all under these conditions ( Figure 4, Table 2). Specifically, we observed strong downregulation of proteins involved in pyrrolysine biosynthesis [40], a rare, non-canonical amino acid that is present in two of the most abundant proteins of M. thermophilum AMET1 (dimethylamine methyltransferase MtbB AMET1_0722/AMET1_0723 and trimethylamine methyltransferase MttB AMET1_0103/AMET1_0104) involved in methyl group reduction ( Table 2). We also detected significant decrease in the abundance of two glycosyltransferases, suggesting that change in

Supra-and Suboptimal Temperature Effect on Protein Abundance
Both cold and heat shock responses typically affect expression of hundreds of genes, most of which are organism-specific [35][36][37]. In archaea, these responses have been studied in several model organisms, mostly using transcriptomics. A major common trend that is shared not only by archaea but also by bacteria and eukaryotes, involves the heat-induced over-expression of a distinct set of highly conserved heat shock proteins, most of which are molecular chaperones [38]. Compared to the heat shock response, the response to cold is apparently more species-specific [36,39].
The PCA described above shows that sub-or supraoptimal temperature contribute jointly and significantly to the variation of the protein abundances, suggesting that many changes in protein abundance in these conditions are correlated. Indeed, 138 proteins are similarly affected by supra-and suboptimal temperatures (Figure 4). This result does not appear surprising given that M. thermophilum AMET1 grows equally poorly at 55 • C and 34 • C [3]. This list is dominated by the proteins that are either significantly downregulated or not detectable at all under these conditions ( Figure 4, Table 2). Specifically, we observed strong downregulation of proteins involved in pyrrolysine biosynthesis [40], a rare, non-canonical amino acid that is present in two of the most abundant proteins of M. thermophilum AMET1 (dimethylamine methyltransferase MtbB AMET1_0722/AMET1_0723 and trimethylamine methyltransferase MttB AMET1_0103/AMET1_0104) involved in methyl group reduction (Table 2). We also detected significant decrease in the abundance of two glycosyltransferases, suggesting that change in temperature affects surface S-layer composition. It has to be taken into account that high or low temperature at extreme medium alkalinity makes a large difference on resistance of the external glycoproteins to hydrolytic damage.  The ribosome properties also could be affected because we observed significant upregulation of ribosomal protein L31E and the converse downregulation of L21E (Table 2). These two ribosomal proteins are encoded in a minor ribosomal superoperon, in which the abundances of the remaining gene products do not change significantly (Table S1). Most of the ribosomal proteins in this superoperon are archaea-specific or shared with eukaryotes but not with bacteria. Both L31E and L21E are assembled relatively late into the eukaryotic large subunit and, unlike the core ribosomal proteins, potentially are the subject to tuning [41]. In eukaryotes L31E is involved in the interaction with ribosome-associated chaperones, which act first to help folding nascent proteins exiting the ribosome [42]; the archaeal orthologue of L31E is likely to perform a similar function, which would  # Proteins that change the abundance in the respective conditions only and not affected in other conditions are highlighted by bold and a larger font. $ "ON": the protein is not detected under the baseline condition but appears under the alternative condition; "OFF": the protein is detected under the baseline condition but not under the alternative condition; "UP": the protein is more abundant under the alternative condition compared to the baseline condition by a factor of more than 3.16; "DOWN": the protein is less abundant under the alternative condition compared to the baseline condition by a factor of more than 3.16. * The fold change is indicated in parenthesis; comma separates values of change at T55 and T34 conditions; if a protein is "ON", its nemPAI value for the respective condition is indicated; if a protein is "DOWN", its nemPAI value at optimal condition is indicated.
The ribosome properties also could be affected because we observed significant upregulation of ribosomal protein L31E and the converse downregulation of L21E (Table 2). These two ribosomal proteins are encoded in a minor ribosomal superoperon, in which the abundances of the remaining gene products do not change significantly (Table S1). Most of the ribosomal proteins in this superoperon are archaea-specific or shared with eukaryotes but not with bacteria. Both L31E and L21E are assembled relatively late into the eukaryotic large subunit and, unlike the core ribosomal proteins, potentially are the subject to tuning [41]. In eukaryotes L31E is involved in the interaction with ribosome-associated chaperones, which act first to help folding nascent proteins exiting the ribosome [42]; the archaeal orthologue of L31E is likely to perform a similar function, which would account for its heat-induced expression. From the archaeal ribosome structure, it is clear that L21E is one of the 5 ribosomal proteins that stabilize the attachment of 5S rRNA to the large subunit [43]. The role in L21E protein in temperature change response remains unknown.
In both conditions, we observed induction of several proteins from a putative integrated His2-like spindle-shaped halovirus (AMET1_0519-AMET1_0565) (Tabels 2 and S1). Induction of integrated elements under various stress conditions is a well-known phenomenon [44][45][46]. Finally, two transcriptional regulators of ArsR family were found to be significantly affected in the opposite directions ( Table 2). The aforementioned AMET1_0092, one of the most abundant proteins, is downregulated, whereas AMET1_0853 is upregulated. The AMET1_0092 protein belongs to arCOG01060 and shows a patchy distribution in archaea, although it is present in many methanogens; in contrast, AMET1_0853 belongs to arCOG01684 which is represented in most euryarchaea [47]. Both archaeal heat-responsive regulators characterized to date, Phr from Pyrococcus furiosus [48] and HSR1 from Archaeoglobus fulgidus [49], belong to arCOG01684. These transcriptional regulators have been shown to substantially induce the expression of the HSP20 family chaperone and CDC48, the AAA ATPase involved in protein folding and degradation control [48,49]. HSR1 is upregulated in heat stress in A. fulgidus [49] and AMET1_0853 is also upregulated much more in T55 conditions (Table 2). Thus, the proteomic data provide additional evidence of the involvement of these regulators in heat stress response in a distinct class of archaea. These findings are consistent with the previous reconstruction of the LACA gene set, which includes the key components of this response, namely, HSR1 family of arCOG01684, CDC48 ATPase of arCOG01308 and thermosome subunits of arCOG01257 [50] and with the observation that these protein families evolve largely at the expected evolutionary rates and are minimally prone to horizontal gene transfer [14]. Taken together, these lines of evidence suggest that at least some of the mechanisms for the regulation of heat shock protein expression are ancestral in Euryarchaeota and possibly, in all archaea.
Apart from those proteins that show similar responses to both supra-and suboptimal temperatures, many proteins respond specifically to either T55 or T34 conditions ( Figure 4, Table 2). The thermosome (HSP60 family chaperonins) heat induction has been demonstrated first in Pyrodictium occultum [51] and then in other archaea including Pyrococcus furiosus [52], Halobacterium salinarum [53] and Sulfolobus shibatae [54]. Two predicted thermosome subunits are also strongly and specifically upregulated in M. thermophilum AMET1 under T55 (Table 2). In agreement with the demonstrated role of the Phr-HSR1 family transcriptional regulators, we observed significant upregulation of CDC48 ATPase in T55, presumably due to the effect of the transcriptional regulator AMET1_0853. One of the most abundant proteins at T55 is AMET1_1336, which is currently annotated as "hypothetical protein" and can be confidently shown to contain a helix-turn-helix domain (HHpred probability 93%). This protein is below detection level under the optimal conditions but has a nemPAI value of 55 at T55 and 230 in the FeS conditions, where it is the most abundant protein. Orthologues of this protein are currently undetectable in archaea but are readily identifiable in many bacteria, including those of the genus Staphylococcus where it encoded in pathogenicity islands. Thus, this protein might play an important role in the regulation of the expression of integrated mobile elements under stress.
Compared to T55, T34 results in down-regulation of a greater number of proteins involved in house-keeping processes, such as translation, rRNA modification and replication (Tabels 2 and S1). Among the few upregulated proteins, there was no readily interpretable trend. Notably, however, AMET1_1156, a member of the HSP20 family of molecular chaperones is significantly upregulated, whereas its paralog AMET1_1255 is significantly downregulated under the T34 conditions. It remains to be determined if the transcriptional regulator AMET1_0853 is responsible for the regulation of the expression of these chaperones. Most bacteria and many mesophilic archaea encode the cold shock protein A (CspA), a RNA-binding protein known to be specifically involved in cold stress response, whereas most thermophiles lack this protein [55]. Both M. thermophilum AMET1 and HMET1 do not encode CspA. Recently, however, it has been shown that another RNA-binding protein, which contains a TRAM (named after uridine methylase TRM2 and the MiaB families of tRNA-modifying enzymes) domain, can function as a CspA analogue in thermophilic archaea [56,57]. Indeed, M. thermophilum AMET1 encodes two members of this family of TRAM domain proteins (AMET1_0306 and AMET1_1194) and both of them show moderately elevated abundance at T34 (Table S1).

Response to Growth Factor Changes
All three conditions that involved growth factor change, namely TMA, FeS and H 2 , appeared to be suboptimal and even stressful for M. thermophilum AMET1, affecting many proteins involved in central metabolic pathways. As with temperature stress, most of the affected proteins are downregulated, with a marginally greater effect in H 2 conditions compared to the FeS conditions and only few upregulated proteins were identified for each condition ( Figure 4B). No differences in the number of upregulated proteins have been observed ( Figure 4B).
The use of TMA instead of MeOH as the e-acceptor in the medium causes several predictable changes in the methyl-reduction pathways. The methylated-thiol corrinoid protein AMET1_1049 that is likely specific for methanol (MtaC) is downregulated, whereas several methylamine utilization proteins are upregulated, including one that is specific for TMA (MttC) and shows the largest fold change ( Table 3). The significantly downregulated transcriptional regulator AMET1_0372 might be responsible for the metabolic adjustment (Table 3). This protein belongs to arCOG01345 (COG03388), which is represented in the majority of Halobacteria, Methanomicrobia and Thermoplasmata but rarely in other groups of euryarchaea. Such conservation in several major archaeal lineages implies involvement of this regulator in some basic house-keeping pathways. highlighted by bold and a larger font. $ "ON": the protein is not detected under the baseline condition but appears under the alternative condition; "OFF": the protein is detected under the baseline condition but not under the alternative condition; "UP": the protein is more abundant under the alternative condition compared to the baseline condition by a factor of more than 3.16; "DOWN": the protein is less abundant under the alternative condition compared to the baseline condition by a factor of more than 3.16. * The fold change is indicated in parenthesis; if a protein is "ON", its nemPAI value for the respective condition is indicated; if a protein is "DOWN", its nemPAI value at optimal condition is indicated.
In both the FeS and H 2 conditions, key methyl-reduction pathways are negatively affected. Similar to the T55 and T34 conditions, this effect appears to be achieved by the downregulation of pyrrolysine biosynthesis (Table 3). In addition, FeS causes downregulation of both methanogenic corrinoid protein MtaC paralogs AMET1_1049 and AMET1_0748, which are likely involved in methanol reduction. Both proteins, especially AMET1_1049, are abundant at normal conditions. Both FeS and H 2 conditions, as expected, affect iron uptake and formation of Fe-S clusters, which are upregulated. However, under these two conditions, different ABC transporters are induced (Table 3). In the presence of H 2 , M. thermophilum AMET1 is expected to reduce motility because of the significant downregulation of chemotaxis proteins and FlaH protein, the KaiC-like ATPase involved in the archaellum assembly [58].

Positive and Negative Correlations between Abundances of Paralogous Proteins
Compared to its closest relatives, aerobic heterotrophic haloarchaea, anaerobic methanogenic M. thermophilum AMET1 has a relatively compact genome. Nevertheless, 178 of the 1084 arCOGs represented in this organism have 2 or more paralogs, comprising 639 out of the 1546 proteins. In an attempt to gain insight into the roles of paralogs in the cellular physiology of AMET1, the following analysis was performed. We selected 824 proteins that showed significant variation of their abundance between the 6 experimental conditions and grouped them into paralogous families (according to the arCOG classification) and singletons (Table S3). In this set, 238 proteins formed 75 paralogous families containing proteins that could be distinguished at the proteomics level, whereas the remaining 586 were singletons. For each pair of paralogs within a paralogous family, Pearson correlation coefficient between their coarse-grained abundance profiles was computed. The distribution of the correlation coefficients was compared to the distribution obtained for 10,000 pairs of randomly chosen singletons. This analysis showed that, compared to singletons, the abundance profiles of paralogs were characterized by a marked excess of highly correlated (r Pearson > 0.8) as well as anti-correlated (r Pearson < −0.8) pairs of profiles (10.5% and 12.5% vs. the expected 3.2% and 3.8%, respectively). This structure of correlations between the expression profiles of paralogs implies a dual role of gene paralogization in evolution. The positively correlated duplicated genes appear to be co-regulated and accordingly, can be inferred to function synergistically (as iso-enzymes?), providing physiological robustness and/or increased protein dosage. In contrast, the anti-correlated paralogs seem to be counter-regulated and, presumably, perform complementary functions under different physiological conditions. As a case in point for the apparent synergy between paralogs, we identified strong correlation between the abundances of predicted glycosyltransferases AMET1_0970, AMET1_0957, AMET1_0963, which all belong to arCOG01403, are encoded in the same predicted operon and are, most likely, coregulated given that they are involved in the same pathway of exopolysaccharide biosynthesis. In contrast, another member of the same paralogous family, AMET1_1410, is encoded in a different locus and is anticorrelated with the former three paralogs. Some paralogous proteins can be tightly co-regulated even though the respective genes are separated on the chromosome, for example, AMET1_0699 and AMET1_0151, homologs of periplasmic components of the ABC-type Fe 3+ -hydroxamate transport system. FtsZ family proteins, which are key components of cell division machinery, are known to form two ancestral archaeal clades FtsZ1 and FtsZ2 [59]. These two paralogs are likely subfunctionalized because only one, FtsZ1, appears to be required for cell division under normal conditions in archaea [59]. In agreement to this subfunctionalization, FtsZ1 (AMET1_0742) is highly expressed but anticorrelated with FtsZ2 (AMET1_1349) under different conditions.

Conclusions
Proteomic analysis of M. thermophilum AMET1 confirms that this extremely haloalkaliphilic and moderately thermophilic methanogen deploys a nearly complete repertoire of proteins involved in key methyl-reduction pathways as well as enzymes for biosynthesis of all amino acids, nucleotides and cofactors and transporters for all essential ions. We did not detect enzymes that could comprise any unknown major biosynthetic pathways although several gaps remain in the M. thermophilum AMET1 metabolic map. Proteins that are involved in methanol reduction are highly abundant under standard growth conditions (MeOH + formate, 48 • C) but downregulated under sub-optimal conditions. Apparently, the key mechanism of this metabolic switch is downregulation of pyrrolysine biosynthesis genes which are required for translation of methylamine methyltransferase proteins containing this non-canonical amino acid. The high level of UspA family proteins in all tested conditions is compatible with their role in the adaptation to high salt concentrations [3]. Furthermore, we predict that AMET1_0636, one of the most abundant proteins in M. thermophilum AMET1 cells in all the conditions, is an RNA chaperone. Proteins with the RNA chaperone functions are poorly characterized in general and remain virtually unknown in archaea. These observations suggest that RNA chaperone activity could be essential under high salt conditions. Considerable parts of the M. thermophilum AMET1 genome remain "silent" under optimal condition but are activated under stress, especially heat and cold shock; some of these "silent islands" correspond to integrated mobile genetic elements.
We identified several key proteins that are upregulated at suboptimal and supraoptimal temperatures and found that they belong to the same protein families that respond to these conditions in other archaea and even bacteria. These proteins include molecular chaperones (heat shock proteins), the AAA ATPase CDC48 and heat shock transcriptional regulator for the T55 conditions and a TRAM domain-containing putative RNA-binding protein for the T34 conditions. These findings are compatible with an ancient origin of the universal heat shock response, perhaps, preceding the Last Universal Cellular Ancestor, whereas the regulation of this response could have evolved as early as the common ancestor of euryarchaea.
Despite having a relatively compact genome, M. thermophilum AMET1 encodes a number of paralogs which show non-randomly correlated or anti-correlated abundances. Thus, proteomic analysis seems to allow identification of paralogs that appear to be co-regulated and function synergistically and those that are differentially regulated and could perform complementary functions.
Surprisingly, the protein abundance data for HMET1 substantially differ from those obtained for all 6 conditions tested for M. thermophilum AMET1, suggesting that, even in closely related organisms with the same type of central metabolism that share the same ecological niche (except for the salt composition resulting in a very different osmotic and pH conditions), protein abundance profiles can diverge relatively fast.
The present proteomic analysis sheds light on the environmental adaptation strategies in the recently discovered triple extremophilic methanogens, reveals common mechanisms of response to heat and cold shock in euryarchaea and yields many experimentally testable predictions of protein functions.
Supplementary Materials: The following are available online at http://www.mdpi.com/s1, Table S1: Protein abundances (nemPAI) mapped to M. thermophilum AMET1 genome and their change in different experimental conditions. Table S2: List of homologs for the putative RNA chaperon. Table S3: Comparison of paralog abundance profiles. Figure S1: Distribution of relative protein abundances (nemPAI) values. Figure S2: UbiD co-localization with predicted archaeal aconitase AcoX.