Next Article in Journal
Comprehensive Analysis of Metabolites in Brews Prepared from Naturally and Technologically Treated Coffee Beans
Previous Article in Journal
Mechanisms Underlying Neurodegenerative Disorders and Potential Neuroprotective Activity of Agrifood By-Products
Previous Article in Special Issue
Physiological Overview of the Potential Link between the UPS and Ca2+ Signaling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evolutionary Conserved Short Linear Motifs Provide Insights into the Cellular Response to Stress

by
Sergey P. Zavadskiy
1,
Denis S. Gruzdov
1,
Susanna S. Sologova
1,
Alexander A. Terentiev
2 and
Nurbubu T. Moldogazieva
1,*
1
Nelyubin Institute of Pharmacy, I.M. Sechenov First Moscow State Medical University (Sechenov University), 119991 Moscow, Russia
2
Department of Biochemistry and Molecular Biology, N.I. Pirogov Russian National Research Medical University, 117997 Moscow, Russia
*
Author to whom correspondence should be addressed.
Antioxidants 2023, 12(1), 96; https://doi.org/10.3390/antiox12010096
Submission received: 24 August 2022 / Revised: 22 November 2022 / Accepted: 22 December 2022 / Published: 30 December 2022

Abstract

:

Highlights

  • Hundreds of short linear motifs (SLiMs) that exhibit a high degree of sequence similarity to two biologically active sites of human alpha-fetoprotein (AFP) were identified.
  • The SLiMs of interest are ubiquitously distributed and found in proteins of both eukaryotic and prokaryotic species.
  • Proteins retrieved by sequence alignment belonged to various functional classes to be directly or indirectly involved in cellular response to stress.
  • Our findings provide insights into the common functions of evolutionary conserved SLiMs and putative involvement of AFP in response to external and internal stimuli during cellular adaptation during embryonic development and cancer.

Abstract

Short linear motifs (SLiMs) are evolutionarily conserved functional modules of proteins composed of 3 to 10 residues and involved in multiple cellular functions. Here, we performed a search for SLiMs that exert sequence similarity to two segments of alpha-fetoprotein (AFP), a major mammalian embryonic and cancer-associated protein. Biological activities of the peptides, LDSYQCT (AFP14–20) and EMTPVNPGV (GIP-9), have been previously confirmed under in vitro and in vivo conditions. In our study, we retrieved a vast array of proteins that contain SLiMs of interest from both prokaryotic and eukaryotic species, including viruses, bacteria, archaea, invertebrates, and vertebrates. Comprehensive Gene Ontology enrichment analysis showed that proteins from multiple functional classes, including enzymes, transcription factors, as well as those involved in signaling, cell cycle, and quality control, and ribosomal proteins were implicated in cellular adaptation to environmental stress conditions. These include response to oxidative and metabolic stress, hypoxia, DNA and RNA damage, protein degradation, as well as antimicrobial, antiviral, and immune response. Thus, our data enabled insights into the common functions of SLiMs evolutionary conserved across all taxonomic categories. These SLiMs can serve as important players in cellular adaptation to stress, which is crucial for cell functioning.

1. Introduction

Short linear motifs (SLiMs) are evolutionarily conserved functional modules of proteins that represent amino acid stretches composed of 3 to 10 residues involved in recognition and targeting activities [1]. SLiMs function through transient interactions with a variety of binding partners, mostly, with globular protein domains of other proteins. Thereby, they are involved in protein–protein interactions, which underlie numerous cellular processes including signal transduction, metabolism, electron transfer, cell cycle, membrane transport, etc. [2]. Currently, it has become recognized that similar SLiMs can be found in numerous non-homologous, unrelated proteins recruited in common regulatory functions [3]. They exert evolutionary plasticity that has facilitated a rapid growth of their use and resulted in their ubiquitous distribution across a variety of organisms.
A growing body of data evidences that during the long evolutionary time, short amino acid segments undergo mutations and multiple events of re-use in a variety of non-homologous proteins [4]. Evolutionary events such as duplication, fusion, and recombination have been suggested to provide a mechanism for the reuse and successful incorporation of such stretches into multiple unrelated proteins [5,6]. Therefore, such reuse of pre-existing sequences is likely to offer an evolutionary advantage for functional proteins.
Presumably, ancient proteins were quite short molecules and have evolved into contemporary large, globular, and functional proteins due to the incorporation of short peptide stretches [7]. Indeed, Eck and Dayhoff described the phenomenon of the incorporation of a prototype small iron–sulfur cluster-containing protein, ferredoxin, that is involved in electron transfer and redox regulation, into metabolic proteins [8]. This can happen at the very early stages of biochemical evolution due to the doubling of the prototype that was enriched in Ala, Asp, Pro, Ser, and Gly residues. The authors have identified two major functional types of primordial peptides composed of 9 to 38 residues in length. The first type represented nucleic acid-binding and ribosomal peptides, while another type was catalytic peptides that can coordinate metal ions, iron–sulfur clusters, nucleotides, and nucleotide-derived cofactors [9]. Therefore, identifying evolutionary conserved SLiMs with sequence similarity to the prototype peptides can be indicative of common ancestry and functional relationships [10,11].
Earlier, we identified a variety of human proteins that contain SLiMs with sequence similarity to two functionally important segments of human alpha-fetoprotein (AFP) [12]. These SLiMs have been proposed to orchestrate functioning of multiple non-homologous human proteins during embryonic development, redox regulation, and cancer progression. AFP is a major mammalian development- and cancer-associated protein that in human is composed of 609 amino acids organized in three structural domains (I, II, and III) [13]. Experimental data have evidenced that human and rodent AFPs are capable to bind metal ions and various hydrophobic ligands.
Multiple linear segments with putative and experimentally confirmed functions have been identified to enable functional and structural mapping of human AFP [14]. The 34 amino acid-long stretch located in the domain III to encompass residues from 464 to 497 of full-length human AFP has been designated as growth-inhibitory peptide (GIP) and chemically synthesized, purified, and characterized [15,16]. The GIP has occurred to exert the inhibitory effects on mouse uterine cell proliferation and cancer growth in an MSF-7 cell line model [17]. Its C-terminal segments, EMTPVNPGV (GIP-9) that encompasses residues 489 to 497 has occurred to be one of the most biologically active segments of GIP [18]. Moreover, human AFP and derived peptides have been experimentally shown to reduce fetotoxicity of high doses of insulin and estrogens in murine and chick models [19].
Another AFP-derived peptide, LDSYQCT, is located in the domain I to encompass residues from 32 to 38 in the full-length protein. In the mature protein this segment encompasses residues from 14 to 20 and, consequently, it has been designated as AFP14–20 [20]. This heptapeptide has been shown to share a high degree of sequence similarity with a part of receptor-binding domain of epidermal growth factor (EGF). AFP14–20 has also been chemically synthesized and demonstrated the immunomodulatory effects in culture of human phytohemagglutinin (PHA)-activated lymphocytes [21]. Multiple analogs and fragments of this peptide have been obtained to display biological activity that correlated with amino acid composition that influences conformational changes in the protein backbone [22].
Here, we used the SLiM search approach based on local sequence alignment algorithms to retrieve proteins that contain short GIP-9-like and AFP14–20-like motifs from protein primary structure databases. We identified a vast array of proteins from all taxonomies, including bacteria, viruses, archaea, and eukaryotes that contain both SLiM types of interest. Amino acid composition analyses of all retrieved SLiMs allowed for the revealing of a high degree of sequence conservation and hotspot residues. Furthermore, we performed comprehensive Gene Ontology (GO) functional enrichment analysis and revealed that the both motif types can be identified in proteins involved directly or indirectly in cellular response to biotic and abiotic stress. Our data allow for the suggestion that these conserved motifs underlie the involvement of a vast array of proteins in cellular response to stress conditions. Also, AFP can be involved in cellular adaptation to oxidative, genotoxic, and metabolic stress during embryonic development and cancer growth.

2. Materials and Methods

2.1. Mapping of AFP14–20-like and GIP-9-like Peptides

Both biologically active peptides, AFP14–20 and GIP-9, were mapped on three-dimensional (3D) structure of human AFP in order to assess their structural features. For this purpose, we utilized the 3D structure of human AFP that we previously constructed by homology-based modelling with the use of Schrödinger software (release 2018-2) [23,24]. PyMOL, version 2.5, molecular graphics system was utilized for structure visualization (https://pymol.org/2/ (accessed on 21 January 2022)) [25].

2.2. Search for Short Linear Motifs

We carried out local sequence alignment with the use of both AFP-derived peptides, LDSYQCT and EMTPVNPGV, as queries for sequence similarity search. FastA suite [26] supported by the European Bioinformatics Institute of European Molecular Biology Laboratory (EMBL-EBI) (https://www.ebi.ac.uk/Tools/sss/fasta/ (accessed on 8 January 2022)) [27] was exploited. The alignment was performed against UniProtKB protein knowledgebase (https://www.uniprot.org/ (accessed on 8 January 2022)), both UniProtKB/Swiss-Prot (the manually annotated and reviewed) and UniProtKB/TrEMBL (the automatically annotated) sections [28]. No restriction in taxonomic categories was applied. GLSEARCH (version 36.3.8 h) algorithm provided the most optimal search for sequences that match the query peptides. Default parameters: BLOSUM50 matrix, gap open -10, gap extension -2, expectation value (E-value) upper unit 10 and lower unit 0 to obtain up to 500 alignments were utilized.

2.3. Amino Acid Conservation Analysis

SLiMs obtained with the use of the FastA GLSEARCH algorithm were further subjected to amino acid substitution analysis. Amino acid substitutions at each position of all SLiMs were calculated as follows: N = a/b × 100%. Here, a is the quantity of a definite residue at a definite position and b is the total number of SLiMs. All SLiMs including those aligned to AFP itself from all species and uncharacterized and hypothetical proteins were taken into account. Graphical representation of the amino acid conservation was performed with the use of the WebLogo3 (http://weblogo.threeplusone.com/create.cgi (accessed on 5 February 2022)) tool [29].

2.4. Functional Classification of Retrieved Proteins

All proteins extracted from the both Swiss-Prot and TrEMBL sections of UniProtKB database were subjected to GO term-based functional classification [30] in both the molecular functions and biological processes categories (http://geneontology.org/ (accessed on 14 May 2022)). These included all retrieved proteins from both prokaryotic and eukaryotic taxonomies. Since TrEMBL is a large section that contains automatically annotated proteins, a cut-off of E-value 0.1 and identity degree of 57.1% for AFP14–20-like motifs and E-value 0.1 and identity degree of 55.6% for GIP-9-like motifs were applied for alignments against this section of UniProtKB. In addition to UniProtKB, InterPro (https://www.ebi.ac.uk/interpro/ (accessed on 7 June 2022)) protein family resource was used for functional annotations of the retrieved proteins [31].

2.5. Gene Set Enrichment Analysis

For further gene set enrichment analysis, two lists of genes coding for the retrieved proteins containing either AFP14–20-like or GIP-9-like motifs were manually created. The UniProtKB-IDs were used and when needed they were converted into Ensembl gene IDs and STRING-db proteins IDs. These datasets were used as backgrounds for GO enrichment analysis. The created lists were first uploaded into PANTHER classification system (http://pantherdb.org/ (accessed on 17 May 2022)) of the Gene Ontology resource [32]. The R/Bioconducter packages in graphical ShinyGO v0.75 suite (http://bioinformatics.sdstate.edu/go/ (accessed on 24 May 2022)) was utilized [33] for further functional enrichment analysis. Characteristics of a list of genes were compared with other genes of the whole genome (background) and Student’s t-test was applied. Additionally, the gProfiler functional enrichment analysis [34] resource (https://biit.cs.ut.ee/gprofiler/ (accessed on 28 May 2022)) was utilized. Here, the gSCS statistical threshold to be equal to 0.2 and ENTREZGENE_ACC numerical IDs to extract all known gene sets were exploited.

3. Results

3.1. Biologically Active Peptides Are Located on AFP Surface

Figure 1 depicts the overall U-shaped architecture and 3D organization of human AFP with secondary structure elements represented by alpha-helices and loops with no beta-strands. Visualization of the obtained structure showed that the two distinct functionally important segments of human AFP with experimentally confirmed biological activities, AFP14–20 and GIP-9, are located on the protein surface to be accessible to the solvent and/or protein binding.
The fist peptide segment is located in the domain I, close to N-terminus, and arranged in α-helical conformation. The second segment encompasses C-terminal part of GIP-34 peptide that occupies the most prolonged α-helical stretch in the domain III. Only a little part of secondary structure elements of the GIP-9 peptide is arranged in α-helix, while the remaining part represents a disordered region, and this can have a role in its functionality.

3.2. Proteins Containing SLiMs of Interest Are Biologically Diverse

Local sequence alignment enabled retrieval of 464 proteins from Swiss-Prot section of UniProtKB database and 500 proteins from its TrEMBL section (with maximum E-value 6.9 × 10-4) that contain SLiMs with sequence similarity to LDSYQCT peptide. They covered proteins from a wide range of taxonomic categories and included uncharacterized and hypothetical proteins. Table 1 shows the most representative proteins from various species aligned with LDSYQCT sequence and the alignment E-values: the lower the E-value, the higher the statistical significance of the alignment. In the alignment column, the upper sequence is a query, whereas the lower sequence is from the retrieved protein. Proteins that contain AFP14–20-like motifs play various biological roles including transcriptional and translational regulation, oxidoreductase and electron transfer activity, protein quality control, host–pathogen interaction, biotic and abiotic stress response, and component of ribosomes and the toxin–antitoxin system, etc.
SLiMs with sequence similarity to EMTPVNPG octapeptide were identified in 258 proteins from the Swiss-Prot section and 500 proteins from the TrEMBL section (with maximum E-value 4.3 × 10−2) of UniProtKB database. These proteins covered all taxonomic categories and included AFP from different biological species, uncharacterized and hypothetical proteins. Table 2 contains the most representative proteins aligned to GIP-9 segment and the alignment E-values. Proteins that contain GIP-9-like motifs also have a wide range of biological roles including involvement in cell signaling, transcriptional regulation, metabolic processes, response to chemicals, immune response, electron transfer, etc.

3.3. SLiMs of Interest Are Enriched in Conserved Residues

After the exclusion of the same proteins from different taxonomies, 199 AFP14–20-like and 280 GIP-9-like unique motifs were identified. Furthermore, we assessed amino acid frequencies at each position of the unique SLiMs. The most conserved residue in AFP14–20-like motifs was cysteine (C) that comprises 100% of the total residue number at position 6. The second-most conserved residue was aspartic acid (D) that constituted 82.9% of all residues at position 2 and can be replaced, predominantly, by physicochemically similar asparagine (N) and glutamate (E). Two aromatic amino acids, tyrosine (Y) and phenylalanine (F), comprised 83.9% of all residues at position 4. While 57.3% of residues at position 1 were represented by leucine (L) that can be substituted for other hydrophobic residues—methionine (M), isoleucine (I), and valine (V). Serine (S) constituted 54.8% of all residues at position 3 to be replaced, mostly, by hydrophilic amino acids—T, K, and E. Hydroxyl group-containing residues, T and S, constituted 65.3% of all residues at position 7. Glutamine (Q) comprised 67.8% of all residues at position 5 to be replaced by charged and hydrophilic amino acids—lysine (K), aspartate (D), and arginine (R). These calculations with the application of a threshold of 5% resulted in the following notation for the consensus sequence: L[MIV]D[NE]S[TKE]Y[F]Q[KDR]CT[S]. Figure 2A graphically depicts the frequency of each amino acid at every position of the retrieved AFP14–20-like motifs.
As for GIP-9-like motifs, three most conserved positions were identified—4, 7, and 8. Positions 4 and 7 were occupied by proline (P) residue that comprised 96% and 98% of all residues, respectively. The third-most conserved residue was glycine (G) that comprised 92% of all residues at position 8. The least conserved position was 2, where methionine (35.4%) was the most frequent residue and could be replaced by other hydrophobic amino acids—L, I, and V. At position 1, glutamic acid residue constituted 60.4% of all residues to be replaced, more frequently, by D, Q, and K, which have similar physicochemical properties. Threonine (T) constituted 51.8% of all residues at position 3 to be replaced most frequently, by serine (S), a physicochemically similar residue (12.9%). Position 5 was occupied, mostly, by large hydrophobic residues—V (55.0%), I (20.4%), and L (7.9%). At position 6, asparagine (N) comprised 56.8% of all residues and the most significant replacement was for D and S, while position 9 was occupied by large hydrophobic amino acids—V (47.1%), L (12.1%), and I (18.2%). On the basis of the calculations, the following notation for consensus sequence was identified: E[DQ]M[LIV]T[S]PV[LI]N[DS]PGV[LI]. Figure 2B graphically depicts frequency of each amino acid at every position of the identified GIP-9-like motifs.
Therefore, both SLiM types of interest contain a large proportion of conserved amino acid residues indicating that they are evolutionarily preserved through all biological species starting from bacteria and viruses to higher eukaryotes. Interestingly, the consensus sequences were enriched in D, S, and P residues found in prototype peptides, which have been proposed to give rise to modern proteins.

3.4. Retrieved Genes Ubiquitously Exist

Furthermore, we classified unique genes that code for proteins containing both SLiM types of interest on the basis of their belonging to any taxonomic category. We found that the retrieved genes are widely distributed among all taxonomic groups, including bacteria, viruses, archaea, and various invertebrate and vertebrate species, including mammals and primates. Figure 3A,B depicts the taxonomic distribution of unique gene coding for proteins with AFP14–20-like and GIP-9-like motifs, respectively.
Up to 64% and 74% of AFP14–20-like and GIP-9-like motifs, respectively, were found in bacterial proteins, while about 10% and 16% motifs, respectively, were identified in mammalian proteins. Some retrieved genes had orthologs in multiple biological species, therefore each of such genes was treated as a unique gene. For example, both SLiMs of interest were found in cytochrome c biogenesis protein CcmE and malate dehydrogenase from a wide range of bacterial species, while transmembrane protein TMEM258 was from various eukaryotic species (see Table 1 and Table 2).

3.5. Retrieved Proteins Are Functionally Diverse

We used GO term annotations provided in the UniProtKB and InterPro databases to classify all retrieved proteins according to molecular functions and biological process categories (Figure 4). A total amount of terms can differ from the total amount of proteins aligned to each SLiM type of interest because (i) more than one GO term may be assigned to a unique protein and (ii) the same unique protein can belong to a variety of taxonomic categories. As shown in Figure 4A, metal ion binding, catalytic activity, and transferase activity were the predominant molecular function terms for AFP14–20-like motif-containing proteins. Additionally, there were proteins that exert oxidoreductase/electron transfer, DNA/RNA-binding, transcription factor, antimicrobial defense and immune response activities. The largest portion of proteins aligned to a GIP-9 segment of human AFP belonged to oxidoreductases and metal ion/iron-sulfur cluster binding, heme binding, and DNA binding proteins (Figure 4B).
Categorization of the retrieved proteins according to biological process terms showed that majority of AFP14–20-like motif-containing proteins are involved in transcriptional regulation, oxidative stress response, RNA processing, and host–pathogen defense response (Figure 4C). GIP-9-like motif-containing proteins were involved in aerobic respiration/electron transfer, response to environmental stress, metabolic process, regulation of gene expression, translation, DNA repair, and protein quality control (Figure 4D).
Additionally, prominent roles belonged to proteins involved in response to the pathogen and immune response. There were apparent relationships between molecular function and biological process terms. For example, DNA binding and metal ion binding activities can be assigned to transcriptional regulation, while electron transfer/oxidoreductase activities and, partly, metal ion binding activity underlie cell response to oxidative stress and antimicrobial, antifungal, and antiviral defense responses.

3.6. Prokaryotic Genes Are Required for Stress Tolerance

In order to identify the most statistically significant GO categories, we carried out gene set enrichment analysis with the use of ShinyGO v0.75 and gProfiler suites. In GO classification system, 389 unique genes encoding an AFP14–20-like motif containing proteins and 273 unique genes encoding a GIP-9-like motif containing proteins were mapped to the Ensembl gene IDs. Figure 5 depicts typical GO term-based functional categorization of genes encoding AFP14–20-like motif-containing proteins. From our gene set list, up to 41 bacterial genes were mapped to Ensembl genome IDs.
As shown in Figure 5A, at FDR cutoff 0.2, bacterial genes associated with nucleotide/nucleic acid binding, ion/metal ion binding and ATP binding activities were retrieved at high statistical significance (low p-value) in molecular function categories. Not surprisingly, biological processes involved in metabolism and nucleotide/nucleic acid and amino acid biosynthesis required for bacterial reproduction were overrepresented (Figure 5B). However, when all available gene sets were retrieved, oxidoreductase and chaperone activity as well as chemical stimuli/stress response and SOS response activities were identified among statistically significant categories identified for bacterial proteins (Figure 5C).
These data were confirmed by functional enrichment analysis of genes encoding GIP-9-like motif-containing proteins. From our gene list, up to 29 unique genes were mapped to Ensembl genome IDs in each bacterial taxonomy. Figure 6A–C depicts the all-available gene set enrichment analyses for three representative bacterial species. As shown in Figure 6, pathways associated with metabolic processes, nucleic acid and protein biosynthesis, translation, and DNA repair are the most statistically significant. However, pathways that underlie cellular response to biotic and abiotic stress and chemical stimuli were identified. They included SOS response and oxidative stress response that occurs with the involvement of oxidoreductase/electron transfer enzymes including those containing Fe-S clusters.

3.7. Eukaryotic Genes Are Responsible for Stress and Defense Response

Figure 7 depicts the Manhattan plots that illustrate GO terms for human (A and C) and A. thaliana (B and D) gene sets coding for (A and B) AFP14–20-like motif-containing and (C and D) GIP-9-like motif-containing proteins.
In humans, up to 54 unique genes encoding AFP14–20-like motif-containing proteins were mapped to Ensembl gene IDs. In other mammalians, the amount of corresponding unique genes constituted from 34 to 53 and from 14 to 20, respectively.
In GO-based molecular function terms, H. sapiens protein/receptor binding, ion/metal ion-binding and calcium-binding, DNA and heterocyclic compound (nucleotide)-binding as well as dioxygenase and oxidoreductase activities were among the most significant categories (Figure 8A). As expected, in GO biological process terms, biosynthetic and developmental processes as well as cell communication and cell signaling pathways were identified as the most significant functional terms (Figure 8B).
Unexpectedly, response to stress and chemical stimulus and DNA damage were also among statistically significant biological processes. This picture was typical for various animal species, where a wide range of stress response proteins including oxidoreductases, ubiquitin activating enzymes, channel activity regulators, and cell signaling proteins were retrieved. For example, in plants, up to 47 unique genes encoding AFP14–20-like motif-containing proteins were mapped to Ensembl gene IDs. These included proteins important for cell division such as those involved in RNA binding, nucleotide biosynthesis, and translation. Interestingly, those implicated in stress/defense response such as oxidoreductases and proteins involved in killing of other organisms were also among significant ones in plants (Figure 8C,D).
As for GIP-9-like motif-containing proteins, lower quantities of statistically significant GO terms were identified (Figure 9). Up to 21 unique genes in mammalians and up to 15 unique genes in plants were mapped to Ensembl gene IDs.
In H. sapiens GO molecular function terms, protein and nucleotide binding activities along with chaperone and ion/metal ion binding activities were among overrepresented molecular function terms (Figure 9A). In biological process terms, immune and defense response as well as autophagy and apoptosis (Figure 9B) were identified among significant human genes. In plants, NADPH-dependent oxidoreductase, ion channel, and RNA/DNA binding activities, which underlie response to external stimulus, protein localization, and cellular metabolism were identified (Figure 9C,D).

4. Discussion

SLiMs are often found in the rapidly evolving intrinsically disordered regions of proteins and the motif acquisition can proceed through the convergent evolution [92]. Frequent mutations, small size, and low complexity make it difficult to identify motifs and to study their functions. Here, we used bioinformatics and GO enrichment analyses to search for SLiMs with sequence similarity to two AFP-derived sequences, LDSYQCT (AFP14–20) and EMTPVNPGV (GIP-9). We identified a vast array of similar motifs across all taxonomic categories including bacteria, viruses, archaea, and various eukaryotic species.
One of the most prominent molecular functions of human and rodent AFPs is metal ion binding capability [93], which is similar to activities of majority of the retrieved in our study proteins. This capability underlies the involvement of proteins in various cellular processes including metabolism, transcriptional regulation, and redox regulation. Most of prokaryotic proteins were, unsurprisingly, involved in nucleotide, nucleic acid, amino acid, and protein biosynthesis necessary for their reproduction. However, the overwhelming majority of both prokaryotic and eukaryotic proteins including enzymes, transcription factors, quality control, and ribosomal proteins were involved in the cellular adaptation to environmental changes and various stress conditions. Our data suggest that AFP can use the SLiMs of interest to provide cellular adaptation to stress conditions during embryonic development and cancer growth.

4.1. AFP14–20-like Motif-Containing Proteins

We found that most bacterial and archaeal proteins containing short segments aligned with the AFP14–20 at high statistical significance (E-value of ~10−5–10−4) are involved in maintaining cellular redox balance (Table 1). For example, iron–sulfur (Fe-S) cluster proteins such as rubredoxins, ferredoxins, anaredoxin, and desulfoferrodoxin exert antioxidant activity and play important roles in bacterial adaptation to environmental changes [44,46]. These proteins have a unique structural characteristic of four Cys residues that surround the Fe-S clusters involved in electron transfer from cognate reductases to cytochrome P-450s enabling maintenance of the pathogen viability [35]. Fe-S clusters are found in many enzymes central to metabolic processes such as nitrogen fixation, respiration, and DNA processing and repair. Additionally, enzymes with flavin oxidoreductase activity such as choline dehydrogenase (Cdh), which oxidizes choline to betaine aldehyde for its further oxidation to betaine, were retrieved. Betaine is a source of CH3-group for biosynthesis of nucleotides, amino acids, etc., and provides adaptation of phototrophic bacteria to osmotic stress [53]. Choline oxidation is associated with electron transfer to the electron transportation chain (ETC) and ROS generation [38]. Reasonably, NADH-quinone oxidoreductase, ETC complex I, that is of the major sites of ROS production in many bacterial strains [65], was also aligned to AFP14–20 segment. Additionally, variety of modulators of environmental stress response were aligned to AFP14–20 segment. Histidine kinase response regulator protein, stress response protein YhaX [43,59], Sel1 domain-containing protein [42], and RagB/SusD family nutrient uptake outer membrane protein [39], which regulate host cell response to pathogen were among them. Moreover, bacterial 8-oxo-dGTP diphosphatase MutT and dITP/XTP pyrophosphatase enzymes, which are involved in SOS response due the removal of oxidatively damaged and non-canonical nucleotides, were retrieved [41].
Transcription factors that regulate gene expression in bacteria, archaea, and viruses for their adaptation to environmental stress conditions were also among the retrieved proteins. They included a helix-turn-helix domain-containing AraC family and a TetR transcriptional regulator that typically bind to target DNA and regulate pathogenic properties by sensing small molecule inducers such as urea, bicarbonate, and glycerol, etc. [49,52]. Bacterial ribosomal enzymes that catalyze posttranslational modification of proteins involved in translation were also aligned to the AFP14–20 segment. An example is rimI that encodes the ribosomal protein S18-alanine acetyltransferase [36]. Proteins involved in host–pathogen interaction via promoting nucleic acid replication and host adaptive immune response were found among viral proteins. They included host range factor 1 [48] and infected cell protein 47 (ICP47) [60], which function under redox changing. For example, ICP47 directly binds antigen-dependent transporter (TAP), leading to the occurrence of empty MHC-I that is under redox control due to disulfide bond oxidation/reduction [94].
In plants, the Rho family of Ras-related GTP-binding (Rop) proteins work as signaling switches that control growth, development and apoptosis in responses to various environmental stimuli [54]. A highly conserved catalytic PRONE (plant-specific Rop nucleotide exchanger) domain-containing proteins with strong substrate specificity for members of the Rop family were aligned to AFP14–20 segment. Additionally, developmental proteins with antimicrobial activity such as gibberellic acid-stimulated Arabidopsis (GASA) [66] were retrieved. There was also, though at low significance, the acidic leucine-rich nuclear phosphoprotein 32-related protein 2 involved in histone chaperone activity and the integration of environmental stress response in plants and immunomodulation and tumor progression in humans [95].
In animals, a variety of small proteins with ion channel regulator and toxin activity such as U-scoloptoxin [55], auger peptide hheTx2 [54], leiurutoxin-3 [62], and others produced by various mollusks, snakes, and insects were aligned with the AFP14–20 segment. Additionally, Cys-rich and metal ion binding small proteins including defensins, ranatuerins, and brevinines figure prominently in the alignment. These host defense proteins have key roles in oxidative stress response, immune response, and antimicrobial, antifungal, and antiviral activities [61]. Defensins have been implicated tumor growth exhibiting both tumor-suppressive and tumor-promoting effects [56]. In human carcinomas, defensins exert antitumor effects due to induction of apoptosis, inhibiting angiogenesis, and immunomodulation.
Furthermore, transcription regulators that are involved in response to changes in microenvironmental conditions have been retrieved. They include CCHC-type domain-containing protein, C2H2-type zinc finger protein 142, nucleus accumbens-associated protein 1 (NAC1), and retinoic acid receptor RXR-gamma-B involved in various diseases including cancer and neurodevelopmental disorders [37,40,60,63]. Additionally, the importance of the extraction of calcium-binding EGF-like domain protein is that the AFP14–20 motif is a part of EGF and EGF-like domains involved in various signaling pathways [51]. Among them are JAG1/Notch signaling cascades, which activate a number of oncogenic factors that regulate cell proliferation, metastasis, angiogenesis, and drug-resistance [47]. Furthermore, denticleless protein homolog (DTL) has been associated with response to DNA damage and the immunosuppressive tumor microenvironment [45].

4.2. GIP-9-like Motif-Containing Proteins

As shown in Table 2, prokaryotic proteins containing sequences aligned with GIP-9 segment at high significance are preliminarily involved in maintaining genomic stability, transcriptional regulation, translation, and cell division. These included bacterial 2’-deoxycytidine-5’-triphosphate deaminase [67], forkhead-associated (FHA) domain-containing protein [68], chromosome partitioning protein ParA [69], AcrR family transcriptional regulator [70], dual specificity phosphatase [76], and glutamyl-Q tRNA(Asp) synthetase [79], as well as 30S and 50S ribosomal proteins [83]. They are involved in protection from DNA damage, genotoxicity, injury osmotic stress, etc.
Additionally, thermonuclease family ribonuclease HII and PINc domain-containing proteins that are involved in DNA and RNA degradation under stress conditions to provide bacterial defense mechanism [75,89] were among the retrieved proteins. Furthermore, components of bacterial toxin–antitoxin systems such as addiction module HigA family antidote, which promote adaptation and persistence by modulating bacterial growth in response to stress [86], were also retrieved.
Qualitatively, most bacterial proteins, including the cupredoxin domain-containing protein play pivotal roles in many metabolic pathways and regulation of redox homeostasis that are crucial for the pathogen survival [82]. NADPH-dependent oxidoreductases such as malate dehydrogenase and short-chain dehydrogenase (SDR) family oxidoreductase are among enzymes that undergo thiol group–redox switch for the involvement in adaptive response to oxidative stress conditions [78]. These also include a cytochrome c biogenesis protein that provides heme binding to apoprotein of cytochrome c and cytochrome c-552, the components of electron transfer and mitochondrial redox regulation [50,73]. Other proteins containing GIP-9-like segments involved in the pathogen response to oxidative stress included protein kinases and proteases.
Many proteins, including those responsible for cell cycle control and embryonic development are regulated under oxidative stress conditions. For example, the Cys residue of CoA-binding protein can undergo S-thiolation in response to oxidative and metabolic stress [71]. Some of bacterial stress response protein homologs are implicated in disease pathogenesis in humans. For example, divalent cation tolerance protein CutA homolog has been proposed to mediate acetylcholinesterase activity and copper homeostasis, which are implicated in Alzheimer’s disease [87]. In proteobacteria, cell division proteins display redox transformation due to electron transfer and reduction of oxygen, nitrogen, and hydrogen sulfide [72]. Among viral proteins, the envelope glycoprotein E that is involved in host immune response to pathogen and viral protein kinase that have a role in virus virulence and tumor pathogenesis [77] were retrieved.
Similar to AFP14–20-like motifs, GIP-9-like motifs were identified in small Fe-S cluster-containing proteins such as ferredoxins and Cisd2-a protein [80]. Furthermore, coevolution of bacteria with their hosts enabled them to tolerate oxidative stress conditions with the use of an antioxidant system (AOS) that includes both enzymatic and non-enzymatic components [96]. In this context, cupin domain-containing proteins contribute to counteracting the host defense due to functional diversity that includes an AOS component, the superoxide dismutase (SOD) enzyme [74]. Additionally, glutathione S-transferases play important roles in the environmental stress response due to S-glutathionylation of Cys residue and thiol groups resulting in target molecule detoxification [91].
In eukaryotes, GIP-9-like motifs were found in proteins such as small proline-rich proteins regulating cell cycle and cell proliferation and differentiation [85]. Interestingly, these proteins can be involved in tumor progression and their functioning is under redox control [97]. Additionally, ceruloplasmin, a major copper-carrying plasma protein that possesses ferroxidase activity and is involved in redox regulation [84], was retrieved. In plants and algae, photosystem II stability and assembly factor HCF136 that is essential for the formation of photosystem II complex and plastocyanin-like domain-containing protein that is involved in electron transfer during photosynthesis [81] were retrieved. They are regulated by redox switches between active–inactive states during light–dark transition [98]. Additionally, various stress-related proteins involved in quality control machinery including a C2H2-type zinc finger-containing protein and zinc metalloproteinases [85] were identified among GIP-9-like motif-containing proteins. Moreover, a variety of transmembrane proteins involved in ER stress response such as TMEM258 [99] and antimicrobial peptides, though at lower statistical significance, peptides were retrieved.
In mammals, members of homeobox family transcription factors such as forkhead box protein O1 (FOXO1) and homeobox protein Hox-C5 (HOXC5) that play important roles in metabolism, cell proliferation, apoptosis, development, and stress resistance [90] were identified. Additionally, HSP family members, along with tumor necrosis factor (TNF) ligand family cytokines and Wnt-1 protein involved in Wnt/β-catenin signaling pathway, key players in redox regulation and cancer development [100], were among the retrieved proteins though at lower significance.

5. Conclusions

In our study, we undertook a comprehensive functional enrichment analysis of a wide range of proteins from all taxonomic groups and different functional classes. All these proteins have similar structural characteristics regarding the presence of conserved SLiMs. The both types of short sequences used as queries for sequence similarity search were derived from AFP, a major mammalian embryo-specific and tumor-associated protein. Therefore, the identification of a variety of transcription factors and proteins involved in cell signaling, cell cycle progression, cell proliferation and differentiation, and protein quality control was anticipated. However, unexpectedly, various prokaryotic and eukaryotic proteins responsible for cellular response to both biotic and abiotic stress were retrieved as containing the both AFP14–20-like and GIP-9-like motifs. They included proteins implicated in the adaptation and protection against pathogens, reactive oxygen species, toxins, and various chemical agents. Moreover, the overwhelming majority of retrieved transcription factors and proteins involved in replication and translation were reported to participate in cellular and organismal adaptation environmental stress stimuli.
We hypothesized that both the AFP-derived peptides can arise from prototype peptides during the long evolutionary time. At the early stages of biochemical evolution, these peptides were involved in cellular stress response and preserved this function in modern proteins, including AFP. Therefore, bioinformatics and GO functional enrichment analyses of SLiMs allows insight into the common functions of a variety proteins and the involvement of AFP in cellular response to external and internal stimuli during embryonic development and cancer growth. Nevertheless, our data require further confirmation with the use of experimental approaches.

Author Contributions

Conceptualization, N.T.M. and A.A.T.; Methodology, S.P.Z. and S.S.S.; Investigation, D.S.G., N.T.M., S.P.Z. and S.S.S.; Validation and Formal Analysis, D.S.G. and S.P.Z.; Data Curation and Visualization, D.S.G. and S.P.Z.; Writing—original draft, N.T.M. and S.P.Z.; Writing—review and editing, N.T.M. and A.A.T.; Resources, D.S.G. and S.P.Z.; Supervision, A.A.T. and N.T.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available in a publicly accessible repository. The data presented in this study are openly available in FigShare at https://doi.org/10.6084/m9.figshare.20456727 accessed on 21 December 2022.

Conflicts of Interest

The authors declare no financial, professional, or personal competing interests.

References

  1. Neduva, V.; Russell, R.B. Linear motifs: Evolutionary interaction switches. FEBS Lett. 2005, 579, 3342–3345. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Davey, N.E.; Van Roey, K.; Weatheritt, R.J.; Toedt, G.; Uyar, B.; Altenberg, B.; Budd, A.; Diella, F.; Dinkel, H.; Gibson, T.J. Attributes of short linear motifs. Mol. Biosyst. 2012, 8, 268–281. [Google Scholar] [CrossRef]
  3. Van Roey, K.; Davey, N.E. Motif co-regulation and co-operativity are common mechanisms in transcriptional, post-transcriptional and post-translational regulation. Cell Commun. Signal 2015, 13, 45. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Kolodny, R. Searching protein space for ancient sub-domain segments. Curr. Opin. Struct. Biol. 2021, 68, 105–112. [Google Scholar] [CrossRef] [PubMed]
  5. Nepomnyachiy, S.; Ben-Tal, N.; Kolodny, R. Global view of the protein universe. Proc. Natl. Acad. Sci. USA 2014, 111, 11691–11696. [Google Scholar] [CrossRef] [Green Version]
  6. Höcker, B. Design of proteins from smaller fragments-learning from evolution. Curr. Opin. Struct. Biol. 2014, 27, 56–62. [Google Scholar] [CrossRef]
  7. Romero Romero, M.L.; Rabin, A.; Tawfik, D.S. Functional proteins from short peptides: Dayhoff’s hypothesis turns 50. Angew. Chem. Int. Ed. Engl. 2016, 55, 15966–15971. [Google Scholar] [CrossRef]
  8. Eck, R.V.; Dayhoff, M.O. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science 1966, 152, 363–366. [Google Scholar] [CrossRef]
  9. Alva, V.; Söding, J.; Lupas, A.N. A vocabulary of ancient peptides at the origin of folded proteins. Elife 2015, 4, e09410. [Google Scholar] [CrossRef]
  10. Verschueren, E.; Vanhee, P.; van der Sloot, A.M.; Serrano, L.; Rousseau, F.; Schymkowitz, J. Protein design with fragment databases. Curr. Opin. Struct. Biol. 2011, 21, 452–459. [Google Scholar] [CrossRef]
  11. Kolodny, R.; Nepomnyachiy, S.; Tawfik, D.S.; Ben-Tal, N. Bridging themes: Short protein segments found in different architectures. Mol. Biol. Evol. 2021, 38, 2191–2208. [Google Scholar] [CrossRef] [PubMed]
  12. Sologova, S.S.; Zavadskiy, S.P.; Mokhosoev, I.M.; Moldogazieva, N.T. Short linear motifs orchestrate functioning of human proteins during embryonic development, redox regulation, and cancer. Metabolites 2022, 12, 464. [Google Scholar] [CrossRef] [PubMed]
  13. Terentiev, A.A.; Moldogazieva, N.T. Alpha-fetoprotein: A renaissance. Tumour. Biol. 2013, 34, 2075–2091. [Google Scholar] [CrossRef] [PubMed]
  14. Terentiev, A.A.; Moldogazieva, N.T. Structural and functional mapping of alpha-fetoprotein. Biochemistry 2006, 71, 120–132. [Google Scholar] [CrossRef]
  15. Muehlemann, M.; Miller, K.D.; Dauphinee, M.; Mizejewski, G.J. Review of growth inhibitory peptide as a biotherapeutic agent for tumor growth, adhesion, and metastasis. Cancer Metastasis Rev. 2005, 24, 441–467. [Google Scholar] [CrossRef]
  16. Mizejewski, G.J.; Eisele, L.; Maccoll, R. Anticancer versus antigrowth activities of three analogs of the growth-inhibitory peptide: Relevance to physicochemical properties. Anticancer Res. 2006, 26, 3071–3076. [Google Scholar]
  17. Jacobson, H.I.; Andersen, T.T.; Bennett, J.A. Development of an active site peptide analog of α-fetoprotein that prevents breast cancer. Cancer Prev. Res. 2014, 7, 565–573. [Google Scholar] [CrossRef] [Green Version]
  18. Zhu, Z.; West, G.R.; Wang, D.C.; Collins, A.B.; Xiao, H.; Bai, Q.; Mesfin, F.B.; Wakefield, M.R.; Fang, Y. AFP peptide (AFPep) as a potential growth factor for prostate cancer. Med. Oncol. 2021, 39, 2. [Google Scholar] [CrossRef]
  19. Butterstein, G.; Morrison, J.; Mizejewski, G.J. Effect of alpha-fetoprotein and derived peptides on insulin- and estrogen-induced fetotoxicity. Fetal. Diagn. Ther. 2003, 18, 360–369. [Google Scholar] [CrossRef]
  20. Moldogazieva, N.T.; Shaitan, K.V.; Antonov, M.Y.; Vinogradova, I.K.; Terentiev, A.A. Influence of intramolecular interactions on conformational and dynamic properties of analogs of heptapeptide AFP14-20. Biochemistry 2011, 76, 1321–1336. [Google Scholar] [CrossRef]
  21. Moldogazieva, N.T.; Shaitan, K.V.; Antonov, M.Y.; Mokhosoev, I.M.; Levtsova, O.V.; Terentiev, A.A. Human EGF-derived direct and reverse short linear motifs: Conformational dynamics insight into the receptor-binding residues. J. Biomol. Struct. Dyn. 2018, 36, 1286–1305. [Google Scholar] [CrossRef] [PubMed]
  22. Moldogazieva, N.T.; Terentiev, A.A.; Antonov, M.Y.; Kazimirsky, A.N.; Shaitan, K.V. Correlation between biological activity and conformational dynamics properties of tetra- and pentapeptides derived from fetoplacental proteins. Biochemistry 2012, 77, 469–484. [Google Scholar] [CrossRef] [PubMed]
  23. Terentiev, A.A.; Moldogazieva, N.T.; Levtsova, O.V.; Maximenko, D.M.; Borozdenko, D.A.; Shaitan, K.V. Modeling of three-dimensional structure of human alpha-fetoprotein complexed with diethylstilbestrol: Docking and molecular dynamics simulation study. J. Bioinform. Comput. Biol. 2012, 10, 1241012. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Moldogazieva, N.T.; Ostroverkhova, D.S.; Kuzmich, N.N.; Kadochnikov, V.V.; Terentiev, A.A.; Porozov, Y.B. Elucidating binding sites and affinities of ERα agonists and antagonists to human alpha-fetoprotein by in silico modeling and point mutagenesis. Int. J. Mol. Sci. 2020, 21, 893. [Google Scholar] [CrossRef] [Green Version]
  25. Rosignoli, S.; Paiardini, A. Boosting the full potential of PyMOL with structural biology plugins. Biomolecules 2022, 12, 1764. [Google Scholar] [CrossRef]
  26. Pearson, W.R. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 1990, 183, 63–98. [Google Scholar] [CrossRef] [PubMed]
  27. Cantelli, G.; Bateman, A.; Brooksbank, C.; Petrov, A.I.; Malik-Sheriff, R.S.; Ide-Smith, M.; Hermjakob, H.; Flicek, P.; Apweiler, R.; Birney, E.; et al. The European Bioinformatics Institute (EMBL-EBI) in 2021. Nucleic. Acids. Res. 2022, 50, D11–D19. [Google Scholar] [CrossRef]
  28. UniProt Consortium. UniProt: The universal protein knowledgebase in 2021. Nucleic. Acids Res. 2021, 49, D480–D489. [CrossRef] [PubMed]
  29. Crooks, G.E.; Hon, G.; Chandonia, J.M.; Brenner, S.E. WebLogo: A sequence logo generator. Genome. Res. 2004, 14, 1188–1190. [Google Scholar] [CrossRef] [Green Version]
  30. Gene Ontology Consortium. The Gene Ontology resource: Enriching a GOld mine. Nucleic Acids Res. 2021, 49, D325–D334. [Google Scholar] [CrossRef] [PubMed]
  31. Blum, M.; Chang, H.Y.; Chuguransky, S.; Grego, T.; Kandasaamy, S.; Mitchell, A.; Nuka, G.; Paysan-Lafosse, T.; Qureshi, M.; Raj, S.; et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 2021, 49, D344–D354. [Google Scholar] [CrossRef]
  32. Mi, H.; Ebert, D.; Muruganujan, A.; Mills, C.; Albou, L.P.; Mushayamaha, T.; Thomas, P.D. PANTHER version 16: A revised family classification, tree-based classification tool, enhancer regions and extensive API. Nucleic Acids Res 2021, 49, D394–D403. [Google Scholar] [CrossRef]
  33. Ge, S.X.; Jung, D.; Yao, R. ShinyGO: A graphical gene-set enrichment tool for animals and plants. Bioinformatics 2020, 36, 2628–2629. [Google Scholar] [CrossRef] [PubMed]
  34. Raudvere, U.; Kolberg, L.; Kuzmin, I.; Arak, T.; Adler, P.; Peterson, H.; Vilo, J. g:Profiler: A web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 2019, 47, W191–W198. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Andreini, C.; Ciofi-Baffoni, S. Basic iron-sulfur centers. Met. Ions Life Sci. 2020, 20. [Google Scholar] [CrossRef]
  36. Pletnev, P.I.; Shulenina, O.; Evfratov, S.; Treshin, V.; Subach, M.F.; Serebryakova, M.V.; Osterman, I.A.; Paleskava, A.; Bogdanov, A.A.; Dontsova, O.A.; et al. Ribosomal protein S18 acetyltransferase RimI is responsible for the acetylation of elongation factor Tu. J. Biol. Chem. 2022, 298, 101914. [Google Scholar] [CrossRef]
  37. Wang, Y.; Yu, Y.; Pang, Y.; Yu, H.; Zhang, W.; Zhao, X.; Yu, J. The distinct roles of zinc finger CCHC-type (ZCCHC) superfamily proteins in the regulation of RNA metabolism. RNA Biol. 2021, 18, 2107–2126. [Google Scholar] [CrossRef]
  38. Mailloux, R.J.; Young, A.; Chalker, J.; Gardiner, D.; O’Brien, M.; Slade, L.; Brosnan, J.T. Choline and dimethylglycine produce superoxide/hydrogen peroxide from the electron transport chain in liver mitochondria. FEBS Lett. 2016, 590, 4318–4328. [Google Scholar] [CrossRef]
  39. Potempa, J.; Madej, M.; Scott, D.A. The RagA and RagB proteins of Porphyromonas gingivalis. Mol. Oral. Microbiol. 2021, 36, 225–232. [Google Scholar] [CrossRef]
  40. Kameyama, S.; Mizuguchi, T.; Fukuda, H.; Moey, L.H.; Keng, W.T.; Okamoto, N.; Tsuchida, N.; Uchiyama, Y.; Koshimizu, E.; Hamanaka, K.; et al. Biallelic null variants in ZNF142 cause global developmental delay with familial epilepsy and dysmorphic features. J. Hum. Genet. 2022, 67, 169–173. [Google Scholar] [CrossRef]
  41. Fitzgerald, D.M.; Rosenberg, S.M. Biology before the SOS response-DNA damage mechanisms at chromosome fragile sites. Cells 2021, 10, 2275. [Google Scholar] [CrossRef]
  42. Mittl, P.R.; Schneider-Brachert, W. Sel1-like repeat proteins in signal transduction. Cell Signal. 2007, 19, 20–31. [Google Scholar] [CrossRef]
  43. Ishii, E.; Eguchi, Y. Diversity in sensing and signaling of bacterial sensor histidine kinases. Biomolecules 2021, 11, 1524. [Google Scholar] [CrossRef]
  44. Sushko, T.; Kavaleuski, A.; Grabovec, I.; Kavaleuskaya, A.; Vakhrameev, D.; Bukhdruker, S.; Marin, E.; Kuzikov, A.; Masamrekh, R.; Shumyantseva, V.; et al. A new twist of rubredoxin function in M. tuberculosis. Bioorg. Chem. 2021, 109, 104721. [Google Scholar] [CrossRef]
  45. Li, Z.; Wang, R.; Qiu, C.; Cao, C.; Zhang, J.; Ge, J.; Shi, Y. Role of DTL in hepatocellular carcinoma and its impact on the tumor microenvironment. Front. Immunol. 2022, 13, 834606. [Google Scholar] [CrossRef]
  46. Sjöholm, J.; Oliveira, P.; Lindblad, P. Transcription and regulation of the bidirectional hydrogenase in the cyanobacterium Nostoc sp. strain PCC 7120. Appl. Environ. Microbiol. 2007, 73, 5435–5446. [Google Scholar] [CrossRef] [Green Version]
  47. Xiu, M.X.; Liu, Y.M.; Kuang, B.H. The oncogenic role of Jagged1/Notch signaling in cancer. Biomed. Pharmacother. 2020, 129, 110416. [Google Scholar] [CrossRef] [PubMed]
  48. Tachibana, A.; Hamajima, R.; Tomizaki, M.; Kondo, T.; Nanba, Y.; Kobayashi, M.; Yamada, H.; Ikeda, M. HCF-1 encoded by baculovirus AcMNPV is required for productive nucleopolyhedrovirus infection of non-permissive Tn368 cells. Sci. Rep. 2017, 7, 3807. [Google Scholar] [CrossRef]
  49. Corbella, M.; Liao, Q.; Moreira, C.; Parracino, A.; Kasson, P.M.; Kamerlin, S.C.L. The N-terminal helix-turn-helix motif of transcription factors MarA and Rob drives DNA recognition. J. Phys. Chem. 2021, 125, 6791–6806. [Google Scholar] [CrossRef]
  50. Waghwani, H.K.; Douglas, T. Cytochrome c with peroxidase-like activity encapsulated inside the small DPS protein nanocage. J. Mater. Chem. B 2021, 9, 3168–3179. [Google Scholar] [CrossRef]
  51. Hong, G.; Kuek, V.; Shi, J.; Zhou, L.; Han, X.; He, W.; Tickner, J.; Qiu, H.; Wei, Q.; Xu, J. EGFL7: Master regulator of cancer pathogenesis, angiogenesis and an emerging mediator of bone homeostasis. J. Cell Physiol. 2018, 233, 8526–8537. [Google Scholar] [CrossRef] [PubMed]
  52. Cuthbertson, L.; Nodwell, J.R. The TetR family of regulators. Microbiol. Mol. Biol. Rev. 2013, 77, 440–475. [Google Scholar] [CrossRef] [Green Version]
  53. Imhoff, J.F.; Rahn, T.; Künzel, S.; Keller, A.; Neulinger, S.C. Osmotic adaptation and compatible solute biosynthesis of phototrophic bacteria as revealed from genome analyses. Microorganisms 2020, 9, 46. [Google Scholar] [CrossRef] [PubMed]
  54. Lin, Y.; Zeng, Y.; Zhu, Y.; Shen, J.; Ye, H.; Jiang, L. Plant Rho GTPase signaling promotes autophagy. Mol. Plant 2021, 14, 905–920. [Google Scholar] [CrossRef] [PubMed]
  55. Khamtorn, P.; Peigneur, S.; Amorim, F.G.; Quinton, L.; Tytgat, J.; Daduang, S. De novo transcriptome analysis of the venom of Latrodectus geometricus with the discovery of an insect-selective Na channel modulator. Molecules 2021, 27, 47. [Google Scholar] [CrossRef] [PubMed]
  56. Xu, D.; Lu, W. Defensins: A double-edged sword in host immunity. Front. Immunol. 2020, 11, 764. [Google Scholar] [CrossRef]
  57. Loppin, B.; Dubruille, R.; Horard, B. The intimate genetics of Drosophila fertilization. Open. Biol. 2015, 5, 150076. [Google Scholar] [CrossRef] [Green Version]
  58. Imperial, J.S.; Kantor, Y.; Watkins, M.; Heralde, F.M., 3rd; Stevenson, B.; Chen, P.; Hansson, K.; Stenflo, J.; Ownby, J.P.; Bouchet, P.; et al. Venomous auger snail Hastula (Impages) hectica (Linnaeus, 1758): Molecular phylogeny, foregut anatomy and comparative toxicology. J. Exp. Zool. B Mol. Dev. Evol. 2007, 308, 744–756. [Google Scholar] [CrossRef]
  59. Gomez-Arrebola, C.; Solano, C.; Lasa, I. Regulation of gene expression by non-phosphorylated response regulators. Int. Microbiol. 2021, 24, 521–529. [Google Scholar] [CrossRef]
  60. Zhang, Y.; Ren, Y.J.; Guo, L.C.; Ji, C.; Hu, J.; Zhang, H.H.; Xu, Q.H.; Zhu, W.D.; Ming, Z.J.; Yuan, Y.S.; et al. Nucleus accumbens-associated protein-1 promotes glycolysis and survival of hypoxic tumor cells via the HDAC4-HIF-1α axis. Oncogene 2017, 36, 4171–4181. [Google Scholar] [CrossRef] [Green Version]
  61. Conlon, J.M.; Kolodziejek, J.; Mechkarska, M.; Coquet, L.; Leprince, J.; Jouenne, T.; Vaudry, H.; Nielsen, P.F.; Nowotny, N.; King, J.D. Host defense peptides from Lithobates forreri, Hylarana luctuosa, and Hylarana signata (Ranidae): Phylogenetic relationships inferred from primary structures of ranatuerin-2 and brevinin-2 peptides. Comp. Biochem. Physiol. Part D Genom. Proteom. 2017, 9, 49–57. [Google Scholar] [CrossRef] [PubMed]
  62. Xu, C.Q.; He, L.L.; Brône, B.; Martin-Eauclaire, M.F.; Van Kerkhove, E.; Zhou, Z.; Chi, C.W. A novel scorpion toxin blocking small conductance Ca2+ activated K+ channel. Toxicon 2004, 43, 961–971. [Google Scholar] [CrossRef] [PubMed]
  63. Leal, A.S.; Reich, L.A.; Moerland, J.A.; Zhang, D.; Liby, K.T. Potential therapeutic uses of rexinoids. Adv. Pharmacol. 2021, 91, 141–183. [Google Scholar] [CrossRef]
  64. Cheng, J.T.; Wang, Y.Y.; Zhu, L.Z.; Zhang, Y.; Cai, W.Q.; Han, Z.W.; Zhou, Y.; Wang, X.W.; Peng, X.C.; Xiang, Y.; et al. Novel transcription regulatory sequences and factors of the immune evasion protein ICP47 (US12) of herpes simplex viruses. Virol. J. 2020, 17, 101. [Google Scholar] [CrossRef]
  65. Ohnishi, T.; Ohnishi, S.T.; Salerno, J.C. Five decades of research on mitochondrial NADH-quinone oxidoreductase (complex I). Biol. Chem. 2018, 399, 1249–1264. [Google Scholar] [CrossRef] [Green Version]
  66. Zhang, S.; Wang, X. One new kind of phytohormonal signaling integrator: Up-and-coming GASA family genes. Plant. Signal Behav. 2017, 12, e1226453. [Google Scholar] [CrossRef] [Green Version]
  67. Dos Vultos, T.; Mestre, O.; Tonjum, T.; Gicquel, B. DNA repair in Mycobacterium tuberculosis revisited. FEMS Microbiol. Rev. 2009, 33, 471–487. [Google Scholar] [CrossRef] [Green Version]
  68. Almawi, A.W.; Matthews, L.A.; Guarné, A. FHA domains: Phosphopeptide binding and beyond. Prog. Biophys. Mol. Biol. 2017, 127, 105–110. [Google Scholar] [CrossRef]
  69. Jalal, A.S.B.; Le, T.B.K. Bacterial chromosome segregation by the ParABS system. Open Biol. 2020, 10, 200097. [Google Scholar] [CrossRef]
  70. Kang, S.M.; Kim, D.H.; Jin, C.; Ahn, H.C.; Lee, B.J. The crystal structure of AcrR from Mycobacterium tuberculosis reveals a one-component transcriptional regulation mechanism. FEBS Open. Bio 2019, 9, 1713–1725. [Google Scholar] [CrossRef] [Green Version]
  71. Gout, I. Coenzyme A, protein CoAlation and redox regulation in mammalian cells. Biochem. Soc. Trans. 2018, 46, 721–728. [Google Scholar] [CrossRef] [PubMed]
  72. Geerlings, N.M.J.; Karman, C.; Trashin, S.; As, K.S.; Kienhuis, M.V.M.; Hidalgo-Martinez, S.; Vasquez-Cardenas, D.; Boschker, H.T.S.; De Wael, K.; Middelburg, J.J.; et al. Division of labor and growth during electrical cooperation in multicellular cable bacteria. Proc. Natl. Acad. Sci. USA 2020, 117, 5478–5485. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  73. Zhang, B.; Pan, C.; Feng, C.; Yan, C.; Yu, Y.; Chen, Z.; Guo, C.; Wang, X. Role of mitochondrial reactive oxygen species in homeostasis regulation. Redox Rep. 2022, 27, 45–52. [Google Scholar] [CrossRef] [PubMed]
  74. Karlik, E. Potential stress tolerance roles of barley germins and GLPs. Dev. Genes. Evol. 2021, 231, 109–118. [Google Scholar] [CrossRef]
  75. Hu, Y.; Meng, J.; Shi, C.; Hervin, K.; Fratamico, P.M.; Shi, X. Characterization and comparative analysis of a second thermonuclease from Staphylococcus aureus. Microbiol. Res. 2013, 168, 174–182. [Google Scholar] [CrossRef]
  76. Pulido, R.; Lang, R. Dual specificity phosphatases: From molecular mechanisms to biological function. Int. J. Mol. Sci. 2019, 20, 4372. [Google Scholar] [CrossRef] [Green Version]
  77. Villalaín, J. Envelope E protein of dengue virus and phospholipid binding to the late endosomal membrane. Biochim. Biophys. Acta Biomembr. 2022, 1864, 183889. [Google Scholar] [CrossRef]
  78. Sellés Vidal, L.; Kelly, C.L.; Mordaka, P.M.; Heap, J.T. Review of NAD(P)H-dependent oxidoreductases: Properties, engineering and application. Biochim. Biophys. Acta Proteins Proteom. 2018, 1866, 327–347. [Google Scholar] [CrossRef]
  79. Katz, A.; Elgamal, S.; Rajkovic, A.; Ibba, M. Non-canonical roles of tRNAs and tRNA mimics in bacterial cell biology. Mol. Microbiol. 2016, 101, 545–558. [Google Scholar] [CrossRef] [Green Version]
  80. Sengupta, S.; Nechushtai, R.; Jennings, P.A.; Onuchich, J.N.; Padulla, P.A.; Azad, R.K.; Mittler, R. Phylogenetic analysis of the CDGSH iron-sulfur binding domain reveals its ancient origin. Sci. Rep. 2018, 8, 4840. [Google Scholar] [CrossRef]
  81. Ma, H.; Zhao, H.; Liu, Z.; Zhao, J. The phytocyanin gene family in rice (Oryza sativa L.): Genome-wide identification, classification and transcriptional analysis. PLoS ONE 2011, 6, e25184. [Google Scholar] [CrossRef] [PubMed]
  82. Zhang, K.; Wang, F.; Liu, B.; Xu, C.; He, Q.; Cheng, W.; Zhao, X.; Ding, Z.; Zhang, W.; Zhang, K.; et al. ZmSKS13, a cupredoxin domain-containing protein, is required for maize kernel development via modulation of redox homeostasis. New Phytol. 2021, 229, 2163–2178. [Google Scholar] [CrossRef] [PubMed]
  83. Svetlov, M.S. Ribosome-associated quality control in bacteria. Biochemistry 2021, 86, 942–951. [Google Scholar] [CrossRef]
  84. Prohaska, J.R. Role of copper transporters in copper homeostasis. Am J Clin Nutr. 2008, 88, 826S–829S. [Google Scholar] [CrossRef] [Green Version]
  85. Zhu, G.; Cai, H.; Yem, L.; Mo, Y.; Zhu, M.; Zeng, Y.; Song, X.; Yang, C.; Gao, X.; Wang, J.; et al. Small proline-rich protein 3 regulates IL-33/ILC2 axis to promote allergic airway inflammation. Front. Immunol. 2020, 12, 758829. [Google Scholar] [CrossRef] [PubMed]
  86. Bordes, P.; Sala, A.J.; Ayala, S.; Texier, P.; Slama, N.; Cirinesi, A.M.; Guillet, V.; Mourey, L.; Genevaux, P. Chaperone addiction of toxin-antitoxin systems. Nat. Commun. 2016, 7, 13339. [Google Scholar] [CrossRef] [Green Version]
  87. Zhao, Y.; Wang, Y.; Hu, J.; Zhang, X.; Zhang, Y.W. CutA divalent cation tolerance homolog (Escherichia coli) (CUTA) regulates β-cleavage of β-amyloid precursor protein (APP) through interacting with β-site APP cleaving protein 1 (BACE1). J. Biol. Chem. 2012, 287, 11141–11150. [Google Scholar] [CrossRef] [Green Version]
  88. Hyjek, M.; Figiel, M.; Nowotny, M. RNases H: Structure and mechanism. DNA Repair. 2019, 84, 102672. [Google Scholar] [CrossRef]
  89. Mackeh, R.; Marr, A.K.; Fadda, A.; Kino, T. C2H2-type zinc finger proteins: Evolutionarily old and new partners of the nuclear hormone receptors. Nucl. Recept. Signal. 2018, 15, 1550762918801071. [Google Scholar] [CrossRef]
  90. Wang, Y.; Zhou, Y.; Graves, D.T. FOXO transcription factors: Their clinical significance and regulation. Biomed. Res. Int. 2014, 2014, 925350. [Google Scholar] [CrossRef] [Green Version]
  91. Vaish, S.; Gupta, D.; Mehrotra, R.; Mehrotra, S.; Basantani, M.K. Glutathione S-transferase: A versatile protein family. 3 Biotech. 2020, 10, 321. [Google Scholar] [CrossRef] [PubMed]
  92. Davey, N.E.; Cyert, M.S.; Moses, A.M. Short linear motifs-ex nihilo evolution of protein regulation. Cell Commun. Signal. 2015, 13, 43. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  93. Terentiev, A.A.; Moldogazieva, N.T. Cell adhesion proteins and alpha-fetoprotein. Similar structural motifs as prerequisites for common functions. Biochemistry 2007, 72, 920–935. [Google Scholar] [CrossRef] [PubMed]
  94. Kienast, A.; Preuss, M.; Winkler, M.; Dick, T.P. Redox regulation of peptide receptivity of major histocompatibility complex class I molecules by ERp57 and tapasin. Nat. Immunol. 2007, 8, 864–872. [Google Scholar] [CrossRef]
  95. Gursoy-Yuzugullu, O.; Ayrapetov, M.K.; Price, B.D. Histone chaperone Anp32e removes H2A.Z from DNA double-strand breaks and promotes nucleosome reorganization and DNA repair. Proc. Natl. Acad. Sci. USA 2015, 112, 7507–7512. [Google Scholar] [CrossRef] [Green Version]
  96. Moldogazieva, N.T.; Mokhosoev, I.M.; Mel’nikova, T.I.; Zavadskiy, S.P.; Kuz’menko, A.N.; Terentiev, A.A. Dual character of reactive oxygen, nitrogen, and halogen species: Endogenous sources, interconversions and neutralization. Biochemistry 2020, 85 (Suppl. S1), S56–S78. [Google Scholar] [CrossRef]
  97. Chen, J.; Wang, Y.; Zhang, W.; Zhao, D.; Zhang, L.; Fan, J.; Li, J.; Zhan, Q. Membranous NOX5-derived ROS oxidizes and activates local Src to promote malignancy of tumor cells. Signal Transduct. Target. 2020, 5, 139. [Google Scholar] [CrossRef]
  98. García-Cerdán, J.G.; Furst, A.L.; McDonald, K.L.; Schünemann, D.; Francis, M.B.; Niyogi, K.K. A thylakoid membrane-bound and redox-active rubredoxin (RBD1) functions in de novo assembly and repair of photosystem II. Proc. Natl. Acad. Sci. USA 2019, 116, 16631–16640. [Google Scholar] [CrossRef] [Green Version]
  99. Graham, D.B.; Lefkovith, A.; Deelen, P.; de Klein, N.; Varma, M.; Boroughs, A.; Desch, A.N.; Ng, A.C.Y.; Guzman, G.; Schenone, M.; et al. TMEM258 is a component of the oligosaccharyltransferase complex controlling ER stress and intestinal inflammation. Cell Rep. 2016, 17, 2955–2965. [Google Scholar] [CrossRef] [Green Version]
  100. Fittipaldi, S.; Mercatelli, N.; Dimauro, I.; Jackson, M.J.; Paronetto, M.P.; Caporossi, D. Alpha B-crystallin induction in skeletal muscle cells under redox imbalance is mediated by a JNK-dependent regulatory mechanism. Free Radic. Biol. Med. 2015, 86, 331–342. [Google Scholar] [CrossRef]
Figure 1. The overall architecture of AFP is represented by a U-shaped structure composed of three domains: I (orange, residues 19–210), II (green, residues 211–402), and III (cyan, residues 403–601). Two functionally important segments, AFP14–20 with sequence LDSYQCT (residues 32–38, colored in blue) and GIP-9 with sequence EMTPVNPGV (residues 489–497, colored in red) that is a part of GIP-34 (residues 464–497, colored in pink), respectively, are shown.
Figure 1. The overall architecture of AFP is represented by a U-shaped structure composed of three domains: I (orange, residues 19–210), II (green, residues 211–402), and III (cyan, residues 403–601). Two functionally important segments, AFP14–20 with sequence LDSYQCT (residues 32–38, colored in blue) and GIP-9 with sequence EMTPVNPGV (residues 489–497, colored in red) that is a part of GIP-34 (residues 464–497, colored in pink), respectively, are shown.
Antioxidants 12 00096 g001
Figure 2. WebLogo representation of amino acid abundances at each position of (A) AFP14–20-like and (B) GIP-9-like motifs identified in proteins retrieved from UniProtKB database. The overall height of every stack indicates residue conservation at each position, while a symbol height within the stack indicates relative frequency of each residue at that position. Colors of symbols are as follows: hydrophobic and glycine—green, hydrophilic and positively charged—orange, negatively charged and their amides—blue, aromatic plus proline—purple, and cysteine—red.
Figure 2. WebLogo representation of amino acid abundances at each position of (A) AFP14–20-like and (B) GIP-9-like motifs identified in proteins retrieved from UniProtKB database. The overall height of every stack indicates residue conservation at each position, while a symbol height within the stack indicates relative frequency of each residue at that position. Colors of symbols are as follows: hydrophobic and glycine—green, hydrophilic and positively charged—orange, negatively charged and their amides—blue, aromatic plus proline—purple, and cysteine—red.
Antioxidants 12 00096 g002
Figure 3. Diagram representations of taxonomic distribution of genes encoding proteins, which were retrieved from an UniProtKB database as aligned with (A) AFP14–20 and (B) GIP-9 segment of human AFP. Amounts of unique genes in each taxonomic category are shown above each column. Prokaryotes—bacteria (blue), viruses (brown), archaea (grey); mammals—Homo sapiens (blue), primates (brown), other mammals (grey); vertebrates—birds (blue), fishes (brown), amphibia (grey); invertebrates—reptiles (blue), insects (brown), nematodes (grey); plants—higher plants (blue), algae (brown); other eukaryotes—S. cerevisiae (blue), fungi (brown), mollusks, scorpions, spiders, etc., (grey).
Figure 3. Diagram representations of taxonomic distribution of genes encoding proteins, which were retrieved from an UniProtKB database as aligned with (A) AFP14–20 and (B) GIP-9 segment of human AFP. Amounts of unique genes in each taxonomic category are shown above each column. Prokaryotes—bacteria (blue), viruses (brown), archaea (grey); mammals—Homo sapiens (blue), primates (brown), other mammals (grey); vertebrates—birds (blue), fishes (brown), amphibia (grey); invertebrates—reptiles (blue), insects (brown), nematodes (grey); plants—higher plants (blue), algae (brown); other eukaryotes—S. cerevisiae (blue), fungi (brown), mollusks, scorpions, spiders, etc., (grey).
Antioxidants 12 00096 g003
Figure 4. Categorization of proteins retrieved from an UniProtKB knowledgebase were performed in Gene Ontology. (A,B) Molecular function and (C,D) biological process terms and aligned with (A,C) AFP14–20 and (B,D) GIP-9 segments of human AFP. Ranking was performed in order of decrease in number of unique genes in each category. Calculation of unique gene quantity was performed manually with no taken into account degree of a category significance.
Figure 4. Categorization of proteins retrieved from an UniProtKB knowledgebase were performed in Gene Ontology. (A,B) Molecular function and (C,D) biological process terms and aligned with (A,C) AFP14–20 and (B,D) GIP-9 segments of human AFP. Ranking was performed in order of decrease in number of unique genes in each category. Calculation of unique gene quantity was performed manually with no taken into account degree of a category significance.
Antioxidants 12 00096 g004
Figure 5. (A) Molecular function, (B) biological process, and (C) all-available gene set categorization in Gene Ontology terms of representative bacterial genome (Acenitobacter sp.). Lollipop chart at aspect ratio 1.5 and -log10 (FDR) heat maps for each category are shown. FDR is calculated based on nominal p-value from the hypergeometric test. FDR shows how likely the enrichment is by chance. Larger gene sets tend to have smaller FDR. N. of Genes indicates the number of genes for each category.
Figure 5. (A) Molecular function, (B) biological process, and (C) all-available gene set categorization in Gene Ontology terms of representative bacterial genome (Acenitobacter sp.). Lollipop chart at aspect ratio 1.5 and -log10 (FDR) heat maps for each category are shown. FDR is calculated based on nominal p-value from the hypergeometric test. FDR shows how likely the enrichment is by chance. Larger gene sets tend to have smaller FDR. N. of Genes indicates the number of genes for each category.
Antioxidants 12 00096 g005
Figure 6. Prokaryotic genes coding for proteins containing GIP-9-like motifs. All-available gene set analysis of (A) Desulfotomaculum guttoideum, (B) Bacillus selenitireducens, and (C) Clostridium aminophilum genes. Categories are ranked by fold enrichment order; that is, the percentage of genes in the list belonging to each category divided by the corresponding percentage in the background. Fold enrichment indicates how drastically genes of a certain pathway are overrepresented. N. of Genes indicates the number of genes for each category.
Figure 6. Prokaryotic genes coding for proteins containing GIP-9-like motifs. All-available gene set analysis of (A) Desulfotomaculum guttoideum, (B) Bacillus selenitireducens, and (C) Clostridium aminophilum genes. Categories are ranked by fold enrichment order; that is, the percentage of genes in the list belonging to each category divided by the corresponding percentage in the background. Fold enrichment indicates how drastically genes of a certain pathway are overrepresented. N. of Genes indicates the number of genes for each category.
Antioxidants 12 00096 g006
Figure 7. Grouping of eukaryotic genes. Manhattan plots of all H. sapiens (A,C) and A. thaliana (B,D) gene sets coding for (A,B) AFP14–20-like motif-containing and (C,D) GIP-9-like motif-containing proteins. The x-axis represents functional terms that are grouped and color-coded by data sources, while the y-axis shows the adjusted enrichment p-values in negative log10 scale. MF, molecular function; BP, biological process; CC, cellular component; KEGG, KEGG pathway; REAC, Reactome; WP, Wiki pathway; TF, transcription factor; MIRNA, microRNA; HPA, human Protein Atlas; CORUM, CORUM dataset; and HP, human phenotype. Each circle indicates the functional enrichment term, while the circle sizes correspond to the term size; larger terms have larger circles.
Figure 7. Grouping of eukaryotic genes. Manhattan plots of all H. sapiens (A,C) and A. thaliana (B,D) gene sets coding for (A,B) AFP14–20-like motif-containing and (C,D) GIP-9-like motif-containing proteins. The x-axis represents functional terms that are grouped and color-coded by data sources, while the y-axis shows the adjusted enrichment p-values in negative log10 scale. MF, molecular function; BP, biological process; CC, cellular component; KEGG, KEGG pathway; REAC, Reactome; WP, Wiki pathway; TF, transcription factor; MIRNA, microRNA; HPA, human Protein Atlas; CORUM, CORUM dataset; and HP, human phenotype. Each circle indicates the functional enrichment term, while the circle sizes correspond to the term size; larger terms have larger circles.
Antioxidants 12 00096 g007
Figure 8. H. sapiens and A. thaliana gene set enrichment analysis of AFP14–20-like motif-containing proteins. Human genes categorized in (A) molecular function terms and (B) biological process terms. A. thaliana genes categorized in (C) molecular function terms and (D) biological process terms. Color codes indicate data inferred from: dark brown—experiment/direct assay, light brown—genetic and physical interactions, yellow—sequence similarity, dark purple—high throughput experiment, green—curator, blue—reviewed computational data.
Figure 8. H. sapiens and A. thaliana gene set enrichment analysis of AFP14–20-like motif-containing proteins. Human genes categorized in (A) molecular function terms and (B) biological process terms. A. thaliana genes categorized in (C) molecular function terms and (D) biological process terms. Color codes indicate data inferred from: dark brown—experiment/direct assay, light brown—genetic and physical interactions, yellow—sequence similarity, dark purple—high throughput experiment, green—curator, blue—reviewed computational data.
Antioxidants 12 00096 g008aAntioxidants 12 00096 g008b
Figure 9. H. sapiens and A. thaliana gene set enrichment analysis of GIP-9-like motif-containing proteins. Human genes categorized in (A) molecular function terms and (B) biological process terms. A. thaliana genes categorized in (C) molecular function terms and (D) biological process terms. Color codes indicate data inferred from: dark brown—experiment/direct assay, light brown—genetic and physical interactions, yellow—sequence similarity, dark purple—high throughput experiment, green—curator, blue—reviewed computational data.
Figure 9. H. sapiens and A. thaliana gene set enrichment analysis of GIP-9-like motif-containing proteins. Human genes categorized in (A) molecular function terms and (B) biological process terms. A. thaliana genes categorized in (C) molecular function terms and (D) biological process terms. Color codes indicate data inferred from: dark brown—experiment/direct assay, light brown—genetic and physical interactions, yellow—sequence similarity, dark purple—high throughput experiment, green—curator, blue—reviewed computational data.
Antioxidants 12 00096 g009aAntioxidants 12 00096 g009b
Table 1. Representative proteins retrieved from the UniProtKB database as containing AFP14–20-like motifs (at E-value ˂ 0.05).
Table 1. Representative proteins retrieved from the UniProtKB database as containing AFP14–20-like motifs (at E-value ˂ 0.05).
Protein Name SpeciesEntry CodeGene SymbolAlignmentAa PositionsIdentity
Degree
E-ValueBiological RolesReference
RubredoxinMethanoregulaceae archaeonTR: A0A1V5A688rub_2LDSYQCT
MDSYQCT
1–785.7%3.9 × 10−11Electron transfer, iron-binding, redox regulation[35]
Ribosomal protein S18-alanine N-acetyltransferaseAcinetobacter sp.TR: A0A5C8C7V6rimILDSYQCT
LDSYQCT
1–7100%1.5 × 10−9Translational regulation[36]
CCHC-type domain-containing proteinCrassostrea gigasTR: K1QN18CGILDSYQCT
MDSYQCS
11–1771.4%3.4 × 10−6Transcriptional regulation, response to environmental changes[37]
Choline dehydrogenaseComamonadaceae bacteriumTR: A0A2H0JD95 COW02_01535LDSYQCT
LDSYQCT
134–140100%5.3 × 10−6Oxidative stress response[38]
RagB/SusD family nutrient uptake outer membrane proteinGinsengibacter hankyongiTR: A0A5J5IHS1 FW778_00440LDSYQCT
LDSYQCT
303–309100%5.3 × 10−6Host cell response to pathogen [39]
Zinc-finger protein 142,Nothobranchius kuhntaeTR: A0A1A8JQF5 ZNF142LDSYQCT
LDSYRCS
24–3071.4%8.8 × 10−6DNA-binding, response to environmental changes[40]
Ferredoxin-type protein NapFSalipiger sp.TR: A0A2A3JNE6CLG85_24025LDS YQCT
LDSAQCT
3–985.7%1.1 × 10−5Nitrate oxidation, redox balance [35]
8-oxo-dGTP diphosphatase MutT Spirochaetae bacteriumTR: A0A2N1RAE0MutT
(CVV52_19070)
LDS YQCT
MDAYQCT
81–8771.4%1.5 × 10−5Removal of oxidatively damaged guanine, DNA repair[41]
Sel1 domain protein repeat-containing
protein
Nitrosococcus halophilusTR: D5BUP3Nhal_0240LDSYQCT
LDGYQCT
63–6985.7%3.0 × 10−5Protein degradation, response to pathogen[42]
Histidine kinase response regulatorBacteroidetes bacteriumTR: A0A2M7KDX5 COZ59_01780LDS YQCT MDGYQCT45–5171.4%3.8 × 10−5Regulation of stress response [43]
FerredoxinCandidatus electrothrix aarhusiensisTR: A0A444IS91 H206_03280LDSYQCT
I DTYQCS
6–1257.1%5.5 × 10−5Electron transfer, metal ion binding, redox regulation[44]
DTL proteinBalaeniceps rexTR: A0A7L2U6N2DtlLDSYQCT
LDSYQCS
10–1685.7%5.8 × 10−5Response to DNA damage and immunosuppressive microenvironment[45]
Anaredoxin Nostoc sp.SP: Q44141 AdxLDSYQCT
LESYQCM
19–2571.4%6.1 × 10−5Oxidoreductase, endonuclease, redox regulation[46]
Protein jagged-1Trichoplax sp. H2TR: A0A369RNS5TrispH2_012046LDSYQCT
LDQYQCT
207–21385.7%1.1 × 10−4Notch signaling, angiogenesis, response to hypoxia [47]
Host range factor 1Lymantria dispar multicapsid nuclear polyhedrosis virusSP: Q90165 HRF-1LDSYQCT
VDSYKCT
14–2071.4%1.6 × 10−4Host response to virus[48]
Helix-turn-helix
domain-containing protein
Cytophagaceae
bacterium
TR: A0A4Q3N6Z7EOO38_22880LDSYQCT
LDDYQCT
59–6585.7%2.4 × 10−4DNA binding, response to pathogen[49]
Cytochrome cIgnavibacteriae bacteriumTR: A0A660Z7I5DRQ13_06320LDSYQCT
LDTYQCT
239–24585.7%2.9 × 10−4ETC component, oxidative stress[50]
Calcium binding EGF domain proteinTrichinella nativaTR: A0A1Y3EHZ0 D917_09763LDSYQCT MDSYQCR79–8571.4%2.9 × 10−4Cell proliferation and adhesion, angiogenesis under hypoxia[51]
TetR family transcriptional regulatorPedobacter duraquaTR: A0A4V3C417CLV32_0466LDSYQCT
LDSYQCK
73–7885.7%3.4 × 10−4Sensing small molecule inducers[52]
Flavin oxidoreductase Salinivibrio sharmensisTR: A0A1V3GXS2 BZG19_13810LDSYQCT
LDSYHCT
188–19485.7%4.0 × 10−4Oxidative stress response[53]
PRONE domain-containing proteinPrunus persica
Prunus armeniaca
TR:M5VXJ9PRUPE_ppa002319mgLDSYQCT
MDSYQCT
666–67285.7%4.5 × 10−4Response to environmental stimuli[54]
U-scoloptoxin(05)-Er1a Ethmostigmus rubripesSP: P0DPX8 N/ALDSYQCT
LECYQCT
21–2771.4%7.1 × 10−4Toxin activity, defense response [55]
Fungal defensin eurocin Aspergillus amstelodamiSP: K7NSL0 N/ALDSYQCT
GDAYQCS
6–1157.1%9.2 × 10−4Antimicrobial peptide, defense response [56]
Yemanuclein Drosophila melanogasterSP: P25992yemLDSYQCT
LDDYQCT
846–85285.7%1.0 × 10−3DNA binding, chromatin assembly, genome stability[57]
dITP/XTP pyrophosphataseLegionella
pneumophila
SP: Q5X245lpp2548LDSYQCT
LNEYQCT
160–16671.4%5.5 × 10−3Preventing non-canonical nucleotide incorporation, SOS response[41]
Augerpeptide hheTx2 Hastula hecticaSP: P0CI09 N/ALDSYQCT
SDSCQCT
11–1771.4%8.7 × 10−3Toxin activity,
C-rich antimicrobial peptide
[58]
Stress response protein YhaX Bacillus subtilisSP: O07539 yhaXLDSYQCT
LESYQCN
96–10271.4%8.9 × 10−3Mg and Cu ion binding, response to stress[59]
Nucleus accumbens-associated protein 1Mus musculusSP: Q7TSZ8Nacc1LDSYQCT
LDSVQCT
172–17885.7%9.2 × 10−3Response to hypoxic microenvironment[60]
Ranatuerin-3Lithobates catesbeianusSP: P82780 N/ALDSYQCT
LDKIKCT
18–2457.1%1.3 × 10−2Host antimicrobial response [61]
Leiurutoxin-3Leiurus
quinquestriatus
SP:P45661N/ALDSYQCT
YDSSQCE
8–1457.1%1.5 × 10−2K+-channel regulator, defense response[62]
Retinoic acid receptor RXR-gamma-BDanio rerioSP:Q6DHP9 rxrgbLDSYQCT
MSSYQCT
112–11871.4%1.8 × 10−2Gene expression and immune response [63]
Infected cell protein 47 Human herpesvirus 2SP: P14345 US12LDSYQCT
LDSSRCT
12–1871.4%2.4 × 10−2Inhibiting CD8+ host adaptive immune response[64]
NADH-quinone oxidoreductase subunit A Roseiflexus sp.
Azotobacter vinelandii
SP: A5UXK0 nuoALDSYQCT
LDTYECG
39–4557.1%2.7 × 10−2Electron transfer oxidative stress response[65]
Brevinin-2Re Pelophylax ridibundusSP: C0HKZ9 N/ALDSYQCT
LDK IQCK
18–2457.1%3.0 × 10−2Antimicrobial defense response[61]
Gibberellin-regulated protein 6 Arabidopsis thalianaSP: Q6NMQ7 GASA6LDSYQCT
LKSYQCG
38–4471.4%4.7 × 10−2Plant development, antimicrobial response[66]
Table 2. Representative proteins retrieved from the UniProtKB database as containing GIP-9-like motifs (at E-value ˂ 0.05).
Table 2. Representative proteins retrieved from the UniProtKB database as containing GIP-9-like motifs (at E-value ˂ 0.05).
Protein Name SpeciesEntry CodeGene SymbolAlignmentAa PositionsIdentity DegreeE-ValueBiological RoleReference
2′-deoxycytidine
5’-triphosphate deaminase
Parvularcula sp.TR: A0A357L903 DEA40_15450EMTPVNPGV
EMTP I NPGL
184–19277.8%2.2 × 10−7Maintaining dNTP pool and genomic stability[67]
FHA domain-containing proteinCryobacterium sp.TR: A0A6H3K8T7 E3O68_01825EMTPVNPGV
ERTPVNPGV
64–7288.9%9.1 × 10−7DNA damage response, innate immune response[68]
Chromosome partitioning
protein ParA
Verrucomicrobiaceae bacteriumTR: A0A4Q3BDS1 EOP84_15500EMTPVNPGV
EMTPFNPGL
70–7877.8%7.4 × 10−6Chromosome partitioning and segregation[69]
AcrR family transcriptional
regulator
Gordonia humiTR:A0A840EPS7BKA16_000043EMTPVNPGV
EMSPVDPGV
158–16677.8%3.1 × 10−5Transcriptional regulation, resistance to toxic chemicals[70]
CoA-binding proteinAcidocella sp.TR: A0A257Q4I9 B7Z75_09205EMTPVNPGV
EVTPVNPGL
42–5077.8%3.2 × 10−5Cellular metabolism under redox control[71]
Cell division protein FtsLBetaproteobacteria bacteriumTR:A0A2N2UB63ftsLEMTPVNPGV
KMRPVNPGI
72–8066.7%9.2 × 10−5Chromosome scaffolding, cell cycle, Zn2+ ion sensitivity [72]
Cytochrome c-type biogenesis protein CcmEAgrobacterium fabrumSP: Q8UGR1 ccmEEMTPVNPGV
EKTPVNPGT
43–5177.8%1.2 × 10−4Apoprotein-heme interaction, redox response[73]
Cupin domain-containing
protein
Bradyrhizobium sp.TR: A0A525IHT2 E7774_03250EMTPVNPGV
E I TPVGPGV
75–8377.8%5.3 × 10−4Response to biotic and abiotic stress; SOD activity[74]
Thermonuclease family proteinChloroflexia bacteriumTR: A0A7W0PLY7H0T93_01160EMTPVNPGV
E I TPVNPG I
126–13477.8%5.9 × 10−4DNA and RNA degradation, defense response[75]
Dual specificity phosphataseDicentrarchus labraxTR: A0A8C4HFW9N/AEMTPVNPGV
NLTPVNPGV
25–3377.8%6.9 × 10−4Cell signaling, protection from genotoxicity, and injury[76]
Envelope glycoprotein EVaricella-zoster virusSP: P09259gEEMTPVNPGV
E ITPVNPGT
524–53277.8%7.0 × 10−4Viral immune response[77]
NADPH-dependent
oxidoreductase
Brevibacterium aurantiacumTR: A0A4Z0KM68 EB834_09640EMTPVNPGV
RMTPVSPGV
135–14377.8%3.7 × 10−3Oxidative stress response[78]
FerredoxinPlanctomycetes
bacterium
TR: A0A3L7UG95 DWI22_14495EMTPVNPGV
EMSPLCPG I
43–5155.6%4.8 × 10−3Iron-sulfur cluster binding, redox response[35]
Glutamyl-Q tRNA(Asp)
synthetase
Gammaproteobacteria bacteriumTR: A0A4Y8UV00gluQEMTPVNPGV
ELRPVNPGV
17–2577.8%4.6 × 10−3Response to amino acid availability[79]
Cisd2-a proteinSymbiodinium pilosumTR: A0A812YDJ7cisd2-aEMTPVNPGV
KPTPVNPG I
31–3966.7%5.6 × 10−3Electron transfer, redox response[80]
Plastocyanin-like
domain-containing protein
Strigops habroptilaTR: A0A672VBQ4N/AEMTPVNPGV
EMSPENPGT
23–3266.7%7.2 × 10−3Electron transfer regulated by light-dark switches [81]
Cupredoxin domain-containing
protein
Thermoleophilaceae bacteriumTR: A0A838PN94H0U20_06955EMTPVNPGV
ELNPANPGV
37–4566.7%9.7 × 10−3Oxidative stress response[82]
50S ribosomal protein L13eAeropyrum pernixSP: Q9YEN9 rpl13eEMTPVNPGV
KLGPVDPGV
15–2355.6%1.1 × 10−2Ribosome assembly [83]
CeruloplasminPterocles gutturalisTR: A0A093CDT1CPEMTPVNPGV
EMTPQNPGT
166–17477.8%1.1 × 10−2Copper-binding ferroxidase activity[84]
Proline-rich protein 2Lottia giganteaSP: B3A0R8 PRH2EMTPVNPGV
PMSPVRPGV
90–9866.7%1.1 × 10−2
Cell cycle regulation under redox control[85]
Addiction module HigA family
antidote
Thiogranum longumTR: A0A4R1H8U5DFR30_1540EMTPVNPGV
KLTP IHPGV
4–1255.6%1.7 × 10−2Plasmid addiction, bacterial growth under stress conditions[86]
Divalent-cation tolerance
protein CutA
Actinomadura rudentiformisTR: A0A6H9YAH5F8566_39675EMTPVNPGV
EVTPGNPGV
10–1877.8%1.3 × 10−2Response to Cu2+ ion[87]
Ribonuclease HIIRhodanobacter sp.TR: A0A522L7A0rnhBEMTPVNPGV
ELTPANPGL
3–1166.7%1.7 × 10−2RNA binding,
defense response
[88]
C2H2-type domain-containing
protein
Gibberella nygamaiTR: A0A2K0W957FNYG_07693EMTPVNPGV
EPTPVNPGL
133–14177.8%2.0 × 10−2Transcriptional regulation, response to environmental changes[89]
Forkhead box protein O1Bos taurusSP: E1BPQ1FOXO1EMTPVNPGV
IMTPVDPGV
445–45377.8%2.8 × 10−2Metabolic homeostasis under oxidative stress[90]
Glutathione S-transferase Caulobacter vibrioidesTR: A0A258CQ14 B7Z12_21310EMTPVNPGV
EMI PVN IGV
30–3877.8%3.8 × 10−2Substrate S-glutathionylation and detoxification[91]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zavadskiy, S.P.; Gruzdov, D.S.; Sologova, S.S.; Terentiev, A.A.; Moldogazieva, N.T. Evolutionary Conserved Short Linear Motifs Provide Insights into the Cellular Response to Stress. Antioxidants 2023, 12, 96. https://doi.org/10.3390/antiox12010096

AMA Style

Zavadskiy SP, Gruzdov DS, Sologova SS, Terentiev AA, Moldogazieva NT. Evolutionary Conserved Short Linear Motifs Provide Insights into the Cellular Response to Stress. Antioxidants. 2023; 12(1):96. https://doi.org/10.3390/antiox12010096

Chicago/Turabian Style

Zavadskiy, Sergey P., Denis S. Gruzdov, Susanna S. Sologova, Alexander A. Terentiev, and Nurbubu T. Moldogazieva. 2023. "Evolutionary Conserved Short Linear Motifs Provide Insights into the Cellular Response to Stress" Antioxidants 12, no. 1: 96. https://doi.org/10.3390/antiox12010096

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop