Highlights
- Hundreds of short linear motifs (SLiMs) that exhibit a high degree of sequence similarity to two biologically active sites of human alpha-fetoprotein (AFP) were identified.
- The SLiMs of interest are ubiquitously distributed and found in proteins of both eukaryotic and prokaryotic species.
- Proteins retrieved by sequence alignment belonged to various functional classes to be directly or indirectly involved in cellular response to stress.
- Our findings provide insights into the common functions of evolutionary conserved SLiMs and putative involvement of AFP in response to external and internal stimuli during cellular adaptation during embryonic development and cancer.
Abstract
Short linear motifs (SLiMs) are evolutionarily conserved functional modules of proteins composed of 3 to 10 residues and involved in multiple cellular functions. Here, we performed a search for SLiMs that exert sequence similarity to two segments of alpha-fetoprotein (AFP), a major mammalian embryonic and cancer-associated protein. Biological activities of the peptides, LDSYQCT (AFP14–20) and EMTPVNPGV (GIP-9), have been previously confirmed under in vitro and in vivo conditions. In our study, we retrieved a vast array of proteins that contain SLiMs of interest from both prokaryotic and eukaryotic species, including viruses, bacteria, archaea, invertebrates, and vertebrates. Comprehensive Gene Ontology enrichment analysis showed that proteins from multiple functional classes, including enzymes, transcription factors, as well as those involved in signaling, cell cycle, and quality control, and ribosomal proteins were implicated in cellular adaptation to environmental stress conditions. These include response to oxidative and metabolic stress, hypoxia, DNA and RNA damage, protein degradation, as well as antimicrobial, antiviral, and immune response. Thus, our data enabled insights into the common functions of SLiMs evolutionary conserved across all taxonomic categories. These SLiMs can serve as important players in cellular adaptation to stress, which is crucial for cell functioning.
1. Introduction
Short linear motifs (SLiMs) are evolutionarily conserved functional modules of proteins that represent amino acid stretches composed of 3 to 10 residues involved in recognition and targeting activities [1]. SLiMs function through transient interactions with a variety of binding partners, mostly, with globular protein domains of other proteins. Thereby, they are involved in protein–protein interactions, which underlie numerous cellular processes including signal transduction, metabolism, electron transfer, cell cycle, membrane transport, etc. [2]. Currently, it has become recognized that similar SLiMs can be found in numerous non-homologous, unrelated proteins recruited in common regulatory functions [3]. They exert evolutionary plasticity that has facilitated a rapid growth of their use and resulted in their ubiquitous distribution across a variety of organisms.
A growing body of data evidences that during the long evolutionary time, short amino acid segments undergo mutations and multiple events of re-use in a variety of non-homologous proteins [4]. Evolutionary events such as duplication, fusion, and recombination have been suggested to provide a mechanism for the reuse and successful incorporation of such stretches into multiple unrelated proteins [5,6]. Therefore, such reuse of pre-existing sequences is likely to offer an evolutionary advantage for functional proteins.
Presumably, ancient proteins were quite short molecules and have evolved into contemporary large, globular, and functional proteins due to the incorporation of short peptide stretches [7]. Indeed, Eck and Dayhoff described the phenomenon of the incorporation of a prototype small iron–sulfur cluster-containing protein, ferredoxin, that is involved in electron transfer and redox regulation, into metabolic proteins [8]. This can happen at the very early stages of biochemical evolution due to the doubling of the prototype that was enriched in Ala, Asp, Pro, Ser, and Gly residues. The authors have identified two major functional types of primordial peptides composed of 9 to 38 residues in length. The first type represented nucleic acid-binding and ribosomal peptides, while another type was catalytic peptides that can coordinate metal ions, iron–sulfur clusters, nucleotides, and nucleotide-derived cofactors [9]. Therefore, identifying evolutionary conserved SLiMs with sequence similarity to the prototype peptides can be indicative of common ancestry and functional relationships [10,11].
Earlier, we identified a variety of human proteins that contain SLiMs with sequence similarity to two functionally important segments of human alpha-fetoprotein (AFP) [12]. These SLiMs have been proposed to orchestrate functioning of multiple non-homologous human proteins during embryonic development, redox regulation, and cancer progression. AFP is a major mammalian development- and cancer-associated protein that in human is composed of 609 amino acids organized in three structural domains (I, II, and III) [13]. Experimental data have evidenced that human and rodent AFPs are capable to bind metal ions and various hydrophobic ligands.
Multiple linear segments with putative and experimentally confirmed functions have been identified to enable functional and structural mapping of human AFP [14]. The 34 amino acid-long stretch located in the domain III to encompass residues from 464 to 497 of full-length human AFP has been designated as growth-inhibitory peptide (GIP) and chemically synthesized, purified, and characterized [15,16]. The GIP has occurred to exert the inhibitory effects on mouse uterine cell proliferation and cancer growth in an MSF-7 cell line model [17]. Its C-terminal segments, EMTPVNPGV (GIP-9) that encompasses residues 489 to 497 has occurred to be one of the most biologically active segments of GIP [18]. Moreover, human AFP and derived peptides have been experimentally shown to reduce fetotoxicity of high doses of insulin and estrogens in murine and chick models [19].
Another AFP-derived peptide, LDSYQCT, is located in the domain I to encompass residues from 32 to 38 in the full-length protein. In the mature protein this segment encompasses residues from 14 to 20 and, consequently, it has been designated as AFP14–20 [20]. This heptapeptide has been shown to share a high degree of sequence similarity with a part of receptor-binding domain of epidermal growth factor (EGF). AFP14–20 has also been chemically synthesized and demonstrated the immunomodulatory effects in culture of human phytohemagglutinin (PHA)-activated lymphocytes [21]. Multiple analogs and fragments of this peptide have been obtained to display biological activity that correlated with amino acid composition that influences conformational changes in the protein backbone [22].
Here, we used the SLiM search approach based on local sequence alignment algorithms to retrieve proteins that contain short GIP-9-like and AFP14–20-like motifs from protein primary structure databases. We identified a vast array of proteins from all taxonomies, including bacteria, viruses, archaea, and eukaryotes that contain both SLiM types of interest. Amino acid composition analyses of all retrieved SLiMs allowed for the revealing of a high degree of sequence conservation and hotspot residues. Furthermore, we performed comprehensive Gene Ontology (GO) functional enrichment analysis and revealed that the both motif types can be identified in proteins involved directly or indirectly in cellular response to biotic and abiotic stress. Our data allow for the suggestion that these conserved motifs underlie the involvement of a vast array of proteins in cellular response to stress conditions. Also, AFP can be involved in cellular adaptation to oxidative, genotoxic, and metabolic stress during embryonic development and cancer growth.
2. Materials and Methods
2.1. Mapping of AFP14–20-like and GIP-9-like Peptides
Both biologically active peptides, AFP14–20 and GIP-9, were mapped on three-dimensional (3D) structure of human AFP in order to assess their structural features. For this purpose, we utilized the 3D structure of human AFP that we previously constructed by homology-based modelling with the use of Schrödinger software (release 2018-2) [23,24]. PyMOL, version 2.5, molecular graphics system was utilized for structure visualization (https://pymol.org/2/ (accessed on 21 January 2022)) [25].
2.2. Search for Short Linear Motifs
We carried out local sequence alignment with the use of both AFP-derived peptides, LDSYQCT and EMTPVNPGV, as queries for sequence similarity search. FastA suite [26] supported by the European Bioinformatics Institute of European Molecular Biology Laboratory (EMBL-EBI) (https://www.ebi.ac.uk/Tools/sss/fasta/ (accessed on 8 January 2022)) [27] was exploited. The alignment was performed against UniProtKB protein knowledgebase (https://www.uniprot.org/ (accessed on 8 January 2022)), both UniProtKB/Swiss-Prot (the manually annotated and reviewed) and UniProtKB/TrEMBL (the automatically annotated) sections [28]. No restriction in taxonomic categories was applied. GLSEARCH (version 36.3.8 h) algorithm provided the most optimal search for sequences that match the query peptides. Default parameters: BLOSUM50 matrix, gap open -10, gap extension -2, expectation value (E-value) upper unit 10 and lower unit 0 to obtain up to 500 alignments were utilized.
2.3. Amino Acid Conservation Analysis
SLiMs obtained with the use of the FastA GLSEARCH algorithm were further subjected to amino acid substitution analysis. Amino acid substitutions at each position of all SLiMs were calculated as follows: N = a/b × 100%. Here, a is the quantity of a definite residue at a definite position and b is the total number of SLiMs. All SLiMs including those aligned to AFP itself from all species and uncharacterized and hypothetical proteins were taken into account. Graphical representation of the amino acid conservation was performed with the use of the WebLogo3 (http://weblogo.threeplusone.com/create.cgi (accessed on 5 February 2022)) tool [29].
2.4. Functional Classification of Retrieved Proteins
All proteins extracted from the both Swiss-Prot and TrEMBL sections of UniProtKB database were subjected to GO term-based functional classification [30] in both the molecular functions and biological processes categories (http://geneontology.org/ (accessed on 14 May 2022)). These included all retrieved proteins from both prokaryotic and eukaryotic taxonomies. Since TrEMBL is a large section that contains automatically annotated proteins, a cut-off of E-value 0.1 and identity degree of 57.1% for AFP14–20-like motifs and E-value 0.1 and identity degree of 55.6% for GIP-9-like motifs were applied for alignments against this section of UniProtKB. In addition to UniProtKB, InterPro (https://www.ebi.ac.uk/interpro/ (accessed on 7 June 2022)) protein family resource was used for functional annotations of the retrieved proteins [31].
2.5. Gene Set Enrichment Analysis
For further gene set enrichment analysis, two lists of genes coding for the retrieved proteins containing either AFP14–20-like or GIP-9-like motifs were manually created. The UniProtKB-IDs were used and when needed they were converted into Ensembl gene IDs and STRING-db proteins IDs. These datasets were used as backgrounds for GO enrichment analysis. The created lists were first uploaded into PANTHER classification system (http://pantherdb.org/ (accessed on 17 May 2022)) of the Gene Ontology resource [32]. The R/Bioconducter packages in graphical ShinyGO v0.75 suite (http://bioinformatics.sdstate.edu/go/ (accessed on 24 May 2022)) was utilized [33] for further functional enrichment analysis. Characteristics of a list of genes were compared with other genes of the whole genome (background) and Student’s t-test was applied. Additionally, the gProfiler functional enrichment analysis [34] resource (https://biit.cs.ut.ee/gprofiler/ (accessed on 28 May 2022)) was utilized. Here, the gSCS statistical threshold to be equal to 0.2 and ENTREZGENE_ACC numerical IDs to extract all known gene sets were exploited.
3. Results
3.1. Biologically Active Peptides Are Located on AFP Surface
Figure 1 depicts the overall U-shaped architecture and 3D organization of human AFP with secondary structure elements represented by alpha-helices and loops with no beta-strands. Visualization of the obtained structure showed that the two distinct functionally important segments of human AFP with experimentally confirmed biological activities, AFP14–20 and GIP-9, are located on the protein surface to be accessible to the solvent and/or protein binding.
Figure 1.
The overall architecture of AFP is represented by a U-shaped structure composed of three domains: I (orange, residues 19–210), II (green, residues 211–402), and III (cyan, residues 403–601). Two functionally important segments, AFP14–20 with sequence LDSYQCT (residues 32–38, colored in blue) and GIP-9 with sequence EMTPVNPGV (residues 489–497, colored in red) that is a part of GIP-34 (residues 464–497, colored in pink), respectively, are shown.
The fist peptide segment is located in the domain I, close to N-terminus, and arranged in α-helical conformation. The second segment encompasses C-terminal part of GIP-34 peptide that occupies the most prolonged α-helical stretch in the domain III. Only a little part of secondary structure elements of the GIP-9 peptide is arranged in α-helix, while the remaining part represents a disordered region, and this can have a role in its functionality.
3.2. Proteins Containing SLiMs of Interest Are Biologically Diverse
Local sequence alignment enabled retrieval of 464 proteins from Swiss-Prot section of UniProtKB database and 500 proteins from its TrEMBL section (with maximum E-value 6.9 × 10-4) that contain SLiMs with sequence similarity to LDSYQCT peptide. They covered proteins from a wide range of taxonomic categories and included uncharacterized and hypothetical proteins. Table 1 shows the most representative proteins from various species aligned with LDSYQCT sequence and the alignment E-values: the lower the E-value, the higher the statistical significance of the alignment. In the alignment column, the upper sequence is a query, whereas the lower sequence is from the retrieved protein. Proteins that contain AFP14–20-like motifs play various biological roles including transcriptional and translational regulation, oxidoreductase and electron transfer activity, protein quality control, host–pathogen interaction, biotic and abiotic stress response, and component of ribosomes and the toxin–antitoxin system, etc.
Table 1.
Representative proteins retrieved from the UniProtKB database as containing AFP14–20-like motifs (at E-value ˂ 0.05).
SLiMs with sequence similarity to EMTPVNPG octapeptide were identified in 258 proteins from the Swiss-Prot section and 500 proteins from the TrEMBL section (with maximum E-value 4.3 × 10−2) of UniProtKB database. These proteins covered all taxonomic categories and included AFP from different biological species, uncharacterized and hypothetical proteins. Table 2 contains the most representative proteins aligned to GIP-9 segment and the alignment E-values. Proteins that contain GIP-9-like motifs also have a wide range of biological roles including involvement in cell signaling, transcriptional regulation, metabolic processes, response to chemicals, immune response, electron transfer, etc.
Table 2.
Representative proteins retrieved from the UniProtKB database as containing GIP-9-like motifs (at E-value ˂ 0.05).
3.3. SLiMs of Interest Are Enriched in Conserved Residues
After the exclusion of the same proteins from different taxonomies, 199 AFP14–20-like and 280 GIP-9-like unique motifs were identified. Furthermore, we assessed amino acid frequencies at each position of the unique SLiMs. The most conserved residue in AFP14–20-like motifs was cysteine (C) that comprises 100% of the total residue number at position 6. The second-most conserved residue was aspartic acid (D) that constituted 82.9% of all residues at position 2 and can be replaced, predominantly, by physicochemically similar asparagine (N) and glutamate (E). Two aromatic amino acids, tyrosine (Y) and phenylalanine (F), comprised 83.9% of all residues at position 4. While 57.3% of residues at position 1 were represented by leucine (L) that can be substituted for other hydrophobic residues—methionine (M), isoleucine (I), and valine (V). Serine (S) constituted 54.8% of all residues at position 3 to be replaced, mostly, by hydrophilic amino acids—T, K, and E. Hydroxyl group-containing residues, T and S, constituted 65.3% of all residues at position 7. Glutamine (Q) comprised 67.8% of all residues at position 5 to be replaced by charged and hydrophilic amino acids—lysine (K), aspartate (D), and arginine (R). These calculations with the application of a threshold of 5% resulted in the following notation for the consensus sequence: L[MIV]D[NE]S[TKE]Y[F]Q[KDR]CT[S]. Figure 2A graphically depicts the frequency of each amino acid at every position of the retrieved AFP14–20-like motifs.
Figure 2.
WebLogo representation of amino acid abundances at each position of (A) AFP14–20-like and (B) GIP-9-like motifs identified in proteins retrieved from UniProtKB database. The overall height of every stack indicates residue conservation at each position, while a symbol height within the stack indicates relative frequency of each residue at that position. Colors of symbols are as follows: hydrophobic and glycine—green, hydrophilic and positively charged—orange, negatively charged and their amides—blue, aromatic plus proline—purple, and cysteine—red.
As for GIP-9-like motifs, three most conserved positions were identified—4, 7, and 8. Positions 4 and 7 were occupied by proline (P) residue that comprised 96% and 98% of all residues, respectively. The third-most conserved residue was glycine (G) that comprised 92% of all residues at position 8. The least conserved position was 2, where methionine (35.4%) was the most frequent residue and could be replaced by other hydrophobic amino acids—L, I, and V. At position 1, glutamic acid residue constituted 60.4% of all residues to be replaced, more frequently, by D, Q, and K, which have similar physicochemical properties. Threonine (T) constituted 51.8% of all residues at position 3 to be replaced most frequently, by serine (S), a physicochemically similar residue (12.9%). Position 5 was occupied, mostly, by large hydrophobic residues—V (55.0%), I (20.4%), and L (7.9%). At position 6, asparagine (N) comprised 56.8% of all residues and the most significant replacement was for D and S, while position 9 was occupied by large hydrophobic amino acids—V (47.1%), L (12.1%), and I (18.2%). On the basis of the calculations, the following notation for consensus sequence was identified: E[DQ]M[LIV]T[S]PV[LI]N[DS]PGV[LI]. Figure 2B graphically depicts frequency of each amino acid at every position of the identified GIP-9-like motifs.
Therefore, both SLiM types of interest contain a large proportion of conserved amino acid residues indicating that they are evolutionarily preserved through all biological species starting from bacteria and viruses to higher eukaryotes. Interestingly, the consensus sequences were enriched in D, S, and P residues found in prototype peptides, which have been proposed to give rise to modern proteins.
3.4. Retrieved Genes Ubiquitously Exist
Furthermore, we classified unique genes that code for proteins containing both SLiM types of interest on the basis of their belonging to any taxonomic category. We found that the retrieved genes are widely distributed among all taxonomic groups, including bacteria, viruses, archaea, and various invertebrate and vertebrate species, including mammals and primates. Figure 3A,B depicts the taxonomic distribution of unique gene coding for proteins with AFP14–20-like and GIP-9-like motifs, respectively.
Figure 3.
Diagram representations of taxonomic distribution of genes encoding proteins, which were retrieved from an UniProtKB database as aligned with (A) AFP14–20 and (B) GIP-9 segment of human AFP. Amounts of unique genes in each taxonomic category are shown above each column. Prokaryotes—bacteria (blue), viruses (brown), archaea (grey); mammals—Homo sapiens (blue), primates (brown), other mammals (grey); vertebrates—birds (blue), fishes (brown), amphibia (grey); invertebrates—reptiles (blue), insects (brown), nematodes (grey); plants—higher plants (blue), algae (brown); other eukaryotes—S. cerevisiae (blue), fungi (brown), mollusks, scorpions, spiders, etc., (grey).
Up to 64% and 74% of AFP14–20-like and GIP-9-like motifs, respectively, were found in bacterial proteins, while about 10% and 16% motifs, respectively, were identified in mammalian proteins. Some retrieved genes had orthologs in multiple biological species, therefore each of such genes was treated as a unique gene. For example, both SLiMs of interest were found in cytochrome c biogenesis protein CcmE and malate dehydrogenase from a wide range of bacterial species, while transmembrane protein TMEM258 was from various eukaryotic species (see Table 1 and Table 2).
3.5. Retrieved Proteins Are Functionally Diverse
We used GO term annotations provided in the UniProtKB and InterPro databases to classify all retrieved proteins according to molecular functions and biological process categories (Figure 4). A total amount of terms can differ from the total amount of proteins aligned to each SLiM type of interest because (i) more than one GO term may be assigned to a unique protein and (ii) the same unique protein can belong to a variety of taxonomic categories. As shown in Figure 4A, metal ion binding, catalytic activity, and transferase activity were the predominant molecular function terms for AFP14–20-like motif-containing proteins. Additionally, there were proteins that exert oxidoreductase/electron transfer, DNA/RNA-binding, transcription factor, antimicrobial defense and immune response activities. The largest portion of proteins aligned to a GIP-9 segment of human AFP belonged to oxidoreductases and metal ion/iron-sulfur cluster binding, heme binding, and DNA binding proteins (Figure 4B).
Figure 4.
Categorization of proteins retrieved from an UniProtKB knowledgebase were performed in Gene Ontology. (A,B) Molecular function and (C,D) biological process terms and aligned with (A,C) AFP14–20 and (B,D) GIP-9 segments of human AFP. Ranking was performed in order of decrease in number of unique genes in each category. Calculation of unique gene quantity was performed manually with no taken into account degree of a category significance.
Categorization of the retrieved proteins according to biological process terms showed that majority of AFP14–20-like motif-containing proteins are involved in transcriptional regulation, oxidative stress response, RNA processing, and host–pathogen defense response (Figure 4C). GIP-9-like motif-containing proteins were involved in aerobic respiration/electron transfer, response to environmental stress, metabolic process, regulation of gene expression, translation, DNA repair, and protein quality control (Figure 4D).
Additionally, prominent roles belonged to proteins involved in response to the pathogen and immune response. There were apparent relationships between molecular function and biological process terms. For example, DNA binding and metal ion binding activities can be assigned to transcriptional regulation, while electron transfer/oxidoreductase activities and, partly, metal ion binding activity underlie cell response to oxidative stress and antimicrobial, antifungal, and antiviral defense responses.
3.6. Prokaryotic Genes Are Required for Stress Tolerance
In order to identify the most statistically significant GO categories, we carried out gene set enrichment analysis with the use of ShinyGO v0.75 and gProfiler suites. In GO classification system, 389 unique genes encoding an AFP14–20-like motif containing proteins and 273 unique genes encoding a GIP-9-like motif containing proteins were mapped to the Ensembl gene IDs. Figure 5 depicts typical GO term-based functional categorization of genes encoding AFP14–20-like motif-containing proteins. From our gene set list, up to 41 bacterial genes were mapped to Ensembl genome IDs.
Figure 5.
(A) Molecular function, (B) biological process, and (C) all-available gene set categorization in Gene Ontology terms of representative bacterial genome (Acenitobacter sp.). Lollipop chart at aspect ratio 1.5 and -log10 (FDR) heat maps for each category are shown. FDR is calculated based on nominal p-value from the hypergeometric test. FDR shows how likely the enrichment is by chance. Larger gene sets tend to have smaller FDR. N. of Genes indicates the number of genes for each category.
As shown in Figure 5A, at FDR cutoff 0.2, bacterial genes associated with nucleotide/nucleic acid binding, ion/metal ion binding and ATP binding activities were retrieved at high statistical significance (low p-value) in molecular function categories. Not surprisingly, biological processes involved in metabolism and nucleotide/nucleic acid and amino acid biosynthesis required for bacterial reproduction were overrepresented (Figure 5B). However, when all available gene sets were retrieved, oxidoreductase and chaperone activity as well as chemical stimuli/stress response and SOS response activities were identified among statistically significant categories identified for bacterial proteins (Figure 5C).
These data were confirmed by functional enrichment analysis of genes encoding GIP-9-like motif-containing proteins. From our gene list, up to 29 unique genes were mapped to Ensembl genome IDs in each bacterial taxonomy. Figure 6A–C depicts the all-available gene set enrichment analyses for three representative bacterial species. As shown in Figure 6, pathways associated with metabolic processes, nucleic acid and protein biosynthesis, translation, and DNA repair are the most statistically significant. However, pathways that underlie cellular response to biotic and abiotic stress and chemical stimuli were identified. They included SOS response and oxidative stress response that occurs with the involvement of oxidoreductase/electron transfer enzymes including those containing Fe-S clusters.
Figure 6.
Prokaryotic genes coding for proteins containing GIP-9-like motifs. All-available gene set analysis of (A) Desulfotomaculum guttoideum, (B) Bacillus selenitireducens, and (C) Clostridium aminophilum genes. Categories are ranked by fold enrichment order; that is, the percentage of genes in the list belonging to each category divided by the corresponding percentage in the background. Fold enrichment indicates how drastically genes of a certain pathway are overrepresented. N. of Genes indicates the number of genes for each category.
3.7. Eukaryotic Genes Are Responsible for Stress and Defense Response
Figure 7 depicts the Manhattan plots that illustrate GO terms for human (A and C) and A. thaliana (B and D) gene sets coding for (A and B) AFP14–20-like motif-containing and (C and D) GIP-9-like motif-containing proteins.
Figure 7.
Grouping of eukaryotic genes. Manhattan plots of all H. sapiens (A,C) and A. thaliana (B,D) gene sets coding for (A,B) AFP14–20-like motif-containing and (C,D) GIP-9-like motif-containing proteins. The x-axis represents functional terms that are grouped and color-coded by data sources, while the y-axis shows the adjusted enrichment p-values in negative log10 scale. MF, molecular function; BP, biological process; CC, cellular component; KEGG, KEGG pathway; REAC, Reactome; WP, Wiki pathway; TF, transcription factor; MIRNA, microRNA; HPA, human Protein Atlas; CORUM, CORUM dataset; and HP, human phenotype. Each circle indicates the functional enrichment term, while the circle sizes correspond to the term size; larger terms have larger circles.
In humans, up to 54 unique genes encoding AFP14–20-like motif-containing proteins were mapped to Ensembl gene IDs. In other mammalians, the amount of corresponding unique genes constituted from 34 to 53 and from 14 to 20, respectively.
In GO-based molecular function terms, H. sapiens protein/receptor binding, ion/metal ion-binding and calcium-binding, DNA and heterocyclic compound (nucleotide)-binding as well as dioxygenase and oxidoreductase activities were among the most significant categories (Figure 8A). As expected, in GO biological process terms, biosynthetic and developmental processes as well as cell communication and cell signaling pathways were identified as the most significant functional terms (Figure 8B).

Figure 8.
H. sapiens and A. thaliana gene set enrichment analysis of AFP14–20-like motif-containing proteins. Human genes categorized in (A) molecular function terms and (B) biological process terms. A. thaliana genes categorized in (C) molecular function terms and (D) biological process terms. Color codes indicate data inferred from: dark brown—experiment/direct assay, light brown—genetic and physical interactions, yellow—sequence similarity, dark purple—high throughput experiment, green—curator, blue—reviewed computational data.
Unexpectedly, response to stress and chemical stimulus and DNA damage were also among statistically significant biological processes. This picture was typical for various animal species, where a wide range of stress response proteins including oxidoreductases, ubiquitin activating enzymes, channel activity regulators, and cell signaling proteins were retrieved. For example, in plants, up to 47 unique genes encoding AFP14–20-like motif-containing proteins were mapped to Ensembl gene IDs. These included proteins important for cell division such as those involved in RNA binding, nucleotide biosynthesis, and translation. Interestingly, those implicated in stress/defense response such as oxidoreductases and proteins involved in killing of other organisms were also among significant ones in plants (Figure 8C,D).
As for GIP-9-like motif-containing proteins, lower quantities of statistically significant GO terms were identified (Figure 9). Up to 21 unique genes in mammalians and up to 15 unique genes in plants were mapped to Ensembl gene IDs.

Figure 9.
H. sapiens and A. thaliana gene set enrichment analysis of GIP-9-like motif-containing proteins. Human genes categorized in (A) molecular function terms and (B) biological process terms. A. thaliana genes categorized in (C) molecular function terms and (D) biological process terms. Color codes indicate data inferred from: dark brown—experiment/direct assay, light brown—genetic and physical interactions, yellow—sequence similarity, dark purple—high throughput experiment, green—curator, blue—reviewed computational data.
In H. sapiens GO molecular function terms, protein and nucleotide binding activities along with chaperone and ion/metal ion binding activities were among overrepresented molecular function terms (Figure 9A). In biological process terms, immune and defense response as well as autophagy and apoptosis (Figure 9B) were identified among significant human genes. In plants, NADPH-dependent oxidoreductase, ion channel, and RNA/DNA binding activities, which underlie response to external stimulus, protein localization, and cellular metabolism were identified (Figure 9C,D).
4. Discussion
SLiMs are often found in the rapidly evolving intrinsically disordered regions of proteins and the motif acquisition can proceed through the convergent evolution [92]. Frequent mutations, small size, and low complexity make it difficult to identify motifs and to study their functions. Here, we used bioinformatics and GO enrichment analyses to search for SLiMs with sequence similarity to two AFP-derived sequences, LDSYQCT (AFP14–20) and EMTPVNPGV (GIP-9). We identified a vast array of similar motifs across all taxonomic categories including bacteria, viruses, archaea, and various eukaryotic species.
One of the most prominent molecular functions of human and rodent AFPs is metal ion binding capability [93], which is similar to activities of majority of the retrieved in our study proteins. This capability underlies the involvement of proteins in various cellular processes including metabolism, transcriptional regulation, and redox regulation. Most of prokaryotic proteins were, unsurprisingly, involved in nucleotide, nucleic acid, amino acid, and protein biosynthesis necessary for their reproduction. However, the overwhelming majority of both prokaryotic and eukaryotic proteins including enzymes, transcription factors, quality control, and ribosomal proteins were involved in the cellular adaptation to environmental changes and various stress conditions. Our data suggest that AFP can use the SLiMs of interest to provide cellular adaptation to stress conditions during embryonic development and cancer growth.
4.1. AFP14–20-like Motif-Containing Proteins
We found that most bacterial and archaeal proteins containing short segments aligned with the AFP14–20 at high statistical significance (E-value of ~10−5–10−4) are involved in maintaining cellular redox balance (Table 1). For example, iron–sulfur (Fe-S) cluster proteins such as rubredoxins, ferredoxins, anaredoxin, and desulfoferrodoxin exert antioxidant activity and play important roles in bacterial adaptation to environmental changes [44,46]. These proteins have a unique structural characteristic of four Cys residues that surround the Fe-S clusters involved in electron transfer from cognate reductases to cytochrome P-450s enabling maintenance of the pathogen viability [35]. Fe-S clusters are found in many enzymes central to metabolic processes such as nitrogen fixation, respiration, and DNA processing and repair. Additionally, enzymes with flavin oxidoreductase activity such as choline dehydrogenase (Cdh), which oxidizes choline to betaine aldehyde for its further oxidation to betaine, were retrieved. Betaine is a source of CH3-group for biosynthesis of nucleotides, amino acids, etc., and provides adaptation of phototrophic bacteria to osmotic stress [53]. Choline oxidation is associated with electron transfer to the electron transportation chain (ETC) and ROS generation [38]. Reasonably, NADH-quinone oxidoreductase, ETC complex I, that is of the major sites of ROS production in many bacterial strains [65], was also aligned to AFP14–20 segment. Additionally, variety of modulators of environmental stress response were aligned to AFP14–20 segment. Histidine kinase response regulator protein, stress response protein YhaX [43,59], Sel1 domain-containing protein [42], and RagB/SusD family nutrient uptake outer membrane protein [39], which regulate host cell response to pathogen were among them. Moreover, bacterial 8-oxo-dGTP diphosphatase MutT and dITP/XTP pyrophosphatase enzymes, which are involved in SOS response due the removal of oxidatively damaged and non-canonical nucleotides, were retrieved [41].
Transcription factors that regulate gene expression in bacteria, archaea, and viruses for their adaptation to environmental stress conditions were also among the retrieved proteins. They included a helix-turn-helix domain-containing AraC family and a TetR transcriptional regulator that typically bind to target DNA and regulate pathogenic properties by sensing small molecule inducers such as urea, bicarbonate, and glycerol, etc. [49,52]. Bacterial ribosomal enzymes that catalyze posttranslational modification of proteins involved in translation were also aligned to the AFP14–20 segment. An example is rimI that encodes the ribosomal protein S18-alanine acetyltransferase [36]. Proteins involved in host–pathogen interaction via promoting nucleic acid replication and host adaptive immune response were found among viral proteins. They included host range factor 1 [48] and infected cell protein 47 (ICP47) [60], which function under redox changing. For example, ICP47 directly binds antigen-dependent transporter (TAP), leading to the occurrence of empty MHC-I that is under redox control due to disulfide bond oxidation/reduction [94].
In plants, the Rho family of Ras-related GTP-binding (Rop) proteins work as signaling switches that control growth, development and apoptosis in responses to various environmental stimuli [54]. A highly conserved catalytic PRONE (plant-specific Rop nucleotide exchanger) domain-containing proteins with strong substrate specificity for members of the Rop family were aligned to AFP14–20 segment. Additionally, developmental proteins with antimicrobial activity such as gibberellic acid-stimulated Arabidopsis (GASA) [66] were retrieved. There was also, though at low significance, the acidic leucine-rich nuclear phosphoprotein 32-related protein 2 involved in histone chaperone activity and the integration of environmental stress response in plants and immunomodulation and tumor progression in humans [95].
In animals, a variety of small proteins with ion channel regulator and toxin activity such as U-scoloptoxin [55], auger peptide hheTx2 [54], leiurutoxin-3 [62], and others produced by various mollusks, snakes, and insects were aligned with the AFP14–20 segment. Additionally, Cys-rich and metal ion binding small proteins including defensins, ranatuerins, and brevinines figure prominently in the alignment. These host defense proteins have key roles in oxidative stress response, immune response, and antimicrobial, antifungal, and antiviral activities [61]. Defensins have been implicated tumor growth exhibiting both tumor-suppressive and tumor-promoting effects [56]. In human carcinomas, defensins exert antitumor effects due to induction of apoptosis, inhibiting angiogenesis, and immunomodulation.
Furthermore, transcription regulators that are involved in response to changes in microenvironmental conditions have been retrieved. They include CCHC-type domain-containing protein, C2H2-type zinc finger protein 142, nucleus accumbens-associated protein 1 (NAC1), and retinoic acid receptor RXR-gamma-B involved in various diseases including cancer and neurodevelopmental disorders [37,40,60,63]. Additionally, the importance of the extraction of calcium-binding EGF-like domain protein is that the AFP14–20 motif is a part of EGF and EGF-like domains involved in various signaling pathways [51]. Among them are JAG1/Notch signaling cascades, which activate a number of oncogenic factors that regulate cell proliferation, metastasis, angiogenesis, and drug-resistance [47]. Furthermore, denticleless protein homolog (DTL) has been associated with response to DNA damage and the immunosuppressive tumor microenvironment [45].
4.2. GIP-9-like Motif-Containing Proteins
As shown in Table 2, prokaryotic proteins containing sequences aligned with GIP-9 segment at high significance are preliminarily involved in maintaining genomic stability, transcriptional regulation, translation, and cell division. These included bacterial 2’-deoxycytidine-5’-triphosphate deaminase [67], forkhead-associated (FHA) domain-containing protein [68], chromosome partitioning protein ParA [69], AcrR family transcriptional regulator [70], dual specificity phosphatase [76], and glutamyl-Q tRNA(Asp) synthetase [79], as well as 30S and 50S ribosomal proteins [83]. They are involved in protection from DNA damage, genotoxicity, injury osmotic stress, etc.
Additionally, thermonuclease family ribonuclease HII and PINc domain-containing proteins that are involved in DNA and RNA degradation under stress conditions to provide bacterial defense mechanism [75,89] were among the retrieved proteins. Furthermore, components of bacterial toxin–antitoxin systems such as addiction module HigA family antidote, which promote adaptation and persistence by modulating bacterial growth in response to stress [86], were also retrieved.
Qualitatively, most bacterial proteins, including the cupredoxin domain-containing protein play pivotal roles in many metabolic pathways and regulation of redox homeostasis that are crucial for the pathogen survival [82]. NADPH-dependent oxidoreductases such as malate dehydrogenase and short-chain dehydrogenase (SDR) family oxidoreductase are among enzymes that undergo thiol group–redox switch for the involvement in adaptive response to oxidative stress conditions [78]. These also include a cytochrome c biogenesis protein that provides heme binding to apoprotein of cytochrome c and cytochrome c-552, the components of electron transfer and mitochondrial redox regulation [50,73]. Other proteins containing GIP-9-like segments involved in the pathogen response to oxidative stress included protein kinases and proteases.
Many proteins, including those responsible for cell cycle control and embryonic development are regulated under oxidative stress conditions. For example, the Cys residue of CoA-binding protein can undergo S-thiolation in response to oxidative and metabolic stress [71]. Some of bacterial stress response protein homologs are implicated in disease pathogenesis in humans. For example, divalent cation tolerance protein CutA homolog has been proposed to mediate acetylcholinesterase activity and copper homeostasis, which are implicated in Alzheimer’s disease [87]. In proteobacteria, cell division proteins display redox transformation due to electron transfer and reduction of oxygen, nitrogen, and hydrogen sulfide [72]. Among viral proteins, the envelope glycoprotein E that is involved in host immune response to pathogen and viral protein kinase that have a role in virus virulence and tumor pathogenesis [77] were retrieved.
Similar to AFP14–20-like motifs, GIP-9-like motifs were identified in small Fe-S cluster-containing proteins such as ferredoxins and Cisd2-a protein [80]. Furthermore, coevolution of bacteria with their hosts enabled them to tolerate oxidative stress conditions with the use of an antioxidant system (AOS) that includes both enzymatic and non-enzymatic components [96]. In this context, cupin domain-containing proteins contribute to counteracting the host defense due to functional diversity that includes an AOS component, the superoxide dismutase (SOD) enzyme [74]. Additionally, glutathione S-transferases play important roles in the environmental stress response due to S-glutathionylation of Cys residue and thiol groups resulting in target molecule detoxification [91].
In eukaryotes, GIP-9-like motifs were found in proteins such as small proline-rich proteins regulating cell cycle and cell proliferation and differentiation [85]. Interestingly, these proteins can be involved in tumor progression and their functioning is under redox control [97]. Additionally, ceruloplasmin, a major copper-carrying plasma protein that possesses ferroxidase activity and is involved in redox regulation [84], was retrieved. In plants and algae, photosystem II stability and assembly factor HCF136 that is essential for the formation of photosystem II complex and plastocyanin-like domain-containing protein that is involved in electron transfer during photosynthesis [81] were retrieved. They are regulated by redox switches between active–inactive states during light–dark transition [98]. Additionally, various stress-related proteins involved in quality control machinery including a C2H2-type zinc finger-containing protein and zinc metalloproteinases [85] were identified among GIP-9-like motif-containing proteins. Moreover, a variety of transmembrane proteins involved in ER stress response such as TMEM258 [99] and antimicrobial peptides, though at lower statistical significance, peptides were retrieved.
In mammals, members of homeobox family transcription factors such as forkhead box protein O1 (FOXO1) and homeobox protein Hox-C5 (HOXC5) that play important roles in metabolism, cell proliferation, apoptosis, development, and stress resistance [90] were identified. Additionally, HSP family members, along with tumor necrosis factor (TNF) ligand family cytokines and Wnt-1 protein involved in Wnt/β-catenin signaling pathway, key players in redox regulation and cancer development [100], were among the retrieved proteins though at lower significance.
5. Conclusions
In our study, we undertook a comprehensive functional enrichment analysis of a wide range of proteins from all taxonomic groups and different functional classes. All these proteins have similar structural characteristics regarding the presence of conserved SLiMs. The both types of short sequences used as queries for sequence similarity search were derived from AFP, a major mammalian embryo-specific and tumor-associated protein. Therefore, the identification of a variety of transcription factors and proteins involved in cell signaling, cell cycle progression, cell proliferation and differentiation, and protein quality control was anticipated. However, unexpectedly, various prokaryotic and eukaryotic proteins responsible for cellular response to both biotic and abiotic stress were retrieved as containing the both AFP14–20-like and GIP-9-like motifs. They included proteins implicated in the adaptation and protection against pathogens, reactive oxygen species, toxins, and various chemical agents. Moreover, the overwhelming majority of retrieved transcription factors and proteins involved in replication and translation were reported to participate in cellular and organismal adaptation environmental stress stimuli.
We hypothesized that both the AFP-derived peptides can arise from prototype peptides during the long evolutionary time. At the early stages of biochemical evolution, these peptides were involved in cellular stress response and preserved this function in modern proteins, including AFP. Therefore, bioinformatics and GO functional enrichment analyses of SLiMs allows insight into the common functions of a variety proteins and the involvement of AFP in cellular response to external and internal stimuli during embryonic development and cancer growth. Nevertheless, our data require further confirmation with the use of experimental approaches.
Author Contributions
Conceptualization, N.T.M. and A.A.T.; Methodology, S.P.Z. and S.S.S.; Investigation, D.S.G., N.T.M., S.P.Z. and S.S.S.; Validation and Formal Analysis, D.S.G. and S.P.Z.; Data Curation and Visualization, D.S.G. and S.P.Z.; Writing—original draft, N.T.M. and S.P.Z.; Writing—review and editing, N.T.M. and A.A.T.; Resources, D.S.G. and S.P.Z.; Supervision, A.A.T. and N.T.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are available in a publicly accessible repository. The data presented in this study are openly available in FigShare at https://doi.org/10.6084/m9.figshare.20456727 accessed on 21 December 2022.
Conflicts of Interest
The authors declare no financial, professional, or personal competing interests.
References
- Neduva, V.; Russell, R.B. Linear motifs: Evolutionary interaction switches. FEBS Lett. 2005, 579, 3342–3345. [Google Scholar] [CrossRef] [PubMed]
- Davey, N.E.; Van Roey, K.; Weatheritt, R.J.; Toedt, G.; Uyar, B.; Altenberg, B.; Budd, A.; Diella, F.; Dinkel, H.; Gibson, T.J. Attributes of short linear motifs. Mol. Biosyst. 2012, 8, 268–281. [Google Scholar] [CrossRef]
- Van Roey, K.; Davey, N.E. Motif co-regulation and co-operativity are common mechanisms in transcriptional, post-transcriptional and post-translational regulation. Cell Commun. Signal 2015, 13, 45. [Google Scholar] [CrossRef] [PubMed]
- Kolodny, R. Searching protein space for ancient sub-domain segments. Curr. Opin. Struct. Biol. 2021, 68, 105–112. [Google Scholar] [CrossRef] [PubMed]
- Nepomnyachiy, S.; Ben-Tal, N.; Kolodny, R. Global view of the protein universe. Proc. Natl. Acad. Sci. USA 2014, 111, 11691–11696. [Google Scholar] [CrossRef]
- Höcker, B. Design of proteins from smaller fragments-learning from evolution. Curr. Opin. Struct. Biol. 2014, 27, 56–62. [Google Scholar] [CrossRef]
- Romero Romero, M.L.; Rabin, A.; Tawfik, D.S. Functional proteins from short peptides: Dayhoff’s hypothesis turns 50. Angew. Chem. Int. Ed. Engl. 2016, 55, 15966–15971. [Google Scholar] [CrossRef]
- Eck, R.V.; Dayhoff, M.O. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science 1966, 152, 363–366. [Google Scholar] [CrossRef]
- Alva, V.; Söding, J.; Lupas, A.N. A vocabulary of ancient peptides at the origin of folded proteins. Elife 2015, 4, e09410. [Google Scholar] [CrossRef]
- Verschueren, E.; Vanhee, P.; van der Sloot, A.M.; Serrano, L.; Rousseau, F.; Schymkowitz, J. Protein design with fragment databases. Curr. Opin. Struct. Biol. 2011, 21, 452–459. [Google Scholar] [CrossRef]
- Kolodny, R.; Nepomnyachiy, S.; Tawfik, D.S.; Ben-Tal, N. Bridging themes: Short protein segments found in different architectures. Mol. Biol. Evol. 2021, 38, 2191–2208. [Google Scholar] [CrossRef] [PubMed]
- Sologova, S.S.; Zavadskiy, S.P.; Mokhosoev, I.M.; Moldogazieva, N.T. Short linear motifs orchestrate functioning of human proteins during embryonic development, redox regulation, and cancer. Metabolites 2022, 12, 464. [Google Scholar] [CrossRef] [PubMed]
- Terentiev, A.A.; Moldogazieva, N.T. Alpha-fetoprotein: A renaissance. Tumour. Biol. 2013, 34, 2075–2091. [Google Scholar] [CrossRef] [PubMed]
- Terentiev, A.A.; Moldogazieva, N.T. Structural and functional mapping of alpha-fetoprotein. Biochemistry 2006, 71, 120–132. [Google Scholar] [CrossRef]
- Muehlemann, M.; Miller, K.D.; Dauphinee, M.; Mizejewski, G.J. Review of growth inhibitory peptide as a biotherapeutic agent for tumor growth, adhesion, and metastasis. Cancer Metastasis Rev. 2005, 24, 441–467. [Google Scholar] [CrossRef]
- Mizejewski, G.J.; Eisele, L.; Maccoll, R. Anticancer versus antigrowth activities of three analogs of the growth-inhibitory peptide: Relevance to physicochemical properties. Anticancer Res. 2006, 26, 3071–3076. [Google Scholar]
- Jacobson, H.I.; Andersen, T.T.; Bennett, J.A. Development of an active site peptide analog of α-fetoprotein that prevents breast cancer. Cancer Prev. Res. 2014, 7, 565–573. [Google Scholar] [CrossRef][Green Version]
- Zhu, Z.; West, G.R.; Wang, D.C.; Collins, A.B.; Xiao, H.; Bai, Q.; Mesfin, F.B.; Wakefield, M.R.; Fang, Y. AFP peptide (AFPep) as a potential growth factor for prostate cancer. Med. Oncol. 2021, 39, 2. [Google Scholar] [CrossRef]
- Butterstein, G.; Morrison, J.; Mizejewski, G.J. Effect of alpha-fetoprotein and derived peptides on insulin- and estrogen-induced fetotoxicity. Fetal. Diagn. Ther. 2003, 18, 360–369. [Google Scholar] [CrossRef]
- Moldogazieva, N.T.; Shaitan, K.V.; Antonov, M.Y.; Vinogradova, I.K.; Terentiev, A.A. Influence of intramolecular interactions on conformational and dynamic properties of analogs of heptapeptide AFP14-20. Biochemistry 2011, 76, 1321–1336. [Google Scholar] [CrossRef]
- Moldogazieva, N.T.; Shaitan, K.V.; Antonov, M.Y.; Mokhosoev, I.M.; Levtsova, O.V.; Terentiev, A.A. Human EGF-derived direct and reverse short linear motifs: Conformational dynamics insight into the receptor-binding residues. J. Biomol. Struct. Dyn. 2018, 36, 1286–1305. [Google Scholar] [CrossRef] [PubMed]
- Moldogazieva, N.T.; Terentiev, A.A.; Antonov, M.Y.; Kazimirsky, A.N.; Shaitan, K.V. Correlation between biological activity and conformational dynamics properties of tetra- and pentapeptides derived from fetoplacental proteins. Biochemistry 2012, 77, 469–484. [Google Scholar] [CrossRef] [PubMed]
- Terentiev, A.A.; Moldogazieva, N.T.; Levtsova, O.V.; Maximenko, D.M.; Borozdenko, D.A.; Shaitan, K.V. Modeling of three-dimensional structure of human alpha-fetoprotein complexed with diethylstilbestrol: Docking and molecular dynamics simulation study. J. Bioinform. Comput. Biol. 2012, 10, 1241012. [Google Scholar] [CrossRef] [PubMed]
- Moldogazieva, N.T.; Ostroverkhova, D.S.; Kuzmich, N.N.; Kadochnikov, V.V.; Terentiev, A.A.; Porozov, Y.B. Elucidating binding sites and affinities of ERα agonists and antagonists to human alpha-fetoprotein by in silico modeling and point mutagenesis. Int. J. Mol. Sci. 2020, 21, 893. [Google Scholar] [CrossRef]
- Rosignoli, S.; Paiardini, A. Boosting the full potential of PyMOL with structural biology plugins. Biomolecules 2022, 12, 1764. [Google Scholar] [CrossRef]
- Pearson, W.R. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 1990, 183, 63–98. [Google Scholar] [CrossRef] [PubMed]
- Cantelli, G.; Bateman, A.; Brooksbank, C.; Petrov, A.I.; Malik-Sheriff, R.S.; Ide-Smith, M.; Hermjakob, H.; Flicek, P.; Apweiler, R.; Birney, E.; et al. The European Bioinformatics Institute (EMBL-EBI) in 2021. Nucleic. Acids. Res. 2022, 50, D11–D19. [Google Scholar] [CrossRef]
- UniProt Consortium. UniProt: The universal protein knowledgebase in 2021. Nucleic. Acids Res. 2021, 49, D480–D489. [CrossRef] [PubMed]
- Crooks, G.E.; Hon, G.; Chandonia, J.M.; Brenner, S.E. WebLogo: A sequence logo generator. Genome. Res. 2004, 14, 1188–1190. [Google Scholar] [CrossRef]
- Gene Ontology Consortium. The Gene Ontology resource: Enriching a GOld mine. Nucleic Acids Res. 2021, 49, D325–D334. [Google Scholar] [CrossRef] [PubMed]
- Blum, M.; Chang, H.Y.; Chuguransky, S.; Grego, T.; Kandasaamy, S.; Mitchell, A.; Nuka, G.; Paysan-Lafosse, T.; Qureshi, M.; Raj, S.; et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 2021, 49, D344–D354. [Google Scholar] [CrossRef]
- Mi, H.; Ebert, D.; Muruganujan, A.; Mills, C.; Albou, L.P.; Mushayamaha, T.; Thomas, P.D. PANTHER version 16: A revised family classification, tree-based classification tool, enhancer regions and extensive API. Nucleic Acids Res 2021, 49, D394–D403. [Google Scholar] [CrossRef]
- Ge, S.X.; Jung, D.; Yao, R. ShinyGO: A graphical gene-set enrichment tool for animals and plants. Bioinformatics 2020, 36, 2628–2629. [Google Scholar] [CrossRef] [PubMed]
- Raudvere, U.; Kolberg, L.; Kuzmin, I.; Arak, T.; Adler, P.; Peterson, H.; Vilo, J. g:Profiler: A web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 2019, 47, W191–W198. [Google Scholar] [CrossRef] [PubMed]
- Andreini, C.; Ciofi-Baffoni, S. Basic iron-sulfur centers. Met. Ions Life Sci. 2020, 20. [Google Scholar] [CrossRef]
- Pletnev, P.I.; Shulenina, O.; Evfratov, S.; Treshin, V.; Subach, M.F.; Serebryakova, M.V.; Osterman, I.A.; Paleskava, A.; Bogdanov, A.A.; Dontsova, O.A.; et al. Ribosomal protein S18 acetyltransferase RimI is responsible for the acetylation of elongation factor Tu. J. Biol. Chem. 2022, 298, 101914. [Google Scholar] [CrossRef]
- Wang, Y.; Yu, Y.; Pang, Y.; Yu, H.; Zhang, W.; Zhao, X.; Yu, J. The distinct roles of zinc finger CCHC-type (ZCCHC) superfamily proteins in the regulation of RNA metabolism. RNA Biol. 2021, 18, 2107–2126. [Google Scholar] [CrossRef]
- Mailloux, R.J.; Young, A.; Chalker, J.; Gardiner, D.; O’Brien, M.; Slade, L.; Brosnan, J.T. Choline and dimethylglycine produce superoxide/hydrogen peroxide from the electron transport chain in liver mitochondria. FEBS Lett. 2016, 590, 4318–4328. [Google Scholar] [CrossRef]
- Potempa, J.; Madej, M.; Scott, D.A. The RagA and RagB proteins of Porphyromonas gingivalis. Mol. Oral. Microbiol. 2021, 36, 225–232. [Google Scholar] [CrossRef]
- Kameyama, S.; Mizuguchi, T.; Fukuda, H.; Moey, L.H.; Keng, W.T.; Okamoto, N.; Tsuchida, N.; Uchiyama, Y.; Koshimizu, E.; Hamanaka, K.; et al. Biallelic null variants in ZNF142 cause global developmental delay with familial epilepsy and dysmorphic features. J. Hum. Genet. 2022, 67, 169–173. [Google Scholar] [CrossRef]
- Fitzgerald, D.M.; Rosenberg, S.M. Biology before the SOS response-DNA damage mechanisms at chromosome fragile sites. Cells 2021, 10, 2275. [Google Scholar] [CrossRef]
- Mittl, P.R.; Schneider-Brachert, W. Sel1-like repeat proteins in signal transduction. Cell Signal. 2007, 19, 20–31. [Google Scholar] [CrossRef]
- Ishii, E.; Eguchi, Y. Diversity in sensing and signaling of bacterial sensor histidine kinases. Biomolecules 2021, 11, 1524. [Google Scholar] [CrossRef]
- Sushko, T.; Kavaleuski, A.; Grabovec, I.; Kavaleuskaya, A.; Vakhrameev, D.; Bukhdruker, S.; Marin, E.; Kuzikov, A.; Masamrekh, R.; Shumyantseva, V.; et al. A new twist of rubredoxin function in M. tuberculosis. Bioorg. Chem. 2021, 109, 104721. [Google Scholar] [CrossRef]
- Li, Z.; Wang, R.; Qiu, C.; Cao, C.; Zhang, J.; Ge, J.; Shi, Y. Role of DTL in hepatocellular carcinoma and its impact on the tumor microenvironment. Front. Immunol. 2022, 13, 834606. [Google Scholar] [CrossRef]
- Sjöholm, J.; Oliveira, P.; Lindblad, P. Transcription and regulation of the bidirectional hydrogenase in the cyanobacterium Nostoc sp. strain PCC 7120. Appl. Environ. Microbiol. 2007, 73, 5435–5446. [Google Scholar] [CrossRef]
- Xiu, M.X.; Liu, Y.M.; Kuang, B.H. The oncogenic role of Jagged1/Notch signaling in cancer. Biomed. Pharmacother. 2020, 129, 110416. [Google Scholar] [CrossRef] [PubMed]
- Tachibana, A.; Hamajima, R.; Tomizaki, M.; Kondo, T.; Nanba, Y.; Kobayashi, M.; Yamada, H.; Ikeda, M. HCF-1 encoded by baculovirus AcMNPV is required for productive nucleopolyhedrovirus infection of non-permissive Tn368 cells. Sci. Rep. 2017, 7, 3807. [Google Scholar] [CrossRef]
- Corbella, M.; Liao, Q.; Moreira, C.; Parracino, A.; Kasson, P.M.; Kamerlin, S.C.L. The N-terminal helix-turn-helix motif of transcription factors MarA and Rob drives DNA recognition. J. Phys. Chem. 2021, 125, 6791–6806. [Google Scholar] [CrossRef]
- Waghwani, H.K.; Douglas, T. Cytochrome c with peroxidase-like activity encapsulated inside the small DPS protein nanocage. J. Mater. Chem. B 2021, 9, 3168–3179. [Google Scholar] [CrossRef]
- Hong, G.; Kuek, V.; Shi, J.; Zhou, L.; Han, X.; He, W.; Tickner, J.; Qiu, H.; Wei, Q.; Xu, J. EGFL7: Master regulator of cancer pathogenesis, angiogenesis and an emerging mediator of bone homeostasis. J. Cell Physiol. 2018, 233, 8526–8537. [Google Scholar] [CrossRef] [PubMed]
- Cuthbertson, L.; Nodwell, J.R. The TetR family of regulators. Microbiol. Mol. Biol. Rev. 2013, 77, 440–475. [Google Scholar] [CrossRef]
- Imhoff, J.F.; Rahn, T.; Künzel, S.; Keller, A.; Neulinger, S.C. Osmotic adaptation and compatible solute biosynthesis of phototrophic bacteria as revealed from genome analyses. Microorganisms 2020, 9, 46. [Google Scholar] [CrossRef] [PubMed]
- Lin, Y.; Zeng, Y.; Zhu, Y.; Shen, J.; Ye, H.; Jiang, L. Plant Rho GTPase signaling promotes autophagy. Mol. Plant 2021, 14, 905–920. [Google Scholar] [CrossRef] [PubMed]
- Khamtorn, P.; Peigneur, S.; Amorim, F.G.; Quinton, L.; Tytgat, J.; Daduang, S. De novo transcriptome analysis of the venom of Latrodectus geometricus with the discovery of an insect-selective Na channel modulator. Molecules 2021, 27, 47. [Google Scholar] [CrossRef] [PubMed]
- Xu, D.; Lu, W. Defensins: A double-edged sword in host immunity. Front. Immunol. 2020, 11, 764. [Google Scholar] [CrossRef]
- Loppin, B.; Dubruille, R.; Horard, B. The intimate genetics of Drosophila fertilization. Open. Biol. 2015, 5, 150076. [Google Scholar] [CrossRef]
- Imperial, J.S.; Kantor, Y.; Watkins, M.; Heralde, F.M., 3rd; Stevenson, B.; Chen, P.; Hansson, K.; Stenflo, J.; Ownby, J.P.; Bouchet, P.; et al. Venomous auger snail Hastula (Impages) hectica (Linnaeus, 1758): Molecular phylogeny, foregut anatomy and comparative toxicology. J. Exp. Zool. B Mol. Dev. Evol. 2007, 308, 744–756. [Google Scholar] [CrossRef]
- Gomez-Arrebola, C.; Solano, C.; Lasa, I. Regulation of gene expression by non-phosphorylated response regulators. Int. Microbiol. 2021, 24, 521–529. [Google Scholar] [CrossRef]
- Zhang, Y.; Ren, Y.J.; Guo, L.C.; Ji, C.; Hu, J.; Zhang, H.H.; Xu, Q.H.; Zhu, W.D.; Ming, Z.J.; Yuan, Y.S.; et al. Nucleus accumbens-associated protein-1 promotes glycolysis and survival of hypoxic tumor cells via the HDAC4-HIF-1α axis. Oncogene 2017, 36, 4171–4181. [Google Scholar] [CrossRef]
- Conlon, J.M.; Kolodziejek, J.; Mechkarska, M.; Coquet, L.; Leprince, J.; Jouenne, T.; Vaudry, H.; Nielsen, P.F.; Nowotny, N.; King, J.D. Host defense peptides from Lithobates forreri, Hylarana luctuosa, and Hylarana signata (Ranidae): Phylogenetic relationships inferred from primary structures of ranatuerin-2 and brevinin-2 peptides. Comp. Biochem. Physiol. Part D Genom. Proteom. 2017, 9, 49–57. [Google Scholar] [CrossRef] [PubMed]
- Xu, C.Q.; He, L.L.; Brône, B.; Martin-Eauclaire, M.F.; Van Kerkhove, E.; Zhou, Z.; Chi, C.W. A novel scorpion toxin blocking small conductance Ca2+ activated K+ channel. Toxicon 2004, 43, 961–971. [Google Scholar] [CrossRef] [PubMed]
- Leal, A.S.; Reich, L.A.; Moerland, J.A.; Zhang, D.; Liby, K.T. Potential therapeutic uses of rexinoids. Adv. Pharmacol. 2021, 91, 141–183. [Google Scholar] [CrossRef]
- Cheng, J.T.; Wang, Y.Y.; Zhu, L.Z.; Zhang, Y.; Cai, W.Q.; Han, Z.W.; Zhou, Y.; Wang, X.W.; Peng, X.C.; Xiang, Y.; et al. Novel transcription regulatory sequences and factors of the immune evasion protein ICP47 (US12) of herpes simplex viruses. Virol. J. 2020, 17, 101. [Google Scholar] [CrossRef]
- Ohnishi, T.; Ohnishi, S.T.; Salerno, J.C. Five decades of research on mitochondrial NADH-quinone oxidoreductase (complex I). Biol. Chem. 2018, 399, 1249–1264. [Google Scholar] [CrossRef]
- Zhang, S.; Wang, X. One new kind of phytohormonal signaling integrator: Up-and-coming GASA family genes. Plant. Signal Behav. 2017, 12, e1226453. [Google Scholar] [CrossRef]
- Dos Vultos, T.; Mestre, O.; Tonjum, T.; Gicquel, B. DNA repair in Mycobacterium tuberculosis revisited. FEMS Microbiol. Rev. 2009, 33, 471–487. [Google Scholar] [CrossRef]
- Almawi, A.W.; Matthews, L.A.; Guarné, A. FHA domains: Phosphopeptide binding and beyond. Prog. Biophys. Mol. Biol. 2017, 127, 105–110. [Google Scholar] [CrossRef]
- Jalal, A.S.B.; Le, T.B.K. Bacterial chromosome segregation by the ParABS system. Open Biol. 2020, 10, 200097. [Google Scholar] [CrossRef]
- Kang, S.M.; Kim, D.H.; Jin, C.; Ahn, H.C.; Lee, B.J. The crystal structure of AcrR from Mycobacterium tuberculosis reveals a one-component transcriptional regulation mechanism. FEBS Open. Bio 2019, 9, 1713–1725. [Google Scholar] [CrossRef]
- Gout, I. Coenzyme A, protein CoAlation and redox regulation in mammalian cells. Biochem. Soc. Trans. 2018, 46, 721–728. [Google Scholar] [CrossRef] [PubMed]
- Geerlings, N.M.J.; Karman, C.; Trashin, S.; As, K.S.; Kienhuis, M.V.M.; Hidalgo-Martinez, S.; Vasquez-Cardenas, D.; Boschker, H.T.S.; De Wael, K.; Middelburg, J.J.; et al. Division of labor and growth during electrical cooperation in multicellular cable bacteria. Proc. Natl. Acad. Sci. USA 2020, 117, 5478–5485. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Pan, C.; Feng, C.; Yan, C.; Yu, Y.; Chen, Z.; Guo, C.; Wang, X. Role of mitochondrial reactive oxygen species in homeostasis regulation. Redox Rep. 2022, 27, 45–52. [Google Scholar] [CrossRef] [PubMed]
- Karlik, E. Potential stress tolerance roles of barley germins and GLPs. Dev. Genes. Evol. 2021, 231, 109–118. [Google Scholar] [CrossRef]
- Hu, Y.; Meng, J.; Shi, C.; Hervin, K.; Fratamico, P.M.; Shi, X. Characterization and comparative analysis of a second thermonuclease from Staphylococcus aureus. Microbiol. Res. 2013, 168, 174–182. [Google Scholar] [CrossRef]
- Pulido, R.; Lang, R. Dual specificity phosphatases: From molecular mechanisms to biological function. Int. J. Mol. Sci. 2019, 20, 4372. [Google Scholar] [CrossRef]
- Villalaín, J. Envelope E protein of dengue virus and phospholipid binding to the late endosomal membrane. Biochim. Biophys. Acta Biomembr. 2022, 1864, 183889. [Google Scholar] [CrossRef]
- Sellés Vidal, L.; Kelly, C.L.; Mordaka, P.M.; Heap, J.T. Review of NAD(P)H-dependent oxidoreductases: Properties, engineering and application. Biochim. Biophys. Acta Proteins Proteom. 2018, 1866, 327–347. [Google Scholar] [CrossRef]
- Katz, A.; Elgamal, S.; Rajkovic, A.; Ibba, M. Non-canonical roles of tRNAs and tRNA mimics in bacterial cell biology. Mol. Microbiol. 2016, 101, 545–558. [Google Scholar] [CrossRef]
- Sengupta, S.; Nechushtai, R.; Jennings, P.A.; Onuchich, J.N.; Padulla, P.A.; Azad, R.K.; Mittler, R. Phylogenetic analysis of the CDGSH iron-sulfur binding domain reveals its ancient origin. Sci. Rep. 2018, 8, 4840. [Google Scholar] [CrossRef]
- Ma, H.; Zhao, H.; Liu, Z.; Zhao, J. The phytocyanin gene family in rice (Oryza sativa L.): Genome-wide identification, classification and transcriptional analysis. PLoS ONE 2011, 6, e25184. [Google Scholar] [CrossRef] [PubMed]
- Zhang, K.; Wang, F.; Liu, B.; Xu, C.; He, Q.; Cheng, W.; Zhao, X.; Ding, Z.; Zhang, W.; Zhang, K.; et al. ZmSKS13, a cupredoxin domain-containing protein, is required for maize kernel development via modulation of redox homeostasis. New Phytol. 2021, 229, 2163–2178. [Google Scholar] [CrossRef] [PubMed]
- Svetlov, M.S. Ribosome-associated quality control in bacteria. Biochemistry 2021, 86, 942–951. [Google Scholar] [CrossRef]
- Prohaska, J.R. Role of copper transporters in copper homeostasis. Am J Clin Nutr. 2008, 88, 826S–829S. [Google Scholar] [CrossRef]
- Zhu, G.; Cai, H.; Yem, L.; Mo, Y.; Zhu, M.; Zeng, Y.; Song, X.; Yang, C.; Gao, X.; Wang, J.; et al. Small proline-rich protein 3 regulates IL-33/ILC2 axis to promote allergic airway inflammation. Front. Immunol. 2020, 12, 758829. [Google Scholar] [CrossRef] [PubMed]
- Bordes, P.; Sala, A.J.; Ayala, S.; Texier, P.; Slama, N.; Cirinesi, A.M.; Guillet, V.; Mourey, L.; Genevaux, P. Chaperone addiction of toxin-antitoxin systems. Nat. Commun. 2016, 7, 13339. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, Y.; Hu, J.; Zhang, X.; Zhang, Y.W. CutA divalent cation tolerance homolog (Escherichia coli) (CUTA) regulates β-cleavage of β-amyloid precursor protein (APP) through interacting with β-site APP cleaving protein 1 (BACE1). J. Biol. Chem. 2012, 287, 11141–11150. [Google Scholar] [CrossRef]
- Hyjek, M.; Figiel, M.; Nowotny, M. RNases H: Structure and mechanism. DNA Repair. 2019, 84, 102672. [Google Scholar] [CrossRef]
- Mackeh, R.; Marr, A.K.; Fadda, A.; Kino, T. C2H2-type zinc finger proteins: Evolutionarily old and new partners of the nuclear hormone receptors. Nucl. Recept. Signal. 2018, 15, 1550762918801071. [Google Scholar] [CrossRef]
- Wang, Y.; Zhou, Y.; Graves, D.T. FOXO transcription factors: Their clinical significance and regulation. Biomed. Res. Int. 2014, 2014, 925350. [Google Scholar] [CrossRef]
- Vaish, S.; Gupta, D.; Mehrotra, R.; Mehrotra, S.; Basantani, M.K. Glutathione S-transferase: A versatile protein family. 3 Biotech. 2020, 10, 321. [Google Scholar] [CrossRef] [PubMed]
- Davey, N.E.; Cyert, M.S.; Moses, A.M. Short linear motifs-ex nihilo evolution of protein regulation. Cell Commun. Signal. 2015, 13, 43. [Google Scholar] [CrossRef] [PubMed]
- Terentiev, A.A.; Moldogazieva, N.T. Cell adhesion proteins and alpha-fetoprotein. Similar structural motifs as prerequisites for common functions. Biochemistry 2007, 72, 920–935. [Google Scholar] [CrossRef] [PubMed]
- Kienast, A.; Preuss, M.; Winkler, M.; Dick, T.P. Redox regulation of peptide receptivity of major histocompatibility complex class I molecules by ERp57 and tapasin. Nat. Immunol. 2007, 8, 864–872. [Google Scholar] [CrossRef]
- Gursoy-Yuzugullu, O.; Ayrapetov, M.K.; Price, B.D. Histone chaperone Anp32e removes H2A.Z from DNA double-strand breaks and promotes nucleosome reorganization and DNA repair. Proc. Natl. Acad. Sci. USA 2015, 112, 7507–7512. [Google Scholar] [CrossRef]
- Moldogazieva, N.T.; Mokhosoev, I.M.; Mel’nikova, T.I.; Zavadskiy, S.P.; Kuz’menko, A.N.; Terentiev, A.A. Dual character of reactive oxygen, nitrogen, and halogen species: Endogenous sources, interconversions and neutralization. Biochemistry 2020, 85 (Suppl. S1), S56–S78. [Google Scholar] [CrossRef]
- Chen, J.; Wang, Y.; Zhang, W.; Zhao, D.; Zhang, L.; Fan, J.; Li, J.; Zhan, Q. Membranous NOX5-derived ROS oxidizes and activates local Src to promote malignancy of tumor cells. Signal Transduct. Target. 2020, 5, 139. [Google Scholar] [CrossRef]
- García-Cerdán, J.G.; Furst, A.L.; McDonald, K.L.; Schünemann, D.; Francis, M.B.; Niyogi, K.K. A thylakoid membrane-bound and redox-active rubredoxin (RBD1) functions in de novo assembly and repair of photosystem II. Proc. Natl. Acad. Sci. USA 2019, 116, 16631–16640. [Google Scholar] [CrossRef]
- Graham, D.B.; Lefkovith, A.; Deelen, P.; de Klein, N.; Varma, M.; Boroughs, A.; Desch, A.N.; Ng, A.C.Y.; Guzman, G.; Schenone, M.; et al. TMEM258 is a component of the oligosaccharyltransferase complex controlling ER stress and intestinal inflammation. Cell Rep. 2016, 17, 2955–2965. [Google Scholar] [CrossRef]
- Fittipaldi, S.; Mercatelli, N.; Dimauro, I.; Jackson, M.J.; Paronetto, M.P.; Caporossi, D. Alpha B-crystallin induction in skeletal muscle cells under redox imbalance is mediated by a JNK-dependent regulatory mechanism. Free Radic. Biol. Med. 2015, 86, 331–342. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).