Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach

de Luna-Aragão, Madson Allan; Alves de Andrade, Fernanda; Penna, Saulo Rafael Mendes; Maciel, Laiane Silva; Rodrigues-Paixão, Laura Maria; Lemos, Ayug Bezerra; Ferreira, José Diogo Cavalcanti; Aragão, Francisco José Lima; Pandolfi, Valesca; Benko-Iseppon, Ana Maria

doi:10.3390/agronomy16070766

Open AccessArticle

Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach

by

Madson Allan de Luna-Aragão

^1,†

,

Fernanda Alves de Andrade

^2,†

,

Saulo Rafael Mendes Penna

²

,

Laiane Silva Maciel

²

,

Laura Maria Rodrigues-Paixão

²,

Ayug Bezerra Lemos

²,

José Diogo Cavalcanti Ferreira

³,

Francisco José Lima Aragão

⁴

,

Valesca Pandolfi

^2,*

and

Ana Maria Benko-Iseppon

^2,*

¹

Institute of Biological Sciences (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil

²

Department of Genetics (dGEN), Center of Biosciences (CB), Federal University of Pernambuco (UFPE), Recife 50670-901, Brazil

³

Federal Institute of Education, Science and Technology of Pernambuco (IFPE), Abreu e Lima 53515-120, Brazil

⁴

Embrapa Recursos Genéticos e Biotecnologia, PqEB W5 Norte, Brasília 70770-900, Brazil

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Agronomy 2026, 16(7), 766; https://doi.org/10.3390/agronomy16070766

Submission received: 1 March 2026 / Revised: 31 March 2026 / Accepted: 1 April 2026 / Published: 6 April 2026

(This article belongs to the Special Issue Recent Advances in Legume Crop Protection—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

The eukaryotic translation initiation factor 4E (eIF4E) family plays a dual role in plants, regulating cap-dependent protein synthesis and mediating susceptibility to viruses in the family Potyviridae. In cowpea (Vigna unguiculata (L.) Walp.), an economically important legume cultivated worldwide, the structural determinants of these isoforms remain largely unexplored. This study characterizes the genomic organization, evolutionary history, and conformational dynamics of eIF4E, eIF(iso)4E, and nCBP in cowpea using a multi-omics approach. Genome mining identified three paralogous genes located on chromosomes 4, 6, and 7, showing high synteny with Phaseolus vulgaris. Phylogenetic analysis confirmed nCBP as the ancestral Class I lineage, distinct from the Class II eIF4E and eIF(iso)4E clades. Theoretical models for the isoforms were generated and subsequently validated by molecular dynamics simulations, revealing that while all isoforms preserve the canonical tertiary architecture and an electropositive cap-binding pocket, eIF(iso)4E exhibits superior structural compactness and hydrogen-bond stability. These biophysical features highlight their role as a stable anchor for viral VPg proteins. By elucidating the atomic-level landscape of these factors, we provide a robust structural framework to guide allele mining and genome-editing strategies aiming to engineer virus-resistant cowpea cultivars without compromising agronomic performance.

Keywords:

eukaryotic translation initiation factor 4E; nuclear cap-binding protein; plant translation; Vigna unguiculata; bioinformatics; molecular dynamics simulation

1. Introduction

The eukaryotic translation initiation factor 4E (eIF4E) is a key component of the translation initiation complex, responsible for recognizing and binding the 5′ cap structure (m⁷GpppN) of messenger RNAs (mRNAs) [1,2]. This interaction promotes the recruitment of the other factors of the eIF4F complex and the ribosome, triggering the initiation of protein synthesis [3]. Therefore, eIF4E serves as a central regulatory point in controlling translation efficiency [4].

In plants, eIF4E is not a single entity but part of a diversified family of cap-binding proteins, including the canonical eIF4E, its plant-specific paralog eIF(iso)4E, and the nuclear cap-binding proteins nCBP [2,5,6]. According to Joshi et al. [7], in angiosperms, the isoforms eIF4E and eIF(iso)4E are classified within Class I and act as canonical cap-binding proteins, whereas the non-canonical isoform nCBP is classified within Class II.

This diversification, absent in most animals, allows plants to fine-tune protein synthesis under specific developmental or stress conditions. For example, Arabidopsis thaliana carries multiple eIF4E-related isoforms with distinct affinities for the 5′ cap, differential expression across tissues, and varied responses to viral infection [2]. Similarly, crops such as lettuce (Lactuca sativa) [8], barley (Hordeum vulgare) [9], and pepper (Capsicum annuum) [10] possess natural variants of eIF4E or eIF(iso)4E that confer recessive resistance to specific viruses by disrupting viral RNA translation. In parallel, nCBPs, though primarily involved in RNA processing and nuclear export, also show stress-responsive expression in species such as rice (Oryza sativa) [11] and soybean (Glycine max) [12], highlighting their role in broader RNA metabolism during environmental adaptation.

In cowpea (Vigna unguiculata (L.) Walp), despite the availability of genomic resources, aspects such as the molecular diversity, evolutionary history, and structural determinants of its eIF4E-related isoforms remain unexplored. Analyzing these isoforms may provide valuable insights into variations in cap-binding affinity, interaction networks, and structural features, such as loop flexibility and surface charge, that influence selective mRNA translation and viral susceptibility. This understanding is particularly relevant given that, in agricultural environments [13], the species is exposed to combined abiotic and biotic pressures [14]. Among these, infections by the cowpea aphid-borne mosaic virus (CABMV) significantly impact its global production.

Previously, we characterized the phenotypic responses of cowpea cultivars to CABMV infection through a combination of in vivo inoculation bioassays and targeted computational modeling [15]. By screening 27 cowpea cultivars, including the specific genotypes selected for the present study, we confirmed a direct association between naturally occurring non-synonymous mutations in the eIF4E gene and viral resistance or susceptibility. From this validated group, we selected six specific genotypes for the present study based on their contrasting phenotypic profiles: Bajão and IT85F-2687 (resistant to CABMV), and Boca Negra, BR14 Mulato, Pingo de Ouro, and Santo Inácio (susceptible). Building upon this validated phenotypic framework, the present study expands the scope to the entire eIF4E gene family. Here, we employ a multi-omics and structural bioinformatics approach to decipher the evolutionary history, genomic organization, and atomic-level conformational dynamics of the eIF4E, eIF(iso)4E, and nCBP isoforms that physically dictate these plant-virus interaction outcomes. By integrating genomic and structural biology data with comparative analyses across plant species, we provide new insights into the molecular basis of translation regulation in cowpea and identify potential targets for breeding and biotechnological strategies aimed at improving stress resilience and resistance to viral infection.

2. Materials and Methods

2.1. eIF4E Isoforms Sequence Mining and Characterization in Genomes

Annotated gene sequences encoding cap-binding protein isoforms, including eIF4E, eIF(iso)4E, and nCBP, were retrieved from publicly available genome assemblies of Vigna unguiculata, Phaseolus vulgaris, and Arabidopsis thaliana. Genome data were obtained from the NCBI Genome (https://phytozome-next.jgi.doe.gov/) (accessed on 18 March 2025) and Phytozome (https://phytozome-next.jgi.doe.gov/) (accessed on 18 March 2025) databases, using the most recent and complete reference assemblies for each species: V. unguiculata (ID: 540), P. vulgaris (ID: 442), and A. thaliana (ID: 167). A. thaliana was included as a model species with extensive functional annotation and well-characterized eIF4E-family proteins, while P. vulgaris was selected as a close phylogenetic relative of V. unguiculata (Fabaceae), enabling comparative analyses among related legumes to identify lineage-specific features. Candidate isoforms in V. unguiculata were first located through keyword searches of genome annotations using the terms “eukaryotic translation initiation factor 4E”, “eIF4E”, “eIF(iso)4E”, “nuclear cap-binding protein” and “nCBP”. The retrieved entries were subsequently validated via BLASTp [16] similarity searches using experimentally characterized eIF4E, eIF(iso)4E, and nCBP sequences from A. thaliana as queries.

2.2. Primer Design

After obtaining sequences of the eIF4E isoforms (item 2.1), three isoform-specific primer pairs were designed (Table 1) to amplify the full-length coding sequence (CDS) using SnapGene software (v 8.0) (GSL Biotech; available at snapgene.com). Primer specificity was verified in silico using Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/) [17] (accessed on 20 March 2025) to ensure exclusive targeting of the V. unguiculata isoforms.

2.3. RNA Extraction and cDNA Synthesis

The selection of these specific cultivars was based on our previous framework [15], which established their phenotypic responses to CABMV through in vivo inoculation bioassays. Within this validated group, Bajão and IT85F-2687 consistently demonstrated a resistant phenotype, whereas Boca Negra, BR14 Mulato, Pingo de Ouro, and Santo Inácio were characterized as susceptible. This established biological contrast served as the foundation for our comparative structural and dynamic modeling, allowing us to investigate how naturally occurring eIF4E mutations previously identified [15] correlate with the biophysical properties of the three isoforms analyzed in the present study.

Total RNA was extracted from 28-day-old plants from six cowpea cultivars (Bajão, Boca Negra, BR14 Mulato, IT85F-2687, Pingo de Ouro, and Santo Inácio) using the SV Total RNA Isolation System kit (Promega, Madison, WI, USA) according to the manufacturer’s specifications. The selection of these cultivars was based on a previous study conducted by our group [15], in which mutations in the eIF4E gene were identified as potentially involved in modulating the interaction between the viral genome-linked viral protein (VPg) of CABMV and the protein eIF4E of the cowpea. Among the analyzed cultivars, Bajão and IT85F-2687 were resistant to CABMV, whereas Boca Negra, BR14 Mulato, Pingo de Ouro and Santo Inácio exhibited a susceptible phenotype to the virus.

For each cultivar, three biological replicates were analyzed, corresponding to independent samples obtained from different plants grown under the same experimental conditions. RNA integrity and concentration were assessed by agarose gel electrophoresis and fluorometer (Qubit 2.0, Invitrogen, Carlsbad, CA, USA), respectively. Samples containing 1 µg of RNA per sample were reverse-transcribed into cDNA using the ImProm-II Reverse Transcription System (Promega).

2.4. Amplification, Cloning and Sequencing

The CDSs of eIF4E, eIF(iso)4E, and nCBP were amplified from cDNA obtained from six cowpea cultivars using gene-specific primer pairs (Table 1). PCR amplifications were performed using DNA polymerase (LGC Biosearch Technologies) following the manufacturer’s instructions. The PCR program consisted of an initial denaturation at 95 °C for 5 min, followed by 35 cycles of 95 °C for 1 min, 58 °C for 1 min, and 72 °C for 40 s, with a final extension at 72 °C for 7 min. Reactions were carried out in a TC-412 thermocycler (Bibby Scientific, Staffordshire, UK).

PCR products were cloned into the pGEM^®-T Easy vector (Promega Corporation, Madison, WI, USA) and transformed into Escherichia coli DH5α competent cells by heat shock, as previously described [18]. Transformant colonies were selected on Luria–Bertani agar plates supplemented with ampicillin (100 mg L⁻¹) and screened by blue–white selection using X-gal and IPTG, according to the manufacturer’s instructions for the pGEM^®-T Easy Vector System (Promega Corporation, Madison, WI, USA). Plasmidial DNA from positive clones was purified using the Wizard^® Plus SV Minipreps DNA Purification System (Promega Corporation, Madison, WI, USA). Insert presence was confirmed by Sanger sequencing with the BigDye^® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) on an ABI 3730XL DNA Analyzer (Applied Biosystems, Foster City, CA, USA).

2.5. Chromosomal Location Assessment

Transcript sequences were aligned to the V. unguiculata (ID: 540) reference genome deposited in Phytozome [19]. The corresponding gene loci, including their genomic coordinates and structural annotations, were directly retrieved from this platform (https://phytozome-next.jgi.doe.gov/, accessed on 30 May 2025). The identified loci were mapped and visualized on assembled chromosomes using the TBtools-II package [20].

2.6. Structural Genomics, Expansion Mechanisms, and Synteny Analysis

The gene structure pattern of V. unguiculata eIF4E isoforms was retrieved from Phytozome and visualized using the Gene Structural Display Server 2.0 (GSDS) tool [21]. Regarding the evolution of eIF4E genes in V. unguiculata, the evolutionary mechanisms that allowed the dispersal of these genes across the species’ chromosomes were mapped. To this end, the MCScanX v1.0.0 tool [22], incorporated into the TBtools-II package [20], was used to perform a BLASTp [16] search between the V. unguiculata proteome and the species’ own genome, aiming to identify similarity patterns and genomic organization that would allow the characterization and classification of eIF4E genes according to their gene duplication events. Based on these patterns, the genes were categorized as singleton genes, tandem duplications, segmental duplications, dispersed, and proximal.

The same methodology was applied to identify syntenic blocks using the species’ proteome against the proteomes of three related species: Phaseolus vulgaris, Glycine max, and Lens culinaris, which served as a basis to contextualize the expansion and diversification of the eIF4E gene family in V. unguiculata. The inclusion of these species aimed to reinforce the robustness of the results and allow for a more comprehensive and comparative characterization of the syntenic blocks associated with eIF4E genes.

2.7. Phylogenetic and Evolutionary Analysis

Phylogenetic analyses were performed for the eIF4E, eIF(iso)4E, and nCBP isoforms identified in V. unguiculata cultivars, as well as in other Fabaceae species (Supplementary Table S1). Annotated genes, corresponding to the amino acid sequences, were obtained from the NCBI and Phytozome databases (accessed on 12 October 2025; Supplementary Table S1). Only complete and high-quality entries were retained, and entries labeled as “partial,” “fragment,” “low quality,” “uncharacterized,” “predicted,” “hypothetical,” “unknown,” or “unnamed” were removed. Redundant sequences (identity > 99%) were removed using CD-HIT (v4.8.1).

Orthologs were identified using reciprocal BLAST searches (v.2.16.0) [23] (accessed on 23 October 2025). For genes with multiple splice variants, the canonical isoform was prioritized, in its absence, the transcript with the highest level of experimental evidence was selected. Amino acid sequences were aligned using the MAFFT L-INS-i algorithm, and ambiguously aligned regions were removed with Gblocks (v0.91b) [24].

Maximum Likelihood (ML) phylogenies were inferred using RAxML (v8.2.12) [25], applying the most suitable substitution model determined by ModelTest-NG (v0.1.7) [26] under the Bayesian Information Criterion (BIC). Branch support was evaluated with 1000 ultrafast bootstrap (BS) replicates. Individual trees were generated for eIF4E, eIF(iso)4E, and nCBP, while a combined tree including all isoform classes was reconstructed to visualize broader evolutionary patterns. In individual reconstructions, sequences from Arabidopsis thaliana (L.) Heynh., Raphanus sativus (L.), and Hirschfeldia incana (L.) Lagr.-Foss (Brassicaceae) served as outgroup. Final tree files were visualized in FigTree (v1.4.4) (https://tree.bio.ed.ac.uk/software/figtree/) (accessed on 12 November 2025) and edited in iTOL (v6) [27].

Complementarily, a distance-based phylogenetic tree was inferred using the Neighbor-Joining (NJ) method implemented in MEGA software (v.11) [28]. The reliability of the resulting distances and branch topology was assessed with 1000 ultrafast bootstrap (BS) replicates, and the final branch lengths were expressed as the number of amino acid substitutions per site (Figure S8).

2.8. Protein Sequences, Alignments and Conserved Domain of eIF4E Isoforms

The primary amino acid sequences of the eIF4E isoforms were obtained by translating the corresponding nucleotide sequences using the ORFfinder tool [29] (https://www.ncbi.nlm.nih.gov/orffinder/, accessed on 12 August 2025). Signal peptide and conserved functional domains were identified using the CD-Search tool [30] (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, accessed on 12 August 2025). Multiple sequence alignments were generated using ClustalW [31] algorithm implemented in MEGA software (v.11) [28] and visualized in Jalview (v. 2.11) [32].

2.9. Molecular Modeling, Model Validation, and Molecular Dynamics Simulations

Three-dimensional structures of the eIF4E isoforms were predicted using AlphaFold 3 (AF3) neural network. Model reliability was assessed based on the Predicted Local Distance Difference Test (pLDDT) and Predicted Aligned Error (PAE) confidence metrics [33]. Stereochemical quality and thermodynamic stability were validated using ProSA [34], PROCHECK [35], and QMEANDisCo [36]. Molecular dynamics (MD) simulations were conducted using the GROMACS suite with the GROMOS 53A6 force field [37]. The protein systems were solvated with the Simple Point Charge (SPC) water model within a cubic box (10 nm edge length). To replicate physiological ionic strength, Na+ and Cl- ions were added to a concentration of 0.15 M [37]. The simulation protocol commenced with energy minimization via the steepest descent algorithm (50,000 steps) to resolve steric clashes, followed by an equilibration phase in the NVT ensemble at 300 K with position restraints applied to the solute [37]. The LINCS algorithm was employed to constrain bond lengths, enabling a 2 fs integration time step via the leapfrog integrator [38]. Unrestrained production runs were executed for 100 ns under the NPT ensemble (300 K, 1 atm). Trajectory analyses evaluated structural stability and fluctuation profiles through Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), B-factor, Radius of Gyration (RG), and Hydrogen Bonds (HBs) dynamics [37]. Finally, the Electrostatic Surface Potential (ESP) was calculated using the APBS-PDB2PQR suite [39].

3. Results

3.1. Assessment of eIF4E Coding Sequences of Cowpea Cultivars

The CDSs of the eIF4E, eIF(iso)4E, and nCBP genes were sequenced in six cowpea cultivars. The sequence alignment showed that the eIF4E CDSs (Figure S1) were highly conserved, with an overall sequence identity of 98.48% (Figure 1A). Despite this high conservation, some polymorphisms were detected among the cultivars. Notably, the Bajão’s eIF4E displayed a 6-pb insertion between positions 224 and 230 (Figure S2). The eIF(iso)4E CDS (Figure S3) were similarly conserved, displaying a mean pairwise identity of 98.46% (Figure 1B), although some polymorphisms were observed (Figure S3). Regarding the nCBP CDSs, the Santo Inácio cultivar included a 3-pb insertion between positions 88 and 91 (Figure S4). Furthermore, this cultivar also presented the highest number of nucleotide polymorphisms in the nCBP gene among the cultivars. The CDSs of nCBP showed an overall sequence identity of 98.68% (Figure 1C).

3.2. Chromosomal Location and Expansion Mechanisms of the eIF4E Isoform Gene Family

Structural genomics has proven to be a significant ally in mapping biological information regarding various genes, complementing and bridging gaps related to the evolution and distribution of important genes for plant machinery. In V. unguiculata, the three genes encoding eIF4E proteins were located on chromosomes 4, 6, and 7. Specifically, the nCBP gene was mapped to chromosome 4, the canonical eIF4E to chromosome 6, and the eIF(iso)4E isoform to chromosome 7 (Figure 2). Taken together, the genomic distribution and orthologous relationships suggest that the V. unguiculata eIF4E gene family expanded via dispersed duplication events, resulting in the retention of paralogs on non-homologous chromosomes.

3.3. Gene Structure of the eIF4E Family in Cowpea and Related Species

The gene structure was comparatively evaluated across seven species, including V. unguiculata. Gene structure was compared across seven plant species, including V. unguiculata. To facilitate cross-species comparisons, eIF4E gene models are shown with distinct colors within the CDSs, with each color corresponding to a different species (Figure 3).

Species considered more evolutionarily basal, such as Ceratopteris richardii (Pteridophyte) and Physcomitrium patens (Bryophyte), were included in the analysis to investigate the structural pattern and possible evolutionary trajectory of this gene family compared to higher plants. In P. patens, six eIF4E genes were identified, with five of them in triplicate. In C. richardii, four eIF4E genes were identified. In both species, gene structures vary in the number of exons from five to six. Notably, C. richardii includes loci exceeding 10 kb, resulting in a substantially larger genomic architecture compared to the other analyzed species.

In V. unguiculata, the structural elements of the analyzed genes were highly conserved. The nCBP gene contained an additional exon (a total of six exons), whereas the remaining structures comprised five exons. Furthermore, the copy Vigun06g182700.2 shows variation in the length of the last exon compared to Vigun06g182700.1, consistent with a possible post-duplication structural modification. Among the analyzed angiosperms, a broadly conserved structural pattern was also observed, with gene distribution similar to that described in more basal species, such as P. patens, C. richardii, and V. unguiculata. Most gene structures retain a conserved number of exons (ranging from five to six), while total gene length varies from <2 Kb to ~5 kb.

Additionally, specific variations were identified in certain gene structures. For example, the second copy of the gene AT5G35620 presents more than one 5′ UTR region. In rice (Oryza sativa), the gene LOC_Os03g15590 lacks UTR regions at both termini. Similar modifications were also observed in eIF4E genes of bryophytes and pteridophytes. Finally, although A. thaliana and Glycine max follow the same structural patterns described above, both species harbor a higher number of eIF4E gene copies than the other angiosperms evaluated.

3.4. Chromosomal Distribution and Synteny Analysis of eIF4E Genes in Legumes

To assess genomic conservation and evolutionary collinearity, a synteny analysis was performed. This analytical approach evaluates the preservation of gene order and structural organization across different genomes, providing critical insights into evolutionary history, chromosomal rearrangements, and orthologous relationships. For this study, three well-characterized species from the Fabaceae family were selected based on prior analyses: Lens culinaris, Glycine max, and Phaseolus vulgaris. Among these, P. vulgaris is particularly significant due to its close phylogenetic relationship with V. unguiculata (Figure 4).

In P. vulgaris, syntenic blocks corresponding to all three V. unguiculata eIF4E genes were identified. Notably, high chromosomal conservation was observed, with syntenic regions maintained between corresponding chromosomes in both species. In contrast, the same was not observed for the other analyzed legumes. In L. culinaris, there was a preservation in the quantity of genes shared with V. unguiculata, but these genes mapped to different chromosomes. In G. max, similar patterns were observed, with an emphasis on the duplication of V. unguiculata eIF4E genes.

3.5. Phylogenetic and Evolutionary Analysis

Evolutionary relationships among eIF4E family isoforms were reconstructed using a curated dataset of 106 protein sequences of three isoforms: eIF4E, eIF(iso)4E, and nCBP. The data included representatives of 95 Fabaceae sequences, in addition to 11 Brassicaceae sequences (including eIF4E1b and eIF4E1c from A. thaliana), used as outgroup in the individual, thereby allowing robust phylogenetic anchoring. The eIF4E and eIF(iso)4E isoforms presented similar sample sizes in Fabaceae, totaling 33 and 34 sequences, respectively, whereas nCBP was represented by 28 sequences.

The reconstructed phylogeny showed a robust topology, supporting three main clades (Figure 5), corresponding to the respective isoforms: Class I (nCBP) and Class II (eIF4E and eIF(iso)4E). Class I formed the most basal clade, representing the most ancestral form in Fabaceae. Subsequently, the Class II canonical eIF4Es diverged (BS: 100%), followed by the emergence of eIF(iso)4E, which appears as the most derived lineage among the analyzed isoforms (BS: 95%). The clear separation between the nCBP isoform and the Class II isoforms, associated with the contrasting branch lengths observed in the cladogram (Figure S8), reinforces the evolutionary distinction between these groups. The nCBP clade presents shorter and more uniform branches, indicating less divergence between its sequences. In contrast, the canonical lineages eIF4E and eIF(iso)4E exhibit relatively longer branches, evidencing greater molecular divergence between the analyzed sequences (Figure S8).

Additionally, the individual phylogenetic trees supported the topology initially reconstructed in the combined phylogeny (Figure S1). For both combined and individual analyses (Figure S1), phylogenetically closer genera, such as Phaseolus (L.) and Vigna Savi. (tribe Phaseoleae), and Medicago (L.) and Trifolium (L.) (tribe Trifolieae), maintained their positioning as sister groups without significant alterations between reconstructions.

3.6. Characterization of Conserved Amino Acids and Sequence Alignment of eIF4E Proteins

The conservation of specific amino acid residues and functional domains is critical for maintaining the structural integrity and biological activity of eukaryotic translation initiation factors. The primary structure dictates the conformational features required for essential cellular functions, such as mRNA 5′-cap recognition, as well as interactions with viral determinants such as the VPg (Genome-linked viral protein). To characterize the conservation profiles and assess the levels of identity at the amino acid level, the deduced amino acid sequences of the eIF4E, eIF(iso)4E, and nCBP genes from six cowpea cultivars were aligned with orthologous reference sequences from A. thaliana and P. vulgaris.

The multiple alignment of the amino acid sequences of eIF4E, eIF(iso)4E, and nCBP isoforms from six V. unguiculata cultivars with orthologous reference sequences from A. thaliana and P. vulgaris revealed a higher variability in the N-terminal region (residues 1–66) across all analyzed sequences (Figures S5–S7). This domain exhibited distinct point divergences among the V. unguiculata cultivars and reference species, in contrast to the highly conserved C-terminal region.

The nCBP protein sequences (Figure S5) showed high conservation between V. unguiculata, A. thaliana, and P. vulgaris. The cowpea orthologs were nearly identical to the A. thaliana and P. vulgaris sequences within the core blocks. Critical tryptophan (W) residues and the FWED motif (Phenylalanine-Tryptophan-Glutamic acid-Aspartic acid), located between residues 92–95, were strictly conserved in all analyzed sequences. For the canonical eIF4E protein (Figure S6), sequence analysis indicated high conservation between V. unguiculata and P. vulgaris. Moreover, the hallmark tryptophan (W) residues, as well as acidic residues such as glutamic acid (E) and aspartic acid (D), were conserved across all sequences. Otherwise, the eIF(iso)4E isoform (Figure S7) similarly exhibited a high degree of amino acid conservation, with the spatial arrangement of critical tryptophan residues and the hydrophobic core maintained in all cowpea cultivars. Unlike the nCBP and eIF4E sequences, no insertions or deletions were observed in the eIF(iso)4E group among the six V. unguiculata cultivars and the P. vulgaris reference.

Taken together, these alignments demonstrate that while the core domains of the eIF4E isoforms remain highly conserved among cowpea cultivars and across A. thaliana and P. vulgaris orthologs, consistent with retention of essential cap-binding features, sequence variation is restricted to the N-terminal regions.

3.7. Three-Dimensional Modeling and Quality Assessment

The rigorous validation of theoretical models is fundamental to ensuring the reproducibility and reliability of structural biology data. Accordingly, we evaluated stereochemical quality and folding stability metrics to support the selection of robust structures (Supplementary Table S2). Three-dimensional models were generated for the three eIF4E-family isoforms (eIF4E, eIF(iso)4E, and nCBP) from six V. unguiculata cultivars (Bajão, Boca Negra, BR14-Mulato, IT85F-2687, Pingo de Ouro, and Santo Inácio) and compared with models from reference species, including A. thaliana and the phylogenetically related legume P. vulgaris. All predictions were produced with AlphaFold 3, meeting high-quality criteria, evidenced by predicted Local Distance Difference Test (pLDDT) scores exceeding 90.0 and Predicted Aligned Error (PAE) values consistently low, falling below 5 Å, indicating well-defined domains. The combination of these pLDDT and PAE criteria signals high-quality theoretical models, ensuring their suitability for detailed structural characterization.

Complementing the AlphaFold confidence metrics, the structural validity of the generated models was further assessed using standard geometric and energetic evaluations. Analyses performed in ProSA-web showed Z-Score values ranging from −5.00 to −6.48 for all theoretical models, placing them within the range characteristic of experimentally determined protein structures of similar size. Regarding stereochemical quality, the PROCHECK analysis revealed exceptional backbone conformation, with >90% of residues located in the most favored regions of the Ramachandran plot (average of 94.19%). Furthermore, QMEANDisCo analysis supported the global reliability of the structures, yielding scores consistently >0.74 (global mean of 0.80). Detailed structural assessment data for all theoretical models are provided in Supplementary Table S2. Collectively, these metrics confirm that the modeled eIF4E, eIF(iso)4E, and nCBP structures showed high stereochemical accuracy and were suitable for downstream physicochemical analyses.

The structural analysis of the generated models revealed that all three isoforms, nCBP, eIF4E, and eIF(iso)4E, adopt the canonical “cupped hand” architecture characteristic of the eIF4E family in eukaryotes. As illustrated in the secondary structure visualization (Figure 6), the protein core is composed of a curved β-sheet formed by antiparallel β-strands (shown in red), which serve as the scaffold for the cap-binding pocket. These strands are stabilized by α-helices (shown in blue) found on the dorsal surface of the protein. The loops connecting these secondary elements (shown in white) exhibited significant variation in length; notably, the loops near the entrance of the binding cavity were longer and structurally more flexible, whereas the loops connecting the dorsal helices were comparatively shorter and more rigid.

To quantify the structural conservation, we performed a global superimposition of the backbone atoms. The resulting structural alignments demonstrated a high degree of spatial convergence across all models. Specifically, the superimposition of the nCBP models yielded a Root Mean Square Deviation (RMSD) of 1.330 Å (Figure 7A). Similarly, the canonical eIF4E structures presented an RMSD of 1.665 Å (Figure 7B), while the eIF(iso)4E models exhibited the highest backbone compactness with an RMSD of 1.006 Å (Figure 7C). These RMSD values (<2.0 Å) confirm that despite the sequence variations observed in the loops and N-terminal regions, the overall three-dimensional folding and the spatial arrangement of the secondary structure elements remain highly conserved across the legume species analyzed.

3.8. Structural Dynamics, Stability and Features of eIF4E Proteins

Molecular dynamics (MD) simulations provide a critical approach to transcend the static limitations of theoretical models, offering high-resolution insights into the time-dependent conformational behavior of biological macromolecules. This computational method is particularly useful for evaluating protein stability under physiological conditions and for decoding the dynamic features, such as domain flexibility and rigidity, that are intrinsic to biological function and ligand interaction.

To assess the structural stability of the V. unguiculata eIF4E isoforms, 100 ns MD simulations were performed for all modeled systems. The RMSD profiles (Figure 8A,C,E) showed that the protein backbones reached equilibrium after an initial relaxation phase, typically within the first 20 ns of simulation. Across all three isoforms (eIF4E, eIF(iso)4E, and nCBP), the trajectories exhibited stable plateaus with average RMSD values ranging from 0.2 to 0.4 nm. This behavior demonstrates that the AlphaFold-generated models maintained their structural integrity and fold stability throughout the simulation time, with no evidence of unfolding or aberrant deviation from the starting conformation.

Complementing the global stability analysis, the Root Mean Square Fluctuation (RMSF) was calculated to map the local flexibility of individual amino acid residues (Figure 8B,D,F). Fluctuation patterns were remarkably conserved among the six cowpea cultivars and the reference species (A. thaliana and P. vulgaris), reinforcing the structural homology of these proteins. As expected, the regions of highest flexibility (peaks > 0.3 nm) corresponded to the N-terminal tails and the loops connecting secondary structure elements. In contrast, the regions forming the central β-sheet core and the dorsal α-helices exhibited low fluctuation values (<0.15 nm), reflecting their structural rigidity required to maintain the architecture of the cap-binding pocket.

Complementing the RMSF profiles, the B-factors were mapped onto the three-dimensional structures to spatially visualize regions of high conformational plasticity (Figure 9). The color-coded projection reveals a distinct flexibility landscape, where the central β-sheet core and dorsal α-helices exhibit consistently low B-factor values, represented in blue, indicating structural rigidity across all isoforms. Conversely, the highest B-factor values, represented in yellow to red, are strictly localized to the interconnecting loops and N-terminal tails. In particular, the loops flanking the mRNA 5′ cap-binding pocket display elevated displacement. This feature is notably pronounced in the canonical eIF4E isoform, where the loops forming the entrance to the binding cavity exhibited a higher degree of flexibility compared to the nCBP and eIF(iso)4E models, reinforcing the dynamic nature of the ligand-entry region.

To further characterize the conformational landscape and verify the structural compactness of the modeled systems, additional geometric and energetic properties were evaluated over the MD trajectories: the Radius of Gyration (RG), Solvent Accessible Surface Area (SASA), the total number of intramolecular Hydrogen Bonds (HBs) and Minimum Distance Matrices (MDMAT), a compilation of all results are shown on Figure 10 and Figure 11.

The RG serves as a critical indicator of protein compactness and global folding stability. As observed in the central panels (Figure 10B,E,H), the RG values for all V. unguiculata cultivars remained remarkably stable throughout the simulation, with no significant drift or abrupt fluctuations. The trajectories settled into a steady state (~1.45–1.55 nm), indicating that the eIF4E isoforms maintain a tightly packed tertiary structure without expansion or unfolding events. This behavior was consistent across all three isoforms and mirrored the profiles of the reference species, A. thaliana and P. vulgaris.

Complementing the compactness analysis, the SASA was calculated to assess the exposure of the protein domains to the aqueous environment. The SASA plots (Figure 10C,F,I) display stable oscillation patterns around equilibrium values, suggesting that the hydrophobic core remains protected and the protein-solvent interface is preserved. The absence of increasing trends in SASA further confirms the structural integrity of the models, as unfolding would typically result in a substantial increase in solvent-exposed surface area.

Furthermore, the stability of the secondary and tertiary structural elements was monitored through the analysis of intramolecular HBs (Figure 10A,D,G). The time-dependent profiles reveal a predominantly constant density of hydrogen bond interactions throughout the simulations. Although dynamic fluctuations are inherent to molecular motion, the average number of bonds remained consistent across the nCBP and eIF4E isoforms. Within the eIF(iso)4E class, the trajectories largely converged to a similar range, with the notable exception of the Boca Negra cultivar, which exhibited a slightly higher average density of intramolecular hydrogen bonds compared to the other models. This persistence of the HBs network is crucial for maintaining the α-helical and β-sheet distinct architecture observed in the initial homology models, ensuring the functional scaffold required for cap-binding activity.

A comparative quantitative assessment across the three isoforms highlights the structural conservation of the eIF4E family in cowpea, despite the specific sequence variations. The RG trajectories showed remarkable convergence, with all systems stabilizing at an average compactness value of approximately 1.50 nm, confirming that the global folding volume is preserved regardless of the isoform. In terms of solvent exposure, the SASA values fluctuated within a consistent range of 85–95 nm² for all groups. Regarding the energetic landscape, the hydrogen bond analysis suggested a slight distinction for the eIF(iso)4E isoform, which maintained a higher average density of intramolecular interactions (~155–165 bonds) compared to the eIF4E and nCBP models (~140–150 bonds).

To outline the spatial connectivity and the time-averaged conformational ensemble of the eIF4E isoforms, the Minimum Distance Matrices (MDMAT) were calculated for all simulated systems. Unlike RMSD, which provides a global deviation value, the MDMAT generates a pairwise map representing the smallest distance between residue pairs averaged over the entire trajectory, a method crucial for visualizing the stability of tertiary contacts and identifying the preservation of secondary structure elements within the three-dimensional space. The resulting matrices (Figure 11) revealed a distinct and highly conserved topological signature across all three isoforms, where the main diagonal represents the zero-distance self-interaction and the broadened blue regions along this diagonal indicate stable local secondary structures, such as the α-helices located on the dorsal face of the protein. More significantly, the matrices displayed characteristic off-diagonal features, visible as perpendicular lines to the main diagonal, which correspond to the antiparallel α-strands forming the structural core. These regions exhibited mean inter-residue distances consistently below 0.6 nm, suggesting that the “cupped hand” architecture was tightly maintained via stable hydrogen bond networks and hydrophobic packing throughout the simulation.

A comparative assessment between the isoforms highlights subtle but specific structural nuances. The matrices for the canonical eIF4E (Figure 11A) exhibit the most defined contact patterns, particularly in the central region (residues ~40–120) where the β-sheet core is located, while the regions of long-range distance (>1.5 nm) represent the spatial separation between the flexible N-terminus and the C-terminal domains. Similarly, the topological maps for eIF(iso)4E (Figure 11B) are strikingly similar to the canonical isoform, reinforcing the high structural homology, although the contact density in the N-terminal region differs slightly due to sequence variations and the differential length of the tail. Regarding nCBP (Figure 11C), the matrices display a preserved core folding pattern indistinguishable from the other isoforms in the central domains, yet subtle variations in the off-diagonal intensities suggest minor differences in the packing of the external loops, consistent with the slightly higher RMSF values observed for this isoform.

Numerically, the matrices demonstrate a remarkable convergence across all V. unguiculata cultivars and reference species. The core structural elements maintain average atomic distances of <0.5 nm, while the overall protein dimensions result in maximum intramolecular distances of approximately 2.0 nm. The absence of aberrant distance shifts or the disappearance of contact signals in any of the 28 analyzed models provides robust evidence that the theoretical predicted structures represent stable, thermodynamically favorable conformations that persist within the physiological time scale of the simulation.

Finally, to map the physicochemical surface properties governing ligand recognition, the electrostatic potential distribution was calculated using the Adaptive Poisson-Boltzmann Solver (APBS) for all modeled structures. Electrostatic interactions play a pivotal role in the recruitment and orientation of the mRNA 5′ cap, serving as a long-range guidance mechanism. The resulting surface potential maps (Figure 12) reveal a striking conservation of the electrostatic fingerprint across the eIF4E, eIF(iso)4E, and nCBP isoforms.

The quantitative analysis of the surface charge distribution, scaled from −5 kT/e to +5 kT/e, identified a prominent, deep cleft characterized by a strong positive potential located on the concave face of the proteins. This electropositive region is structurally conserved in all six V. unguiculata cultivars and aligns perfectly with the cap-binding pockets observed in the reference structures of A. thaliana and P. vulgaris. This feature is biologically essential, as the dense concentration of positive charges shown in blue, provides the necessary electrostatic environment to neutralize and bind the negatively charged triphosphate bridge of the 7-methylguanosine (m7GpppN) cap. Conversely, the dorsal surfaces of the proteins exhibited a predominantly electronegative or neutral potential, likely involved in preventing non-specific aggregation or mediating interactions with other translation initiation partners. The preservation of this electrostatic signature across the different isoforms and genotypes, combined with the structural stability data and dynamic conservation, strongly supports the functional viability of these theoretical models for detailed interaction studies.

4. Discussion

4.1. Experimental Analysis and Coding Sequences Assessment

The comparative analysis of the eukaryotic translation initiation factor 4E (eIF4E) isoforms demonstrated a high degree of overall sequence conservation, juxtaposed with specific genetic alterations across the evaluated cultivars [40]. Resistance to Potyviridae members has frequently been associated with single nucleotide polymorphisms (SNPs) in eukaryotic translation initiation factor 4E (eIF4E) genes or their isoforms [41]. Consistent with this model, although the coding sequences (CDS) of the three V. unguiculata eIF4E isoforms analyzed in this study were highly conserved, we identified several SNPs across the cultivars. Consistently, previous study from our group [15] verified a relationship between mutations in V. unguiculata eIF4E and Potyvirus resistance, reinforcing the functional relevance of naturally occurring variation in translation initiation factors during viral infection. Nevertheless, potyviruses can bypass SNP-mediated resistance through mutations in viral proteins or by exploiting alternative isoforms of the host eIF4E gene [42,43].

In legumes (Fabaceae), particularly in economically significant crops such as cowpea, understanding the evolutionary dynamics and structural conservation of these genes is crucial for developing resistance strategies. While the eIF4E family has been is well-characterized in model species like A. thaliana, its organization in V. unguiculata and its syntenic relationships with related organisms, such as P. vulgaris and Glycine max, reveal complex evolutionary patterns driven by selective pressures.

4.2. Genomic Architecture and Expansion Mechanisms

Regarding genomic aspects, the three genes identified in V. unguiculata reflect a conservation pattern reported for other dicots, including watermelon (Citrullus lanatus) [44], but different from species with higher copy number variations driven by specific polyploidization events, such as G. max or Solanum spp. (e.g., tomato and pepper) [45]. In cowpea, the expansion of eIF4E genes appears to have occurred predominantly through dispersed or translocated duplications rather than recent whole-genome duplications. This mechanism, often associated with transposable elements and epigenetic regulation, results in non-collinear and spatially distant gene copies [46,47], explaining the distinct chromosomal locations observed in the analysis.

The gene architecture remains highly conserved, characterized by a 5–6 exon structure. This pattern is maintained across both phylogenetically distant organisms and closely related species, like Phaseolus vulgaris and Lens culinaris. However, structural variations such as the loss of UTR regions or exons, as observed in G. max, O. sativa, and Brassica rapa [48], indicate that functional redundancy can lead to pseudogenization or reduced purifying selection efficiency in specific paralogs [49]. In V. unguiculata, by contrast, the functional domains appear preserved, suggesting that the spaced distribution allows these copies to be subject to distinct selective pressures, potentially implying neofunctionalization [50].

4.3. Synteny and Comparative Genomics

Syntenic analysis reveals collinear blocks between V. unguiculata and P. vulgaris, with eIF4E genes located on corresponding chromosomes (4, 6, and 7). However, chromosomal rearrangements during the divergence of these species appear to have influenced gene orientation. It was observed that the gene identified on chromosome 7 of V. unguiculata, when compared to its counterpart in P. vulgaris, is located on the same chromosome and in an equivalent position along the chromosomal axis. However, it is positioned at opposite chromosomal ends, indicating an inversion in the orientation of the chromosomal arm. This pattern suggests the occurrence of structural rearrangements, such as inversions or chromosomal reorganization events, during the evolutionary history of Vigna and Phaseolus lineages. These findings highlight the genomic plasticity of these species despite the conservation of the gene locus, as previously reported [51].

In contrast, G. max displays a duplication of all cowpea eIF4E orthologs, reflecting multiple rounds of whole-genome duplication that have shaped the soybean genome [41,52]. While copy number is increased in soybean, functional redundancy may compromise the functionality of some copies to maintain genetic machinery balance [52]. Conversely, Lens culinaris preserves the gene quantity but in distinct genomic regions, likely due to chromosome number reduction. This indicates that genome size reduction (as seen in Utricularia gibba) does not necessarily imply a loss of essential regulatory genes, which are maintained under strong selective pressure.

4.4. Evolutionary History and Phylogenetic Reconstruction

The topology obtained from the reconstruction of the eIF4E family confirms the division of Fabaceae into two main paralogous lineages: Class I (nCBP) and Class II (eIF4E and eIF(iso)4E). This organization is consistent with analyses in Viridiplantae, indicating that eIF4E and eIF(iso)4E forms a monophyletic clade distinct from the ancestral nCBP lineage (equivalent to 4EHP/4EHP-like) [53,54,55,56]. Specifically, the conservation of the basal position of Class I (Figure 2) supports the hypothesis that nCBP retains features of the ancestral cap-binding protein present in green algae prior to the duplication events that gave rise to the canonical forms in vascular plants [54].

The identification of representatives from two classes in cowpea cultivars (Bajão, Boca Negra, BR14 Mulato, IT85F-2687, Pingo de Ouro, and Santo Inácio), as well as in genera of the subfamily Caesalpinioideae (e.g., Senna and Prosopis), which are evolutionarily prior to the emergence of the Papilionoideae subfamily, suggests that the common ancestor of these legumes already possessed all three gene copies. The positioning of Vigna sequences within the tribe Phaseoleae fills an important evolutionary gap, demonstrating that the diversification within Class I likely derives from a single duplication event followed by the maintenance of its primary function.

This maintenance is significant because the eIF4E-family evolution is closely linked to pressures imposed by virus–host interactions [43]. The reconstruction of the monophyletic clade of eIF4E and eIF(iso)4E supports the “duplication-degeneration-complementation” or sub functionalization hypothesis, in which one paralog preserves core translation activity while the other accumulates mutations that may confer viral resistance without compromising plant viability [53,57]. Finally, the clustering of A. thaliana sequences within the Fabaceae isoform clades further highlights deep conservation across different Eudicot families [7,54], suggesting that after ancient duplication events, isoforms were passed vertically without significant recent losses in this lineage.

4.5. eIF4E Protein Sequence Conservation and Biological Signatures

The characterization of the eIF4E gene family in cowpea represents a crucial step toward understanding the translational machinery of this legume, filling a gap in the genomic data for a variety of crops [15]. Our comparative analysis revealed a shared architectural signature across all analyzed isoforms: a highly conserved C-terminal core juxtaposed with a divergent N-terminal domain (Figures S4–S6). This structural organization is consistent with the evolutionary paradigm of eukaryotic translation initiation factors, where the core domain undergoes strong purifying selection to maintain the stereochemical requirements for 5′-cap (m7GpppN) recognition, while the N-terminus remains evolutionarily variable [43,58]. This pronounced divergence is partly attributable to the presence of signal peptide motifs. However, while these motifs show a baseline level of conservation within the plant kingdom, they also display pronounced species-specific variability [7,43,59].

The variability observed in the N-terminal region (residues 1–66) among the cowpea cultivars, and the reference species (A. thaliana and P. vulgaris) likely reflects the intrinsically disordered nature of this domain [60]. In plants, the N-terminus does not contribute directly to cap binding, instead, it is thought to play a regulatory role, often serving as a docking site for accessory proteins or undergoing post-translational modifications that modulate translational activity [15,61]. The divergence found in this region suggests that, while the fundamental mechanism of translation initiation is preserved, species-specific or even cultivar-specific regulatory fine-tuning may occur, potentially influencing how different cowpea genotypes respond to biotic and abiotic stresses [62,63]. Biologically, the strict conservation of the tryptophan (W) residues and the FWED motif across all analyzed sequences underscores the functional competence of the cowpea isoforms [64]. These aromatic residues form the “tryptophan sandwich”, a hydrophobic stack that intercalates the 7-methylguanosine (m⁷G) moiety of the mRNA cap, essential for the recruitment of the ribosome [61,64]. The presence of conserved acidic residues, such as glutamic acid (E) and aspartic acid (D), further facilitates this interaction by neutralizing the positive charge of the mRNA 5′cap [59,65]. The identification of these motifs in V. unguiculata (Figure S5) suggest that the three described genes encode translation initiation factors capable of driving protein synthesis.

Notably, the eIF(iso)4E isoform exhibited a unique conservation profile: no insertions or deletions were observed among the six cowpea cultivars and the P. vulgaris reference (Figure S6). This remarkable stability (greater than that observed for nCBP and eIF4E), suggests that the iso isoform may be under distinct evolutionary constraints in legumes [7,43]. Given that eIF(iso)4E is frequently recruited by potyviruses (such as the Bean common mosaic necrosis virus), as a host susceptibility factor, its high conservation in cowpea potentially indicates that this protein could serve as a stable and evolutionarily constrained target for viral subversion [15,43,66]. This extensive conservation across diverse plant species likely contributes to the broad host range observed for many potyviruses, as the viral VPg (Viral Protein genome-linked) appears to have evolved to recognize these highly conserved structural determinants [66]. Nevertheless, it is crucial to note that despite this global conservation, specific point mutations can disrupt this interaction without compromising the physiological function of the protein in translation [66]. Indeed, our previous study [15] identified discrete nonsynonymous substitutions in the cowpea’s eIF4E gene. Such changes could be sufficient to confer partial evasion (reduced susceptibility) or resistance to specific Potyvirus strains in certain cultivars [67]. Therefore, while the high background conservation of eIF4E protein sequences facilitates viral susceptibility across legume species, the identification of such rare, naturally occurring resistance alleles remains a priority for breeding programs aiming to disrupt the eIF4E-VPg interactome.

4.6. Structural Validation, Stability and Conformational Dynamics

The structural dynamics and interaction profiles predicted in our expanded in silico models provide a robust biophysical rationale for the cultivar-specific phenotypes previously characterized by our group [15]. While the current molecular dynamics simulations represent theoretical biophysical states, they are strictly anchored in a validated framework of experimental CABMV inoculation bioassays. In that prior context, the specific cowpea cultivars harboring the alleles modeled herein demonstrated unambiguousness in vivo resistance or susceptibility profiles driven by eIF4E mutations. Consequently, enhanced thermodynamic stability, domain flexibility, and distinct conformational signatures observed in this study, particularly the proposed structural competence of the eIF(iso)4E isoform, are not isolated computational artifacts. Instead, they provide a comprehensive mechanistic basis that explains the biological reality of viral subversion and genetic resistance established in plants.

The reliability of theoretical protein models is a prerequisite for their application in downstream functional analyses, particularly in the absence of experimentally resolved structures for a variety of important crops like cowpea. The structural validation conducted in this study demonstrates that the AlphaFold 3 generated models for eIF4E, eIF(iso)4E, and nCBP achieve stereochemical and energetic quality comparable to high-resolution theoretical models’ structures [33]. The convergence of high pLDDT scores (>90.0) and low PAE values (<5 Å), reinforced by QMEANDisCo global quality estimates averaging 0.80, ProSA Z-scores with a mean of −5.69, and PROCHECK analysis revealing an average of 94.18% of residues in thermodynamically favored regions, confirms that the modeled isoforms adopt a stable fold, devoid of the steric clashes or geometric distortions often associated with template-based modeling [33,68]. Detailed structural assessment data for all theoretical models are comprehensively listed in Supplementary Table S1.

Biologically, the structural analysis reveals that all cowpea isoforms preserve the canonical “cupped hand” architecture, a hallmark of the eIF4E family essential for its interaction with the mRNA 5′ cap [61,65,69]. The strict conservation of the antiparallel β-sheet core, supported by dorsal α-helices (Figure 6), provides the necessary scaffold to form the cap-binding pocket [61]. The low RMSD values obtained from the superimposition of V. unguiculata models with orthologs from A. thaliana and P. vulgaris (<1.7 Å for all isoforms, Figure 7) underscore the evolutionary pressure to maintain this tertiary fold [15]. This structural rigidity is critical, as even minor deviations in the core geometry could disrupt the precise positioning of the conserved tryptophan residues required for the “sandwich stacking” interaction with the 7-methylguanosine moiety [61,70].

Despite the static conservation of the core, the molecular dynamics simulations highlighted the importance of conformational plasticity in specific regions. The RMSF profiles (Figure 8) delineated a clear dynamic partitioning: a rigid core necessary for stable ligand binding, contrasted with flexible loops and N-terminal tails. This distinct mobility landscape was spatially corroborated by the analysis of B-factors, which mapped the regions of highest disorder precisely to the loops flanking the mRNA cap-binding pocket (Figure 9). This flexibility is unlikely to be a simulation artifact, but a functional requirement, since the loops surrounding the binding cavity often act as cavities that undergo induced fit movements to accommodate the mRNA cap or to facilitate the release of the translation factor after initiation [70,71]. The behavior observed in the cowpea isoforms mirrors that of the reference species, suggesting that the dynamic regulation of cap accessibility is a conserved mechanism in legumes.

The comparative analysis of stability metrics revealed that all three isoforms exhibit a high degree of structural compactness and stability. Specifically, the radius of gyration and solvent accessible surface area values fluctuated within a similar range across the eIF4E, eIF(iso)4E, and nCBP models. Despite this global similarity in folding volume, the hydrogen bond analysis highlighted a subtle distinction for the eIF(iso)4E isoform, which displayed a higher density of intramolecular interactions. Biologically, this structural conservation across isoforms underscores the functional plasticity of the eIF4E family [43]. In various plant-virus pathosystems, Potyviridae members have demonstrated the ability to recruit either the canonical eIF4E or the eIF(iso)4E isoform as host susceptibility factors, depending on the specific virus strain and plant genotype [43]. Conversely, while nCBP shares the conserved core architecture, it is frequently described as a specialized factor essential for specific plant developmental stages and mRNA processing, rather than serving as a primary target for viral subversion [6,72]. The theoretical thermodynamic stability and structural compactness shared by both the canonical eIF4E and eIF(iso)4E isoforms suggest that both proteins potentially possess the conformational profiles required to act as putative anchors for the Potyvirus’ VPg proteins [15,61]. Consequently, the preferential recruitment of one isoform over another by different viral strains is likely influenced by precise surface residue complementarity and host-genotype specificity, rather than global stability differences alone [33].

Furthermore, the analysis of the contact distances provided a detailed map of the spatial connectivity within these proteins (Figure 11). The conserved contact patterns across cultivars confirm that the N-terminal sequence variants do not propagate into conformational disturbances within the C-terminal core. This modular independence allows the N-terminus to evolve and acquire new regulatory motifs without compromising the basal translation initiation function performed by the C-terminus [15].

Moreover, the electrostatic surface potential analysis (Figure 12) provided the most direct link between structure and function. The prominent electropositive cleft conserved in the concave face of all cowpea isoforms is biologically required [61]. This positive potential is essential to electrostatically steer and neutralize the negatively charged triphosphate bridge of the mRNA 5′ cap [61]. The perfect alignment of this electrostatic fingerprint between V. unguiculata and the reference species suggests that the modeled proteins are fully competent for cap recognition. Conversely, the electronegative/neutral dorsal surface likely serves to prevent non-specific RNA binding, ensuring that the interaction is strictly specific to the 5′ terminus [15,61]. Consequently, the comparative structural benchmarking against A. thaliana and, crucially, the phylogenetically proximal legume P. vulgaris, provides robust evolutionary validation for the V. unguiculata models.

As one of the pioneering comprehensive three-dimensional descriptions of the cowpea eIF4E gene family, this study fills a significant gap in the structural genomics of some crops. Beyond theoretical validation, these theoretical models represent a strategic resource for genetic improvement programs, providing a detailed structural roadmap that could guide the rational exploration of resistance alleles and help clarify the potential atomic-level determinants governing the interaction with Potyvirus’ VPg proteins, thereby facilitating the development of virus-resistant cultivars.

Finally, the comprehensive structural and dynamic characterization of the eIF4E, eIF(iso)4E, and nCBP isoforms in V. unguiculata establishes a foundational benchmark for the functional genomics of this legume. By elucidating the conserved molecular architecture and the electrostatic determinants governing the 5′ cap interaction, this study provides the first rational structural framework to guide the engineering of these translation factors. The confirmed stability of the theoretical models directly empowers crop breeding programs, offering precise coordinates for allele mining of natural variants or for targeted mutagenesis via gene editing technologies, such as CRISPR/Cas9. By shifting the paradigm from a sequence-based description to a three-dimensional mechanistic understanding of the susceptibility mediated by the eIF4E-VPg interaction, our data highlight the understanding of the development of cultivars improved to Potyvirus infection, thereby decisively contributing to agricultural sustainability and food security in regions reliant on essential legume crops.

5. Conclusions

This study provides the first comprehensive genomic and structural characterization of the eIF4E, eIF(iso)4E, and nCBP gene family in V. unguiculata. Our analyses revealed that, while these isoforms share a highly conserved core and electrostatic signature essential for mRNA 5′ cap recognition, they follow distinct evolutionary trajectories and exhibit different dynamic behaviors. The genomic organization, characterized by dispersed duplications, and the high synteny with P. vulgaris underscore the evolutionary pressure to maintain these essential translation factors in legumes. Molecular dynamics simulations highlighted eIF(iso)4E as the most compact and structurally stable isoform, a feature that may suggest its preferential recruitment by Potyviridae as a host susceptibility factor. By transitioning from a linear sequence perspective to a three-dimensional conformational understanding, we identified that the N-terminal plasticity and the rigid cap-binding pocket are key determinants of their function. Consequently, these high-resolution models provide a strategic structural framework for breeding initiatives. They enable the rational identification of non-synonymous mutations that can disrupt viral VPg interaction without compromising physiological translation, paving the way for the development of cowpea cultivars with broader-spectrum resistance to Potyviruses.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16070766/s1, Table S1: Accession numbers for the 106 isoforms analyzed in this study, including 95 accessions for Fabaceae and outgroups used in the reconstruction; Table S2: Validation metrics for the structural quality of eIF4E, eIF(iso)4E, and nCBP theoretical models; Figure S1: Individual phylogenetic reconstructions of the eIF4E family isoforms; Figure S2: Multiple nucleotide sequence alignment of the eIF4E coding region (CDS) in six cowpea (Vigna unguiculata) cultivars; Figure S3: Multiple nucleotide sequence alignment of the eIF(iso)4E coding region (CDS) in six cowpea (Vigna unguiculata) cultivars; Figure S4: Multiple nucleotide sequence alignment of the nCBP coding region (CDS) in six cowpea (Vigna unguiculata) cultivars; Figure S5: Multiple amino acid sequence alignment of nCBP proteins from Vigna unguiculata cultivars and reference species; Figure S6: Multiple amino acid sequence alignment of eIF4E proteins from Vigna unguiculata cultivars and reference species; Figure S7: Multiple amino acid sequence alignment of eIF(iso)4E proteins; Figure S8: Neighbor-Joining cladogram for the eIF4E family isoforms in Fabaceae.

Author Contributions

Conceptualization: M.A.d.L.-A. and F.A.d.A.; methodology: M.A.d.L.-A., F.A.d.A., S.R.M.P., L.S.M., L.M.R.-P. and A.B.L.; formal analysis: M.A.d.L.-A. and F.A.d.A.; investigation: M.A.d.L.-A., F.A.d.A., S.R.M.P., L.S.M., L.M.R.-P. and A.B.L.; writing—original draft: M.A.d.L.-A. and F.A.d.A.; review and editing: V.P., J.D.C.F., A.M.B.-I. and F.J.L.A.; visualization: M.A.d.L.-A. and F.A.d.A.; supervision: V.P. and A.M.B.-I.; project administration M.A.d.L.-A., F.A.d.A. and V.P.; and resources: V.P., A.M.B.-I. and F.J.L.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the CNPq (National Council for Scientific and Technological Development) (financial support umbers: 404070/2024-8, 406657/2023-8 and 406048/2022-3); CNPq (167660/2022-5) and FAPEMIG (Minas Gerais State Research Support Foundation) (23072.242088/2024-72) grants.

Data Availability Statement

The original contributions presented in the study are included in the article and its Supplementary Materials. Raw data and additional resources supporting this research can be accessed via the GitHub repository at https://github.com/madsondeluna/agronomy-cowpea-eif4e, accessed on 25 March 2025. Further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors acknowledge the National Laboratory for Scientific Computing (LNCC) for providing high-performance computing resources through the Santos Dumont supercomputer, which supported the computational analyses performed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Cap 5′	5′ cap structure (m⁷GpppN)
APBS	Adaptive Poisson-Boltzmann Solver
AF3	AlphaFold 3
BIC	Bayesian Information Criterion
CDS	Coding Sequence
cDNA	Complementary DNA
CABMV	cowpea aphid-borne mosaic virus
ESP	Electrostatic Surface Potential
eIF4E	eukaryotic Translation Initiation Factor 4E
eIF(iso)4E	eukaryotic Translation Initiation Factor Isoform 4E
HB	Hydrogen Bond
ML	Maximum Likelihood
MDMAT	Mean Minimum Distance Matrices
mRNA	Messenger RNA
MD	Molecular Dynamics
nCBP	Nuclear cap-binding proteins
PCR	Polymerase chain reactions
PAE	Predicted Aligned Error
pLDDT	Predicted Local Distance Difference Test
RG	Radius of Gyration
RMSD	Root Mean Square Deviation
RMSF	Root Mean Square Fluctuation
SPC	Simple Point Charge
SNPs	Single Nucleotide Polymorphisms
UTR	Untranslated Region
VPg	Genome-linked viral protein
SASA	Solvent Accessible Surface Area
WGD	Whole-genome Duplications
TEs	Transposable Elements

References

Kolesnikova, V.V.; Nikonov, O.S.; Phat, T.D.; Nikonova, E.Y. The Proteins Diversity of the eIF4E Family in the eIF4F Complex. Biochem. Mosc. 2025, 90, S60–S85. [Google Scholar] [CrossRef]
Kropiwnicka, A.; Kuchta, K.; Lukaszewicz, M.; Kowalska, J.; Jemielity, J.; Ginalski, K.; Darzynkiewicz, E.; Zuberek, J. Five eIF4E Isoforms from Arabidopsis thaliana Are Characterized by Distinct Features of Cap Analogs Binding. Biochem. Biophys. Res. Commun. 2015, 456, 47–52. [Google Scholar] [CrossRef] [PubMed]
Merrick, W.C.; Pavitt, G.D. Protein Synthesis Initiation in Eukaryotic Cells. Cold Spring Harb. Perspect. Biol. 2018, 10, a033092. [Google Scholar] [CrossRef] [PubMed]
Rhoads, R.E. eIF4E: New Family Members, New Binding Partners, New Roles. J. Biol. Chem. 2009, 284, 16711–16715. [Google Scholar] [CrossRef] [PubMed]
Keima, T.; Hagiwara-Komoda, Y.; Hashimoto, M.; Neriya, Y.; Koinuma, H.; Iwabuchi, N.; Nishida, S.; Yamaji, Y.; Namba, S. Deficiency of the eIF4E Isoform nCBP Limits the Cell-to-Cell Movement of a Plant Virus Encoding Triple-Gene-Block Proteins in Arabidopsis thaliana. Sci. Rep. 2017, 7, 39678. [Google Scholar] [CrossRef]
Chen, R.; Yang, M.; Tu, Z.; Xie, F.; Chen, J.; Luo, T.; Hu, X.; Nie, B.; He, C. Eukaryotic Translation Initiation Factor 4E Family Member nCBP Facilitates the Accumulation of TGB-Encoding Viruses by Recognizing the Viral Coat Protein in Potato and Tobacco. Front. Plant Sci. 2022, 13, 946873. [Google Scholar] [CrossRef]
Joshi, B.; Lee, K.; Maeder, D.L.; Jagus, R. Phylogenetic Analysis of eIF4E-Family Members. BMC Evol. Biol. 2005, 5, 48. [Google Scholar] [CrossRef]
Nicaise, V.; German-Retana, S.; Sanjuán, R.; Dubrana, M.-P.; Mazier, M.; Maisonneuve, B.; Candresse, T.; Caranta, C.; LeGall, O. The Eukaryotic Translation Initiation Factor 4E Controls Lettuce Susceptibility to the Potyvirus Lettuce Mosaic Virus. Plant Physiol. 2003, 132, 1272–1282. [Google Scholar] [CrossRef]
Stein, N.; Perovic, D.; Kumlehn, J.; Pellio, B.; Stracke, S.; Streng, S.; Ordon, F.; Graner, A. The Eukaryotic Translation Initiation Factor 4E Confers Multiallelic Recessive Bymovirus Resistance in Hordeum vulgare (L.). Plant J. 2005, 42, 912–922. [Google Scholar] [CrossRef]
Ruffel, S.; Gallois, J.-L.; Moury, B.; Robaglia, C.; Palloix, A.; Caranta, C. Simultaneous Mutations in Translation Initiation Factors eIF4E and eIF(Iso)4E Are Required to Prevent Pepper Veinal Mottle Virus Infection of Pepper. J. Gen. Virol. 2006, 87, 2089–2098. [Google Scholar] [CrossRef]
Saidi, A.; Hajibarat, Z. In-Silico Analysis of Eukaryotic Translation Initiation Factors (eIFs) in Response to Environmental Stresses in Rice (Oryza sativa). Biologia 2020, 75, 1731–1738. [Google Scholar] [CrossRef]
Gao, L.; Luo, J.; Ding, X.; Wang, T.; Hu, T.; Song, P.; Zhai, R.; Zhang, H.; Zhang, K.; Li, K.; et al. Soybean RNA Interference Lines Silenced for eIF4E Show Broad Potyvirus Resistance. Mol. Plant Pathol. 2020, 21, 303–317. [Google Scholar] [CrossRef] [PubMed]
Herniter, I.A.; Muñoz-Amatriaín, M.; Close, T.J. Genetic, Textual, and Archeological Evidence of the Historical Global Spread of Cowpea ([L.] Walp.). Legume Sci. 2020, 2, e57. [Google Scholar] [CrossRef]
Amorim, L.L.B.; Ferreira-Neto, J.R.C.; Bezerra-Neto, J.P.; Pandolfi, V.; de Araújo, F.T.; da Silva Matos, M.K.; Santos, M.G.; Kido, E.A.; Benko-Iseppon, A.M. Cowpea and Abiotic Stresses: Identification of Reference Genes for Transcriptional Profiling by qPCR. Plant Methods 2018, 14, 88. [Google Scholar] [CrossRef]
de Andrade, F.A.; de Luna-Aragão, M.A.; Ferreira, J.D.C.; Souza, F.F.; Oliveira, A.C.d.R.; da Costa, A.F.; Aragão, F.J.L.; dos Santos-Silva, C.A.; Benko-Iseppon, A.M.; Pandolfi, V. Deciphering Cowpea Resistance to Potyvirus: Assessment of eIF4E Gene Mutations and Their Impact on the eIF4E-VPg Protein Interaction. Viruses 2025, 17, 1050. [Google Scholar] [CrossRef]
Shiryev, S.A.; Papadopoulos, J.S.; Schäffer, A.A.; Agarwala, R. Improved BLAST Searches Using Longer Words for Protein Seeding. Bioinformatics 2007, 23, 2949–2951. [Google Scholar] [CrossRef]
Ye, J.; Coulouris, G.; Zaretskaya, I.; Cutcutache, I.; Rozen, S.; Madden, T.L. Primer-BLAST: A Tool to Design Target-Specific Primers for Polymerase Chain Reaction. BMC Bioinform. 2012, 13, 134. [Google Scholar] [CrossRef]
Froger, A.; Hall, J.E. Transformation of Plasmid DNA into E. Coli Using the Heat Shock Method. J. Vis. Exp. 2007, 6, 253. [Google Scholar] [CrossRef]
Goodstein, D.M.; Shu, S.; Howson, R.; Neupane, R.; Hayes, R.D.; Fazo, J.; Mitros, T.; Dirks, W.; Hellsten, U.; Putnam, N.; et al. Phytozome: A Comparative Platform for Green Plant Genomics. Nucleic Acids Res. 2012, 40, D1178–D1186. [Google Scholar] [CrossRef]
Chen, C.; Wu, Y.; Li, J.; Wang, X.; Zeng, Z.; Xu, J.; Liu, Y.; Feng, J.; Chen, H.; He, Y.; et al. TBtools-II: A “One for All, All for One” Bioinformatics Platform for Biological Big-Data Mining. Mol. Plant 2023, 16, 1733–1742. [Google Scholar] [CrossRef]
Hu, B.; Jin, J.; Guo, A.-Y.; Zhang, H.; Luo, J.; Gao, G. GSDS 2.0: An Upgraded Gene Feature Visualization Server. Bioinformatics 2015, 31, 1296–1297. [Google Scholar] [CrossRef]
Wang, Y.; Tang, H.; DeBarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.-H.; Jin, H.; Marler, B.; Guo, H.; et al. MCScanX: A Toolkit for Detection and Evolutionary Analysis of Gene Synteny and Collinearity. Nucleic Acids Res. 2012, 40, e49. [Google Scholar] [CrossRef] [PubMed]
Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic Local Alignment Search Tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef] [PubMed]
Castresana, J. Selection of Conserved Blocks from Multiple Alignments for Their Use in Phylogenetic Analysis. Mol. Biol. Evol. 2000, 17, 540–552. [Google Scholar] [CrossRef] [PubMed]
Stamatakis, A. RAxML Version 8: A Tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies. Bioinformatics 2014, 30, 1312–1313. [Google Scholar] [CrossRef]
Darriba, D.; Posada, D.; Kozlov, A.M.; Stamatakis, A.; Morel, B.; Flouri, T. ModelTest-NG: A New and Scalable Tool for the Selection of DNA and Protein Evolutionary Models. Mol. Biol. Evol. 2020, 37, 291–294. [Google Scholar] [CrossRef]
Letunic, I.; Bork, P. Interactive Tree Of Life (iTOL) v5: An Online Tool for Phylogenetic Tree Display and Annotation. Nucleic Acids Res. 2021, 49, W293–W296. [Google Scholar] [CrossRef]
Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
Rombel, I.T.; Sykes, K.F.; Rayner, S.; Johnston, S.A. ORF-FINDER: A Vector for High-Throughput Gene Identification. Gene 2002, 282, 33–41. [Google Scholar] [CrossRef]
Yang, M.; Derbyshire, M.K.; Yamashita, R.A.; Marchler-Bauer, A. NCBI’s Conserved Domain Database and Tools for Protein Domain Analysis. Curr. Protoc. Bioinform. 2020, 69, e90. [Google Scholar] [CrossRef]
Larkin, M.A.; Blackshields, G.; Brown, N.P.; Chenna, R.; McGettigan, P.A.; McWilliam, H.; Valentin, F.; Wallace, I.M.; Wilm, A.; Lopez, R.; et al. Clustal W and Clustal X Version 2.0. Bioinformatics 2007, 23, 2947–2948. [Google Scholar] [CrossRef]
Waterhouse, A.M.; Procter, J.B.; Martin, D.M.A.; Clamp, M.; Barton, G.J. Jalview Version 2—A Multiple Sequence Alignment Editor and Analysis Workbench. Bioinformatics 2009, 25, 1189–1191. [Google Scholar] [CrossRef] [PubMed]
Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A.J.; Bambrick, J.; et al. Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3. Nature 2024, 630, 493–500. [Google Scholar] [CrossRef] [PubMed]
Wiederstein, M.; Sippl, M.J. ProSA-Web: Interactive Web Service for the Recognition of Errors in Three-Dimensional Structures of Proteins. Nucleic Acids Res. 2007, 35, W407–W410. [Google Scholar] [CrossRef] [PubMed]
Laskowski, R.A.; Rullmann, J.A.C.; MacArthur, M.W.; Kaptein, R.; Thornton, J.M. AQUA and PROCHECK-NMR: Programs for Checking the Quality of Protein Structures Solved by NMR. J. Biomol. NMR 1996, 8, 477–486. [Google Scholar] [CrossRef]
Studer, G.; Rempfer, C.; Waterhouse, A.M.; Gumienny, R.; Haas, J.; Schwede, T. QMEANDisCo—Distance Constraints Applied on Model Quality Estimation. Bioinformatics 2020, 36, 1765–1771. [Google Scholar] [CrossRef]
Abraham, M.J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J.C.; Hess, B.; Lindahl, E. GROMACS: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1–2, 19–25. [Google Scholar] [CrossRef]
Hess, B. P-LINCS: A Parallel Linear Constraint Solver for Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 116–122. [Google Scholar] [CrossRef]
Jurrus, E.; Engel, D.; Star, K.; Monson, K.; Brandi, J.; Felberg, L.E.; Brookes, D.H.; Wilson, L.; Chen, J.; Liles, K.; et al. Improvements to the APBS Biomolecular Solvation Software Suite. Protein Sci. 2018, 27, 112–128. [Google Scholar] [CrossRef]
Sanfaçon, H. Plant Translation Factors and Virus Resistance. Viruses 2015, 7, 3392–3419. [Google Scholar] [CrossRef]
Suzuki, M.; Nishikawa, M.; Yamamoto, T.; Koinuma, H.; Keima, T.; Fujimoto, Y.; Komatsu, K.; Hashimoto, M.; Neriya, Y.; Maejima, K.; et al. Broadening Virus Resistance Through Gene Pyramiding of eIF4E Family Members. Mol. Plant Pathol. 2025, 26, e70187. [Google Scholar] [CrossRef]
Moury, B.; Janzac, B.; Ruellan, Y.; Simon, V.; Ben Khalifa, M.; Fakhfakh, H.; Fabre, F.; Palloix, A. Interaction Patterns between Potato virus Y and eIF4E-Mediated Recessive Resistance in the Solanaceae. J. Virol. 2014, 88, 9799–9807. [Google Scholar] [CrossRef] [PubMed]
Zlobin, N.; Taranov, V. Plant eIF4E Isoforms as Factors of Susceptibility and Resistance to Potyviruses. Front. Plant Sci. 2023, 14, 1041868. [Google Scholar] [CrossRef] [PubMed]
Li, M.; Qiu, Y.; Zhu, D.; Xu, X.; Tian, S.; Wang, J.; Yu, Y.; Ren, Y.; Gong, G.; Zhang, H.; et al. Editing eIF4E in the Watermelon Genome Using CRISPR/Cas9 Technology Confers Resistance to ZYMV. Int. J. Mol. Sci. 2024, 25, 11468. [Google Scholar] [CrossRef]
Lebaron, C.; Rosado, A.; Sauvage, C.; Gauffier, C.; German-Retana, S.; Moury, B.; Gallois, J.-L. A New eIF4E1 Allele Characterized by RNAseq Data Mining Is Associated with Resistance to Potato Virus Y in Tomato Albeit with a Low Durability. J. Gen. Virol. 2016, 97, 3063–3072. [Google Scholar] [CrossRef] [PubMed]
Ganko, E.W.; Meyers, B.C.; Vision, T.J. Divergence in Expression between Duplicated Genes in Arabidopsis. Mol. Biol. Evol. 2007, 24, 2298–2309. [Google Scholar] [CrossRef]
Wang, Y.; Ficklin, S.P.; Wang, X.; Feltus, F.A.; Paterson, A.H. Large-Scale Gene Relocations Following an Ancient Genome Triplication Associated with the Diversification of Core Eudicots. PLoS ONE 2016, 11, e0155637. [Google Scholar] [CrossRef]
Jenner, C.E.; Nellist, C.F.; Barker, G.C.; Walsh, J.A. Turnip Mosaic Virus (TuMV) Is Able to Use Alleles of Both eIF4E and eIF(Iso)4E from Multiple Loci of the Diploid Brassica Rapa. MPMI 2010, 23, 1498–1505. [Google Scholar] [CrossRef]
Chain, F.J.; Dushoff, J.; Evans, B.J. The Odds of Duplicate Gene Persistence after Polyploidization. BMC Genom. 2011, 12, 599. [Google Scholar] [CrossRef]
Wendel, J.F.; Lisch, D.; Hu, G.; Mason, A.S. The Long and Short of Doubling down: Polyploidy, Epigenetics, and the Temporal Dynamics of Genome Fractionation. Curr. Opin. Genet. Dev. 2018, 49, 1–7. [Google Scholar] [CrossRef]
do Vale Martins, L.; de Oliveira Bustamante, F.; da Silva Oliveira, A.R.; da Costa, A.F.; de Lima Feitoza, L.; Liang, Q.; Zhao, H.; Benko-Iseppon, A.M.; Muñoz-Amatriaín, M.; Pedrosa-Harand, A.; et al. BAC- and Oligo-FISH Mapping Reveals Chromosome Evolution among Vigna angularis, V. unguiculata, and Phaseolus vulgaris. Chromosoma 2021, 130, 133–147. [Google Scholar] [CrossRef]
Crombez, E.; Van de Peer, Y.; Li, Z. The Subordinate Role of Pseudogenization to Recombinative Deletion Following Polyploidization in Angiosperms. Nat. Commun. 2025, 16, 6335. [Google Scholar] [CrossRef]
Patrick, R.M.; Mayberry, L.K.; Choy, G.; Woodard, L.E.; Liu, J.S.; White, A.; Mullen, R.A.; Tanavin, T.M.; Latz, C.A.; Browning, K.S. Two Arabidopsis Loci Encode Novel Eukaryotic Initiation Factor 4E Isoforms That Are Functionally Distinct from the Conserved Plant Eukaryotic Initiation Factor 4E. Plant Physiol. 2014, 164, 1820–1830. [Google Scholar] [CrossRef] [PubMed]
Dinkova, T.D.; Martinez-Castilla, L.; Cruz-Espíndola, M.A. The Diversification of eIF4E Family Members in Plants and Their Role in the Plant-Virus Interaction. In Evolution of the Protein Synthesis Machinery and Its Regulation; Hernández, G., Jagus, R., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 187–205. [Google Scholar]
Huang, C. From Player to Pawn: Viral Avirulence Factors Involved in Plant Immunity. Viruses 2021, 13, 688. [Google Scholar] [CrossRef] [PubMed]
Anuradha, C.; Chandrasekar, A.; Mol, P.P.; Balasubramanian, V.; Selvarajan, R.; Uma, S. Genomic Insights and Expression Analysis of Eukaryotic Initiation Factors (eIFs) in Banana Under Viral Stress: Focus on Evolutionary Dynamics and eIF4E Allele Mining. J. Plant Growth Regul. 2025, 45, 256–277. [Google Scholar] [CrossRef]
Bastet, A.; Robaglia, C.; Gallois, J.-L. eIF4E Resistance: Natural Variation Should Guide Gene Editing. Trends Plant Sci. 2017, 22, 411–419. [Google Scholar] [CrossRef]
Papadopoulos, E.; Jenni, S.; Kabha, E.; Takrouri, K.J.; Yi, T.; Salvi, N.; Luna, R.E.; Gavathiotis, E.; Mahalingam, P.; Arthanari, H.; et al. Structure of the Eukaryotic Translation Initiation Factor eIF4E in Complex with 4EGI-1 Reveals an Allosteric Mechanism for Dissociating eIF4G. Proc. Natl. Acad. Sci. USA 2014, 111, E3187–E3195. [Google Scholar] [CrossRef]
Romagnoli, A.; D’Agostino, M.; Ardiccioni, C.; Maracci, C.; Motta, S.; La Teana, A.; Di Marino, D. Control of the eIF4E Activity: Structural Insights and Pharmacological Implications. Cell. Mol. Life Sci. 2021, 78, 6869–6885. [Google Scholar] [CrossRef]
Smyth, S.; Gradinaru, C.C.; Forman-Kay, J.D. Multimodal Interactions between a Disordered Protein and Its Folded Target at Single-Molecule Level. Biophys. J. 2023, 122, 8a. [Google Scholar] [CrossRef]
Coutinho de Oliveira, L.; Volpon, L.; Rahardjo, A.K.; Osborne, M.J.; Culjkovic-Kraljacic, B.; Trahan, C.; Oeffinger, M.; Kwok, B.H.; Borden, K.L.B. Structural Studies of the eIF4E–VPg Complex Reveal a Direct Competition for Capped RNA: Implications for Translation. Proc. Natl. Acad. Sci. USA 2019, 116, 24056–24065. [Google Scholar] [CrossRef]
Owji, H.; Nezafat, N.; Negahdaripour, M.; Hajiebrahimi, A.; Ghasemi, Y. A Comprehensive Review of Signal Peptides: Structure, Roles, and Applications. Eur. J. Cell Biol. 2018, 97, 422–441. [Google Scholar] [CrossRef]
Kim, J.S.; Jeon, B.W.; Kim, J. Signaling Peptides Regulating Abiotic Stress Responses in Plants. Front. Plant Sci. 2021, 12, 704490. [Google Scholar] [CrossRef] [PubMed]
Hernández, G.; Ross-Kaschitza, D.; Moreno-Hagelsieb, G.; García, A.; Vélez, D.E.; Torres, B.L. Analysis of eIF4E-Family Members in Fungi Contributes to Their Classification in Eukaryotes. J. Biol. Chem. 2025, 301, 108129. [Google Scholar] [CrossRef] [PubMed]
Kinkelin, K.; Veith, K.; Grünwald, M.; Bono, F. Crystal Structure of a Minimal eIF4E–Cup Complex Reveals a General Mechanism of eIF4E Regulation in Translational Repression. RNA 2012, 18, 1624–1634. [Google Scholar] [CrossRef] [PubMed]
Moury, B.; Charron, C.; Janzac, B.; Simon, V.; Gallois, J.L.; Palloix, A.; Caranta, C. Evolution of Plant Eukaryotic Initiation Factor 4E (eIF4E) and Potyvirus Genome-Linked Protein (VPg): A Game of Mirrors Impacting Resistance Spectrum and Durability. Infect. Genet. Evol. 2014, 27, 472–480. [Google Scholar] [CrossRef]
Liu, Y.; Wang, S.; Zhao, D.; Zhao, C.; Yu, H.; Zeng, J.; Tong, Z.; Yuan, C.; Li, Z.; Huang, C. Simultaneous Knockout of Multiple Eukaryotic Translation Initiation Factor 4E Genes Confers Durable and Broad-Spectrum Resistance to Potyviruses in Tobacco. aBIOTECH 2025, 6, 232–248. [Google Scholar] [CrossRef]
Gil Zuluaga, F.H.; D’Arminio, N.; Bardozzo, F.; Tagliaferri, R.; Marabotti, A. An Automated Pipeline Integrating AlphaFold 2 and MODELLER for Protein Structure Prediction. Comput. Struct. Biotechnol. J. 2023, 21, 5620–5629. [Google Scholar] [CrossRef]
Volpon, L.; Osborne, M.J.; Topisirovic, I.; Siddiqui, N.; Borden, K.L. Cap-Free Structure of eIF4E Suggests a Basis for Conformational Regulation by Its Ligands. EMBO J. 2006, 25, 5138–5149. [Google Scholar] [CrossRef]
Freire, E.R.; Sturm, N.R.; Campbell, D.A.; De Melo Neto, O.P. The Role of Cytoplasmic mRNA Cap-Binding Protein Complexes in Trypanosoma Brucei and Other Trypanosomatids. Pathogens 2017, 6, 55. [Google Scholar] [CrossRef]
Shuvalova, E.; Shuvalov, A.; Al Sheikh, W.; Ivanov, A.V.; Biziaev, N.; Egorova, T.V.; Dmitriev, S.E.; Terenin, I.M.; Alkalaeva, E. Eukaryotic Initiation Factors eIF4F and eIF4B Promote Translation Termination upon Closed-Loop Formation. Nucleic Acids Res. 2025, 53, gkaf161. [Google Scholar] [CrossRef]
Wang, T.; Liu, J.; Chu, Z.; Zhao, Y.; Ma, J.; Tao, Z.; Wang, C.; Liu, L.; Li, P. A Genome-Wide Association Study Uncovers That BnaA10.NCBP Regulates Early Flowering in Brassica napus. Ind. Crops Prod. 2025, 226, 120703. [Google Scholar] [CrossRef]

Figure 1. Pairwise sequence identity matrices of eIF4E, eIF(iso)4E, and nCBP coding sequences among six cowpea cultivars. Heatmaps display the percentage of nucleotide identity among: (A) eIF4E, (B) eIF(iso)4E, and (C) nCBP. The values in each cell indicate the identity percentage for each pairwise comparison. The color scale reflects the degree of conservation, from yellow (100% identity) to dark blue (lower identity). The overall mean identity for each gene is shown at the top of each panel.

Figure 2. Genomic distribution of eIF4E isoforms in Vigna unguiculata. The linear diagram depicts the chromosomal loci of the three identified genes: nCBP (blue) on chromosome 4 (Chr 4), eIF4E (green) on chromosome 6 (Chr 6), and eIF(iso)4E (red) on chromosome 7 (Chr 7). Gray bars represent the chromosomes, and the X-axis indicates physical positions in megabase pairs (Mb). Exact start and stop coordinates (bp) are shown below each locus marker, illustrating the dispersed chromosomal arrangement of the gene family.

Figure 3. Gene structure of the eIF4E family in Vigna unguiculata compared to other species. The different colors in the CDS distinguish genes among the analyzed species. UTR regions are shown in gray, and dotted lines indicate introns.

Figure 4. Synteny analysis between Vigna unguiculata and legume species. Chromosomal blocks of Vigna unguiculata (Vu01–Vu11, purple) are compared with chromosomes of Phaseolus vulgaris (Chr01–Chr11, green), Lens culinaris (Lcu.2RBY.Chr1–Chr7, orange), and Glycine max (Gm01–Gm20, blue). Red lines denote conserved syntenic regions, whereas gray lines indicate collinear regions.

Figure 5. Phylogenetic analysis of eIF4E family isoforms in Fabaceae. Topology was inferred using the Maximum Likelihood method based on the JTT + I amino acid substitution model. The colors illustrate three distinct clades, colored according to the isoform class: Class I: nCBP (blue); Class II: eIF4E (green) and eIF(iso)4E (red). Dashed branches indicate Vigna unguiculata cultivars. Node stability was tested using 1000 bootstrap replicates.

Figure 6. Three-dimensional structural overview of eIF4E isoforms. Ribbon diagrams illustrating the secondary structure elements of (A) nCBP, (B) eIF4E, and (C) eIF(iso)4E models. The structures are color-coded to distinguish α-helices (blue), β-strands (red), and loops/coils (white). The black arrows indicate a 180° rotation of the protein structure to display the opposite face. The canonical “cupped hand” fold, characteristic of the eIF4E family, is preserved across all three isoforms, with the antiparallel β-sheets forming the central cap-binding core.

Figure 7. Structural superimposition of eIF4E isoforms modeled in Vigna unguiculata and reference species. The global alignment of backbone atoms was performed for (A) nCBP, (B) eIF4E, and (C) eIF(iso)4E proteins. The ensembles include models from six Vigna unguiculata cultivars (Bajão, Boca Negra, BR14-Mulato, IT85F-2687, Pingo de Ouro, and Santo Inácio) superimposed with orthologs from Arabidopsis thaliana and Phaseolus vulgaris. The calculated Root Mean Square Deviation (RMSD) values of 1.330 Å, 1.665 Å, and 1.006 Å, respectively, indicate a high degree of structural conservation across the legume genotypes, despite specific sequence variations in the N-terminal and punctually along the core domain. The black arrows indicate a 180° rotation of the protein structure to display the opposite face.

Figure 8. Structural stability and flexibility analysis during 100 ns of molecular dynamics (MD) simulation. The plots illustrate the temporal evolution of the modeled eIF4E isoform structures. The left panels (A,C,E) show the backbone Root Mean Square Deviation (RMSD, in nm) over the simulation time (ns). The plateau reached by the trajectories indicates structural equilibrium for (A) eIF4E, (C) eIF(iso)4E, and (E) nCBP. The right panels (B,D,F) show the per-residue Root Mean Square Fluctuation (RMSF, in nm), highlighting regions of higher rigidity (valleys) and local flexibility (peaks) for (B) eIF4E, (D) eIF(iso)4E, and (F) nCBP. Line colors correspond to models from the reference species (A. thaliana and P. vulgaris) and the six V. unguiculata cultivars, as detailed in the embedded legend of each plot.

Figure 9. B-factor mapping and conformational flexibility of eIF4E isoforms. The three-dimensional structures of (Left Panel) nCBP, (Middle Panel) eIF4E, and (Right Panel) eIF(iso)4E are rendered with a color gradient representing the B-factors. The scale ranges from blue (low B-factor, high rigidity) to red (high B-factor, high flexibility). The labels correspond to the modeled sequences and their source organisms: (A,B) Reference species (A. thaliana and P. vulgaris, except for the eIF4E panel where (A–C) represent A. thaliana isoforms); (C–H/D–J) V. unguiculata cultivars (Bajão, Boca Negra, BR14-Mulato, IT85F-2687, Pingo de Ouro, and Santo Inácio).

Figure 10. Time-dependent analysis of structural compactness and stability metrics during MD simulation. The panels display the evolution of structural properties over 100 ns for eIF4E (Top Row: (A–C)), eIF(iso)4E (Middle Row: (D–F)), and nCBP (Bottom Row: (G–I)). Left Column (A,D,G): Number of Hydrogen Bonds. The plot shows the total count of intramolecular hydrogen bonds, indicating the maintenance of the internal interaction network. Middle Column (B,E,H): Radius of Gyration (RG). The RG values (in nm) measure the global compactness of the protein structure; flat trajectories indicate a stably folded state. Right Column (C,F,I): Solvent Accessible Surface Area (SASA). The profiles (in nm²) represent the surface area exposed to the solvent, serving as a proxy for tertiary structure integrity. The color code for V. unguiculata cultivars and reference species (A. thaliana and P. vulgaris) follows the standard scheme used in previous figures.

Figure 11. Mean Minimum Distance Matrices (MDMAT) of eIF4E isoforms over MD simulations. The heatmaps represent the smallest distance between residue pairs averaged over the entire trajectory for (A) eIF4E, (B) eIF(iso)4E, and (C) nCBP models. The X and Y axes correspond to the residue indices of the protein sequences. The color scale indicates the mean distance in nanometers (nm), where dark blue regions (<0.6 nm) indicate residues in close spatial proximity, representing stable secondary structures (α-helices) and tertiary contacts (antiparallel β-sheets), and yellow regions (>1.5 nm) represent residues that are spatially distant within the three-dimensional fold. The consistent “checkerboard” patterns observed across all cultivars (Bajão, Boca Negra, BR14-Mulato, IT85F-2687, Pingo de Ouro, Santo Inácio) and reference species (A. thaliana, P. vulgaris) confirm the preservation of the global fold and the stability of the intermolecular contact network.

Figure 12. Electrostatic surface potential of eIF4E isoforms in Vigna unguiculata and reference species. The surface potential maps were calculated using APBS and are colored according to the electrostatic scale in units of kT/e, ranging from −5.0 (red, anionic) to +5.0 (blue, cationic). White areas represent neutral potential. (Left Panel) nCBP: Models for (A) At, (B) Pv, and (C–H) V. unguiculata cultivars demonstrate a conserved positive cleft. (Middle Panel) eIF4E: Comparative electrostatics for (A–C) At isoforms, (D) Pv, and (E–J) cowpea cultivars, highlighting the canonical cap-binding pocket. (Right Panel) eIF(iso)4E: Surface maps for (A) At, (B) Pv, and (C–H) V. unguiculata models. The prominent blue regions correspond to the positively charged binding groove essential for interacting with the negatively charged phosphate backbone of the mRNA 5′ cap. The intense conservation of this electrostatic profile across all genotypes confirms the functional competence of the modeled structures.

Table 1. Primer pairs used to amplify the coding sequences (CDS) of eIF4E isoforms (nCBP, eIF4E, and eIF(iso)4E) from cDNA derived from cowpea cultivars.

Primer	Sequence (5′–3′)	Fragment Size (PB)
nCBP_CDS.F nCBP_CDS.R eIF4E_CDS.F eIF4E_CDS.R eIF(iso)4E_CDS.F eIF(iso)4E_CDS.R	ATGGATTTCACAGCGGAGAAGAA CTACCCTCTCAACCAAGTGTTCC ATGGTTGTGGAAGATTCACAAA TCATATCACGTATTTATTTTTAGCACCC ATGGCAACAAGCGAGGAAG TTATACGGTGTATCGACCCTTTG	705 693 594

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

de Luna-Aragão, M.A.; Alves de Andrade, F.; Penna, S.R.M.; Maciel, L.S.; Rodrigues-Paixão, L.M.; Lemos, A.B.; Ferreira, J.D.C.; Aragão, F.J.L.; Pandolfi, V.; Benko-Iseppon, A.M. Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach. Agronomy 2026, 16, 766. https://doi.org/10.3390/agronomy16070766

AMA Style

de Luna-Aragão MA, Alves de Andrade F, Penna SRM, Maciel LS, Rodrigues-Paixão LM, Lemos AB, Ferreira JDC, Aragão FJL, Pandolfi V, Benko-Iseppon AM. Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach. Agronomy. 2026; 16(7):766. https://doi.org/10.3390/agronomy16070766

Chicago/Turabian Style

de Luna-Aragão, Madson Allan, Fernanda Alves de Andrade, Saulo Rafael Mendes Penna, Laiane Silva Maciel, Laura Maria Rodrigues-Paixão, Ayug Bezerra Lemos, José Diogo Cavalcanti Ferreira, Francisco José Lima Aragão, Valesca Pandolfi, and Ana Maria Benko-Iseppon. 2026. "Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach" Agronomy 16, no. 7: 766. https://doi.org/10.3390/agronomy16070766

APA Style

de Luna-Aragão, M. A., Alves de Andrade, F., Penna, S. R. M., Maciel, L. S., Rodrigues-Paixão, L. M., Lemos, A. B., Ferreira, J. D. C., Aragão, F. J. L., Pandolfi, V., & Benko-Iseppon, A. M. (2026). Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach. Agronomy, 16(7), 766. https://doi.org/10.3390/agronomy16070766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unveiling Three Functionally Diverse Isoforms of eIF4E in Cowpea Through a Multi-Omics Approach

Abstract

1. Introduction

2. Materials and Methods

2.1. eIF4E Isoforms Sequence Mining and Characterization in Genomes

2.2. Primer Design

2.3. RNA Extraction and cDNA Synthesis

2.4. Amplification, Cloning and Sequencing

2.5. Chromosomal Location Assessment

2.6. Structural Genomics, Expansion Mechanisms, and Synteny Analysis

2.7. Phylogenetic and Evolutionary Analysis

2.8. Protein Sequences, Alignments and Conserved Domain of eIF4E Isoforms

2.9. Molecular Modeling, Model Validation, and Molecular Dynamics Simulations

3. Results

3.1. Assessment of eIF4E Coding Sequences of Cowpea Cultivars

3.2. Chromosomal Location and Expansion Mechanisms of the eIF4E Isoform Gene Family

3.3. Gene Structure of the eIF4E Family in Cowpea and Related Species

3.4. Chromosomal Distribution and Synteny Analysis of eIF4E Genes in Legumes

3.5. Phylogenetic and Evolutionary Analysis

3.6. Characterization of Conserved Amino Acids and Sequence Alignment of eIF4E Proteins

3.7. Three-Dimensional Modeling and Quality Assessment

3.8. Structural Dynamics, Stability and Features of eIF4E Proteins

4. Discussion

4.1. Experimental Analysis and Coding Sequences Assessment

4.2. Genomic Architecture and Expansion Mechanisms

4.3. Synteny and Comparative Genomics

4.4. Evolutionary History and Phylogenetic Reconstruction

4.5. eIF4E Protein Sequence Conservation and Biological Signatures

4.6. Structural Validation, Stability and Conformational Dynamics

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI