Next Article in Journal
Reassessing the Risk of Severe Parvovirus B19 Infection in the Immunocompetent Population: A Call for Vigilance in the Wake of Resurgence
Next Article in Special Issue
Viral Diversity in Mixed Tree Fruit Production Systems Determined through Bee-Mediated Pollen Collection
Previous Article in Journal
Honeysuckle-Derived miR2911 Inhibits Replication of Porcine Reproductive and Respiratory Syndrome Virus by Targeting Viral Gene Regions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Genetic Study of Spillovers in the Bean Common Mosaic Subgroup of Potyviruses

by
Mohammad Hajizadeh
1,*,
Karima Ben Mansour
2,3 and
Adrian J. Gibbs
4,*
1
Department of Plant Protection, Faculty of Agriculture, University of Kurdistan, Sanandaj 66177-15175, Iran
2
Ecology, Diagnostics and Genetic Resources of Agriculturally Important Viruses, Fungi and Phytoplasmas, Crop Research Institute, Drnovská 507, 161 06 Prague, Czech Republic
3
Department of Plant Protection, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 129, 165 00 Prague, Czech Republic
4
Emeritus Faculty, Australian National University, Canberra, ACT 2601, Australia
*
Authors to whom correspondence should be addressed.
Viruses 2024, 16(9), 1351; https://doi.org/10.3390/v16091351
Submission received: 28 July 2024 / Revised: 16 August 2024 / Accepted: 20 August 2024 / Published: 23 August 2024
(This article belongs to the Special Issue Plant Virus Spillovers)

Abstract

:
Nine viruses of the bean common mosaic virus subgroup of potyviruses are major international crop pathogens, but their phylogenetically closest relatives from non-crop plants have mostly been found only in SE Asia and Oceania, which is thus likely to be their “centre of emergence”. We have compared over 700 of the complete genomic ORFs of the crop pandemic and the non-crop viruses in various ways. Only one-third of crop virus genomes are non-recombinant, but more than half the non-crop virus genomes are. Four of the viruses were from crops domesticated in the Old World (Africa to SE Asia), and the other five were from New World crops. There was a temporal signal in only three of the crop virus datasets, but it confirmed that the most recent common ancestors of all the crop viruses were before inter-continental marine trade started after 1492 CE, whereas all the crown clusters of the phylogenies are from after that date. The non-crop virus datasets are genetically more diverse than those of the crop viruses, and Tajima’s D analyses showed that their populations were contracting, and only one of the crop viruses had a significantly expanding population. dN/dS analyses showed that most of the genes and codons in all the viruses were under significant negative selection, and the few that were under significant positive selection were mostly in the PIPO-encoding region of the P3 protein, or the PIPO protein itself. Interestingly, more positively selected codons were found in non-crop than in crop viruses, and, as the hosts of the former were taxonomically more diverse than the latter, this may indicate that the positively selected codons are involved in host range determination; AlphaFold3 modelling was used to investigate this possibility.

1. Introduction

The appearance in the second half of 2019 of a major viral pandemic on the human population has recharged interest in how viruses “emerge” or “spillover” from the wild and cause pandemics. Some plant pandemics are devastating, and many are less obvious but nonetheless damaging [1]. However, the sources of most viral pandemics are usually unknown and often controversial [2]. The bean common mosaic virus (BCMV) subgroup of potyviruses is of interest as it contains many viruses of non-crop species but also many well-known crop pandemic viruses, which might have resulted from spillovers.
Viruses of the BCMV subgroup were first recognised more than a century ago [3] when a mosaic disease of Phaseolus vulgaris, the common bean, was first described in Russia by Iwanowski in 1899, then in the eastern U.S.A. by Stewart in 1917 [4], and in more detail by Pierce in 1934 [5]. The disease was shown to be seed-borne, sap-transmitted but with difficulty [6], and also some isolates were transmitted by several species of probing aphids [7]. Host range tests indicated that there were probably several different viruses causing the bean mosaic pandemic, as well as similar diseases of other cultivated legumes such as soybean, adzuki, etc. Serological tests [8] failed to distinguish between these viruses as, it was later found, the principal antigenic sites on the virions of potyviruses are poor immunogens and unstable. However, in 1959, Brandes showed that all viruses with rod-shaped or filamentous virions could be placed into usefully predictive groups by the length and morphology of their virions [9,10], and these criteria showed that most of the different viruses causing bean mosaic pandemics were placed in a group with potato virus Y, which provided the name “potyvirus” [11]. Finally, peptide analysis [12] and then gene sequencing [13,14,15] showed that many of the seed-borne potyviruses of common bean and other legumes formed several clearly defined species in a subgroup of the potyviruses, which was named after bean common mosaic virus (BCMV) by Dijkstra in 1992, who included viruses from both leguminous and non-leguminous host plants [16].
Gibbs et al. [17] analysed the coat protein genes of potyviruses isolated in Australia and found that 17 Australian potyviruses isolated from various crops were from 13 different potyvirus lineages and were closely related to them, whereas another 18 Australian potyviruses found in native plants but 14 were from the BCMV subgroup and related more distantly to viruses of the BCMV subgroup isolated from uncultivated plants or indigenous crops in SE Asia. They concluded that most crop potyviruses are recent migrants to Australia, whereas the BCMV subgroup of potyviruses had probably first diversified in SE Asia and spread to Australia sometime in the past. Thus, the BCMV subgroup viruses present a rare, perhaps unique, opportunity to make genetic comparisons of several related crop viruses with their possible source populations, and this may provide new ideas on the features that permit or promote spillovers.
The genomes of all potyviruses [18] are single molecules of ss-RNA and have short terminal non-coding regions. Each genome mostly encodes a single polyprotein that, after transcription, is hydrolysed by the three proteases it includes [19] to produce ten proteins. The proteins are the N-terminal P1 protein (a self-hydrolysing serine protease) and the HC-pro (a helper component and self-hydrolysing cysteine protease). The remaining polyprotein is hydrolysed at seven sites by NIa-pro (the eighth protein) to produce seven proteins: 6K1, CI (the cytoplasmic inclusion protein), 6K2, NIa-VPg (the genome 5′terminal capping protein), NIa-pro, NIb (the RNA-dependent RNA polymerase), and, at the 3′ end, CP (the coat protein gene). In the middle of the P3 genes of all potyviruses is a conserved motif of six adenine residues [20] that produces transcriptional slippage [20] in a small proportion of transcriptions, and these translate to produce the eleventh protein, a shortened transcript with a novel 3′ terminus called P3N-PIPO. In viruses of the sweet potato feathery mottle virus subgroup, another similar overlapping gene of the P1 protein produces a twelfth protein [21,22]. A considerable body of research has aimed to determine the functions of the various proteins; however, it is clear that most act cooperatively in a complex biochemical life cycle [23,24,25,26,27].
The BCMV subgroup is now represented in the international GenBank database by over 800 full-length genomic sequences, and this allows for a range of genetic comparisons to be made. Several of these viruses were first found in crop plants and are represented in the GenBank database by more sequences than others, probably because crop pathogens are of more interest to plant pathologists than non-crop pathogens, so we used a count of the number of sequences in GenBank to sort the BCMV subgroup sequences into the “crop” and “non-crop” or “outgroup” viruses. We then compared the gene sequences of the crop viruses with those of the phylogenetically closest non-crop viruses in various ways, looking for differences that may have arisen during spillovers and/or adaptation to crops.

2. Materials and Methods

All the full-length genomes of viruses of the BCMV subgroup were downloaded from GenBank. To obtain these, the genomic sequences of the 22 virus isolates contributing to the BCMV group in Figure 3 of Ref. [28] were each used for a Discontiguous Mega BLAST search of the GenBank database, each collecting the nearest 250 sequences. The 5500 resulting sequences were pooled, the duplicates were removed, and around 800 unique sequences remained, of which 731 were complete, or nearly so. Their Accession numbers are listed in File S1.
The 731 unique sequences were aligned by MAFFT using default parameters and trimmed using BioEdit 7.2.5 [29] to obtain the main open reading frames (ORFs) used for all analyses. ORFs were aligned using MAFFT [30] with default parameters and, after trimming and degapping, were aligned by PAL2NAL [31]. The aligned sequences were tested for the presence of phylogenetic anomalies probably resulting from recombination using all the options in RDP v.5.5 [32], with default parameters, namely that an anomaly was detected by five or more methods with an average chance probability of <10−5 [33,34]. Figure S2 is a histogram showing the vital statistics of the 11,001 nt (3667 codon) alignment of 731 sequences we analysed. The NJ tree option in ClustalX [35] was used to calculate, with default parameters, a phylogeny of the ORFs (Figure S1). All the sequence names were checked phylogenetically, and several sequences with unique names were found to be isolates of others, notably isolates of BCMV, WMV, or ZYMV, and were therefore grouped with them.
The best-fit substitution model for the ORFs was assessed using MEGA11 [36] and found to be GTR + γ4 + I, and LG + γ4 + I for their encoded amino acid sequences. Maximum likelihood (ML) phylogenies were calculated using PhyML3.0 or MEGA11. Phylogenies were dated using IQ-TREE 2.3.4-Windows [37]. The statistical support for nodes of ML trees was assessed using the SH option in PhyML [38]. Phylogenies were drawn using Figtree Version 1.4.4 “http://tree.bio.ed.ac.uk/software/FigTree/ (accessed on 30 November 2023) and a commercial graphics package.
The ORFs of all 248 non-recombinant (n-rec) genomes were used to generate a ML phylogeny using MEGA11, and this was converted to a pairwise patristic distance matrix using PATRISTIC [39]. The matrix was interrogated in MS Excel to identify the 15 sequences with the smallest average patristic distance for each of the crop virus clusters. We call these clusters “outgroup” clusters as some included crop sequences but no more than two sequences were chosen for each set from each of the other crop virus clusters. PATRISTIC was also used for comparing the phylogenies calculated from alignments of nucleotide sequences and the amino acid sequences they encoded.
The program DnaSP v.6.10.01 [40] was used to analyse the genetic differences between each of the crop virus populations and its 15 outgroup sequence populations (Table S1). Estimates were calculated for the average pairwise nucleotide diversity (π), number of segregation sites (S), mean non-synonymous substitutions per non-synonymous site (dN), mean synonymous substitutions per non-synonymous site (dS), and the ratio of non-synonymous nucleotide diversity to synonymous nucleotide diversity (dN/dS). It was concluded that genes were under positive, neutral, or negative selection when their dN/dS ratios were >1, =1, and <1, respectively [34]. In addition, Tajima’s D genetic tests of neutrality were used to determine whether the populations had a greater or smaller diversity than expected. The FUBAR (Fast Unconstrained Bayesian AppRoximation) method [41] implemented in the online Datamonkey server (https://www.datamonkey.org/ (accessed on 30 November 2023)) was used to find evidence of “significant pervasive selection pressures”, both positive and negative, on individual codons in the genes.
The AlphaFold3 server [42] was used to model the amino acid sequences of individual genes and combinations of the genes of the peanut mottle virus (PeMoV) and the dasheen mosaic virus (DashMV). The models obtained from AlphaFold 3 were investigated using Visual Studio Code 1.90. The Protoparam tool of Expasy [43] was used to estimate the pI and hydropathicity of proteins.

3. Results

3.1. The Data

All the full-length genomic sequences of viruses of the BCMV subgroup were downloaded (February 2023) from GenBank, and 731 unique sequences were found (File S1 and Figure S1). Nine viruses (Figure 1) were found to be represented by the complete genomes of 15 or more isolates, which indicates that these viruses had attracted the attention of plant pathologists. They are the bean common necrosis virus (BCMNV), bean common mosaic virus (BCMV), cowpea aphid-borne mosaic virus (CpAbMV), dasheen mosaic virus (DashMV), East Asia passiflora virus (EAPV), peanut mottle virus (PeMoV), soybean mosaic virus (SbMV), watermelon mosaic virus (WMV), and zucchini yellow mosaic virus (ZYMV). We call these the “crop viruses”. The other 37 viruses were represented by the genomic sequences of eight or fewer isolates (see Figure 1 legend). These we call “non-crop” or “outgroup” viruses; many had been isolated from wild plants, although some came from regional specialist crops or indigenous medicinal plants, such as ginseng (Panax spp.) and crow-dipper (Pinellia ternata; a medicinal arum) (Table S1).
The countries from which the crop and non-crop isolates came, and the numbers of isolates from each of those countries, are shown in Figure 2. The non-crop isolates came from only 14 countries, most in SE and E Asia and Australia (>80%), whereas the crop virus isolates came from 30 countries worldwide (<50%).
A phylogeny of all 248 n-rec genomic ORFs was used to identify fifteen n-rec sequences from among the non-crop plants with the smallest average patristic distance to each of the seven crop viruses, excluding SbMV and WMV as these had fewer than nine n-rec sequences. We call these sets of “outgroup” viruses as some included sequences from other crop virus clusters, but no more than two sequences were included from any other crop virus set. The Accession numbers, hosts, and provenances of these sequence sets, 232 ORFs in total, are listed in Table S1, and Figure 3 shows their phylogeny. It can be seen that PeMoV has a unique set of non-crop viruses closest to it, DashMV and ZYMV mostly share another set of non-crop viruses, and the other four crop viruses are closely related and share different combinations drawn from a single cluster of non-crop and crop viruses.

3.2. Recombinants

Genetic recombination is often stated to be an important source of genetic diversity in virus population evolution [44,45,46], and recombinants have been found to be common in the populations of many potyviruses. Recombination has been previously reported in populations of all the crop BCMV subgroup viruses [33,47,48,49,50,51,52]. Our RDP analysis of all 731 genomic ORFs of the BCMV subgroup found that only 248 (34%) of the sequences showed no significant evidence of recombination (see Section 2).
If a particular recombination event had been a key trigger establishing a spillover, then one would expect all isolates resulting from that spillover to show evidence of that event, though subsequent recombination events in some isolates might obscure the first event. The RDP analysis showed that most WMV isolates and most SbMV isolates were recombinants; therefore, it is possible that the spillovers of these two viruses were triggered by recombination. These two viruses are closely related crown clusters of a BCMV sub-sub-lineage with isolates of uraria mosaic virus (LC477217) and wisteria vein mosaic virus (NC_007216) forming the basal clusters. The nearest major “parent” of WMV (Figure 4) is a Korean isolate of EAPV (LC656468), and its nearest minor parent is a Chinese isolate of BCMV (MW834586) from poplar (Populus alba var pyramidalis); the recombinants were produced by Event 60, identified in the RDP analysis of all the ORF sequences (seven methods RGBMCS3, mean p < 10−21), and were found in 121 isolates. The two most basal nodes of WMV branch from isolates (KF274031, KC845322) isolated from the Chinese hosts Ailanthus altissima (‘Tree of Heaven’; Sapindales) and Atractylodes macrocephala (a medicinal plant; Asterales). The next most basal node branches from isolates (MK217416 and KX926428) from Panax ginseng (a medicinal plant, Apiales) and Alcea rosea (hollyhock, Malvales). The third and subsequent nodes subtend clusters of WMV isolates, mostly from cucurbits, and grow worldwide.
The basal events of the SbMV phylogeny (Figure 4) are superficially similar to those of WMV. The basal lineage of SbMV is of sequences from isolates found in Pinellia spp. (a medicinal arum) and are of recombinants closest to DashMV, now found in Typhonium giganteum, the Giant Voodoo Lily, a Chinese ornamental arum. The main SbMV population is of two lineages arising from RDP Event 9 found in 21 sequences and from RDP Event 90. However, there is no SH statistical support distinguishing whether the main SbMV lineages arose before or after the divergence within the Pinellia cluster when the recombinant regions are removed and the n-rec remnants are used in a BLASTn search of GenBank; then, sequences of WMV (LC787269, MT780537) and Calla lily latent virus (EF105298/9) are found to be closest.
None of the other crop viruses show evidence that recombination was associated with the spillover events; there are many recombinants in the BCMV population, but all are sub-lineage specific.

3.3. Dating

Analyses using the TempEst v. 1-5-3 and IQ-TREE v.2.3.4 programs found ”temporal signals” only in the EAPV, PeMoV, and ZYMV n-rec ORF individual datasets, and not in the alignment containing all 232 n-rec sequences. The IQ-TREE analyses found that EAPV sequences had a “most recent common ancestor” (MRCA) date of 8717 BCE with its major divergences occurring after 1592 CE, and, likewise, the PeMoV population had a single basal divergence dated 173 CE but with its major divergence occurring after 1894 CE, and the ZYMV population had a MRCA of 683 CE, and all its crown clusters formed after 1500 CE.
Comparing these datings, and perhaps extrapolating them in detail to the other four crop viruses, was unlikely to be useful as the histories of some of the viruses obviously involved significant host changes with concomitant changes in selection. However, in all the phylogenies, the crop virus MRCAs were older than 1500 CE (i.e., pre-Columbian world marine trade), and all of the nodes subtending their major crown clusters were (see Section 4).

3.4. Population Genetics

The sequences of the ORFs of seven n-rec crop sequence populations and the corresponding outgroup virus sequences were compared using basic population genetic metrics calculated using the DnaSP program suite, and the results are in Table 1. They could not be calculated for SbMV and WMV, as no n-rec sequences of WMV were found—and only three of SbMV—and a minimum of six sequences are required for calculating most population genetic metrics.
The nucleotide diversities of the different viral datasets showed (Table 1), as expected, that the crop virus sequences were all less variable than their respective outgroup ones; mean crop virus sequence numerically weighted diversity 0.102 ± 0.100 compared with mean outgroup virus sequence diversity 0.283 ± 0.035. PeMoV had the least diverse set of crop sequences, with the smallest nucleotide diversity (π = 0.02) and number of segregating sites (S = 1231), but had the most diverse set of outgroup sequences (π = 0.379, and S = 6201), which reflects the fact that its branches are basal in the phylogeny and its mostly unique group of outgroup sequences were therefore on long branches (Figure 3).
All the outgroup populations gave large positive Tajima’s D estimates that are statistically significant (Table 1), which is evidence of “balancing selection and/or significant population contraction” as “rare alleles are scarce” [53]. Although Tajima’s D estimates for all the crop sequences were more negative than those of the outgroup sequence sets, unexpectedly, only those of PeMoV had a statistically significant negative Tajima D estimate, indicative of a “population expansion after a bottleneck, or a recent selective sweep” resulting in “an excess of rare alleles”, namely a spillover. Only 15 full-length n-rec genomic sequences of PeMoV were in GenBank when this project started in 2023, and they formed a single crown cluster, although the sequences came from isolates collected in five countries (Brazil, China, Iran, Korea, and Turkey). However, although the populations of the other crop viruses had several crown clusters mixed with basal outliers, samples of isolated crown clusters of each of the nine to fifteen sequences of those viruses did not give statistically significant negative Tajima D values; these crown clusters were mostly of the ORFs of isolates collected from single countries. Estimates for individual genes found that the P1, HC-pro, and P3 genes had larger Tajima’s D values than the others, suggesting that they are more responsive to selection changes, and the PIPO gene had consistently smaller Tajima’s D values than the other genes, suggesting that it had been less affected by selection.
The dN/dS estimates showed that all the sequences from crop isolates, except those of DashMV, were under greater negative selection (i.e., smaller dN/dS) than those from the corresponding outgroup sequences, properly reflecting increased purifying selection. Furthermore, dN/dS scans (Figure 5 for BCMV crop) showed that negatively selected codons are present throughout the genome, whereas positively selected codons are clustered in the P3 gene, which encodes two proteins, P3 and P3N-PIPO, reported to function as movement proteins [23,54,55].
As described above, the potyvirus P3 gene (codons 1141 to 1504) has a motif, “5-GAAAAAA-3” in its centre, which causes slippage when being transcribed [20], and as a result, a small proportion of the progeny transcripts [56] of the P3 gene have an extra “A” in the slippage region and translate the C-terminal region of the P3 gene in its + 1 reading frame. Thus, the resulting protein, P3N-PIPO has a P3 N-terminal region and a different C-terminal region, which has been named PIPO (“Pretty Important Protein Overlapping”). The P3 gene is 364 codons long, and the slippage motif is 165 codons from its 5′-terminus; however, the PIPO region is only 71 to 94 codons long in different viruses, and thus, the P3N-PIPO protein is smaller than the P3 protein.
We therefore further investigated the number and position of individual positively and negatively selected codons in the ORFs and P3N-PIPO regions of the different crop and outgroup datasets using FUBAR (Fast Unconstrained Bayesian AppRoximation for inferring selection) (Table S2 and Table S3), and Figure 6 summarises the results of those analyses.
The FUBAR analyses (Figure 6) agree with the dN/dS results in finding that most codons were under negative selection in all sequence comparisons. The crop and outgroup ORFs had similar numbers of positively selected codons (1.43 codons/genome (±1.39) in the crop virus ORFs compared with 1.86 (±0.90) in outgroup virus ORFs); however, the distribution of those sites throughout the ORFs was not random, and 80% were in the P3 gene (codons 1141–1504), in agreement with the dN/dS scan results (Figure 4). The P3 codons that were positively selected most frequently (Figure 5) were codons 1316 and 1326 (five sets each) and codon 1336 (four sets). All but two of the positively selected codons were in the PIPO encoding region of P3. Furthermore, in the P3N-PIPO sequences, codons 1366 (five comparisons) and 1375 (four comparisons) were the codons positively selected most frequently. Only three pairs of positively selected codons overlapped in the PIPO region: codon 1340 in P3 with codon 1339 in P3N-PIPO, codon 1340 in P3 with codon1340 in P3N-PIPO, and codon 1347 in P3 with codon 1346 in P3N-PIPO.
Finally, to better understand the results of the FUBAR analyses, we modelled the potyvirus proteins involved using the online AlphaFold3 server [42]. We used proteins of two of the viruses of the BCMV subgroup: PeMoV (sequences KF977830 and MZ442685) and DashMV (KT026108 and KY242358). These were chosen because patristic distance comparisons of the nucleotide and encoded amino acid sequences of all 232 n-rec sequences showed that those of PeMoV and DashMV were the most different, and hence most likely to separate functional and structural similarities and differences from biological variation.
AlphaFold3 models of the P3 and P3N-PIPO amino acid sequences (Figure 7) show what complex and interesting molecules they are. The P3 and the P3N-PIPO proteins have identical N-terminal regions. These consist of a series of eight short alpha-helices folded to form an irregular polyhedral head. The N-terminal regions of the P3 C-terminal and PIPO regions form “brims” of longer helices around the heads of both proteins. The brim in P3 includes the positively selected codons (1316, 1326, and 1336). The C-terminal regions are significantly different in structure and composition; the first part of the P3 protein is a long helix of 106 amino acids with a C-terminus of two short helices that fold to lie across the long helix, whereas the PIPO protein, which is shorter, has a helix of 53 amino acids with a bend after the 35th, and a C-terminal region of 20 amino acids that are “intrinsically disordered”.
The isoelectric points (pIs) estimated by ProtParam of the P3N head proteins are, on average, 6.4, but the pIs of the brim helices, the long helix of P3, and the bent helices of PIPO are from 9.5 to 10.3; however, the P3 C-terminal short helices have pIs of 4.45.
The hydropathicity indices of the long P3 helix and folded PIPO helices are positive, indicating that they are hydrophobic, whereas all other parts of P3 and P3N-PIPO are negative and hence likely to be hydrophilic. The PIPO protein is compositionally less variable than the regions of the P3 protein, and all three regions (i.e., the helices and the ID C-terminus) have pIs around 10.
The elongated shape of the P3 proteins is intriguing. The fact that the long helix providing that shape is of 106 amino acids in all 232 n-rec sequences, and all indels in the P3 gene are confined to regions of the N and C-termini that have no effect on the length of the protein, suggests that the length has functional significance. The acidic composition of the C-terminus compared with the helix and head to which it is attached again suggests functional significance. The length of alpha-helical peptides is determined by the backbone: 3.6 residues/turn with each turn adding 0.54 nm along the axis. Therefore, the P3 long helix of 106 amino acids is 15.9 nm long, and as it constitutes about 80% of the length of the protein, the long axis of the P3 protein is about 20.5 nm and its width maximally 5 nm. This suggests that the P3 protein would not pass from cell to cell through unaltered plasmodesmata, although this is a complex issue.
AlphaFold3 will also co-model (i.e., model and dock) two or more proteins. We therefore made a search of all combinations of the P3 and P3N-PIPO proteins of PeMoV (KF977830) with each of its other nine proteins and found the P3 and CI protein combination gave the largest “interface temperature modelling score” (ipTM) of 0.6 with ipTMs of >0.5 being considered a ”true structure” and the other combinations giving an average ipTM of 0.15. To check whether this was biologically informative, we co-modelled all combinations of the P3 and CI molecules of two isolates of PeMoV and DashMV: KF977830 with KT026108 and MZ442685 with KY242358. The models gave varying PTMs and ipTMs around the values above, but those with the largest ipTM consistently showed the brim region (codons 1279–1286) of the head of the P3 protein closest to a region of the CI protein where there was a helix (codons 1814–1825) and a short anti-parallel beta-strand (codons 1830–1834). Thus, none of these regions defined structurally involved the codons found by FUBAR to be most frequently positively selected.

4. Discussion

Our study, using full-length genomes of a large number of BCMV subgroup viruses, has confirmed an earlier analysis of their CP gene sequences [17], which concluded that the BCMV subgroup of potyviruses probably first diverged in uncultivated plants (or local specialist crops) in SE and East Asia and spread as far as Australia, and more recently, it also produced some major crop pathogenic viruses that have spread worldwide. Here, we used phylogenetic and population genetic methods to compare the crop viruses with the most closely phylogenetically related non-crop or outgroup viruses looking for differences that might have occurred during each of nine spillovers from a non-crop virus to a major crop pathogen.
We have identified nine viruses of the BCMV subgroup of potyviruses as major pandemic viruses. Three of them (DashMV, CpAbMV, and SbMV) were isolated from dasheen or taro, cowpea, and soybean, which are among the earliest plants to be domesticated and have been grown for many thousands of years in SE Asia. Cowpea (Vigna unguiculata) was domesticated in sub-Saharan Africa before 2500 BCE and by 400 BCE was long established in all of its modern major production regions of the Old World, including Africa, the Mediterranean, India, and Southeast Asia [58]. Taro (Colocasia esculenta) is a member of the Araceae and probably the earliest domesticated crop. Archaeological studies indicate that the crop has been cultivated for at least 28,000 years [59] over a vast area spanning Africa and India to South China, Melanesia, and northern Australia [60]. Taro is polymorphic and the shape of the corm distinguishes dasheen (Colocasia esculenta var. esculenta) and eddoe (Colocasia esculenta var. antiquorum) types [61], and genetic analyses indicate that taro may have been domesticated on multiple occasions. Soybean (Soja max) was domesticated in East Asia more than 3000 years ago from wild Glycine soja and other Glycine species found in the eastern regions of the Old World and Oceania. Interestingly the phylogenies of n-rec DashMV and CpAbMV (Figure S1) are distinct from those of the other pandemic viruses in the subgroup by consisting of a few long-branch lineages; there were no n-rec SbMV sequences in our data. A fourth virus, WMV, probably moved from various uncultivated plants and minor crops and became a crop pathogen in NE Asia when watermelon (Citrullus lanatus) became widely grown there around 1000 years ago [33]. The other five viruses were isolated from crops domesticated in the Americas [62]: two from common bean (Phaseolus vulgaris) and one each from passion fruit (Passiflora edulis), peanut (Arachis hypogeae), and zucchini (Cucurbita pepo). Thus, several of the nine pandemic viruses of the BCMV subgroup were generated from “new encounters” of the sort described and discussed by Refs. [1,63], and it is likely that the spread of the host crop in worldwide trade and to new agricultural areas is the main factor in them having become major crop pathogens. This was confirmed in our limited dating analyses, which showed that although many of the crop viruses had “Most Recent Common Ancestors” back to 8700 BCE (EAPV), none of the nodes subtending their major crown clusters were older than 1500 CE, which was when the post-Columbian era of world marine trade started [64,65]. However, there is also the possibility that this dating has been influenced by the “time dependent rate phenomenon” [66].
It is the carriage by seafarers of infected plant material, especially seeds, that most likely explains the presence of BCMV subgroup viruses on at least three oceanic islands (Figure 2) and the “genetic connectivity” of ZYMV isolates in SE Asia and NW Australia [67,68]. These viruses are less likely to have been spread long distances over oceans by aphids as potyviruses do not persist in flying or starving aphids. “The plant virus transmissions database” [69] records that seven of the nine crop viruses we have studied are “seed-borne”, but that DashMV and WMV are “not seed-borne”, and although that may be correct for DashMV as the host is vegetatively propagated, it is questionable for WMV given its worldwide distribution; whether or not a virus is recorded as seed-borne depends greatly on the number of seeds that were tested.
Genetic recombination may represent a significant evolutionary force for plant RNA viruses [70,71,72]; for example the necrogenic lineages of potato virus Y are mostly recombinants of non-necrogenic lineages [73]. However, there is no evidence that recombination provides new crop species of viruses; indeed, Moreno et al. [74] studied a WMV population in melon with 7% recombinants and found “strong selection against isolates with recombinant proteins, even when originated from closely related strains.” Desbiez and Lecoq [75] confirmed that “the P1 of WMV was 135 amino-acids longer than that of SMV, and the N-terminal half of the P1 showed no relation to SMV but was 85% identical to BCMV. This suggests that WMV has emerged through an ancestral recombination event”. However, we find that this is not supported by the more detailed recombination events now revealed; the large proportion of recombinants found in WMV and SbMV populations may not only indicate the sharing of infected seeds but also reflect the larger number of genomes that have been compared, as this makes it more likely for recombinants to be found. None of the other seven BCMV subgroup viruses showed evidence of recombination being involved in their emergence.
Like many studies of potyviruses already published, we have found evidence of strong negative selection in all nine BCMV subgroup viruses. Our arbitrary method for distinguishing crop and non-crop viruses seems to be effective as they gave different, but consistent, datasets; for example, the non-crop viruses were consistently under more negative selection than the crop viruses, and the populations of their hosts were declining. The finding that most of the positively selected codons of these viruses were concentrated in their P3 proteins, and the P3N-PIPO proteins derived from them, was significant, especially as more were found in the analyses of the non-crop viruses than the crop viruses. This suggests that the selection identified by FUBAR is linked in some way with host preferences as the hosts of the outgroup viruses were more taxonomically diverse than the crop virus hosts (Table S2); the crop viruses were isolated from plants of nine families (Araceae, Berberidaceae, Fabaceae, Moraceae, Orchidaceae, Passifloraceae, and Pedaliaceae), and the outgroup viruses from plants of 13 families (Amaranthaceae, Apocynaceae, Asparagaceae, Asphodelaceae, Basellaceae, Begoniaceae, Fabaceae, Hyacinthaceae, Iridaceae, Liliaceae, Melanthiaceae, Orchidaceae, and Passifloraceae), but with only three families shared (Fabaceae, Orchidaceae, and Passifloraceae). However, no correlations were found between phylogenies of the nucleotide or encoded amino acid sequences of the P3 or P3-PIPO genes or the positively selected codons found in them, and the host differences (i.e., monocotyledonous versus dicotyledonous hosts; asterids versus rosids versus caryophyllid hosts [76]).
AlphaFold3 models of the P3 and P3N-PIPO amino acid sequences show what complex and interesting molecules they are. The elongated shape of the P3 proteins is intriguing, and the fact that the long helix providing that shape is of 106 amino acids in all 232 n-rec sequences despite large sequence differences suggests that the length has functional significance and the protein may be membrane-spanning. The acidic composition of the C-terminus compared with the helix and head to which it is attached again suggests functional significance. As potyviruses cause such significant damage in many crops worldwide, much international effort is currently being made to understand the molecular biology of their replication as, for example, a knowledge of how they spread from cell to cell with the plant might provide methods to control that process. However, there are significant differences of opinion about which 11 potyvirus proteins are involved [23,77,78,79]. The acidic composition of the C-terminus compared with the helix and head to which it is attached, as well as the size and chemistry of the various regions of P3, again suggests functional significance.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/v16091351/s1, File S1: GenBank Accession numbers of the sequences compared; Figure S1: Branching pattern of the neighbor-joining phylogeny of the principal genomic ORFs of 731 BCMV subgroup viruses. The clusters of each of the nine major “crop” viruses are shown; Figure S2: Vital statistics of the 731 sequence alignment (including indels); Table S1: Accession numbers, hosts and provenances of sequences used in popgen analyses; Table S2: Results of FUBAR searches for negatively and positively selected codons in the main ORFs of seven crop viruses and their corresponding outgroup viruses; Table S3: Results of FUBAR searches for negatively and positively selected codons in the P3 genes of seven crop viruses and their corresponding outgroup viruses.

Author Contributions

Conceptualisation, A.J.G. and M.H.; methodology, A.J.G., M.H. and K.B.M.; software, A.J.G., M.H. and K.B.M.; formal analysis, A.J.G., M.H. and K.B.M.; writing—original draft preparation, A.J.G.; writing—review and editing, M.H. and K.B.M.; project administration, A.J.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data generated or analysed during this study are included in this published article and its Supplementary Materials. Further details are available from the corresponding authors upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Jones, R.A.C. Global Plant Virus Disease Pandemics and Epidemics. Plants 2021, 10, 233. [Google Scholar] [CrossRef]
  2. Nguyen, H.D.; Tomitaka, Y.; Ho, S.Y.W.; Duchêne, S.; Vetten, H.J.; Lesemann, D.; Walsh, J.A.; Gibbs, A.J.; Ohshima, K. Turnip Mosaic Potyvirus Probably First Spread to Eurasian Brassica Crops from Wild Orchids about 1000 Years Ago. PLoS ONE 2013, 8, e55336. [Google Scholar] [CrossRef] [PubMed]
  3. Smith, K. A Textbook of Plant Virus Diseases, 2nd ed.; J&A Churchill: London, UK, 1957. [Google Scholar]
  4. Stewart, V.B.; Reddick, D. Bean Mosaic. Phytopathology 1917, 7, 61. [Google Scholar]
  5. Pierce, W.H. Viruses of the Bean. Phytopathology 1934, 24, 87–115. [Google Scholar]
  6. Pierce, W.H.; Hungerford, C.W. Symptomatology, Transmission, Infection and Control of Bean Mosaic in Idaho; Research Bulletin 7; Agricultural Experiment Station of the University of Idaho: Moscow, ID, USA, 1929. [Google Scholar]
  7. Zaumeyer, W.J.; Kearns, C.W. The Relation of Aphids to the Transmission of Bean Mosaic. Phytopathology 1936, 26, 614–629. [Google Scholar]
  8. Beemster, A.B.R.; van der Want, J.P.H. Serological Investigations of the Phaseolus Viruses 1 and 2. Antonie Leeuwenhoek J. Microbiol. Serol. 1951, 17, 285–296. [Google Scholar] [CrossRef]
  9. Brandes, J.; Wetter, C. Classification of Elongated Plant Viruses on the Basis of Particle Morphology. Virology 1959, 8, 99–115. [Google Scholar] [CrossRef] [PubMed]
  10. Brandes, J.; Quantz, L. Elektronenmikroskopische Untersuchungen Des Weißkleevirus Und Des Steinkleevirus. Arch. Mikrobiol. 1957, 26, 369–372. [Google Scholar] [CrossRef]
  11. Harrison, B.D.; Finch, J.T.; Gibbs, A.J.; Hollings, M.; Shepherd, R.J.; Valenta, V.; Wetter, C. Sixteen Groups of Plant Viruses. Virology 1971, 45, 356–363. [Google Scholar] [CrossRef]
  12. McKern, N.M.; Ward, C.W.; Shukla, D.D. Strains of Bean Common Mosaic Virus Consist of at Least Two Distinct Potyviruses; Archives of Virology (ARCHIVES SUPPL.); Springer: Berlin/Heidelberg, Germany, 1992; Volume 5, pp. 407–414. [Google Scholar]
  13. McKern, N.M.; Mink, G.I.; Barnett, O.W.; Mishra, A.; Whittaker, L.A.; Silbernagel, M.J.; Ward, C.W.; Shukla, D.D. Isolates of Bean Common Mosaic Virus Comprising Two Distinct Potyviruses. Phytopathology 1992, 82, 923–929. [Google Scholar] [CrossRef]
  14. Vetten, H.J.; Lesemann, D.E.; Maiss, E. Serotype A and B Strains of Bean Common Mosaic Virus Are Two Distinct Potyviruses; Archives of Virology (ARCHIVES SUPPL.); Springer: Berlin/Heidelberg, Germany, 1992; Volume 5, pp. 415–431. [Google Scholar]
  15. Khan, J.A.; Lohuis, D.; Goldbach, R.; Dijkstra, J. Sequence Data to Settle the Taxonomic Position of Bean Common Mosaic Virus and Blackeye Cowpea Mosaic Virus Isolates. J. Gen. Virol. 1993, 74, 2243–2249. [Google Scholar] [CrossRef] [PubMed]
  16. Dijkstra, J.; Khan, J.A. A Proposal for a Bean Common Mosaic Subgroup of Potyviruses; Archives of Virology (ARCHIVES SUPPL.); Springer: Berlin/Heidelberg, Germany, 1992; Volume 5, pp. 389–395. [Google Scholar]
  17. Gibbs, A.J.; Trueman, J.W.H.; Gibbs, M.J. The Bean Common Mosaic Virus Lineage of Potyviruses: Where Did It Arise and When? Arch. Virol. 2008, 153, 2177–2187. [Google Scholar] [CrossRef]
  18. Inoue-Nagata, A.K.; Jordan, R.; Kreuze, J.; Li, F.; López-Moya, J.J.; Mäkinen, K.; Ohshima, K.; Wylie, S.J.; Siddell, S.G.; Lefkowitz, E.J.; et al. ICTV Virus Taxonomy Profile: Potyviridae 2022. J. Gen. Virol. 2022, 103, 001738. [Google Scholar] [CrossRef] [PubMed]
  19. Revers, F.; García, J.A. Chapter Three—Molecular Biology of Potyviruses. In Advances in Virus Research; Elsevier: Amsterdam, The Netherlands, 2015; Volume 92, pp. 101–199. [Google Scholar]
  20. Chung, B.Y.W.; Miller, W.A.; Atkins, J.F.; Firth, A.E. An Overlapping Essential Gene in the Potyviridae. Proc. Natl. Acad. Sci. USA 2008, 105, 5897–5902. [Google Scholar] [CrossRef] [PubMed]
  21. Clark, C.A.; Davis, J.A.; Abad, J.A.; Cuellar, W.J.; Fuentes, S.; Kreuze, J.F.; Gibson, R.W.; Mukasa, S.B.; Tugume, A.K.; Tairo, F.D.; et al. Sweetpotato Viruses: 15 Years of Progress on Understanding and Managing Complex Diseases. Plant Dis. 2012, 96, 168–185. [Google Scholar] [CrossRef]
  22. Li, F.; Xu, D.; Abad, J.; Li, R. Phylogenetic Relationships of Closely Related Potyviruses Infecting Sweet Potato Determined by Genomic Characterization of Sweet Potato Virus G and Sweet Potato Virus 2. Virus Genes 2012, 45, 118–125. [Google Scholar] [CrossRef]
  23. Chai, M.; Wu, X.; Liu, J.; Fang, Y.; Luan, Y.; Cui, X.; Zhou, X.; Wang, A.; Cheng, X. P3N-PIPO Interacts with P3 via the Shared N-Terminal Domain to Recruit Viral Replication Vesicles for Cell-to-Cell Movement. J. Virol. 2020, 94, 1110–1128. [Google Scholar] [CrossRef]
  24. Mingot, A.; Valli, A.; Rodamilans, B.; San León, D.; Baulcombe, D.; García, J.A.; López-Moya, J.J. The P1N-PISPO Trans-Frame Gene of Sweet Potato Feathery Mottle Potyvirus Is Produced during Virus Infection and Functions as an RNA Silencing Suppressor. J. Virol. 2016, 90, 3543–3557. [Google Scholar] [CrossRef]
  25. Untiveros, M.; Olspert, A.; Artola, K.; Firth, A.E.; Kreuze, J.F.; Valkonen, J.P.T. A Novel Sweet Potato Potyvirus Open Reading Frame (ORF) Is Expressed via Polymerase Slippage and Suppresses RNA Silencing. Mol. Plant Pathol. 2016, 17, 1111–1123. [Google Scholar] [CrossRef] [PubMed]
  26. Vijayapalani, P.; Maeshima, M.; Nagasaki-Takekuchi, N.; Miller, W.A. Interaction of the Trans-Frame Potyvirus Protein P3N-PIPO with Host Protein PCaP1 Facilitates Potyvirus Movement. PLoS Pathog. 2012, 8, e1002639. [Google Scholar] [CrossRef] [PubMed]
  27. Wen, R.H.; Hajimorad, M.R. Mutational Analysis of the Putative Pipo of Soybean Mosaic Virus Suggests Disruption of PIPO Protein Impedes Movement. Virology 2010, 400, 1–7. [Google Scholar] [CrossRef] [PubMed]
  28. Gibbs, A.J.; Hajizadeh, M.; Ohshima, K.; Jones, R.A.C. The Potyviruses: An Evolutionary Synthesis Is Emerging. Viruses 2020, 12, 132. [Google Scholar] [CrossRef]
  29. Hall, T.A. BioEdit: A User-Friendly Biological Sequence Alignment Editor and Analysis Program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 1999, 41, 95–98. [Google Scholar]
  30. Katoh, K.; Rozewicki, J.; Yamada, K.D. MAFFT Online Service: Multiple Sequence Alignment, Interactive Sequence Choice and Visualization. Brief. Bioinform. 2018, 20, 1160–1166. [Google Scholar] [CrossRef] [PubMed]
  31. Suyama, M.; Torrents, D.; Bork, P. PAL2NAL: Robust Conversion of Protein Sequence Alignments into the Corresponding Codon Alignments. Nucleic Acids Res. 2006, 34, 609–612. [Google Scholar] [CrossRef] [PubMed]
  32. Martin, D.P.; Varsani, A.; Roumagnac, P.; Botha, G.; Maslamoney, S.; Schwab, T.; Kelz, Z.; Kumar, V.; Murrell, B. RDP5: A Computer Program for Analyzing Recombination in, and Removing Signals of Recombination from, Nucleotide Sequence Datasets. Virus Evol. 2020, 7, veaa087. [Google Scholar] [CrossRef]
  33. Ben Mansour, K.; Gibbs, A.J.; Komínková, M.; Komínek, P.; Brožová, J.; Kazda, J.; Zouhar, M.; Ryšánek, P. Watermelon Mosaic Virus in the Czech Republic, Its Recent and Historical Origins. Plant Pathol. 2023, 72, 1528–1538. [Google Scholar] [CrossRef]
  34. Shokri, S.; Shujaei, K.; Gibbs, A.J.; Hajizadeh, M. Evolution and Biogeography of Apple Stem Grooving Virus. Virol. J. 2023, 20, 105. [Google Scholar] [CrossRef]
  35. Jeanmougin, F.; Thompson, J.D.; Gouy, M.; Higgins, D.G.; Gibson, T.J. Multiple Sequence Alignment with Clustal, X. Trends Biochem. Sci. 1998, 23, 403–405. [Google Scholar] [CrossRef]
  36. Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
  37. Minh, B.Q.; Schmidt, H.A.; Chernomor, O.; Schrempf, D.; Woodhams, M.D.; Von Haeseler, A.; Lanfear, R.; Teeling, E. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol. Biol. Evol. 2020, 37, 1530–1534. [Google Scholar] [CrossRef]
  38. Shimodaira, H.; Hasegawa, M. Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference. Mol. Biol. Evol. 1999, 16, 1114–1116. [Google Scholar] [CrossRef]
  39. Fourment, M.; Gibbs, M.J. PATRISTIC: A Program for Calculating Patristic Distances and Graphically Comparing the Components of Genetic Change. BMC Evol. Biol. 2006, 6, 1. [Google Scholar] [CrossRef] [PubMed]
  40. Librado, P.; Rozas, J. DnaSP v5: A Software for Comprehensive Analysis of DNA Polymorphism Data. Bioinformatics 2009, 25, 1451–1452. [Google Scholar] [CrossRef]
  41. Murrell, B.; Moola, S.; Mabona, A.; Weighill, T.; Sheward, D.; Kosakovsky Pond, S.L.; Scheffler, K. FUBAR: A Fast, Unconstrained Bayesian AppRoximation for Inferring Selection. Mol. Biol. Evol. 2013, 30, 1196–1205. [Google Scholar] [CrossRef]
  42. Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A.J.; Bambrick, J.; et al. Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3. Nature 2024, 630, 493–500. [Google Scholar] [CrossRef] [PubMed]
  43. Gasteiger, E.; Hoogland, C.; Gattiker, A.; Duvaud, S.; Wilkins, M.R.; Appel, R.D.; Bairoch, A. The Proteomics Protocols Handbook; Springer Protocols Handbooks; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
  44. Acosta-Leal, R.; Duffy, S.; Xiong, Z.; Hammond, R.W.; Elena, S.F. Advances in Plant Virus Evolution: Translating Evolutionary Insights into Better Disease Management. Phytopathology 2011, 101, 1136–1148. [Google Scholar] [CrossRef]
  45. Pérez-losada, M.; Arenas, M.; Galán, J.C.; Palero, F.; González-Candelas, F. Recombination in Viruses: Mechanisms, Methods of Study, and Evolutionary Consequences. Infect. Genet. Evol. 2015, 30, 296–307. [Google Scholar] [CrossRef]
  46. Jaya, F.R.; Brito, B.P.; Darling, A.E. Evaluation of Recombination Detection Methods for Viral Sequencing. Virus Evol. 2023, 9, vead066. [Google Scholar] [CrossRef]
  47. Mohammadi, M.; Gibbs, A.J.; Hosseini, A.; Hosseini, S. An Iranian Genomic Sequence of Beet Mosaic Virus Provides Insights into Diversity and Evolution of the World Population. Virus Genes 2018, 54, 272–279. [Google Scholar] [CrossRef]
  48. Hajizadeh, M.; Gibbs, A.J.; Amirnia, F.; Glasa, M. The Global Phylogeny of Plum Pox Virus Is Emerging. J. Gen. Virol. 2019, 100, 1457–1468. [Google Scholar] [CrossRef] [PubMed]
  49. Ohshima, K.; Tomitaka, Y.; Wood, J.T.; Minematsu, Y.; Kajiyama, H.; Tomimura, K.; Gibbs, A.J. Patterns of Recombination in Turnip Mosaic Virus Genomic Sequences Indicate Hotspots of Recombination. J. Gen. Virol. 2007, 88, 298–315. [Google Scholar] [CrossRef] [PubMed]
  50. Kawakubo, S.; Tomitaka, Y.; Tomimura, K.; Koga, R.; Matsuoka, H.; Uematsu, S.; Yamashita, K.; Ho, S.Y.W.; Ohshima, K. The Recombinogenic History of Turnip Mosaic Potyvirus Reveals Its Introduction to Japan in the 19th Century. Virus Evol. 2022, 8, veac060. [Google Scholar] [CrossRef] [PubMed]
  51. Gibbs, A.J.; Ohshima, K. Potyviruses and the Digital Revolution. Annu. Rev. Phytopathol. 2010, 48, 205–223. [Google Scholar] [CrossRef] [PubMed]
  52. Abadkhah, M.; Hajizadeh, M.; Koolivand, D. Global population genetic structure of Bean common mosaic virus. Arch. Phytopathol. Plant Prot. 2020, 53, 266–281. [Google Scholar] [CrossRef]
  53. Eckshtain-Levi, N.; Weisberg, A.J.; Vinatzer, B.A. The Population Genetic Test Tajima’s D Identifies Genes Encoding Pathogen-Associated Molecular Patterns and Other Virulence-Related Genes in Ralstonia Solanacearum. Mol. Plant Pathol. 2018, 19, 2187–2192. [Google Scholar] [CrossRef]
  54. Xue, M.; Arvy, N.; German-Retana, S. The Mystery Remains: How Do Potyviruses Move within and between Cells? Mol. Plant Pathol. 2023, 24, 1560–1574. [Google Scholar] [CrossRef]
  55. Cui, X.; Yaghmaiean, H.; Wu, G.; Wu, X.; Chen, X.; Thorn, G.; Wang, A. The C-Terminal Region of the Turnip Mosaic Virus P3 Protein Is Essential for Viral Infection via Targeting P3 to the Viral Replication Complex. Virology 2017, 510, 147–155. [Google Scholar] [CrossRef]
  56. Kärblane, K.; Firth, A.E.; Olspert, A. Turnip Mosaic Virus Transcriptional Slippage Dynamics and Distribution in RNA Subpopulations. Mol. Plant-Microbe Interact. 2022, 35, 835–844. [Google Scholar] [CrossRef]
  57. Xu, J.; Zhang, Y. How Significant Is a Protein Structure Similarity with TM-Score = 0.5? Bioinformatics 2010, 26, 889–895. [Google Scholar] [CrossRef]
  58. Herniter, I.A.; Muñoz-Amatriaín, M.; Close, T.J. Genetic, Textual, and Archeological Evidence of the Historical Global Spread of Cowpea (Vigna Unguiculata [L.] Walp.). Legume Sci. 2020, 2, e57. [Google Scholar] [CrossRef]
  59. Loy, H.; Spriggs, M.; Wickler, S. Direct Eveidence for Human Use of Plants 28,000 Years Ago: Starch Residues on Stone Artefacts from the Northern Solomon Islands. Antiquity 1992, 66, 898–912. [Google Scholar] [CrossRef]
  60. Chaïr, H.; Traore, R.E.; Duval, M.F.; Rivallan, R.; Mukherjee, A.; Aboagye, L.M.; Van Rensburg, W.J.; Andrianavalona, V.; De Pinheiro Carvalho, M.A.A.; Saborio, F.; et al. Genetic Diversification and Dispersal of Taro (Colocasia Esculenta (l.) Schott). PLoS ONE 2016, 11, e0157712. [Google Scholar] [CrossRef]
  61. Kreike, C.M.; Van Eck, H.J.; Lebot, V. Genetic Diversity of Taro, Colocasia Esculenta (L.) Schott, in Southeast Asia and the Pacific. Theor. Appl. Genet. 2004, 109, 761–768. [Google Scholar] [CrossRef]
  62. Piperno, D.R. The Origins of Plant Cultivation and Domestication in the New World Tropics Patterns, Process, and New Developments. Curr. Anthropol. 2011, 52, S453–S470. [Google Scholar] [CrossRef]
  63. Jones, R.A.C. Disease Pandemics and Major Epidemics Arising from New Encounters between Indigenous Viruses and Introduced Crops. Viruses 2020, 12, 1388. [Google Scholar] [CrossRef] [PubMed]
  64. Salvaggio, J.E. Fauna, Flora, Fowl, and Fruit: Effects of the Columbian Exchange on the Allergic Response of New and Old World Inhabitants. Allergy Proc. 1992, 13, 335–344. [Google Scholar] [CrossRef]
  65. Phaseolus Vulgaris. 2024. Available online: https://en.wikipedia.org/wiki/phaseolus_vulgaris (accessed on 30 June 2024).
  66. Ho, S.Y.; Lanfear, R.; Bromham, L.; Phillips, M.J.; Soubrier, J.; Rodrigo, A.G.; Cooper, A. Time-dependent rates of molecular evolution. Mol Ecol. 2011, 20, 3087–3101. [Google Scholar] [CrossRef]
  67. Maina, S.; Coutts, B.A.; Edwards, O.R.; de Almeida, L.; Kehoe, M.A.; Ximenes, A.; Jones, R.A.C. Zucchini Yellow Mosaic Virus Populations from East Timorese and Northern Australian Cucurbit Crops: Molecular Properties, Genetic Connectivity, and Biosecurity Implications. Plant Dis. 2017, 101, 1236–1245. [Google Scholar] [CrossRef]
  68. Maina, S.; Barbetti, M.J.; Edwards, O.R.; Minemba, D.; Areke, M.W.; Jones, R.A.C. Zucchini Yellow Mosaic Virus Genomic Sequences from Papua New Guinea: Lack of Genetic Connectivity with Northern Australian or East Timorese Genomes, and New Recombination Findings. Plant Dis. 2019, 103, 1326–1336. [Google Scholar] [CrossRef]
  69. Peters, D.; Matsumura, E.E.; van Vredendaal, P.; van der Vlugt, R.A.A. The plant virus transmissions database. J. Gen. Virol. 2024, 105, 001957. [Google Scholar] [CrossRef] [PubMed]
  70. Bujarski, J.J. Genetic Recombination in Plant-Infecting Messenger-Sense RNA Viruses: Overview and Research Perspectives. Front. Plant Sci. 2013, 4, 42516. [Google Scholar] [CrossRef]
  71. Simon-Loriere, E.; Holmes, E.C. Why Do RNA Viruses Recombine? Nat. Rev. Microbiol. 2011, 9, 617–626. [Google Scholar] [CrossRef] [PubMed]
  72. Wylie, S.J.; Jones, R.A.C. Role of Recombination in the Evolution of Host Specialization within Bean Yellow Mosaic Virus. Phytopathology 2009, 99, 512–518. [Google Scholar] [CrossRef]
  73. Gibbs, A.J.; Ohshima, K.; Yasaka, R.; Mohammadi, M.; Gibbs, M.J.; Jones, R.A.C. The Phylogenetics of the Global Population of Potato Virus y and Its Necrogenic Recombinants. Virus Evol. 2017, 3, vex002. [Google Scholar] [CrossRef] [PubMed]
  74. Moreno, I.M.; Malpica, J.M.; Díaz-Pendón, J.A.; Moriones, E.; Fraile, A.; García-Arenal, F. Variability and Genetic Structure of the Population of Watermelon Mosaic Virus Infecting Melon in Spain. Virology 2004, 318, 451–460. [Google Scholar] [CrossRef]
  75. Desbiez, C.; Lecoq, H. The Nucleotide Sequence of Watermelon Mosaic Virus (WMV, Potyvirus) Reveals Interspecific Recombination between Two Related Potyviruses in the 5′ Part of the Genome. Arch. Virol. 2004, 149, 1619–1632. [Google Scholar] [CrossRef]
  76. Zuntini, A.R.; Carruthers, T.; Maurin, O.; Bailey, P.C.; Leempoel, K.; Brewer, G.E.; Epitawalage, N.; Françoso, E.; Gallego-Paramo, B.; McGinnie, C.; et al. Phylogenomics and the Rise of the Angiosperms. Nature 2024, 629, 843–850. [Google Scholar] [CrossRef]
  77. Morozov, S.Y.; Solovyev, A.G. Small hydrophobic viral proteins involved in intercellular movement of diverse plant virus genomes. AIMS Microbiol. 2020, 6, 305–329. [Google Scholar] [CrossRef]
  78. Pasin, F.; Daròs, J.-A.; Tzanetakis, I.E. Proteome expansion in the Potyviridae evolutionary radiation. FEMS Microbiol. Rev. 2022, 46, fuac011. [Google Scholar] [CrossRef]
  79. Qin, L.; Liu, H.; Liu, P.; Jiang, L.; Cheng, X.; Li, F.; Shen, W.; Qiu, W.; Dai, Z.; Cui, H. Rubisco small subunit (RbCS) is coopted by potyvirids as the scaffold protein in assembling a complex for viral intercellular movement. PLoS Pathog. 2024, 20, e1012064. [Google Scholar] [CrossRef] [PubMed]
Figure 1. A histogram showing the numbers of sequences of different BCMV subgroup viruses in GenBank (blue bars) found in crops, and the numbers of those sequences found to be non-recombinant (n-rec; orange bars). Virus acronyms: WMV, watermelon mosaic virus; SbMV, soybean mosaic virus; ZYMV, zucchini yellow mosaic virus; BCMV, bean common mosaic virus; BCMNV, bean common mosaic necrosis virus; EAPV, East Asian passiflora virus; DashMV, dasheen mosaic virus; CpAbMV, cowpea aphid-borne mosaic virus; PeMoV, peanut mottle virus. The histogram also shows the total and n-rec numbers of sequences of non-crop BCMV subgroup viruses; they were eight sequences of hardenbergia mosaic virus (seven n-rec); six of telosma mosaic virus (one n-rec); four of basella rugose mosaic virus (four n-rec), Paris mosaic necrosis virus (three n-rec), and wisteria vein mosaic virus (two n-rec); three each of beet mosaic virus (two n-rec), freesia mosaic virus (two n-rec), passionfruit Vietnam virus, and passionfruit woodiness virus (three n-rec); two each of begonia flower breaking virus (two n-rec), blue squill virus A (two n-rec), calla lily latent virus, passiflora chlorosis virus (two n-rec), passiflora foetida virus Y (two n-rec), yam bean mosaic virus (one n-rec), and zantedeschia mild mosaic virus (two n-rec); and one each of achyranthes bidentata mosaic virus (one n-rec), atractyloides macrocephala virus, fritillary virus Y (one n-rec), gomphocarpus mosaic virus (one n-rec), impatiens flower break potyvirus (one n-rec), keunjorong mosaic virus (one n-rec), passiflora virus Y (one n-rec), passionfruit severe mottle virus, pleione flower breaking virus (one n-rec), polygonatum kingianum virus 3 (one n-rec), polygonatum kingianum virus 4 (one n-rec), polygonatum mosaic-associated virus 1 (one n-rec), saffron latent virus isolate (one n-rec), and uraria mosaic virus (one n-rec). Sequences of blackeye cowpea mosaic virus, lygodium japonicum potyvirus, peanut stripe virus, and poplar mosaic virus (China) were phylogenetically indistinguishable from those of bean common mosaic virus and were pooled with them and that of vanilla mosaic virus with dasheen mosaic virus. Likewise, those of artemisia carvifolia potyvirus, cerasus yedoensis potyvirus, and cucurbita moschata potyvirus were pooled with those of watermelon mosaic virus, soybean virus A with those of wisteria vein mosaic virus, and of allium fistulosum potyvirus, brassica caulorapa potyvirus, cerasus yedoensis potyvirus, cucurbita moschata potyvirus, luffa aegyptiaca potyvirus, sapindus mukorossi potyvirus, and solanum melongena potyvirus with those of zucchini yellow mosaic virus.
Figure 1. A histogram showing the numbers of sequences of different BCMV subgroup viruses in GenBank (blue bars) found in crops, and the numbers of those sequences found to be non-recombinant (n-rec; orange bars). Virus acronyms: WMV, watermelon mosaic virus; SbMV, soybean mosaic virus; ZYMV, zucchini yellow mosaic virus; BCMV, bean common mosaic virus; BCMNV, bean common mosaic necrosis virus; EAPV, East Asian passiflora virus; DashMV, dasheen mosaic virus; CpAbMV, cowpea aphid-borne mosaic virus; PeMoV, peanut mottle virus. The histogram also shows the total and n-rec numbers of sequences of non-crop BCMV subgroup viruses; they were eight sequences of hardenbergia mosaic virus (seven n-rec); six of telosma mosaic virus (one n-rec); four of basella rugose mosaic virus (four n-rec), Paris mosaic necrosis virus (three n-rec), and wisteria vein mosaic virus (two n-rec); three each of beet mosaic virus (two n-rec), freesia mosaic virus (two n-rec), passionfruit Vietnam virus, and passionfruit woodiness virus (three n-rec); two each of begonia flower breaking virus (two n-rec), blue squill virus A (two n-rec), calla lily latent virus, passiflora chlorosis virus (two n-rec), passiflora foetida virus Y (two n-rec), yam bean mosaic virus (one n-rec), and zantedeschia mild mosaic virus (two n-rec); and one each of achyranthes bidentata mosaic virus (one n-rec), atractyloides macrocephala virus, fritillary virus Y (one n-rec), gomphocarpus mosaic virus (one n-rec), impatiens flower break potyvirus (one n-rec), keunjorong mosaic virus (one n-rec), passiflora virus Y (one n-rec), passionfruit severe mottle virus, pleione flower breaking virus (one n-rec), polygonatum kingianum virus 3 (one n-rec), polygonatum kingianum virus 4 (one n-rec), polygonatum mosaic-associated virus 1 (one n-rec), saffron latent virus isolate (one n-rec), and uraria mosaic virus (one n-rec). Sequences of blackeye cowpea mosaic virus, lygodium japonicum potyvirus, peanut stripe virus, and poplar mosaic virus (China) were phylogenetically indistinguishable from those of bean common mosaic virus and were pooled with them and that of vanilla mosaic virus with dasheen mosaic virus. Likewise, those of artemisia carvifolia potyvirus, cerasus yedoensis potyvirus, and cucurbita moschata potyvirus were pooled with those of watermelon mosaic virus, soybean virus A with those of wisteria vein mosaic virus, and of allium fistulosum potyvirus, brassica caulorapa potyvirus, cerasus yedoensis potyvirus, cucurbita moschata potyvirus, luffa aegyptiaca potyvirus, sapindus mukorossi potyvirus, and solanum melongena potyvirus with those of zucchini yellow mosaic virus.
Viruses 16 01351 g001
Figure 2. A diagram showing the 30 countries (upper map and red discs) from which the complete ORFs of BCMV subgroup crop viruses have been isolated, and the corresponding 14 countries (lower map and green discs) providing the ORFs of non-crop virus isolates. The disc sizes are related to the number of isolates obtained from each country and range from 61 non-crop virus sequences from China to single isolates from three oceanic islands (Cook, Hawaii, and Reunion); the diameter of each disc is scaled to the square root of the number of isolates from that country.
Figure 2. A diagram showing the 30 countries (upper map and red discs) from which the complete ORFs of BCMV subgroup crop viruses have been isolated, and the corresponding 14 countries (lower map and green discs) providing the ORFs of non-crop virus isolates. The disc sizes are related to the number of isolates obtained from each country and range from 61 non-crop virus sequences from China to single isolates from three oceanic islands (Cook, Hawaii, and Reunion); the diameter of each disc is scaled to the square root of the number of isolates from that country.
Viruses 16 01351 g002
Figure 3. A phylogeny of the seven n-rec crop virus sequence sets (Table S1) and the 45 non-crop isolates closest to them. The crop virus clusters have been collapsed and represented as triangles, and these, together with the large and small discs, are colour-coded to indicate the groupings used for population genetic comparisons. There were too few n-rec SbMV sequences for useful population genetic comparisons, and no n-rec WMV sequences, so these two viruses are represented in Figure 3 by black triangles. Crop virus sets contributing outgroup sequences to other crop sequence sets are marked with coloured ellipses.
Figure 3. A phylogeny of the seven n-rec crop virus sequence sets (Table S1) and the 45 non-crop isolates closest to them. The crop virus clusters have been collapsed and represented as triangles, and these, together with the large and small discs, are colour-coded to indicate the groupings used for population genetic comparisons. There were too few n-rec SbMV sequences for useful population genetic comparisons, and no n-rec WMV sequences, so these two viruses are represented in Figure 3 by black triangles. Crop virus sets contributing outgroup sequences to other crop sequence sets are marked with coloured ellipses.
Viruses 16 01351 g003
Figure 4. Basal recombinants. Phylogenies showing the relationships between the sequences basal to the WMV and SbMV sequences from crop isolates, which are collapsed and represented as triangles. Four WMV isolates involved in Event 60 of the RDP analysis are labelled “SbMV”, and were obtained from unusual hosts in NE Asia, whereas the relationships of the four SbMV isolates from Pinellia spp. are more complex and unresolved statistically. Nodes marked with a red disc have >0.99 SH statistical support. The results of the RDP5 analyses are available from the authors upon request.
Figure 4. Basal recombinants. Phylogenies showing the relationships between the sequences basal to the WMV and SbMV sequences from crop isolates, which are collapsed and represented as triangles. Four WMV isolates involved in Event 60 of the RDP analysis are labelled “SbMV”, and were obtained from unusual hosts in NE Asia, whereas the relationships of the four SbMV isolates from Pinellia spp. are more complex and unresolved statistically. Nodes marked with a red disc have >0.99 SH statistical support. The results of the RDP5 analyses are available from the authors upon request.
Viruses 16 01351 g004
Figure 5. dN/dS map of the 60 main ORFs of the BCMV genomes obtained using the online Datamonkey server (https://www.datamonkey.org/ (accessed on 30 November 2023)) and drawn by Excel.
Figure 5. dN/dS map of the 60 main ORFs of the BCMV genomes obtained using the online Datamonkey server (https://www.datamonkey.org/ (accessed on 30 November 2023)) and drawn by Excel.
Viruses 16 01351 g005
Figure 6. Summary of the FUBAR analyses for positively and negatively selected sites in the P3 and P3N-PIPO genes of seven viruses and their outgroups (Table S1); the same site was never found to be positively selected in both the virus dataset and its outgroup. Positively selected sites are red discs; negative are blue. Discs for zero positive or negative sites are shown as circles; individual discs show how many viruses (0–7) gave a significant positive or negative FUBAR result. Rows separate the numbers of datasets (0–7) providing each point; total sites/row at right. Green dotted line shows the position of the 3′most codon of the P3N gene. Codon numbering from the 731-sequence alignment used in this study.
Figure 6. Summary of the FUBAR analyses for positively and negatively selected sites in the P3 and P3N-PIPO genes of seven viruses and their outgroups (Table S1); the same site was never found to be positively selected in both the virus dataset and its outgroup. Positively selected sites are red discs; negative are blue. Discs for zero positive or negative sites are shown as circles; individual discs show how many viruses (0–7) gave a significant positive or negative FUBAR result. Rows separate the numbers of datasets (0–7) providing each point; total sites/row at right. Green dotted line shows the position of the 3′most codon of the P3N gene. Codon numbering from the 731-sequence alignment used in this study.
Viruses 16 01351 g006
Figure 7. AlphaFold3 models of the P3 and P3N-PIPO proteins encoded in the genome of PeMoV (KF977830). Support for the models of P3 and P3N-PIPO proteins of several BCMV subgroup viruses is judged by the “predicted temperature modelling score” (pTM), which was all around 0.6, where above 0.5 means the overall model “might be similar to the true structure” [57], especially as all had essentially the same structure.
Figure 7. AlphaFold3 models of the P3 and P3N-PIPO proteins encoded in the genome of PeMoV (KF977830). Support for the models of P3 and P3N-PIPO proteins of several BCMV subgroup viruses is judged by the “predicted temperature modelling score” (pTM), which was all around 0.6, where above 0.5 means the overall model “might be similar to the true structure” [57], especially as all had essentially the same structure.
Viruses 16 01351 g007
Table 1. The population genetic parameters and demography tests estimated for the alignments of seven crop viruses, their outgroups, and their combined alignments.
Table 1. The population genetic parameters and demography tests estimated for the alignments of seven crop viruses, their outgroups, and their combined alignments.
VirusNumber of
Sequences
Nucleotide
Diversity
Number of
Segregating Sites
Tajima’s DdN/dS
BCMNV-all320.24455292.297 *0.221
BCMNV-crop170.06918030.674 ns0.089
BCMNV-outgroup150.29954722.678 **0.237
BCMV-all600.19456891.593 ns0.185
BCMV-crop450.11444990.104 ns0.106
BCMV-outgroup150.28754192.501 **0.221
CpAbMV-all230.29454553.252 ***0.233
CpAbMV-crop80.21243111.092 ns0.137
CpAbMV-outgroup150.29053442.616 **0.229
DashMV-all260.31157803.442 ***0.261
DashMV-crop110.30653862.516 **0.264
DashMV-outgroup150.31156232.807 **0.261
EAPV-all430.19553781.553 ns0.217
EAPV-crop280.0542880−1.155 ns0.163
EAPV- outgroup150.26951472.387 *0.199
PeMoV-all300.30462482.632 **0.362
PeMoV-crop150.0201231−2.273 **0.081
PeMoV-outgroup150.37962013.273 ***0.370
ZYMV-all900.17657271.371 ns0.189
ZYMV-crop750.09535640.669 ns0.072
ZYMV-outgroup150.28653152.624 **0.238
ns = not statistically significant; * = significant at p < 0.1; ** = significant at p < 0.01; *** = significant at p < 0.001.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hajizadeh, M.; Ben Mansour, K.; Gibbs, A.J. A Genetic Study of Spillovers in the Bean Common Mosaic Subgroup of Potyviruses. Viruses 2024, 16, 1351. https://doi.org/10.3390/v16091351

AMA Style

Hajizadeh M, Ben Mansour K, Gibbs AJ. A Genetic Study of Spillovers in the Bean Common Mosaic Subgroup of Potyviruses. Viruses. 2024; 16(9):1351. https://doi.org/10.3390/v16091351

Chicago/Turabian Style

Hajizadeh, Mohammad, Karima Ben Mansour, and Adrian J. Gibbs. 2024. "A Genetic Study of Spillovers in the Bean Common Mosaic Subgroup of Potyviruses" Viruses 16, no. 9: 1351. https://doi.org/10.3390/v16091351

APA Style

Hajizadeh, M., Ben Mansour, K., & Gibbs, A. J. (2024). A Genetic Study of Spillovers in the Bean Common Mosaic Subgroup of Potyviruses. Viruses, 16(9), 1351. https://doi.org/10.3390/v16091351

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop