High Prevalence of Three Potyviruses Infecting Cucurbits in Oklahoma and Phylogenetic Analysis of Cucurbit Aphid-Borne Yellows Virus Isolated from Pumpkins

Field information about viruses infecting crops is fundamental for understanding the severity of the effects they cause in plants. To determine the status of cucurbit viruses, surveys were conducted for three consecutive years (2016–2018) in different agricultural districts of Oklahoma. A total of 1331 leaf samples from >90 fields were randomly collected from both symptomatic and asymptomatic cucurbit plants across 11 counties. All samples were tested with the dot-immunobinding assay (DIBA) against the antisera of 10 known viruses. Samples infected with papaya ringspot virus (PRSV-W), watermelon mosaic virus (WMV), zucchini yellow mosaic virus (ZYMV), and cucurbit aphid-borne-yellows virus (CABYV) were also tested by RT-PCR. Of the 10 viruses, PRSV-W was the most widespread, with an overall prevalence of 59.1%, present in all 11 counties, followed by ZYMV (27.6%), in 10 counties, and WMV (20.7%), in seven counties, while the remaining viruses were present sporadically with low incidence. Approximately 42% of the infected samples were positive, with more than one virus indicating a high proportion of mixed infections. CABYV was detected for the first time in Oklahoma, and the phylogenetic analysis of the first complete genome sequence of a CABYV isolate (BL-4) from the US showed a close relationship with Asian isolates.


Introduction
The members of the Cucurbitaceae, family, commonly referred to as cucurbits, are major cash crops in the United States (US) and worldwide. Cucurbitaceae includes 15 tribes, 95-97 genera, and approximately 1000 species, which are grown in tropical and subtropical regions [1][2][3]. Among them, the genus Cucumis (including cucumbers and melons), Citrullus (watermelons), and Cucurbita (pumpkin sand squashes) are popular. Archeological evidence suggests that pumpkins and squashes (Cucurbita spp) were first cultivated in American continents >10,000 years ago; watermelons were first cultivated in Africa at least 4000 years ago, while cucumbers and melons have been in Asia for > 5000 years [4]. In the US, six cucurbits-cantaloupes (Cucumis melo), cucumbers (Cucumis sativus), honeydew (Cucumis melo), pumpkins, squashes (Cucurbita pepo, C. maxima, and C. moschata), and watermelons (Citrullus lanatus)-are among the top fifteen fresh vegetables. The value of production from these crops was >two billion US dollars in 2017, which accounts for nearly 13% of the total value of production from fresh vegetables in the US. [5]. In Oklahoma, cucurbits have been cultivated for >70 years, and currently, they are grown across approximately <6000 acres, where viral diseases are one of the major factors limiting their production [6].
Our previous study [6], conducted during the 2008-2010 growing seasons, was limited to only watermelons in four Oklahoma counties and showed that PRSV-W was the most prevalent virus, followed by WMV and ZYMV, while MNSV, SqMV, and CMV were present with an incidence of less than 1% [6]. The purpose of this study was to further expand our knowledge by surveying more growing areas (11 counties from eight agricultural districts) in Oklahoma ( Figure 1) and determine the current status of viruses infecting cucurbits in terms of their relative prevalence and distribution in five major cucurbit crops: cantaloupes, cucumbers, pumpkins, squashes, and watermelons. In addition, a new virus, CABYV, was reported for the first time in Oklahoma. We determined the first complete genome sequence from the US and deduced the evolutionary relationship of CABYV with other complete genomes from around the world.

Field Symptoms
A wide range of virus symptoms were observed in the cucurbit fields. The most common symptoms were leaf distortion, light and severe mosaics, mottling, and chlorosis ( Figure 2A). The other notable symptoms were curling, narrowing, thickening, stunting, rolling, cupping, and a shoestring appearance on the leaves of cucurbits. The ring spots were also observed on pumpkin fruit ( Figure 2B). CABYV-positive leaves showed light yellowing and vein thickening on pumpkins ( Figure 2C).

Field Symptoms
A wide range of virus symptoms were observed in the cucurbit fields. The most common symptoms were leaf distortion, light and severe mosaics, mottling, and chlorosis ( Figure 2A). The other notable symptoms were curling, narrowing, thickening, stunting, rolling, cupping, and a shoestring appearance on the leaves of cucurbits. The ring spots were also observed on pumpkin fruit ( Figure 2B). CABYV-positive leaves showed light yellowing and vein thickening on pumpkins ( Figure 2C).

Distribution of Viruses in Different Counties
We collected 1331 samples from >90 cucurbit fields during the three growing seasons. Among the 10 tested viruses, PRSV-W was the most dominant virus infecting cucurbits in Oklahoma, with an average infection rate of 59.1%, followed by ZYMV (27.6%) and WMV (20.7%). Apart from these three potyviruses, the other four viruses were present in low abundance: MNSV (5.7%), SqMV (1.4%), CABYV (1.3%), and CGMMV (1.1%). None of the samples tested positive for the remaining three viruses (CMV, cucurbit yellow stunting disorder virus (CYSDV), and squash leaf curl virus (SLCV)). Samples from all eleven counties were infected with at least one virus. The virus prevalence in different counties ranged from 11.8 % (Atoka County) to 100% (Pontotoc County). The leaf samples collected in Atoka County were infected with only one type of virus, while six different viruses were present in the samples from Muskogee County. PRSV-W was present in all eleven counties, followed by ZYMV in 10 counties, WMV in seven counties, MNSV in five counties, and CABYV and SqMV in three counties, while CGMMV was detected in only one county (Table 1).

Distribution of Viruses in Different Counties
We collected 1331 samples from >90 cucurbit fields during the three growing seasons. Among the 10 tested viruses, PRSV-W was the most dominant virus infecting cucurbits in Oklahoma, with an average infection rate of 59.1%, followed by ZYMV (27.6%) and WMV (20.7%). Apart from these three potyviruses, the other four viruses were present in low abundance: MNSV (5.7%), SqMV (1.4%), CABYV (1.3%), and CGMMV (1.1%). None of the samples tested positive for the remaining three viruses (CMV, cucurbit yellow stunting disorder virus (CYSDV), and squash leaf curl virus (SLCV)). Samples from all eleven counties were infected with at least one virus. The virus prevalence in different counties ranged from 11.8 % (Atoka County) to 100% (Pontotoc County). The leaf samples collected in Atoka County were infected with only one type of virus, while six different viruses were present in the samples from Muskogee County. PRSV-W was present in all eleven counties, followed by ZYMV in 10 counties, WMV in seven counties, MNSV in five counties, and CABYV and SqMV in three counties, while CGMMV was detected in only one county (Table 1).   Note: The data combine the samples collected from all three years. The values inside the brackets are respective percentages. The viruses with the highest percent incidence within the respective counties are shown in bold script. The sum of individual incidence is more than the total virus incidence due to mixed infections.

Distribution of Viruses in Different Hosts
Cantaloupes had the highest viral incidence, with 91.0% of the samples infected with at least one of the 10 viruses tested, followed by pumpkins (86.3%), squashes (84.0%), cucumbers (76.5%), and watermelons (48.2%). None of the three samples from gourds tested positive for any of the ten viruses tested. At least six different viruses were detected in pumpkins, followed by cucumber samples, containing five different viruses, while cantaloupes, squashes, and watermelons were infected with four viruses each. The distribution of PRSV-W was that it was dominant in all five major hosts, and its incidence ranged from 43.0% (133 out of 309) in watermelons to 88.06% (59 out of 67) in cantaloupes ( Table 2).   Note: The values inside the parentheses are their respective percentages. The virus with the highest percent incidence within the respective host is shown in bold script. The sum of individual incidence is more than the total virus incidence due to mixed infections.

Distribution of Viruses in Different Months and Years of the Growing Seasons
The three potyviruses PRSV-W, WMV, and ZYMV were present in all three growing seasons (2016-2018). PRSV-W was the most dominant virus every year, and its incidence increased each year (Table 3). On the other hand, the incidence of WMV decreased from 40% in 2016 to 7.5% in 2018. The incidence of ZYMV slightly decreased in 2017 from 2016; however, it increased multi-fold in 2018. CGMMV was found only in 2016, MNSV only in 2018, SqMV in 2016 and 2018, and CABYV in 2017 and 2018. The incidence of 10 viruses increased from 66.7% in July to 77.9% in August (Table 4). It marginally decreased in September (71.6%) and increased to as high as 91.5% in October. The incidence of PRSV-W was the highest in all four months, with an average incidence of >46% each month for all three years. Similarly, the average incidence of ZYMV was >20% every month. However, the incidence of WMV increased from 0.6% in July to 36.3% in October.

Distribution of Viruses in Different Growth Stages of Plants
For assessing the viral disease during different growth stages, plants collected during the survey were divided into three groups, namely, the pre-flowering stage (i.e., plants before flowering), fruiting stage (i.e., plants from flowering to having immature fruits), and harvesting stage (i.e., plants bearing fruits ready to harvest and beyond). The incidence of viruses in the pre-flowering stage of the plants was 55.6%, followed by the fruiting stage with 67.9%, and by the harvesting stage, the virus incidence reached 85.1%. The incidence of PRSV-W during different growth stages did not differ significantly; however, the incidence of ZYMV gradually increased from 20.3% at the pre-flowering stage to 29.6% at the harvesting stage. Interestingly, the incidence of WMV was zero at the pre-flowering stage but overtook ZYMV at the harvesting stage, with an incidence of 30.6% (Table 5).  Note: The values inside the brackets are respective percentages. The virus with the highest percent incidence within the respective year is shown in bold script. The sum of individual incidence is more than the total virus incidence due to mixed infections.   Note: The values inside the brackets are respective percentages. The virus with the highest percent incidence within the respective month is shown in bold script. The sum of individual incidence is more than the total virus incidence due to mixed infection.  Note: The values inside the brackets are respective percentages. The virus with the highest percent incidence within the respective growth stage is shown in bold script. The sum of individual incidence is more than the total virus incidence due to mixed infections.

Mixed Infections
In this study, 1023 out of 1331 (76.9%) symptomatic cucurbit samples tested positive for at least one of the 10 tested viruses. Out of the 1023 virus-positive samples, 434 samples were infected with more than one virus, which accounts for 32.6% of the total samples collected. There were as many as 21 different combinations of mixed infections from the field samples based on the dot-immunobinding assay (DIBA) results (Table 6). A total of 343 samples were infected with two different viruses, 84 were infected with three different viruses, and seven samples were infected with four different viruses. The most common mode of double infection was PRSV-W+ZYMV, followed by PRSV-W+WMV and WMV+ZYMV. A combination of MNSV, PRSV-W, and ZYMV was the most common triple infection, followed by the combination involving three potyviruses: PRSV-W, WMV, and ZYMV. Three different combinations of four viruses infecting the same samples were observed. Notably, PRSV-W, WMV, and ZYMV were common in all the quadruple mixedinfection combinations (Table 6).

Statistical Analysis of Various Virus Infections
There were 76 sample populations, of which 14 were eliminated from the statistical analysis because they had less than six samples. All samples were classified as either infected or uninfected by any virus, with H o signifying that there was no difference among the 62 sample populations in the distribution of plants in the two categories. The infected samples were not split into those with a single virus versus multiple viruses due to the low expected number of multiple-virus-infected plants in many sample populations. There was a significant difference among sample populations (X 2 = 841.08, df = 61, p < 0.0001). In order to determine why the significant differences among sample populations existed, the virus distributions were analyzed among the three years of the study (2016-2018), hosts, growth stages, or counties from where they were collected.
The sample populations from each year of the survey were analyzed separately for the overall infection rate. All three tests corresponding to three sample-collection years revealed significant differences among the particular year's sample populations (2016: X 2 = 239.83, df = 15, p < 0.0001; 2017: X 2 = 242.19, df = 9, p < 0.0001; X 2 = 282.85, df = 35, p < 0.0001), which indicated that yearly differences did not explain the variation in the overall infection rate. The samples from all five hosts were classified by developmental stage, and within each classification, each sample was either uninfected, infected by a single virus, or infected by two or more viruses. For cantaloupes, cucumbers, and squashes, there were no samples collected during the flowering stage. Each host was tested separately for the independence of the developmental stage and infection category (H o = independent). For all five hosts except cantaloupes, a significant difference from independent association of developmental stage and infection category existed (cantaloupe: X 2 = 1.44, df = 2, p = 0.4873; cucumber: X 2 = 6.77, df = 2, p = 0.0340; pumpkin: X 2 = 51.20, df = 4, p < 0.0001; squash: X 2 = 13.30, df = 2, p = 0.0013; watermelon: X 2 = 63.68, df = 4, p < 0.0001). The cantaloupes had only six samples in the fruiting stage, which may have accounted for the non-significant result. Similarly, the virus incidence was also analyzed as a product of the spatial location of the hosts by grouping sample populations solely based on the county from where they were collected without regard to the date collected, plant species, or developmental stage. There were eleven groups corresponding to the eleven counties of this study. The samples fell into one of three categories: uninfected, infected by a single virus, or infected by two or more viruses. There was a significant difference in the distribution of viruses in the three categories from different counties (X 2 = 304.90, df = 20, p < 0.0001). Finally, the predicted numbers against observed numbers generated from the multinomial target theory approach for each year were tested separately to test the possibility of random independent events. Only virus types present in a particular year were used to calculate probabilities. The data from 2016 (X 2 = 77.6, df = 1, p < 0.001) and 2018 (X 2 = 42.4, df = 1, p < 0.001) differed significantly from the independent event model. By contrast, the 2017 data did not differ from the model predictions (X 2 = 1.4, df = 1, p = 0.237).

Molecular Confirmation of Selected Viruses
More than 10% of the DIBA-positive samples for PRSV-W, WMV, and ZYMV were further confirmed at the molecular level with RT-PCR using specific primers. The expected PCR products for PRSV-W, WMV, and ZYMV, respectively, were 963 bp, 979 bp, and 902 bp ( Figure 3). After sequencing the PCR products, the lengths of the CP gene of the three viruses were 864, 852, and 840 nucleotides (nt) for PRSV-W, WMV, and ZYMV, respectively. One isolate each of ZYMV and CABYV was also completely sequenced [26,27]. None of the selected DIBA-negative samples were positive according to RT-PCR using primer pairs for the aforementioned viruses.

Sequence Analysis of CABYV BL-4 Isolate
The complete genome of the CABYV BL-4 isolate was 5679 nt long and was the first complete genome sequenced from the US [26]. The schematic representation of the genomic organization of the BL-4 isolate is illustrated in Figure 4. The genome contained a 20-nt-long 5' untranslated region (UTR) and 165-nt-long 3′ UTR. The open reading frame (ORF) contained six overlapping proteins, with a 199-nt-long intergenic non-coding region (NCR) between the P1-P2 fusion protein and readthrough protein (P5). The P1-P2 protein has ribosomal slippage at nt position 1488, and P5 has a stop codon at nt position

Sequence Analysis of CABYV BL-4 Isolate
The complete genome of the CABYV BL-4 isolate was 5679 nt long and was the first complete genome sequenced from the US [26]. The schematic representation of the genomic organization of the BL-4 isolate is illustrated in Figure 4. The genome contained a 20-ntlong 5' untranslated region (UTR) and 165-nt-long 3 UTR. The open reading frame (ORF) contained six overlapping proteins, with a 199-nt-long intergenic non-coding region (NCR) between the P1-P2 fusion protein and readthrough protein (P5). The P1-P2 protein has ribosomal slippage at nt position 1488, and P5 has a stop codon at nt position 4108-4110 (at the end of the coat protein region). The nt positions of six different proteins and their respective amino acid lengths are shown in Table 7. The nt similarity of the CABYV BL-4 isolate was >90% with Chinese, Korean, and Japanese isolates and <90% with other CABYV isolates ( Table 8). The query cover for the Brazilian isolates was only 61% because it is a recombinant strain with around 60% CABYV at the 5' end and 40% of unknown virus origin at the 3' end of the genome [28].

Sequence Analysis of CABYV BL-4 Isolate
The complete genome of the CABYV BL-4 isolate was 5679 nt long and was the first complete genome sequenced from the US [26]. The schematic representation of the genomic organization of the BL-4 isolate is illustrated in Figure 4. The genome contained a 20-nt-long 5' untranslated region (UTR) and 165-nt-long 3′ UTR. The open reading frame (ORF) contained six overlapping proteins, with a 199-nt-long intergenic non-coding region (NCR) between the P1-P2 fusion protein and readthrough protein (P5). The P1-P2 protein has ribosomal slippage at nt position 1488, and P5 has a stop codon at nt position 4108-4110 (at the end of the coat protein region). The nt positions of six different proteins and their respective amino acid lengths are shown in Table 7. The nt similarity of the CABYV BL-4 isolate was >90% with Chinese, Korean, and Japanese isolates and <90% with other CABYV isolates ( Table 8). The query cover for the Brazilian isolates was only 61% because it is a recombinant strain with around 60% CABYV at the 5' end and 40% of unknown virus origin at the 3' end of the genome [28].

Phylogenetic Analysis of CABYV Isolates
The maximum likelihood (ML) tree inferred in MEGA7 [29] using default parameters revealed two major (G1 and G2) phylogroups ( Figure 5). The G1 phylogroup contains CABYV isolates from Europe and Brazil, while the G2 phylogroup contains isolates from the rest of the world. The G1 phylogroup is further divided into two subgroups (G1a and G1b). G1a contains European isolates and two Brazilian isolates form a separate cluster (G1b). Similarly, the second phylogroup is divided into three subgroups: first, the BL-4 isolate together with Chinese, Korean, and Japanese isolates (G2a); second, the Asian recombinant strains from China and Taiwan (G2b); and third, the non-recombinant isolate from Taiwan together with PNG, East Timor, and Indonesian isolates (G2c). The Bayesian inference (BI) tree deduced from Bayesian evolutionary analysis sampling trees (BEAST) also showed two distinct phylogroups as in the ML tree ( Figure S1). While the clustering pattern within the first group was identical to the ML tree, the second group has two additional subgroups: first, the US isolate with the Beijing isolate, and second, the CABYV-C isolate from Taiwan, thereby making five subgroups.

Recombination Analysis of CABYV Isolates
Recombination analysis showed that a total of 61 possible recombination events were detected by all seven algorithms within Recombination Detection Program version 4 (RDP4) [30]. Only the recombination events detected by at least three methods with high statistical support were considered. Altogether, there were 22 recombination events involving 24 isolates. The BL-4 isolate from the US is a major parental isolate for the HD118 and HD1 isolates from Korea, with recombination detected at nt position 854-3862 ( Figure  6). While all the methods except GENECONV identified BL-4 as a parental isolate of HD118, only four methods-RDP, Bootscan, MaxChi, and 3Seq-identified it as a major parental isolate of HD1. Similarly, the BL-4 isolate also served as a minor parent for the C-TW2 isolate from Taiwan at the 5′ end of the genome (Table 9).

Recombination Analysis of CABYV Isolates
Recombination analysis showed that a total of 61 possible recombination events were detected by all seven algorithms within Recombination Detection Program version 4 (RDP4) [30]. Only the recombination events detected by at least three methods with high statistical support were considered. Altogether, there were 22 recombination events involving 24 isolates. The BL-4 isolate from the US is a major parental isolate for the HD118 and HD1 isolates from Korea, with recombination detected at nt position 854-3862 ( Figure 6). While all the methods except GENECONV identified BL-4 as a parental isolate of HD118, only four methods-RDP, Bootscan, MaxChi, and 3Seq-identified it as a major parental isolate of HD1. Similarly, the BL-4 isolate also served as a minor parent for the C-TW2 isolate from Taiwan at the 5 end of the genome (Table 9). volving 24 isolates. The BL-4 isolate from the US is a major parental isolate for the HD118 and HD1 isolates from Korea, with recombination detected at nt position 854-3862 ( Figure  6). While all the methods except GENECONV identified BL-4 as a parental isolate of HD118, only four methods-RDP, Bootscan, MaxChi, and 3Seq-identified it as a major parental isolate of HD1. Similarly, the BL-4 isolate also served as a minor parent for the C-TW2 isolate from Taiwan at the 5′ end of the genome (Table 9).

Discussion
This study presented three years (2016-2018) of survey data for 10 important viruses infecting major cucurbit crops in Oklahoma. The survey results showed that viruses are common in cucurbits, and potyviruses are a major and continuous threat to cucurbit production in Oklahoma.
The symptoms appearing on virus-infected plants afflicted by the three potyviruses were similar, so it was difficult to distinguish the virus type in the field based on appearance, except for ringspot caused by PRSV-W on fruits. Around 77% of those symptomatic plants sampled were positive for at least one of the tested viruses. The remaining samples might have been infected with viruses that were not tested in this study. The symptoms caused by viruses in pumpkins, squashes, and cantaloupes were more severe than those caused by virus infections in cucumbers and watermelons.
Our previous study [6] only focused on watermelon plants and was conducted in four Oklahoma counties, which showed that PRSV-W was the most dominant virus, followed by WMV and ZYMV. This study shows that the prevalence among the three potyviruses has changed over time. For example, PRSV has maintained its top ranking, while ZYMV has overtaken WMV in the overall incidence. CABYV was reported for the first time in Oklahoma during this study [31], which presents the first complete genome of CABYV in the US. In addition, the DIBA results suggest that 15 cucumber samples out of 19 collected from Cimarron County in 2016 were infected with CGMMV. CGMMV was recently reported in the US [23], and this is the first report of it from Oklahoma. Since CGMMV is seed transmitted, it could pose a problem in other states unless preventive measures are applied. The incidence of MNSV and SqMV reported in this study was low and was similar to that found in the previous study [6]. Interestingly, there was no CMV infection during the 2016-2018 growing seasons. The prevalence of CMV was less than 1% in Oklahoma during the 2008-2010 survey [6], and there were zero incidences in this study. These results suggest that despite being one of the most widely distributed viruses infecting plants, including cucurbits [32], CMV might have been eliminated from the major cucurbits in Oklahoma due to the absence of virus inoculum in alternate hosts and vectors. The primary mode of virus transmission (Table 10) for four (three potyviruses and CABYV) of the seven viruses detected in this study is aphid vectors, which accounted for >90% of total virus infections. Three viruses (ZYMV, MNSV, and CGMMV) are also transmitted by infected seeds. The cultivation of different cucurbit crops in near proximity was also frequently observed during the course of this study, which might have enabled viruses to spread among other cucurbit hosts easily. These results suggest that controlling aphid populations early in the growing season, the use of virus-free seed, and good agricultural practices are key in minimizing viral epidemics in cucurbit crops in the state. Based on our results, the dynamics of viruses differ among the various cucurbit hosts. Altogether, four viruses (MNSV, PRSV-W, WMV, and ZYMV) were present in all five major cucurbit crops. CGMMV was only present in cucumbers, and CABYV and SqMV were only present in pumpkins. PRSV-W was dominant in each host, and the distribution of other viruses was significantly different in different hosts. For instance, MNSV was present in 74.6% (50 out of 67) of the samples from cantaloupes but was present with low incidence in other hosts, such as 39.2% (20 out of 51) in cucumbers, 2.6% (4 out of 156) in squashes, and less than 1% incidence in watermelons (1 out of 309) and pumpkins (1 out of 745). Similarly, the incidence of WMV was higher in pumpkins, squashes, and watermelons than cucumbers and cantaloupes (Table 5). ZYMV was more prevalent than WMV in cucumbers, cantaloupes, and squashes, but WMV was frequently found in pumpkins and watermelons. These host preferences of the different viruses among the various hosts could be driven by the preferences of their vectors for particular hosts present in the respective fields or nearby alternate hosts. Although the composition of the viruses in different years changed, PRSV-W, WMV, and ZYMV were consistently present, albeit with different rates of incidence. The other four viruses were mostly sporadic; however, outbreaks of these viruses in the near future cannot be ruled out. CABYV requires more attention owing to its rapid emergence and high incidence in Europe and Asia [12,13,33,34].
In 2016, ZYMV was not detected in Caddo, Muskogee, McCurtain, Payne, and Tulsa counties. However, in 2018, the virus was detected in all these locations. Moreover, it was detected in additional counties we surveyed in 2018 including Cherokee, Carter, and Pontotoc. These results suggest that ZYMV is an emerging virus infecting cucurbits in Oklahoma. Although there are only a few alternative hosts that are potential reservoirs or overwintering hosts for ZYMV [35], it was found to be highly stable and could be transmitted via contact-mediated wounding and/or abrasion as well [36]. Unlike PRSV-W and WMV, ZYMV can spread even when aphids are absent because ZYMV can be transmitted via virus-infected seeds [37]. ZYMV-infected seeds might have provided primary infection foci for the virus, and the other two modes of transmission (aphids and contact) helped in the transmission of ZYMV to adjacent plants. We also speculate that the rapid spread of the virus to other locations might be due to its ability to be transmitted by seeds. Interestingly, while the prevalence of WMV and ZYMV was significantly higher at the harvesting stage than at the young (pre-flowering and fruiting) stage, the incidence of PRSV remained similar throughout the different developmental stages.
The viral infection rates varied significantly among different counties, hosts, collection years, and growth stages. The discrepancy in the infection rates among different counties and collection years might be due to the presence or absence of wild hosts for these viruses and simply due to differences in agricultural practices such as crop rotation and the use of virus-susceptible varieties. The relatively higher infection rates in the four hosts other than watermelons could also be attributed to the larger sizes of the leaves, which would facilitate the aphid trapping. A gradual increase in the infection rate was observed as the plants grew older. After the flowering stages of plants, the plants are more likely to attract aphids, which in turn increase the chances of the transmission of viruses from infected plants to uninfected plants nearby. In addition, as the plants grow larger and cover more area, transmission to and from other plants becomes more likely. It was previously reported that aphid performance and movement are correlated with the age and species of plants [38]. Therefore, a higher incidence of viruses with the progression of plant growth was also probably due to the higher activity of aphids during the later stages of plant growth. Another factor, which could result in increased viral infection during later stages of the host, is physiological changes induced by the virus in infected plants that result in attracting vectors to infected plants in preference to healthy plants. For instance, ZYMV helped to increase the production of volatile compounds in plants, facilitating the attraction of a higher number of aphid vectors [39]. Similarly, a PRSV-W and Aphis gossypii combination resulted in an increase in the vector population by plant enrichment, thereby allowing the swift transmission of the virus in the same field or to nearby fields [40]. In another study, aphids acquiring CMV were found to encourage the virus to move towards uninfected plants by reducing the quality of the vector's host plants [41].
A high proportion of mixed virus infections (434 out of 1023 samples, 42.42%) was observed among the infected samples. The rate of mixed infections involving three potyviruses was 88.94% (386 out of 434) of the total mixed infections recorded in this study and 37.73% of the total infected samples ( Table 6). Our previous study [20] revealed that the most common combination of mixed infections in the southern United States was PRSV-W+WMV, followed by WMV+ZYMV and PRSV-W+ZYMV. In Oklahoma, 18.9% of samples were found to be infected with at least two viruses among 869 cucurbit samples [6]. A high incidence of mixed infections involving PRSV-W, WMV, and ZYMV has also been reported previously in different studies around the world: in the USA [6,20,22], Iran [12], Spain [13], Brazil [42], Panama [14], Venezuela [43], South Africa [44], Ivory Coast [45], Argentina [18], India [16], and France [19]. However, the proportion of mixed infections involving these three viruses in this study was higher than that in previous studies.
Mixed infections lead to not only symptom variation but also a change in the pattern of vector transmission and infectivity. In addition, the preference and fitness of vectors can also be changed differently by mixed virus infections compared to single infections [46]. They may also lead to virus-virus interactions within a host, resulting in new variants with novel features [47]. Mixed virus infections can be antagonistic [47][48][49][50], synergistic [42,[51][52][53], or neutral [53,54]. Additionally, the suppression of post-transcriptional expression by one virus helps in the replication of another virus, thereby forming a synergistic association [55]. Sometimes, the same plant can be co-infected with a different virus strain, which leads to cross-protection [56]. The cross-protection strategy has been used to control different viral species including potyviruses [57]. At this stage, it is not clear whether the mixed infections in this study were synergistic or not. The symptoms shown by the virus-infected cucurbit plants in mixed infections were more severe than in single infections, indicating the possibility of synergism. However, the gradual increase in ZYMV and the reverse in WMV seen in our survey from 2016 to 2018 raised suspicion regarding synergism. A recent study [39] conducted on mixed infections involving WMV and ZYMV in a greenhouse setting showed that ZYMV dominated WMV. ZYMV replicated at a similar rate in both single and mixed infections, whereas WMV replication sharply declined in combination with ZYMV due to in-plant competition. However, the close evaluation of the interaction of these viruses in the field is desired to reach a conclusion about the synergism and competition. Additionally, the role of another dominant virus, PRSV-W, should be determined in future studies.
There have been limited studies about the incidence of viruses infecting cucurbits lately, but this study indicates that the distribution of viruses in cucurbits changes over time. The vector population, the inoculum source, the susceptibility of cultivars to a virus, and environmental factors play an important role in the inconsistency of virus incidence at different time points and locations.
Despite being geographically distant, the CABYV BL-4 isolate was found to be evolutionarily closer to Asian isolates. Based on evolutionary analyses, CABYV was broadly classified in the Asian and Mediterranean subgroups by Shang et al., 2009 [58], and in four groups (Asian, Mediterranean, Taiwanese, and recombinant) by Kwak et al., 2018 [34]. The phylogenetic grouping of the isolates was loosely based on geographical location in both studies. Our study shows a similar result to the 2009 study [36] but with modified subgrouping due to the addition of other isolates from different parts of the world. The Asian group has expanded to include the US and PNG isolate as well ( Figure 5). Similarly, the European group clusters together with Brazilian isolates, with high bootstrap and posterior probability values in ML and BI phylogenetic trees. Hence, we propose the first group (G1) as the European and Brazilian isolates and the second group (G2) as the Asian and US isolates ( Figure 5).
In RNA viruses, including Luteoviridae, recombination is considered a main evolutionary force that shapes their diversity. The intergenic noncoding region (IR) of the Luteoviruses, in particular, is considered a hotspot for genome recombination [59,60]. This study further corroborates this evidence, as 11 out of 22 possible recombination events involved this region. Although none of the recombination detection methods in RDP4 found any evidence of the US isolate being a recombinant, it served as a major parental isolate for two Korean isolates: HD1 and HD118. The BL-4 isolate also serves as a minor parental isolate for the Taiwanese isolate C-TW2, with Sq-2003 being the major parent. Interestingly, the HD1 and HD118 isolates were detected as products of recombination between HS2 from Korea and CZ from China in a recent study [34]. The nucleotide positions of the recombination breakpoints detected for these isolates in this study were also different. The discrepancy might be due to the enhanced robustness in the analysis provided by the inclusion of 12 additional complete genomes in this study. In addition, the overrepresentation of South Korean isolates and the smaller number of isolates from other countries might have caused inconsistencies in the recombination and phylogenetic analysis.
In conclusion, our study has provided further knowledge about and meaningful insights into the epidemiology of various viruses and the fluctuation in the distribution and prevalence of these viruses over time. In addition, the re-emergence of known or emergence of newly reported viruses will also be important for future management studies for a particular crop and locality.

Survey Area and Sampling
Surveys of viruses in cucurbit crops were conducted during three growing seasons (2016-2018) in eight out of nine agricultural districts of Oklahoma (Figure 1), including one to three counties in each district. A total of 11 counties that include Cimarron (panhandle district), Blaine (west-central), Caddo (south-west), Payne (central), Carter, Atoka, Pontotoc (south-central), Tulsa (north-east), Muskogee, Cherokee (east-central), and McCurtain (south-east) were surveyed. Overall, there were more than 90 fields, ranging from those of a few acres with small-scale farms to large commercial fields up to 50 acres in area.
The six cucurbit hosts (cantaloupes, cucumbers, gourds, pumpkins, squashes, and watermelons), depending on their availability in the respective counties, were sampled. Young, fully expanded leaves showing typical virus-like symptoms (symptomatic) and some healthy leaves (asymptomatic) were randomly collected from different cucurbit hosts. The number of leaves sampled from each plant ranged from 2 to 4. The collected leaves were put in individual Ziploc plastic bags, labeled, brought to the laboratory on ice, and processed within 48 h of collection.

Sample Preparation for Dot-Immunobinding Assay
Samples were prepared and tested with the dot-immunobinding assay (DIBA) as described previously [6,61,62]. Approximately 100 mg of leaf tissue from each sample was crushed in an individual plastic bag containing one volume of phosphate-buffered saline (PBS), pH 7.4, at room temperature. The extracted sap was then centrifuged for 2-3 min to obtain a clear supernatant, and 2 µL of the supernatant was dotted and replicated on 10 nitrocellulose membranes (Bio-Rad laboratories, Hercules, CA, USA). Each nitrocellulose membrane was made to accommodate 121 samples (11 × 11). Leaf sap extracted from healthy squashes grown in the growth chamber was used as a negative control, while positive controls for each virus were obtained commercially (Nano Diagnostics, LLC, Fayettville, AR, USA and Agdia, Inc, Elkhart, IN, USA) and were also dotted on each membrane. After dotting, the membranes were air-dried for 10 min and stored at 4 • C until development.
During the membranes' development, all the blocking and antibody solutions were prepared in PBS, pH 7.4, and contained 600 mM glucose and 600 mM mannose. All incubations were performed at room temperature with gentle agitation on a rocking roller. After the incubation of the membranes with their respective virus-specific polyclonal antibodies, all the membranes were washed three times (10-15 min/wash) in in AP 7.5 (alkaline phosphatase buffer-100 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2 mM MgCl 2 , and 0.05% Triton X-100)-for 20 min each. After washing, the membranes were incubated in PBS for 1 h with a goat anti-rabbit IgG-alkaline phosphatase conjugate antibody (Southern Biotechnology, Birmingham, AL, USA). The membranes were washed again as described above. Before adding the substrate, all the membranes were briefly washed once in AP 9.5 (alkaline phosphatase buffer: 100 mM Tris-HCl, pH 9.5, 100 mM NaCl, and 5 mM MgCl 2 ). The membranes were incubated in AP 9.5 containing 0.33 mg of nitro blue tetrazolium (NBT) per mL and 0.17 mg of 5-bromo-4-chloro-3-indolyl phosphate (BCIP) per mL (VWR, USA) in the dark for 5 to 10 min. The reaction was stopped with 10 mM Tris-HCl, pH 7.5, containing 5 mM EDTA. Both positive and negative results were determined visually according to the intensity of the color in comparison with the positive and negative controls.

Statistical Analysis
Chi-square tests were used to determine if differences existed in the infection rates among the sample populations from all three years. An infected plant was defined as harboring any of the 10 viruses tested. The sample populations were based on the date of collection (month/year), location (counties), plant host (cantaloupes, cucumbers, pumpkins, squashes, or watermelons), and developmental stage (pre-flowering, fruiting, or harvesting). Samples from gourds were not included in the analysis due to the low sample population size (n = 3).
The differences in incidence rates among the sample populations were tested for each year, and different developmental stages within each plant species were analyzed. Spatial effects on the county level were also analyzed. For these analyses, the entire data set was used since sample populations were being combined based on the year of collection, species, or county. In each analysis, chi-square statistical tests were used.
Finally, we tested whether the infection of plants by multiple virus types could be explained as the result of random independent events. The multinomial target theory approach was used to generate the expected number, which was used in a chi-square test of whether the phenomenon of multiple virus infections was due to random, independent infection events. For each virus species, the probability that a plant would be infected was calculated by using the observed data without accounting for whether a plant was or was not infected with multiple viruses. These values were used to deduce the probabilities for each of the single infections (i.e., P 1 *(1−P 2 )*(1−P 3 )* . . . , the probability of no infection (i.e., (1−P 1 )*(1−P 2 )*(1−P 3 )* . . . ) and the probability of multiple virus infections (i.e., at least two infections, 1−P (no infection)-Σ P (infected only by virus i)).

Molecular Confirmation of Selected Viruses
The confirmation of the three potyviruses (PRSV-W, WMV, and ZYMV) was performed by reverse transcription-polymerase chain reaction (RT-PCR) in at least 10% of the DIBA-positive samples. For WMV, primers from the previous study [6] were used for amplifying the coat protein (CP) gene. For amplifying the PRSV and ZYMV CP genes, primers were designed using the primer3 software package available online, and the following primers were used (PRSVCPF: 5 -CTGATGATTATCAACTTGTT-3 ; PRSVCPR: 5 -TAAGGTGAAACAGGGTGGAG-3 ; ZYMVCPF: 5 -GAACAAGGAGACACTGTGAT-3 ; ZYMVCPR: 5 -GCAGCGAAACAATAACCTAG-3 ). The expected PCR-amplified product sizes of 963, 979, and 902 for PRSV-W, WMV, and ZYMV, respectively, included a part of the Nib gene, the full CP gene, and a part of the 3 untranslated region. A small portion (50-100 mg) of the infected leaf was used to extract total RNA by the TRI Reagent (Molecular Research Centre Inc, Cincinnati, OH, USA) method as described previously [6]. Five microliters of the RNA template was used for cDNA synthesis, and 1 µL of the cDNA was used for the PCR. The PCR conditions for ZYMV and PRSV were 94 • C for 2 min for initial denaturation, and then 30 cycles of 94 • C for 30 s, 50 • C for 40 s, and 72 • C for 40 s, with a final extension at 72 • C for 10 min. For WMV, the annealing temperature was 48 • C. The amplified PCR products were run in a 1% agarose gel, and expectedly sized DNA fragments were excised and gel purified (Qiagen, Hilden, Germany). Three microliters of each gel-purified product was ligated using the pGEM-T Easy Vector (Promega Corp, Madison, WI, USA) and transformed into Escherichia coli DH5α competent cells (New England Biolabs, Ipswich, MA, USA). Transformed cells were subjected to blue-white screening using Luria-Bertani agar (LBA), carbenicillin, isopropyl-thiogalactopyranoside (IPTG), and X-gal. Three to five clones were selected and sequenced in both directions by Sanger sequencing using an Applied Biosystems 3130 at the Department of Biological Science, the University of Tulsa, Oklahoma. Nucleotide sequences were analyzed using the basic local alignment search tool (BLAST).

Phylogenetic Analysis of CABYV Complete Genomes
A CABYV BL-4 isolate was collected during the 2017 growing season from Blaine County in Oklahoma and was completely sequenced as described previously [26]. The 45 published complete genome sequences of CABYV (Table S2) were retrieved from Gen-Bank (ncbi.nlm.nih.gov). These nucleotide sequences were aligned using ClustalX in MEGA7. The ML and BI trees were built with default parameters in MEGA7 and BEAST, respectively. Both trees were visualized in Figure version 1.4.3.

Recombination Detection
Recombination events, putative minor and major parents of recombinant isolates, and recombination breakpoints were analyzed using seven methods incorporated in RDP4 with default settings for all 45 complete genome sequences available in the GenBank database. Only recombination events with Bonferroni-corrected P-values less than the cut-off of 0.05 in three or more methods were regarded as recombinants to reduce the false detection of recombination. The recombination events that were indicated by RDP4 as "may be due to factor other than recombination" were also excluded. The breakpoint locations and recombination region were verified individually by using methods that detected the recombination.
Supplementary Materials: The following are available online at https://www.mdpi.com/2076-081 7/10/1/53/s1. Tables S1 and S2. Figure S1: Bayesian inference tree inferred in BEAST for comparison with ML tree. The posterior probability values of > 50 are shown at the respective nodes. The collapsed 22 South Korean isolates are denoted by a dark triangle. The phylogenetic grouping is shown on their respective branches. The US isolate is shown in red color. Isolates from phylogroup G1a are shown in blue, G1b in turquoise G2a (except BL-4) in black, G2b in green, and G2c in pink color.