Interspecies Recombination-Led Speciation of a Novel Geminivirus in Pakistan

Recombination between isolates of different virus species has been known to be one of the sources of speciation. Weeds serve as mixing vessels for begomoviruses, infecting a wide range of economically important plants, thereby facilitating recombination. Chenopodium album is an economically important weed spread worldwide. Here, we present the molecular characterization of a novel recombinant begomovirus identified from C. album in Lahore, Pakistan. The complete DNA- A genome of the virus associated with the leaf distortion occurred in the infected C. album plants was cloned and sequenced. DNA sequence analysis showed that the nucleotide sequence of the virus shared 93% identity with those of the rose leaf curl virus and the duranta leaf curl virus. Interestingly, this newly identified virus is composed of open reading frames (ORFs) from different origins. Phylogenetic networks and complementary recombination detection methods revealed extensive recombination among the sequences. The infectious clone of the newly detected virus was found to be fully infectious in C. album and Nicotiana benthamiana as the viral DNA was successfully reconstituted from systemically infected tissues of inoculated plants, thus fulfilling Koch’s postulates. Our study reveals a new speciation of an emergent ssDNA plant virus associated with C. album through recombination and therefore, proposed the tentative name ‘Chenopodium leaf distortion virus’ (CLDV).


Introduction
Interspecific interactions play a primary role in the diversification and organization of life [1]. Numerous recurrent formations of allopolyploid species have been reported in the plant and animal kingdoms [2,3]. Genetic recombination is a major source of genetic variability in viruses and creates new opportunities for the viruses to overcome selection pressures [4][5][6][7]. The expansion of viral host ranges, alteration of transmission vector specificities, and increases in virulence and pathogenesis are associated with recombination [8][9][10]. Recombination between isolates of different species as a source of speciation has been reported widely [11]. Among the DNA viruses, the role of recombination in geminiviruses (family: Geminiviridae) in the formation of new DNA virus species is welldocumented [12][13][14] and, therefore, plays an essential role in geminivirus diversification and evolution [15][16][17].
The family Geminiviridae includes some of the most damaging plant pathogens that affect a wide host range and cause economic losses throughout the world [18,19]. These plant-infecting viruses have very compact monopartite (DNA-A) or bipartite (DNA-A and DNA-B) genomes [20][21][22]. Geminiviruses infect both monocots and dicots [23] and are

Sample Collection and Virus Detection
In June 2018, during a routine survey to record begomoviruses other than the cottoninfecting viruses in Pakistan, leaf distortion was observed in C. album weed grown in a residence area, Lahore, Pakistan ( Figure 1A,B). Leaf tissues from three plants were collected and stored at −20 • C until processing. Whiteflies, the insect vectors for begomoviruses, were observed in all symptomatic plants. Total nucleic acid was extracted from the samples using the Viral Gene-spin Viral DNA/RNA Extraction Kit (iNtRON Biotechnology, Inc., Seongnam, Korea) following the manufacturer's instructions. Circular DNA was amplified using the extracted total DNA as a template through rolling circle amplification (RCA) (TempliPhi Amplification Kit; GE Healthcare Life Sciences, Uppsala, Sweden) before being digested with the restriction enzymes BamHI, XhoI, HindIII, and EcoRV (TaKaRa Bio, Inc., Shiga, Japan) [42,43]. All the amplified products digested by all the above-mentioned restriction enzymes were visualized using gel electrophoresis and determined to be approximately 2.7 kb in size. Along with RCA amplification, the presence of begomoviruses was confirmed through PCR amplification of coat protein (CP) and replication enhancer protein (REn) using begomovirus-specific primers, Beg-F (5 -CCGTGCTGCTGCCCCCATTGTCCGCGTCAC-3 ) and Beg-R (5 -CTGCCACAACCATGG ATTCACGCACAGGG-3 ) with target size about 1.1 kb [44]. All these amplicons from both RCA and PCR processing were cloned into the pGEM-3Zf (+) vector (Promega Corporation, WI, USA) and then sequenced by a commercial sequencing service, Macrogen (Seoul, Korea). Sequence contigs were assembled and analyzed using BLASTn and BLASTx [45]. We also attempted to detect satellite molecules, i.e., alphasatellite and betasatellite, and DNA-B through PCR using universal primers [46,47]. Southern hybridization analysis was conducted to confirm the replication of the newly detected virus in the samples using the modified method by Southern et al. [48,49].
Circular DNA was amplified using the extracted total DNA as a template through rolling circle amplification (RCA) (TempliPhi Amplification Kit; GE Healthcare Life Sciences, Uppsala, Sweden) before being digested with the restriction enzymes BamHI, XhoI, HindIII, and EcoRV (TaKaRa Bio, Inc., Shiga, Japan) [42,43]. All the amplified products digested by all the above-mentioned restriction enzymes were visualized using gel electrophoresis and determined to be approximately 2.7 kb in size. Along with RCA amplification, the presence of begomoviruses was confirmed through PCR amplification of coat protein (CP) and replication enhancer protein (REn) using begomovirus-specific primers, Beg-F (5′-CCGTGCTGCTGCCCCCATTGTCCGCGTCAC-3′) and Beg-R (5′-CTGCCACAACCATGGATTCACGCACAGGG-3′) with target size about 1.1 kb [44]. All these amplicons from both RCA and PCR processing were cloned into the pGEM-3Zf (+) vector (Promega Corporation, WI, USA) and then sequenced by a commercial sequencing service, Macrogen (Seoul, Korea). Sequence contigs were assembled and analyzed using BLASTn and BLASTx [45]. We also attempted to detect satellite molecules, i.e., alphasatellite and betasatellite, and DNA-B through PCR using universal primers [46,47]. Southern hybridization analysis was conducted to confirm the replication of the newly detected virus in the samples using the modified method by Southern et al. [48,49].

Infectious Clone Construction and Infectivity
An infectious clone (1.1 mer) of the detected virus was constructed to check its infectivity in the host plants (Supplementary Figure S1). Two partial genomes containing restriction sites at the edges, i.e., SpeI/BamHI and BamHI/XbaI respectively were amplified using primer sets based on the extracted sequence and were ligated into the pGEM-T Easy vector (Promega, Madison, WI, USA), using the TA cloning technique according to the manufacturer's instructions. This was followed by sequencing (Macrogen, Seoul, Korea) and restriction digestion using specific enzymes. The two partial genomes were introduced into the pCAMBIA1303 vector and first transformed into competent Escherichia coli strain DH5α using the heat shock method and then into the GV3101 Agrobacterium strains. GV3101 Agrobacterium strains (both transformed and untransformed) were cultured in LB broth in the presence of a pCAMBIA1303 selection antibiotic, such as kanamycin (50 mg/L), and strain-specific selection antibiotics, such as gentamycin and rifampicin (50 mg/L), at 28 • C with agitation for 30 h (until the OD value at 600 nm was 0.8-1.0). The untransformed GV3101 Agrobacterium with no plasmid was used as the mock. Agro-inoculation was performed by the pin-pricking method [50] in approximately 4-and 6-week-old N. benthamiana and C. album plants, respectively. Leaf tissue samples were collected from mock and infected plants 28 days post-inoculation (dpi) to check infectivity through PCR processing using the primers: CLDV-IC1/F (5 -ACTAGTTTTGGCAATCGGTGTCTCAC-3 ) and CLDV-IC1/R (5 -GGATCCACACTCGTTTACATCC-3 ) specifically designed to amplify IC1 (intergenic region, movement protein and coat protein) of the infectious clone of CLDV with a target size of 1.4 kb. The infection caused by CLDV in N. benthamiana and C. album plants was further confirmed by Southern blot hybridization.

Target-Specific Primer Construction and PCR Processing
To explore the intriguing genomic composition of the detected virus, i.e., Chenopodium leaf distortion virus (CLDV), primers were designed based on one ORF sequence of the recombinant viruses rather than their identical ORFs in the new virus genomic composition. The ageratum enation virus (AEV) was found to share the highest identity with the Rep and C4 proteins. New primers were designed to target the gene encoding CP of AEV to confirm whether the detected virus was a separate entity. Similarly, primers were designed to target the genes encoding Rep proteins of the rose leaf curl virus (RLCV), duranta leaf curl virus (DLCV), papaya leaf crumple virus (PaLCrV), and catharanthus yellow mosaic virus (CYMV) ( Table 1).

Nucleotide Sequences and Phylogenetic Analysis
A total of 56 full-length DNA-A sequences of AEV (30), RLCV (5), CYMV (6), PaLCrV (9), and DLCV (5) were used in this study including the sequence of CLDV (Supplementary  Table S1). BLASTn analysis of CLDV showed the sequence identity of these selected viruses though the percentage varies. All sequences were retrieved from the GenBank database (www.ncbi.nlm.nih.gov (accessed on 23 November 2021)) and were aligned at the nicking site in the nonanucleotide motif at the origin of replication (5 -TAATATT//AC-3 ). All multiple-sequence alignments were constructed using the MUSCLE method as implemented in MEGA X [51] and manually corrected as well. Phylogenetic tree construction was performed using Mr. Bayes software version 3.2.7a provided by the CIPRES server [52]. In addition, distances were corrected with the best fit model estimated with jModelTest v2.1.6 on XSEDE on the CIPRES Gateway [52][53][54]. Visualization and editing of phylogenetic trees were carried out using a Newick file generated through FigTree in iTOL [55]. The full-length genome sequences of these top hits aligned with the MUSCLE algorithm were subjected to pairwise comparison using SDT v1.2 [56].

Recombination Analysis
Putative recombinants and major and minor "parents" within the datasets were determined using the RDP, GeneConv, Bootscan, MaxChi, Chimaera, SiScan, and 3Seq recombination detection methods implemented on the RDP4 v4.100 suite [57]. In RDP4, the major parent and minor parent are the presumed parent contributing the larger fraction of the sequence and the presumed parent contributing the smaller fraction of the sequence, respectively. Alignments for all methods were performed using default settings and by p-value cutoff of 0.05.

Nucleotide Diversity and Haplotype Variability Indices
The average pairwise number of nucleotide differences per site (nucleotide diversity, π) was estimated using DnaSP version 6.12.03 (Librado and Rozas 2009, Universitat de Barcelona, Barcelona, Spain). DnaSP version 6.12.03 was also used to calculate the number of haplotypes (H), the number of segregating sites (S), haplotype diversity (Hd), testing Tajima D [58], and Fu and Li's F [59]. Nucleotide diversity estimates the average pairwise differences among sequences while haplotype diversity refers to the frequency and number of haplotypes in the population. Tajima's D test is based on the differences between the number of segregating sites and the average number of nucleotide differences whereas, Fu and Li's F test is based on the differences between the number of singletons and the average number of nucleotide differences between every pair of sequences. The statistically significant differences among the mean nucleotide diversity from all datasets were estimated and represented using GraphPad Prism version 8.0. (Harvey Motulsky 1989, Dotmatics, CA, USA).

Estimation of Gene Genealogies through TCS
The method of Templeton, Crandall, and Sing (TCS) resulting genealogical networks identifies both the relationship between the different sequences as well as the number of nucleotide substitutions connecting them. All the sequences of the AEV, RLCV, CYMV, moieties identified in the novel CLDV recombinant begomovirus along with highly identical DLCV and PaLCrV were analyzed using statistical parsimony with the program TCS (v. 1.21) implemented in the software Population Analysis with Reticulate Trees (POPART) [60,61].

Detection of a New Virus in C. album
Sequencing of cloned DNA fragments following the digestion of RCA product with BamHI confirmed the presence of a new begomovirus species, i.e., chenopodium leaf distortion virus (CLDV) in all samples ( Figure 1C). The complete nucleotide sequence of DNA-A of CLDV was deposited in GenBank under the accession number MN423112. The expected amplicon size of 1.1 kb targeting CP and Ren regions was observed during PCR processing in all symptomatic samples ( Figure 1D), yielding the same sequencing results as MN423112. NCBI basic local alignment search tool (BLASTn) analysis revealed a 93% identity of the nucleotide sequence of the newly identified virus (2.7 kb) with that of the rose leaf curl virus (RLCV; MN746285) and the duranta leaf curl virus (DLCV; MN537564) Viruses 2022, 14, 2166 6 of 16 respectively. No satellite molecules or DNA-B were detected in association with the DNA-A using universal primers. Southern blot hybridization assay confirms that the viral DNA was integrated into the genome of the C. album by producing a noticeable specific band of the expected full-length genome size ( Figure 1E).

Genome Organization and Homology Analysis of Genes
DNA-A of CLDV contained six ORFs following the Old World organization. Sequence analysis (amino acid level) of each ORF using BLASTX showed that AC1, which encodes Rep protein, showed 91% identity with AC1 of AEV (AGO59951) and AC4 showed 92% identity with AC4 of AEV (AGO59954). Similarly, AC2 (TrAP) and AC3 (REn) were 99% identical to the AC2 and AC3 of the RLCV (QAY29069) and (QAY29070), respectively. The ORF AVI (CP) was 100% similar to AV1 of CYMV (YP_009112873), and AV2 showed 100% identity with AV2 of RLCV (ADU33215) (Figure 2A Table S2). In the overlapping regions: CP/MP (∆1) and Rep/TrAP (∆2), RLCV was found more identical to CLDV than CYMV and AEV respectively (Supplementary Figure S3). The similarities and differences in the DNA-A of the newly identified virus with the reference sequences are highly noticeable and make a strong case for proving the identity of CLDV as a separate new species.

Infectivity through Infectious Clone Inoculation
Mild symptoms were observed in both CLDV infected plant groups, i.e., N. benthamiana and C. album. Leaf tissues were harvested and analyzed by PCR to investigate viral replication ability; the virus was detected in the samples, which confirmed its presence ( Figure 3A-E). CLDV (1.4 kb amplicon) was successfully detected through PCR

Infectivity through Infectious Clone Inoculation
Mild symptoms were observed in both CLDV infected plant groups, i.e., N. benthamiana and C. album. Leaf tissues were harvested and analyzed by PCR to investigate viral replication ability; the virus was detected in the samples, which confirmed its presence ( Figure 3A-E). CLDV (1.4 kb amplicon) was successfully detected through PCR in all three C. album samples and three N. benthamiana samples. Sample no. 3 of N. benthamiana (Lane 6 in Figure 2E) shows a faint band which might be due to poor inoculation or any other experimental error resulting in less virus titer but still, CLDV was reconstituted in this sample when it was sequenced. The virus reconstituted in C. album and N. benthamiana maintained the exact nucleotide sequence of the original clone. PCR using vector-specific primers showed negative results, which backs the virus detection on its own instead of containing the virus in different parts of the plant. Southern blot hybridization data also confirms the viral infection ( Figure 3F).

Target-Specific Primer Construction and PCR Analysis
To explore the intriguing genomic composition of the detected virus, primers were constructed based on one of the ORF sequences of the recombinant viruses rather than their identical ORFs in the new virus genomic composition (Table 1). In all cases, the results remained negative showing the presence of only CLDV in the host sample (Supplementary Figure S4).

Target-Specific Primer Construction and PCR Analysis
To explore the intriguing genomic composition of the detected virus, primers were constructed based on one of the ORF sequences of the recombinant viruses rather than their identical ORFs in the new virus genomic composition (Table 1). In all cases, the results remained negative showing the presence of only CLDV in the host sample (Supplementary Figure S4).

Phylogenetic and Recombination Analysis
To assess the standing evolutionary relatedness among these populations, we performed molecular phylogenetic analysis of CLDV and closely associated viruses, using full-genome sequences (sequences included in the analysis were added in Supplementary  Table S1). CLDV was found to share a clade with RLCV ( Figure 4A). Analysis through Sequence demarcation tool version 1.2 (SDT v1.2.) showed the sequence comparison among the viruses with the revelation of CLDV maximum identity of 93% with RLCV and DLCV isolates ( Figure 4B). Recombination analysis detected recombination events in the main genome of CLDV. RDP analysis revealed that CLDV is probably a recombinant genome resulting from a recombination event and originated through recombination between the isolates of the CYMV (GenBank MH643737; 86% similar), AEV (KC795968; 88% similar), and the RLCV (GenBank GQ478342; 93% similar). The recombinant nucleotide coordinates are 2280-1059, spanning the AC1, AC4, IR, V2, and CP genes. The recombination event was validated by the lower p-value of 2.98 × 10 −15 , maximum recombination methods, i.e., RGBMCS3, and an acceptable R score of 0.47 ( Figure 5 and Supplementary Table S3).

Estimation of Genealogies through TCS
As the TCS method provides an important tool for dealing with species or genes at the population level and has proved to be a valuable tool in DNA analysis, TCS calculations revealed that most of the isolates sustained a significant number of mutations compared with each other. CLDV is a species that has arisen from the recombination between RLCV, DLCV, and CYMV. Despite CLDV nucleotide sequences having high identity with those of the RLCV, the CLDV genome was found localized in between RLCV and DLCV as shown by the genealogical network analysis (Figure 7).

Estimation of Genealogies through TCS
As the TCS method provides an important tool for dealing with species or genes at the population level and has proved to be a valuable tool in DNA analysis, TCS calculations revealed that most of the isolates sustained a significant number of mutations compared with each other. CLDV is a species that has arisen from the recombination between RLCV, DLCV, and CYMV. Despite CLDV nucleotide sequences having high identity with those of the RLCV, the CLDV genome was found localized in between RLCV and DLCV as shown by the genealogical network analysis (Figure 7).

Discussion
Geminiviruses have the ability to adapt and evolve quickly as a result of genomeassociated changes and recombination events [62,63]. These recombination events have also been documented among members of the genus Begomovirus enhancing their virulence [64][65][66][67]. These recombination and recurrent mutations can occur in all plant viruses. In this study, we characterized a new recombinant virus (a name suggested as CLDV) with ORFs originating from three different viruses AEV, RLCV, and CYMV ( Figure 2). There are no known begomovirus species with such diverse origins. C. album samples with leaf distortion symptoms processed in this study were collected from Lahore, Pakistan from regions where RLCV, DLCV, AEV, and CYMV had been detected previously in various hosts [44,[68][69][70]. As all viruses that constitute CLDV were reported in the same region, it compelled us to think of the three possible scenarios of CLDV speciation: (i) collection of all aforementioned viruses from different hosts by whiteflies and transmitted to the C. album plant, which might act as a mixing vessel to facilitate

Discussion
Geminiviruses have the ability to adapt and evolve quickly as a result of genome-associated changes and recombination events [62,63]. These recombination events have also been documented among members of the genus Begomovirus enhancing their virulence [64][65][66][67]. These recombination and recurrent mutations can occur in all plant viruses. In this study, we characterized a new recombinant virus (a name suggested as CLDV) with ORFs originating from three different viruses AEV, RLCV, and CYMV ( Figure 2). There are no known begomovirus species with such diverse origins. C. album samples with leaf distortion symptoms processed in this study were collected from Lahore, Pakistan from regions where RLCV, DLCV, AEV, and CYMV had been detected previously in various hosts [44,[68][69][70]. As all viruses that constitute CLDV were reported in the same region, it compelled us to think of the three possible scenarios of CLDV speciation: (i) collection of all aforementioned viruses from different hosts by whiteflies and transmitted to the C. album plant, which might act as a mixing vessel to facilitate interspecific recombination, (ii) these viruses might have been intermixed in any other host and then transmitted to C. album by whiteflies, (iii) the interspecies recombination could have been carried out inside the insect vector, which in this case is the whitefly. To investigate the first scenario, we designed the target-specific primers based on one of the ORF sequences of the recombinant viruses rather than their identical ORFs in CLDV (Table 1). PCR amplification showed no positive results which dismisses the possibility of interspecific recombination in CLDV (Supplementary Figure S3). In the case of the second scenario, we collected a lot of samples i.e., Vinca rosea, Ficus virens, Duranta repens, Rosa indica, Cestrum nocturnum, etc., from the surrounding areas of the location of CLDV infected C. album samples and processed through PCR amplification by using begomovirus specific primers as well as target-specific primers as mentioned above but could not find CLDV in any of the cases (data not shown). There might have been a possibility of missing any other host of CLDV during sample collection. Attempt to detect CLDV from the insect vector, i.e., whiteflies did not succeed either and needs further investigation but the presence of whitefly on C. album plants substantiated CLDV as a whitefly vectored begomovirus.
Since genetic variation influences viral emergence, evolution, and vector transmission [4], we investigated the existing genetic diversity of each related virus, i.e., AEV, RLCV, CYMV, PaLCrV, DLCV to determine the extent of genomic variations in these datasets. The average pairwise number of nucleotide differences per site (nucleotide diversity, π), the number of haplotypes (H), the number of segregating sites (S), haplotype diversity (Hd), Tajima's D value [58], and Fu and Li's F value [59] were calculated for the aforementioned viruses using DnaSP version 6.12.03 (Librado and Rozas 2009, Universitat de Barcelona, Barcelona, Spain). as shown in Figure 6 and Supplementary Table S4. Though we successfully found genetic diversity among these viruses respectively but due to the difference in numbers of sequences (due to the scarcity of viruses in the numbers reported in NCBI GenBank) we cannot conclude that the result data is trustworthy, specifically in regard to CLDV. At least, we observed genetic diversity in all of these CLDV-related viruses which emphasizes the possibility of CLDV existence in the derived class from these viruses.
The possibility of virus existence in the current composition cannot be considered naturally original as clearly recombination events were identified during recombination analysis ( Figure 5). Phylogenies are really useful tools to establish genealogical relationships among organisms or their parts (e.g., genes) [60]. Phylogenetic analysis through Mr. Bayes highlighted the evolutionary relatedness among the viruses with the revelation of CLDV localization in between RLCV and DLCV ( Figure 4A,B). Along with this traditional method of phylogenetic analysis, an alternative approach TCS [60] was used to provide accurate estimates of gene genealogies at the population level which also showed the same results ( Figure 7). On the basis of all these evaluations, we believe that the RLCV is a parent virus here and possesses the CP of CYMV and Rep and C4 of AEV through recombination events respectively as shown in Figure 8. Based on the absence of the other component, i.e., the DNA-B, the novel begomovirus CLDV can be considered a monopartite begomovirus with no associated satellite molecules identified and proved through Koch's postulates by successfully reconstituting the virus from the host after agro-inoculation ( Figure 3).
Based on general ICTV demarcation criteria the newly detected virus (CLDV) should be categorized as a new isolate of RLCV (93% sequence identity between them), but the ICTV report clearly demonstrates exceptions in the case of recombinant viruses [29] such as tomato yellow leaf curl Malaga virus and tomato yellow leaf curl Axarquia virus, which have ≥91% identity to both parental viruses (tomato yellow leaf curl virus and tomato yellow leaf curl Sardinia virus) causing both parental species to merge into a single species, despite the fact that all isolates of the parental viruses have <91% identity. Following this rule, CLDV which shows >91% identity to both RLCV and DLCV is categorized as a new species with the proposal of merging these parental species (88% identical) into a single species. Based on general ICTV demarcation criteria the newly detected virus (CLDV) should be categorized as a new isolate of RLCV (93% sequence identity between them), but the ICTV report clearly demonstrates exceptions in the case of recombinant viruses [29] such as tomato yellow leaf curl Malaga virus and tomato yellow leaf curl Axarquia virus, which have ≥91% identity to both parental viruses (tomato yellow leaf curl virus and tomato yellow leaf curl Sardinia virus) causing both parental species to merge into a single species, despite the fact that all isolates of the parental viruses have <91% identity. Following this rule, CLDV which shows >91% identity to both RLCV and DLCV is categorized as a new species with the proposal of merging these parental species (88% identical) into a single species.
Supplementary Materials: The following supporting information can be downloaded at: www.mdpi.com/xxx/s1, Supplementary Figure S1: Schematic diagram of IC construction of CLDV. 1.1 mer IC was constructed by the addition of restriction enzyme sites i.e., SpeI at the start of IC1 and XbaI at the end of IC2. BamHI is the common point of digestion (end of IC1; start of IC2) existing naturally in the sequence. The sequences of both IC1 and IC2 have been shown as well in the box on the right side. Restriction enzymes in the sequences (IC1 and IC2) are shown in bold letters. Both IC1 and IC2 are ligated with digested pCambia-1303 followed by the transformation into Agrobacterium strain GV3101. Supplementary Figure S2. Alignment of Each ORF of CLDV with their respective identical ORF. AEV was aligned with CLDV in the cases of AC1 and AC4. Alignment of RLCV with CLDV was shown in the cases of AC2, AC3, and AV2. In the case of AV1, CYMV was aligned with CLDV. Supplementary Figure S3. Analysis of the common region between the ORFs: (Δ1) coat protein-movement protein and (Δ2) replication protein-transcriptional activator protein. The common region of CLDV ORFs was compared with their respective contendents i.e., RLCV and CYMV in the case of Δ1; RLCV and AEV in the case of Δ2. Analysis was done on both i.e., nucleotide and amino acid levels. In the case of Δ1, RLCV showed more identity with CLDV than CYMV. In the case of Δ2, RLCV showed more identity to CLDV than AEV. The asterisk in red Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v14102166/s1, Supplementary Figure S1: Schematic diagram of IC construction of CLDV. 1.1 mer IC was constructed by the addition of restriction enzyme sites i.e., SpeI at the start of IC1 and XbaI at the end of IC2. BamHI is the common point of digestion (end of IC1; start of IC2) existing naturally in the sequence. The sequences of both IC1 and IC2 have been shown as well in the box on the right side. Restriction enzymes in the sequences (IC1 and IC2) are shown in bold letters. Both IC1 and IC2 are ligated with digested pCambia-1303 followed by the transformation into Agrobacterium strain GV3101. Supplementary Figure S2. Alignment of Each ORF of CLDV with their respective identical ORF. AEV was aligned with CLDV in the cases of AC1 and AC4. Alignment of RLCV with CLDV was shown in the cases of AC2, AC3, and AV2. In the case of AV1, CYMV was aligned with CLDV. Supplementary Figure S3. Analysis of the common region between the ORFs: (∆1) coat protein-movement protein and (∆2) replication protein-transcriptional activator protein.
The common region of CLDV ORFs was compared with their respective contendents i.e., RLCV and CYMV in the case of ∆1; RLCV and AEV in the case of ∆2. Analysis was done on both i.e., nucleotide and amino acid levels. In the case of ∆1, RLCV showed more identity with CLDV than CYMV. In the case of ∆2, RLCV showed more identity to CLDV than AEV. The asterisk in red color (*) shows the variations in the nucleotides whereas the asterisk in black color (*) shows the variations in the comparison at the amino acid level. Supplementary Figure S4