Molecular Variability and Host Distribution of ‘Candidatus Phytoplasma solani’ Strains from Different Geographic Origins

The knowledge of phytoplasma genetic variability is a tool to study their epidemiology and to implement an effective monitoring and management of their associated diseases. ‘Candidatus Phytoplasma solani’ is associated with “bois noir” disease in grapevines, and yellowing and decline symptoms in many plant species, causing serious damages during the epidemic outbreaks. The epidemiology of the diseases associated with this phytoplasma is complex and related to numerous factors, such as interactions of the host plant and insect vectors and spreading through infected plant propagation material. The genetic variability of ‘Ca. P. solani’ strains in different host species and in different geographic areas during the last two decades was studied by RFLP analyses coupled with sequencing on vmp1, stamp, and tuf genes. A total of 119 strains were examined, 25 molecular variants were identified, and the variability of the studied genes was linked to both geographic distribution and year of infection. The crucial question in ‘Ca. P. solani’ epidemiology is to trace back the epidemic cycle of the infections. This study presents some relevant features about differential strain distribution useful for disease monitoring and forecasting, illustrating and comparing the phytoplasma molecular variants identified in various regions, host species, and time periods.


Introduction
"Bois noir" (BN), the most widespread grapevine yellows disease, represents a worldwide threat to viticulture. It is associated with the presence of 'Candidatus Phytoplasma solani' [1], an obligate cell-wall-lacking bacterium that belongs to the class Mollicutes and is transmitted by polyphagous phloem-feeding insects [2,3]. It is enclosed in the 16SrXII-A ribosomal subgroup and associated also with the "stolbur" disease in vegetable crop species, mostly belonging to the Solanaceae (tomato, potato, and pepper) and Apiaceae (carrot, celery, and parsley) families [4,5]. Due to its complex ecology comprising diverse insect vectors and a broad range of host plant species, it is difficult to design effective strategies for the management of both "bois noir" and "stolbur" diseases. Insect vectors represent one of the critical points in the spread of this phytoplasma. The polyphagous cixiid Hyalesthes obsoletus Signoret transmitting the phytoplasma is ubiquitous in Europe to a wide range of wild and cultivated plants [6][7][8][9][10], which in return represent a reservoir of the pathogen in and outside the cultivated fields. Reptalus panzeri and R. quinquecostatus have been reported as vectors of BN in Serbian and French vineyards, respectively [11,12], while Anaceratagallia ribauti was reported as vector of "stolbur" to broad bean plants [13]. Other studies described the ability of R. panzeri collected in maize fields with a reddening disease to transmit 'Ca. P. solani' to grapevine plants [11]. Recent transmission trials conducted with insects collected in a Northern Italy vineyard showed that at least eight insect species (Aphrodes makarovi, Dicranotropis hamata, Dictyophara europaea, Euscelis incisus, Euscelidius variegatus, Laodelphax striatellus, Philaenus spumarius, and Psammotettix alienus/confinis) are vectoring BN [14], therefore confirming the complex epidemiology of 'Ca. P. solani'-associated diseases [1]. Moreover, a large genetic diversity was described for this phytoplasma after grapevine-infecting strain molecular characterization on multiple genes (i.e., tuf, secY, vmp1, and stamp), highlighting the presence of many genetic lineages or variants [15][16][17], and of a positive selective pressure determining the strain population complexity in different vineyard agroecosystems [18]. One of the first genes used for epidemiological studies is the housekeeping gene tuf (elongation factor Tu), of which four variants and several subvariants (tuf types) were described in Europe [19][20][21][22][23]. 'Ca. P. solani' tuf type b1 was mainly identified in Hyalesthes obsoletus and Convolvulus arvensis, Vitex agnus-castus and Crepis foetida, and Reptalus panzeri [20][21][22]. Tuf types a and b2, harbored by Urtica dioica, are reported to be only transmitted by H. obsoletus. A tuf type c was erratically detected in hedge bindweed (Calystegia sepium) in a restricted area of Germany [19,24]; a tuf type b3 variant was reported in vineyards in the Republic of Azerbaijan [25]; and a tuf type d was very recently described in Serbia, in a few crop species (sugar beet, parsnip, and parsley) [23].
Molecular epidemiological studies focused on the distribution of BN and "stolbur" strains in their hosts (plants and insects) increased the knowledge about their transmission in vineyard agroecosystems and natural environments. Recently, the use of several molecular markers suggest a possibility to differentiate BN strains for their differential virulence in grapevine plants [17,26]. However, in most cases molecular markers are mainly used in combination to resolve epidemic cycles at a regional level [11,20,21,24,[27][28][29][30].
Multilocus sequence typing (MLST), based on molecular characterization of more variable genes such as vmp1 and stamp, evidenced a large variability among 'Ca. P. solani' strains within the tuf types [17,31]. Molecular approaches, using vmp1-and stamp-based molecular markers, allowed the increase of the knowledge of these phytoplasma population structures and dynamics [17,32] and their transmission routes throughout vineyards and their surroundings [21,33].
In the present study, a characterization of 'Ca. P. solani' strains collected in the last two decades in different European regions and from different host species was carried out by RFLP analyses and sequencing of tuf, stamp, and vmp1 genes to verify possible correlation between the variants and the disease outbreaks towards designing focused monitoring and control strategies.

Amplification of 'Ca. P. solani' Strains
The genes tuf, stamp, and vmp1 were studied for molecular differentiation of the 'Ca. P. solani' strains used. All the PCRs were performed in a final volume of 25 µL containing 12.5 µL of PCR Master Mix (2X) (Fermentas, Lithuania, 0.05 U/µL Taq DNA polymerase, reaction buffer, 4 mM MgCl 2 , and 0.4 mM of each dNTP), 10.5 µL of SDW, 0.5 µL of each primer at 20 pmol/µL (final concentration 0.4 µM), and 1 µL DNA template (20 ng). Positive controls were used in all PCR amplifications. Samples containing SDW as a template were used as negative control in both the PCR and nested PCR assays.
The tuf gene was amplified using the primer pairs fTuf1/rTuf1 and fTufAy/rTufAy in nested PCR [37]. The stamp gene was amplified with primers StampF and StampR0 and the nested primers StampF1 and StampR1 following described reaction conditions [15]. The vmp1 gene was amplified with H10F1/R1 [38], followed by nested PCR with the TYPH10F/R primer pair [39]. A 6 µL aliquot of PCR products was separated by electrophoresis through 1% agarose gel, stained with ethidium bromide, and visualized with UV transilluminator with a 1 kb DNA ladder (Bioline, England) as marker.

Restriction Fragment Length Polymorphism (RFLP) Analyses
RFLP analyses of the tuf, stamp, and vmp1 gene amplicons were performed using HpaII, Tru1I, and RsaI restriction enzymes, respectively. All the enzymes were from Thermo Fisher, Lithuania, and were used according to the manufacturer's instructions. Obtained restriction products were separated by electrophoresis in 6.7% or 8% polyacrylamide gel, stained, and visualized as described above, using the ΦX174/HaeIII DNA ladder (Fermentas, Lithuania) as a marker. To verify the accuracy in the determination and recognition of the different RFLP patterns obtained in the PCR⁄RFLP analysis, the pDRAW32 software (AcaClone software, http://www.acaclone.com (accessed on 15 September 2021)) was used for virtual digestion of the vmp1 and stamp sequenced amplicons with the Tru1I and RsaI endonuclease, respectively. Tuf amplicons showed RFLP profiles less variable, therefore the attribution to tuf variants was made based on RFLP and sequences similarity.

Sequencing and Phylogenetic Analyses
Direct sequencing of 88 amplicons from the different genes (9 tuf, 47 stamp, and 32 vmp1) selected considering the RFLP profiles, the host species, and the quality of the amplicons bands in the agarose gel, was performed by Macrogen Inc. (Netherlands) on both strands, using the same primers employed for the amplification. Raw sequences were assembled and edited using Pregap4 and Gap4 software from the Staden package [40], and the representative ones were deposited in GenBank database. Nucleotide sequences were compiled in FASTA format, and multiple alignments were performed with ClustalW [41]. The vmp1 gene sequences were trimmed to approximately 1,300 nt and the stamp gene sequences to approximately 500 nt, and phylogenetic analyses were carried out with MEGA X [42] using the neighbor-joining method [43], with 1000 bootstrap replicates to estimate the solidity of the analysis. Phylogenetic trees were constructed based on nucleotide sequences of vmp1 and stamp genes produced in this work, strain's sequences from previous studies [17,26,44,45] and retrieved from the NCBI GenBank (Table 1). Stamp gene nucleotide sequences were analyzed by sequence identity matrix to calculate their genetic diversity and aligned with 70 sequences of previously defined stamp sequence variants [26,44,45]. A nucleotide sequence identity of 100% was employed for the sequence variant attribution.

Results
The 119 'Ca. P. solani' strains tested provided amplification on the stamp and vmp1 genes in 111 samples, while the tuf gene was positive in 108 samples. Readable RFLP profiles for all three genes were obtained for 94 strains (Table 2), while for 25 samples, one or two genes did not give amplification or the RFLP profile was inconclusive. Twenty samples gave amplification on two genes, while five grapevine samples were amplified only on one gene, indicating a different rate of amplification according with the gene employed.
RFLP and sequencing analysis on the tuf gene showed the prevalence of the tuf type b1 profile [19,20] identified in samples from Serbia, Italy, Spain, Portugal, Montenegro, and Bulgaria, while tuf type b2 was only found in two grapevine samples from Hungary. Additionally, 15 grapevine samples from Italy, mainly collected in 2010 and 2020 and one sample collected in 2019, showed a tuf type a profile, which was also identified in Parthenocissus spp. from Italy in 2005, 2018, and 2020 (Table 2).
A phylogenetic tree was constructed with 26 vmp1 gene sequences representing the different RFLP profiles observed, and 18 strains retrieved from NCBI GenBank database representing the vmp1 gene profiles according to the literature [11,28]. The sequences generally clustered according to the RFLP profiles ( Figure 1). Only the two samples (strain P-TV from Italy and pepper 223-17 from Serbia) that exhibited a 1,200 bp fragment after nested TYPH10F/R PCR on vmp1 gene showed an identical RsaI restriction profile ( Figure 2); while in the phylogenetic tree, they appeared to cluster separately (Figure 1). These two strains were differentiated by AluI virtual digestion (data not shown) and resulted in the V7-A and V7 profiles, respectively [11]. The enzymatic digestion with RsaI on vmp1 gene amplicons allowed the identification of 14 RFLP profiles (Table 2), according to 23 V-types reported in previous studies [11,28,31]. Furthermore, three vmp1 gene amplicon sizes were obtained, approximately 1,700, 1,450, and 1,200 bp long. The largest polymorphism was found in the 1,450 bp amplicons, for which 10 RFLP profiles were differentiated (V2-TA, V3, V4, V11, V14, V15, V18, V17, und1, und2, and und3); on the other hand, only the profiles V7 and V7-A from the shortest amplicon and V11 and V12 from the longest were detected (Figure 2).     The RFLP analysis conducted on stamp gene amplicons revealed the presence of five profiles ( Table 2). A phylogenetic tree was constructed using representative nucleotide sequences of the stamp gene obtained in this study and 70 stamp sequences retrieved from previous studies [19,27,44,46]. The phylogenetic analysis showed the presence of 11 stamp variants (St1-3, St2-11, St3-2, St4-3, St5-8, St8-4, St9-1, St10-2, St11-2, St18-2, and St19-6) determined by comparison with the available stamp gene dataset [44,46]. Two variants identified in tomato from Portugal and grapevine from Spain were found for the first time in the present study and were deposited at NCBI GenBank with the accession numbers MW759855 (tomato P3) and MW759856 (grapevine 3S) ( Table 3).  The phylogenetic tree constructed using the stamp representative sequences showed the presence of two main stamp clusters, a and b, enclosing tuf type a (nettle-related) and tuf type b (bindweed-related) samples, respectively ( Figure 3). The subcluster a-II enclosed stamp variants St8, St9, and St19 related to tuf type a sample (grapevine PM1, grapevine TB3, Parthenocissus 1). The subcluster a-I encompasses St11 stamp sequences enclosed in the tuf type b2 (grapevine 10). Moreover, grapevine TB11, tuf type b1, was enclosed within the subcluster a-I, while all the other stamp variants were enclosed in stamp b-I (St10), stamp b-II (St1, St2, St5), and stamp b-III (St3, St4) subclusters. The 25 lineages obtained by the combination of the restriction profiles of the tested genes were mainly discriminated by the vmp1 gene that allowed differentiation of 15 variants (Table 2). V2-TA, V4, and V3 were the prevalent profiles, detected in 19.8%, 19.8%, and 25.5% of the samples, respectively ( Figure 4). Furthermore, V2-TA, V3, V4, V12, and V14 profiles were detected in both grapevines and other species, whereas und1, und3, V11, and V18 were only detected in grapevines. Profiles und1 (Serbia), und2 (Slovenia), and und3 (Spain), detected only in grapevines, were unique, and differed from the already-described profiles. Figure 3. Unrooted phylogenetic tree inferred from stamp gene nucleotide sequences of 'Ca. P. solani' strain representative of stamp sequence variants previously described [17,26,41,42] and identified in this work (Table 3). Phylogenetic analysis was carried out using the neighbor-joining method and bootstrap-replicated 1000 times. Phytoplasma strains included in the phylogenetic analysis are given in the tree image. The GenBank accession number of each sequence is given in parentheses; gene sequences obtained in the present study are indicated in bold. Clusters are shown on the right. Moreover, one sample from tobacco from Serbia (strain 150/10) and one sample from grapevine from Italy (strain RA 9827) showed mixed profiles ( Table 2). Most of the samples tested originated from Serbia and Italy, and the distribution of the different vmp1 RFLP profiles showed that only the V14 profile was detected in both countries. Considering all the samples tested, five vmp1 profiles could be detected both in grapevines and in other host species (profiles V2-TA, V3, V4, V12, and V14) ( Figure 5).

Discussion
The genetic variability of the 'Ca. P. solani' strains and the broad range of different plant host species infected are the key points in the study of population genetic and ecology of this phytoplasma. To provide an overall insight into its genetic variability and host distribution, two membrane protein coding genes (vmp1 and stamp) involved in the recognition and interaction with its hosts [15,38] were studied. They showed a high sequence variability that make them useful to study the phytoplasma population dynamics. Moreover, the study of the elongation factor Tu (tuf ) gene allowed the distinction of three variants (tuf type a, tuf type b2 and tuf type b1) involved in two BN disease cycles [19,20], while no other tuf variants were found [23,25,46]. Additionally, the first identification of a tuf type a (nettle-associated type) in naturally infected P. quinquefolia and P. tricuspidata added new host plant species to this 'Ca. P. solani' tuf type and indicated its possible involvement in alternative epidemiological cycles with different and previously undescribed, host species.
The polymorphisms detected in the stamp gene improved the knowledge of the phytoplasma strain population structure and dynamics. Currently, 70 nucleotide sequence variants have been described [19,26,31,32,44,45], and the two new variants detected in this work, together with the 11 already published, confirmed the large genetic variability of this gene. The definition of the stamp variants showed that the genetic variability of this gene could be underestimated and not fully exploited by the RFLP analysis alone, since variants are often characterized by small inserts or deletions, not detected by the restriction analysis. Considering the vmp1 gene, the host species distribution of V2-TA and V4 profiles was quite wide, since the first was detected in corn, grapevine, potato, tomato, parsley, and parsnip; while the latter was identified in bindweed, carrot, grapevine, potato, tomato, parsley, periwinkle, parsnip, tobacco, and valerian. The vmp1 V4 profile was detected in grapevine samples from Italy, Croatia, Serbia, and Bosnia and Herzegovina [11,27,29,[47][48][49]. On the contrary, the V3 profile showed a host species distribution limited to grapevine, H. obsoletus, P. quinquefolia, and P. tricuspidata from Italy, and it was detected in all the strains tuf type a, only in Italian samples. The presence of V14 profile in potato, grapevine, bindweed, celery, parsley, periwinkle, pepper, and valerian was confirmed mainly in Eastern European countries, confirming previous reports [17,31]. Out of the 17 samples in which it was identified (from Italy, Serbia, Montenegro, and Hungary), it was detected only in one grapevine sample from Central Italy (strain J1 from the Marche region).
This study indicated that the variability and, in some cases, the unique combination of the environmental and agroecological conditions, play an important role in the strain selection, making them prevalent and/or endemic in a specific geographic area. The presence of previously unreported vmp1 RFLP patterns (und1, und2, and und3) demonstrated the high degree of plasticity of this gene, which suggests further studies to fully understand its complexity and variability in this phytoplasma. However, studies focusing on correlation between different symptomatology and strain variability are still necessary to confirm the presence of virulent or mild strains in the diverse host species.
While the high variability of the vmp1 gene has proven to be useful for discriminating 'Ca. P. solani' lineages, the results of this study indicated that the epidemiology of this phytoplasma is more complex than already shown, since strains connected to nettle and grapevine cycle [26] have been identified in new host species. Despite that the strains analyzed in this work were collected in different years and countries, the variability detected showed incomplete consistency with the year or the country of collection. However, the lineage I was detected from 2005 to 2020 only in grapevines and Parthenocissus spp. and in Italian cultivations, while the lineage III was only identified in Serbia and Hungary from 2009 to 2018 in diverse plant host species. The lineage IV was identified in diverse host species only in Serbia until 2013, but in 2016 and 2017, it was also identified in Italy and Montenegro, and the lineage IX was only retrieved in 2009-2013 in a few host species in Serbia and Bulgaria. This survey's results confirmed that the plasticity of these genes can be connected to both year and location of collection; however, comparable analyses of more 'Ca. P. solani' strains should be done to confirm the epidemiological trends indicated by the identified lineage diversity.
Asymptomatic, infected propagation material trade, due to the lack of screening and certification protocols, and the ability of diverse insect vectors to transmit 'Ca. P. solani' in the different geographical regions, are jeopardizing molecular-based epidemiological studies. It is nevertheless very important to continue the molecular monitoring of the 'Ca. P. solani' populations to verify the possible emergence or re-emergence and spread of epidemic strains of the pathogen also identified for their genetic homogeneity in the studied genes.
Funding: This research received no external founding.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: The archived datasets analyzed or generated during this study were deposited in the NCBI GenBank.