Phytoplasma Taxonomy: Nomenclature, Classification, and Identification

Simple Summary Phytoplasmas are vector-borne and graft-transmissible bacteria that cause various plant diseases, leading to severe economic losses. Since phytoplasmas cannot be cultured in cell-free media, their identification and taxonomy rely on molecular techniques and gene sequences. In this article, we summarize the recent advances in phytoplasma taxonomy from three different aspects, including (i) nomenclature (naming Candidatus Phytoplasma species); (ii) classification (group and subgroup assignment based on 16S rRNA gene sequences); and (iii) identification (fine differentiation of phytoplasma strains). In addition, some important issues, especially those related to recognizing new ‘Candidatus Phytoplasma’ species, are discussed. This information will be helpful for rapid diagnosis of phytoplasma diseases and accurate taxonomic identification of both emerging and known phytoplasma strains. Abstract Phytoplasmas are pleomorphic, wall-less intracellular bacteria that can cause devastating diseases in a wide variety of plant species. Rapid diagnosis and precise identification of phytoplasmas responsible for emerging plant diseases are crucial to preventing further spread of the diseases and reducing economic losses. Phytoplasma taxonomy (identification, nomenclature, and classification) has lagged in comparison to culturable bacteria, largely due to lack of axenic phytoplasma culture and consequent inaccessibility of phenotypic characteristics. However, the rapid expansion of molecular techniques and the advent of high throughput genome sequencing have tremendously enhanced the nucleotide sequence-based phytoplasma taxonomy. In this article, the key events and milestones that shaped the current phytoplasma taxonomy are highlighted. In addition, the distinctions and relatedness of two parallel systems of ‘Candidatus phytoplasma’ species/nomenclature system and group/subgroup classification system are clarified. Both systems are indispensable as they serve different purposes. Furthermore, some hot button issues in phytoplasma nomenclature are also discussed, especially those pertinent to the implementation of newly revised guidelines for ‘Candidatus Phytoplasma’ species description. To conclude, the challenges and future perspectives of phytoplasma taxonomy are briefly outlined.


Introduction
Plant diseases characterized by flower abnormality, yellowing, and witches'-broom were long thought to be caused by viruses until 1967 [1,2], when Doi et al. discovered small bacteria with pleomorphism and lack of cell walls in ultrathin electron microscopic sections of infected phloem tissues [3]. The bacteria were named mycoplasma-like organisms (MLOs) because of their morphological resemblance to the mycoplasmas that infect humans and animals [3]. The first 16S rRNA gene sequence of MLO (Oenothera MLO 86-7, accession number M30790) was reported in 1989, which was distinct from mycoplasmas in phylogeny [4]. In 1993, according to the proposal of International Committee on Systematic  [1][2][3][4][5][6][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. [ Taxonomy (from Greek taxis (arrangement) and nomos (law)) is a broad biological science concerned with nomenclature, classification and identification, which are three related but distinct aspects [23]. Nomenclature is the naming of organisms. Scientific names established according to the binomial nomenclature help scientists around the world better communicate and study the same organism(s). Classification is the orderly arrangement of organisms into groups or taxa based on their similarity. Identification is to recognize a known or unknown organism and to assign it to an existing or a new taxon [23]. Nomenclature is more academic, but classification can be designed according to practical needs [24]. In recent years, significant progress has been made in the bacterial taxonomy; for example, transition from 16S rRNA gene toward whole genome sequence in culturable bacteria [20]. For unculturable phytoplasma, the taxonomy has also been advanced in order to adapt and align better with culturable bacteria. In this article, recent advancements in phytoplasma taxonomy are explored from nomenclature, classification and identification.  [1][2][3][4][5][6][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. [ Taxonomy (from Greek taxis (arrangement) and nomos (law)) is a broad biological science concerned with nomenclature, classification and identification, which are three related but distinct aspects [23]. Nomenclature is the naming of organisms. Scientific names established according to the binomial nomenclature help scientists around the world better communicate and study the same organism(s). Classification is the orderly arrangement of organisms into groups or taxa based on their similarity. Identification is to recognize a known or unknown organism and to assign it to an existing or a new taxon [23]. Nomenclature is more academic, but classification can be designed according to practical needs [24]. In recent years, significant progress has been made in the bacterial taxonomy; for example, transition from 16S rRNA gene toward whole genome sequence in culturable bacteria [20]. For unculturable phytoplasma, the taxonomy has also been advanced in order to adapt and align better with culturable bacteria. In this article, recent advancements in phytoplasma taxonomy are explored from nomenclature, classification and identification.

Phytoplasma Nomenclature: Delineation of Candidatus Phytoplasma Species
Traditional polyphasic approach, which integrates phenotypic and genotypic data and reflects the ecological nature of the bacteria, is considered as the gold standard for Biology 2022, 11, 1119 3 of 23 bacterial taxonomy [25]. The phenotypic markers mainly include morphological, physiological, and biochemical characteristics of cultivatable bacteria [26]; however, inability to culture phytoplasma in vitro impeded the accessibility of the above-mentioned phenotypic characteristics to differentiate phytoplasmas. Several decades ago, scientists attempted to distinguish phytoplasmas by using symptoms induced by phytoplasmas, plant host range, insect vector specificity and serological correlations as markers, but were ultimately unsuccessful due to lack of consistency [27][28][29][30][31]. The subsequent development of cultureindependent modern genotypic approach based on heredity information has rapidly and considerably enhanced the entire bacterial systematics, providing high levels of resolution and differentiation. In particular, the advent of DNA sequencing technology and exploitation of 16S rRNA gene sequences have tremendously facilitated taxonomy, tree of life, evolution, and diversity studies of unculturable bacteria [32][33][34]. Based on 16S rRNA gene sequences, many bacteria have been reclassified and renamed [35,36].
As with many other unculturable bacteria, the higher rank taxa of phytoplasmas (Mycoplasmatota (originally named Tenericutes_/Mollicutes/Acholeplasmatales/incertae sedis-Family II]) were named in the absence of type genus and species [37][38][39]. While the Candidatus status was used to reserve the putative lower rank taxa (Genus and Species [10]). The term Candidatus was first introduced in 1994 to nonculturable bacteria, granting appropriate status of potential taxa based on 16S rRNA gene sequences ([10]; Figure 1). Candidatus is not a rank, nor is it governed by Prokaryotic Code [40]. Currently, all phytoplasma strains are accommodated within the provisional Candidatus Phytoplasma genus. The main function of the phytoplasma taxonomic nomenclature system is naming 'Candidatus Phytoplasma' species as species is the most basic taxon of bacteria [12].
2.1. Transition from 16S rRNA Gene to Whole Genome-Based Nomenclature of Candidatus Phytoplasma Species?
The first 'Candidatus Phytoplasma' species (Ca. Phytoplasma. aurantifolia) was named based on 16S rRNA gene sequence in 1995 [41]. In 2004, the detailed guidelines (rules a through g) for naming Candidatus Phytoplasma were proposed by the IRPCM Phytoplasma/Spiroplasma Working Team-Phytoplasma taxonomy group [12]. According to the guidelines, Ca. Phytoplasma species may be delineated based on the identity of their 16S rRNA gene sequences greater than 1200 bp (rule a). A new 'Candidatus Phytoplasma' species can be recognized if the phytoplasma shares lower than 97.5% sequence identity in 16S rRNA gene with previously established 'Candidatus Phytoplasma' species (rule b). Such sequence identity threshold value was adopted because it corresponded to DNA-DNA hybridization (DDH) reassociation value (70%) suitable for demarcating bacterial species. If a phytoplasma shares higher than 97.5% sequence identity in 16S rRNA gene with existing species but clearly represents "an ecologically separated population", the phytoplasma also qualifies for a new 'Candidatus species' (rule c).
In genotypic characterization, 16S rRNA gene serves as a backbone for bacterial taxonomy [35,[39][40][41][42]. In many bacterial genera, the 16S rRNA gene alone is not sufficient to differentiate species, and, in some cases, multi-locus sequence analysis (MLSA) of alternative housekeeping genes is needed for phylogenetic studies [43][44][45][46]. In addition, with the advancement of genome sequencing technology, whole genome sequence-based genotypic characterization becomes possible. In recent years, the whole-genome average nucleotide identity (ANI) has emerged as a robust method for assessing species boundaries and estimating the genetic relatedness between two genomes. Ample data have shown whole genome ANI is correlated with the traditional microbiological concept of DNA-DNA hybridization relatedness for defining species [47,48]. In bacteria, an ANI value of 95% to 96% has been generally accepted for circumscribing species [47,49]. In 2019, Bergey's Manual suggested the beginning of the transition from 16S rRNA gene sequence-based to whole genome-based taxonomy in bacteria (Figure 1) [20].
The first complete genome of phytoplasma (onion yellows phytoplasma mild strain (OY-M) was published in 2004 [6] (data deposited into GenBank in 2003).
So far,  Figure 2 and Table 1, [6,7,14,). The size of the complete ge-nomes ranges from 576 to 960 Kb. As shown in Figure 2, 12 phytoplasma genomes were published in year 2021 alone. Even so, genomes of only a small proportion of phytoplasmas have been sequenced compared to nearly one thousand known phytoplasma strains (covering 37 groups and more than 150 subgroups, see Section 3). No doubt, the ever-increasing phytoplasma genome sequence data will serve as an excellent and more comprehensive frame-work for phytoplasma taxonomy.
Bergey's Manual suggested the beginning of the transition from 16S rRNA gene seq based to whole genome-based taxonomy in bacteria ( Figure 1) [20]. The first complete genome of phytoplasma (onion yellows phytoplasma mil (OY-M) was published in 2004 [6] (data deposited into GenBank in 2003). So far, toplasma genomes (35 draft and 12 complete) have been sequenced involving 13 and 29 subgroups ( Figure 2 and Table 1, [6,7,14,). The size of the complete ge ranges from 576 to 960 Kb. As shown in Figure 2, 12 phytoplasma genomes we lished in year 2021 alone. Even so, genomes of only a small proportion of phytop have been sequenced compared to nearly one thousand known phytoplasma strai ering 37 groups and more than 150 subgroups, see Section 3). No doubt, the evering phytoplasma genome sequence data will serve as an excellent and more com sive frame-work for phytoplasma taxonomy. The 2004 guidelines have served nearly 20 years. To date, approximately 50 datus phytoplasma' species have been formally named [15]. While most of speci delineated based on 16S rRNA gene identity scores (rules a and b), several speci recognized as they each represent an ecologically distinct population (rule c). In c the delineation of 'Ca. Phytoplasma tritici' and 'Ca. Phytoplasma sacchari' exploited genome information in addition to unique ecological properties [14,82]. Phytoplas cies naming based on whole genome information goes beyond the 2004 guidelines other hand, in recent years, after pairwise comparison of the 16S rRNA gene se identity score and the corresponding whole genome ANI score, bacterial taxonom vised twice the 16S rRNA gene sequence identity threshold value for delineating n terial species: changing from 97% to 98.7% and then to the current 98.65% [49,83 embrace these new developments and to incorporate "whole genome" concept to plasma taxonomy, Bertaccini et al. recently revised guidelines for naming Ca. plasma species (referred to as "2022 guidelines" thereafter) [15].

The Newly Revised 2022 Guidelines and Proposed Amendments
The major revisions in the 2022 guidelines for naming a new Candidatus Phyto species include (1) the length of 16S rRNA gene sequence was extended from >120 >1500 bp (full length or nearly full length of 16S rRNA gene); (2) the threshold of 16 gene identity was changed from 97.5% to 98.65%; (3) a whole genome ANI criteriu proposed; and (4) if a strain shares >98.65% identity in 16S rRNA gene sequence an genome ANI with previously established species, a MLSA approach can be used marcate a new species. Criteria for five housekeeping genes were proposed. How our opinion, some provisions in the 2022 guidelines lack clarity and precision. Bel compare the newly revised 2022 guidelines with the original 2004 guidelines and the issues that require clarification and amendment ( Table 2). The 2004 guidelines have served nearly 20 years. To date, approximately 50 'Candidatus phytoplasma' species have been formally named [15]. While most of species were delineated based on 16S rRNA gene identity scores (rules a and b), several species were recognized as they each represent an ecologically distinct population (rule c). In contrast, the delineation of 'Ca. Phytoplasma tritici' and 'Ca. Phytoplasma sacchari' exploited whole genome information in addition to unique ecological properties [14,82]. Phytoplasma species naming based on whole genome information goes beyond the 2004 guidelines. On the other hand, in recent years, after pairwise comparison of the 16S rRNA gene sequence identity score and the corresponding whole genome ANI score, bacterial taxonomists revised twice the 16S rRNA gene sequence identity threshold value for delineating new bacterial species: changing from 97% to 98.7% and then to the current 98.65% [49,83,84]. To embrace these new developments and to incorporate "whole genome" concept to phytoplasma taxonomy, Bertaccini et al. recently revised guidelines for naming Ca. Phytoplasma species (referred to as "2022 guidelines" thereafter) [15].

The Newly Revised 2022 Guidelines and Proposed Amendments
The major revisions in the 2022 guidelines for naming a new Candidatus Phytoplasma species include (1) the length of 16S rRNA gene sequence was extended from >1200 bp to >1500 bp (full length or nearly full length of 16S rRNA gene); (2) the threshold of 16S rRNA gene identity was changed from 97.5% to 98.65%; (3) a whole genome ANI criterium was proposed; and (4) if a strain shares >98.65% identity in 16S rRNA gene sequence and >95% genome ANI with previously established species, a MLSA approach can be used to demarcate a new species. Criteria for five housekeeping genes were proposed. However, in our opinion, some provisions in the 2022 guidelines lack clarity and precision. Below, we compare the newly revised 2022 guidelines with the original 2004 guidelines and discuss the issues that require clarification and amendment ( Table 2).
In rule (a) of 2004 IRPCM guidelines, the 'related strain' was clearly defined. That is, the strain from which this sequence was obtained should be named the 'reference strain' and not the 'type strain'. Strains in which even minimal differences in the 16S rRNA gene sequence from the reference strain are detected do not 'belong' to the Candidatus species but are 'related' to it. However, in the following statement of the 2022 revised guidelines, "Strains sharing >98.65% sequence identity when compared with the reference strain are considered members of the respective 'Ca. Phytoplasma' species. Strains showing identity <98.65% to the reference strain, but >98.65% with other strains of the same 'Ca. Phytoplasma' species should be considered as related to this 'Ca. Phytoplasma' species.".
The term 'member strain' was not very well conceived and could lead to erroneous assignment of a single given strain to more than one species. For example, alder yellows phytoplasma strain ALY (AY197646) shares 99.7%, 99.35%, and 98.98% identity with that of 'Ca. Phytoplasma ulmi' reference strain EY1 (AY197655), 'Ca. Phytoplasma rubi' reference strain RuS (AY197648), and 'Ca. Phytoplasma ziziphi' reference strain JWB-G1 (AB052876), respectively, in their 16S rRNA gene sequences. According to the 2022 guidelines, ALY would be a member strain of 'Ca. Phytoplasma ulmi', 'Ca. Phytoplasma rubi', and 'Ca. Phytoplasma ziziphi' simultaneously. Likewise, plum leptonecrosis phytoplasma strain LNp (JQ868450) shares 99.93%, 98.86%, and 98.66% identity with 'Ca. Phytoplasma prunorum' reference strain ESFY-G1 (AJ542544), 'Ca. Phytoplasma pyri' reference strain PD1 (AJ542543), and 'Ca. Phytoplasma mali' reference strain AP15 (AJ542541), respectively, in their 16S rRNA gene sequences. Therefore, according to the 2022 guidelines, LNp would be a member strain of 'Ca. Phytoplasma prunorum', 'Ca. Phytoplasma pyri', and 'Ca. Phytoplasma mali' at the same time.  Conceptually and practically, a strain can be related to more than one species, but cannot be a member of more than one species simultaneously. Therefore, the term "member strain" should be abolished and the term "related strain" as coined in the 2004 original guidelines should be restored.
Rule (c) of the 2004 IRPCM guidelines emphasized the importance of 'ecological population'; it reflects the ecological nature of bacteria. No matter how we "modernize" our standards or framework, this provision should be retained. In other words, in addition to 16S rRNA sequence identity-and whole ANI-based criteria for demarcating Candidatus Phytoplasma species, ecological feature/property-based delineation criteria should be in place as well.
In the 2022 revised guidelines, a proposal was made to allow naming new phytoplasma species based on two out of five housekeeping genes (groEL, tuf, rp, secA and secY) with individual criteria if a strain shares >98.65% identity in 16S rRNA gene sequence and >95% genome ANI with previously reported species. So far, about 20,000 phytoplasma-related nucleotide sequences have been found in NCBI database, including approximately 8000 16S rRNA gene sequences, 1300 ribosomal protein-encoding gene sequences (rps3, rpl15, and rpl22, etc), 880 secY gene sequences, 570 tuf gene sequences, and sequences of other genes that are often used for differentiation of closely related phytoplasma strains such as vmp, Cpn60, amp, map, and SecA, etc. For a particular gene, such as secY gene, the sequence length of different phytoplasma strains deposited in GenBank varies considerably, ranging from 150 to 1400 bp. Excessively short sequences bear little value for comparative analysis. In addition, so far, there is no single gene other than the 16S rRNA gene has universal primers that are capable of amplifying all phytoplasma strains. Even widely used generic rp and secY primers can only amplify phytoplasmas that belong to certain 16Sr groups. Without universal primers and sufficient sequence data for a thorough comparative analysis of these five housekeeping genes, it is still difficult to establish objective criteria for MLSA-based species delineation. However, if the strain clearly represents ecologically separated populations, MLSA could be used to demonstrate significant molecular diversity in addition to fulfilling the unique vectorship and host specificity requirement. In such case, the MLSA assay should not be limited to these five genes proposed in 2022 guidelines. Evidence of significant molecular diversity from other housekeeping genes should be accepted as well.
Based on the above reasoning, we propose the following amendments to the 2022 revised guidelines. For clarity and consistency, the amendments are structured in the same fashion as the original 2004 IRCPM guidelines: (a) The 'Ca. Phytoplasma' species description should refer to a single, unique 16S rRNA gene sequence (full length or nearly full length, >1500 bp) or whole genome sequence with at least 60% coverage. The strain from which this sequence was obtained should be named the 'reference strain' and not the 'type strain'. Strains in which even minimal differences in the 16S rRNA gene sequence from the reference strain are detected are referred as 'related' to the Candidatus species. (b) In general, a strain can be described as a novel 'Ca. Phytoplasma' species if its 16S rRNA gene sequence shares <98.65% identity or its whole genome shares an ANI score <95-96% to that of any previously described 'Ca. Phytoplasma' species. (c) There are, however, cases of phytoplasmas that share >98.65% identity of their 16S rRNA gene sequences or >95-96% ANI of their genomes, but clearly represent ecologically separated populations and, therefore, may deserve description as separate species. For such cases, description of two different species is recommended only when all three of the following conditions apply: In summary, a phytoplasma may be recognized as a novel 'Ca. Phytoplasma' species if it meets one of the following three criteria: sharing <98.65% 16S rRNA gene sequence identity, or sharing <95-96% genome-wide ANI or representing an ecologically separated population. Fulfillment of the rule (c) shall be demonstrated by vector specificity, unique host or host behavior, and molecular divergence.

Phytoplasma Classification: 16Sr Group/Subgroup Classification System Based on Collective RFLP Profiles
Classification is the systematic and orderly arrangement of organisms into groups or categories according to established criteria. Different from taxonomic nomenclature system, a classification scheme is often designed to meet practical needs, emphasizing less academic significance. Therefore, different scientists may classify the same organism differently [24]. Phytoplasma classification also has followed this principle. Phenotypic approaches such as symptomology, vectorship, and serology were employed to classify phytoplasmas in early days, but this has proved not suitable or practical [85,86] as in many cases the same phytoplasma strain may induce different symptoms in different hosts, and different phytoplasma strains may share a common vector or cause diseases exhibiting similar symptoms [87]. Until the 1990s, the 16Sr group/subgroup classification scheme was established based on RFLP profiles of PCR amplified F2nR2 fragment of the 16S rRNA gene [11,35,88,89]. This classification system is most widely adopted by phytoplasma researchers so far [90][91][92][93].
The RFLP-based phytoplasma classification scheme exploits a high-resolution subset of the 16S rRNA gene characteristics, namely, the recognition sites of 17 restriction enzymes, to differentiate diverse phytoplasmas [11,87]. The 16Sr groups delineated with this RFLP classification scheme are consistent with the 16S rRNA gene phylogenetic clades. More advantageously, by distinguishing subtle pattern differences, this RFLP analysis-based scheme is able to identify and distinguish different subgroup lineages within any given group [13,88,94,95]. Operationally, traditional RFLP analysis requires actual enzymatic gel electrophoresis and visual comparisons of various banded patterns. It is inconvenient, and few people are willing to do that anymore. The current virtual RFLP analysis approach is operated based on DNA sequences but retains the principles and criteria of the original phytoplasma classification scheme. Using accurate sequence data, the virtual gel patterns generated by computer simulated RFLP analysis can faithfully duplicate the classical and authoritative patterns established by conventional RFLP analysis. The new pattern types derived from virtual RFLP analysis have also been confirmed by actual enzymatic gel electrophoresis [94]. Furthermore, based on the virtual RFLP analysis approach, the interactive online tool iPhyClassifier was constructed, enabling and facilitating databaseguided phytoplasma classification and identification [13].
Some scientists might think that the RFLP approach is obsolete. The truth is RFLP analysis still plays an important role in the classification and differentiation of many unculturable and fastidious bacteria, and fungi [96][97][98]. Examples include classifications of genus Basidiobolus [97] and genus Vibrio [98]. In the past five years (2017 to present), around 15,000 papers have been published on the classification and differentiation of bacteria and fungi based on RFLP analysis, including nearly 1600 articles on phytoplasma classification and identification. Computer-simulated virtual RFLP analysis undoubtedly enhanced the applicability of the RFLP analysis-based classification.
Importantly, the 16Sr group/subgroup classification system complements 'Candidatus Phytoplasma' species affiliation assignment. A striking example is the aster yellows (AY) phytoplasma group, which contains hundreds of known strains around the globe. The current taxonomic system assigns all the AY strains as 'Ca. Phytoplasma asteris'-related strains, which grossly masks the differences among the strains. On the other hand, the existing 16Sr group classification scheme can differentiate the AY strains into more than two dozen subgroups, each of which has its own unique RFLP profile. In addition, some subgroups are only (or predominantly) present in certain geological regions and associated with different ecological niches [93,98].
In addition, in certain cases, the current phytoplasma taxonomic system may even have difficulty to assign certain strains to the existing 'Ca. Phytoplasma' species. For example, a strain (KJ452548) in the elm yellows phytoplasma group shares 99.1-99.3% identities with 'Ca. Phytoplasma ulmi'-and 'Ca. Phytoplasma ziziphi'-related strains in their 16S rRNA gene sequences. So, what species should this strain be affiliated with, 'Ca. Phytoplasma ulmi' or 'Ca. Phytoplasma ziziphi'? Well, the RFLP-based group/subgroup classification system can at least provide distinguishing RFLP markers to separate them and classify the strain into a new subgroup other than 16SrV-A and 16SrV-B. This example strongly demonstrates that the group/subgroup classification system effectively avoids the ambiguity caused by the term, 'Candidatus Phytoplasma sp.'-related strain, and helps diagnosticians and regulatory agencies distinguish closely-related phytoplasma strains.
In 2007, based on the virtual RFLP analysis of all 16S rRNA gene sequences available at the time (F2nR2 fragment of about 1250 bp), the number of phytoplasma classification groups was expanded from 19 to 28 (16SrXIX-16SrXXVIII), and some potentially new species were proposed with suggested reference strains (Table 3). In the present review, groups/subgroups corresponding to Candidatus Phytoplasma species, especially the newly named species are updated (Table 3 [ ). Two new groups (16SrXXXVIII and 16SrXXXIX) are established based on the criterium which requires the collective F2nR2 RFLP pattern of any new group representative has a similarity coefficient of <0.85 with that of all previously recognized 16Sr groups [94] (Supplementary Table S1). The reference strains of 'Ca. Phytoplasma noviguineense' and 'Ca. Phytoplasma dypsidis' were designated as representative strain of 16SrXXXVIII-A (LC228755) and 16SrXXXIX-A (MT536195), respectively.
Currently, there are a total of 37 groups and 48 named Candidates phytoplasma species (Table 3). Each group should contain at least one Candidatus species [138]. As shown in Table 3, nearly ten novel groups have been identified since 2007 (16SrXXIX-16SrXXXIX). However, it is noteworthy that no new phytoplasmas have been identified in groups 16SrXXIII-16SrXXVIII during the past 15 years. This suggests that the phytoplasmas belonging to these groups may be rare or the sequences representing these groups contain errors. In addition, we also noted that several pairs of strains share high sequence identity, but very low RFLP similarity coefficients. Such discrepancy might be caused by indels or sequencing errors that occurred within restriction enzyme recognition sites.   * Abolished: 'Ca. Phytoplasma australasia' was originally described by White et al. [101]. It was later removed from the 'Ca. Phytoplasma' species list by the IRPCM as its 16S rRNA gene sequence shares 99.5% sequence identity with that of 'Ca. Phytoplasma aurantifolia' and there is no evidence that it represents an ecologically separated population [12]. 'Ca. Phytoplasma australasia' was erroneously included in 2022 guidelines [15] and should be removed.

Phytoplasma Identification: Detection, Diagnostics and Characterization
The early identification and diagnosis of phytoplasmas and phytoplasmal diseases are vital for the formulation and implementation of rapid control measures. This not only thwarts the further spread of disease and reduce direct economic losses from plant death/damage, but also prevents delays and restrictions on the import and export of plant materials. Plants infected by phytoplasmas often exhibit remarkable symptoms. These symptoms include virescence (flower petals turning green), phyllody (leafy flowers), cauliflower-like inflorescence (repetitive initiation of inflorescence meristems), and witches'-broom (excessive shoot proliferation) [139,140]. In addition to these characteristic symptoms, phytoplasma infection can also induce some general symptoms seen in diseases caused by various other plant pathogens. Such general symptoms include leaf discoloration (such as purple leaves and leaf yellowing), little leaf, stem fasciation, and stunting [139][140][141]. Furthermore, asymptomatic phytoplasma infections were reported as well [142].
As phytoplasmas cannot be cultured in vitro, the routine culture-dependent metrics and characteristics for bacterial identification (morphological observation, biochemical assay, serotyping and antibiotic inhibition/resistance pattern assessment) cannot be employed. Phytoplasma detection and characterization heavily rely on the molecular diagnostic techniques. With the rapid development of molecular diagnostic techniques, a variety of fast, sensitive, and cost effective phytoplasma detection methods have emerged, ranging from PCR, nested PCR, real time PCR, droplet digital PCR (ddPCR), and loop-mediated isothermal amplification (LAMP) to CRISPR-based detection methods. These methods are devised based on highly conserved gene sequences of phytoplasmas, namely 16S rRNA gene, rp gene, SecY gene and tuf gene, etc. [11,[143][144][145][146][147][148][149].
Currently, the most widely adopted procedure for the phytoplasma identification and further classification includes the following steps: (i) PCR or nested PCR amplification of phytoplasma DNA using universal primers of 16S rRNA gene, for example, P1, P7, P1A, P7A, 16S-SR, 16RF2n, and R16R2 [103,144,150,151]; (ii) Sequencing of PCR amplicons (direct sequencing or sequencing after amplicon cloning); and (iii) Sequence analysis using iPhyClassifier, classifying the phytoplasma strain under study to existing 16Sr group/subgroup and assigning (relating) the strain to previously named Candidatus Phytoplasma species. Results from the last step also offer opportunities for establishing new groups/subgroups and discovering novel Candidatus Phytoplasma species.
MLSA-based classification schemes have been established in many bacteria, but not yet implemented in non-culturable phytoplasmas (see Section 2.2 for reasons). However, this does not affect MLSA as a very effective method for phytoplasma diversity studies and fine differentiation of closely related phytoplasmas. For example, MLSA-based approach revealed the genetic diversity of apple proliferation phytoplasmas [152]; in addition, 16S rRNA, rp, and secY genes based MLSA characterization also indicated azalea little leaf phytoplasmas represented a distinct lineage within 16SrI group [153].

Challenges and Perspectives
Phytoplasma genome sequence information is essential to further advancing phytoplasma taxonomy. Since axenic phytoplasma culture is unattainable, DNA samples for phytoplasma genome sequencing are usually prepared from infected plants. As host DNA accounts for an overwhelming majority in the genomic DNA preparations, there is a risk of host DNA contamination in the process of genome assembly and mapping. A careful assessment of genome coverage statistics is vital as genome information-based delineation of new Candidatus Phytoplasma species solely relies on the accuracy of the genome sequences. Genome coverage statistics can indicate not only the contamination, but also the completeness of the genome. For those incomplete (draft) genomes, a minimum of 60% coverage has been suggested for microbial species delineation [154].
In addition, according to rule (e) in the 2004 Guidelines, the reference strain (maintained in micro-propagation if available) should be sent to and deposited to the scientific community or authorized organizations by the authors of the Candidatus species description paper. Implementation of this rule has become increasingly difficult or even no longer feasible due to strict international regulations. A good alternative might be to submit the gene clones of the reference strain to the scientific committee or the authorized organizations in different countries in America, Europe, and Asia, etc. (to be determined by phytoplasma scientists).
Currently, Groups 16SrV, 16SrX, and 16SrXII each contains four or five Candidatus Phytoplasma species (Table 3). There are occasions where an unknown strain shares an identical (or nearly identical) 16S rDNA sequence identity score with the reference strains of more than one Candidatus species within a given 16Sr group. Such a scenario makes it difficult to determine with which Candidatus species the unknown strain should be affiliated. The decision-making process is even tough if any involved Candidatus species has quarantine implications. The 2022 Guidelines revised the 16S rRNA gene sequence identity threshold value for demarcating phytoplasma species from 97.5% to 98.65%. The new threshold will likely result in an increase in the number of new Candidatus species within certain 16Sr groups, which will make the matters worse. Subgroup classification and multilocus strain typing may help alleviate the problem.

Conclusions
This article reviews the latest progress in phytoplasma taxonomy from three aspects: nomenclature, classification, and identification. 'Candidatus Phytoplasma' species/nomenclature system and group/subgroup classification system are two parallel systems and serve different purposes. The nomenclature system focuses more on naming new species based on one of the three criteria: 16S rRNA gene sequence identity (<98.65%), whole genome ANI (<95-96%), or representing ecologically separated populations. Currently, 48 Candidatus Phytoplasma species have been named. The group/subgroup classification system is based on collective RFLP profiles of the F2nR2 region of 16S rRNA gene. The genetically diverse phytoplasmas have been classified into 37 groups and more than 150 subgroups.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biology11081119/s1, Table S1: Identification of two new 16Sr groups/subgroups (16SrXXXVIII-A and 16SrXXXIX-A) based on similarity coefficients derived from virtual RFLP analysis of 16S rRNA genes.
Author Contributions: Conceptualization, W.W. and Y.Z.; writing-original draft preparation, W.W. and Y.Z.; writing-review and editing; funding acquisition, W.W. and Y.Z. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.