The Challenge of Environmental Samples for PCR Detection of Phytopathogenic Bacteria: A Case Study of Citrus Huanglongbing Disease

: Huanglongbing (HLB) is the most devastating citrus disease and is associated with three bacterial species of the genus ‘ Candidatus Liberibacter’ transmitted by insect vectors. The early detection of HLB is based on PCR methods, and it is one of the cornerstones for preventing incursion into disease-free countries. However, the detection of phytopathogenic bacteria with PCR-based methods is problematic in surveys that include a variety of samples of different origins. Here, we ﬁrst report the proportion of ampliﬁcations obtained by two standardized real-time PCR methods for the diagnosis of HLB in various environmental samples that include plants, psyllid vectors, and parasitic wasps of the psyllids. The results of 4915 samples showed that 9.3% of them were ampliﬁed by the ﬁrst rapid screening test and only 0.3% by the more speciﬁc tests. Most of the ampliﬁcations were associated with parasitic wasps. We designed the primers external to the target regions of both real-time PCR protocols to determine if ampliﬁcations belonged to one of three ‘ Ca. Liberibacter’ species associated with HLB. The bioinformatic analysis of the sequences obtained with these primers revealed that all these ampliﬁcations came from the presence of other prokaryotic organisms in the samples. The primers developed in this study overcome the problem of undesired ampliﬁcation in environmental samples. Thus, they could be used in future survey protocols to prevent the eradication of negative trees and the generation of unjustiﬁed alarms.


Introduction
Accurate detection is one of the benchmarks of plant bacterial disease management. The challenge of detection is greater when the target organism cannot be grown under in vitro conditions, and detection is based only on molecular methods. PCR-based methods have become essential in detection protocols; however, they present problems of sensitivity, specificity and robustness in complex environmental samples such as leaves, roots, insects and soil [1]. Sometimes it is necessary, after designing new primers, to adjust sequences, reagents and amplification conditions to increase the specificity of the reaction as new information becomes available [2]. One plant disease that can serve as a paradigm for detection problems is huanglongbing (HLB), the most devastating disease affecting many citrus species [3,4]. Table 1. Analysis of plant and insect samples in the period 2009-2018 for the detection of huanglongbing (HLB)-associated bacteria by the real-time PCR protocols described by Bertolini et al. [19] and Li et al. [21].

Insect Material
A total of 2132 insect samples from the classical biological control program developed to introduce T. dryi into mainland Europe were selected for HLB diagnosis (Table 1). Of these, 1081 samples were specimens of T. dryi. Eighty-two were collected in Pretoria (South Africa), and the rest belonged to different generations of the colony established at Instituto Canario de Investigaciones Agrarias, ICIA (Canary Islands, Spain) (F1; F2-4; F5-9). The other 1051 samples were T. erytreae individuals obtained from colonies established in a greenhouse maintained on young lemon trees cv. Eureka grafted on HLB-free Citrus macrophylla rootstocks at ICIA. The identification of both insect species was performed by PCR amplification and the sequencing of the mitochondrial COI gene according to Folmer et al. [33].

DNA Extraction from Plant Material and Insects
For the analysis of the plant material, three biological replicates from each sample were placed in separate plastic bags and stored at 4 • C prior to analysis, according to EPPO protocol [18]. Midribs of the leaves were crushed into extraction bags (Bioreba), using a Homex 6 homogenizer (Bioreba), in PBS extraction buffer at 1:5-10 (w/v). A 1.5 mL aliquot of each crude plant extract was either processed immediately by real-time PCR or stored at −20 • C until use. For the analysis of the insect material, individual specimens were preserved in 70% ethanol until analysis. Total DNA was obtained from 200 µL of crude plant extract or insect material using CTAB (Cetyl Trimethyl Ammonium Bromide) extraction method [18]. The purified DNA was analyzed immediately or preserved at −20 • C until use.

Detection of 'Ca. Liberibacter' spp.
Two protocols, based on real-time PCRs with targets in the 16S rDNA gene, were used for the detection of HLB-associated-'Ca. Liberibacter' spp. following EPPO [18]: (1) real-time PCR with universal primers and a TaqMan probe for all CaLspp according to Bertolini et al. [19]; and (2) three real-time PCRs, also using a TaqMan probe, for the specific identification of CaLaf, CaLam and CaLas, as described by Li et al. [21]. Three replicates were included for each sample. StepOne Plus (Applied Biosystems, Foster City, CA, USA) or LightCycler ® 480 (Roche) thermocyclers were used for the amplification and management of data and analyses.

Verification of Positive Samples
To determine the status of positive with respect to HLB-associated bacteria, a set of primers for conventional PCR were designed in the 16s rDNA gene: CalsppF-sec, 5 GAG AGT TTG ATC CTG GCT CA 3 and CalsppR-sec, 5 TCC TCT CAG ACC AGC TAT 3 . To this end, the sequence alignment of ten reference whole genomes CaLspp available in the databases [34] was performed (accessed on September 2019). The access numbers of the reference genomes were NC_012985; NC_014774; NC_020549; NC_022793; NZ_AOFG00000000; NZ_AP014595; NZ_CP004021; NZ_CP010804; NZ_CP019958; NZ_CP0 29348. According to this alignment, these primers amplify a DNA fragment of 266 bp, including target regions of the two real-time PCR protocols described above ( Figure 1). Primers designed were used only to analyze the positive samples by real-time PCRs [19,21], not for CaLspp detection. The reaction mixture contained either 0.5 µM of each of the primers, 0.15 mM of MgCl 2 , 0.2 mM of dNTPs, 1X of 10x standard reaction buffer and 1 U of DNA polymerase (Biotools). The PCR protocol consisted of one step of 95 • C for 10 min followed by 45 cycles of amplification (95 • C for 15 s and 55 • C for 1 min). Conventional PCR products were purified using the QIAquick PCR Purification Kit (Qiagen), and sequencing was performed by Sanger method from two assembled sequences (forward and reverse). Quality of the chromatograms was determined with the package SangeranalyseR, tools for Sanger sequencing data [35] in R software [36]. NCBI BLAST search [37] was used to identify related sequences and homologies. Samples that do not show nucleotide homology with any CaLspp were named undesirable or undesired amplifications.

Bioinformatic Analysis of the Sequences
Geneious Prime 2020 software (Biomatters Ltd., Auckland, New Zealand) was used to calculate the match identity of the probe and primer sequences using as reference the consensus and CaLaf sequences (NZ_CP004021). The consensus sequence was obtained by a MUSCLE [38] alignment using different bacterial species sequences. These sequences included, on the one hand, species that showed high identity by BLASTn

Phylogenetic Analysis of the Undesirable Amplifications
Twelve representative samples with undesired amplifications, including citrus plants, psyllid vectors and parasitoid species, were selected for the phylogenetic analysis. To this end, fragments of 266 bp obtained with Calspp-sec primers were aligned with sequences of bacterial species from GenBank that showed high identity in the BLASTn, as described above. Sequences were aligned with Clustal W Algorithm using Mega X software [39] and Agronomy 2021, 11, 10 5 of 13 maximum likelihood tree was obtained selecting the best nucleotide substitution model according to Mega X [39]. the primers, 0.15 mM of MgCl2, 0.2 mM of dNTPs, 1X of 10x standard reaction buffer and 1 U of DNA polymerase (Biotools). The PCR protocol consisted of one step of 95 °C for 10 min followed by 45 cycles of amplification (95 °C for 15 s and 55 °C for 1 min). Conventional PCR products were purified using the QIAquick PCR Purification Kit (Qiagen), and sequencing was performed by Sanger method from two assembled sequences (forward and reverse). Quality of the chromatograms was determined with the package Sangeran-alyseR, tools for Sanger sequencing data [35] in R software [36]. NCBI BLAST search [37] was used to identify related sequences and homologies. Samples that do not show nucleotide homology with any CaLspp were named undesirable or undesired amplifications.

Analysis of Plant Samples
Real-time PCR analyses revealed that 213 out of the 2783 plant samples analyzed (7.65%) were positive by real-time PCR with the 'Ca. Liberibacter' spp. protocol proposed for the first screening [19] (Table 1). Six samples out of 213 also showed amplification signal with specific HLB primers, but with cycle threshold (Ct) values near the detection limit [21]: sample IVIA 5029.1 from Citrus sp. was positive for both specific primers designed to detect CaLam and CaLas; sample IVIA 5029.2 from Murraya koenigii amplified only with specific primers for CaLam; and sample IVIA 5361 from Citrus unshiu, and P10, P11 and P48 from Citrus sp. only with specific primers for CaLaf (Table 2).

Analysis of Insect Samples
Real-time PCR results of T. erytreae samples showed that nine out of 1051 specimens were positive by the Bertolini et al. PCR [19], representing only the 0.86% of the field samples and confirming the usefulness of this protocol for the first screening of the HLB vector. The average Ct value of all insects samples was 33.11, with the minimum value 26.29 and the maximum 35.34. However, the nine samples were negative by Li et al. PCR [21] ( Table 1). In the case of T. dryi, 237 out of 1081 (21.92%) were positive by Bertolini et al. [19] and five of these samples (0.46%) also were positive by Li et al. [21] near to its detection limit (Table 1). Table 2. Highest sequence identity at the nucleotide level in the GeneBank database of the samples positive by PCR protocols of Bertolini et al. [19] or Li et al. [21].

Host
Origin Sample 34 * bacteria not classified according to available databases [34] (accessed on July 2020).

Sequence Analyses of the Amplified Fragments
A representative selection of 47 positive samples by the real-time PCRs [19,21] was sequenced in a second round by conventional PCR with the primers designed in this study. These samples included twelve samples from different Rutaceae plant species, thirty from T. dry and five from T. erytreae (Table 2) (GeneBank numbers: MW248533-MW248552). Comparative sequence analyses using BLASTn showed that none corresponded to any species of 'Ca. Liberibacter.' All matched other 16S rDNA genes from other bacteria, so they were non-target amplifications. With respect to Rutaceae plants, samples IVIA 5029.1, IVIA 5029.2 and IVIA 5361 showed the best BLASTn match with Sphingomonas sp., with a sequence identity of 90-98%. Samples IVIA 5020 and IVIA 5516 presented a sequence identity of 98-100% with soil bacteria from the Rhizobium genus. Sample 209R showed a 93% sequence identity with Bradyrhizobium sp., and sample 209A showed a 94% match with an uncultured bacterium, both of the same order of the Rhizobiales that includes Rhizobium. Samples P10, P11, P15 and P16 presented sequence indentities of 100% with Phyllobacterium sp; while only one sample, P48, showed an identity of 99% with uncultured Asaia sp. Twenty-six out of the 30 amplified fragments from T. dryi showed a high sequence identity (99%) also with Asaia sp. (Table 2). Finally, four samples showed homology with other sequences: two samples with Wolbachia sp. endosymbiont of Drosophila simulans, one sample with the partial sequence of 16S rDNA gene of Rhizobium sp. 3041, and one sample with Ochrobactrum pseudogrignonense strain K8. One of the T. erytreae samples, (i273) showed the best BLASTn match with a non-classified uncultured bacterial isolate from soil with a sequence identity of 99%. The remainder of the T. erytreae samples best matched two endosymbionts from other Hemiptera using BLASTn. Specifically, two samples (i800; i360) showed a sequence identity of 91% with Sodalis sp., an endosymbiont of Porphyrophora polonica L., and the other two (i487; i502) showed a sequence identity of 89% with a secondary endosymbiont of B. cockerelli. real-time PCRs [19,21] confirmed the results obtained by BLASTn searches, demo that the samples with positive results did not match any species of 'Ca. Liberib even the clade of this genus (Figure 2). Sequences of both plant and insect mate shown to be clustered into groups of several orders such as Rhizobia Sphingomonadales, which include cultured and uncultured bacteria. The insect also clustered in the orders of Rickettsiales and Enterobacterales.

Phylogenetic Analysis of the Amplifications
The phylogenetic analysis of the fragments that include the target region of the two real-time PCRs [19,21] confirmed the results obtained by BLASTn searches, demonstrating that the samples with positive results did not match any species of 'Ca. Liberibacter' or even the clade of this genus (Figure 2). Sequences of both plant and insect material were shown to be clustered into groups of several orders such as Rhizobiales and Sphingomonadales, which include cultured and uncultured bacteria. The insect samples also clustered in the orders of Rickettsiales and Enterobacterales.
The match identity value of the HLBr primer was 67.3% in the consensus sequence. In this case, the sample sequence showed match identity values between 87. 5

Discussion
Accurate and reliable diagnosis, detection, and identification techniques are key elements in the prevention, regulation, and management of the bacterial diseases of plants. A suitable early detection protocol will help to preserve the crops free of pathogenic bacteria and to allow rapid and appropriate responses when they are already present, even in low concentration. Knowledge of how a real-time PCR protocol performs in routine analysis will permit its adequate integration into diagnostic schemes, with the correct interpretation of results, and the design of optimal risk management strategies [41]. Accuracy, sensitivity, and specificity, speed, economy-sustainability, and ease of use are the main characteristics that a detection protocol must meet and knowing the advantages and drawbacks of those used in large-scale surveys is necessary. In this study, we have shown that specificity is particularly important when dealing with varied environmental samples, which have a diverse microbiota that can be mistaken for the target and produce false positive results. There is an increasing number of tools to design specific primers and a growing number of sequences available in databases. However, the absolute specificity of primers or probes is very difficult to achieve and even to predict. In fact, it is estimated that there are between 0.8 and 1.6 million prokaryotic operational taxonomic units (OTUs) worldwide [42]. Therefore, in the design of primers and probes, it is important to take into account both the available sequence databases and the empirical evidence from experimental studies. Extensive surveys and reliable analytical methods are required to demonstrate and confirm the absence of HLB-associated bacteria. In many Mediterranean countries, this work is being carried out using the methodology described for the analysis of plants and vector specimens [19,21,43]. To date, the HLB has not been reported in these areas, but this situation is subject to change, and optimal detection tools are needed. In the present work, an extensive analysis of Rutaceae host plants, insect vectors and parasitic wasps was carried out following the EPPO protocol for the detection of HLB-associated 'Ca. Liberibacter' spp. [18], using two real-time PCRs targeting the 16S rDNA gene [19,21]. Percentages of 7.65 and 2.81% of the field citrus samples analyzed, from different origins, showed the positive amplification with the genus-specific and species-specific primers, respectively [19,21]. In the case of the T. erytreae analysis, less than 1% of the samples were amplified in the first screening method and none by the PCR specific protocols, confirming the results obtained in previous intensive surveys carried out in Spain [43]. The greatest number of amplifications with universal genus primers (21.92%) was obtained in the analysis of specimens of the parasitoid T. dryi, which was not previously tested by the authors of the different evaluated protocols [19,21]. This result could be due partly to the genetic relationship of the analyzed specimens, all of them with South African origin. Only 2.95% tested positive for one of the species-specific protocols [21].
The genus-specific 16S rDNA primers [19] are included in the protocol because of their high sensitivity. In non-HLB endemic areas, it is likely that the pathogenic bacterial population is well below 1.7 cells per phloem cell, which is the concentration assumed in symptomatic plants [44], while the Ct values of asymptomatic samples are expected to have Ct values no less than 35 [23]. In the present study, the amplification signals were generally obtained at Ct values above 35. Although these late values could indicate the presence of the target sequence at low concentration, in more than 97% of these samples, no amplification was obtained in the second step with the species-specific primers. In a recent work on detection of CaLas, Ct values greater than 35 are thought to be due to non-specific sequences of unknown bacteria [23]. Therefore, the signal obtained in the first stage in 9.34% of the samples suggested undesired amplifications of non-target bacteria with homologies in part of the target sequence of the HLB-associated bacteria.
The lower specificity of the 16S rDNA assay has been described as its main drawback [29,45]. This is because homologies with sequences from host and/or citrus-associated endophyte organisms [46] compromise a reliable and specific diagnosis of HLB. The verification step developed in this study showed that all positives samples obtained according to the workflow based on two real-time PCRs [19,21] were non-target amplifications not corresponding to the species of 'Ca. Liberibacter'. Among the organisms identified in the analyses of plant samples, it should be noted that Sphingomonas is a genus comprising more than 55 species, some of which inhabit the soil and rhizosphere [47]. Interestingly, the other genus identified in the order Rhizobiales, Phyllobacterium, was phylogenetically close to 'Ca. Liberibacter' [48], and also contains species related to leaves and roots of plants [49]. Since phylogenetic relationships have been found between some metabolically diverse species of Rhizobiales, such as CaLas and Agrobacterium tumefaciens [50,51], it is believed that 'Ca. Liberibacter' evolved from a common ancestor through diversification and reduction processes, which occurred during the adaptation to the host [52].
In the insect analyses, the amplification with genus-specific 16S rDNA primers was obtained in less than 1% of T. erytreae specimens, in line with a previous work [43]. However, more than 20% were obtained in the case of the parasitic wasp. Of the T. dryi fragments selected for sequencing, more than 86% showed a 99% sequence identity with an uncultured Asaia sp., a genus whose species is frequently associated with plants and insects [53,54], and in fact, this bacterial species also was identified in a citrus sample (P48). From wasp samples, 19% also were amplified with the CaLaf specific primers. Although the match identity of CaLaf primers and the probe described by Li et al. [21] was lower than the genus-specific primers described by Bertolini et al. [19], some non-target amplifications could be expected.
Other selected spurious amplicons of T. dryi showed 100% identity with Ochrobactrum pseudogrignonense, a bacterium that can be associated with an insect as an endosymbiont, since the isolation of Ochrobactrum sp. has been reported from the intestinal region of termites, where it participates in the degradation of hemicelluloses [55]. Two other samples of T. dryi revealed sequences like those of the endosymbiont Wolbachia sp., a group of alpha-proteobacteria that infect a wide range of insects and filarial nematodes [56]. Wolbachia sp. has been described recently in T. dryi from Kenya [57]. The association of T. dryi with Wolbachia sp. might have important consequences for the classical biological control program, because this bacteria can affect the reproduction of its host [58]. Moreover, a better understanding of the multitrophic interactions between citrus, psyllids, endosymbionts and pathogens could lead to developing more effective management strategies [56,59,60].
Our results show that the non-target amplifications matched the primers and probes. Since the target is a small fragment of a highly conserved gene, 16S rDNA, the probability of finding it in prokaryotes that share habitat with HLB-associated bacteria is high, both in citrus and insect hosts. Diagnosis by PCR is based on specific and discriminating sequence signatures. Specificity can be inferred from the comparison of sequences, but discrimination requires empirical evidence [61]. Today, the availability of several 'Ca. Liberibacter' genomes allows redefining in silico the specificity of the 16S rDNA oligonucleotides. This study shows different bacterial species that share the same habitat as the target organisms (Calspp) and should be considered in the future for the discrimination sequence signatures in the HLB detection by PCR.
Complete genomic sequences of more CaLas, CaLaf and CaLam strains will allow systematic screening of unique genes of HLB-associated bacteria across the genome [26]. Recently, by whole genome sequencing, a missing nucleotide G was identified in the sequence of the forward primer by Li et al. [21,23]. This made it impossible to distinguish the low CaLas-titer (Ct > 30) from the absence of CaLas in the samples of citrus fruits and the psyllid vector, D. citri [23]. Due to the complexity, the lack of enough knowledge of the plant and insect microbiome, and the limited number of sequences available in the current nucleotide sequence databases of the all HLB-associated 'Ca. Liberibacter' species, it is not feasible to filter all the possible sequences bioinformatically that could result in false positives. Therefore, a good strategy is to identify undesired amplifications empirically, by combining different sets of primer pairs using a consensus approach [62]. This in silico re-evaluation of the specificity of primers and probes is applicable to many pathogenic bacteria [61], particularly in complex samples such as plant and insect vectors. Sample DNA can be a template for non-specific binding of CaLas primers, forming non-specific products. In the same way host tissues may contain elements that affect the efficiency of the qPCR reaction contributing to the variability of the result, as observed in other environmental samples [63,64].
The accurate identification of HLB-associated bacteria is necessary in all areas where the disease is a major threat to facilitate early identification in plants, insect vectors, and biocontrol agents. Accurate diagnosis assists in the management of HLB-affected trees and the development of HLB-free nursery materials [23,24]. Reliable diagnosis and the differentiation of HLB-associated species also are essential to reduce the spread of this disease through international trade, as well as to minimize the economic impact of possible false positive diagnoses, in particular in non-affected citrus-producing areas such as European Union countries [27]. Although detection based solely on a fragment of the 16S rDNA gene is a valid approach for the rapid screening of samples due to its high sensitivity, for accurate results, the proposed protocol for the sequencing of the amplicons obtained is recommended. Any positives obtained in general surveys should be confirmed with sequence data. That will allow the reliable identification of non-target bacteria and prevent false positive results. Funding: This work was funded by Project TROPICSAFE (Insect-borne prokaryote-associated diseases in tropical and subtropical perennial crops) (grant number 727459) from the European Union's Horizon 2020 research and innovation program and Project E-RTA2015-00005-C06-01 (Métodos de control y contención de Trioza erytreae, vector del huanglongbing de los cítricos from Programa Estatal de I+D+I Orientada a los Retos de la Sociedad) from the Spanish Government.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.