A Novel High Discriminatory Protocol for the Detection of Borrelia afzelii, Borrelia burgdorferi Sensu Stricto and Borrelia garinii in Ticks

Bacteria of the Borrelia burgdorferi sensu lato complex are the causative agents of Lyme borreliosis (LB). Even if the conventional diagnosis of LB does not rely on the species itself, an accurate species identification within the complex will provide a deepened epidemiological scenario, a better diagnosis leading to a more targeted therapeutic approach, as well as promote the general public’s awareness. A comparative genomics approach based on the 210 Borrelia spp. genomes available in 2019 were used to set up three species-specific PCR protocols, able to detect and provide species typing of Borrelia afzelii, Borrelia burgdorferi sensu stricto (s.s.) and Borrelia garinii, the three most common and important human pathogenic Lyme Borrelia species in Europe. The species-specificity of these protocols was confirmed on previously identified B. afzelii, B. burgdorferi s.s. and B. garinii specimens detected in Ixodes ricinus samples. In addition, the protocols were validated on 120 DNA samples from ticks collected in Sweden, showing 88% accuracy, 100% precision, 72% sensitivity and 100% specificity. The proposed approach represents an innovative tool in epidemiological studies focused on B. burgdorferi s.l. occurrence in ticks, and future studies could suggest its helpfulness in routine diagnostic tests for health care.


Introduction
Lyme borreliosis (LB), one of the most common tick-borne diseases in Europe, the United States (US) and Asia, is a multisystemic infectious and inflammatory disease caused by spirochetes of the Borrelia burgdorferi sensu lato (s.l.) complex transmitted by hard ticks belonging to the family Ixodidae [1][2][3][4]. In recent decades, the incidence of LB in humans has increased by approximately 200% worldwide, resulting in between 240,000 to 440,000 new cases per year in the US [5] and about 86,000 human LB cases in Europe [6,7].
The aim of this study was to develop an easy, fast and reliable real-time PCR tool, based on the amplification of specific gene fragments of the main LB etiological agents in Europe (B. afzelii, B. burgdorferi s.s. and B. garinii), to better comprehend the distribution of these species and raise awareness of clinicians and the general public.
The unique fragments identified for each species of interest were determined following a comparative genomics approach based on the available Borrelia spp. genomes. The proposed protocol was validated on DNA samples from ticks collected in Sweden.

Genomes Download and Annotation Revision
Borrelia genome files (n = 210) were downloaded from the PATRIC database [37] and accessed on 16 July 2019. The assemblies were analyzed with OrthoANI [38] to assess their average nucleotide identity (ANI): genomes with ANI > 95% were assigned to the same cluster as described by Jain et al. [39]. Each group was named after the most abundant species in the respective cluster. This allowed us to find any mistakes and confirm the correct annotation of the genome's assemblies.

Target Genes Selection
The pan-genome of Borrelia (i.e., the set of all the genes/Open Reading Frames, ORFs, in the genus) was calculated, by analyzing the downloaded genomic sequences with Roary software version 3.11.2 [40]. Briefly, the tool takes assembly genomes as an input, and the ORFs are called and clustered based on their genetic similarity allowing to group the sequences in orthologous clusters representative of the genes present in the dataset. A principal coordinate analysis (PCoA) was performed based on the gene presence/absence in the genomes with the R package Adegenet [41]. PCoA represents, in a Cartesian space, the patterns found in distance matrices to explain most of the variance in the data set. Therefore, PCoA analysis characterizes the degree of similarity of a set of genomes, considering the whole information derived from the gene presence/absence analysis. Subsequently, based on the PCoA results, the clusters of species were identified by the discriminant analysis of principal components (DAPC [42]), and the contribution of each gene to the discrimination of a specific cluster on the PCoA, was determined by the "loading plot" function of the package Adegenet [41]. Briefly, using this approach, higher loading scores are attributed to genes that have the largest between-species variance and the smallest within-species variance. The genes with the highest loading scores were selected for B. afzelii, B. burgdorferi s.s. and B. garinii. The selected genes were chosen to be targets for the newly designed real-time PCR amplification (qPCR).

Primer Design
The sequences of the selected genes were analyzed using EasyPrimer [43] to identify the most suitable regions for primer design, and then species-specific primers were manually designed. For each species, the primers and the specificity of the amplified fragments were validated in silico by BLAST searches [44]. In addition, the cross-reaction between these primers and various organisms (e.g., mammals_taxid: 40,674; hard ticks_taxid: 6939; Anaplasmataceae spp_taxid: 942) were also tested by BLAST searches [44] excluding B. burgdorferi s.l._taxid: 64,895 from the analyses.

Real-Time PCR Assay Set Up
The three species-specific primer sets for the typing protocols were used in qPCR by the CFX Connect Real-time PCR detection system (Biorad ® , Hercules, CA, USA). Each 20 µL reaction contained a final concentration of 1× SsoAdvanced Universal SYBR ® Green Supermix (Biorad ® , Hercules, CA, USA), 0.25 µM of each primer, 1 µL of tick DNA and ddH 2 O up to the final volume. The thermal profile for the three reactions was set up as follows: 95 • C for 180 s; 40 cycles (95 • C for 10 s, 52 • C for 15 s and 72 • C for 15 s) and a melt curve from 55 • C to 95 • C with increments of 0.5 • C per cycle. Each sample was tested in duplicate.

Protocol Validation
A preliminary validation to test the species-specificity of the newly designed primers was performed on the nucleic acids obtained from six B. burgdorferi s.l.-positive female I. ricinus samples (two B. afzelii strains, two B. burgdorferi s.s. strains and two B. garinii strains; see Table S1) identified by the amplification and sequencing of a 5S-23S rRNA fragment of the spirochetes [26]. Briefly, total nucleic acid (NA) was extracted with MagNA Pure LC Total Nucleic Acid Isolation Kit (Roche, Basilea, Swiss) and reverse-transcribed to cDNA using illustra™ Ready-to-Go RT-PCR Beads kit (GE Healthcare, Little Chalfont, UK) as described in [45]. Each sample was tested with the three newly designed primer sets in separate tubes. To evaluate the accuracy, precision, sensitivity and specificity of the protocol, a much larger dataset of ticks (n = 120) was tested with the three designed PCR primer sets. In detail, the total NA was extracted from the ticks using the Magnatrix 8000+ extraction robot (Magnetic Biosolutions, Stockholm, Sweden) and the Vet Viral NA kit (NorDiag ASA, Oslo, Norway) and reverse-transcribed to cDNA using illustra™ Ready-to-Go RT-PCR Beads kit (GE Healthcare, Little Chalfont, UK). These samples were previously tested both for DNA and RNA-viral pathogens by Fluidigm (unpublished data). For that analysis, a mix of equal parts of total synthesized NA and cDNA were used as templates.
The 120 samples included 62 NA samples retrieved from the RåFäst-project collection (i.e., questing ticks collected by cloth-dragging method from Grimsö and Bogesund, Sweden, in 2013 [46]; Table S2) and 58 NA samples retrieved from the CLINF-project collection (i.e., ticks detached from dogs or cats in northern Sweden during 2018-2019). All the NA samples were stored at −20 • C at the National Veterinary Institute (Uppsala, Sweden; Table S2).
Additionally, samples with incoherent typing results between Fluidigm and the newly described protocol were also subjected to the amplification and sequencing of a 5S-23S rRNA fragment, as described by Wilhelmsson et al. [26]. Metrics to evaluate the new protocols were computed considering the adjustments made to the Fluidigm typing after the 5S-23S rRNA approach of the selected samples with incoherent results. Accuracy ((true positive + true negative)/total samples)), precision (true positive/(true positive+ false positive)), sensitivity (true positive/(true positive + false negative)) and specificity (true negative/(true negative + false positive)) were calculated.

Bioinformatics Results: Validating PATRIC Annotation
Average nucleotide identity clusters of the 210 downloaded Borrelia genomes revealed that 10 out of 210 (4.8%) were mis-annotated. A table with the correspondence of the genome annotation with the clusters of genomes with ANI > 95% is available in Table S3. PCoA performed on the gene presence/absence analysis is reported in Figure 1  The colors were manually added to group the genomes based on the average nucleotide identity (ANI) analyses: the genomes with an ANI > 95% were clustered and highlighted by the same color. Moreover, two genomes were explicitly indicated in the plot because their annotation on the PATRIC database (in black) was incoherent with the ANI clusters computed in this work (indicated by the color of the dot).

Bioinformatics Results: Target Genes Selection and Primers Design
Pangenome analysis revealed the absence of core genes among the 210 analyzed genomes (i.e., the absence of a set of orthologous sequences conserved in all aligned genomes), and DAPC analysis allowed the identification of species-specific genes. The three primer pairs designed on the genes selected to be highly specific and discriminant for each of the three B. burgdorferi s.l. species of interest are reported in Table 1.

Protocol Validation
The results of the three qPCRs carried out on the control samples obtained from six ticks (two positive for B. afzelii, two for B. burgdorferi s.s. and two for B. garinii; see Table S1), confirmed the primers species-specificity and the absence of cross-amplification among the three species. The specificity of each reaction was confirmed through the sequencing of the amplified fragments that showed 100% identity with the fragment of the corresponding species. The newly designed protocol was called LyDet, as it is aimed at Lyme bacteria detection. The NA previously extracted from the larger dataset of 120 tick specimens was typed by LyDet, and the results were compared to those obtained by Fluidigm on the same dataset. LyDet protocol highlighted the presence of 12 B. afzelii out of 24 Fluidigm-positives, 1 B. burgdorferi s.s. out of 9 Fluidigm-positives and 19 B. garinii out of 22 Fluidigm-positives. Additionally, six samples were identified by Fluidigm as Borrelia spp. (n = 4), B. spielmanii (n = 1) and B. miyamotoi (n = 1) were identified as B. garinii (n = 4) and B. afzelii (n = 2) by LyDet; these results were also confirmed by 5S-23S rRNA amplicon sequencing (Table S2). In addition, the eight samples typed as B. burgdorferi s.s. by Fluidigm and negative to LyDet protocol also resulted as being negative to the 5S-23S rRNA amplification (Table S2). Lastly, all the Borrelia-negative samples to Fluidigm were also negative by LyDet.
A comparative matrix of the results of both Fluidigm and LyDet methods is reported in Table 2. In detail, the matrix compares the two typing approaches on the 120 samples considering the differences between the two methods (LyDet and Fluidigm). LyDet showed 88% accuracy, 100% precision, 72% sensitivity and 100% specificity.

Discussion
The comparative genomics performed in this work highlights the lack of a core genome in Borrelia genus. This feature makes it difficult to identify the proper target gene to develop an unequivocal molecular typing protocol in general for B. burgdorferi s.l. complex species, and in particular for the three main etiological agents of LB in Europe (B. afzelii, B. burgdorferi s.s. and B. garinii). Indeed, numerous typing methods to identify Borrelia spp. that cause LB in both arthropods and vertebrates have been proposed over the years [47].
Thus, the best approach for B. burgdorferi s.l. species typing should be Whole Genome Sequencing (WGS), although several issues related to genome assembling are still limiting factors [48]. However, this technique may still be time-consuming, requires complex data computing and is still expensive to apply in routine diagnostics or in epidemiological studies.
Interestingly, comparative genomics on several Borrelia spp. has revealed that frequent inaccurate species assignments are present in public databases. This can possibly lead to incorrect interpretations concerning, e.g., the circulation of a certain B. burgdorferi s.l. species in a specific geographical area (leading to non-reliable distribution patterns), or incorrect diagnosis could also occur, resulting in improper therapeutic strategies. To our knowledge, there is no standardized procedure to fill the gaps in these discrepancies, and this can represent a growing problem in the future.
The aim of this work was to develop a typing protocol for an easier, rapid detection and identification of B. afzelii, B. burgdorferi s.s. and B. garinii in ticks. For this purpose, bioinformatics analyses were performed to select species-specific loci for each species. Based on the alignment of each locus, species-specific primers were designed on conserved regions flanking variable ones. The intraspecific variability of the amplified fragments would not have allowed the specificity-and consequently the sensitivity-of a probe-based approach. For this reason, the three newly designed qPCRs were set up in separate tubes and using SybrGreen reagent as a fluorescent molecule. The species-specificity of the protocol and the absence of cross-reactions with other Borrelia species were assessed on a dataset of already species-typed samples identified as B. afzelii, B. burgdorferi s.s. and B. garinii, as well as on tick samples previously analyzed and typed using a Fluidigm approach (unpublished data). Furthermore, two specimens identified using Fluidigm as B. miyamotoi and B. spielmanii were amplified by LyDet assay and assigned to B. afzelii and B. garinii species, respectively. The subsequent sequencing of the 5S-23S gene fragment confirmed the result provided by LyDet, reiterating the species-specificity of the proposed protocol.
Some previously published investigations on B. burgdorferi s.l. in ticks revealed that species typing was occasionally not determinable [26,34]. On the contrary, the high sensitivity of LyDet protocol allowed the typing of three B. garinii and one B. afzelii, which were generically assigned to Borrelia spp. group by Fluidigm, thus leading to an improvement of the species-specific detection of Borrelia bacteria.
There are few studies published that quantify the frequency of co-infections by different Borrelia spp. in ticks [49,50]. This can, in part, be due to the fact that it might be unfeasible to obtain the species identification by Sanger or whole genome sequencing in samples where multiple Borrelia species are present. The detection of co-infections by different Lyme Borrelia species is another potential outcome linked to using the LyDet protocol. One potential limitation of LyDet is that this protocol is able to detect only three given Borrelia species, and it might thus miss other, new, emerging or re-emerging ones. This might become an issue since these undetected species could be involved in the clinical picture or could be relevant when performing a general screening aimed at assessing the occurrence of more Borrelia species. One possible solution would be to perform a general Borrelia spp. screening according to an already described method (e.g., [26]), along with the species-specific LyDet approach.
The LyDet approach confirmed the Fluidigm results to only 50% of B. afzelii, 11% of B. burgdorferi s.s. and 86% of B. garinii samples, while the remaining were negative. The lower detection rate of LyDet compared to Fluidigm could be attributed to the template sample. Indeed, in LyDet assays total NA alone was used as template, while in Fluidigm a mix of total NA and cDNA was used for the detection of genetic material from pathogens in the screened tick samples (i.e., DNA from bacteria and/or protozoa, as well as RNA from RNA-viruses). It is worth mentioning that the Fluidigm approach is also based on a pre-amplification step that can increase the signal of those targets showing extremely low DNA concentrations.
Even if setting up a method for the quantification of bacterial DNA (and therefore the number of bacteria) in a given sample was beyond the scope of the present study, bacterial load in ticks is considered relevant by some authors. For example, if the analyzed tick has been detached from a patient, information on the bacterial load can help in predicting transmission risks of the bacteria to the host [26,28]. In such cases, the LyDet protocol could be optimized and improved in order to quantify the species-specific bacteria load, either by using a certain amount of target sequences (i.e., through the use of plasmids) or by assessing the number of copies of the target genes of the three PCR systems in the respective Borrelia genomes.
Eight out of nine samples identified as B. burgdorferi s.s. with the Fluidigm test produced negative results when analyzed with the LyDet protocol. This result was also confirmed by the absence of amplification of the 5S-23S rRNA fragment. This can be attributed to either a low DNA quality of the template or to an interpretation of Fluidigm results for B. burgdorferi s.s. that "should be interpreted with care" as stated in the first description of the method [34]. However, the limited number of B. burgdorferi s.s. specimens detected by LyDet are coherent with low prevalence rates of this Borrelia species in Sweden, as recently described [26]. Further analyses should be performed on B. burgdorferi s.s. positive samples for an enhanced validation of the method.
The absence of gene sequencing and the sensibility and specificity of the LyDet assay make this method particularly reliable for large screenings of ticks for B. afzelii, B. burgdorferi s.s. and B. garinii detection. The epidemiology and ecology of these species are indeed pivotal for human health and could help to promote prevention against tick exposure. However, the low number of genomes available in PATRIC for B. garinii and B. afzelii and the genomic plasticity of Borrelia DNA may decrease the sensitivity of the tool. Nevertheless, genomes of B. burgdorferi s.l. available in the PATRIC database are continuously upgraded [37], and the addition of new genomes to our analysis would greatly benefit the specificity of the method.
Future applications of the recently developed method could include clinical investigations on LB patients to provide a quick and straightforward identification of the related Borrelia spp. This method could certainly complement, but not replace, the clinical evaluation and the diagnostic tests currently used. In fact, the validity of methods such as LyDet needs to be checked periodically by comparing the target sequences to those that are continuously generated and successively made available in sequence databases.

Supplementary Materials:
The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/pathogens11111234/s1, Table S1: Information about the six B. burgdorferi s.l.-positive female I. ricinus samples (two positive for B. afzelii strains, two for B. burgdorferi s.s. strains and two for B. garinii strains) used for PCR primers validation; Table S2: Information about the 120 tick samples used to validate the protocol; Table S3: Correspondence table of the PATRIC genome annotation (rows) with the cluster of genomes with ANI > 95% (columns). The 10 mis-annotation found in the PATRIC database are indicated in bold red font.
Author Contributions: All authors contributed to the study's conception and design. Material preparation was performed by P.W., P.-E.L., P.K., K.U., A.O. and S.M.; data collection was performed by G.C. and M.P.; analysis was performed by R.N. The first draft of the manuscript was written by G.C. and M.P.; all authors commented on previous versions of the manuscript. The final manuscript was written by C.B., A.C. and G.G. All authors have read and agreed to the published version of the manuscript.
Funding: Work was partially supported by PRIN_MIUR2012 (code 2012A4F828) to C.B., P.W. and P.-E.L. were supported by the European Union through the European Regional Development Fund and the Interreg North Sea Region Programme 2014-2020 as part of the NorthTick project (reference number J- No. 38-2-7-19). G.G. was supported by the Swedish Research Council (Vetenskapsrådet, VR), project "TICKBIOCON-Ticks from animals and their microbiome: interaction between pathogens and endosymbionts as a potential target for biocontrol tools", registration number 2018-03830.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.