The Population Structure of Borrelia lusitaniae Is Reflected by a Population Division of Its Ixodes Vector

Populations of vector-borne pathogens are shaped by the distribution and movement of vector and reservoir hosts. To study what impact host and vector association have on tick-borne pathogens, we investigated the population structure of Borrelia lusitaniae using multilocus sequence typing (MLST). Novel sequences were acquired from questing ticks collected in multiple North African and European locations and were supplemented by publicly available sequences at the Borrelia Pubmlst database (accessed on 11 February 2020). Population structure of B. lusitaniae was inferred using clustering and network analyses. Maximum likelihood phylogenies for two molecular tick markers (the mitochondrial 16S rRNA locus and a nuclear locus, Tick-receptor of outer surface protein A, trospA) were used to confirm the morphological species identification of collected ticks. Our results confirmed that B. lusitaniae does indeed form two distinguishable populations: one containing mostly European samples and the other mostly Portuguese and North African samples. Of interest, Portuguese samples clustered largely based on being from north (European) or south (North African) of the river Targus. As two different Ixodes species (i.e., I. ricinus and I. inopinatus) may vector Borrelia in these regions, reference samples were included for I. inopinatus but did not form monophyletic clades in either tree, suggesting some misidentification. Even so, the trospA phylogeny showed a monophyletic clade containing tick samples from Northern Africa and Portugal south of the river Tagus suggesting a population division in Ixodes on this locus. The pattern mirrored the clustering of B. lusitaniae samples, suggesting a potential co-evolution between tick and Borrelia populations that deserve further investigation.


Introduction
Populations of vector-borne infectious disease agents are shaped by migration of, and selection pressure exerted by, their hosts and vectors [1][2][3][4] and these processes leave genetic signatures in the genomes of the organisms. Investigation of the geographical distribution of genetic lineages and their diversity permit inference of evolutionary processes such as selection or migration that have shaped populations [4]. Ticks range among the most important vectors for microbial pathogens [5] but many pathogens they transmit are difficult to study [6]. Disease agents that have been associated with Ixodes ticks include Borrelia spp., Babesia spp., Anaplasma spp., Rickettsia spp., and viruses such as tick-borne encephalitis virus or Powassan virus with species varying in their geographical distribution and pathogenic potential [5], [7]. The abundance of ticks, tick-borne disease agents and resulting host-parasite interactions have been reported to be influenced by environmental changes, among them changes in land use, increases in temperature due to climate change and others [3,8,9]. In this study, as a proxy for tick-transmitted pathogens, we investigated a population of the most frequently found microbes transmitted by ticks in the temperate northern hemisphere, i.e., microbes that belong to the Borrelia burgdorferi sensu lato species complex.
The Borrelia burgdorferi sensu lato species complex comprises more than 20 spirochetal bacterial species that use ticks of the genus Ixodes as vector and small to medium sized vertebrate species as reservoir hosts. Borrelia species vary considerably in the scale of their host and vector specificity [1,6,10,11], and this is reflected in the geographic distribution of their lineages [11][12][13]. Furthermore, host-and vector associations have been identified as drivers for diversification and speciation in Borrelia [6,11,14,15]. For example, B. burgdorferi sensu stricto (s.s.) Johnson et al. 1984, the main cause of human Lyme borreliosis in North America, is considered a generalist species because it can have different groups of vertebrate species such as rodents or birds as reservoir hosts and several Ixodes species as vectors [16,17]. The distribution range in North America includes the Northeast and Midwest of the USA [18] ranging now into Canada [19][20][21] and Western coastal regions [22,23]. Phylogenies based on multilocus sequence typing (MLST) or genomic SNPs have revealed a complex population structure, but strains do not cluster according to geography [24][25][26][27]. Species using only avian reservoirs, such as Borrelia garinii Baranton et al. 1982 or Borrelia valaisiana Wang et al. 1997, showed very little population structure likely owing to the migration and dispersal pattern of their avian reservoir hosts [12,13,[28][29][30][31][32]. In comparison, species that use hosts with smaller migration ranges such as rodents or lizards show more pronounced population structure, supposedly reflecting the migration range of their hosts [12,33].
Borrelia lusitaniae Le Fleche et al. 1997 was isolated for the first time in Portugal [34] and described as a species in 1997 [35]. The species has been associated with lizards as reservoir hosts [36][37][38] and Ixodes ricinus Linnaeus, 1758 as its main vector [39][40][41]. Several species of the family Lacertidae, e.g., Psammodromus algirus Linnaeus, 1758 [36,38,42], Podarcis spp. Wagler, 1830 [37,38,43,44], Teira dugesii Milne-Edwards 1829 [45] as well as Lacerta spp. Linnaeus, 1758 [43,46] were incriminated as potential reservoir hosts for B. lusitaniae. Some studies reported isolation of B. lusitaniae from, or detection of B. lusitaniae by PCR in, larvae feeding on the rodent Apodemus sylvaticus Linnaeus, 1758 [38,47]. Although this is not proof of reservoir competence, it cannot be immediately discounted that hosts other than lizards may serve as reservoirs for B. lusitaniae; further studies are needed to clarify this point. Initial reports of B. lusitaniae came from around the Mediterranean basin (Tunisia, Morocco, Portugal, Italy) [34,41,[48][49][50] but now it has also been reported in countries such as Slovakia, Poland, and Latvia [46,[51][52][53]. Remarkably high Borrelia infection rates of adult ticks (at the time identified as I. ricinus) (75%, 41 out of 55) were found in Portugal near Grândola of which a majority were identified as B. lusitaniae [54] suggesting favorable conditions in terms of reservoir hosts and vector ticks for this Borrelia species. First studies showed that B. lusitaniae induced disease in C3H/HeN mice, causing both nodular interstitial cystitis and vasculitis of the great vessels of the heart [55]. From an epidemiological perspective, further investigations of B. lusitaniae populations will be important as to date two isolates have been obtained from human clinical cases [56][57][58].
The population structure of B. lusitaniae was analyzed based on single genes (outer surface protein A; ospA) [59] and multilocus sequence typing (MLST) [33]. The study on ospA included samples from Italy, Germany, Morocco and Portugal. Phylogenetic analyses demonstrated that Portuguese tick isolates (PoTiB1, PoTiB2 and PoTiB3, all isolated from ticks from around Águas de Moura, south of the river Tagus) formed a clade together with North African isolates while isolates from Germany, Italy and a Portuguese human isolate (from Lisbon region) formed a second clade [59]. The MLST study included samples from Portugal collected in Mafra (north of the river Tagus) and Grândola (south of the river Tagus). Phylogenetic analyses of MLST sequences showed that isolates from north of the Tagus river mostly clustered together while samples from Grândola formed a second cluster suggesting two populations in Portugal-one from the north of the Tagus river, the other one from south of the Tagus river [33]. To confirm the population structure of B. lusitaniae, we analyzed, using MLST, samples from various European countries (Austria, Croatia, Latvia, Serbia, Slovakia, Ukraine and Portugal) and from North Africa (Algeria). The Portuguese samples included specimens from north (Mafra and Coimbra) and south (Santiago do Cacém) of the river Tagus and from Madeira island. In addition, we tested the hypothesis that the population structure of B. lusitaniae is shaped by association with different vector species. In 2014, a new Ixodes species was described and named Ixodes inopinatus Estrada-Peña, Nava et Petney 2014 [60]. The authors describe that I. inopinatus can be found on lizards and suggest that it replaces I. ricinus in dry Mediterranean regions of Algeria, Morocco, Portugal, Spain and Tunisia. A previous study on the population structure of I. ricinus had shown that there was a divergent clade in North Africa [61]. The latter study used concatenated sequences of four mitochondrial and nuclear genes including the 16S rRNA locus and Tick-receptor of outer surface protein A (trospA). In the present study, we used these two loci to establish whether there are different populations of Ixodes in Portugal and whether the ticks collected in Algeria may be identified as I. inopinatus.

Tick Collection and Processing
Questing ticks were collected by standardized flagging method [62]. Ticks from Algeria were removed from cattle, one specimen was collected in Portugal feeding on a P. algirus (PoTiB11/T3087), three specimens were collected in 2018 in Germany from great tits Parus major Linnaeus, 1758 (11-E12, 7-F5, 8-C12) [63], and three specimens feeding on humans (13310PT18, 13360PT18 and 14401PT18). Collected ticks were morphologically determined into stages and species [60,64] and stored in 70% ethanol until further analyses, except for one sample that was inoculated in BSK medium (see Table S1).
In Slovakia, ticks were collected in years 2013-2017 in an ecotone between the forest and a small mountain stream on Martinské hole mountains (49. Countries where ticks were collected and Borrelia samples characterized are shown in Figure 1. Details on geographic origin, species and number of ticks included in the current analysis are given in Tables 1 and S1. Countries where ticks were collected and Borrelia samples characterized are shown in Figure 1. Details on geographic origin, species and number of ticks included in the current analysis are given in Tables 1 and S1. Figure 1. Map of countries from which ticks and B. lusitaniae were included in this study. Samples from Austria, Latvia, Serbia and some samples from Portugal were retrieved from the MLST website. The locations of sample collection were slightly shifted for visualization purposes. The map was generated in R. For details, see Tables 1, 2, S1 and S2.

DNA Extraction
For DNA isolation from ticks, different methods were used. In Slovakia and Ukraine, alkaline-hydrolysis was used as described [65] on individual ticks. In Portugal, a DNAeasy kit (Qiagen, Hilden, Germany) was used. Ticks sampled in Algeria were individually extracted using the MagNA Pure 96 system using the MagNA Pure96 DNA and Viral NA Small Volume Kit (Roche Diagnostics, Meylan, France) according to the manufacturer's instruction and for ticks from Germany the Qiagen BioSprint 96 DNA Blood kit (Qiagen) was used. DNA samples were stored frozen (at −20 °C and −80 °C, respectively), until further processing. In samples from Croatia, Slovakia, and Ukraine the presence of tick DNA was confirmed by amplification of a 620 bp fragment of the tick mitochondrial gene cytochrome B (Black & Roehrdanz 1998). Only PCR positive samples were further analyzed. The locations of sample collection were slightly shifted for visualization purposes. The map was generated in R. For details, see Tables 1 and 2, S1 and S2.

DNA Extraction
For DNA isolation from ticks, different methods were used. In Slovakia and Ukraine, alkaline-hydrolysis was used as described [65] on individual ticks. In Portugal, a DNAeasy kit (Qiagen, Hilden, Germany) was used. Ticks sampled in Algeria were individually extracted using the MagNA Pure 96 system using the MagNA Pure96 DNA and Viral NA Small Volume Kit (Roche Diagnostics, Meylan, France) according to the manufacturer's instruction and for ticks from Germany the Qiagen BioSprint 96 DNA Blood kit (Qiagen) was used. DNA samples were stored frozen (at −20 • C and −80 • C, respectively), until further processing. In samples from Croatia, Slovakia, and Ukraine the presence of tick DNA was confirmed by amplification of a 620 bp fragment of the tick mitochondrial gene cytochrome B (Black & Roehrdanz 1998). Only PCR positive samples were further analyzed.
MLST on Borrelia positive samples was done primarily using a nested or semi-nested PCR (HotStar Taq Master Mix Kit or HotStarTaq Plus DNA Polymerase kit, Qiagen, Hilden, Germany), with primers and PCR conditions as previously described [70,71] [see also https://pubmlst.org/borrelia/ (11 February 2020)]. As the uvrA gene did not amplify in all cases, fragments of the seven housekeeping genes clpA, clpX, nifS, pepX, pyrG, recG, and rplB were amplified and sequenced. As positive control DNA of B. japonica or B. turdi was used while double distilled water was used as negative control. Table 2 and Table S2 give an overview of B. lusitaniae positive samples that were typed by MLST and for which at least seven MLST housekeeping genes gave good sequences without ambiguities. MLST sequences were submitted and are available via pubmlst.org/borrelia/ (11 May 2020) at the University of Oxford.
Purification of PCR products was done using a NucleoSpin Gel and PCR Clean-up according to the manufacturer's recommendation (MACHEREY-NAGEL, Düren, Germany). For Slovakian and Algerian samples Sanger sequencing of all the genes was performed by Eurofins Genomics (Eurofins Genomics Germany GmbH, Ebersberg, Germany and Basel, Switzerland).

PCR on Tick Genes
To confirm the morphological identification of ticks and to differentiate I. ricinus and I. inopinatus, two previously used loci, the mitochondrial 16S rRNA locus and the nuclear gene trospA were amplified [61,72] and sequenced. We included tick samples that were characterized for Borrelia or originated from the same regions as the Borrelia positive tick samples, but they were not necessarily identical. Three morphologically identified samples that matched the 16S rRNA of I. inopinatus were available, designated as such, and were included as controls (for details see Table S1). In tick samples from Algeria, Slovakia and Portugal, the 16S rRNA gene was amplified according to [72] and the trospA locus was amplified using primers and conditions with slight modification according to [61]. The implemented modification was a touch-down protocol for the initial six rounds of amplification to minimize non-specific background; the annealing temperatures started at 61 • C and decreased 1 • C per cycle until a temperature of 56 • C was reached. Further amplification rounds were performed at 56 • C. Further details on PCR conditions are given in supplementary information.

Bioinformatic Analysis Recombination and Network Analysis on MLST Genes of Borrelia Samples
Recombination signals were analyzed in the MLST sequence alignment and in both trospA and 16S rRNA sequences employing the PhiTest implemented in SplitTree4 [73]. Because the MLST alignment had recombination signals NeighborNet was run instead of a regular phylogeny; we used SplitTree4 to construct the net and selected the uncorrected P distance and equal angle display.
To identify the potential recombination events in the MLST alignment we used Gubbins with default parameters [74]. We also conduct individual recombination analysis via PhiPack [75], following the method described for A. baumannii [76], on each the fragments of the seven housekeeping genes employed for the MLST. Of note, out of the seven housekeeping genes in clpA we found signals of recombination (p-value = 3.66 × 10 −2 ).

Hierarchical Population Structure Analysis (HPSA) of Borrelia
For a hierarchical population structure analysis using hierBAPS [77], we used the alignment of concatenated MLST sequences. The analysis was run at four levels of clustering and the number of initial clusters was set to 20. We also computed the coefficient of differentiation via MEGA X [78] for the two major groups found at the first level of HPSA clustering.

Hierarchical Clustering of MLST Genes of Borrelia Samples
To assess the hypothesis of having two genetic clades in Europe, we applied a hierarchical clustering algorithm [79]. This algorithm aggregated the isolates towards one group using the allelic profile data on the seven genes (clpA, clpX, nifS, pepX, pyrG, recG and rplB). The hierarchical clustering algorithm treats every isolate as its own cluster and will then look for the closest related isolate based on a similarity measure (allelic profile) and fuses these into a cluster. This step is repeated until all isolates belong to a cluster which can be visualized in a dendrogram. The similarity measure was Gower's similarity coefficient [80]. This coefficient can range from 0 to 1, where a score of 1 indicates complete similarity. Identical isolates fuse directly in the dendrogram, whereas dissimilar isolates only fuse at a more upward branch. R (version 4.0.2) [81] was used to map the data with package "rnaturalearth" and "ggplot2". The dendrogram was build using the R package "dendextend" [82].

GoeBURST Analysis of Borrelia
We used PHYLOVIZ 1.1 [83] to conduct goeBURST analyses using the triple locus variant setting [84] using ST numbers, allelic profiles and meta information. Previous data available for B. lusitaniae isolates were downloaded from the Borrelia MLST database [www. pubmlst.org/borrelia/ (11 February 2020)] and included in our network and goeBURST analyses.

Maximum Likelihood Phylogenies of Tick Samples
As 16S rRNA and trospA did not show signals of recombination regular phylogenies were constructed. For both loci, the best model for the phylogeny was selected via jModelTest [85]. For each locus, a maximum likelihood (ML) phylogeny was run using PhyML [86] setting the best model, which was HK+I for 16S and K80+I for trospA.

Identification of Borrelia lusitaniae in Ticks
Ticks collected in Europe (Croatia, Portugal, Slovakia, Ukraine,) and North Africa (Algeria) that tested positive for B. burgdorferi in screening PCRs were further analyzed by MLST to identify B. lusitaniae. Details of B. lusitaniae samples (country of origin, region, MLST alleles and STs) included in the present study are given in Tables 2 and S2. An overview of the geographical distribution can be found in Figure 1. For 15 B. lusitaniae positive samples at least seven MLST genes provided good sequences and these samples were included in the MLST analysis conducted here (Table 2). These samples were complemented with 40 B. lusitaniae samples available via the Borrelia MLST database [(pubmlst.org/borrelia/ (11 February 2020)]. For several of the samples, uvrA did not yield a PCR product and, thus, we describe the isolates by seven genes, namely: clpA, clpX, nifS, pepX, pyrG, recG and rplB. Most samples of B. lusitaniae were from Portugal (n = 24), followed by Serbia (n = 15) and Algeria (n = 6) ( Table 2). Among the 55 isolates there are 19 differing alleles of clpA, 19 different alleles of gene clpX, nine different alleles of gene nifS, 12 alleles of pepX, 14 alleles of pyrG, 10 alleles of recG genes and 17 rplB alleles.

Clustering Analysis of B. lusitaniae
Based on previous results for B. lusitaniae [33,59], we investigated the hypothesis that isolates may fall into two groups, one cluster of isolates from Portugal and Algeria and one cluster of isolates from Central/Eastern Europe. We assessed this hypothesis by using hierarchical clustering, numbering the isolates 1-55 (Table 2). In the dendogram in Figure 2A two different clusters are obvious (depicted in red and green branches). Within these groups identical isolates were identified: e.g., isolates 18, 8, 9, and 1, 2, 30 were identical.

Multilocus Sequence Typing (MLST)
For 15 B. lusitaniae positive samples at least seven MLST genes provided good sequences and these samples were included in the MLST analysis conducted here ( Table 2). These samples were complemented with 40 B. lusitaniae samples available via the Borrelia MLST database [(pubmlst.org/borrelia/ (11 February 2020)]. For several of the samples, uvrA did not yield a PCR product and, thus, we describe the isolates by seven genes, namely: clpA, clpX, nifS, pepX, pyrG, recG and rplB. Most samples of B. lusitaniae were from Portugal (n = 24), followed by Serbia (n = 15) and Algeria (n = 6) ( Table 2). Among the 55 isolates there are 19 differing alleles of clpA, 19 different alleles of gene clpX, nine different alleles of gene nifS, 12 alleles of pepX, 14 alleles of pyrG, 10 alleles of recG genes and 17 rplB alleles.

Clustering Analysis of B. lusitaniae
Based on previous results for B. lusitaniae [33,59], we investigated the hypothesis that isolates may fall into two groups, one cluster of isolates from Portugal and Algeria and one cluster of isolates from Central/Eastern Europe. We assessed this hypothesis by using hierarchical clustering, numbering the isolates 1-55 (Table 2). In the dendogram in Figure  2A two different clusters are obvious (depicted in red and green branches). Within these groups identical isolates were identified: e.g., isolates 18, 8, 9, and 1, 2, 30 were identical.
The isolates (numbered 1-55) were plotted on a map ( Figure 2B); numbers and color coding for cluster 1 (red) and cluster 2 (green) correspond in Figure 2A,B. Samples from Central/Eastern Europe all fell into cluster 1 while all samples from Algeria fell into cluster 2. Interestingly, isolates from Portugal were assigned to different clusters, a subset fell into cluster 1 and a subset into cluster 2 (inlet in Figure 2B). Six isolates from Mafra and Lisbon seem to be different from other Portuguese and North African isolates.  The isolates (numbered 1-55) were plotted on a map ( Figure 2B); numbers and color coding for cluster 1 (red) and cluster 2 (green) correspond in Figure 2A,B. Samples from Central/Eastern Europe all fell into cluster 1 while all samples from Algeria fell into cluster 2. Interestingly, isolates from Portugal were assigned to different clusters, a subset fell into cluster 1 and a subset into cluster 2 (inlet in Figure 2B). Six isolates from Mafra and Lisbon seem to be different from other Portuguese and North African isolates.

goeBURST Analysis
Since the cluster analysis was in line with previous results suggesting a population division of B. lusitaniae, we used goeBURST to obtain an additional view of the population using a different method. The goeBURST diagram also indicated a population division (Figure 3). Although there was one link based on tiebreak rule 1 between ST7-007* from Algeria and ST7-004* from Mafra in Portugal, several connections (one link without choice of tiebreak rules and several lower level edges; see figure legend Figure 3 for further information) were found between STs from Algeria and STs originating from south of the Tagus river in Portugal (Grândola and Águas de Moura), resulting in a clonal complex (CC) that included 17 STs (CC7-003). A bushy clonal complex (CC918) was also formed consisting of STs from Slovakia, Serbia, Austria, Latvia, Ukraine and one ST (ST7-005) from Mafra, Portugal.
When we used goeBURST on higher level settings (level 4, using four alleles to infer inter-isolate relationships; level 5, using 5 alleles and level 6, six alleles), the picture of the relationship within populations became more obvious (see Supplementary Figures S1-S3). Most of the Portuguese isolates clustered with North African isolates, except for some isolates from Lisbon (human), Mafra and Madeira island which clustered with those from Austria, Croatia, Latvia, Serbia and Slovakia. Since the cluster analysis was in line with previous results suggesting a population division of B. lusitaniae, we used goeBURST to obtain an additional view of the population using a different method. The goeBURST diagram also indicated a population division (Figure 3). Although there was one link based on tiebreak rule 1 between ST7-007* from Algeria and ST7-004* from Mafra in Portugal, several connections (one link without choice of tiebreak rules and several lower level edges; see figure legend Figure 3 for further information) were found between STs from Algeria and STs originating from south of the Tagus river in Portugal (Grândola and Águas de Moura), resulting in a clonal complex (CC) that included 17 STs (CC7-003). A bushy clonal complex (CC918) was also formed consisting of STs from Slovakia, Serbia, Austria, Latvia, Ukraine and one ST (ST7-005) from Mafra, Portugal.
When we used goeBURST on higher level settings (level 4, using four alleles to infer inter-isolate relationships; level 5, using 5 alleles and level 6, six alleles), the picture of the relationship within populations became more obvious (see Supplementary Figures S1-S3). Most of the Portuguese isolates clustered with North African isolates, except for some isolates from Lisbon (human), Mafra and Madeira island which clustered with those from Austria, Croatia, Latvia, Serbia and Slovakia.

Sequence Analysis
All analyses were conducted with concatenated sequences of seven MLST genes. As previously reported [33] we found evidence for recombination (p = 0.00125) in the B. lusitaniae dataset. Therefore, a NeighborNet [73] was constructed instead of a phylogeny. The net shows that there are two very well differentiated groups (see Figure 4A,B), which was confirmed by the first level of the HPSA (Table S3). One group consisted of Portuguese and Algerian isolates ( Figure 4A); the other group combined isolates from Eastern Europe, Austria and Portugal ( Figure 4B). Supplementary Figure S4 provides the whole network, where two major groups are well differentiated. Notably, we did not find any recombination events between these two groups using Gubbins; further suggesting that these are

Sequence Analysis
All analyses were conducted with concatenated sequences of seven MLST genes. As previously reported [33] we found evidence for recombination (p = 0.00125) in the B. lusitaniae dataset. Therefore, a NeighborNet [73] was constructed instead of a phylogeny. The net shows that there are two very well differentiated groups (see Figure 4A,B), which was confirmed by the first level of the HPSA (Table S3). One group consisted of Portuguese and Algerian isolates ( Figure 4A); the other group combined isolates from Eastern Europe, Austria and Portugal ( Figure 4B). Supplementary Figure S4 provides the whole network, where two major groups are well differentiated. Notably, we did not find any recombination events between these two groups using Gubbins; further suggesting that these are well differentiated groups that may not exchange DNA. We also computed the coefficient of differentiation (N ST ) using MEGA X; the value of the coefficient was 0.71 (SE 0.04), which implies considerable levels of population differentiation between the two groups.
well differentiated groups that may not exchange DNA. We also computed the coefficient of differentiation (NST) using MEGA X; the value of the coefficient was 0.71 (SE 0.04), which implies considerable levels of population differentiation between the two groups.
The second level of clustering using HPSA found clearly defined sub-clusters (see Table S3) within the two major groups, which suggests further geographic structuring. For instance, in major group 2 ( Figure 4B) there is a cluster of Serbian isolates (blue label) and there is also a tight cluster of three Portuguese isolates (green label). Figure 4. NeighborNet of the MLST alignment generated in Splitstree4 [73]. The net was yielded considering uncorrected P distance and equal angle display. Panel A shows the major group that includes the North African and some Portuguese isolates, whereas panel B shows the other major group containing most of the European isolates. The different color labels indicate the clusters found at the second level of clustering of the hierarchical population structure analysis (HPSA) using hierBAPS [77] (see Table S3). The scale bar shows the number of substitutions per site.
The second level of clustering using HPSA found clearly defined sub-clusters (see Table S3) within the two major groups, which suggests further geographic structuring. For instance, in major group 2 ( Figure 4B) there is a cluster of Serbian isolates (blue label) and there is also a tight cluster of three Portuguese isolates (green label).

Analysis of Tick Samples
One hypothesis for the strong population division of B. lusitaniae was that this was due to vector association. To test this hypothesis, we investigated tick samples that originated from the same geographic region as the B. lusitaniae samples as well as from other regions. Ticks were identified to species status using morphological criteria [60,64,87]. Sequence data for the mitochondrial 16S rRNA gene and a nuclear gene, trospA, were used for corroboration. These loci were chosen because they showed in previous studies that there was a population division of I. ricinus between Europe and North Africa [61]. Because a new Ixodes species, I. inopinatus, was described originally in the Mediterranean region [60], samples from Germany that were identified as I. inopinatus morphologically and using molecular data (16S rRNA) were used as a control for I. inopinatus.
Given that neither of the loci, 16S rRNA and trospA, showed recombination signals (16S rRNA p = 0.82 and trospA p = 0.13), we employed ML phylogenies to depict the evolutionary relationships for each locus. All ticks tested in Slovakia were determined as I. ricinus by sequencing both 16S rRNA and trospA genes and comparing the sequences via BLASTn [88] searches to sequences available in GenBank. In the 16S rRNA ML phylogeny and as far as I. ricinus samples are concerned, the Portuguese and North African samples were dispersed in several groups across the tree ( Figure 5A, green color coding). Furthermore, the I. inopinatus samples (both from GenBank and our "reference samples") grouped together with samples that were classified as I. ricinus morphologically. Figure 5B shows the phylogeny of trospA, which seems to agree with the MLST NeighborNet in Figure 4: Two clusters emerged in the trospA phylogeny according to the geographical distribution (whatever the tick species) as in the B. lusitaniae MLST NeighborNet. Interestingly, in the trospA phylogeny, with only one exception all the Portuguese samples cluster with the North African samples ( Figure 5B, green color coding). The only exception being the Portuguese sample R1756PT13 collected in Mafra (north of the river Tagus), which clusters with other non-Portuguese samples from Europe (red color coding). Of note, the three I. inopinatus control ("reference") samples from Germany (blue color coding) did not form a monophyletic group, as they were located on different branches far apart in the tree. Taken together these results suggest that unlike the 16S rRNA-based phylogeny, the trospA phylogeny is consistent with the MLST NeighborNet in that samples from North Africa and Portugal form a separate clade. Notably, neither in the trospA phylogeny nor in the 16S RNA phylogeny did the I. inopinatus samples form a monophyletic group.

Population Structure of Borrelia lusitaniae
In this study, we have further evaluated the population structure of B. lusitaniae including additional samples from Croatia, Portugal, Slovakia, Ukraine as well as from Algeria using MLST. Consistent with previously reported data [59], our data clearly show a subdivision between North African isolates and isolates from Latvia, Serbia, Slovakia, and Austria. The population division in Portugal [33] was also corroborated. In the cluster analysis isolates from north of the Tagus river clustered together (cluster 1) and only two isolates fell into cluster 2 together with isolates from south of the Tagus river which is consistent with previously reported data by [33]. Southern Portuguese isolates showed a higher degree of genetic relatedness to isolates from Algeria using goeBURST than to isolates from Mafra. However, two Portuguese STs from south of the Tagus river clustered as singletons in the goeBURST diagram with TLV settings. When higher level settings were considered, two singletons from south of the Tagus river (ST67, ST64) clustered with Algerian samples while singletons from Mafra (ST69, ST7-006) and Madeira island (ST925) clustered with samples from Europe ( Supplementary Information Figures S1-S3). Thus, of the two major clonal complexes, one (CC918) consisted of STs from Austria, Croatia, Slovakia, Serbia, Latvia, Ukraine and one Portuguese ST from Mafra, the other (CC7-007*) consisted primarily of STs from Algeria, southern Portugal and two isolates from Mafra and Coimbra. The division was strongly supported through the sequenced-based network analysis where most STs from Mafra (except one) clustered with STs from other European countries and all STs from Grândola and Algeria fell into one cluster together with PoTiB2 and PoTiB3 from Águas de Moura (south of river Tagus), corroborating data published using ospA as molecular marker [47,59]. In summary, our data support, and agree with, previously published data demonstrating a population division of B. lusitaniae.  [86]. Phylogeny based on the 16S RNA gene (A). The model HK+I was used to construct the tree. Red labels mark I. ricinus non-Portuguese European sequences; green labels highlight Ixodes sequences of Portuguese and North African samples; and the blue labels denote I. inopinatus "reference" sequences from Germany. Phylogeny based on the trospA gene (B). The model K80+I was used to construct the tree. Red labels mark I. ricinus non-Portuguese European sequences; green labels highlight Portuguese and North African Ixodes sequences; and the blue labels denote I. inopinatus "reference" sequences. The scale bars show the number of substitutions per site.

Population Structure of Borrelia lusitaniae
In this study, we have further evaluated the population structure of B. lusitaniae including additional samples from Croatia, Portugal, Slovakia, Ukraine as well as from Algeria using MLST. Consistent with previously reported data [59], our data clearly show a subdivision between North African isolates and isolates from Latvia, Serbia, Slovakia, and Austria. The population division in Portugal [33] was also corroborated. In the cluster analysis isolates from north of the Tagus river clustered together (cluster 1) and only two isolates fell into cluster 2 together with isolates from south of the Tagus river which is consistent with previously reported data by [33]. Southern Portuguese isolates showed a higher degree of genetic relatedness to isolates from Algeria using goeBURST than to isolates from Mafra. However, two Portuguese STs from south of the Tagus river clustered as singletons in the goeBURST diagram with TLV settings. When higher level settings were considered, two singletons from south of the Tagus river (ST67, ST64) clustered with Algerian samples while singletons from Mafra (ST69, ST7-006) and Madeira island (ST925) clustered with samples from Europe ( Supplementary Information Figures S1-S3). Thus, of the two major clonal complexes, one (CC918) consisted of STs from Austria, Croatia, Slovakia, Serbia, Latvia, Ukraine and one Portuguese ST from Mafra, the other (CC7-007*) consisted primarily of STs from Algeria, southern Portugal and two isolates from Mafra and Coimbra. The division was strongly supported through the sequenced-based network analysis where most STs from Mafra (except one) clustered with STs from other European countries and all STs from Grândola and Algeria fell into one cluster together with PoTiB2 and PoTiB3 from Águas de Moura (south of river Tagus), corroborating data published using ospA as molecular marker [47,59]. In summary, our data support, and agree with, previously published data demonstrating a population division of B. lusitaniae.
Recombination analysis and an analysis of the degree of differentiation between the two B. lusitaniae clusters suggested considerable differentiation. Previous work on Borrelia has shown that intraspecific recombination rates are 50 times higher than interspecific recombination rates [89]. Furthermore, population genomics analyses of ocean bacteria investigating early events in population differentiation suggest that gradual separation of genes pools is accompanied by population-specific recombination [90,91]. It appears that the two major groups (populations) maybe in the early stages of speciation; each one on its own trajectory. First, both the population structure analysis and Neighbor net analysis clearly unveiled two very well-defined groups. Secondly, the absence of recombination events between these two groups may suggest that there has been no recent gene flow between the two groups. Clearly further investigations considering whole genome sequences are needed to establish if there is gene flow between the two major populations.
The observed population division of B. lusitaniae did not seem to follow climate or vegetation zones known for Europe [92]. North Portugal may be influence by the Atlantic region while countries such as Italy, Spain and Algeria belong to the sub-Mediterranean or Mediterranean region. Although one could speculate that the division of Portuguese B. lusitaniae populations may follow the pattern of floristic regions or may be influenced by a geographic barrier (river Tagus), it would not explain the observed clustering of B. lusitaniae STs from North Portugal with STs from other European regions.

Host and Vector Associations
Previous work has suggested that host association is a key factor for driving diversification and speciation in Borrelia [6,14]. Subsequently it was suggested that vector associations may also drive diversification and ultimately speciation in Borrelia [11]. In view of the latest described Ixodes species (I. inopinatus) and its geographic distribution in North Africa, southern Spain and Portugal [60], it was a very attractive hypothesis to investigate whether the association with different vector species may be responsible for the observed population division. However, our data do not unambiguously support the hypothesis that an association with I. inopinatus is responsible for the observed population division of B. lusitaniae. We added three sample of I. inopinatus as reference because these were morphologically identified as I. inopinatus and their 16S rRNA sequences were identical to samples designated I. inopinatus in GenBank. Surprisingly, the I. inopinatus sequences from GenBank (GenBank accession numbers for 16S rRNA/trospA, respectively: GU074596TN/GU074839TN; GU074598DZ/GU074841DZ; GU074602MA/GU074845MA) and our own I. inopinatus reference samples did not form a monophyletic group. These data call for good reference sequences for this tick species and currently we cannot confirm the hypothesis of I. inopinatus association being the reason for the population division of B. lusitaniae. Intriguingly, in the trospA tree sequences clustered according to geographical origin: a divergent clade contained (with one exception) all the Portuguese and North African samples, samples designated as I. inopinatus in GenBank and one sample that the 16S rRNA BLAST search identified as I. inopinatus (3117DZ16). A separation between North African and Eurasian I. ricinus was also found by [61] and [93] using concatenated sequences of four mitochondrial and nuclear genes or 125 single nucleotide polymorphisms, respectively, although none of the studies included tick samples from Portugal. Thus, the B. lusitaniae population structure obtained by MLST seems to match the population structure of Ixodes using the trospA gene. Although these data are suggestive of co-evolution between B. lusitaniae and its vector, the drivers for this process remain to be elucidated.
Regarding host association, in the various countries were B. lusitaniae has been described, several lizard species have been suggested as reservoir hosts. In Slovakia, Germany and Italy green lizards (L. viridis) [46], sand lizards (L. agilis) [43] and common wall lizards (Podarcis muralis) [37,43,44] were described as reservoir hosts. However, these species do not occur on the Iberian Peninsula or in North Africa (see also https://www.Eurolizards.com). In Portugal and Spain, L. schreiberi is commonly found but its role as reservoir for B. lusi-taniae cannot be confirmed with current data, although feeding I. ricinus nymphs have been found on this host [38]. Psammodromus algirus, the Algerian lizard, is widespread in North Africa (Algeria, Morocco, Tunisia) and on the Iberian Peninsula but its distribution in Italy or other countries is very restricted. The species has been shown to be divided into differentiated populations in East and West Iberia and North Africa. Furthermore, this species also shows some lineage differentiation in its western range in Iberia, between northern and southern populations, but miscegenation and relatively small sample size has precluded any taxonomic divisions [94,95]. The species has been identified as reservoir host for B. lusitaniae in Tunisia [36,42,96] and Portugal [38]. Given the different geographic distributions of potential reservoir hosts for B. lusitaniae, it remains to be investigated how narrow the niche for reservoir hosts of B. lusitaniae is and serum sensitivity or transmission experiments could help solving this question.

Conclusions
In this study, we have expanded investigations on the population structure of B. lusitaniae and confirmed a population division separating samples from southern Portugal and Algeria from samples from northern Portugal and other European countries. Molecular analyses of morphologically identified Ixodes samples (encompassing I. ricinus and I. inopinatus) acquired from the same geographical regions as Borrelia samples showed that the two loci investigated, i.e., 16S rRNA and trospA, were phylogenetically incongruous also with respect to clustering of Ixodes species. One clade in the trospA phylogeny consisted mostly of tick samples from North Africa and Portugal and mirrored the population division found in Borrelia. These data suggest that some co-evolution between Ixodes and B. lusitaniae populations may have occurred which warrants further investigation.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/microorganisms9050933/s1, Table S1: Ixodes tick samples included in this study including information on processing of samples, Table S2: Borrelia samples included in this study, sequence type (ST) number (if available), allele numbers of the seven genes included for analyses and details of sample processing, Table S3