Phylogenetic History and Phylogeographic Patterns of the European Wildcat (Felis silvestris) Populations

Simple Summary The European wildcat is an iconic small-sized predator that is still threatened by habitat fragmentation, accidental or illegal killings, and hybridization with domestic cats. However, phylogenetic and phylogeographic patterns of the taxon, though essential to design appropriate long-term conservation management actions, have been poorly investigated. Therefore, in this study, for the first time, we describe the most geographically-representative evolutionary history of the species in Europe based on mitochondrial DNA sequences. Our results clearly show that the European wildcat genetic variability was mainly originated during Pleistocene climatic oscillations and successively modeled by both historical natural gene flow among wild lineages and more recent wild x domestic anthropogenic hybridization events. Abstract Disentangling phylogenetic and phylogeographic patterns is fundamental to reconstruct the evolutionary histories of taxa and assess their actual conservation status. Therefore, in this study, for the first time, the most exhaustive biogeographic history of European wildcat (Felis silvestris) populations was reconstructed by typing 430 European wildcats, 213 domestic cats, and 72 putative admixed individuals, collected across the entire species’ distribution range, at a highly diagnostic portion of the mitochondrial ND5 gene. Phylogenetic and phylogeographic analyses identified two main ND5 lineages (D and W) roughly associated with domestic and wild polymorphisms. Lineage D included all domestic cats, 83.3% of putative admixed individuals, and also 41.4% of wildcats; these latter mostly showed haplotypes belonging to sub-clade Ia, that diverged about 37,700 years ago, long pre-dating any evidence for cat domestication. Lineage W included all the remaining wildcats and putative admixed individuals, spatially clustered into four main geographic groups, which started to diverge about 64,200 years ago, corresponding to (i) the isolated Scottish population, (ii) the Iberian population, (iii) a South-Eastern European cluster, and (iv) a Central European cluster. Our results suggest that the last Pleistocene glacial isolation and subsequent re-expansion from Mediterranean and extra-Mediterranean glacial refugia were pivotal drivers in shaping the extant European wildcat phylogenetic and phylogeographic patterns, which were further modeled by both historical natural gene flow among wild lineages and more recent wild x domestic anthropogenic hybridization, as confirmed by the finding of F. catus/lybica shared haplotypes. The reconstructed evolutionary histories and the wild ancestry contents detected in this study could be used to identify adequate Conservation Units within European wildcat populations and help to design appropriate long-term management actions.


Introduction
Biogeographic and phylogeographic patterns are key factors in clarifying the evolutionary histories of taxa and identifying appropriate conservation management units (ESU and MU [1,2]) In this study, therefore, we analyzed variation patterns of a portion of the mitochondrial NADH dehydrogenase subunit 5 in a geographically representative set of European wildcat and domestic cat individuals with the scope to (1) accurately describe European wildcat phylogenetic and phylogeographic patterns in Europe, (2) date back divergence times among the main haplogroups in the European wildcat phylogenetic history, and (3) evaluate different likely biogeographic scenarios to clarify the origin of shared haplotypes between wild and domestic cats.

Sampling
A total of 705 high-quality DNA cat samples, opportunistically collected from found dead animals between 1998 and 2010 in 12 sub-regions belonging to six European biogeographic macro-regions (Figure 1), were selected from the ISPRA Felis DNA biobank [25]. Sampled individuals had been previously identified morphologically by collectors according to phenotype, life history traits, and biometric indices [43][44][45]. Samples had also been previously genotyped at 31 microsatellites (STR) loci [25] and genetically classified as domestic (Felis catus), wild (Felis silvestris), or putative wild x domestic admixed cats through Bayesian clustering analyses based on an assignment threshold of posterior probability proportion to belong to the wildcat cluster qi = 0.90 [33] (Supplementary Table S1). According to these criteria, the selected samples were re-classified into 420 European wildcats, 213 domestic cats, and 72 putative admixed individuals (see Mattucci et al. [25] for details about sampling, DNA extraction, genotyping, and assignment methods). Due to the high level of domestic introgression spread in Scotland and Hungary, felid samples from these two sub-regions were mainly represented by wild x domestic admixed cats [25].  [46]. Macro-regions (Italy, Central Europe, Iberian Peninsula, Balkans, Eastern Europe, Scotland) are indicated in different colors. Each macroregion was further subdivided into sub-regions (It-cnt: Central Italy; It-Sicily: Sicily; Eu-cnt: Center-North Germany; Eu-sw: Southern and Western regions of Central Europe; Ib-prt: Portugal; Ib-sp: Spain; Balk-n: Northern Balkans; Balk-s: Southern Balkans; Balk-nw Eastern Italian Alps and North-Western Dinarides; East-n: Poland; East-s: Hungary). Samples from the North-Eastern Italian Alps Figure 1. Map of the approximate cat sampling locations. Dark grey areas correspond to the most recently updated IUCN Felis silvestris distribution range [46]. Macro-regions (Italy, Central Europe, Iberian Peninsula, Balkans, Eastern Europe, Scotland) are indicated in different colors. Each macro-region was further subdivided into sub-regions (It-cnt: Central Italy; It-Sicily: Sicily; Eu-cnt: Center-North Germany; Eu-sw: Southern and Western regions of Central Europe; Ib-prt: Portugal; Ib-sp: Spain; Balk-n: Northern Balkans; Balk-s: Southern Balkans; Balk-nw Eastern Italian Alps and North-Western Dinarides; East-n: Poland; East-s: Hungary). Samples from the North-Eastern Italian Alps were included in the Balkans group (blk-nw) according to the most recent genetic structure detected in Europe through Bayesian assignment procedures performed using nuclear microsatellite genetic profiles [25]. Small circles indicate the approximate geographic locations within the main sampling areas. Each sub-region is provided with the number of analyzed samples classified on the base of their 31-STR assignment values [25]: W = wildcat; D = domestic cat; H = putative admixed. Further details are available in Supplementary Table S1.

Mitochondrial DNA Sequencing and Haplotype Identification
Selected samples were sequenced for a portion of 835 bp of the mitochondrial NADH dehydrogenase subunit 5 (ND5; nucleotides 13,149-13,983 mapped on the mitochondrial genome of the domestic cat; NCBI Reference Sequence NC001700). This mtDNA region is particularly suitable for phylogeny reconstructions, because of the absence of nuclear mitochondrial segments (numts) and its reduced homoplasy, and is highly discriminating, since it contains seven diagnostic mutations useful to distinguish the European wildcat (F. silvestris) from the domestic/African cat (F. catus/lybica) lineages [20,31].
Identical haplotypes were identified and collapsed using DNASP v5.10.01 [49], and possible correspondences with haplotypes already published in GENBANK were checked using BLAST [50]. The final dataset of unique cat NAD5 haplotypes was then used to perform all the downstream diversity, phylogeographic and phylogenetic analyses.
The best nucleotide substitution model scheme was computed in PARTITIONFINDER v2 [51] by the Bayesian information criterion (BIC), and, subsequently, three phylogenetic trees were constructed through different computational approaches: (i) the neighbor-joining (NJ, [52]) method, using Smart Model Selection (SMS, [53]) algorithms implemented in MEGA v11.0 [54] and performing 10,000 random bootstrap replications; (ii) the maximumlikelihood method (ML, [55]) implemented in PHYML v3.0 [56] with the heuristic search by topological rearrangement of an initial tree (Near-Neighbor-Interchange) and 5000 random bootstrap replications; (iii) the Bayesian method (BT) implemented in BEAST v2.1.3 [57], which further allowed to estimate divergence times among nodes. Due to the strong relationship between taxa, a strict molecular clock model with a fixed mean substitution rate (2.28 × 10 −8 /site/year [47]) and constant population size as coalescent priors were selected. The Bayesian posterior probabilities (BPPs), as well as the high posterior densities for the node ages (HPDs), were extrapolated by performing three independent MCMC runs of 100,000,000 steps with a burn-in period of 10,000,000 steps and picking genealogies every 2000 steps. The results of the three chains were simultaneously analyzed in TRACER v1.7 [58].
The corresponding portion of the ND5 sequence of Felis margarita retrieved from GENEBANK (Accession number: EF587034) was used as an outgroup in NJ and ML tree reconstructions. Conversely, the BT tree was calculated without the inclusion of an outgroup, as suggested by BEAST developers [57], and the tree rooting point was estimated using as a prior calibration point the time interval in which the common ancestor between F. silvestris and F. catus/lybica likely coalesced (230,000-173,000 years ago [20]).
Each supported node was annotated with bootstrap values for NJ and ML trees and the highest posterior density (HPD) for the BT tree (e.g.: NJ/ML/HPD).
Haplotype pairwise genetic distances among cat taxa and cat biogeographic clusters, assessed with an analysis of molecular variance (AMOVA), and the φ ST and φ SC indexes for genetic differentiation [60], were computed in ARLEQUIN v3.5.1.3 [61], running 10,000 permutations to evaluate the significance of each parameter.
Further spatial analysis of molecular variance (SAMOVA) was performed using SAMOVA v2.0 [62], which defines clusters of geographically homogeneous populations, based on an a priori definition of the number of K groups, and uses a simulated annealing procedure to maximize the proportion of total genetic variance between groups with an AMOVA approach. We tested a different number of groups (with K from 2 to 9), each time with the simulated annealing process repeated 10,000 times, starting with a different partition of the population samples into the K groups. The selection of the best K-repartitions was based on the highest significant values of the F CT genetic differentiation index. F CT estimates differentiation among those groups of populations. The closer F CT is to 1, the more divergent the groups are from each other.

Approximate Bayesian Computation Analyses
The Approximate Bayesian Computation (ABC) simulations [63], implemented in DIYABC v2.1.0 [64], were run to model plausible demographic scenarios and estimate divergence times (in generations). We performed two types of ABC analyses: (a) considering only populations carrying wildcat mitochondrial haplotypes [20]; and (b) considering populations showing mito-nuclear discordance (see Results). In details, to avoid over-computation, we designed the smallest number of evolutionary scenarios using as prior populations the spatially geographical and genetically homogeneous clusters found through SAMOVA analyses. Successively, we simulated alternative evolutionary hypotheses using haplotype distribution and divergence times estimated from the reconstructed Bayesian phylogeny (see Results) and modeled population dynamics, taking into account the main phylogeographic findings reported in the literature [20,25,31]. Therefore, we tested four demographic scenarios for the wildcat haplotypes (Supplementary Figure S1), hypothesizing that the four clusters split sequentially (scenarios 1 and 3) or simultaneously (scenarios 2 and 4) and that Cluster 4 diverged by isolation (scenarios 1 and 2) or followed a gene flow with other European populations (scenarios 3 and 4). The haplogroup, including wildcats showing mito-nuclear discordances, was analyzed by testing three different scenarios: (a) the three clusters split in recent times from few common ancestral haplotypes (scenario 1); (b) the three clusters split in different sequential evolutionary events (scenario 2); (c) same as scenario 2 but considering longer coalescence time (scenario 3). We ran 4 × 10 6 simulations for each scenario using prior uniform distributions of the effective population sizes and Animals 2023, 13, 953 6 of 17 time parameters with the gamma-distributed mutation model with Gamma shape = 4.0. Scenarios were compared by estimating posterior probabilities using the logistic regression method using 1% of the simulated datasets. For the best models, posterior distributions of the parameters were estimated with a logit-transformed linear regression on the 1% simulated datasets closest to the observed data. Scenario confidence was evaluated by comparing observed and simulated summary statistics. Finally, the goodness-of-fit of the posterior parameters for the best-performing scenarios was tested via the model checking option with default settings, and significance was assessed after Bonferroni correction for multiple testing.

Results
Haplotype alignment did not show any indels or stop codons and the aminoacidic sequence was concordant with the domestic cat ND5 protein sequence (NCBI Reference Sequence NC001700). Thus, we excluded the amplification of numts or pseudogenes. After regrouping procedures, we identified 29 haplotypes among the 715 sequences, counting 32 polymorphic sites, including 23 parsimony informative sites, with a total haplotype diversity Hd = 0.862 ± 0.006 and a nucleotide diversity π = 0.870 ± 0.013 (Table 1). We compared the resulting haplotype sequences with the NCBI nucleotide database using the Blast algorithm founding 12 new unpublished haplotypes (Supplementary Table S2). Table 1. Summary of genetic variability statistics obtained by analyzing a portion of the mitochondrial ND5 gene and considering as sample groups: (a) STR Bayesian assignment cat populations [25]; (b) main phylogenetic lineages; and (c) biogeographic macro-regions within each phylogenetic lineage. For lineages W and DW (lineage D pruned by domestic cats), the statistics are also provided for each macro-region. N = number of samples; SD = standard deviation.

Phylogenetic Analyses and Divergence Time Estimates
The best fit evolutionary model for the 29-haplotype alignment was Kimura's twoparameter (K80 [65]) model with invariable sites (I = 0.80) for the first and second codon positions, whereas the Hasegawa, Kishino, and Yano (HKY [66]) model with a gamma distribution and four discrete categories was selected for the third codon position.
The NJ, ML, and BT phylogenetic trees showed very concordant topologies for the main clades (Supplementary Figure S2); thus, we described in detail directly the topology of the tree generated by BEAST, which presented posterior probabilities of the main internodes > 0.90 (Figure 2a). In the BT tree, haplotypes were clearly split into two main and strongly supported lineages (node 1, 100/100/1), diverging 197,500 years ago (95% HPD 173,002-226,274 years ago): (i) lineage D, including 15 haplotypes, shared in 448 (62.3%) individuals and (ii) lineage W, including 14 haplotypes, shared in 267 individuals (37.7%) ( Table 1).
Haplotypes were separated into three main categories according to the previous 31-STR Bayesian assignment tests performed by Mattucci et al. [25]: (i) category "d" included haplotypes found only among domestic cats; (ii) category "dw" included haplotypes found either in domestic, wild or putative admixed individuals; and iii) category "w" included haplotypes found only among wildcats and putative admixed individuals.  [25], were partitioned and colored as domestic, wild, and putative admixed groups. (c) MJ networks among ND5 haplotypes, including only wildcats and putative admixed individuals (n = 502). Each haplotype was divided and colored according to the proportion of individuals belonging to each biogeographic macro-region. Small bars indicate the number of mutations (greater than one) between two different haplotypes. The frequency of each haplotype is proportional to the size of the circles.

Phylogeographic Analyses of European Wildcat and Putative Admixed Individuals
MJ network, reconstructed using the entire 29-haplotype alignment, was highly concordant with the topology of the BT tree (Figure 2), identifying two main haplogroups (D and W), which were clearly separated by seven diagnostic polymorphisms, previously described by Driscoll et al. [47], though one of them (D) included seven domestic-wild shared haplotypes (Figure 2b). The AMOVA (Table 2), performed considering European wildcats, domestic cats, and putative admixed cats as different groups, detected a higher proportion of variation within (63.3%) than among (36.7%) populations and a significant differentiation index ϕst = 0.37 (p < 0.01). However, to avoid that domestic cat distribution, strongly linked to human activities, might inflate estimates, the initial dataset was pruned from samples genetically assigned to the domestic cat group and phylogeographic analyses were focused on wildcat and putative admixed populations. Thus, a further MJ network was reconstructed using ND5 sequences from 430 wildcats and 72 putative admixed individuals, corresponding to 21 haplotypes and characterized by 26 polymorphic sites and 21 parsimony informative sites (Figure 2c). In the network, the presence of two main haplogroups, significantly differentiated (ϕST = 0.97, p < 0.01; Table 2) was still evident: (i) DW, cleaned from "d" haplotypes, including 174 wildcats and 61 putative admixed cats; and (ii) W, including 256 wildcats and 11 putative admixed individuals (Figure 2c).
Haplogroup DW showed a split, roughly corresponding to node 2 of the BT tree (Figure 2c), which separated clades I-II and III. In particular, sub-clade Ia included haplotype  [25], were partitioned and colored as domestic, wild, and putative admixed groups. (c) MJ networks among ND5 haplotypes, including only wildcats and putative admixed individuals (n = 502). Each haplotype was divided and colored according to the proportion of individuals belonging to each biogeographic macro-region. Small bars indicate the number of mutations (greater than one) between two different haplotypes. The frequency of each haplotype is proportional to the size of the circles.
Lineage D included eight haplotypes, identified by the prefix "d", exclusively detected in 25 domestic cat genotypes, and other seven haplotypes, identified by the prefix "dw", found in 174 wildcat (41.4% of the wild), 188 domestic cats, and 61 putative admixed (84.7% of the wild) genotypes (Supplementary Table S3). In this lineage, we found the first highly supported split (node 2, 100/88/0.99) dating back about 80,000 years ago (95% HPD 31,561-145,850 years ago), distinguishing two major clade groups, I-II and III (Figure 2a Material and Supplementary Table S3 for details).

Phylogeographic Analyses of European Wildcat and Putative Admixed Individuals
MJ network, reconstructed using the entire 29-haplotype alignment, was highly concordant with the topology of the BT tree (Figure 2), identifying two main haplogroups (D and W), which were clearly separated by seven diagnostic polymorphisms, previously described by Driscoll et al. [47], though one of them (D) included seven domestic-wild shared haplotypes (Figure 2b). The AMOVA (Table 2), performed considering European wildcats, domestic cats, and putative admixed cats as different groups, detected a higher proportion of variation within (63.3%) than among (36.7%) populations and a significant differentiation index φst = 0.37 (p < 0.01). However, to avoid that domestic cat distribution, strongly linked to human activities, might inflate estimates, the initial dataset was pruned from samples genetically assigned to the domestic cat group and phylogeographic analyses were focused on wildcat and putative admixed populations. Thus, a further MJ network was reconstructed using ND5 sequences from 430 wildcats and 72 putative admixed individuals, corresponding to 21 haplotypes and characterized by 26 polymorphic sites and 21 parsimony informative sites (Figure 2c). In the network, the presence of two main haplogroups, significantly differentiated (φ ST = 0.97, p < 0.01; Table 2) was still evident: (i) DW, cleaned from "d" haplotypes, including 174 wildcats and 61 putative admixed cats; and (ii) W, including 256 wildcats and 11 putative admixed individuals (Figure 2c). Table 2. Analysis of molecular variance (AMOVA) on a portion of the mitochondrial ND5 gene computed among and within three different sample groups: (a) the STR Bayesian assignment populations [25]; (b) the main phylogenetic lineages; and (c) the biogeographic macro-regions within each phylogenetic lineage. φ ST : differences among groups; φ SC : differences among populations within a group. All values were highly significant (p < 0.05). Lineage DW represents lineage D pruned by domestic cats. Haplogroup DW showed a split, roughly corresponding to node 2 of the BT tree (Figure 2c), which separated clades I-II and III. In particular, sub-clade Ia included haplotype dw4, which was the most frequent within haplogroup DW, was found in 146 individuals, and showed a spread distribution across central and southern Europe (Figure 3). Scotland (Cluster 1), the Iberian Peninsula (Cluster 2), and the remaining European populations (Cluster 3) (Figure 3), with a lower overall FCT = 0.28 (p < 0.05) and most of the variation explained within populations (63.06%).

Percentage of Variation
The independent AMOVAs performed considering six biogeographic repartitions ( [25]; Figure 1), yielded concordant results but with a slightly lower proportion of variation explained among macro-regions (58.24%) for the haplogroup W and a higher proportion of variance within populations (82.8%) in haplogroup DW (Table 2).  Haplogroup W showed a split, roughly corresponding to node 5 of the BT tree (Figure 2c Table S3 for details).
The spatial analysis of molecular variance (SAMOVA) showed that the most statisticallysupported geographic partition within haplogroup W corresponded to K = 4 population groups, with an overall F CT = 0.64 (p < 0.01) and 63.2% of variation explained among the detected repartitions. The four groups included Italy and South-Eastern Europe (Cluster 1), Central and North-Eastern Europe, the Balkans (Cluster 2), Scotland (Cluster 3), and the Iberian Peninsula (Cluster 4) (Figure 3).
The independent AMOVAs performed considering six biogeographic repartitions ( [25]; Figure 1), yielded concordant results but with a slightly lower proportion of variation explained among macro-regions (58.24%) for the haplogroup W and a higher proportion of variance within populations (82.8%) in haplogroup DW (Table 2).

Demographic Analyses
A weak significant sign of population expansion was detected in the domestic cats of lineage D with a near bell-shaped curve in the mismatch plot (Figure 4), a Tajima's D = −0.413 (p = 0.078) and a Fu and Li's F = −2.337 (p < 0.05) ( Table 2). The mismatch distribution curve (Figure 4) and the slightly significant negative values of Tajima's and Fu and Li's estimators suggested a low degree of population expansion also for lineage W (Table 1). However, within this haplogroup, only Italy and the Balkan macro-regions presented an increasing trend in the mismatch plot consistent with significant negative values of Tajima and Fu and Li's statistics (Table 1, Figure 4), although these values were lower than two, suggesting caution in considering a hypothesis of actual expansion [67]. Among dw haplotypes, only those of Central Europe showed traces of an actual expansion trend with significant negative values of the statistics (Tajima's D = −2.002; p < 0.01 and FU and Li's F = −1.106; p < 0.05; Table 1) and a low peak in the mismatch plot (Figure 4).

Approximate Bayesian Computation Analyses
ABC simulations for haplogroup W provided the best support for scenario 4 (simul taneous population splitting with the following gene flow, Figure 5a

Approximate Bayesian Computation Analyses
ABC simulations for haplogroup W provided the best support for scenario 4 (simultaneous population splitting with the following gene flow, Figure 5a) with a posterior probability of 0.372 (95% C.I. 0.000-0.796). Under this scenario, the median values of the divergence time showed that Cluster 1, Cluster 2, and Cluster 3 started isolating about 40,800 generations ago (5-95% quartile: 13,900-53,600). Considering a wildcat generation time of two years [16], the time from the most recent common ancestor (TMRCA) of these populations was estimated at about 81,600 years ago (5-95% quartile: 27,800-107,200 years ago). The following admixture event between Cluster 1 and Cluster 2 contributed to generating Cluster 4 approximately 8920 years ago. Simulations on the DW haplotype (Figure 5b) showed the best posterior probability for scenario 1 (simultaneous splitting in recent times) with a posterior probability of 0.748 (95% C.I. 0.367-1.000). Median values of the divergence time from TMRCA suggested 540 generations ago (5-95% quartile: 42.2-5250), corresponding to about 1080 years ago (5-95% quartile: 84.4-10,500).
populations was estimated at about 81,600 years ago (5-95% quartile: 27,800-107,200 years ago). The following admixture event between Cluster 1 and Cluster 2 contributed to generating Cluster 4 approximately 8920 years ago. Simulations on the DW haplotype ( Figure  5b) showed the best posterior probability for scenario 1 (simultaneous splitting in recent times) with a posterior probability of 0.748 (95% C.I. 0.367-1.000). Median values of the divergence time from TMRCA suggested 540 generations ago (5-95% quartile: 42.2-5250), corresponding to about 1080 years ago (5-95% quartile: 84.4-10,500). Figure 5. Graphical representation of the resulting population sizes and divergence times estimated for the two best-simulated scenarios inferred by ABC simulations, using a generation time g = 2 years. The width of branches is proportional to the inferred effective population sizes. (a) The best scenario was inferred using lineage W haplotypes. (b) The best scenario was inferred using lineage DW (lineage D pruned by domestic cats) haplotypes. Cluster numbers refer to the best K repartition obtained by SAMOVA analyses for each of the two lineages. Figure 5. Graphical representation of the resulting population sizes and divergence times estimated for the two best-simulated scenarios inferred by ABC simulations, using a generation time g = 2 years. The width of branches is proportional to the inferred effective population sizes. (a) The best scenario was inferred using lineage W haplotypes. (b) The best scenario was inferred using lineage DW (lineage D pruned by domestic cats) haplotypes. Cluster numbers refer to the best K repartition obtained by SAMOVA analyses for each of the two lineages.

Discussion
Pleistocene climate oscillations significantly shaped the biogeographic patterns and genetic structure of many mammal species within the Palaearctic region [5]. Several examples of shifts in the distribution and genetic composition of mammal populations following glacial and sea-level cycles have been clearly described, highlighting how recurring eastwest colonization waves introduced new genetic variants, and subsequent post-glacial recolonizations from Mediterranean and extra-Mediterranean refugia further modified their genetic makeup [5,10,[68][69][70]. In addition, for some species affected by anthropogenic hybridization, such as the European wildcat (Felis silvestris), the wolf (Canis lupus), or the wild boar (Sus scrofa), the genetic mosaic has been further altered by the introgression of domestic variants [31,71,72]. Therefore, here we describe phylogenetic and phylogeographic patterns obtained by analyzing a diagnostic fragment of the mtDNA on a wide sampling of European wildcats collected across Europe to (i) detect clear signs of genetic differentiation between central and southern European wildcat populations, likely resulting from glacial isolation and consequent post-glacial recolonization processes, and (ii) disentangle the origin of shared haplotypes between wild and domestic cats.

Phylogenetic Patterns
Although based on partial ND5 sequences, our phylogenetic reconstructions on European wild and domestic cats well reflected their evolutionary relationships confirming previous studies on larger portions of the mitochondrial DNA [20,31,73], as well as entire mitogenomes [74]. Indeed, we detected four main F. catus/lybica haplogroups (D: Ia, Ib, II, and III), showing an overall high level of haplotype diversity (0.735 ± 0.018) and originating about 80,000 years ago. According to Driscoll et al. [20], such clades might reflect the multiple origins of the F. catus/lybica lineage, whose estimated coalescence time, although older, is largely included within our confidential intervals. Whitin F. catus/lybica lineage, clades I and II showed a higher haplotype richness and frequency among samples (in particular, they included 58.2% of all domestic cats). Such haplogroups could be roughly associated with lineages A/B, previously found by Driscoll et al. [20] and Ottoni et al. [31], which represent the Near Eastern Felis contribution to the mtDNA pool of present-day domestic cats. On the other hand, clade III, showing a basal phyletic position, might correspond to lineage C described by Driscoll et al. [20] and Ottoni et al. [31], which includes north African wildcat haplotypes (possibly of ancient Egypt derivation) later spread in Europe [31].
Concordant results with previous studies [20,31,73] were further found for the F. silvestris lineage W, which showed two main clades, IV and V, dating back about 62,400 years ago. Similar divergent times and phylogenetic patterns have been observed also in other species, such as the pine marten (Martes martes) [69] and the wolf (Canis lupus) [75] and might be the result of Pleistocene climatic oscillations on mammal population distribution and evolution [5,69,75,76].

Phylogeographic Patterns
A spatial analysis of molecular variance (SAMOVA) was performed to clarify phylogeographic patterns within the European wildcat lineage W, showing that its haplotypes were spatially clustered into at least four main geographic groups roughly concordant with the biogeographic regions previously detected analyzing nuclear markers [25,38]: (1) a South-Eastern European cluster, spanning from Italy to Hungary; (2) a Central European cluster, spanning from the Italian and Balkan Alps to Germany; (3) a cluster including the isolated Scottish population; (4) a cluster including the Iberian population. In particular, Cluster 1 showed the overall lowest genetic variability. However, further analyses, including additional wildcat samples from other unsampled and formerly in contact with eastern Countries, such as Anatolia, might reveal increased genetic variability levels and shed light on the possible contributions of such populations in shaping the current European wildcat evolutionary history in eastern European populations, such as that living in Hungary. This cluster was mainly represented in clade IV with (a) the dominant haplotype w1, shared at low frequencies also with the Iberian Peninsula wildcats, which could be the result of a colonization wave from eastern Europe during glacial periods and subsequent isolation south of the Alps, which acted as a natural barrier to gene flow [23,25,77], and (b) the presence of a private haplotype w5 in Sicily, that could result from the long-lasting isolation of the Island from the Peninsula [78]. These recolonization and subsequent isolation patterns are further confirmed by the presence of shared mtDNA haplotypes in other mammal species, such as the pine marten (Martes martes) [69], the red deer (Cervus elaphus) [79], the roe deer (Capreolus capreolus) [5], and the brown bear (Ursus arctos) [80] in southern Europe.
Cluster 2 showed a higher genetic variability, and was mainly represented in clade V, characterized by the predominance of haplotype w4, which might have originated in extra-Mediterranean or in the Dinaric-Alpine refugia and successively widespread in Central Europe, as hypothesized through molecular studies carried out also on the hedgehog (Erinaceus europaeus) [81], the Eurasian lynx (Lynx lynx) [82] and the red deer (Cervus elaphus) [9] and confirmed by the absence of Pleniglacial wildcat archaeological findings [23].
Cluster 3 showed a single unique haplotype w3, which might be the legacy of the mtDNA gene pool of a continental wild ancestor which migrated to Britain via land bridge approximately 10,000 years ago [83] and originated the Island population, which successively experienced recurrent bottlenecks during glacial maximums, more recent demographic declines, and a compromising anthropogenic admixture with domestic cats [84]. Finally, Cluster 4 showed a) the balanced presence of haplogroups IV and V, which might derive from the Pleistocene population migration waves from Central Europe, as supported also by ABC analyses, and b) the occurrence of three private haplotypes, w9 and w12 in Spain and w8 in Portugal, which might have originated during the subsequent isolation south of the Pyrenes [85,86].

Mito-Nuclear Discordance and Evolutionary Hypotheses
Our phylogenetic and phylogeographic history of the European wildcat assessed by partial mitochondrial sequences revealed the widespread presence of haplotypes shared between wild and domestic cat populations. Indeed, a consistent proportion of individuals previously assigned to the wildcat population through STR Bayesian clustering analyses [25] were included in the mitochondrial lineage D. Cases of Felis mito-nuclear discordances have been already described and generally attributed to recent F. catus mitochondrial introgressions, which may have likely eroded domestic ancestry at the nuclear loci after a few backcrossing generations [87], leaving exogenous traces only at the mtDNA or at a small portion of the nuclear genome [16]. Accordingly, our mtDNA data showed several shared haplotypes (dw1, dw2, dw3, dw5, dw7), mostly frequent within domestic cats, which seem to have recently differentiated and simultaneously split about 1000 years ago, as further revealed by the best ABC evolutionary scenario, supporting the hypothesis of an F. catus introgression.
However, other two shared haplotypes, belonging to the D subclade Ia (dw4 and dw6), mostly frequent within wildcats, seem to have originated about 37,000 years ago, which, according to the last available archaeological and genetic findings, long pre-dated any evidence for cat domestication [20,30,88], thus suggesting their possible natural F. lybica derivation. This latter hypothesis fits well with the scenario of a late Pleistocene European wildcat migration toward the Levant and Anatolia regions already occupied by F. lybica populations [89,90]. Such a syntopic event might have promoted a natural inter-taxon gene flow introgressing F. lybica mitochondrial signatures in some European wildcat individuals [31].
A possible female-biased directionality of the admixture patterns might justify the high numbers of F. catus/lybica mtDNA hyplotypes found in the European wildcats analyzed in this study, though this hypothesis should be confirmed or denied by further Y-chromosome haplotype analyses.

Conclusions
In this study, using a short but highly diagnostic portion of the mtDNA, we provide the first exhaustive description of the European wildcat phylogenetic and phylogeographic structure across the entire species' range in the continent. Our results suggest the presence of at least three main continental biogeographic clusters, roughly corresponding to the Iberian Peninsula, the South-Eastern European Mainland, and Central Europe, whose origin fits well with a model of species glacial isolation and post-glacial re-expansion from the Mediterranean and extra-Mediterranean refugia during the late Pleistocene. As expected, a fourth biogeographic cluster was identified in the isolated and almost genetically compromised Scotland wildcat population [84], which showed unique wild and domestic haplotypes. Based on their wild ancestry content, such biogeographic clusters could be used to identify four possible preliminary corresponding Conservation Units (CU, [91]) to be treated as different management priorities and preserved through well-planned conservation actions, depending on their wild genomic mito-nuclear concordance. Interestingly, our data also show the presence of mtDNA haplotypes shared between wild and domestic cat populations, likely resulting from two different independent evolutionary processes, historical natural gene flow among wild lineages, and recent wild x domestic anthropogenic hybridization. Future studies, based on the analyses of entire mitogenomes and whole nuclear genomes of early domesticated cats from museum collections, modern and ancient European and African wildcats collected also in their overlapping distribution ranges (from Turkey to the Near East), could undoubtedly help researchers to disentangle this complex biogeographic mosaic, clarify the evolutionary histories and admixture patterns, as well as shed light on the origin of the current mito-nuclear variability of the European wildcat populations and their long-term adaptive potential.