Highly Clonal Structure and Abundance of One Haplotype Characterise the Diplodia sapinea Populations in Europe and Western Asia

Diplodia sapinea is a cosmopolitan endophyte and opportunistic pathogen having occurred on several conifer species in Europe for at least 200 years. In Europe, disease outbreaks have increased on several Pinus spp. in the last few decades. In this study, the genetic structure of the European and western Asian D. sapinea population were investigated using 13 microsatellite markers. In total, 425 isolates from 15 countries were analysed. A high clonal fraction and low genetic distance between most subpopulations was found. One single haplotype dominates the European population, being represented by 45.3% of all isolates and found in nearly all investigated countries. Three genetically distinct subpopulations were found: Central/North European, Italian and Georgian. The recently detected subpopulations of D. sapinea in northern Europe (Estonia) share several haplotypes with the German subpopulation. The northern European subpopulations (Latvia, Estonia and Finland) show relatively high genetic diversity compared to those in central Europe suggesting either that the fungus has existed in the North in an asymptomatic/endophytic mode for a long time or that it has spread recently by multiple introductions. Considerable genetic diversity was found even among isolates of a single tree as 16 isolates from a single tree resulted in lower clonal fraction index than most subpopulations in Europe, which might reflect cryptic sexual proliferation. According to currently published allelic patterns, D. sapinea most likely originates from North America or from some unsampled population in Asia or central America. In order to enable the detection of endophytic or latent infections of planting stock by D. sapinea, new species-specific PCR primers (DiSapi-F and Diplo-R) were designed. During the search for Diplodia isolates across the world for species specific primer development, we identified D. africana in California, USA, and in the Canary Islands, which are the first records of this species in North America and in Spain.

The native range of D. sapinea is thought to be in pine forests of the Northern Hemisphere [42,44]. However, the highest genetic diversity is observed in South Africa, when compared to populations of the Northern Hemisphere [36,45,47]. Although populations of some European countries have been characterised by molecular methods [24,43,47,48], the genetic structure of D. sapinea in Europe (including subpopulations recently established in northern Europe) is still poorly known.
The aim of this study was to characterise the genetic diversity of D. sapinea in Europe in order to improve the understanding of the pathogen's spread across the continent. According to our experience, existing species-specific DNA primers fail to identify all isolates of D. sapinea, and therefore, we started this research by developing a new primerpair and testing it with an extensive set of isolates including isolates obtained from a geographically wide range. Isolates identified as D. sapinea using this new primer-pair were characterised using microsatellite markers and mating type determinations. Diversity was determined both at a country (including 1-11 sites each) level and within single trees. Genetic characteristics of subpopulations were compared to the present disease situation in each country.

Sample Collection and Disease Severity
Pine needles, cones and shoots with and without Diplodia sapinea pycnidia were collected from one or several locations in each of 15 countries: Belarus, Estonia, Finland, Georgia, Germany, Italy, Latvia, North Macedonia, Norway, Poland, European part of Russia, Serbia, Slovakia, Switzerland and Ukraine (Figure 1 and Suppl. Tables S1 and S3). Sampled hosts included nine Pinus species, one Pseudotsuga species and one vector (insect) species (Pityogenes quadridens) (Suppl . Table S1). Samples were collected from 2011 to 2020.  Sampled cones, needles and shoots were collected from forests and urban greeneries (Suppl. Tables S1 and S3). Five isolates of D. sapinea were obtained from the exoskeleton of the bark beetle Pityogenes quadridens collected in a Norwegian Pinus sylvestris forest. To study the population structure on a European scale, only one fungal isolate per sampled tree was used in the analyses. To determine the genetic diversity of the pathogen at a small spatial scale, several isolates were obtained from single trees and small groups of nearby trees in Estonia and Slovakia. Additionally, several isolates per tree was used for analyses of first arrival haplotypes into Estonia.
One batch of isolates was obtained from a single P. nigra tree in Järvselja, south-east Estonia, from which the first D. sapinea record was documented in the Baltic states in 2007 [25]. In total, 16 isolates originate from this tree; five of them were isolated in 2012, three in 2013 and eight in 2018. Another batch of 14 isolates was obtained in 2012 from a small (0.7 ha) P. nigra stand in Muhu island, western Estonia. The third batch (10 isolates) was obtained in 2012 from six P. sylvestris trees on Vormsi island (on the west coast of Estonia) in a private garden, from where D. sapinea was found for the first time in the Baltic states on a native host, P. sylvestris [23]. The Estonian isolates were divided into two groups according to sampling time: (a) 2011-2012 as first arrivals of the pathogen; (b) 2013-2018 as the second wave of arrivals of the pathogen [23,25].
In Slovakia, 23 trees from 10 locations were sampled. Two to four isolates were obtained from each tree, yielding a total of 62 isolates. All the isolates were obtained in 2019 from cones of P. nigra or P. sylvestris trees. These and additional sampling sites are detailed in Suppl. Table S1.
Disease severity on native pine species during sampling was assessed in every country. Disease severity was indexed as follows: 1 = endophytic presence only, i.e., no disease outbreaks, 2 = weak local outbreaks, 3 = moderate local outbreaks (lethal for single mature trees) (Suppl. Text S1).

Fungal Isolations, DNA Extraction and Isolate Identification
Fungal isolations were performed according to the protocols of Mullett and Barnes [50]. Approximately 0.04 g of mycelium from the colony edge was transferred into 2.0 mL microcentrifuge tubes and stored at −20 • C for DNA extraction. The DNA of the German samples was extracted according to Keriö et al. [51] with some modifications. Mycelium was homogenised with a Retsch MM400 homogeniser (Retsch GmbH, Haan, Germany) using metal beads (∅ 2.5 mm). DNA was extracted using the GeneJET Genomic DNA Purification Kit (Thermo Scientific, Vilnius, Lithuania).
For the detection of D. sapinea, species-specific conventional PCR was performed with primers DiSapi-F and Diplo-R targeting mtSSU DNA (developed in this study, see Section 2.4). PCR reactions were carried out in 20 µL volumes: 1 µL DNA template, a final concentration of 0.4 µM of each forward and reverse primer, 4 µL 5× HOT FIREPol Blend Master Mix Ready to Load with 10 mM MgCl 2 (Solis BioDyne, Tartu, Estonia). Cycling conditions were as follows: initial activation at 95 • C for 15 min, followed by 35 cycles of denaturation at 95 • C for 30 s, annealing at 61 • C for 30 s and elongation at 72 • C for 1 min and the final elongation at 72 • C for 10 min. All PCRs were carried out on a TProfessional Thermocycler (Biometra, Göttingen, Germany). PCR products were visualised on a 1% agarose gel (SeaKem ® LE Agarose, Lonza, Rockland, ME, USA) under UV light using a Quantum ST4-system (VilberLourmat SAS, Marne-la-Vallée, France).

ITS Sequencing
The identity of Diplodia sapinea, D. africana, D. mutila, D. seriata, D. scrobiculata, Botryosphaeria dothidea, Lasiodiplodia gonubiensis, L. theobromae and Trichoderma paraciridescens isolates used for the D. sapinea species-specific primer design (described in Section 2.4.) was confirmed by sequencing the internal transcribed spacer (ITS) region. ITS-PCR was performed using the fungal-specific PCR primers ITS1-F [52] and ITS4 [53]. PCR reactions were carried out as described by Drenkhan et al. [54]. PCR products were sent for sequenc-ing to the Estonian Biocentre in Tartu. The ITS region of samples was sequenced using the primer ITS5 [53]. The sequences were edited using BioEdit version 7.2.5 [55]. BLAST searches for fungal taxa confirmation were performed in GenBank (NCBI). ITS sequences of the isolates were deposited in GenBank (Suppl . Table S2).

Species-Specific Conventional PCR Primer Design
Primers specific to D. sapinea were designed in silico based on mitochondrial small subunit ribosomal DNA (mtSSU rDNA) sequences of D. sapinea and other related species present in the International Nucleotide Sequence Database Collaboration (INSDC) database. The sequences were downloaded and aligned using MAFFT. Aligned sequences were then scanned for regions conserved in all sequences belonging to D. sapinea, but that contain mismatches in comparison to sequences from other species of Diplodia. Specificity of suitable primer sequences was validated using BLASTn searches against the INSDC nucleotide database and sequences. Primer pairs fully complementary to species other than D. sapinea were discarded. IDT OligoAnalyzer 3.1 was used to select primers with melting temperatures differing by less than 4 • C, and to check the stability of potential homodimers, heterodimers, and hairpin structures.
All PCR experiments were carried out using the following reaction mixture: 5 µL of PCR mastermix with 10 mM MgCl 2 (5x HOT FIREPol ® Blend Master Mix Ready to Load; Solis Biodyne, Tartu, Estonia); 0.5 µL of both forward and reverse primers (20 µM; 0.4 µM final conc.), 18 µL of PCR grade water, and 1 µL of template DNA. The following thermal cycling program was adopted after optimisation using an annealing temperature gradient: 95 • C for 15 min; 35 cycles at 95 • C for 30 s, 61 • C for 30 s, and 72 • C for 1 min; a final step of 72 • C for 10 min. For primer specificity confirmation, DNA from the pure cultures of D. sapinea, D. africana, D. mutila, D. seriata, D. scrobiculata, Botryosphaeria dothidea, Lasiodiplodia theobromae, and Trichoderma paraviridescens were used (Suppl . Table S2 and Figure S1), as well as DNA from environmental samples of soil, wood and needles infected by D. sapinea (data not shown). In total, we used more than 500 samples from 19 countries covering five continents, including North and South America, Europe, Asia and Oceania. The detection limit of primers was determined using 10-fold serial dilutions of a D. sapinea pure culture DNA sample, whose concentration was determined using a Qubit fluorometer (Thermo Fisher Scientific, Waltham, MA, USA).

Haplotype Determination
For multilocus haplotyping, 13 microsatellite markers [35,46] were used with fluorescently labelled forward primers (Table 1). Different fluorescent labels (FAM, ATTO532, ATTO550, ATTO565) allowed fragment analysis to be run in a single panel containing amplicons from all 13 loci. PCR reactions were performed in 20 µL reaction volumes, consisting of 2 µL template DNA, a final concentration of 0.3 µM forward and reverse primer, 4 µL 5× HOT FIREPol Blend Master Mix Ready to Load with 10 mM MgCl 2 and 13 µL PCR grade water. PCR was done as described by Burgess et al. [46] and Bihon et al. [35] with some modifications (see Table 1).
PCR products for fragment analysis were pooled into a single panel and run on an Applied Biosystems 3130XL genetic analyser along with LIZ 500 size standard (Applied Biosystems). Alleles were scored using GENEMAPPER v5.0 (Applied Biosystems, Carlsbad, CA, USA).

Genetic Diversity and Differentiation of Populations
Individuals with identical alleles at all microsatellite loci were considered clones. Individuals from the same country were considered to represent one subpopulation. Two datasets were generated: one containing all individuals, i.e., the non-clone-corrected (noncc) dataset; and one containing only one individual of each multilocus haplotype per subpopulation, i.e., the clone-corrected (cc) dataset. The cc dataset was used to calculate the total number of different multilocus haplotypes using GENALEX 6.5 [56]. The non-cc data set was used for calculating the clonal fraction index CF = 1 − [(number of different multilocus haplotypes)/(total number of isolates)] [57]. The clone-corrected (cc) dataset was used to calculate mean haploid genetic diversity (h), total number of alleles, private alleles, mean number of different alleles (Na), and mean unbiased diversity (uh) for each subpopulation using GENALEX 6.5. Allelic richness (A R , the number of distinct alleles in a population) and private allelic richness (PA R, the number of alleles unique to a particular population), were calculated with ADZE 1.0 using the rarefaction approach, with subpopulation sizes standardised to the smallest sample size of 6 [58].
An analysis of molecular variance (AMOVA) was used on the cc dataset to test the significance of differentiation between subpopulations. Subpopulation RUS was omitted from AMOVA analyses due to its small sample size which did not meet the requirements of the test. Additionally, an AMOVA was conducted to test differentiation of the subpopulations on the two main hosts, P. nigra and P. sylvestris. Other host taxa were discarded from the analysis due to their small sample size.
The software STATISTICA was used to test whether mean haploid genetic diversity (h), mean unbiased diversity (uh), allelic richness (A R ) or private allelic richness (PA R ) correlate significantly with northern latitudes or eastern longitudes or with disease severity index (Suppl. Text S1).
EDENetworks v2.18 was used for minimum spanning network visualisation on the cc and non-cc dataset [59]. Analyses were carried out on Fst fixation indexes.

Mating Type Determination and Random Mating
To determine the mating type of D. sapinea isolates, mating type primers were used to amplify the MAT genes. The 20 µL PCR reaction mix consisted of 4 µL of 5× HOT FIREPol Blend Master Mix Ready to Load with 10 mM MgCl 2 , and each mating type primer at a final concentration of 0.5 µM: DipM1f, DipM1r or DipHMGf, DipHMGr [37], 1 µL template DNA and 13 µL PCR grade water. PCR conditions followed Bihon et al. [37], with the adjustment of the initial denaturation step to 95 • C for 12 min.
In order to investigate the possibility of sexual recombination, two tests were carried out on both cc and non-cc datasets for subpopulations represented by at least six isolates. Firstly, an exact binomial test, using two-tailed p-values, was used to test whether mating type ratios deviated from a 1:1 ratio. Secondly, the index of association (I A ) was used to test for haploid linkage disequilibrium of the 13 microsatellite loci in Genalex 6.5 [56].

Isolation by Distance
Mantel tests, conducted in Genalex 6.5, were used to test for isolation by distance on the cc dataset using Nei's genetic distance [60,61] and geographic distance. Only subpopulations with a sample size of six or higher were included in the analysis. For visualisation of Nei's genetic distance and geographic distance, principal coordinates analysis (PCoA) was carried out in GENALEX 6.5 using the covariance standardised method on the cc dataset. [62] was used to determine the most likely number of population clusters (K). Each of 20 independent runs of K = 1−20 were carried out with 100,000 burnin iterations followed by a run of 500,000. The optimum number of clusters (K) was determined using the ln(Pr(X|K)) method [63,64] in CLUMPAK [65].

Identification of Diplodia sapinea with Conventional PCR Primers
As a result of in silico screening for inclusivity, specificity, melting temperatures and stability of homodimers, heterodimers and hairpin structures, the following D. sapinea specific primer pair was selected: DiSapi-F-5 CCCTTATATATCAAACTATGCTTTGT 3 and Diplo-R-5 TTACATAGAGGATTGCCTTCG 3 . The forward primer is fully complementary only to the sequences of D. sapinea (4 bp differ from D. scrobiculata), whereas the reverse primer is also complementary to D. scrobiculata. Together, these primers amplify a 546 bp fragment of D. sapinea mtSSU rDNA (Suppl. Figure S1). The detection limit of D. sapinea DNA with these primers in DNA extracts of pure cultures was 2.4 pg. The species-specific PCR primers were tested on several isolates of D. sapinea from five continents (North America, South America, Europe, Asia and Oceania) (Suppl . Table S2 and Figure S1). In agarose gel electrophoresis, the primers produced a visible band of the expected size only from DNA extracted from D. sapinea pure cultures and wood infected by D. sapinea. No bands were observed when using DNA from other species or from the 10 analysed soil samples [66].
The new primers Disapi-F and Diplo-R were able to discriminate D. sapinea from D. africana, D. mutila, D. seriata, D. scrobiculata, Botryosphaeria dothidea, Lasiodiplodia gonubiensis, L. theobromae and Trichoderma paraciridescens isolates (Suppl. Table S2 and Figure S1). A total of 425 European isolates were identified as D. sapinea using the species-specific primer pairs DiSapi-F/Diplo-R. Isolates which were not D. sapinea were excluded from population analyses.
Fifty-two different multilocus haplotypes (MLH) were found in the 425 European isolates of D. sapinea ( Table 2). The most frequent haplotype, MLH29, was represented by 185 isolates from 13 subpopulations. The second and third most frequent haplotypes were represented by 37 isolates from nine subpopulations (MLH33) and by 32 isolates from two subpopulations (MLH8) ( Table 2). From all 52 haplotypes, 27 were private haplotypes, i.e., found just in a single subpopulation.   Haplotypes of first arrivals of D. sapinea into Estonia are highlighted in bold and underlined. If several isolates were obtained from single tree or small group of trees then only one randomly chosen haplotype was used and several isolates from same tree were used in small spatial scale analyses and were indicated in brackets. Percentage calculation included haplotypes, which were used in the main study. Subpopulations GER, GEO, EST and UKR had the highest number of haplotypes (13, 10, 9 and 9, respectively) and GER has the highest number of private haplotypes (7), not represented in other subpopulations (Table 3). In general, D. sapinea subpopulations in Europe are characterised by a high clonal fraction index that in 13 out of 15 subpopulations fell in the range 0.45-0.90 (Table 3). Lowest clonal fraction index was found in FIN (0.45) and ITA (0.46) subpopulations, while the highest was recorded in SLO (0.85) and POL (0.88) subpopulations ( Table 3).
The highest number of shared haplotypes (5) were found between the EST and GER subpopulations followed by the FIN and POL subpopulations with four shared haplotypes ( Table 2).
In the Estonian subpopulation of first arrivals (several isolates per tree) (EST) of D. sapinea, haplotype numbers MLH4, MLH8 and MLH10 were found only in subpopulations EST and GER, while haplotypes MLH29 and MLH33 were found in most examined subpopulations, and haplotype MLH15 was found in two neighbouring countries (Finland and Latvia) to Estonia and in SER ( Table 2). In the Estonian subpopulation of the second wave of arrivals (EST), haplotypes MLH2 and MLH16 were found in the EST and UKR subpopulations, MLH9 in EST and GER subpopulations, MLH15 in EST, FIN, LAT and SER subpopulations, MLH40 in EST, GER, ITA, SLO and SWI subpopulations and haplotype number MLH50 in most of the analysed subpopulations (Table 2). Haplotypes MLH1, MLH13, MLH27 and MLH48 were found only in the Estonian (EST) subpopulation. According to minimum spanning network analyses, the betweenness and connectivity between EST an GER population is high on cc and non-cc datasets ( Figure 2). GEO subpopulation is related weakly with rest of European subpopulations according to minimum spanning network.

Population Differentiation
Most European subpopulations of D. sapinea did not differ from each other according to the AMOVA of their haplotype variance. Only GEO and GER differed from some other subpopulations (Table 4). According to the AMOVA 97% of the total molecular variance was ascribed to within-population variation and 3% to among-population variation. No significant differentiation (p = 0.427) was found between isolates from the two main host species, P. nigra and P. sylvestris. Calculation of Nei's genetic distances revealed that subpopulations GEO and ITA clearly differ from other subpopulations in Europe (Table 5, Figure 3). SER subpopulation is distant from EST and LAT.

Isolation by Distance and Clustering Analysis
The Mantel test on Nei's genetic and geographical distances revealed strong isolation by distance (p = 0.001) when the 15 subpopulations of D. sapinea in Europe and Georgia were considered (Figure 4).

Isolation by Distance and Clustering Analysis
The Mantel test on Nei's genetic and geographical distances revealed strong isolation by distance (p = 0.001) when the 15 subpopulations of D. sapinea in Europe and Georgia were considered (Figure 4).

Isolation by Distance and Clustering Analysis
The Mantel test on Nei's genetic and geographical distances revealed strong isolation by distance (p = 0.001) when the 15 subpopulations of D. sapinea in Europe and Georgia were considered (Figure 4).  The Structure results indicated that all of the isolates fell into a single cluster (probability of 0.468) ( Figure 5). i 2021, 7, x FOR PEER REVIEW 13 of 23 Figure 5. The optimum number of clusters determined using the ln(Pr(X|K)) method [63,64] in CLUMPAK [65].

Genetic Diversity and Population Statistics
In the 15 subpopulations analysed with 13 microsatellite markers, 13 (RUS) to 22 (GEO, GER and ITA) alleles were recorded per subpopulation (Table 6). Private alleles were observed in seven subpopulations: one allele in subpopulations BEL, EST, FIN and UKR, two alleles in GER and three alleles in subpopulations GEO and four in ITA. The rest of the subpopulations (LAT, MAC, NOR, POL, RUS, SER, SLO and SWI) did not have private alleles. The highest allelic richness (AR) was recorded in subpopulation ITA (1.560) followed by GEO (1.478) and SLO (1.472), but the highest private allelic richness (PAR) was found in subpopulation ITA (0.321) followed by GEO (0.152) and GER (0.077) ( Table 6). The lowest allelic richness was observed in EST (1.224), while the lowest private allelic richness occurred in SER (0.022). The highest mean number of different alleles (Na) occurred in SLO (1.538), GEO, LAT and ITA (1.462 for each), while lowest values were observed in EST (1.231).  Figure 5. The optimum number of clusters determined using the ln(Pr(X|K)) method [63,64] in CLUMPAK [65].

Genetic Diversity and Population Statistics
In the 15 subpopulations analysed with 13 microsatellite markers, 13 (RUS) to 22 (GEO, GER and ITA) alleles were recorded per subpopulation (Table 6). Private alleles were observed in seven subpopulations: one allele in subpopulations BEL, EST, FIN and UKR, two alleles in GER and three alleles in subpopulations GEO and four in ITA. The rest of the subpopulations (LAT, MAC, NOR, POL, RUS, SER, SLO and SWI) did not have private alleles. The highest allelic richness (A R ) was recorded in subpopulation ITA (1.560) followed by GEO (1.478) and SLO (1.472), but the highest private allelic richness (P AR ) was found in subpopulation ITA (0.321) followed by GEO (0.152) and GER (0.077) ( Table 6). The lowest allelic richness was observed in EST (1.224), while the lowest private allelic richness occurred in SER (0.022). The highest mean number of different alleles (Na) occurred in SLO (1.538), GEO, LAT and ITA (1.462 for each), while lowest values were observed in EST (1.231). The highest mean unbiased diversity (uh) was found in SLO (0.185), followed by ITA (0.179) and SER (0.174). The lowest mean unbiased diversity was observed in the EST and UKR subpopulations (Table 6). In comparison, high mean haploid genetic diversity (h) was documented in the SLO (0.154), ITA (0.150) and SER (0.143) subpopulations (Table 6), while the lowest values were found in the EST and UKR subpopulations.
No statistically significant correlations between northern latitudes, eastern longitudes or disease severity index and mean haploid genetic diversity (h), mean unbiased diversity (uh), allelic richness (A R ) or private allelic richness (PA R ) were found (data not shown).

Mating Type Distribution and Haploid Linkage Disequilibrium
Both mating type idiomorphs (MAT1-1-1 and MAT1-2-1) were represented in 12 out of 15 subpopulations (Table 7). In subpopulation MAC only the MAT1-2-1 mating type idiomorph was found, while in subpopulations NOR and RUS only the MAT1-1-1 mating type idiomorph was found (Table 7).  An unequal distribution of mating type idiomorphs was registered in GEO, ITA, MAC, NOR and SLO subpopulations in the non-cc dataset (p < 0.05), whereas in the cc dataset both mating type idiomorphs were present in equal proportion (p > 0.05) in all subpopulations.
Random mating was not supported by the index of association (I A ) test in the MAC subpopulation (p = 0.045) using the cc dataset, and in the GEO, ITA, MAC, POL, SER and SLO subpopulations using the non-cc dataset (p < 0.05). Using both datasets (cc and non-cc) most subpopulations had low linkage disequilibrium, except the MAC subpopulation (I A = 4.168 on non-cc and 6.210 on cc dataset) ( Table 7). The significant IA for many subpopulations, together with the balanced ratio of mating types, suggests that sexual reproduction is likely occurring in these subpopulations, albeit at a low level. The high clonal fraction of many subpopulations demonstrates the predominance of asexual reproduction.

Haplotypic Diversity at Small Spatial Scale
In 2007 D. sapinea was documented for the first time in the Baltic region on cones of a single P. nigra tree in Järvselja nursery [25]. Sixteen isolates of D. sapinea were obtained from this tree over the course of three years (2012, 2013, 2018) ( Table 8)  An unequal distribution of mating type idiomorphs was registered in GEO, ITA, MAC, NOR and SLO subpopulations in the non-cc dataset (p < 0.05), whereas in the cc dataset both mating type idiomorphs were present in equal proportion (p > 0.05) in all subpopulations.
Random mating was not supported by the index of association (IA) test in the MAC subpopulation (p = 0.045) using the cc dataset, and in the GEO, ITA, MAC, POL, SER and SLO subpopulations using the non-cc dataset (p < 0.05). Using both datasets (cc and noncc) most subpopulations had low linkage disequilibrium, except the MAC subpopulation (IA = 4.168 on non-cc and 6.210 on cc dataset) ( Table 7). The significant IA for many subpopulations, together with the balanced ratio of mating types, suggests that sexual reproduction is likely occurring in these subpopulations, albeit at a low level. The high clonal fraction of many subpopulations demonstrates the predominance of asexual reproduction.

Haplotypic Diversity at Small Spatial Scale
In 2007 D. sapinea was documented for the first time in the Baltic region on cones of a single P. nigra tree in Järvselja nursery [25]. Sixteen isolates of D. sapinea were obtained from this tree over the course of three years (2012, 2013, 2018) ( Table 8)  The first finding of D. sapinea on a native pine tree in Estonia was from Vormsi island, from where six P. sylvestris trees were sampled in 2012. Ten isolates from six trees consisting of four different haplotypes were obtained, giving a clonal fraction of 0.60. The most abundant haplotypes were no. MHL48 with six representatives followed by no. MHL8 with two representatives (Table 8).
In Muhu island, western Estonia D. sapinea has been found since 2008 in a P. nigra stand of 0.7 hectares. In 2012 14 isolates of the pathogen were isolated from c 10 trees, which consisted of four different haplotypes, giving a clonal fraction of 0.71. The most abundant haplotypes were nos. MLH29 and MLH45, each occurring six times (Table 8). The first finding of D. sapinea on a native pine tree in Estonia was from Vormsi island, from where six P. sylvestris trees were sampled in 2012. Ten isolates from six trees consisting of four different haplotypes were obtained, giving a clonal fraction of 0.60. The most abundant haplotypes were no. MHL48 with six representatives followed by no. MHL8 with two representatives (Table 8).
In Muhu island, western Estonia D. sapinea has been found since 2008 in a P. nigra stand of 0.7 hectares. In 2012 14 isolates of the pathogen were isolated from c 10 trees, which consisted of four different haplotypes, giving a clonal fraction of 0.71. The most abundant haplotypes were nos. MLH29 and MLH45, each occurring six times (Table 8).
In Slovakia 23 different trees from 10 different sites were sampled (Table 8). From each tree two to four isolates were obtained, with each isolate from a different cone. All trees had only one haplotype per tree, with the exception of a single tree in Galanta, from which two haplotypes, no. MHL39 and MHL40, occurred in four isolates (Table 8).

Allele Polymorphism in Different Loci in North America and Europe
To date, from the SSR markers which were used on the isolates from Europe and North America only marker SS6 was found to be monomorphic on both continents (Table 9). Within Europe markers SS1 and SS2 appear monomorphic. Among loci that have been analysed in both continents higher polymorphism is observed in North America than in Europe, while the sample size in North America (N = 67) is roughly 10 times smaller than in Europe (N = 623) ( Table 9). In other words, European allele diversity is considerably lower than in North America.

Discussion
Population analyses, based on 342 isolates of D. sapinea from 14 European and 1 western Asian country, revealed that the subpopulations are dominated by one microsatellite haplotype (MLH29), which comprised 45.3% of the isolates (Table 2). Nonetheless, 27 of the 52 recorded multilocus haplotypes were found only once. Only two subpopulations (GEO and ITA) out of the 15 differed significantly from other subpopulations. Significant isolation by distance was found in the European subpopulations (Figure 4), yet they have relatively low Nei's genetic distance (0.005-0.080) (Table 5) compared with, for example, Dothistroma septosporum (0.055-0.857) [67,68]. Due to the clustering analyses grouping all isolates into a single cluster ( Figure 5) and a relatively low occurrence of private alleles (Table 6), we conclude that the European D. sapinea subpopulation is homogenous and little differentiated, except subpopulations ITA and GEO.
The Italian (ITA) and Georgian (GEO) subpopulations clearly deviated from the others according to Nei's genetic distance (Figure 3) which may reflect their biogeographic isolation from Central and Northern Europe due to the Alps and Caucasus mountains, respectively. These subpopulations are characterised by having a relatively high number of private haplotypes and alleles (Tables 3 and 6). Additionally, the ITA and GEO subpopulations were characterised by a high allelic and private allelic richness, mean number of different alleles, mean unbiased diversity and mean haploid genetic diversity compared to other subpopulations.
Based on systematic annual forest disease monitoring (since 2007) on more than 60 permanent sampling sites, D. sapinea was distributed randomly around Estonia until 2012 [69]. Since 2013 the pathogen has spread widely by moving from south to north across Estonia [23]. Among the first arrived haplotypes (isolates obtained up to 2012, Table 2) only MLH4, MLH8 and MLH10 were found in subpopulations EST and GER. Haplotypes MLH29 and MLH33 were found in most of the analysed subpopulations in Europe, and haplotype MLH15 was found in neighbouring countries (Finland and Latvia). Additionally, a strong link between Estonian and German subpopulations was demonstrated by minimum spanning network analyses. Therefore, it is likely that the first introductions of D. sapinea to Estonia were of German origin or from another, unsampled, population. That the movement of D. sapinea infected plant material to Estonia from other countries is possible is demonstrated by the finding of imported seedlings testing positive for the pathogen (R. Drenkhan, unpublished data). For example, these seedlings (Tsuga canadensis and Picea pungens) were imported from the Netherlands and Germany in November 2015 and displayed no clear symptoms of D. sapinea infection, indicating that asymptomatic plants can harbour and facilitate spread of the pathogen.
A previous population study of D. sapinea with isolates from Sweden, Italy, Estonia, Spain and Turkey demonstrated the distribution of one haplotype in all investigated countries [24]. Similarly, in the present study, 25 out of 52 haplotypes occurred in at least two different subpopulations (Table 2, Figure 3), across distances of up to 2800 km (GER-GEO) ( Table 5). Burgess et al. [47] documented the occurrence of a single haplotype (MS1) in North America, Europe and New Zealand. In the current study, 45.3% of European isolates were represented by haplotype MLH29. According to McDonald and Linde [38] the domination of a few (virulent) genotypes can evolve through a mixed reproduction system, which poses the highest risk to the host. During sexual recombination new genotypes are produced, and the fittest ones are widely and rapidly dispersed by subsequent asexual reproduction. If a haplotype with high fitness is dispersed over a wide area, an epidemic can arise [38]. For a long time, D. sapinea was thought to be an asexual fungus [70], but both this study (1:1 distribution of mating types and loci that are not linked) and previous ones demonstrate that D. sapinea may be reproducing sexually, at least to some extent [37,45].
D. sapinea is native to pine forests of the Northern Hemisphere [47]. In general, plant pathogens are thought to be more diverse in their native area than in areas where they have been recently introduced [71]. Nonetheless, the highest microsatellite diversity was observed in South Africa, although this is due to multiple introductions since 1909 [45,72]. Bihon et al. [45], using the same set of 13 microsatellites as used in this study, found polymorphism in the South African population of D. sapinea, while no polymorphism was found among isolates from South America, Europe, Australia, New Zealand and part of Africa [36,42,47,48]. In our study four loci (SS1, SS2, SS11 and SS16) were monomorphic (Table 9; Suppl. Table S3) among 425 isolates. Of the 16 microsatellite loci that have been described for D. sapinea, 10 are polymorphic in European subpopulations while 15 are polymorphic in North American populations (Table 9) [42,43,47,48]. In addition, the loci which were analysed in European and North American populations were found to be more diverse in North America, even if the sample size was roughly 10 times smaller than that of Europe (Table 9). These results support the view that D. sapinea is more likely indigenous to North America than Europe, while almost nothing is known about the population diversity of the fungus in Asia and thus the exact origin of the pathogen remains unknown.
Somewhat surprisingly, in the Estonian subpopulation a high haplotype diversity was observed on a single tree in Järvselja nursery, southeast Estonia, where 10 haplotypes were found from 16 isolates ( Table 8). The P. nigra tree was moderately to heavily infected by D. sapinea, depending on the year. For comparison, in Slovakia just nine different haplotypes were found among 70 isolates from 23 trees. Only once were two different haplotypes observed on the same tree in Slovakia (Table 8). Of note is that from the single tree in Järvselja two haplotypes (MLH5 and MLH42) were isolated which were not found anywhere else in Estonia and haplotype MLH48 was found only in Estonia (Table 8). Additionally, the first record of D. sapinea for Estonia and for all of northern Europe was on this particular P. nigra tree [25]. It can be surmised that different haplotypes of D. sapinea have been sporadically imported into this area, probably with infected nursery stock. Another possibility is that sexual recombination has occurred on the tree, altering the haplotypic composition of its isolates. Strengthening the hypothesis about sexual proliferation is the occurrence of both mating types on the tree since 2012/2013.
The relatively high genetic diversity observed in newly established subpopulations in the Baltics and Finland may be explained by a long existence of the fungus in endophytic or asymptomatic modes. In the autumn of 2014, a survey was carried out in the northern Baltics including 85 asymptomatic pine needle and bud samples from 14 stands across Estonia and northern Latvia. The samples were analysed with D. sapinea species-specific primers and none of the samples were positive for D. sapinea (R. Drenkhan, unpublished data). Consequently, the endophytic or asymptomatic existence of D. sapinea in Estonia was possible only in restricted or unsampled areas. Terhonen et al. [73] found recently that D. sapinea is an endophyte in healthy P. sylvestris trees in Finland. Separate introductions of different strains of the fungus, as has happened in South Africa where the highest diversity of the pathogen is documented [45], is the most likely means of increasing genetic diversity of such an invasive pathogen. The distribution of identical haplotypes in remote geographic areas indicates that movement of haplotypes across large distances is possible.
Several molecular assays have been developed for the detection D. sapinea in environmental samples [13][14][15][16]. All these previously published methods were tested on isolates originating from 3-6 countries from one or two continents per method [13][14][15][16], while samples from most of Asia and South America were missing. In this study, novel species-specific PCR primers were designed and samples from five continents were used for detection and testing of D. sapinea in environmental samples such as imported and planted stock, seed and seedling lots. New species-specific primers provide more rapid and reliable detection of D. sapinea compared to conventional or nested conventional PCR primers previously published.
During testing of the new PCR protocol with various isolates of Diplodia spp. we identified D. africana (MW332343) from a cone collected in California, USA which is to our knowledge the first record of this fungus on Pinus canariensis in North America. We also found D. africana (MW332342) from a cone of P. canariensis collected in the Canary Islands, to our knowledge the first record of D. africana in Spain.

Conclusions
In the European D. sapinea population the dominance of a single haplotype as well as high genetic similarity between different regions were found, suggesting high gene-flow between different countries, except Italy and Georgia. The first established subpopulation of D. sapinea in northern Europe (Estonia) was likely introduced from central Europe (Germany), due to the occurrence of several common haplotypes. Furthermore, the relatively new subpopulations in northern Europe seem to have a mixed mode of reproduction (both asexual and sexual) and have similar levels of genetic diversity as older subpopulations in central and southern Europe. This situation may suggest that the pathogen population are evolutionary fit and could pose an increasing risk to pine forests, particularly in the face of a changing climate. Despite c 200 years of documented presence in European coniferous forests, D. sapinea is less diverse in Europe than in North America, which may suggest that the pathogen is not native to Europe. Further speculations on the origin of D. sapinea need population studies to be extended to Asia and Central America.