Invasive Everywhere ? Phylogeographic Analysis of the Globally Distributed Tree Pathogen Lasiodiplodia theobromae

Fungi in the Botryosphaeriaceae are important plant pathogens that persist endophytically in infected plant hosts. Lasiodiplodia theobromae is a prominent species in this family that infects numerous plants in tropical and subtropical areas. We characterized a collection of 255 isolates of L. theobromae from 52 plants and from many parts of the world to determine the global genetic structure and a possible origin of the fungus using sequence data from four nuclear loci. One to two dominant haplotypes emerged across all loci, none of which could be associated with geography or host; and no other population structure or subdivision was observed. The data also did not reveal a clear region of origin of the fungus. This global collection of L. theobromae thus appears to constitute a highly connected population. The most likely explanation for this is the human-mediated movement of plant material infected by this fungus over a long period of time. These data, together with related studies on other Botryosphaeriaceae, highlight the inability of quarantine systems to reduce the spread of pathogens with a prolonged latent phase.


Introduction
The health of both native and planted forests is under increasing pressure from rapid changes in the environment (many related to the growing impacts of human society) or the introduction of non-native, invasive pathogens and pests [1,2].The rise in the number of invasive pathogens and pests is thought to be driven primarily by increasing international movement and trade in plants and plant products [2,3].This problem might be even more severe than previously realized, because quarantine mechanisms designed to reduce such movement are oblivious of the multitude of cryptic and endophytic microbes that occur asymptomatically within plants [3,4].A prominent group of fungi that reflect this threat is the Botryosphaeriaceae.
The Botryosphaeriaceae includes many important plant pathogens such as well-known species in Botryosphaeria, Diplodia, Dothiorella, Lasiodiplodia, Macrophomina, and Neofusicoccum [5].These fungi can persist endophytically within apparently asymptomatic plant material, from where they can cause disease when the host is stressed [4,6].Many Botryosphaeriaceae species infect multiple plant hosts and commonly occur on both native and non-native hosts in a region [7][8][9][10][11].Consequently, they can easily be spread when plants or plant material are moved between regions [3,4].
The majority of the Botryosphaeriaceae have relatively limited distributions [12][13][14][15].This is perhaps not surprising given that their spread is closely linked with rainfall and associated wind dispersal, and is consequently expected to be relatively local [6,16].While stepwise, long-distance spread would be possible, a continuous distribution of available hosts would be needed, making spread across oceans or other major physical barriers unlikely.A few species, however, have very broad global distributions, including Botryosphaeria dothidea, Diplodia sapinea, D. seriata, Dothiorella sarmentorum, Neofusicoccum parvum, and Lasiodiplodia theobromae [4,11,[17][18][19][20].These species are commonly associated with agriculture, forestry, or urban environments, and it is thought that human-assisted dispersal has played a significant role in their distributions [15,18,19].
A number of previous studies have suggested that human-assisted dispersal of the Botryosphaeriaceae might in some cases occur on a large scale.For example, D. sapinea has been introduced to all areas where Pinus species have been planted in the southern hemisphere [21].Population genetic studies on this fungus suggest that, in most areas, these introductions have been so extensive that the diversity of the non-native populations exceeds that of some local native populations of the fungus [22,23].Another example is N. parvum, which is also highly genetically diverse, with 12 lineages identified using microsatellite markers, many of which are shared between different countries and on different continents [18].In the case of Macrophomina phaseolina, Sarr et al. [24] identified three lineages using DNA sequence data for six loci, also with shared geographic ranges.Analyses of a global collection of isolates of B. dothidea using two DNA sequence markers, showed that isolates grouped into two main haplotypes, with no structure based on either host genus or country of origin [19].
Lasiodiplodia theobromae is one of the most commonly reported species in the Botryosphaeriaceae.This fungus has been associated with at least 500 plant hosts from many tropical and subtropical regions globally [17,25].However, many of these host associations and disease reports for L. theobromae predate the use of DNA sequencing for species identification, and at least some could be attributed to cryptic species related to L. theobromae [12,17].In recent years, many cryptic species have been described for isolates previously treated as L. theobromae due to their morphological similarity, but that are distinct based mainly on DNA sequence data from two loci, the internal transcribed spacer ribosomal DNA (ITS) and translation elongation factor 1α (tef1α) [26][27][28].At present, the genus Lasiodiplodia comprises 31 species [20], mostly distinguished using sequence data.Furthermore, Cruywagen et al. [27] recently showed that four species of Lasiodiplodia represent hybrid species, based on more complete isolate collections or sequence data of more loci than originally used.In view of all of these studies, there is no overall clarity on the host or geographic distribution of what can be considered L. theobromae sensu stricto, based on current DNA-based definitions of this taxon.It is also not clear where the fungus might have originated, where it is invasive, or to what extent humans have facilitated the dispersal of this fungus globally.
The first aim of this study was to screen a global collection of isolates putatively identified as L. theobromae and thus to identify a collection that represented L. theobromae sensu stricto based on DNA sequence data.Sequence data from four nuclear loci were then used to determine whether there was genetic structure amongst this global collection of L. theobromae isolates.Finally, we considered whether the data revealed a possible area of origin for the fungus.

Isolate Collections and DNA Extractions
A total of 426 fungal isolates designated as Botryosphaeria sp. or L. theobromae were obtained from the culture collection (CMW) of the Forestry and Agricultural Biotechnology Institute (FABI) at the University of Pretoria, Pretoria, South Africa.These isolates originated from collections made in Australia, Benin, Brazil, Cameroon, China, Colombia, Ecuador, Indonesia, Madagascar, Mexico, Oman, Peru, South Africa, Thailand, Uganda, the United States of America (USA), Venezuela, and Zambia (Figure 1).Several isolates identified as L. theobromae were also sourced from the culture collection of the Westerdijk Fungal Biodiversity Institute (previously known as the Centraalbureau voor Schimmelcultures), Utrecht, the Netherlands.In addition, sequences were sourced from GenBank for taxa labeled as "Botryosphaeria rhodina" or "Lasiodiplodia theobromae" and were included in datasets for analyses (Table 1).
Isolates assembled for this study were purified by transferring single hyphal tips to clean culture plates following the method described in Mehl et al. [30].DNA was extracted from isolates using the method described by Wright et al. [31] with pellets suspended in 50 µL Tris Ethylenediaminetetraacetic acid (TE) buffer.DNA concentrations were determined using a NanoDrop ® ND-1000 and accompanying software (NanoDrop Technologies, DuPont Agricultural Genomics Laboratories, Wilmington, DE, USA).Zambia (Figure 1).Several isolates identified as L. theobromae were also sourced from the culture collection of the Westerdijk Fungal Biodiversity Institute (previously known as the Centraalbureau voor Schimmelcultures), Utrecht, the Netherlands.In addition, sequences were sourced from GenBank for taxa labeled as "Botryosphaeria rhodina" or "Lasiodiplodia theobromae" and were included in datasets for analyses (Table 1).Isolates assembled for this study were purified by transferring single hyphal tips to clean culture plates following the method described in Mehl et al. [30].DNA was extracted from isolates using the method described by Wright et al. [31] with pellets suspended in 50 μL Tris Ethylenediaminetetraacetic acid (TE) buffer.DNA concentrations were determined using a NanoDrop ® ND-1000 and accompanying software (NanoDrop Technologies, DuPont Agricultural Genomics Laboratories, Wilmington, DE, USA).Isolate identities were confirmed as L. theobromae using data from four loci; the ITS rDNA (including the ITS1, 5.8S nuclear ribosomal RNA (nrRNA) and ITS2), tef1α, β-tubulin-2 (tub2) and RNA polymerase II (rpb2) loci.Preliminary identification was done for all isolates using maximum likelihood phylogenetic analysis of sequence data from the tef1α locus, which was then supported by data for the other three loci.The dataset for tef1α included all other Lasiodiplodia species known at the time of the analyses.
For PCR amplifications, the primer sets ITS1 and ITS4 [32], EF1F and EF2R [33], EF688F and EF1251R [34], Bt-2a and Bt-2b [35], and RPB2-LasF and RPB2-LasR [27] were used to amplify the ITS, tef1α, tub2, and rpb2 loci, respectively.PCR mixes were the same as those that included KAPA Taq and MyTaq DNA polymerases as described by Mehl et al. [36] and PCR cycling conditions and product visualization were the same as those used by Mehl et al. [37].PCR product purification and sequencing were done as described by Mehl et al. [30] and sequences were examined and edited using MEGA 6 [38].
Sequence datasets were aligned using MAFFT 6 [39] with the G-INS-I algorithm selected and alignment errors corrected visually.For the tef1α dataset that included isolates of species other than L. theobromae, the best nucleotide substitution model was determined using JMODELTEST 2.1.3[40] with the corrected Akaike Information Criterion selected.The dataset was analyzed with PHYML 3.0.1 [41] using the same model parameters as determined by JMODELTEST and the robustness of the generated tree was evaluated using 1000 bootstrap replicates.Sequences generated in this study were deposited in GenBank (Table 1).

Haplotype Assignment and Networks
To ascertain the number of haplotypes for each dataset and to identify where haplotypes occurred, sequence datasets were generated for each locus separately, along with one combined dataset for the ITS and tef1α regions.The combined dataset was generated because it included the majority of isolates and provided a better representation of the diversity inherent in the populations and regions.For each dataset, isolates were assigned to different haplotypes using the map program in Mobyle SNAP Workbench [42].Sites that violated the infinite sites model, as well as indels, were removed prior to assigning haplotypes.Median joining haplotype networks were then constructed for each dataset, as well as for the combined dataset using NETWORK 4.6.1.3[43,44].

Population and Regional Structure and Diversity
To determine whether there was genetic structure present in the datasets and to test for potential population subdivision, haplotype assignments for all four loci, as determined by Mobyle SNAP Workbench, were analyzed using the program STRUCTURE 2.3.4 [45,46].STRUCTURE uses a Bayesian clustering algorithm to evaluate the possibility of multiple lineages being present.Two sets of analyses were made, the first of which evaluated whether there was genetic structure in the dataset for all isolates.The second set of analyses involved grouping isolates into five populations based on the continent of origin (North America, South America, Africa, Eurasia, and Australasia) and then running STRUCTURE analyses on pairs of populations to determine whether there was genetic structure between any of the populations (10 pairs including every possible combination).
For all analyses, burnin was set at 300,000 and the number of Markov Chain Monte Carlo (MCMC) repeats was set at 900,000, so that more than 1,000,000 repeats were done to generate robust results.Initially lambda was computed based on five runs at K = 1.The model selected entailed admixture with independent allele frequencies and the lambda value computed.Twenty iterations were done for each value of K = 1 to K = 10.Results were parsed through STRUCTURE HARVESTER [47] and the DeltaK [48] output used to identify possible subpopulations.
Population statistics, including gene and nucleotide diversities, were inferred using ARLEQUIN 3.5.1.2[49] on the ITS, tef1α, combined ITS and tef1α, and tub2 sequence datasets for every geographic country and region assigned.Pairwise population differentiation (Φ ST ) comparisons were computed for all populations and regions using ARLEQUIN on the same dataset.

Putative Geographic Origin of Lasiodiplodia theobromae
To determine the possible centre of origin for L. theobromae, scenarios of how populations could have arisen were simulated and the summary statistics of these compared to those of the observed dataset using DIYABC 2.0.4 [50].For these analyses, the sequence datasets of isolates (with data from all four loci) were grouped according to continent of origin, similar to the arrangements for the second set of analyses using STRUCTURE.To determine whether any of the populations could be ancestral, pairs of populations were evaluated using three possible scenarios (Figure 2): scenario 1-the first population is ancestral to both, scenario 2-the second population is ancestral to both, scenario 3-both populations diverged from an unknown ancestral population.For each scenario, 1,000,000 datasets were simulated.
Forests 2017, 8, 145 12 of 24 every geographic country and region assigned.Pairwise population differentiation (ΦST) comparisons were computed for all populations and regions using ARLEQUIN on the same dataset.

Putative Geographic Origin of Lasiodiplodia theobromae
To determine the possible centre of origin for L. theobromae, scenarios of how populations could have arisen were simulated and the summary statistics of these compared to those of the observed dataset using DIYABC 2.0.4 [50].For these analyses, the sequence datasets of isolates (with data from all four loci) were grouped according to continent of origin, similar to the arrangements for the second set of analyses using STRUCTURE.To determine whether any of the populations could be ancestral, pairs of populations were evaluated using three possible scenarios (Figure 2): scenario 1-the first population is ancestral to both, scenario 2-the second population is ancestral to both, scenario 3-both populations diverged from an unknown ancestral population.For each scenario, 1,000,000 datasets were simulated.Posterior probabilities of scenarios for each analysis step were computed using polychotomous logistic regression on 1% of the simulated datasets closest to the dataset provided.The best scenario was the one having the highest probability and with 95% confidence intervals that did not overlap with those of the other scenarios tested.

Isolate Collections and Confirmation of Species Identity
The tef1α sequence dataset that included all isolates, as well as representatives of other Lasiodiplodia species, consisted of 340 characters (151 parsimony informative, 22 parsimony uninformative, 167 constant).The model selected by JMODELTEST was HKY (transitions:transversions (ti/tv) = 1.719, γ = 0.407).The resulting tree contained a clade of 255 isolates, from 26 countries, that was considered to represent L. theobromae sensu stricto as it included authentic isolates of this species (Figure S1).Of these, 95 isolates represented a global collection assembled over many years and stored in the CMW culture collection.The other isolates sampled from this collection grouped with Botryosphaeria dothidea, D. pseudoseriata, L. brasiliense, L. crassispora, L. gilanensis, L. gonubiensis, L. hormozganensis, L. iraniensis, L. laeliocattleyae, L. mahajangana, L. margaritacea, L. parva, L. pseudotheobromae, L. viticola, Neofusicoccum parvum, and N. vitifusiforme (data not shown) and were thus excluded.Four isolates were from the collection of the Westerdijk Fungal Biodiveristy Institute.The remaining sequences for 156 additional isolates were sourced from GenBank (Table 1, Figure 1).Thus, all subsequent analyses were based on data for this core group of 255 isolates from 52 plant hosts.
Countries considered in the analyses were grouped into eight geographic regions, including north America (Hawaii, Mexico, Puerto Rico, United States of America-USA), western south Posterior probabilities of scenarios for each analysis step were computed using polychotomous logistic regression on 1% of the simulated datasets closest to the dataset provided.The best scenario was the one having the highest probability and with 95% confidence intervals that did not overlap with those of the other scenarios tested.

Isolate Collections and Confirmation of Species Identity
The tef1α sequence dataset that included all isolates, as well as representatives of other Lasiodiplodia species, consisted of 340 characters (151 parsimony informative, 22 parsimony uninformative, 167 constant).The model selected by JMODELTEST was HKY (transitions:transversions (ti/tv) = 1.719, γ = 0.407).The resulting tree contained a clade of 255 isolates, from 26 countries, that was considered to represent L. theobromae sensu stricto as it included authentic isolates of this species (Figure S1).Of these, 95 isolates represented a global collection assembled over many years and stored in the CMW culture collection.The other isolates sampled from this collection grouped with Botryosphaeria dothidea, D. pseudoseriata, L. brasiliense, L. crassispora, L. gilanensis, L. gonubiensis, L. hormozganensis, L. iraniensis, L. laeliocattleyae, L. mahajangana, L. margaritacea, L. parva, L. pseudotheobromae, L. viticola, Neofusicoccum parvum, and N. vitifusiforme (data not shown) and were thus excluded.Four isolates were from the collection of the Westerdijk Fungal Biodiveristy Institute.The remaining sequences for 156 additional isolates were sourced from GenBank (Table 1, Figure 1).Thus, all subsequent analyses were based on data for this core group of 255 isolates from 52 plant hosts.
Countries considered in the analyses were grouped into eight geographic regions, including north America (Hawaii, Mexico, Puerto Rico, United States of America-USA), western south America (Colombia, Ecuador, Peru, Venezuela), eastern south America (Brazil, Uruguay), western Africa (Benin, Cameroon), southern and eastern Africa (Madagascar, South Africa, Uganda, Zambia), Middle East and Europe (Egypt, Iran, Italy, Oman), Asia (China, Indonesia, Korea, Thailand), and Australasia (Australia, Papua New Guinea) (Tables 1 and 2).

Haplotype Assignment and Networks
The ITS dataset (252 isolates) consisted of 333 characters (two parsimony informative, 23 parsimony uninformative, 308 constant) and yielded 11 haplotypes with 17 fixed single nucleotide polymorphisms (SNPs) (Table S1, Figure 3a).The tef1α dataset (255 isolates) consisted of 216 characters (five parsimony informative, 11 parsimony uninformative, 200 constant) and yielded eight haplotypes with 14 SNPs (Table S1, Figure 3b).The tub2 dataset (153 isolates) consisted of 309 characters (six parsimony informative, nine parsimony uninformative, 294 constant) and yielded 12 haplotypes with 15 SNPs (Table S1, Figure 3c).The rpb2 dataset (73 isolates) consisted of 535 characters (zero parsimony informative, zero parsimony uninformative, 535 constant) and yielded a single haplotype.The combined ITS and tef1α dataset consisted of 549 characters (seven parsimony informative, 34 parsimony uninformative, 508 constant) and yielded 17 haplotypes (Figure 4).Table 2. Standard genetic and nucleotide diversity measures for isolates collected in each country and region, for the ITS, tef1α, combined ITS and tef1α, and tub2 sequence datasets.Included are sample size (N), number of haplotypes found (H), gene diversity (H E ) and nucleotide diversity (π).Sample sizes are also recorded for the tub2 dataset as sequence data for this locus was not available for all isolates.Totals for each region are also listed.

Region
Country N ITS tef1α ITS + tef1α tub2  There was no clear grouping of isolates based on region of origin.Analyses of the ITS and tub2 loci (Figure 3) showed that one haplotype was most common.The rpb2 dataset was not analyzed further as it constituted only one haplotype.For the tef1α dataset and the combined dataset of ITS and tef1α, two closely related (separated by a single mutation) haplotypes were most common.These common haplotypes represented isolates sourced from all eight regions sampled (Figure 3 and 4, Table S2).
An analysis of haplotypes (Table S3) showed that Asia and North America had the greatest number of unique haplotypes (10 and four, respectively) across all three loci (ITS, tef1α, and tub2).For the remaining regions, one to three unique haplotypes were detected.When considering the individual loci, three unique ITS haplotypes and six unique tub2 haplotypes were observed amongst isolates from Asia.For all other regions, two or fewer unique haplotypes were found.Upon closer examination, these unique haplotypes were confined to specific countries.Two of the five isolates collected from the USA (North America) had unique haplotypes, while 15 isolates collected from three locations in China over a period of four years had unique haplotypes.

Population and Regional Structure and Diversity
There was no evidence of sub-populations present in either set of the STRUCTURE analyses.In the first set of analyses that considered all isolates, the significantly highest DeltaK value was at K = 8 populations, but the corresponding barplot showed that no structure was present (Figure 5).Similarly, in the second set of analyses that evaluated genetic structure between the pairs of populations, the highest DeltaK values obtained differed for each population pair tested and varied from K = 2 to K = 8.However, the corresponding barplots for these values of K all showed that no structure was present in the data (Figure S2a-j).There was no clear grouping of isolates based on region of origin.Analyses of the ITS and tub2 loci (Figure 3) showed that one haplotype was most common.The rpb2 dataset was not analyzed further as it constituted only one haplotype.For the tef1α dataset and the combined dataset of ITS and tef1α, two closely related (separated by a single mutation) haplotypes were most common.These common haplotypes represented isolates sourced from all eight regions sampled (Figures 3 and 4, Table S2).
An analysis of haplotypes (Table S3) showed that Asia and North America had the greatest number of unique haplotypes (10 and four, respectively) across all three loci (ITS, tef1α, and tub2).For the remaining regions, one to three unique haplotypes were detected.When considering the individual loci, three unique ITS haplotypes and six unique tub2 haplotypes were observed amongst isolates from Asia.For all other regions, two or fewer unique haplotypes were found.Upon closer examination, these unique haplotypes were confined to specific countries.Two of the five isolates collected from the USA (North America) had unique haplotypes, while 15 isolates collected from three locations in China over a period of four years had unique haplotypes.

Population and Regional Structure and Diversity
There was no evidence of sub-populations present in either set of the STRUCTURE analyses.In the first set of analyses that considered all isolates, the significantly highest DeltaK value was at K = 8 populations, but the corresponding barplot showed that no structure was present (Figure 5).Similarly, in the second set of analyses that evaluated genetic structure between the pairs of populations, the highest DeltaK values obtained differed for each population pair tested and varied from K = 2 to K = 8.However, the corresponding barplots for these values of K all showed that no structure was present in the data (Figure S2a-j).Gene diversity was low for most countries and regions sampled.High gene diversity (>0.4) was detected for individual loci in countries including USA, Peru, Uganda, China, Indonesia, and Thailand, and in North America (Table 2).High nucleotide diversity was detected in the above-mentioned countries, as well as in Ecuador, Venezuela, Brazil, Cameroon, South Africa, Oman and Australia, and in several regions including western and eastern South America, western Africa, and Australasia (Table 2).Gene diversity was low for most countries and regions sampled.High gene diversity (>0.4) was detected for individual loci in countries including USA, Peru, Uganda, China, Indonesia, and Thailand, and in North America (Table 2).High nucleotide diversity was detected in the above-mentioned countries, as well as in Ecuador, Venezuela, Brazil, Cameroon, South Africa, Oman and Australia, and in several regions including western and eastern South America, western Africa, and Australasia (Table 2).

Putative Geographic Origin of Lasiodiplodia theobromae
Posterior probabilities for all of the scenarios tested for the pairs of populations were low (Table S3) when a posterior probability of 0.7 or more was considered high.Ninety-five percent (95%) confidence intervals for different scenarios for the same pairwise comparison often overlapped (Table S4), indicating a lack of resolution in choosing one specific scenario over the others.These results are likely due to the lack of variation in the markers.However, they support the conclusions of other analyses reported above that did not identify any specific region as an evolutionary origin of the fungus over others.

Discussion
Results of this study suggest that isolates associated with L. theobromae collected from many different hosts and countries of the world represent a single globally distributed species, with no obvious phylogeographic structure.This was evident from various analyses on sequence datasets for four loci (only three of which were variable) in 255 isolates from 52 hosts from all continents other than Antarctica.We thus contend that the only likely explanation for this result is the large-scale human dispersal of this fungal species.
The lack of population structure in L. theobromae on a global scale is in contrast to studies on other broadly distributed fungi that infect commercially cultivated plants or are medically important (e.g., [52][53][54]).These previous studies have typically revealed phylogeographic structure within species, with multiple cryptic lineages linked to geographic regions, leading to the conclusion that, for fungi, "nothing is generally everywhere" [54,55].Subsequent studies have shown that lineages in some of these fungi (e.g., Fusarium graminearum and Histoplasma capsulatum) represent cryptic species [56,57].An exception to this rule is Aspergillus fumigatus, which has very small (2-3 µm), wind-dispersed conidia.This special case is hypothesized to possibly arise from human influence, especially through environmental impact, which has created ideal habitats for the fungus [58,59].
Amongst the Botryosphaeriaceae, the shared genetic diversity across continents is not unique to L. theobromae.Neofusicoccum parvum also appears to have a similar global distribution of diversity [18].Recently, Marsberg et al. [19] reported a similar lack of structure amongst a global collection of B. dothidea isolates.All three of these species have exceptionally broad host ranges across many plant families, and this has no doubt facilitated their broad distribution.Furthermore, N. parvum was reported to be more common in human-associated and disturbed environments, such as plantations, orchards, and urban environments [15], which could facilitate invasion (similar to A. fumigatus).
Lasiodiplodia theobromae, B. dothidea, and N. parvum are ideal systems in which to further test these hypotheses regarding the role of host and human association in facilitating invasions.
The absence of phylogeographic structure amongst global collections of Botryosphaeriaceae such as L. theobromae is surprising in the light of their spore dispersal mechanism.Spores of the Botryosphaeriaceae, including those of L. theobromae, emerge in a sticky matrix and are relatively large (the most common spores, conidia, range between 10-35 × 8-15 µm; [12]) and are naturally dispersed by wind and rain splash [6,16,[60][61][62].Consequently spores are not expected to be spread over large distances or across geographic barriers and certainly not between continents.The limited ability of these fungi to disperse over long distances would be expected to result in a vicariant population structure with differences at a regional level between populations.The lack of population structure and dominance of identical multilocus haplotypes on distant continents can only be explained by assisted dispersal.In this case, human-mediated movement of plant material [1,3,63] has most likely facilitated this global dispersal.
A large number of the plant hosts from which isolates of L. theobromae were obtained for the present study are commercially important and traded globally as part of the nursery trade, or cultivated either for agriculture (e.g., Carica papaya, Mangifera indica, and Vitis vinifera) or forestry (e.g., Acacia mangium, Eucalyptus species).The Botryosphaeriaceae, including L. theobromae, are common endophytes in such plants and plant products, including fruits [4,64].Endophytic infections by these fungi are typically invisible and are thus not detected by quarantine systems [3,19,65].The present study highlights how widely species of the Botryosphaeriaceae, specifically L. theobromae, can be spread as a consequence of such human-assisted movement.
Results of this study were consistent with those of previous studies that used microsatellite markers to study populations of L. theobromae [66][67][68].These previous studies considered populations of isolates from Mexico, South Africa, Venezuela, India, and Cameroon, and detected extensive gene flow and shared genotypes from different hosts [66][67][68] and from different countries [66].Our analyses provide a broader representation with consistent results, including publicly available data combined with data from our own collection of L. theobromae isolates.
No clear centre of origin for L. theobromae emerged from this study based on gene diversity.The greatest cumulative diversity obtained by combining the diversities for the individual loci was detected for the North American collections.Population differentiation tests highlighted the North American and west African populations as being moderately to fairly distinct from the rest.The North American and Asian regions had higher numbers of unique haplotypes (four and ten respectively), but these haplotypes were present only in some countries (USA and China, respectively).
The diversity of L. theobromae in the USA was especially noticeable given that only a few isolates were available for that country.Further sampling would be needed to confirm whether this reflects a possible native population or is the result of introductions through trade with various other regions [55].It has been shown for other organisms, for example lizards, that the invasive populations could be more diverse than native populations if introduced multiple times and from various isolated native populations [69].This has also been observed in fungi such as D. sapinea in parts of its invasive range (e.g., in South Africa; [22,23]).
This study provides a valuable foundation for future studies that will investigate the genetic structure, movement, and origins in L. theobromae and other important species of the Botryosphaeriaceae.The loci used were chosen to allow for the inclusion of publicly available sequence data so as to obtain a more comprehensive global perspective.We excluded cryptic lineages based on previous studies that have resolved the taxonomy of Lasiodiplodia spp.and have defined these lineages as distinct species, including hybrid species [12,27].As such, the current collection represents a valuable resource to represent a sensu stricto definition of the species.This information can now serve as a basis for further collections targeted at more isolated areas that could reveal the potential origin of the fungus.Other markers, such as microsatellite markers, would also provide further insights into origins and patterns of spread of this fungus.However, this will require greater numbers of isolates and ideally a more structured sampling regime than was possible for this study [18].

Conclusions
The results of this study, together with other recent investigations on diversity amongst global populations of Botryosphaeriaceae, have highlighted the fact that human-mediated movement of plant material infected by these fungi can facilitate their movement globally.The extent of movement of this serious pathogen around the world suggests a major shortcoming in the ability of quarantine systems to inhibit or stop its movement.These fungi, and their hosts, are also likely to increasingly be influenced by global climate change.Because the earth is subjected to more extreme weather events, plants are likely to become increasingly stressed and more susceptible to disease by pathogens [70], including opportunistic and generalist pathogens such as the Botryosphaeriaceae.Consequently, the Botryosphaeriaceae, including L. theobromae, will become increasingly prominent and important for the management of health in both native and commercially cultivated woody plants.Serious attention should be given to strategies that could reduce the extent of such movement.Such management strategies are likely to also be relevant to the numerous other endophytes and potential latent pathogens that inhabit plants and plant material traded around the world.

Supplementary Materials:
The following are available online at www.mdpi.com/1999-4907/8/5/145/s1, Figure S1: Maximum likelihood tree of the tef1α sequence dataset for the initial identification of isolates for inclusion in this study.Included were type and paratype strains of other Lasiodiplodia species, Figure S2: STRUCTURE output from pairwise comparisons of populations.Each plot includes the DeltaK analysis from STRUCTURE HARVESTER (top) and the corresponding barplot for the highest value of K. Pairwise comparisons as follows: (a) north America and south America, (b) north America and Africa, (c) north America and Eurasia, (d) north America and Australasia, (e) south America and Africa, (f) south America and Eurasia, (g) south America and Australasia, (h) Africa and Eurasia, (i) Africa and Australasia and (j) Eurasia and Australasia, Table S1: Polymorphic sites for the respective haplotypes for the ITS, tef1α and tub2 datasets, Table S2: Haplotype assignments for every isolate used in this study, based on the sequence datasets, Table S3: Summary of haplotypes obtained and unique haplotypes (listed in brackets) found for each locus, Table S4: Posterior probabilities (with 95% confidence intervals in parentheses) of pairwise comparisons for three scenarios to test for possible ancestry between populations done in DIYABC.In scenario 1, population 1 is ancestral to both.In scenario 2, population 2 is ancestral to both.In scenario 3, both populations diverged from an unknown source population.

Figure 2 .
Figure 2. Scenarios evaluated to determine possible ancestry between any of the pairs of populations tested.In scenario 1, population 1 is ancestral to both.In scenario 2, population 2 is ancestral to both.In scenario 3, both populations diverged from an unknown source population.

Figure 2 .
Figure 2. Scenarios evaluated to determine possible ancestry between any of the pairs of populations tested.In scenario 1, population 1 is ancestral to both.In scenario 2, population 2 is ancestral to both.In scenario 3, both populations diverged from an unknown source population.

Figure 3 .
Figure 3. Haplotype networks generated for the (a) internal transcribed spacer rDNA (ITS), (b) translation elongation factor 1α (tef1α), and (c) β-tubulin-2 (tub2) loci.Only one haplotype resulted from analysis of the RNA polymerase II (rpb2) locus and is not included.Colours represent the different regions isolates were obtained from.

Figure 3 .
Figure 3. Haplotype networks generated for the (a) internal transcribed spacer rDNA (ITS), (b) translation elongation factor 1α (tef1α), and (c) β-tubulin-2 (tub2) loci.Only one haplotype resulted from analysis of the RNA polymerase II (rpb2) locus and is not included.Colours represent the different regions isolates were obtained from.

Figure 4 .
Figure 4. Haplotype network generated for the combined ITS and tef1α dataset.Colours represent the different regions isolates were obtained from.Haplotypes designated by Roman numerals (I-XVII).Open circles represent inferred haplotypes.

Figure 4 .
Figure 4. Haplotype network generated for the combined ITS and tef1α dataset.Colours represent the different regions isolates were obtained from.Haplotypes designated by Roman numerals (I-XVII).Open circles represent inferred haplotypes.

Figure 5 .
Figure 5. Structure output on the combined dataset of all four loci.The output from the DeltaK analysis from STRUCTURE HARVESTER (top) resulted in the highest peak at K = 8 populations, but the corresponding barplot (bottom) showed no structure.

Figure 5 .
Figure 5. Structure output on the combined dataset of all four loci.The output from the DeltaK analysis from STRUCTURE HARVESTER (top) resulted in the highest peak at K = 8 populations, but the corresponding barplot (bottom) showed no structure.

Table 1 .
List of isolates used for genetic analyses.Isolates are ordered geographically, moving from North America eastwards to Australia.Countries in each region are arranged alphabetically.Sequences from GenBank are italicized.

Table 3 .
Pairwise population differentiation (Φ ST ) comparisons between the regions that isolates were obtained from, based on the combined ITS and tef1α dataset.