Managing Genetic Diversity and Representation in Banksia marginata (Proteaceae) Seed Production Areas Used for Conservation and Restoration

: Landscape degradation is a major threat to global biodiversity that is being further exacerbated by climate change. Halting or reversing biodiversity decline using seed-based restoration requires tons of seed, most of which is sourced from wild populations. However, in regions where restoration is most urgent, wild seed sources are often fragmented, declining and producing seed with low genetic diversity. Seed production areas (SPAs) can help to reduce the burden of collecting native seed from remnant vegetation, improve genetic diversity in managed seed crops and contribute to species conservation. Banksia marginata (Proteaceae) is a key restoration species in south-eastern Australia but is highly fragmented and declining across much of its range. We evaluated genetic diversity, population genetic structure and relatedness in two B. marginata SPAs and the wild populations from which the SPA germplasm was sourced. We found high levels of relatedness within most remnants and that the population genetic structure was best described by three groups of trees. We suggest that SPAs are likely to be important to meet future native seed demand but that best practice protocols are required to assist land managers design and manage these resources including genetic analyses to guide the selection of germplasm.


Introduction
The Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) has highlighted that human-mediated land degradation is negatively impacting on some 3.2 B people, pushing the planet towards a sixth mass species extinction and costing more than 10% of annual global growth product in lost biodiversity and ecosystem services [1]. IPBES also indicates that investments to avoid land degradation and restore degraded land are economically sound and essential for reaching many Sustainable Development Goals [1]. Halting and/or reversing ecosystem decline is a global challenge that will, in part, require tons of seed to be collected from native vegetation. However, in regions where significant vegetation loss has occurred native seed for restoration initiatives is likely to be limited [2,3] while there are also concerns about low genetic diversity and high inbreeding in this seed [4]. In addition, climate change effects including reduced rainfall, changes in rainfall seasonality and increased temperature are likely to negatively influence seed crops in the future [5], further limiting restoration activities.
Seed production areas (seed orchards, seed increase; SPAs), i.e., cultivated stands established specifically for native seed production, can help to reduce the burden of seed collection from remnant vegetation, improve genetic diversity in managed seed crops and contribute to species conservation [6]. However, SPAs established for restoration face many challenges including provision of a broad genetic base to ensure that the adaptive capacity of populations and species is maintained [7][8][9] and to reduce potential inbreeding effects [10,11] in seed crops. Another consideration is that many restoration projects involve planting into novel and/or degraded environments [12] with native seed potentially unable to cope with these conditions. To overcome this situation some US land management agencies are genetically improving key restoration species to maximize plant establishment and growth in greatly modified rangeland habitats [13,14]. A further concern is whether SPAs should be producing seed for local or future conditions. A typical approach to SPA design for future conditions is to include selections from populations that currently experience the likely future climate along with local seed sources. This brings some potential for negative outbreeding effects if divergent genomes are brought together [15,16], although outbreeding depression is generally a minor concern in forest tree species. However, the risk of inbreeding depression caused by asynchronous flowering among divergent populations is a significant risk [17] that requires careful management. These issues can be especially pressing for SPAs of long-lived species which may take several years before a seed crop is produced and even longer to determine if negative genetic issues have arisen.
When selecting germplasm to establish SPAs it is important to ensure that the seed crops produced will support proposed restoration activities. For example, seed collected from small and/or fragmented populations are likely to have low genetic diversity and may also be highly inbred [18]. Genetic bottlenecks can also be created during SPA establishment if germination or clone failure skew the SPA population towards germplasm from particular mothers or sites [6,19]. Though bringing inbred germplasm from divergent remnants together creates the opportunity for outcrossing, it is important to understand the levels of relatedness of selections within the SPA to actively manage inbreeding. Despite SPAs regularly producing seed for restoration, remarkedly few of these crops have been evaluated to determine whether these seed are leading to the long-term persistence of restored populations. Studies of SPA seed have determined problems such as low genetic diversity [20], the use of commercial varieties whose genotypes do not exist in local remnant populations [21], unidirectional genotypic selection [22] and extreme bottlenecks [23]. However, other assessments have found no differences in genetic diversity between SPAs and the germplasm used for establishment [6,19]. Another key component is ensuring that the SPAs are producing seed crops that are genetically compatible with the area to be restored. Generalized patterns of genetic diversity and population structure based on life history traits have been determined [24][25][26] where, due to an ability to disperse pollen and seed both spatially and temporally, long-lived, outcrossing, animal-and wind-pollinated trees have less among-population genetic structure than other species [27,28]. An understanding of both genetic diversity and population genetic structure of wild populations can assist in establishing SPAs to meet the expectations of restoration projects but this level of detail is rarely sought, and therefore is rarely available for restoration species.
The deteriorating state of the Australian environment, vegetation fragmentation and climate change have been recently cited as critical issues for the Australian native seed sector [29]. Native seed production across south-eastern Australia is largely environmentally driven, making supply erratic and leading to inappropriate species substitutions to meet the objectives of restoration projects [30]. Consequently, SPAs are likely to be part of the solution to overcome predicted native seed shortages and to reduce further degradation of wild remnants from seed harvesting. This is especially topical following the Australian Black Summer (2019-2020) wildfires that burnt extremely large tracts of vegetation which will be unable to supply seed for restoration for many years. Some forward-thinking land managers have been establishing SPAs in south-eastern Australia since the late 1990s with various levels of success in meeting their seed demands (Martin Driver, pers. comm.). These restoration-based SPAs are primarily producing grass and wildflower seed with far fewer being established with shrubs and trees [29]. This study focused on two Banksia marginata SPAs established in 2001 at Euroa and 2005 at Benalla within the Goulburn-Broken Catchment (Victoria Australia) to produce seed for local restoration projects. Banksia marginata (Proteaceae) is a highly fragmented long-lived woody tree distributed across south-eastern Australia. We investigated genetic diversity and population genetic structure of remnants close to the SPAs including those from which seed were collected to establish the SPAs. We also determined if the selections included in the SPAs were representative of these remnant populations.
To do this we were firstly interested in understanding how much genetic diversity existed in each remnant and its associated SPA trees. We then investigated differences in genetic diversity among the remnants and the two SPAs. We also assessed population genetic structure to determine whether the SPAs were representative of any structure observed. As the source remnant population could not be determined for all SPA trees, we further determined the probable origin of these unidentifiable trees. As a key determinant of SPA success is ensuring that inbreeding is avoided, we also assessed the relatedness of all trees and those in the SPAs to determine the likelihood of mating among close relatives.

Species Background, Study Sites and Sampling
Species background-Banksia marginata (Proteaceae) is highly variable small shrub (<1 m) to a tall tree (>10 m; [31]) that is broadly distributed across south-eastern Australia from northern New South Wales to Victoria, South Australia and Tasmania [32]. This species occurs across a range of ecological communities (shrubland, woodland forest, swamps and coastal dunes) and a variety of soils (sandy, clay loam, peaty loam, rocky soil, and soils developed on quartz sandstone, limestone and granite) within a 400-1000 mm mean annual rainfall range [33,34]. While B. marginata can self-pollinate, these seed are smaller and less likely to survive [35] suggesting preferential outcrossing and that the species, like most other tree species, is subject to inbreeding depression [36]. Variability in seed mass among individuals, populations and years has been attributed to nutrient availability during seed provisioning [37]. This species can also resprout after fires [38]. B. marginata is important resource for many vertebrates [39][40][41] as well as having invertebrate relationships [42] and fungal associations [43].
Study sites-Seed for the Euroa SPA were collected from wild populations ( Figure 1) then germinated, grown and randomly planted at the Euroa Arboretum. Seed for the Benalla SPA were also collected from local wild populations and similar germinated and planted with the addition of 10 cuttings derived from the Ruffy and Marraweeney populations, some of which did not survive planting out ( Figure 1). Leaf material was collected from all trees in the Euroa (36.768838 • S, 145.548482 • E) and Benalla (36.747551 • S, 145.743524 • E) SPAs as well as the remnants that had been used as seed sources for these SPAs (Figure 1). The material was individually dried on silica gel and transported to CSIRO Black Mountain Science and Innovation Park in Canberra for analyses. Trees at Euroa were originally planted with identification tags but over time some of these tags were lost leaving a small proportion of trees that could not be ascribed to their original remnant populations. In contrast, remnant sources were known for the Benalla SPA but none of the trees could be individually attributed to these remnants. Consequently, all SPA trees whose source populations could not be identified were pooled into a single 'Unknown' group.

DNA Extraction and Genotyping
Each leaf was subsampled and genomic DNA extracted [44]. A subset of 16 samples from several of the remnants were used to test 30 microsatellite (SSR) primer pairs developed for other Banksia species [45][46][47][48][49][50] (Supplementary Table S1). Ten of these primer pairs either amplified poorly or failed to amplify and were eliminated. A further eight primer pairs amplified well but exhibited little to no polymorphism across the 16 samples and were also eliminated. The remaining twelve primer pairs were amplified across all samples but a review of the data generated revealed poor amplification of some loci in a small number of populations leaving eight primer pairs for analysis (BO22, Br3 and Br13 [45], Bint02, Bint07, Bint05 and Bint24 [47] and BH-B8 [46]). PCR (5µL) compris- ing of 1 X PCR buffer (Invitrogen, Grand Island, NE, USA), 2 mM MgCl 2 (Invitrogen), 0.2 mM of combined dNTPs (Sigma, Perth, Australia), 0.25 µM M13 tagged fluorescent primer (tagged with FAM from Thermo Scientific (Scoresby, Victoria, Australia) or NED, PET or VIC from Applied Biosystems (Mulgrave Australia)), 0.05 µM forward primer, 0.25 µM reverse primer, 10% BSA, 10% PVP-360, 5U/µL Platinum Taq (Invitrogen) and 25 ηg template DNA were amplified using an Eppendorf Mastercycler with a step down program as previously outlined [51]. Amplicons were visualized on a 3130XL sequencer (Applied Biosystems, Mulgrave, Australia) using a LIZ 600 bp internal standard (Applied Biosystems) and scored with GeneMapper Version 4.0 (Applied Biosystems). The eight primer pairs were assessed for null alleles using micro-checker [52] and for significant deviations from Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) using Genepop on the Web (https://genepop.curtin.edu.au/) [50,51].

DNA Extraction and Genotyping
Each leaf was subsampled and genomic DNA extracted [44]. A subset of 16 samples from several of the remnants were used to test 30 microsatellite (SSR) primer pairs developed for other Banksia species [45][46][47][48][49][50] (Supplementary Table S1). Ten of these primer pairs either amplified poorly or failed to amplify and were eliminated. A further eight primer pairs amplified well but exhibited little to no polymorphism across the 16 samples and were also eliminated. The remaining twelve primer pairs were amplified across all samples but a review of the data generated revealed poor amplification of some loci in a small number of populations leaving eight primer pairs for analysis (BO22, Br3 and Br13 [45], Bint02, Bint07, Bint05 and Bint24 [47] and BH-B8 [46]). PCR (5μL) comprising of 1 X PCR buffer (Invitrogen, Grand Island, NE, USA), 2 mM MgCl2 (Invitrogen), 0.2 mM of combined dNTPs (Sigma, Perth, Australia), 0.25 μM M13 tagged fluorescent primer (tagged with FAM from Thermo Scientific (Scoresby, Victoria, Australia) or NED, PET or VIC from Applied Biosystems (Mulgrave Australia)), 0.05 μM forward primer, 0.25 μM reverse primer, 10% BSA, 10% PVP-360, 5U/μL Platinum Taq (Invitrogen) and 25 ηg template DNA were amplified using an Eppendorf Mastercycler with a step down program as previously outlined [51]. Amplicons were visualized on a 3130XL sequencer (Applied Biosystems Mulgrave Australia) using a LIZ 600 bp internal standard (Applied Biosystems) and We also generated SNP data where each leaf was again subsampled and sent to Diversity Arrays Technology Pty Ltd (DArT) in Canberra Australia for DNA extraction, quantification and DArTseq™ genotyping which uses a combination of complexity reduction with restriction enzymes, implicit fragment size selection and next-generation sequencing [52,53], as outlined by [54].

Genetic Diversity and Relatedness in Remnant and SPA Populations
We considered SPA trees to be extensions of each remnant and first sought to determine the total genetic diversity ascribed by the SSRs in these remnants and the associated SPA trees. To do this we pooled SPA trees with their source remnant (hereafter called "remnants") and estimated the mean number of alleles (N a ), observed heterozygosity (H o ) and unbiased gene diversity (UH e ) as well as the inbreeding coefficient (F IS ) using GenAlEx Version 6.51b [55]. To account for bias in N a due to sample size differences, we used FSTAT version 2.9.3.2 [56] to generate allelic richness (R s ); however, it should be noted that these data should be interpreted cautiously as the minimum sample size was two. We then sought to compare the combined genetic diversity from the remnants (hereafter called "wild") to that in each of the SPAs as previously described. The SNP dataset consisted of 7954 loci. R libraries including adegenet, [57] and dartR [58] were used for SNP data manipulation and filtering. We removed loci with call rates <80%, minor allele frequency (MAF) with threshold of 0.05 and repeatability (RepAvg; threshold = 0.99), thus leaving 3161 loci with a mean call rate of 95.2%. Genetic diversity measures of R s , H o , H e and F IS were then estimated using the dartR, poppr [59] and hierfstat [60] R libraries.
Relatedness within remnants and within the pooled wild and SPA populations was explored in the SSR dataset using GenAlEx with Lynch and Ritland [61] and Queller and Goodnight [62] measures based on 10,000 permutations and 10,000 bootstraps for each analysis. The Coancestry package [63] was used to estimate relatedness among individual trees from the SNP data. The Wang (2007) [64] maximum likelihood method was selected from several packages available, as it returns values that are within the theoretical space (unlike method of moment estimators that often return negative values) and can account for inbreeding. A figure using the SNP-based genomic relationship matrix (G) was produced by hierarchical cluster analysis so that groups of closely related trees are sorted into blocks, with individual inbreeding estimates (1 + f ) on the diagonal.

Population Genetic Structure
The SSR data were subjected to a principal co-ordinate analysis (PCord) based on genetic distances among plants calculated using the GenAlEx covariance standardized method, followed by visualization of the first two PCord dimensions. For the SNP dataset, we carried out individual-level PCord analyses using adegenet R functions based on Euclidean distance matrices. STRUCTURE v2.3.4 [65] was used for both SSR and SNP datasets to infer the number of genetic clusters (K) across all of the remnants without prior knowledge of remnant affinities using a 25,000 burn-in and 250,000 MCMC for K from 1-11 replicated 10 times for each K. We assumed no gene-flow among remnants (no admixture model), correlated allele frequencies, no location prior and a uniform alpha individually defined for each population. The optimal number of K-clusters was determined with the ad hoc statistic ∆K [66] using Structure Harvester version 0.6.93 [67] and bar charts of mean cluster membership coefficients were generated using CLUMPAK [68] available online at http://clumpak.tau.ac.il/.

Genetic Diversity and Relatedness in Wild and SPA Populations
There was no evidence for null alleles but some evidence of HWE departure for Bint02, Bint05, Bint24, Br3 and Bo22 (Supplementary Table S2) and LD in nine of the 28 comparisons after Bonferroni corrections [69] (Supplementary Table S3). These deviations were inconsistent among sites and loci so all loci were retained for analyses. Levels of genetic diversity based on SSRs and SNPs varied among remnants (Tables 1 and 2). While genetic diversity indices for the small Banksia remnants should be viewed cautiously, the number of alleles (N a ) based on SSRs ranged between 1.63 and 6.75. SSR allelic richness (R s ) was 1.62-2.44, although this was based on a diploid sample of two individuals due to the small population sizes, and was 1.   Both SSR-based relatedness measures indicated that all remnants had significantly higher mean relatedness except for Omeo and Tooborac (Figure 2a,b) as did the pooled wild and SPA populations (Figure 2c,d). The SNP-based genomic relationship matrix (G) is shown in Figure 3. Individual inbreeding estimates ranged from zero to 0.8, averaging 0.28. The SNP panel provided excellent power for resolving relationships using the Wang (2007) triadic estimator, revealing strong relatedness within and among the SPA and wild trees (Figure 3). A small number of clones not evident in the SSR dataset were also detected in this analysis.
higher mean relatedness except for Omeo and Tooborac (Figure 2a,b) as did the pooled wild and SPA populations (Figure 2c,d). The SNP-based genomic relationship matrix (G) is shown in Figure 3. Individual inbreeding estimates ranged from zero to 0.8, averaging 0.28. The SNP panel provided excellent power for resolving relationships using the Wang (2007) triadic estimator, revealing strong relatedness within and among the SPA and wild trees (Figure 3). A small number of clones not evident in the SSR dataset were also detected in this analysis.

Population Genetic Structure
Principal co-ordinate analysis using the SSR data indicated that 13.5% of the total genetic variation was accounted for on the first axis and 10.8% on the second axis (Figure 4a). Highlands trees were distributed in negative PCord 1 space and Marraweeney trees in positive PCord 2 space but overall, there was no clear differentiation of trees into discrete groups; trees of unknown origin were distributed across all of the PCord space. While Diversity 2021, 13, 39 8 of 16 similar levels of total genetic variation were also observed in the first two axes of the SNP PCord (12.5% and 8.5% respectively), the trees in this analysis were clearly differentiated into four groups (Figure 4b). Highlands, Gobur, Kobyboyn and Ruffy trees fell together in negative PCord 1 space while Sandy Creek and some Blue Range trees were in a second group in positive PCord 1 space. The third group was primarily comprised of Marraweeney trees with a fourth group that consisted of Tooborac, Omeo, Dropmore and the remaining Blue Range trees placed centrally to the three other groups. One tree from both Ruffy and Gobur and two of the Highlands trees did not fall within their expected groups. Unknown trees were variously distributed among all four groups, allowing inferences to be drawn about their probable origins. Fifteen trees (Unknown A) grouped with Highlands cluster were all from the Euroa SPA as were the 6 trees (Unknown B) which fell with Marraweeney. Trees from Unknown C (two Euroa and eight Benalla) fell within the central group of Tooborac, Omeo, Dropmore and some Blue Range trees with the remaining 52 trees from Unknown D (three Euroa and 49 Benalla SPA) grouped with the Sandy Creek group.

Population Genetic Structure
Principal co-ordinate analysis using the SSR data indicated that 13.5% of the total genetic variation was accounted for on the first axis and 10.8% on the second axis ( Figure  4a). Highlands trees were distributed in negative PCord 1 space and Marraweeney trees in positive PCord 2 space but overall, there was no clear differentiation of trees into dis-  ments to more than one class in most cases. Some differences in the assignment of trees observed between these two analyses included SSRs again differentiating between the Blue Range trees and not being reflected in the SNP analysis and a small number of trees (e.g., Highlands and Sandy Creek) being assigned to different clusters. As expected from the PCord analyses the Unknown trees were assigned to all three K by both analyses however Benalla SPA trees were predominantly from Cluster 1 whereas Euroa SPA trees were represented in from all three clusters.   Despite no clear groupings in the PCord SSR dataset, STRUCTURE indicated that the most probable number of K was 3 whereas for the SNP data K = 2 was the optimal clustering (Supplementary Figure S1). While the K = 2 assignments were similar for many trees in both the SSR and SNP datasets some notable differences were present. For example, the SSR analysis indicated that Blue Range trees came from both clusters (Figure 5a) whereas the SNP analysis assigned all these trees to Cluster 1 (Figure 5b). The Marraweeney trees were also assigned to different clusters by the two analyses and the Unknown trees were primarily assigned to Cluster 1 by both the analyses. The K = 3 analyses provided a nearly identical result between the SSR and SNP assignments (Figure 5c,d) that is readily interpretable and that is more useful for assigning the unknown individuals to their likely source populations. The SNP markers resulted in assignments of individual trees to populations with 99.9-100% probability whereas SSR markers resulted in proportional assignments to more than one class in most cases. Some differences in the assignment of trees observed between these two analyses included SSRs again differentiating between the Blue Range trees and not being reflected in the SNP analysis and a small number of trees (e.g., Highlands and Sandy Creek) being assigned to different clusters. As expected from the PCord analyses the Unknown trees were assigned to all three K by both analyses however Benalla SPA trees were predominantly from Cluster 1 whereas Euroa SPA trees were represented in from all three clusters.

Discussion
Our study used both SSR and SNP markers to assess genetic diversity and population genetic structure in remnant and SPA populations of Banksia marginata. We found that genetic diversity and inbreeding varied among the remnants. However, when the trees were pooled into their respective wild and SPA populations, genetic diversity and inbreeding indices were similar among the three groups. High levels of relatedness were also observed within most remnants as well as within the pooled wild and SPA populations. We also found that the population genetic structure was best described by three groups of trees and that each tree of unknown origin could be unambiguously assigned to one of these groups.
Lower levels of genetic diversity and increased inbreeding are globally recognized signatures of landscape fragmentation [18,70] that are also likely to be exacerbated by climate change (see review by [71]). Similar levels of genetic diversity were observed among these B. marginata remnants irrespective of whether SSRs or SNPs were analyzed. Our SSR genetic diversity measures are lower than those found for this species by [72] but this is likely due to the different SSR marker panels used in the two studies and because our study populations were considerably smaller and likely to have lower genetic diversity.
As B. marginata was once a widely distributed, but likely scattered, species in savannah-like ecosystems [32,73], it is unclear whether the genetic diversity patterns presented here are indicative of the species' history, of recent fragmentation, or both. However, the SSR indices reported here are comparable with rare species such as B. mimica, B. vestita and B. arborea [74,75] as well as the widespread but fragmented B. menziesii [76], suggesting that recent fragmentation is likely the driver of low diversity. While few SNP-based comparisons are available for Banksia our diversity indices were generally higher than those observed in the disjunct B. biterax [77] but similar to the more widely distributed B. seminuda [78].
The low genetic diversity observed here is presumed to expose these fragmented Banksia remnants to inbreeding and genetic drift and eventual extinction [11]. However, longevity and large spatial and temporal gene dispersal distances are thought to buffer longer-lived species against the effects of habitat fragmentation [27,28], although this assumption has been recently challenged [79,80]. In general, outcrossing species such as this Banksia [35], are also thought to be able to counter the effects of fragmentation [27,28] but not always, even in the case of wind-pollinated species [81][82][83]. Banksia marginata is a major nectar source for honeyeaters, possums and insects [33] which presumably helps to maintain gene dispersal and preferential outcrossing and limiting the production of small inbred seed that are less likely to survive [35]. As B. marginata populations do not appear to be experiencing recruitment bottlenecks [73], it is possible that sufficient outcrossed progeny are still being produced to maintain demographic processes. The very real challenge for these and other small remnants may be to continue to attract pollinators to maintain gene flow in the face of ongoing vegetation loss and associated decline of woodland birds [84,85].
Contrasting inbreeding estimates ranging from inbred to outbred were observed for many of our remnants depending on the dataset analyzed. While these data did not provide any clear patterns, the relatedness analyses tell a quite different story for these B. marginata remnants. High relatedness based on SSRs was detected in more than half of the remnants as well as when trees were pooled as wild and SPA populations. This result has also been observed in other remnants of this species [72]. The kinship estimates using the SNP dataset provided a much more nuanced understanding of the relationships among our remnant and SPA trees with many trees determined to be more closely related than full-sibs as a result of selfing or mating between close relatives. This may reflect (i) that trees within the fragmented populations from which the SPA trees were selected are often closely related, (ii) that sampling of trees for inclusion in the SPAs was carried out over short geographical distances resulting in related trees being included, or (iii) both. These findings have very important consequences for the efficacy of the two SPAs, as it increases the likelihood that the seed being produced will be inbred.
For outcrossing species, especially those with self-incompatibility systems, high relatedness reduces mate availability and often results in seed crop failures [86,87]. Several Australian genera are primarily preferentially outcrossing including banksias [88] and eucalypts [89], allowing these species to continue to set seed through selfing when mate availability is low. However, selfed seed in banksias (e.g., [90]) and eucalypts (e.g., [91,92]) can fail to germinate or has poor seedling survival. The prognosis for eucalypt seedlings that are the result of selfed or close-relative mating is poor, with inbreeding depression generally resulting in near-complete mortality before reproductive maturity is reached [93]. While kinship among trees is lower in the Euroa SPA (r = 0.09) than in the Benalla SPA (r = 0.16), neither of these is likely to be producing seed with the broad genetic base required for restoration, with the likely effects of inbreeding depression on fitness and survival compounding this problem. Unfortunately this is a seemingly common issue across SPAs established to meet the seed demands for restoration [6,[20][21][22]. There is much to be learnt from forestry seed orchards that are actively designed and managed to achieve high outcrossing rates [91] when establishing genetically diverse SPAs for restoration. In particular, ensuring that close relatives are not included, to minimize inbreeding, is a high priority.
Sampling trees within remnants at appropriate densities to avoid clusters of near relatives and/or pre-screening selections using molecular markers is also advisable.
While Miller et al. (2020) observed strong population structuring among fragmented B. marginata populations across the Victorian Volcanic Plains (2.3M Ha), we were somewhat surprised to find similarly strong partitioning at much finer geographic scales. Interestingly, these observations are inconsistent with expectations for outcrossing, longer-lived trees where more extensive gene dispersal should reduce differentiation among populations [27,28]. This expectation is, however, dependent on numerous factors including successful gene dispersal via pollen and seed. Pollen-mediated gene flow in banksias is primarily via insects, birds and mammals [39,94,95]. Birds can potentially transport pollen over large distances with B. menziesii pollen known to travel from 400 m to <2 km [76,96], although [97] highlight that most B. sphaerocarpa var. caesia pollen travels very short distances (<20 m). Banksia seed dispersal also varies with B. hookeriana seed moving from 36 m [98] to several kilometers [99], although the latter distance is possibly due to wind vortices and a lack of obstacles following fire. As B. marginata is likely to be bird and insect pollinated and the seed lack adaptations for long distance dispersal [72] it is expected that similar dispersal distances also characterize this species. It is difficult to ascertain from our data whether the strong population genetic structure observed was due to this species being widespread but naturally scattered or whether it reflects fragmentation since European settlement. The three genetic clusters we identified did not relate to any obvious geographical characteristics, making it possible that our populations reflect historic processes such as founder effects, genetic drift and limited gene flow among naturally scattered populations.
The population structure observed has allowed us to determine the probable origins of the Unknown SPA trees, helping to determine if new germplasm is required to alleviate biases towards particular remnants in the SPAs. The Benalla SPA trees were originally and primarily sourced from Sandy Creek with smaller contributions from Marraweeney, Ruffy, Gobur and the Euroa SPA. Most of the Unknown trees in this SPA were identified as being from Sandy Creek with only two coming from either Ruffy or Gobur and none from Marraweeney. While we cannot identify the Unknown Euroa SPA trees to individual remnants, most of these trees were from the Highlands/Gobur/Ruffy Cluster. The SNP dataset provided significantly better resolution of the Unknown trees than the SSRs demonstrating the value of using SNP markers despite the higher cost. As SPA establishment and management is expensive and time consuming, it is critical to ensure that any investment produces seed that is likely to generate a long-term biodiversity benefit. This includes more-careful selection of SPA foundational germplasm including sampling small and declining remnants at appropriate densities to avoid relatedness rather than selecting as many trees as possible to 'save' these from extinction.

Conclusions
SPAs are likely to be increasingly important to meet the seed demand for restoration. This will be especially so in Australia as a drying climate begins to impact on seed production and remnant populations continue to decline.
There is an urgent need to develop best practice protocols for SPAs that include design and active management practices that ensure outcrossing similar to those developed for forestry seed orchards.
To maximize the investment in SPAs studies to assess genetic diversity and population genetic structure is required to guide the selection of germplasm.
Studies to determine whether SPA-produced seed provide a significant biodiversity benefit are required to ensure that the investment required to produce these seed is justified or can become more cost-effective.
Supplementary Materials: The following are available online at https://www.mdpi.com/1424-281 8/13/2/39/s1, Figure S1: ∆K based on STRUCTURE analyses for Banksia marginata remnant and SPA trees (a) SSRs and (b) SNPs, Table S1: Cross amplification of Banksia marginata samples using primer pairs developed for other Banksia species. + initially amplified across all samples; * data suitable for  Table S2: p-values for each locus indicating departure from Hardy-Weinberg equilibrium, Table S3: Significant p-values for each locus pair across all Banksia marginata populations based on Fisher's exact test [100]. * p-value remains significant after Bonferroni correction of p = 0.002 [69].