All Populations Matter: Conservation Genomics of Australia’s Iconic Purple Wattle, Acacia purpureopetala

: Maximising genetic diversity in conservation efforts can help to increase the chances of survival of a species amidst the turbulence of the anthropogenic age. Here, we deﬁne the distribution and extent of genomic diversity across the range of the iconic but threatened Acacia purpureopetala , a beautiful sprawling shrub with mauve ﬂowers, restricted to a few disjunct populations in far north Queensland, Australia. Seed production is poor and germination sporadic, but the species occurs in abundance at some ﬁeld sites. While several thousands of SNP markers were recovered, comparable to other Acacia species, very low levels of heterozygosity and allelic variation suggested inbreeding. Limited dispersal most likely contributed towards the high levels of divergence amongst ﬁeld sites and, using a generalised dissimilarity modelling framework amongst environmental, spatial and ﬂoristic data, spatial distance was found to be the strongest factor explaining the current distribution of genetic diversity. We illustrate how population genomic data can be utilised to design a collecting strategy for a germplasm conservation collection that optimises genetic diversity. For this species, inclusion of all ﬁeld sites will capture maximum genetic diversity for both in situ and ex situ conservation. Assisted cross pollination, within and between ﬁeld sites and genetically structured groups, is recommended to enhance heterozygosity particularly at the most disjunct sites and further fragmentation should be discouraged to avoid loss of genetic connectivity.


Introduction
While the distribution of genetic diversity within and between individuals and populations can guide our understanding of past demographic processes and life history characteristics of a species, there has been increased interest in genetic diversity from a conservation perspective as is reflected in the reference to genetic diversity in the Convention on Biological Diversity, Aichi Biodiversity Targets agreed in 2010 [1]. This is justified by the mounting evidence that higher levels of genetic diversity can facilitate fitness and long-term survival [2,3], providing some form of life-insurance for species in a rapidly changing environment. Population genetic studies have revealed low genetic diversity for some threatened species. As a result, many conservation management plans are turning to inferences of genetic diversity to guide conservation efforts in order to maximise genetic diversity for in situ and ex situ collections as well as translocation [4,5]. To the advantage of this evolutionary approach to conservation, is the advances in genomic sequencing technologies and accompanying bioinformatic tools over the last decade facilitating the inclusion of genetic information in conservation management plans [6,7].
The genus Acacia evolved c.23 million years ago (Mya) with a major radiation at 15 Mya [8,9] that led to its current ubiquitous distribution across the Australian land-habitat specificity plays a critical role in the geographic distribution as well as the extent and distribution of genetic diversity of the species. Thus, to compliment the genomic data, we gathered floristic and environmental data to explore the relationship amongst the three variables using a generalised dissimilarity modelling approach (GDM).
Conservation practitioners often need to make explicit or implicit decisions regarding selection for sites to be included in targeted management plans that include in situ and ex situ conservation with trade-offs between various factors (such as stakeholders, threatened status of the species, available information on the species, funding and time [26,27]). The entire extent of A. purpureopetala occurs outside conservation reserves and this can complicate the implementation of in situ management plans across the distribution of the species. We used the genomic data to explore how we can optimise genetic diversity for germplasm conservation collections. As it is unlikely that the exact same individuals used in the genomic study will be sampled, we used an approach of randomly selecting individuals from the available pool of genotyped samples. The aims of this study were (i) to obtain genomic data suitable for population genetic inferences and (ii) utilise the data and results to provide practical guidelines for conservation practitioners specific to this species.

Materials and Methods
Field sites for this study ( Figure 1) were based upon existing herbaria records, which represent the populations and sub-populations as presented in the Species Profile and Threats Database [28]. Acacia purpureopetala occurs in low open woodlands with a welldeveloped shrub layer (habitat images provided in Figure S1) but is also found along roadsides and abandoned mine sites. These woodlands tend to be very rocky, with substrates varying from granites and rhyolites to metalliferous metamorphic substrates at elevations of 64-1040 m above sea level. Nearly all plants are found at aspects of 10-140 degrees North [20]. All fieldwork was undertaken between November 2017 and May 2018. Three sets of data, environmental, floristic and genomic, were gathered for each field site.

Floristic Data
The methodology associated with floristic data collection followed [30]. After the presence of A. purpureopetala was verified at a field site, a transect of 50 m × 10 m was established. One side of the line was assessed for woody species (i.e., 50 m × 5 m line) with each woody species identified and counted into shrub strata, S1 and S2, where S1 is the tallest layer and S2 is the shortest layer. A. purpureopetala plants were counted on both sides of the centre line. Full details on vegetation surveys associated with the transects can be found in [20]. Transect data for all woody species at each site were entered into a two- Figure 1. The location of the field sites for Acacia purpureopetala in far North Queensland, Australia. The map was produced in R using Leaflet [29] with an Esri World Street base layer. The numbering of sites following a North to South latitudinal change with site 1 = Springmount, 2 = Stannary Hill North, 3 = Stannary Hill South, 4 = Baal Gammon, 5 = Jumna, 6 = Dargo, 7 = Emuford, 8 = Ibis Dam, 9 = Gurrumba, 10 = Mt Misery North, 11 = Mt Misery East, 12 = Mt Misery Central and 13 = Mt Misery South.

Floristic Data
The methodology associated with floristic data collection followed [30]. After the presence of A. purpureopetala was verified at a field site, a transect of 50 m × 10 m was established. One side of the line was assessed for woody species (i.e., 50 m × 5 m line) with each woody species identified and counted into shrub strata, S1 and S2, where S1 is the tallest layer and S2 is the shortest layer. A. purpureopetala plants were counted on both sides of the centre line. Full details on vegetation surveys associated with the transects can be found in [20]. Transect data for all woody species at each site were entered into a two-way table with counts for all species included in the matrix. The data was separated into S1 and S2 per site and a total count per species per site. This was used for estimating pairwise dissimilarity matrices (see below).
To obtain an estimate of total population size of A purpureopetala, sightings in areas adjacent to the transect were recorded. Once an individual plant was observed, a count was taken within a 10 m × 5 m area, along with a GPS waypoint. Counting in this fashion was not exhaustive but provided a replicable way to assess the number of individuals at each site.

Single Nucleotide Polymorphic (SNP) Data
Fresh leaf material for DNA extractions were collected at each of the field sites ( Figure 1). Samples were taken from along the transects and from widely spaced individuals throughout the remaining site. Some individuals were many 100 s of metres apart from the next sample. Location data (latitude and longitude) was collected for each sample and the spatial distribution of the tissue samples largely reflected that of the population. Leaf material was placed in paper envelopes placed in sealed plastic bags, kept cold, and express mailed to the Royal Botanic Gardens, Sydney, where material was lyophilised immediately upon arrival.
Approximately 7 ug of dried leaf material for each of 188 samples were sent to DArT Pty Ltd. (Canberra, Australia) for DNA extractions and medium density DArTseq. This whole genome double-restriction enzyme complexity reduction and high-throughput sequencing method [31] has been successfully applied to several studies of Australian native plants and animals, and more detailed descriptions of the methodology applied by the company can be found in published literature [32][33][34][35].

Genomic Data Analyses
The SNP data set provided by DArT Pty Ltd. was assembled through their proprietary analytical pipelines and was provided with quality informing statistics for each marker and sample. The SNP data was filtered based on several quality criteria using an in-house R package 'RRtools v. 1 see [32]: (1.) only SNPs with a reproducibility score> 98% were used (the reproducibility score is an index of the precision of genotype calls between technical replicate samples); (2.) SNP loci with missing genotype calls for greater than 10% of samples were removed; (3.) for markers with more than a single SNP, only one SNP was maintained in order to minimize the effect of linkage; and (4.) samples with a high percentage of missing data (greater than 40%) were excluded from the data. This left 14,565 SNPs and 184 samples to conduct a series of population genomic analyses.
A principal component analysis (PCA) was produced using the R package adegenet 2.0.1 [36] to assess genetic similarity at the individual and population level. Genetic structure analyses were performed using LEA (an R Package for Landscape and Ecological Association Studies, [37]), which estimates ancestry coefficients from large genotypic matrices and evaluates the number of ancestral populations. The snmf function (sparse Non-Negative Matrix Factorization algorithms) was implemented to estimate individual admixture coefficients from the genotype matrix. A measure of fit (i.e., the entropy criterion) is evaluated between the statistical model and the data to choose the best number of ancestral populations (K) that explain the data. Ten replicates were run for each value of K (K ranged from 1-20), wherein optimal K was selected by examining the post-stabilisation of the steepest decline in cross-entropy values, with the best replicate of optimal K selected from the lowest minimal cross-entropy values among the 10 replicates.
Population genetic diversity measures (allelic diversity, population size, observed heterozygosity, expected heterozygosity and inbreeding coefficient) were estimated for each sub-population (or collection sites) using the R package diveRsity [38] and poppR [39]. The R package SNPrelate [40] was used to estimate population pairwise Fst values based on the estimator of [41]. These diversity estimates were also calculated for three comparable sized data sets (sample size and sequencing depth for SNPs) of three species of Acacia (A. linifolia, A. longifolia and A suaveolens) with similar available data (DArTseq SNPs, samples size and sequencing depth) previously described in [16], allowing for a better understanding of levels of inferred genetic diversity in the study species.
The extent of genetic differentiation was tested by hierarchical analysis of molecular variance (AMOVA) in poppR [39]. Significance of variance components, differentiation between clusters, differentiation among locations within clusters, and differentiation among locations was assessed using a permutation test implemented through the randtest function with 9999 permutations.

Environmental Data
The CHELSA climate data set [42,43] was used to provide a representation of the climatic conditions experienced by A. purpureopetala at each field site. The CHELSA data was interpolated from observed weather data spanning 1979-2013. The data has a spatial resolution of 30 arc seconds (approximately 885 m at the mean latitude of the study sites). The downscaling method accounts for the interactions between humidity, air mass movement, and topography to better describe rainfall distribution [42]. The basic 19 bioclimatic variables [44] were used in this study. We also used the following non-climate variables to characterise the environment at each field site: percent sand, silt and clay in the soil, slope, aspect, topographic position index (a measure of a site's location valley, mod-slope, ridge), and topographic wetness index (a measure of the amount of water movement a site may receive from upslope areas). Soil and topographic areas were downloaded from the CSIRO Data Portal [45] and resampled to the same grid as the CHELSA climate data. All GIS processing was performed using the raster package [46] in the R statistical environment.

Relationships between Genetic, Environmental, Floristic and Spatial Distances
We fitted a Generalised Dissimilarity Model (GDM) [47] with the Fst matrix as the dependent variable and geographical distance, environmental variables, and floristic distance as explanatory covariates. For spatial distance, we computed the Great Circle Distance (WGS84 ellipsoid) between sample locations with the function spDists function in the R package sp [48]. For floristic distances, a Bray-Curtis dissimilarity matrix was generated based on Hellinger transformed abundance matrices using the function vegdist in Vegan [49]. The GDM was fitted using the package gdm in the R statistical environment [50,51].

Optimising Genetic Diversity for A Germplasm Collection
To be able to provide practical guidance to conservation practitioners regarding a genetically diverse germplasm collection, we used the genomic data to investigate how best to capture genetic diversity. For this analysis we used a SNP data set where all loci with missing data were removed (resulting in 6830 SNPs) along with a minor allele frequency of >3% as the threshold delimiting common SNPs. This represented a trade-off between keeping data for as many SNPs as possible, while removing variation that might be especially subject to genotyping error (especially singletons). We utilised the algorithms available in OptGenMix [35] available at [52] to explore sampling design using different subsets of genotyped individuals. First, we evaluated whether sampling size influence diversity ( Figures S2 and S3), and based on the results, we proceeded with sampling sizes, 12, 24, 36, 48, and 60. We then simulated sampling in two ways. One of which is a genetic-based approach of sampling of individuals using simulated annealing optimization algorithm [53] to choose a subset of the available plants that maximized the value of gene diversity (expected heterozygosity) [54]. The output from this provided a benchmark of the maximum genetic diversity that can be captured under each sampling regime. Subsequently, we also chose individuals at random from the genotyped individuals for each sampling size. This was repeated 100 times, and outcomes were averaged to inform the level of diversity that could be expected if sampling was random.
From the diversity estimates, Fst and AMOVA, we were already aware that much variation resides between the field sites and we thus explored the effect of sampling from subsets of field sites rather than across all samples on diversity capture. We tested selecting only 5, 6, 7, 8, 9, 10, 11, and 12 sites (for this study we combined Jumna and Dargo as these are close in proximity and only six samples were available from Jumna), specifying that a more or less equal number of individuals should be collected from each of the selected sites to reach a sampling size of 12, 24, 36, 48, and 60 individuals. Each random selection was repeated 100 times. The "more or less equal" sampling regime across sites was set up as we felt that this emulated a field collection best.

Floristic Data
A total of 13 transects were established to represent vegetation coverage across the distribution of A. purpureopetala at each accessible field site ( Figure 1). Incidentally, two new sub-populations were discovered while conducting the surveys. Woody species diversity varied both between shrub layers at each site and between sites (Table A1). Total diversity ranged from 16-32 species and was not associated with the altitude of the transects. The different shrub layers contributed unevenly towards total diversity, with some sites recording higher site diversity from the taller shrub layer (S1) (e.g., field site Mount Misery East), whilst others had a greater diversity within the lower shrub layer (S2) (e.g., field site Emuford). Estimated numbers of A. purpureopetala at each site varied between 67-561 but these were not reflected in the abundance of the study species along each transect and are likely a better representation of total species number per site. The total area covered in the genetic sampling largely reflected the total area covered by the site survey.

Genotype Data
Medium density DArTseq analysis of 188 samples of A. purpureopetala resulted in a SNP matrix of 21659 SNPs. After removal of SNPs according to missing data and the reproducibility score, 14565 SNPs and 184 samples were retained for genomic data analyses. With the snmf analysis ( Figure A1) performed in LEA, K = 2 supported the clustering of all samples from Baal Gammon, Dargo, Jumna, and Springmount as one group and all samples from Mt Misery (Central, East, North, South), Gurrumba and Emuford as the second group, while the samples from Stannary Hill North and South and Emuford were not clearly identified as belonging to either. With K = 3, the three inferred structure groups corresponded to the groups identified on PC1vs PC2 plot.
The assignments to K = 4 was congruent with the clusters obtained across the three main principal components. 3.
The analysis of variance (Table 1) indicated substantial variation between sites (60%) as well as between the two groups (33%) identified with the principal component analysis.  Figure 1 and the relative geographic locations of the field sites are illustrated in the insert (top left corner).

Population Genomic Diversity Estimates
Commonly used population genomic estimates are provided in Table 2 and summarized in Figure A2 along with estimates from three other Acacias for which similar data were available to us through the Restore and Renew project [32]. For A. purpureopetala, overall observed heterozygosity was almost always lower than expected heterozygosity, Figure 2. Three plots illustrating the relationships amongst individuals of Acacia purpureopetala inferred from principal components analysis based on SNP data. PC1 vs PC2 (top), PC1 vs PC3 (bottom left), and PC2 vs PC3 (bottom right). The percentage variation explained by each component is provided along the X and Y axis. PC1 separated two large clusters (indicated with purple and green shapes) while all samples from Emuford separated from these clusters along PC2 (blue arrow). The samples from Stannary Hill South (yellow dots) and North (orange dots) and Springmount (red dots) were further separated into distinct groups along PC 3. Samples from each field site are colour coded in the same colours used in Figure 1 and the relative geographic locations of the field sites are illustrated in the insert (top left corner).

Population Genomic Diversity Estimates
Commonly used population genomic estimates are provided in Table 2 and summarized in Figure A2 along with estimates from three other Acacias for which similar data were available to us through the Restore and Renew project [32]. For A. purpureopetala, overall observed heterozygosity was almost always lower than expected heterozygosity, which contributed to the positive Fis values (ranging between 0.1 for Springmount and 0.69 for Dargo). We found that four of the five sites with lowest allelic richness (Springmount, Baal Gammon, Emuford and Gurrumba) were also the sites most isolated and disjunct from any other known site. Springmount and Emuford both harbored least allelic richness, lowest observed and expected heterozygosity, and highest average kinship (Table S2). Table 1. Results from the Analysis of Molecular Variance for A. purpuereopetala SNP data. Two hierarchical levels were analysed: at the field site level and among groups (green and purple groups as based on the results from the PCA illustrated in Figure 2). d.f degree of freedom, SS = Sum of squares, Phi-statistics (Φ) according to [55]. Comparable genomic data for three additional species of Acacia (available to us through the Restore and Renew project [16,32]) allowed us to assess the levels of inferred diversity against other species with similar phylogenetic background. While the three additional species occur across a much bigger area (estimates of Area of Occupancy and Extent of Occurrence provided in Table S1), similar numbers of SNPs were found in three of the four species with the same sequencing depth as provided by Dart Pty Ltd. Striking differences were found between diversity statistics amongst the four species ( Figure A2; Table S1) with A. purpureopetala showing the lowest average Ho and He amongst all species and highest Fis. The Ho, He, and ar as well as pairwise Fst estimates amongst field sites (Table S2; 0.2 to 0.9) were most like A. suaveolens, and this species has been found to have a mixed mating system with high levels of selfing within some populations [56]. More details of the three additional species of Acacia species are available [16].

Relationships between Genomic Data and Ecological Data
We explored linear relationships between genomic diversity estimates and floristic diversity (abundance of A. purpureopetela and species diversity at each field-site; Table S2) but did not find any positive or negative correlations suggesting that neither population at a site nor species diversity of the associated vegetation are indicative of genetic diversity of the study species (results not shown and not explored further).
A GDM approach with spatial distance, environmental variables, and floristic distance as predictors of genetic difference between field sites revealed all three covariate groups contributed significantly to the model. The model quality was found to be good ( Figures S4 and S5). The strongest contributor to Fst values between field sites was spatial distance, with a strongly non-linear relationship to Fst ( Figure S6). Also making strong contributions to the model were aspect and bioclimatic variables 3 (isothermality), 12 (annual rainfall), and 14 (driest monthly precipitation, and floristic difference.). Weak contributions were provided by several other bioclimatic and topographic variables ( Figure S6).

A Collecting Strategy to Optimise Genetic Diversity
Using OptGenMix [52]. it was found that with very few samples a large amount of the genetic diversity present amongst the genotyped samples can be captured (Figures S2 and S3). In fact, with as few as 24 samples (out of 184), over 90% of the diversity were captured with an optimised sampling strategy (Figure 3 and Figure S3). This is slightly lower to the OptGen-Mix output for other threatened species used in translocations with the asymptote achieved at higher levels of sampling [57]. Exploring the level of genetic diversity captured through a semi-randomised approach (Figure 3), we found that strategies that include more field sites captured a greater depth of genetic diversity (expressed by genetic distance). By including five randomly selected samples from each site, a maximum diversity was captured. This was always higher than when collecting more individuals from 10 or less sites. As it is most likely that the sampling for the genetic work did not capture all available diversity, these numbers should not be taken literally as an indication of how many samples should be collected at each site.

A Collecting Strategy to Optimise Genetic Diversity
Using OptGenMix [52]. it was found that with very few samples a large amount of the genetic diversity present amongst the genotyped samples can be captured ( Figures S2  and S3). In fact, with as few as 24 samples (out of 184), over 90% of the diversity were captured with an optimised sampling strategy (Figures 3 and S3). This is slightly lower to the OptGenMix output for other threatened species used in translocations with the asymptote achieved at higher levels of sampling [57]. Exploring the level of genetic diversity captured through a semi-randomised approach (Figure 3), we found that strategies that include more field sites captured a greater depth of genetic diversity (expressed by genetic distance). By including five randomly selected samples from each site, a maximum diversity was captured. This was always higher than when collecting more individuals from 10 or less sites. As it is most likely that the sampling for the genetic work did not capture all available diversity, these numbers should not be taken literally as an indication of how many samples should be collected at each site. Figure 3. The proportion of genetic diversity captured, from that available amongst the genotyped samples, through different collecting strategies with different sample sizes (X-axis) and from a different number of randomly selected sites for A. purpureopetala. With a resulting sampling size of 24 individuals, sampling individuals from 12 sites captured significantly more diversity than from five sites (93 to 96% diversity vs 72 to 83%, p < 0.001). Sites were selected randomly from all available sites and individuals selected randomly from available individuals. Each box whisker plot summarises the sampling event across 100 randomisations. The red open diamond represents the "benchmark" of diversity to be captured across the entire dataset with each specified sample size. Figure 3. The proportion of genetic diversity captured, from that available amongst the genotyped samples, through different collecting strategies with different sample sizes (X-axis) and from a different number of randomly selected sites for A. purpureopetala. With a resulting sampling size of 24 individuals, sampling individuals from 12 sites captured significantly more diversity than from five sites (93 to 96% diversity vs 72 to 83%, p < 0.001). Sites were selected randomly from all available sites and individuals selected randomly from available individuals. Each box whisker plot summarises the sampling event across 100 randomisations. The red open diamond represents the "benchmark" of diversity to be captured across the entire dataset with each specified sample size.

Discussion
Here, using the genome complexity reduction method, DArTseq, we investigated the population genomic structure of the fragmented, geographically-restricted, and threatened species, Acacia purpureopetala. We used environmental, floristic, and spatial information to establish if the genetic patterns can be explained by variation in habitat in order to understand how the outcomes of the genetic data can be explained and should be applied to conservation as understanding demographic processes are crucial for successful conservation [58].
The matrix of polymorphic loci recovered for A. purpureopetala suggests that the available genetic variation residing in this species are very similar to other more widespread Acacias [16]. However, we found very low levels of diversity (Ho, He and ar) at the field sites, suggesting repeated levels of inbreeding through biparental inbreeding and self-fertilisation [59]. Furthermore, the most isolated and disjunct populations are likely experiencing severe bottlenecking as these are most genetically depauperate. The large number of individual plants found at each site does not reflect the severely low levels of inferred genetic diversity, suggesting that inbreeding depression is not (yet) affecting this generation [60] but could be affecting the next generation. Very high levels of divergence between field sites (high Fst values and 60% of the variation between sites) and low heterozygosity such as found here for A. purpureopetala have been associated with organisms with mixed mating systems [61], where in small populations drift can quickly lead to fixation following inbreeding amongst close relatives [22]. Low levels of heterozygosity have not consistently been found in other rare or threatened Acacia species (such as A. sciophanes [17] and A. whibleyana [18]), but we found very similar patterns of genomic diversity indices to that of the widespread A. suaveolens. For this species, genotyping of seedlings and mothers indicated that the low levels of heterozygosity are due to self-fertilisation and biparental inbreeding [56].
Several factors may have contributed towards the inferred pattern of genomic diversity (large divergence amongst populations, low diversity at field sites and high levels of kinship/genetic similarity amongst spatially adjacent individuals) for this species but the absence of long and medium distance pollen and seed dispersal agents is most likely a large contributor [24,62,63]. This species lacks many of the traits associated with seed dispersal, and while speculative, it is likely that the fixed purple flower colour in A. purpureopetala (contrasting with other Acacias with creamy to bright yellow flowers) does not attract the same suite of commonly available pollinators. Thus, the species may have to rely heavily on self-fertilisation and is further subjected to biparental inbreeding as individuals close by are likely to be of close kin due to the lack of seed dispersal.
Clonality can lead to low within-population diversity and high Fst and high estimates of kinship within populations [64], but we did not observe any suckering during extensive field surveys and the species does not resprout after fire. Apomixis could potentially account for the high levels of kinship, however, detailed developmental studies will be needed to confirm this and in general apomixis is often associated with polyploidy.The low levels of heterozygosity suggests that this species is not polyploid. Polyad pollen (clusters of 2-32 pollen grains from one stamen) common in Acacia [65] could also contribute to high levels of kinship within pods, but we do not have data at this fine scale. The reproductive advantage of a self-compatible system (along with possibility of polyad pollen contributing to multiple ovules fertilised by the same male) may work well for colonisation and range expansion by only one or a few propagules [66], but it leads to an increase of a small number of highly similar genotypes. In the absence of continuous gene flow from source populations, such a strategy will lead to a rapid breakdown of heterozygosity that can lead to genetic erosion and long term this can lead to a lack of adaptability particularly when conditions change [67]. While it is now well-known that increased levels of homozygosity and inbreeding may lead to inbreeding depression (a lack of fitness), at least for now the ability to successfully produce offspring through self or biparental inbreeding provides A. purpureopetala with a mechanism to survive while conditions are suitable to the available genotypes in each population. However, relatively low seed production and germination has been observed in this species and this already raises questions regarding inbreeding depression. With habitat loss listed as one of the major threats to this species [12], populations may become further disconnected allowing for even less gene flow.
While A. purpureopetala occupies only a small area, SNP data indicates strong genetic structure across the landscape with field sites in relatively close proximity not necessarily belonging to the same clusters ( Figure 2). These broader clusters may reflect historical patterns of expansion and contraction with divergence between them, which is most likely exacerbated by genetic drift. More detailed experimentation may unravel the relationship between environment and genetic diversity further and will be needed to fully understand the demography of the species [58]. Spatial distance is often an easy variable to visualise and imagine in regard to gene flow and connectivity, but it is often hard (or impossible) to explain genetic breaks that cannot be defined by a measurable attribute. Each fieldsite is likely to be under different selective forces, however, we highlight that sites with low diversity are also those that are spatially disjunct.s. In the case of Springmount (an environmental and spatial outlier), it is difficult to explore the severe bottlenecking, but it could be due to a small founder event where the selective forces for successful migration are too strong against new arrivals or that selection facilitated rapid genetic purging [68].

Application of Population Genomics to A Conservation Management Plan for the Purple Wattle
As the sampling strategy for this project was aimed at capturing the broad representation of genetic diversity across and within field sites, our results are highly informative for future conservation management of the species. For A. purpureopetala, it is likely that a seed collection will reflect lower levels of diversity than what was found in this dataset because of the chances of the seed generation being subject to further inbreeding through self-fertilisation. In addition, practitioners should take note that sampling numbers in Figure 3 are based on available genotyped samples and should not be seen as the total number of individuals to sample. The amount of genetic diversity in a collection based on vegetative material and seed material will also vary according to factors such as strike rate of cuttings, timing of collecting, conditions at the "ex-situ site", and seed viability, and these can differ across genotypes and time [69,70], where possible conservation practitioners are urged to monitor collections and augment these where possible and needed. We make the following practical recommendations that can be applied to seed collecting or a collection based on vegetative cuttings based on the genomic results.
1. All populations matter and material for a germplasm conservation collection (seed or living material) should focus on collecting material from all available sites regardless of low heterozygosity or low allelic richness [71,72].
2. By increasing the distance between individuals at each site, likelihood of collecting from half-siblings will decrease. While the seed from one individual will not necessarily increase the genetic diversity of a germplasm collection due to low seed viability observed by other researchers, multiple seeds can be collected, but to maintain maternal lines seed collectors should not pool seeds from either sites or individuals and record the location details carefully aligning with recommendation by [73].
3. To increase genetic diversity at a site outcrossing should be encouraged through assisted gene flow and admixture between field sites [18]. This can be done through hand pollination aimed at cross fertilisation between individuals with different genotypes. Such crosses could include individuals within specific genetic groups such as the green or purple group in Figure 1. However, since we found little correlation between genetic distance and environmental distance, crosses could include samples across a wide array of genetic clusters and this will result in more diversity and new unique genotypes while also diluting the spatial genetic structure of the species. An alternative option would be to produce crosses in controlled glass house conditions. The seed can be used to augment wild populations devoid of heterozygosity and be added to germplasm collections. Short term, this may lead to higher levels of fitness at seed and seedling level and long term will reduce the risk of losing diversity even further through drift. In particular, the disjunct populations that displayed exceptionally low levels of allelic variation and observed heterozygosity should be targets for genetic augmentation. Monitoring will be required to ensure that the introduced genotypes are not swamped by the local inbred populations or the opposite that local adaptive genes are lost [74]. Seed collections can be made prior to introductions to ensure local genotypes are not lost due to swamping.
4. In situ management should conserve genetic connectivity amongst sites by aiming to reduce fragmentation in these areas (field sites with lower pairwise Fst, Table S3).
5. As this species has such unique flowers and an attractive sprawling habit, we cannot resist raising the idea of implementing horticulture as a tool for ex situ conservation and genetic rescue for this species, which has successfully been implemented for cycads [75]. However, unlike traditional horticultural practices where there is often selection for a specific trait, we would encourage active hand crosses and seed harvesting that can augment germplasm collections that originated from in situ field-sites. Commercially available purple wattle may also alleviate the pressure of illegal collecting of this species [12].

Conclusions
While current abundance of the individuals of this species at each field site indicates that Acacia purpureopetala is likely to be downlisted from Critically Endangered to Vulnerable under IUCN criteria and is therefore unlikely to get conservation priority, the excessive low levels of heterozygosity inferred from the genomic data ring alarm bells for the long term future of this species. Disjunct populations are valuable for the unique genetic diversity harboured at these sites while the more connected populations are valuable for higher levels of heterozygosity and recombination amongst genotypes. The low germination and seedling survival observed by researchers may well be the early warning signs of inbreeding depression for this species, and implementation of a germplasm collection from seed should happen rapidly to avoid further loss of diversity through drift and inbreeding. Population genomic studies that cover the distribution of a species are highly informative for conservation and are highly recommended to form part of the conservation management toolbox [76].
Supplementary Materials: The following are available online at https://www.mdpi.com/1424-281 8/13/4/139/s1. Figure S1: Four images to illustrate the habitat occupied by A. purpureopetala in the Herberton region, Queensland Australia. Figure S2: Simulations of genetic diversity captured for different field sampling sizes (X-axis) of Acacia purpureopetala from total available genetic samples. For each sampling size, individuals were selected by optimising on the basis of gene diversity (genetic distance; red diamond symbol) and by choosing individuals at random (blue round symbol, representing means of 100 replicates). For both approaches, we obtained the SNPs that were 'common' in A. purpureopetala (minor allele frequency > 3%) and calculated the proportion of these that were polymorphic for each sampling size (Y-axis). Figure S3: This plot illustrates how the proportion of total diversity captured through increasing sampling from the available sample set in the A. purpureopetala DArTseq data set does not increase beyond a sample size of 60 individuals. We used OptGenMix (Bragg et al. 2020) to estimate the proportion of diversity based on maximising the genetic distance between samples. Figure S4 Plot showing predicted compositional dissimilarity against observed compositional dissimilarity. The close distribution of points around the smooth rising trend curve indicates that the fitted GDM can accurately relate observed genetic dissimilarity to predictor variables. Figure S5: Calibration plot for the fitted GDM. The strongly linear relationship between predicted ecological distance and observed compositional dissimilarity for pairs of sites shows that the model can accurately relate an observed value of Fst between sites to differences in the combined predictor variables. Figure S6: Contributions of predictor variables retained in the final fitted GDM. Table S1: A summary of genomic diversity estimates based on DArTseq data and geographic distribution size of four Acacia species with comparable sample sizes (given below the species name). Table S2: Average distance in meters and between samples of A. purpureopetala (based on latitude and longitude recorded during collecting using a handheld GPS) collected at each field site along with the average genetic distance amongst all the samples at each field site. Average Kinship was estimated from all pairwise estimates across samples in each site. Kinships analysis used a SNP matrix with no missing data. Table S3 Pairwise Fst estimated from a DArTseq SNP matrix across all known field sites. In Supplementary data S2 we provide a .csv file with the SNP calls for each sample as received from DArT Pty. Ltd.      Table 2. Table A1. Species diversity across two substrata (taller shrub layer, S1 and lower shrub layer, S2) and the species count for A. purpureopetala across each transect (Ap Transect) and across the field site (Ap site). Field sites appear in the order of numbering on the map in Figure 1.

Field Site
Altitude S1 S2 Total