Crown-of-thorns starfish (CoTS; Acanthaster spp.
) naturally occur on coral reefs throughout the Indo-Pacific [1
] While normally found at low densities [3
], sporadic population outbreaks of CoTS cause significant localised coral loss and are a major contributor to the ongoing degradation of coral reefs throughout the Indo West-Pacific [4
]. Numerous hypotheses have been put forward to explain the occurrence of CoTS outbreaks (reviewed by [1
]), most of which incite an anthropogenic basis for the purportedly recent and increasing occurrence of outbreaks. While their inherent life-history characteristics (most notably their high fecundity [9
]) predisposes CoTS to major fluctuations in abundance [10
], there are two prominent theories proposed to trigger outbreaks; the larval survival hypothesis suggests that anthropogenic eutrophication of nearshore waters dramatically increases the survival of planktonic larvae [11
], whereas the predator removal hypothesis postulates that overfishing of natural predators has allowed more CoTS to reach sexual maturity [1
]. Tests of either hypothesis require improved knowledge of when and where outbreaks start and corresponding research on environmental conditions and population demographics of CoTS at these locations.
Outbreaks of CoTS on Australia’s Great Barrier Reef (GBR) were first documented in 1962 [14
], though there are earlier reports of high densities of starfish (which may or may not have constituted an “outbreak”) on the GBR e.g., [15
]. Since 1962, there have been three additional outbreak episodes on the GBR, starting in 1979, 1993 and 2009. However, each outbreak has followed a reasonably consistent pattern where primary outbreaks were first recorded on mid-shelf reefs between Lizard Island and Cairns (the ‘initiation box’; Fabricius et al. [13
]), followed by a wave of ‘secondary outbreaks’ that tend to propagate southwards [17
]. Biophysical modelling of larval dispersal patterns suggests that reefs within the initiation box are highly connected [19
], thereby explaining why outbreaks that initiate in this region inevitably lead to reef-wide outbreaks. However, limited temporal and spatial resolution of monitoring e.g., [20
], as well as inevitable delays in responding to new outbreaks mean that it is still unclear where exactly outbreaks arise. It is also unknown whether outbreaks start from a small cluster of reefs within this area or arise simultaneously on widely separated reefs [2
]. Resolving the exact timing and location where outbreaks start is important to establish environmental triggers [13
] or changes in population demographics [21
] that cause outbreaks. This could also lead to improved management and containment strategies to stop outbreaks before they spread.
Although native to the GBR, the spatial distribution of CoTS following an outbreak closely resembles the spread of invasive species and infectious diseases [22
]. Historical and observational data have previously identified invasion routes or disease vectors e.g., [23
], but direct observations have proven ineffective for CoTS e.g., [20
], where it is still unclear whether outbreaks start from a single reef or arise simultaneously from separate locations [24
]. The rapid expansion of populations following biological invasions can, however, lead to distinct patterns of genetic structure and diversity. Genetic data have increasingly been used as an indirect method to describe the spread of invasive species [25
] and infer the relationship between discrete populations or possible migration routes [27
]. Elaborating on these patterns, model-based statistics can provide probabilistic estimations of the demographic and genetic history that are necessary to generate observed patterns of genetic structure e.g., [28
]. Approximate Bayesian computation (ABC) approaches that incorporate the divergence and admixture of populations, as well as changes in population size and structure [30
] can provide important information on the likely initiation and spread of species e.g., [31
This study examines genetic diversity and structure, based on sampling of crown-of-thorns starfish during the current outbreak on the GBR. Over four thousand starfish were sampled at 13 reefs spanning ~1000 km. The spatial genetic structure of a CoTS outbreak will depend on the history of the source population(s), the size of the initial population(s), the dispersal of individuals that led to a primary outbreak and successive secondary outbreaks. While the demographic factors contributing to each stage of an outbreak are unclear, a recent review clarifies many aspects of their population dynamics and life-history characteristics [2
]. A model-based approach would therefore allow us to take into account the stochasticity of these demographic processes and test multiple scenarios that would have generated the observed spatial population structure of CoTS on the GBR. Specifically, we test whether the outbreak was generated from a single source population in the ‘initiation box’ or multiple populations. If a small number of individuals from a single source population caused a localised primary outbreak, we would expect successive secondary outbreaks to be affected by a single bottleneck, be composed of highly related individuals and have low genetic diversity. If the primary outbreak originated from multiple populations, we would expect multiple bottlenecks, founding a widespread admixed population, and high genetic diversity. Using a model-based Bayesian approach, we explore the likelihood of competing outbreak scenarios against the observed spatial genetic structure of CoTS to determine the most parsimonious origin of primary outbreaks on the GBR.
A total of 4082 individual CoTS were collected between April 2013 and May 2015 from 13 reefs between Lizard Island and the Swains reefs (Figure 1
). All individuals were genotyped at 26 microsatellite loci, and the final dataset was curated to focus on five focal regions (Table 1
). Individuals removed from the data included 642 individuals sampled prior to September 2013 or less than 80 mm in length and 121 individuals that were sampled from reefs where less than 50 individuals were sampled.
Poor DNA quality or the possible presence of PCR inhibitors in DNA extractions led to a large amount of missing data and genotyping error. A training dataset [36
] identified four loci (Apl21
) with over 3% genotyping error and five loci (Apl01
) with over 30% missing data, which were all discarded prior to analysis. In addition, 600 individuals with four or more missing loci were discarded from further analyses along with 14 duplicate genotypes associated with the presence of missing data.
Of an initial 4082 samples, our final data included 2705 unique individuals from 13 reefs from Lizard Island (n = 1002), Cooktown (n = 692), Cairns (n = 802), Townsville (n = 137) and the Swains (n = 68). Individuals ranged from 90 mm to 510 mm (mean = 282 mm ± 66 mm SD) with no statistical difference in the size of individuals between regions. Amongst all genotyped individuals, 1925 were identified as either mature male or mature female with 1.34 males for every female. This ratio was used to parameterise the ABC models.
3.1. Microsatellite Data
Data presented here contained 17 polymorphic loci with 1.7% missing data (Table 2
). The mean number of alleles per locus was 17.4 and ranged from four to 62 alleles. Similarly, the average observed heterozygosity was 0.67 and ranged from 0.21 to 0.95. Each locus was tested for departure from Hardy-Weinberg equilibrium, and nine loci showed non-random association of alleles after correction for multiple testing. A specific test for heterozygous deficiency highlighted 12 loci with a higher than expected frequency of heterozygotes after correction for multiple testing. Estimates of Weir and Cockerham’s FIS
were positively skewed with an average FIS
of 0.017 ± 0.004 SE across all loci, which suggests some degree of inbreeding in these populations. Amongst 136 pairwise comparisons, two locus pairs measured linkage disequilibrium after correction for multiple testing, and no locus showed evidence of null alleles.
3.2. Spatial Patterns of Genetic Diversity
The mean allelic richness (AR) and genetic diversity (HS
) within sampled reefs were high (AR = 5.5; HS
= 0.687) and consistent among reefs (Table 1
). There was no evidence for genetic differentiation amongst sampled reefs with global estimates of G’ST
= −0.001 (p
= 0.948) and no genetic variance amongst regions (FCT
= 0.000 ± 0.000 SD) or amongst populations nested in regions (FSC
= 0.000 ± 0.000 SD; Table 3
). Pairwise genetic differences amongst populations were not significantly different from zero (Table 4
), and showed no evidence of isolation by distance (Mantel test: p
There was, however, evidence of low, but significant heterozygote deficiency within the sample. Individual heterozygosity was lower than would be expected in a large, randomly-mating population with a global estimate of GIS = 0.022 (p = 0.001) and amongst individuals within individuals reefs (FIS = 0.021 ± 0.004 SD). Such patterns are commonly associated with evidence of the mixing of different source populations (Wahlund effect), inbreeding due to the mating of close relatives or the non-random sampling of a limited number of familial pools. The level of heterozygote deficiency also varied amongst reefs, ranging from −0.004 to 0.048 and an average GIS of 0.021 ± 0.012 SD amongst reefs.
Using the whole sample of individuals without prior information of sampling location, the population could not be partitioned into independent populations, indicating that the sample has a common origin. The largest mean log-likelihood values of the data were for K = 1 population and decreased with increasing values of K (Figures S1 and S2
). The result provided by the analysis of spatial genetic clustering could not unambiguously detect separate groups of individuals in the sample, indicating a homogenous population, and does not support a Wahlund effect as a source of heterozygote deficiency. The power of the tests suggest that the number of sampled individuals, the number of loci and the allelic diversity at these loci were sufficient to detect genetic structure amongst sampled populations. At a level of differentiation of FST
= 0.0005, the power to detect genetic heterogeneity with 95% confidence was 100%. At the lowest level of differentiation (FST
= 0.0001), the power was reduced to 53.7%. Analyses were repeated after excluding nine loci that did not meet equilibrium assumptions (AP12QS
). Results from these runs were not different from runs with the full dataset.
We measured low, but significant levels of heterozygote deficiency in most screened loci, which could indicate the presence of inbred individuals in our sample. However, the mean multilocus heterozygosity correlation was −0.003 (CI 95% −0.025 to 0.019) indicating that the heterozygote deficiency was not significantly different across loci. To confirm that inbreeding was not the cause of heterozygote deficiency in sampled CoTS, we compared the MLH between sampled and simulated genotypes. Amongst all CoTS sampled in the GBR, the average MLH was 0.29 ± 0.11 SD, which was not significantly different from simulated genotypes (0.26 ± 0.10 SD, t = −9.41, df = 2704, p-value = 1).
3.4. ABC Framework
The most likely scenario supported by the ABC analyses was Scenario 4 (Figure 2
d), whereby all sampled populations originated sequentially from a single common primary outbreak (Figure S4
). However, the outcomes of this scenario could not be distinguished from Scenario 2, whereby sampled populations originated from a single common primary outbreak at the same time point (Figure 2
b). Both scenarios were more representative of the observed data than alternative scenarios that incorporated a divergence between the Swains and the northern populations (Figure 2
a,c). In all simulated scenarios, ABC analyses were particularly sensitive to the effective population size for the ancestral population (Nanc
). The posterior distribution of Nanc
was however consistent amongst scenarios, and estimates suggest that the effective population size that would have led to a primary outbreak would be ~5000 individuals (Table 5
; Figure S3
Assessing a species’ dispersal ability and capacity to colonise new habitats is critical for our understanding of their population biology and ecology [60
], particularly where species have ecological or economic importance. CoTS outbreaks represent the single most important biological disturbance on coral reefs throughout the Indo West-Pacific [62
] and often account for up to 50% of coral loss recorded on coral reefs over the last few decades [5
]. For these starfish, knowledge of population structure and the movement of individuals among reefs can greatly influence management decisions e.g., [19
], leading to improved detection and understanding of the patterns of outbreaks, as well as prioritisation of reefs for direct intervention (culling) in an attempt to contain outbreaks. In the current study, we investigated the genetic diversity and structure of CoTS on the GBR with the specific intention of identifying the origin and the direction of subsequent spread for current outbreaks apparent at reefs between Cooktown and Townsville, as well as at Swains reefs, in the southernmost portion of the GBR. This would have further important management ramifications, whereby containment of future outbreaks would be most effective by concentrating monitoring and control on the reef(s) where outbreaks initiate. Our results indicate that populations sampled across the full length of the GBR are genetically homogeneous, highly diverse and have no apparent genetic structure. Furthermore, model-based Bayesian analyses showed that the current outbreaking population of CoTS in the Swains is not independent of the outbreak populations in the northern GBR, but share a common origin.
We found no evidence of genetic structure amongst CoTS genotyped from 13 reefs and five regions spanning over 1000 km along the GBR. While the genetic diversity in our sample was high, there was no variance in diversity among reefs. Power analysis confirms that our data were sufficient to detect even very low degrees of genetic differentiation (FST
> 0.0005) had there been any structure in the sampled population. The results indicate that CoTS from Lizard Island to the Swains are genetically homogenous. There was a small but significant deficiency in heterozygosity amongst individuals that could have arisen from: (i) the successful reproduction among differentiated populations (Wahlund effect); (ii) the mating of close relatives (inbreeding) or (iii) the sampling of close relatives. We could not distinguish independent clusters in our sample, and therefore, a Wahlund effect is unlikely (sensu [63
]). Moreover, multilocus genotypes did not provide consistent evidence of inbreeding. We did find evidence that some individuals were highly related, which would result in a small but significant deficit in heterozygosity (data not shown). The strength and number of relationships between individuals from the same or different reefs could not, however, distinguish between alternative explanations for deficiencies in heterozygosity. Furthermore, a spatial autocorrelation analysis did not identify any relationship between the relatedness of individuals and their spatial distribution. Previous sampling during non-outbreak periods reported small but significant genetic structure among latent CoTS populations on the GBR (FST
= 0.003, [64
]), even though they also found limited structure when sampling each outbreak population.
Our ability to determine a putative source population for the current outbreak of CoTS on the GBR is significantly constrained by the limited genetic structure (sensu [65
]), despite hierarchical sampling and large numbers of CoTS genotyped from individual reefs and regions. Very high levels of homogeneity across the 2705 individual starfish points to the recent and very rapid increase in population size of CoTS across the GBR, which has arisen from either a single source population or multiple undifferentiated populations. A single origin would be expected to generate some inbreeding and highly related individuals, which the data do not support. A more parsimonious explanation, therefore, is that the current outbreak arose almost simultaneously across a number of reefs, but with largely undifferentiated latent populations. This is consistent with reports during the last major outbreak in 1994, whereby increasing CoTS densities occurred almost simultaneously at Lizard Island and several nearby reefs (including Linnet Reef, North Direction Island and Rocky Islet) before being reported on reefs to the south [20
]. High levels of gene flow amongst these closely-positioned populations would have likely resulted in admixture and high levels of genetic homogeneity, as recorded in this study. However, given the lack of genetic structure between regions, we are unable to unequivocally state whether the current outbreak did or did not originate in the northernmost sector of the initiation box, nor can we establish the directionality in the spread of individuals between regions. It is also likely that the latent population of CoTS is sufficiently large and sufficiently diverse to prevent genetic drift from occurring within regions and therefore maintaining the genetic homogeneity in the latent population. We cannot, therefore, dismiss the possibility that outbreaks arise almost simultaneously and/or independently across the entire area of the initiation box, as suggested by Fabricius et al. [13
While our genetic analyses did not resolve competing hypotheses about the initiation and spread of CoTS outbreaks on the GBR, these data could be used in conjunction with demographic information and/or fine-scale monitoring data to better resolve the patterns of initiation and spread. Extensive and intensive monitoring of CoTS populations was undertaken across the northern GBR throughout the period of this study, recording the extent and severity of outbreaks at every reef between Lizard Island and Cairns (P. Doherty, unpublished data), as well as documenting the size-structure of CoTS populations at select reefs throughout this range [67
]. However, these surveys were undertaken (in 2014 to 2015) only after outbreaks had become well established throughout the entire area, such that sequential (cf. simultaneous) occurrence of outbreaks will only be apparent based on spatial variation in the size and abundance of CoTS. In the future, systematic and intensive monitoring should be undertaken across a range of reefs within the initiation box to unequivocally establish the sequence and inter-dependence of outbreaks within this area [2
]. It is also possible (but not certain) that next generation sequencing might reveal greater genetic structure among existing samples and thereby provide meaningful differences among sub-populations to explicitly test for directionality in spread. The recent compilation and publication of an entire mitochondrial genome for Acanthaster
collected from Japan [68
] certainly paves the way for much more detailed studies of population genetics for CoTS. Ongoing genetic sampling and re-analyses of existing genetic samples from the GBR are underway.
An unexpected outcome of this study was that outbreak populations of CoTS in the southern GBR (Swains reefs) were not significantly differentiated and have a similar origin to outbreak populations sampled in the northern and central GBR during 2014/2015. For the most part, CoTS outbreaks in the Swains have been thought to occur independently of outbreaks in the northern GBR and have an altogether different origin (e.g., [19
]), though the appearance of high CoTS densities at Swains reefs in 2014 is consistent with continual and progressive southerly spread of the outbreak that started at and near Lizard Island in 1993/1994 [69
]. Model-based Bayesian analyses resulted in a higher posterior probability of selected scenarios that considered a common origin for outbreaks in both the Swains reefs and northern GBR, as opposed to two independent primary outbreaks (sensu [19
]). Both scenarios that represent a single common primary outbreak indicate a strong goodness-of-fit to our genetic data, though we could not distinguish whether secondary outbreaks originated from a single time point or sequentially. It is possible that the limited time (in generations) elapsed between primary and secondary outbreaks, and even between successive waves of outbreaks, would result in minimal observable genetic differences.