Divergence Time Estimation of Aloes and Allies ( Xanthorrhoeaceae ) Based on Three Marker Genes

Aloes and allies are prominent members of African succulent vegetation and especially of the highly diverse Cape Flora. The main goal of this study was to obtain age estimates for alooids by calibrating a Bayesian phylogenetic analysis based on two chloroplast markers (the trnL-trnF spacer region and rbcL gene) and one gene marker (ITS) using a relaxed molecular clock. Seventy four species from all succulent genera of alooids were analysed with MrBayes to infer species relationships. We discuss the age estimates to address the question whether vicariance or dispersal could account for the diversification of Madagascan alooids. In the combined maximum clade credibility tree obtained from BEAST the succulent alooids have split from asphodeloids around 51.8 Mya in Early Miocene. Divergence time age estimation for succulent drought resistant alooids (late Oligocene to early Miocene) correspond well with dates identified for several other plant lineages in southern Africa and does match with the start of dry period in Miocene which triggered speciation and evolutionary radiation of these genera and families. All climbing aloes and some tree aloes which were recently split into new genera are amongst the early diverged group in alooids and the crown node of this group diverged around 16.82 (15.5–22.4) Mya. The oldest node age estimation for aloes from Madagascar (5.1 Mya) is in early Pliocene and our findings support the hypothesis that the Africa-Madagascan divergence is best explained by oceanic long-distance dispersal rather than vicariance. This study is one of the first to give age estimates for clades of alooids in Xanthorrhoeaceae as a starting point for future studies on the historical biogeography of this family of succulent plants which are important for ethnomedicine, and as ornamental and horticultural plants.


Introduction
Alooids or the subfamily Alooideae sensu ) Dahlgren, et al. 1985 ([1] are a well-known group of southern African and Cape rosette leaf succulents (ca.530 species) adapted to life in dry areas; some members of Aloe occur on the Arabian Peninsula, Madagascar and the Mascarene Islands.Alooids are of interest from an ethnomedical, ornamental and horticultural perspective (Smith et al., 2000) [2].Aloe ferox is the source of anthraquinones, which are used in medicine as a laxative.Aloe vera is also used as a laxative but also as a gel used for wound-healing, and skin care (van Wyk and Wink, 2017) [3].In Africa, harvested species of Aloe come from east and southern Africa.During the 1990s, exports of wild-harvested exudate (Aloe) from Kenya sometimes exceeded 80 tons per annum (Oldfield, 2004) [4].
Alooids are characterised by rosulate and succulent leafs and synapomorphies like: Bimodal karyotype with four long and three short chromosomes, hemitropous ovules, a parenchymatous, cap like inner bundle sheath at the phloem poles, 1-methyl-8-hydroxyanthraquinones in the roots and anthrone-C-glycosides in the leaves (Treutlein et al., 2003) [7].Aloe species are often pollinated by insects and birds but can also be autogamous.Moreover, the widespread occurrence of secondary growth might be added to these characters (Smith and Van Wyk, 1998) [6].
As the largest genus in the Asphodelaceae with approximately 530 species, Aloe has centers of diversity in southern Africa.It occurs widespread in Africa, Arabia, and on several island of the Western Indian Ocean Islands off the east coast of Africa, such as Madagascar, and Socotra (Klopper et al., 2010) [13].The distribution of Haworthia, Astroloba and Gasteria are similar.The berry fruited Lomatophyllum is limited to Mascarene Islands.
With regards to species richness alooids are not alone in the Cape Floristic Region (CFR) (Linder et al., 1992;Sauquet et al., 2009) [19,20], which shows a high degree of endemism including 30% of the succulents plants of the world (Schnitzler et al., 2011) [21].With the exception of the Karoo flora which diversified as a result of recent radiation during the late Miocene or Pliocene (Verboom et al., 2003 [22], molecular phylogenies indicate that the radiation of several African plant lineages took place over much of the Neogene and had started earlier than the climatic changes in the late Miocene (Bakker et al., 2005;Schrire et al., 2003;Goldblatt et al., 2002) [23][24][25].
Although representatives of the subfamily Asphodeloideae (including Aloeae) are supposed to have been around since the early Cretaceous (Smith and Van Wyk, 1991) [26], only few dated phylogenies has been published for this diverse complex of succulent plants (Grace et al., 2015) [11].
In the current study, we have analysed nucleotide sequences of 77 taxa, comprising all genera of alooids and three genera of non-succulent Asphodeloids.These data are used to carry out an age estimation for the main clades of alooids (including a diversification of Madagascan aloes), using a "relaxed" molecular clock that permits variation of the molecular rate among lineage in two chloroplast markers (trnL-trnF spacer and rbcL) and one highly repeated nuclear ITS region.Since there are many different hypotheses including dispersal and extinction or vicariance and peripheral isolation in the speciation process of aloes, we investigated the hypothesis of vicariance vs. dispersal as explanations for the origin of Madagascan aloes.

Taxon Sampling
Data were compiled for 74 species from all succulent genera of alooids including Lomatophyllum (with three individuals each) and for all of these species new sequences for three gene regions were generated.Only three outgroups were additionally obtained from Genbank for non-succulent genera (asphodeloids which are the sister group of alooids) in the subfamily Asphodeloideae (family Xanthorrhoeaceae).Most of Aloe samples were collected from plants of wild provenance kept in the collection of Gariep Plants in Pretoria.Other genera came from the Botanical Garden of Heidelberg University and the Palmengarten in Frankfurt.Details of GenBank accession numbers and DNA voucher specimens which were deposited at IPMB (Heidelberg University) are presented in Table 1.

Molecular Methods
DNA was isolated from fresh leaves based on a modified CTAB method (Doyle and Doyle, 1990) [28].Only the epidermal part of the leaves was used in DNA extraction due to large amounts of secondary metabolites in the mucilaginous part.Extracted DNA was dissolved in TE buffer and the concentration was measured by UV spectrophotometry.
The internal transcribed spacer (ITS1 & 2 and 5.8srDNA) regions were amplified with the primers ITS4 and ITS5 of White, et al. [30] and the same PCR protocol of Adams, et al. [31] with addition of 4% DMSO to the PCR reaction.The following PCR was applied: 26 cycles of 97 • C for 1 min, 50 • C for 1 min, and 72 • C for 3 min, followed by a final extension at 72 • C for 7 min.
For sequencing, PCR products were precipitated following Gonzalez, et al. [32].Sequencing was performed using an ABI 3730 automated capillary sequencer (ThermoFisher Scientific, Darmstadt, Germany) with the ABI Prism Big Dye Terminator Cycle Sequencing Ready Reaction Kit version 3.1 and was carried out by STARSEQ GmbH (Mainz, Germany).Accession number of plants and DNA sequences are provided in Table 1.

Sequence Editing and Alignment
Non-coding regions such as trnL_F spacer and ITS are known to contain more substitutions than coding sequences and also carry insertions/deletions (indels).A high occurrence of indel mutations of varying lengths makes sequence alignment problematic (Small et al., 2004) [33].Because of problems confounding alignment of these regions, all alignments were done manually using BioEdit (Hall, 1999) [34] and gaps corresponding to indels were positioned to minimise the number of nucleotide differences among sequences.To facilitate alignment most of problematic regions in terms of alignment were omitted, which resulted in a fragment of 415 bp for trnL_F spacer, 535 bp for ITS region and the final aligned matrix for rbcL was 907 bp long.A sequence alignment can be obtained from the first author on request and sequences are deposited in the GenBank (Table 1).

Phylogenetic Analyses
Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 7 (Kumar, et al., 2016) [35].Phylogenetic reconstruction was performed using maximum likelihood (ML) in MEGA v.7 with the Kimura two-parameter model (Kimura 1980) [36].MEGA7 was used to estimate the best substitution model: Kimura's two parameter model corrects for multiple hits, taking into account transitional and transversional substitution rates, while assuming that the four nucleotide frequencies are the same and that rates of substitution do not vary among sites.Under a general time reversible nucleotide substitution model (Tavaré, 1986) [37], one thousand inferences were run using among-site rate variation modelled with a gamma distribution.Subsequently, 1000 non-parametric bootstraps were performed under the partition data mode, and bootstrap support values were drawn on the ML tree.
In addition to ML analyses, Bayesian inference (BI) was used implemented in MrBayes v3.2.6 (Ronquist et al., 2012) [38].To determine the best-fit model of DNA substitution for each loci with Akaike information criterion, MrModeltest v.2.3 [39] was used (for both rbcL and trnL_F: GTR + I + G, and GTR + G for ITS).We used GTR model in MrBayes and BEAST and K2P in MEGA 7 because we wanted to test the consistence between the models.Moreover, K2P is a nested model which is a special case of more general model such as GTR.
Two parallel runs of four chains of the Markov Chain Monte Carlo (MCMC) were executed for 7,000,000 generations, sampled every 1000 generations.
All parameters were stationary after 500,000 generations.All trees prior to the stationary point were discarded as "burn-in" from the compilation of posterior probabilities (PP).Strongly supported clades have posterior probabilities above 0.90.Phylogenetic trees were reconstructed for single and combined gene data and visualised using FigTree v1.3.1 [40].

Estimating Divergence Times
The estimates of divergence time of alooids were conducted using combined chloroplast and nuclear datasets.
BEAST v1.8. was used for the Bayesian MCMC inferred analyses of the nucleotide sequence data and BEAUti (Bayesian Evolutionary Analysis Utility) v1.6.1 [41] was utilised to generate initial xml files for BEAST.
A Yule (Yule, 1924) [42] process of speciation ('a pure birth' process) was used as a tree prior for all the tree model analyses and a relaxed uncorrelated log-normal clock (Drummond et al., 2012) [43] in BEAST v1.8. was applied.
Two independent simultaneous runs of 20,000,000 generations were completed, sampling one out of every 1000 trees in BEAST v 1.8.Log files were tested for ESS estimations with Tracer v1.6.(Rambaut & Drummond 2009) [44], LogCombiner v1.6.1 was used to combine the log files from the independent BEAST runs.Using TreeAnnotator v1.4.8,BEAST trees were summarised with a burn-in value of 25% and mean node heights (BEAUti, LogCombiner, and TreeAnnotator are all part of the BEAST software bundle).
The calibration mean date for the outgroup of alooids in Xanthorrhoeaceae or the genus Bulbine in Asphodeloids (51 Mya) and the date for the root of Xanthorrhoeaceae (61 Mya) used in the current study were taken from Wikström et al. (2001) [45].
Using fossils as calibration points, the ages and error estimates for over 75% of all angiosperm families were calculated in these studies, and these estimations were mainly compatible with the fossil record (Wikström et al., 2001) [45].Moreover, in a recent revision of the age estimation and diversification of angiosperms using BEAST software (Bell et al., 2010) [46] the age estimation for the most cases overlapped the range published by Wikström et al. (2001) [45].
Since the Bayesian methods produce divergence time estimates which are dependent on priors and the model parameters, we tested the impact of using different settings in BEAST and a normal distribution was also applied with the mean value fixed at 61 Mya and a standard deviation of one.
Moreover, we compared our Bayesian estimates with those made using penalised likelihood approach previously applied in Hyacinthaceae, which is also a family in Asparagales, as described by Buerki et al., 2012 [47].

Results
Phylogenetic trees reconstructed from plastid and nuclear data showed almost identical topologies; therefore, the cpDNA and ncDNA datasets were combined.Partition homogeneity test in PAUP* 4.0 Beta was used and the P-value (P = 0.0571) indicates a congruency.Maximum Likelihood and MrBayes analyses recovered almost the same phylogenetic relationships for the combined data set.A ML phylogram is shown in Figure 1 and posterior probability values from MrBayes analysis and ML bootstrap numbers are provided at the nodes in this tree.
A BEAST analysis was used to reconstruct phylogeny and to estimate divergence times (Figure 2).The mean ages (with 95% HPD intervals) are given for the well supported nodes; similar estimates have also been reported in previous phylogenetic studies (Table 2).The mean coefficients of variation (σ γ ) under the relaxed clock model accounted for more than 1.This shows that a significant level of rate heterogeneity exists between lineages (Drummond et al., 2007) [48].The 95% HPD intervals for the evolutionary root age of the outgroups was similar to those of the study of Wikström et al., (2001) [45].In the combined maximum clade credibility tree obtained from BEAST, the mean age of the root of the tree for non-succulent asphodeloid members of the subfamily Asphodeloideae is about 51.8 (47.0-55.5)Mya.The succulent alooids have split from asphodeloids much later around 22.7 (20.3-24.1)Mya in Early Miocene.The result from penalised likelihood method were similar to the BEAST and the two different setting BEAST analyses provided similar values for node age estimates.
The combined dataset resolved four major clades within alooids labelled A-D in Figures 1  and 2: (A (D) True aloes and Lomatophyllum: The rest of the species of aloes including some tree aloes were found to be in one clade (PP = 0.90; BS = 69%), which diverged about 15.8 (11.5-19.6)Mya.Only some internal groups are possibly monophyletic.For example, most samples from Yemen and North East Africa appeared in a single clade with moderate support value (PP = 0.90; BS = 59%) which has diverged only 4.1 (1.5-6.3)Mya.Moreover, most Madagascan aloes were found in a poorly supported internal clade of aloes with stem node divergence time about 5.3 (2.6-7.2) Mya.Lomatophylum is unlikely to be monophyletic.

Formation of Arid Habitats in Africa
A combination of the post African І erosion cycle (5-24 Mya), Post African ІІ uplift event at the Pliocene and the glacial-interglacial cycles in the Pleistocene triggered a rapid speciation of many southern African plants (Siesser, 1978;Goldblatt, 1997) [50,51].Through the Miocene (5.5-24 Mya) arid habitats became abundant in Africa (Coetzee, 1993;Axelrod and Raven, 1978) [52,53].In this  a Represent lower-upper 95% HPD intervals, respectively.The 95% HPD is regarded as a Bayesian representation of confidence interval.
The mid-Miocene Climatic Optimum (ca.15 Mya) has led to the development of wide open ecosystems and the start of the radiation of the present hyperdiverse clades of the Cape flora.Moreover, the aridity of Southwestern Africa increased around 14 Mya (Siesser, 1978) [50] through the development of the proto-Benguela current off the coast of SW Africa as a result of the spread of the Antarctic ice sheet.This event led to the radiation of succulent life forms (Goldblatt, 1997) [51], among them alooids.

Divergence in Aloes
An early divergence of shrubby aloes (or their ancestors) around 16.82 (15.5-22.4)Mya had already been suggested by Holland (1978) [54], who had supposed that these succulents represent the original ancient lineage for other aloes during the desertification of Africa.
Despite the assumption of an early radiation in southern Africa around the Early Miocene or earlier, most modern African species have radiated in contemporary climatic conditions and have evolved during the Pliocene-Pleistocene (Linder, 1992) [19].A second Pliocene uplift event in Africa (Partridge and Maud, 2000) [62] caused extensive aridification by changing the ocean currents (Krammer et al., 2006) [63]; this was a period of rapid speciation in many clades such as in Phylica (Richardson et al., 2001) [64], semi-desert ice plants (Aizoaceae) (Klak et al., 2004) [65] and Gladiolus (Rymer et al., 2010) [66].The estimated mean crown age of many nodes within alooids also fall in this period (around 5 Mya).
From 2.5 Mya (i.e., the Quaternary) onwards, the climatic instability associated with glacial-interglacial cycles in the Northern hemisphere stimulated further diversification in South Africa (Cowling et al., 2009) [67].The extensive speciation of many plants such as Kniphofia (Bakker et al., 2005) [23] and Haworthia subgen.Haworthia (Bayer, 1999) [17] falls in this period.Although the Haworthias diverged in the mid-Miocene, the youngest internal nodes within alooids are found in this group.This may be considered as further evidence for the recent speciation with in this subgenus of Haworthia as postulated by Ramdhani and co-workers (2011) [8].The results of Manning et al. (2014) [10] confirm an early separation of the clade.

Divergence on Madagascar
A high diversity of Aloe and Lomatophyllum species was detected on Madagascar which represents the "hottest hotspot" of biodiversity of plants species of the world (Myers et al., 2000) [68].Only grass aloes have not been found there (Reynolds, 1966) [69].The oldest node age estimation for aloes from Madagascar (5.1 Mya) is in early Pliocene and apparently much later than the separation of this island during Gondwana from both the mainland of Africa (165-121 Mya) and India (88-63 Mya).
The divergence time of most Madagascan aloes correspond with other greatly diverse plants in Madagascar such as scaly tree ferns (Janssen et al, 2008) [70] and Indian Ocean Daisy Trees (Psiadia) (Strijk et al, 2012) [71].Due to climatic alternations in the Pliocene (Coetzee, 1993) [52] resulting in habitat disintegration and repetitive decrease and increase of limited forest refugia, it has been assumed that these plants experienced fast geographical parallel diversification spurts in Madagascar.Our findings in aloes support the hypothesis that the Africa-Madagascan divergence is best explained by oceanic long-distance dispersal rather than ancient vicariance.

Speciation Processes in Aloes
Despite the strong influence of climate on plant diversification, it is very unlikely that climate alone is the cause for these levels of plant diversification especially in Cape Flora (Goldblatt and Manning, 2002) [25].It has been proposed that speciation and endemism in alooids are associated with many other factors.Most species of alooids occur in extremely restricted areas, which are naturally isolated, thus showing a 'mosaic distribution' (Holland, 1978) [54].It is assumed that specific microclimatic preferences of species had enhanced endemism in aloes (Kamstra, 1971) [72].Therefore, the drivers of high endemism and speciation of alooids were mainly sought in mechanisms that lead to geographically isolated populations, and so to allopatric speciation (Schluter, 2001) [73].From several proposed selective forces, a speciation in alooids might have been driven by a change of pollinators as well as by a slight differentiation in flowering times permitting the survival of new forms which enable a greater number of Aloe species to coexist (Rowley, 1976;Botes et al., 2008) [74,75].
In the Cape flora (including Aloe), parapatric sister-species limited to the diverse territories are known as a result of edaphic specialization (Goldblatt et al., 2001;Kurzweil et al., 1991) [76,77], leading to different life forms or ecomorphotypes that are described as different species today (Holland, 1978) [54].
Even though the real biological traits that have influenced speciation of alooids are unclear at this stage, the hypothesis of contemporary speciation and ongoing hybridization (Ramdhani et al. 2011) [8] in non-monophyletic genera of alooids (such as Haworthia sensu lato) should also be considered as an explanation for the complex taxonomy and the abundance of habitat restricted species.

Conclusions
In conclusion, although age estimations are dependent on fossil calibrations and monocots do not fossilise well, we hope that this phylogenetic study of alooids, in which we aimed to sample most sections of Aloe and many of its allied genera, will throw light on the causes of the high diversity in alooids and the timing of their speciation.We suggest that future studies focus on increased taxon sampling to conclude more comprehensive age estimates for this important group of African succulents.

Table 1 .
Origins of samples.List of specimens, distribution of plant samples, morphological information, number in the herbarium of the Institute of Pharmacy and Molecular Biotechnology (IPMB) and GenBank accession numbers (from our own sequence analyses) listed in this order: rbcL, trnL_F and ITS.Aloe

Table 2 .
Node age with the posterior probability densities are shown for important clades and outgroups (asphodeloids).