Biogeography of Stigmaphyllon (Malpighiaceae) and a Meta-Analysis of Vascular Plant Lineages Diversified in the Brazilian Atlantic Rainforests Point to the Late Eocene Origins of This Megadiverse Biome

We investigated the biogeography of Stigmaphyllon, the second-largest lianescent genus of Malpighiaceae, as a model genus to reconstruct the age and biogeographic history of the Brazilian Atlantic Rainforest (BAF). Few studies to date have focused on the tertiary diversification of plant lineages in the BAFs, especially on Stigmaphyllon. Phylogenetic relationships for 24 species of Stigmaphyllon (18 ssp. From the Atlantic forest (out of 31 spp.), three spp. from the Amazon Rainforest, two spp. from the Caatinga biome, and a single species from the Cerrado biome) were inferred based on one nuclear DNA (PHYC) and two ribosomal DNA (ETS, ITS) regions using parsimony and Bayesian methods. A time-calibrated phylogenetic tree for ancestral area reconstructions was additionally generated, coupled with a meta-analysis of vascular plant lineages diversified in the BAFs. Our results show that: (1) Stigmaphyllon is monophyletic, but its subgenera are paraphyletic; (2) the most recent common ancestor of Stigmaphyllon originated in the Brazilian Atlantic Rainforest/Caatinga region in Northeastern Brazil ca. 26.0 Mya; (3) the genus colonized the Amazon Rainforest at two different times (ca. 22.0 and 6.0 Mya), the Caatinga biome at least four other times (ca. 14.0, 9.0, 7.0, and 1.0 Mya), the Cerrado biome a single time (ca. 15.0 Mya), and the Southern Atlantic Rainforests five times (from 26.0 to 9.0 Mya); (4) a history of at least seven expansion events connecting the Brazilian Atlantic Rainforest to other biomes from 26.0 to 9.0 Mya, and (5) a single dispersion event from South America to Southeastern Asia and Oceania at 22.0 Mya via Antarctica was proposed. Compared to a meta-analysis of time-calibrated phylogenies for 64 lineages of vascular plants diversified in the Brazilian Atlantic Rainforests, our results point to a late Eocene origin for this megadiverse biome.


Introduction
Malpighiaceae Juss. is one of the most diverse plant families of Neotropical shrubs, trees, and lianas [1], with most species confined to this region [2]. Its species are easily recognized by their remarkable floral conservatism, with flowers frequently bearing a pair of oil-secreting glands at the base of sepals, petals clawed at the base, and a posterior petal differentiated from the remaining lateral four [1,2]. This family has received broad phylogenetic attention in the past few years [2][3][4][5][6][7][8], including more focused investigations on generic delimitations and the phylogenetic position of

Phylogenetic Analysis
Our combined dataset for ribosomal (ETS, ITS) and nuclear (PHYC) markers contains a total of 2283 characters, of which 1724 characters are constant, 257 characters are variable, but parsimony uninformative, and 302 characters are parsimony informative. The combined analysis of nuclear and ribosomal markers provides higher support for more clades than those based on independent nuclear and ribosomal datasets. Overlapping peaks (paralogous copies) for ETS and ITS were not recorded during electrophoresis and sequencing. The heuristic search for the combined dataset found 15 trees (consistency index, CI = 0.80, retention index, RI = 0.70), and the strict consensus tree includes 17 moderately supported clades (bootstrap percentage, BP > 75; Figure 1). The Bayesian analysis recovered 25 well-supported to moderately supported clades (posterior probabilities, PP > 0.95 and >0.80, respectively; Figure 1).

Ancestral Area Reconstruction and Divergence Times Estimation
The S-DEC reconstruction suggests that the most recent common ancestor (MRCA) of Stigmaphyllon was widespread in the northern portion of the BAFs ca. 26 (Figures 2 and 3, nodes 15-16, Table 1). Both a dispersal and a vicariant event took place ca. 12.0 Mya giving rise to the MRCA of node 17, which was distributed within the BAFs, Caatinga, and the Amazon rainforest. The lineage of node 18 colonized the Amazon rainforest for the first time, but only diversified ca. 0.5 Mya (Figures 2 and 3, node 18, Table 1).

Meta-Analysis
We identified 113 genera of ferns/lycophytes, gymnosperms, magnoliids, monocots, and eudicots comprising lineages exclusively diversified in the BAF biome (out of a total of 2224 genera currently recorded by the Flora do Brasil Project; Table 2). Only 64 of those genera have estimated diversification ages available from 34 phylogenetic studies published from 2004 to 2020 ( Figure 4, Table 2). Most of those studies were published from 2015 to 2020 when molecular clocks and time-calibrated trees were already widespread in phylogenetic literature.

Phylogenetics of Stigmaphyllon
The topology recovered from the combined dataset (i.e., ribosomal + nuclear markers) evidenced that the subgenera of Stigmaphyllon proposed by Anderson [10] are paraphyletic. All previous phylogenetic studies of Malpighiaceae sampled mostly Mesoamerican and Amazonian species of the genus [2,3,5,9,44], making it difficult to properly test the monophyly of its subgenera. The only BAF species of Stigmaphyllon sampled in previous studies were S. ciliatum and S. paralias. However, in all these studies S. paralias (a Brazilian Atlantic Rainforest lineage) is consistently recovered as sister to all remaining lineages of Stigmaphyllon. Additionally, this is the first time S. auriculatum, the type species of the genus, is included in a phylogenetic study. On the other hand, our results highly corroborate the previous topologies recovered for Stigmaphyllon, with the S. paralias group recovered as the first lineage to diverge, followed by the clade comprising the S. ciliatum group + S. subg. Ryssopterys sister to a large clade consisting of species from the S. tomentosum group (core Stigmaphyllon) [2,9]. Additional species sampling allied to a thorough morphological study on a phylogenetic perspective in Stigmaphyllon is urgently required to shed some light on its infrageneric classification. Over these last 26.0 Mya, the lineages of Stigmaphyllon went through 13 dispersals and four vicariance events (VE). Worth noting is that VEs in this genus were always followed by a biome shifting event (BSE) from the BAFs to another tropical biome. The first VE followed by a BSE in Stigmaphyllon occurred ca. 15.0 Mya in the MRCA of S. urenifolium, from the BAFs to the Brazilian Cerrado. Several plant lineages present the same diversification pattern with older ancestors arising in the BAFs and colonizing the Cerrado biome ca. 15 Mya, such as clade 5 of Amphilophium Kunth (Bignoniaceae) [52], Astraea Klotzsch (Euphorbiaceae) [38], Dolichandra Cham. (Bignoniaceae) [33], Fridericia Mart. (Bignoniaceae) [52], and Xylophragma Sprague (Bignoniaceae) [53]. Dispersal events from forest to open habitats is one of the main factors explaining richness in Neotropical biodiversity [16]. Even though in Stigmaphyllon, those dispersal events occurred mostly among forested biomes, its single Cerrado lineage seems to point to an older diversification of this biome, such as in those from Vochysiaceae (20.0-15.0 Mya) [51]. Few species of Stigmaphyllon successfully colonized the Cerrado biome, such as S. jobertii, S. occidentale, S. tomentosum, and S. urenifolium. Future studies sampling those species in the molecular phylogeny of Stigmaphyllon are crucial to test if the genus colonized the Cerrado biome more than a single time. However, several studies seem to present the same pattern pointed by us of forest lineages occupying and diversifying in the Cerrado biome [16], with the opposite rarely recorded in biogeographic studies. We hypothesize that Cerrado lineages colonizing forested biomes might be rare due to the recent diversification of several plant lineages of Neotropical savannas [54].

Divergence Times and Biogeography of Stigmaphyllon
The second VE followed by a BSE in Stigmaphyllon occurred ca. 12.0 Mya in the MRCA of S. sinuatum group, from the BAFs to the Amazon rainforest. The same pattern of diversification was recorded in other plant lineages, with Atlantic rainforest MRCAs colonizing the Amazon rainforest at this time, such as Eugenia L. clade G [55]. At least 50 dispersal events occurred between the Atlantic and Amazon rainforests [16]. The number of lineage exchanges between these biomes fluctuated over the last 60.0 Mya, with its previous increase starting ca. 12.0 Mya and peaking ca. 6.0-3.0 Mya [16], corroborating our results with Stigmaphyllon. The MRCA of S. finlayanum group was the fourth VE, followed by a BSE occurring in the genus, ca. 6.30 Mya, from the BAFs to the Amazon rainforest. The same pattern of diversification was recorded in other plant lineages, with Atlantic rainforest MRCAs colonizing the Amazon rainforest at this time, such as the clade 3 of Amphilophium Kunth (Bignoniaceae) [52] and the abovementioned study by Antonelli et al. [16].
The third VE followed by a BSE in Stigmaphyllon occurred ca. 10.0 Mya in the MRCA of S. subg. Ryssopterys + S. ciliatum group, from the BAFs to the Asian rainforests. This lineage diversified ca. 22.0 Mya, so we hypothesize that its MRCA was widely distributed among dunes vegetation in the Atlantic and Asian rainforests via the Antarctic route. Antarctica's glaciation process started ca. 34.0 Mya, during the Eocene/Oligocene transition, on high altitude regions of this continent [56]. From 34.0-12.0 Mya, Antarctica had intermittent ice sheet coverage, leaving intact Tertiary pockets of pre-glaciation fauna and flora that only went completely extinct ca. 12.0 Mya with the formation of permanent ice sheets in this continent [57], the same confidence interval recovered in our study for the MRCA of S. subg. Ryssopterys + S. ciliatum group. The same pattern of vicariance event is observed in the Drymophila (Australian) + Luzuriaga (South American) clade (Alstromeriaceae), in which its MRCA split ca. 23.0 Mya, giving rise to these genera from ca. 10.0 to 4.0 Mya [58].
On the other hand, 14 dispersal events (DE) occurred within Stigmaphyllon, mostly within the North-South corridors of the BAFs, and at least three DEs from the BAFS to the Caatinga biome. From these latter three DEs, the first occurred ca. 14.0 Mya in the MRCA of S. paralias group, the second occurred ca. 9.0 Mya in the MRCA of S. saxicola group and the third occurred ca. 1.0 Mya in the MRCA of the S. auriculatum group. From the remaining DEs that happened within the BAFs, at least five of them occurred from North to South ca. 22.0, 17.0, 12.0, 11.0, and 9.0 Mya. The same pattern with Northern BAF lineages colonizing Southern portions of this biome is observed in Aechmea (Bromeliaceae) [59]. However, the North-South dispersals in this genus started only ca. 6.0 Mya (Bromeliaceae) [59], being much younger than those in Stigmaphyllon.

Time and Diversification of the Atlantic Rainforest
Several implications for understanding the BAFs historical biogeography might be postulated from the biogeographical study of Stigmaphyllon. Our results suggest a late-Eocene origin for these forests, that seems to be partially corroborated by the literature. Until the late Paleocene and early Eocene, the Earth's climate was mostly warmer and more humid than today, suggesting that South America was probably covered by continuous rainforests [60,61]. When comparing the mean ages recovered in published studies for several lineages of ferns, gymnosperms, and angiosperms diversified in the BAFs, we bring new evidence that vascular plants started colonizing these forests over the last 60.0 Mya (Figure 4; Table 2), corroborating the abovementioned authors. The oldest lineage to occupy the BAFs might have been the genus Barnebya W.R.Anderson and B.Gates ca. 60 Mya (Malpighiaceae, Eudicots), with the diversification of most Eudicot lineages in these forests occurring from 40.0 to 15.0 Mya (Figure 4; Table 2). During the late Eocene and Oligocene, global episodes of cooling and dryness favored the expansion of grasslands in the southern and central regions of the continent [60,62], which culminated in the formation of a diagonal belt of more open and drier biomes (also known as "dry diagonal") [63]. The formation of the dry diagonal marked the formation of the Atlantic forest in the east and Amazonia in the west [64]. On the other hand, the colonization and diversification of Magnoliid lineages took place from 18.0 to 3.0 Mya (Figure 4; Table 2), followed by gymnosperm lineages that diversified from 15.0 to 11.0 Mya, and finally from monocot lineages diversifying from 12.0 to 3.0 Mya in these rainforests ( Figure 4; Table 2). Fossil records and paleoclimate studies suggest that the BAFs and the Amazon rainforests were re-connected multiple times in the Miocene and Pliocene [64]. Mean ages regarding the colonization and diversification of ferns in the BAFs are still incipient, with a single study on the family Cyatheaceae evidencing its initial diversification in these forests at 30.0 Mya (Figure 4; Table 2). Additionally, from the 113 BAF lineages presented in Table 2, only 64 show mean ages based on time-calibrated phylogenies available in the literature. At least 18 families of angiosperms (Acanthaceae, Amaryllidaceae, Apocynaceae, Araceae, Asparagaceae, Asteraceae, Bignoniaceae, Erythroxylaceae, Fabaceae, Gentianaceae, Lauraceae, Marantaceae, Moraceae, Orchidaceae, Poaceae, Rubiaceae, Rutaceae, and Sapindaceae) with lineages diversified in the Atlantic forest still lack published time-calibrated phylogenies.
Another worth mentioning factor that might have played an important role in the diversification of the BAFs was the uplift of Serra do Mar and Serra da Mantiqueira Mountain Ranges in Eastern Brazil. Those mountains were previously thought to have uplifted around 120.0 Mya, but geological studies from the past decade have pointed to an earlier uplifting age for these mountains, from 60.0 to 30.0 Mya [65]. The tertiary uplift age of these mountains in Eastern Brazil was a direct result of the Andean uplift, coinciding with our results for the colonization of the BAFs by 64 lineages of vascular plants [65].

Taxon Sampling and Plant Material
We sampled 24 species of Stigmaphyllon (ca. 1 4 of the Neotropics' genus diversity: 18 spp. from the Brazilian Atlantic Rainforest (out of 31 spp.), three spp. from the Amazon Rainforest, two spp. from the Caatinga, and one spp. from the Cerrado biomes), including outgroups Bronwenia W.R.Anderson and C.C.Davis and Diplopterys A.Juss. From this total, 23 species represent S subg. Stigmaphyllon and a single species [Stigmaphyllon timoriense (DC) C.E.Anderson] represents S. subg. Ryssopterys (Table 3). For DNA extraction, we used mainly field-prepared silica dried leaves (12-80 mg) and herbarium specimens as necessary (Table 3).

Molecular Protocols
Genomic DNA was extracted using the 2 × CTAB protocol, modified from Doyle and Doyle [66]. Three DNA regions (nuclear PHYC gene, and the ribosomal external and internal transcribed spacers (ETS and ITS)) were selected based on their variability in previous Malpighiaceae studies [2,5,6,8,67]. Protocols to amplify and sequence ETS and ITS followed Almeida et al. [8]. For amplification, we used the TopTaq (Qiagen) mix following the manufacturer's standard protocol, with the addition of betaine (1.0 M final concentration) and 2% DMSO for the ETS region. PCR products were purified using PEG (polyethylene glycol) 11% and sequenced directly with the same primers used for PCR amplification.
Sequence electropherograms were produced on an automatic sequencer (ABI 3130XL genetic analyzer) using the Big Dye Terminator 3.1 kit (Applied Biosystems). Additional sequences for PHYC were retrieved from GenBank (Table 3). Newly generated sequences were edited using the Geneious software [68], and all datasets were aligned using Muscle [69], with subsequent adjustments in the preliminary matrices made by eye. The complete data matrices are available at TreeBase (accession number S21218). This study was authorized by the Genetic Heritage and Associated Traditional Knowledge Management National System of Brazil (SISGEN #A3B8F19).

Phylogenetic Analysis
Analyses were rooted in Bronwenia, according to Davis and Anderson [2]. Individual analyses for each marker were performed, and since no significant incongruencies were found, analyses of combined matrices (i.e., nuclear + ribosomal markers) were performed using maximum parsimony (MP) conducted with PAUP 4.0b10a [70]. A heuristic search was performed using TBR swapping (tree-bisection reconnection), and 1000 random taxon-addition sequence replicates with TBR swapping limited to 15 trees per replicate to prevent extensive searches (swapping) in suboptimal islands, followed by TBR in the resulting trees with a limit of 1000 trees. In all analyses, the characters were equally weighted and unordered [71]. Relative support for individual nodes was assessed using non-parametric bootstrapping [72], with 1000 bootstrap pseudoreplicates, TBR swapping, simple taxon addition, and a limit of 15 trees per replicate.
For the model-based approach, we selected the model GTR + I + G using hierarchical likelihood ratio tests (HLRT) on J Modeltest 2 [73]. A Bayesian analysis (BA) was conducted with mixed models and unlinked parameters, using MrBayes 3.1.2 [74]. The Markov chain Monte Carlo (MCMC) analysis was performed using two simultaneous independent runs with four chains each (one cold and three heated), saving one tree every 1000 generations for a total of ten million generations. We excluded as 'burn-in' trees from the first two million generations, and tree distributions were checked for a stationary phase of likelihood. The posterior probabilities (PP) of clades were based on the majority-rule consensus produced with the remaining trees in MrBayes 3.1.2 [74].

Calibration
Estimates were conducted based on a simplified ultrametric Bayesian combined tree generated with BEAST 1.8.4 [75]. This analysis used a relaxed uncorrelated lognormal clock and Yule process speciation prior to inferring trees. The calibration parameters were based on previous estimates derived from a comprehensive fossil-calibrated study of the whole Malpighiaceae [44,76]. We opted for calibrating at the root, using a normal prior with mean initial values of 40.0 Mya (representing the age estimated for the MRCA of the Stigmaphylloid clade) and a standard deviation of 1.0 [44,67,76]. Two separate and convergent runs were conducted, with 10,000,000 generations, sampling every 1000 steps, and 2000 trees as burn-in. We checked for ESS values higher than 400 for all parameters on Tracer 1.6 [77]. Tree topology was assessed using TreeAnnotator and FigTree 1.4.0 [78].

Meta-Analysis
Ages of BAF lineages of vascular plants (ferns/lycophytes, gymnosperms, magnoliids, monocots, and eudicots) were compiled based on the phylogenetic literature and distribution data from Flora do Brasil [12] and Plants of the World Online [17]. Online repositories such as GBIF, BIEN and Species Link were not used since specimen identification is not usually updated or are not identified by a Malpighiaceae specialist. Additionally, the main problem in using all the above-mentioned repositories is that only Flora do Brasil present reliable information on the biome distribution of species sampled in our study. We considered a genus or lineage within a genus diversified in the BAF when at least 50% of its total number of species occurred in this biome. Data from BAF lineages of vascular plant species are presented in Table 2, alongside estimated ages, and references. A boxplot graphic is also presented in Figure 4, showing the diversification of vascular plants through time in the BAF based on data presented in Table 2.

Conclusions
Even though dispersal events from forested to open habitats have been recently identified as one of the main factors explaining richness in Neotropical biodiversity [16], the same pattern was not recovered for Stigmaphyllon. A late-Eocene origin for this genus is suggested, with its MRCA originating in the Northeastern BAFs, with several dispersal events taking place to other Neo-and Paleotropical biomes from 22.0 to 1.0 Mya, alongside several dispersals from Northern to Southern portions of the BAFs. When comparing our results with published divergence times for BAFs' vascular plant lineages, a late-Eocene origin for these forests was evidenced. The immense gap in time-calibrated phylogenies focusing on BAFs' vascular plant lineages is still the most significant impediment for a more comprehensive understanding of the plant diversification timeframe in these forests. Additionally, the recent evidenced tertiary uplift of Serra do Mar and Serra da Mantiqueira Mountain Ranges might also have played an important role in the diversification of the megadiverse BAFs.