Abstract
The rates of continuous evolution plays a crucial role in understanding the pace at which species evolve. Various statistical models have been developed to estimate the rates of continuous trait evolution for a group of related species evolving along a phylogenetic tree. Existing models often assume the independence of the rate parameters; however, this assumption may not account for scenarios where the rate of continuous trait evolution correlates with its evolutionary history. We propose using the autoregressive–moving-average (ARMA) model for modeling the rate of continuous trait evolution along the tree, hypothesizing that rates between two successive generations (ancestor–descendant) are time-dependent and correlated along the tree. We denote PhyRateARMA as a phylogenetic rate-of-continuous-trait-evolution ARMA(p,q) model in our framework. Our algorithm begins by utilizing the tree and trait data to estimate the rates on each branch, followed by implementing the ARMA process to infer the relationships between successive rates. We apply our innovation to analyze the primate body mass dataset and plant genome size dataset and test for the autoregressive effect of the rates of continuous evolution along the tree.
Keywords:
autoregressive–moving-average model; evolutionary rates; phylogenetic comparative method; Brownian motion; trait evolution MSC:
62P10; 92-10; 60H35
1. Introduction
In evolution, studying the heritable characteristics of the biological population is helpful for understanding the diversity between species on our planet Earth [1]. Macroevolution is expected to occur when selection acts on a trait that has a heritable basis of phenotypic variation. During the evolutionary process, speciation results in new species, and the comparison of traits (e.g., height, weight, size, etc.) among a group of related species can be performed by studying the speed of changes in their characteristics over successive generations [2].
Although one subgroup of species evolved at a faster rate and resulted in a larger variation in the trait, the other subgroup of species evolved at a relatively lower rate and produced a moderate variation of the trait. When evolutionary processes such as natural selection (including sexual selection) and genetic drift act on this variation, certain characteristics become more common or rare within a population [3]. For example, Darwin finches are a group of about 18 species of dull-colored passerine birds on the Galápagos islands [4]. They are well known for their remarkable diversity in the form, size, and function of the beak, which is highly adaptable to different food sources [5]. Another example is angiosperms, which survive and thrive successfully on our planet. The evolution of fruits is one of the most important characteristics, as fruits not only provide a food source for other species, but also protect seeds and contribute to seed dispersal [6]. The survival and success of fruits require adaptation to their environment. For example, while dragon fruit (Selenicereus) endures temperatures of up to 40 °C (104 °F) for survival, watermelon (Citrullus lanatus) needs temperatures higher than about 25 °C (77 °F) to thrive. Studying the reproducible properties by the rates of continuous evolution to the resistance of temperature would help us shed light on the evolution of angiosperms themselves and understand their ecological implications.
The rates of continuous evolution is a measurement of the change in an evolutionary lineage over time and can be defined as the ratio of the character displacement over a certain time interval. The rate of change between two samples is defined using three quantities [7]: the proportional difference between the sample means, the pooled standard deviation of the samples, and the time interval between the samples. For example, suppose that a character has been measured twice, and , where and are expressed as the time before the present in millions of years. The time interval between the two samples can be written as , which is 1 million years if and . The average value of the character is defined as in the previous sample and in the later sample. Let be the natural logarithm, taking into account and . Then, the evolutionary rate () can be defined as in Equation (1):
Next, multiplying the time difference on both sides of Equation (1) produces the character difference within a time unit Conceptually consider that the character change occurred in infinitesimal time and denote the character displacement by ; then, we have the differential equation . Given , the solution is , which shows that increased with time t. Unfortunately, this may not be an appropriate model for describing character change in the evolutionary perspective. Instead, one may consider that the variation in the character change adopts a certain dynamic. For example, if one considers that the variation in the character change is proportional to time, then a stochastic variable can be introduced, where is a Brownian motion (BM) process with an independent Gaussian distributed increment, and is a normal distributed random variable with mean 0 and variance (i.e., ). Thus, the displacement of the continuous trait variable (=) solves the stochastic differential equation shown in Equation (2):
Given , integrate both sides of Equation (2), and then one has
where , and .
Given the initial value , Equation (3) describes the dynamic of the trait variable at time t pending the rate parameter . Figure 1 presents one hundred trajectories generated using rates (bottom left) and (bottom right), respectively. It can be seen straightforwardly that while a smaller rate () yields a narrower range of character value, the larger rate () yields a wider range of character value.
Figure 1.
Trait values under Brownian motion dynamics with two different rates (left: , right: ). Histograms in the top row represent the distribution of final positions, showing greater spread with higher . The bottom row illustrates 100 trajectories in red over time and the mean trajectory in blue, with higher resulting in more dispersed paths. The time span is set to 100.
For a group of n related species, denote as the trait variable for the ith species. Then, we can apply the Brownian motion with the rate parameter to explore the dynamic of the trait along the evolutionary history using a phylgoenetic tree that represents the relatedness between species. In particular, the estimation of the evolution rate can be performed using phylogenetic comparative methods (PCMs) [8,9,10].
There are models created on the basis of Equation (3) by considering that the constant-rate BM model may not be well addressed for the evolution in many scenarios [11,12,13,14,15]. Such models come with the assumption that the evolution of a species changes over time, wherethe rates of continuous trait evolution can be modeled as either constants or stochastic variables along times (branch lengths). The trait variable adopts the following dynamics:
where can be constant (i.e., ) [16], piecewise constant (i.e., , where and are the successive time regimes) [11] or a random variable modeled by another pertinent process (i.e., where is a distribution function for the stochastic variable) [17,18].
A hypothetical tree of four species and the corresponding simulation of the trajectories using Brownian motion along a rooted phylogenetic tree under two different rates is shown in Figure 2.
Figure 2.
Trajectories of trait evolution for 4 species along a rooted phylogenetic tree. Middle panel: A rooted phylogenetic tree with four taxa . Left panel: A set of four dependent trajectories along the tree using a single rate ( on all branches). Right panel: A set of four dependent trajectories along the tree using two rates ( on the blue branches, and on the red branches). is the root status denoted as a parameter of interest (analogous to ).
These models have been broadly applied in many studies. For example, in the evolution of the morphology of the world’s largest flowers (Raffkesianeae: up to 1 m in diameter), the authors of [19] found that the enormous flowers evolved from ancestors with tiny flowers. In the study of the evolution of the size of the plant genome, the authors of [20] found that the woody lineages had a stochastic motion rate that was nearly five times slower than the rate of the herbaceous lineages. Although the existing framework has produced rate models, none consider a scenario in which the rate is treated as a time-correlated stochastic variable, which could potentially enhance the study of rate evolution. This highlights the need for our work to apply a time series model [21,22]. Specifically, this approach aims to answer the following key question:
Are the evolutionary rates of biological traits statistically independent, or are they believed to be phylogenetically serially autocorrelated?
Note that in modeling the rates of continuous evolution, it is essential to consider its dynamic nature, as the rate can fluctuate, increase, or decrease over time as well as vary gradually and stochastically across a clade. Refs. [18,23,24], rather than remain constant. Consider that the implementation of the time-correlated rate evolution could possibly provide an alternative to reveal embedded information about species evolution; in this work, we intend to expand the model in Equation (4) within the framework of correlated rate evolution ( for , where is a parameter vector) to model the trait evolution for phylogenetic comparative analysis. In particular, we use the autoregressive–moving-average (ARMA) time series model that has been widely applied in econometrics to model the rate parameter [25,26]. The description of the methods can be found in Section 2. The simulations are detailed in Section 3. Empirical analyses are presented in Section 4. The discussions and conclusions are covered in Section 5.
2. Methods
2.1. Trait Evolution with Time-Correlated Rate
We describe our procedure as follows. In Section 2.1.1, we adopt phylogenetic ridge regression [27] to perform the analysis using the phylogenetic tree and trait data as input to obtain rate estimates. In Section 2.1.2, we perform a phylogenetic rate ARMA(p,q) regression, where the rate estimates are incorporated along the tree using a traversal algorithm, treating the data as time series.
The combination of ridge regression and ARMA models allows us to address the challenges of estimating evolutionary rates by leveraging their respective strengths. Specifically, ridge regression is employed to obtain reliable and stable estimates of evolutionary rates, even in scenarios where multicollinearity may arise in the data. Meanwhile, the ARMA model is used to capture the series-correlated rate dynamics, incorporating the dependency structure dictated by the phylogenetic tree topology.
2.1.1. Step 1: Phylogenetic Ridge Regression
Given the tree with the topology and branch lengths, one should assume that the observed trait values are a linear combination of past values, time has elapsed, and the rate of continuous trait evolution changes. To illustrate, we used a case of a 3-taxon tree as shown in Figure 3, where the extant species A, B, and C, and the fossil species D, E, and O have trait values of and . These trait values can be expressed as a linear combination of the time elapsed and the rates of continuous evolution, as shown in Equation (5):
Figure 3.
A rooted phylogenetic tree of three taxa with tip nodes A, B, and C; internal nodes D and E; and root node O, where the branch lengths are , , , , and , and the rate variables are , , , , and .
One can formulate into the system of linear equation using Equation (6):
Given the empirical data where the tip values are , , and , and that the ancestral values and may be known through fossil records or remain unknown, and assuming the currently provided data, , and the phylogenetic tree with a given set of known branch lengths, set , then the tip states can be written into a matrix form as Equation (7):
where L is the design matrix corresponding to species with the associated branch from the root of the tree, as shown in Figure 3, and represents the rate vector. Equation (7) can be reformulated and written as Equation (8):
where the estimates for the rate vector can be obtained [27] in Equation (9):
which can be written as
In general, given the tip data and tree topology of n taxa with the branch-length set , where is the internal node index set with the corresponding ancestral–descendant relationship delineated by the tree topology, we use the R package rrphylo (version 2.8.1) [28] to perform the analysis for the estimation of phenotypic evolutionary rates using Equation (10), adopting the leave-one-out cross-validation (LOOCV) to search for the optimal . In this method, each observation is removed one at a time, and the remaining data are used to make predictions. The error for the excluded observation is calculated, and this process is repeated for all observations. The total LOOCV error is computed for various values, and the with the smallest error is selected as the best regularization parameter, improving model performance [27].
2.1.2. Step 2: Phylogenetic ARMA(p,q) Rates Model for Continuous Trait Evolution
Next, we implemented the ARMA model for studying the rates . Given tree topology of n taxa, with tip node and internal node index set , and the corresponding branch length set with the ancestral–descedant relationship delineated by the tree topology, by utilizing phylogenetic ridge regression in Equation (10), one can obtain a set of estimated rates , where index corresponds to the internal index. Then, the phylogenetic ARMA(p,q) rates model for continuous trait evolution, defined by a mixture of phylogenetic autoregressive (AR) and phylogenetic moving average (MA) models, is given by Equation (11):
where represents the time series at time , are coefficients for the AR terms, are coefficients for the MA terms, and denotes the white noise error terms following a normal distribution with mean zero and variable proportional to the branch length as shown in Equation (12):
For a phylogenetic ARMA (1, 1) rate model, we denote as the descendent () and as the ancestor; thus, the model in Equation (11) can be represented as in Equation (13):
A model example of the 3-taxon case is shown in Figure 4.
Figure 4.
The phylogenetic ARMA rate model for the rates of continuous evolution along the studied tree. The root status O starts with , and along the tree is the error of the rate estimate. The ARMA rates are bound to the tree topology’s ancestral–descendant relationship. For instance, the rate at tip node A has the relationship with the ancestor node D of , where , while the rate at internal node D has the relationship with the ancestor node E of , where .
We are interested in the joint distribution , where is the parameter vector (i.e., for the PhyRateARMA model. On a specific branch with branch length , we assume that error follows a normal distribution with mean conditioning on the ancestor and variance with branch length . The logarithmic likelihood on a branch is a one-dimensional ARMA that has likelihood in Equation (14):
Assuming that the branches are independent [24,29], the full log-likelihood given the tree and trait , can be written as
where is the residual at the root, and the product operator follows the corresponding tree topology .
To perform model inference on parameter estimation, given the trait vector on the tip and a phylogenetic tree with the known branch-length set , where is the branch length of the root node and each branch is , use and the tree to obtain the rate estimates at the root from the Brownian motion model [12]:
where the matrix is the phylogenetic similarity matrix, in which the element measures the shared branch length between species i and j on the tree [30]. is the MLE for the ancestral at the root under the Brownian motion model, and is vectors of 1s. Then, we used the maximum likelihood approach for parameter estimation and inference. Refer to Appendix A.2 for our expression and parameter estimation of a first few orders of the PhyRateARMA() model.
For empirical analysis, we analyzed the data using the model up to the order . For model selection, the Akaike information criterion , its sample size correction version [31], and the weighted parameter were calculated using the equation , where is the weight for the models i and . The weighted parameter estimate was calculated as . Here, represents the MLE from model i. This method incorporates uncertainty in the selection of the model by averaging parameters based on AIC weights [32].
2.2. Testing of PhyRateAR(1) Effect
We conducted inference to test for the existence of the autoregressive effect. This is equivalent to testing whether , as shown in Section 2.2.1. Additionally, we investigated whether the autoregressive effect varies across different clades of the tree. For this, we posed the null hypothesis , which is discussed in Section 2.2.2.
2.2.1. Test of Correlated Rate Evolution
Given the PhyRateAR(1) process, , where is white noise, the asymptotic variance of the MLE is . The null hypothesis is given by , where is a hypothesized value. The alternative hypothesis for a two-sided test is . We set to test for the PhyRateAR(1) effect. To perform the test, first fit an PhyRateAR(1) model to the data and obtain the estimate of and its standard error . The test statistic is then calculated as
This test statistic is compared with the critical value of the t distribution with degrees of freedom, where n is the sample size. The standard deviation estimate can be derived from the Hessian matrix obtained during the optimization process, which is then used to conduct hypothesis testing.
In the context of hypothesis testing for a phylogenetic rate model in an PhyRateAR(1) model, the process begins by establishing the null hypothesis, , where represents a specific hypothesized value. The alternative hypothesis for a two-sided test is . The methodology for testing these hypotheses involves several steps: (i) Firstly, an PhyRateAR(1) model is fitted to the data. From this model, the estimate of and its standard error are obtained. (ii) The next step involves calculating the test statistic using the formula . (iii) Finally, this test statistic is compared against the critical value from the t-distribution, which is determined by the sample size , where is the number of the branches of the tree and corresponds to degrees of freedom.
This comprehensive method is essential to determine whether the hypothesized value is a plausible value for in the PhyRateAR(1) model.
2.2.2. Testing Heterogeneity Rates on Subclades
We consider whether heterogeneity autoregressive effect exists on the two subclades, as shown in Figure 5.
Figure 5.
Test heterogeneity rates for trait evolution on the two subclades of the tree with tips . Evaluate the model with two ’s () vs. a single .
The null Hypothesis () poses that a single tree with n taxa uses one parameter , while the alternative Hypothesis () states that two autoregressive parameters, and , would be more appropriate when the tree is divided into two subtrees from the root and the parameters are estimated independently.
The null hypothesis in the likelihood ratio test states that the simpler model (in this case, the single combined tree) is sufficient to explain the variability in the data. The alternative hypothesis suggests that dividing the tree into two subtrees provides a better fit. The likelihood ratio test (LRT) statistic is computed as
where is the log-likelihood under the null hypothesis and is the log-likelihood under the alternative hypothesis. The test statistic follows a chi-squared distribution with the degrees of freedom .
The procedure in our framework is summarized in Algorithm 1.
| Algorithm 1 Procedure of PhyRateARMA(p,q) Model Data Analysis |
|
3. Simulation
We assessed the performance of the model using simulation. Two types of trees, a balanced tree and a birth-and-death tree, were used for the assessment, each with sizes of 16, 32, 64, and 128, respectively. The initial samples were drawn from an independent normal distribution. The initial estimate for by applying the formula in Equation (16) at the root was estimated by the Brownian motion model using the R package geiger (version 2.0.11) [33]. For each type of tree and for each taxa, a million replicates of the sample were generated to assess performance of the model.
Let the tree have m branches, with each branch representing a transition between an ancestral node and a descendant node . The tree is traversed in postorder (from root to tips), and the trait values for each node are computed based on the following AR(1) process.
For each branch, the evolutionary rate at the descendant node is determined by the rate at its ancestor , with some noise (innovation) added. The autoregressive equation is shown in Equation (19):
Initially, the rate value at the root node is set to zero: .
After simulating the rates using Equation (19) for all nodes in the tree, the final trait values at the tree tips (taxa) are computed by accumulating the rates along the path from the root to each tip.
Let denote the set of nodes from the root to tip j, and denote the branch length leading to node . The trait value at each tip j is
where is the final accumulated trait value at tip j.
The final output consists of two arrays: (1) the array of simulated evolutionary rates for each node, , and (2) the trait values at the tree tips , summarized by the following Equation (21):
This simulation captures the evolutionary rate changes along a phylogeny under an AR(1) model, accounting for both ancestral states and random innovation at each branch. We implemented the corresponding method and script in R software (version 4.4.2) [34], naming the function simulate_trait_data_given_phi (refer to Appendix A.1.1 for the programming script), which simulates the trait evolution process along a phylogenetic tree using an autoregressive process (AR(1)) for the rates of continuous trait evolution. The simulation setup is shown in Algorithm 2.
Figure 6 and Table 1 show how the statistical power to detect an effect changes with different values of and different taxon levels.
Figure 6.
Power curve for varying taxon levels (using balanced tree and birth–death tree cases as demonstration). The horizontal axis represents an adjusted value of a parameter called , ranging from 0 to near 1. The vertical axis depicts the statistical power, indicating the probability of rejecting a null hypothesis when an alternative is true; a range from 0 suggests no detection ability, while 1 signifies certain detection. The four curves represent different taxon counts , which could indicate distinct sample sizes or biological classifications. The horizontal lines present the type I error rate using level .
Table 1.
The type I error (T1 err) rate and the power of tree AR(1) rate on the null using 2 types of trees and 4 taxon sizes. The alternative are set to values of and , respectively. A total of 1000 replicates are used and the significance level is set to .
From Figure 6 and Table 1, one can envision that the power of the test changes depending on the number of taxa. Generally, as the number of taxa increases, the power also increases. Next, for all taxon levels, as increases, the power of the test also increases. This could mean that as the parameter increases, the ability to detect an effect or a difference becomes stronger. Third, all curves converge or come closer together at higher values of , indicating that the power could be more consistent between different levels of taxa at these higher values. This simulation provides the evidence that our methods fit the statistical properties as expected in the performance of evaluating the statistical power and type I error. Results using the coalescent tree and pure birth tree are similar, and readers can refer to the online appendix link in Appendix A.1.2.
| Algorithm 2 Estimating Type I Error and Power with PhyRateAR(1) rate Model |
|
4. Empirical Analysis
4.1. Genome Size Rate Evolution in Herbaticus and Woody Plant
We collected and analyzed datasets on the genome size in pg of herbaticus and woody plant. The primary data source is [35], from whichwe included 35 herbaticus and 33 woody species in this study. The main objective was to test whether there is a time series relationship in the evolutionary rate of these plants’ genome sizes.
Figure 7 shows the phylgoenetic tree for herbaticus, woody, and their joint tree.
Figure 7.
Left: Herbaticus tree. Right: Woody species Middle: Combined herbaticus and woody species. The herbaticus and woody trees were obtained using TimeTree [36], where species names are entered and the system generates the tree in Newick format. The combined tree was obtained using R ape package (version 5.8-1) function bind.tree by applying the molecular dating with a penalized likelihood approach [29] for branch-length estimation. This was performed using the R ape package (version 5.8-1) function chronopl, taking the herbaticus tree and woody tree as input.
4.1.1. Genome Size: Combined Herbaticus and Woody Species
Table 2 presents the parameter estimates for various autoregressive (PhyRateAR) and autoregressive–moving-average (PhyRateARMA) models. PhyRateARMA(1,1) has a value of and a value of , indicating a moderate correlation between consecutive rates of evolution, with a residual variance of . The weighted estimates combine values across the models, with , , and , providing a comprehensive view of the underlying evolutionary process.
Table 2.
MLE parameter estimated for the 68 combined species of herbaticus (35) and woody (33) species.
Using the phylogenetic PhyRateARMA(1,1) rate model described in Equation (13), the equation for demonstrates how the descendant’s rates of continuous evolution is influenced by multiple factors. In the weighted estimate, the AR(1) estimate indicates that −5.0% of the ancestor’s evolutionary rate () is carried over to the descendant, reflecting a weak and inverse correlation between the two rates. The MA(1) estimate , with a coefficient of , adjusts for the random variation (or noise) in the ancestor’s rate. This positive value suggests that random fluctuations in the ancestor’s rate have a small reinforcing effect on the descendant. Finally, , with variance proportional to , represents the new random variation specific to the descendant. The weighted estimates offer a balanced perspective on how both inherited traits and random fluctuations shape evolutionary changes across generations.
In Table 3, PhyRateAR(2) ranks first with the lowest AICc value of and the highest Akaike weight (), indicating the best fit among the models. PhyRateAR(1) follows with an AICc of and a weight of , suggesting a close fit but slightly less optimal than PhyRateAR(2). PhyRateARMA(1,1) ranks third with an AICc of and a weight of . Models PhyRateARMA(2,1) and PhyRateARMA(2,2) show considerably lower Akaike weights, indicating that increasing model complexity beyond PhyRateAR(2) does not significantly improve the fit.
Table 3.
Model information for the combined herbaticus and 88 woody primate species.
4.1.2. Genome Size: Herbaticus Species
The parameter estimates in Table 4 for the herbaticus models show that PhyRateAR(1) has a value of with residual variance , indicating a moderate influence from the previous rates of continuous evolution. PhyRateAR(2) introduces a second autoregressive term () and maintains a similar variance of . The PhyRateARMA(1,1) model has and , with the same variance of . More complex models, such as PhyRateARMA(2,1) and PhyRateARMA(2,2), show mixed and values, but they maintain the same low variance of .
Table 4.
Herbaticus species model parameter estimates.
The weighted estimate balances these models, showing , , and . This suggests that the descendant’s rates of continuous evolution is influenced by % of the ancestor’s rate, with a small adjustment of % due to noise from the ancestor, all within a low variance process.
In Table 5, PhyRateAR(1) ranks first with the lowest AICc value of and the highest Akaike weight (), making it the best-fitting model for the herbaticus data. PhyRateARMA(1,1) follows in second place with an AICc of and a weight of , while PhyRateAR(2) ranks fourth with an AICc of and a weight of . More complex models like PhyRateARMA(2,1) and PhyRateARMA(2,2) have even lower Akaike weights, indicating that they provide little additional explanatory power compared to simpler models.
Table 5.
Herbaticus species model results.
4.1.3. Genome Size: Woody Species
The parameter estimates in Table 6 for the woody models indicate that PhyRateAR(1) has a value of and a residual variance of , suggesting a very weak negative influence from the previous rates of continuous evolution. PhyRateAR(2) introduces a second autoregressive term () and slightly increases the variance to . PhyRateARMA(1,1) shows stronger dependence with and , while maintaining a variance of .
Table 6.
Woody species model parameter estimates.
The weighted estimate balances the models, with , , and . This suggests that the descendant’s rate is slightly positively influenced by the ancestor’s rate (%) but is slightly negatively impacted by the noise (%), with very low overall variance (), indicating minimal change between generations.
In Table 7, PhyRateAR(1) ranks first with the lowest AICc value of and the highest Akaike weight (), indicating that it is the best-fitting model for the woody data. PhyRateARMA(1,1) ranks second with an AICc of and a much smaller weight (), while PhyRateAR(2) ranks last with an AICc of and no Akaike weight (). More complex models, such as PhyRateARMA(2,1) and PhyRateARMA(2,2), also have minimal Akaike weights (), suggesting that they provide little additional explanatory power and overcomplicate the fit for the woody data.
Table 7.
Woody species model results.
From above, one may foresee that the woody primate and the herbaticus genome size possess different autoregressive effects for the rates of evolution, and we present the test we conducted in the following sections.
4.1.4. Testing Autocorrelation Rate
Interpretation of PhyRateAR(1) Test Results
Table 8 shows the PhyRateAR(1) test result.
Table 8.
, .
Table 8 presents the results of the PhyRateAR(1) test under the null hypothesis (no autocorrelation) and the alternative hypothesis . The herbaticus tree shows a significant positive autocorrelation with a large z-value of 5.100 and a p-value of 0.000, strongly rejecting the null hypothesis. This suggests a significant effect of the previous rates of continuous evolution on the current rate in the herbaticus dataset.
In contrast, the woody tree shows no significant autocorrelation, with z values of ; and p values of indicate that the null hypothesis cannot be rejected for these datasets, suggesting that there is no significant relationship between the past and current rates of evolution in the woody trees. This may due to the fact that the woody tree has slower rates (5 times slower than the herbaticus tree [20]).
The combined tree shows no significant autocorrelation, with z-values of , and p-values of . The result suggests that there is no significant relationship between the past and current rates of evolution in the combined trees.
Testing Heterogeneity Rates on Subclades
Table 9 presents the results of the heterogeneity test in evolutionary rates across the herbaticus, woody, and combined (dino) subclades using various autoregressive models (PhyRateAR(1), PhyRateAR(2), PhyRateARMA(1,1), PhyRateARMA(2,1), PhyRateARMA(2,2)). The log-likelihood values for the herbaticus (), woody (), and combined () datasets are provided, along with the chi-squared statistic and p-values.
Table 9.
Testing heterogeneity rates on subclades. : , : .
All models yield highly significant results (p-values = 0.00), indicating strong evidence for rate heterogeneity between the subclades. The highest chi-squared statistic is observed for PhyRateAR(1) (847.16), suggesting that it explains the greatest amount of variance in the data compared to the other models.
4.2. Body Mass Rate Evolution in Diurnal and Nocturnal Primates
We collected and analyzed datasets on the body weights of male and female primates. The primary data sources are from [37] and Animal Diversity Web [38], which include diurnal and nocturnal male and female primates. The phylogenetic tree of 88 primate species was obtained from [37,39]. Figure 8 shows the phylogenetic tree for nocturnal, diurnal, and their joint tree. The main objective was to test whether there is a time series relationship in the evolutionary rate of these primates’ body weights. The original dataset contained 88 primate species, 34 diurnal species, and 54 nocturnal species. Body weight for males, females, or combined was collected. We present the combined analysis of male and female primate data below. For the male-only and female-only analysis cases, see Appendix A.1.2. Note here that we expanded the analysis from [40], where a smaller dataset of the primate was analyzed under the PhyRateAR(1) and PhyRateARMA(1,1) models.
Figure 8.
Left: Tree with 54 nocturnal primates. Right: Tree with 28 diurnal primates. Middle: Combined phylogenetic tree of 88 primate species.
4.2.1. Body Mass: Combined Diurnal and Nocturnal Species
Table 10 presents the parameter estimates for various phylogenetic autoregressive–moving-average (PhyRateARMA) and autoregressive (PhyRateAR) models. PhyRateARMA(1,1) shows the highest value of and of , indicating a significant correlation between consecutive rates of evolution, with residual variance . The weighted estimates combine the values across the models, with , , and providing a balanced view of the underlying process.
Table 10.
MLE parameter estimated for the 88 combined diurnal and nocturnal primate species.
Using the PhyRateARMA(1,1) rate model in Equation (13), and as the equation for shows how the descendant’s rates of continuous evolution is determined by a combination of factors, in the weighted estimate the AR(1) estimate suggests that 77.9% of the ancestor’s evolutionary rate () carries over to the descendant, indicating a strong correlation between the two rates. The MA(1) estimate , which has a coefficient of , adjusts for the random variation (or noise) in the ancestor’s rate. This negative value means that random fluctuations in the ancestor’s rate are counterbalanced, effectively reducing the influence of ancestor-specific randomness on the descendant. Finally, has variance proportional to , representing the new random variation specific to the descendant. The weighted estimates provide a balanced view of the process by combining these terms, giving insight into how both inherited traits and random fluctuations influence evolutionary changes across generations.
In Table 11, PhyRateARMA(1,1) ranks first with the lowest AICc value of and the highest Akaike weight (), indicating the best fit among the models. PhyRateARMA(2,1) and PhyRateARMA(2,2) follow, but with considerably lower Akaike weights, suggesting that increasing model complexity beyond PhyRateARMA (1,1) does not produce substantial fit improvements. PhyRateAR(1) ranks the lowest, highlighting its reduced explanatory power compared to more complex models.
Table 11.
Model information for the 88 combined diurnal and nocturnal primate species.
4.2.2. Body Mass: Diurnal Species
The parameter estimates in Table 12 for the diurnal models show that PhyRateAR(1) has a value of with residual variance , indicating a moderate level of influence from the previous rates of continuous evolution. PhyRateAR(2) introduces a second autoregressive term () and slightly increases the variance to . PhyRateARMA(1,1) and more complex models, such as PhyRateARMA(2,1) and PhyRateARMA(2,2), demonstrate negative and values, but these more complex models have similar or slightly lower variances (). The weighted estimate shows that the descendant’s rate is influenced by % of the ancestor’s rate, with a positive adjustment of % from the ancestor’s noise, while the process has a relatively low variance of .
Table 12.
Diurnal species model parameter estimates.
In Table 13, PhyRateAR(1) is the first with the lowest AICc value of and the highest Akaike weight (), making it the best-fitting model for the diurnal data. PhyRateARMA(1,1) ranks second with an AICc of and a lower weight (), followed by PhyRateAR(2) with an AICc of (). More complex models such as PhyRateARMA(2,1) and PhyRateARMA(2,2) rank lowest, with minimal Akaike weights, indicating that they provide little additional explanatory power.
Table 13.
Diurnal species model results.
4.2.3. Body Mass: Nocturnal species
The parameter estimates in Table 14 for the nocturnal models indicate that PhyRateAR(1) has a value of and a residual variance of , suggesting a weak negative influence from the previous rates of continuous evolution. PhyRateAR(2) introduces a second autoregressive term () and slightly increases the variance to . PhyRateARMA (1,1) has larger and , with a similar variance of , indicating a stronger dependency on the autoregressive and moving average terms. The weighted estimate shows a balance between models with , , and . The weighted estimate indicates that the descendant’s rate is slightly negatively influenced by the ancestor’s rate (%) and its noise (%), with very low overall variance (), suggesting minimal change between generations.
Table 14.
Nocturnal species model parameter estimates.
In Table 15, PhyRateAR (1) is the first with the lowest AICc value of and the highest Akaike weight (), indicating that it is the best-fitting model for nocturnal data. PhyRateAR(2) and PhyRateARMA(1,1) rank much lower, with AICc values of and , respectively, both having a very small Akaike weight (). More complex models such as PhyRateARMA(2,1) and PhyRateARMA(2,2) rank lowest, with minimal Akaike weights, indicating they add little explanatory power and overcomplicate the fit for the nocturnal data.
Table 15.
Nocturnal species model results.
From the above results, one may foresee that the nocturnal primate and the diurnal primate body masses possess different autoregressive effects for the rates of the evolution, and we provide the test results in the following sections.
4.2.4. Testing Autocorrelation Rate
Interpretation of PhyRateAR(1) Test Results
Table 16 shows the PhyRateAR(1) test result.
Table 16.
, .
Table 16 presents the results of the PhyRateAR(1) test under the null hypothesis (no autocorrelation) and the alternative hypothesis . The combined tree and the diurnal tree show significant positive autocorrelation with large z-values ( and , respectively) and p values of , indicating a strong rejection of the null hypothesis. This suggests a significant effect of the previous evolution rate on the current rate in these datasets.
Interestingly, the nocturnal tree exhibits a significant negative autocorrelation with a z value of and a p value of . Despite the negative z-value, the small p-value suggests a significant relationship. This indicates that while each tree shows a significant effect, the nocturnal tree behaves differently from the combined and diurnal trees, potentially indicating unique evolutionary dynamics in nocturnal species.
Testing Heterogeity Rates on Subclades
Table 17 presents the results of the test of heterogeneity in evolutionary rates in the diurnal, nocturnal, and combined (dino) subclades using various autoregressive models (PhyRateAR(1), PhyRateAR(2), PhyRateARMA(1,1), PhyRateARMA(2,1), PhyRateARMA(2,2)). The log-likelihood values for the diurnal (_di), nocturnal (_no), and combined (_dino) data are shown, along with the chi-squared statistic and p-values. All models show highly significant results (p-values = 0.00), indicating strong evidence for rate heterogeneity between subclades. The chi-squared statistic is the highest for PhyRateAR(1) (), suggesting that it explains the greatest amount of variance in the data compared to the other models.
Table 17.
Testing heterogeneity rates on subclade. : one , : .
5. Discussion and Conclusions
This study introduces the first model developed for evolutionary rate estimation using the ARMA model. We first assumed that species evolve along a phylogenetic tree, with time length, evolutionary rate, and trait values expressed as a linear combination. Using the given trait values and branch lengths of the phylogenetic tree, we applied ridge regression and used the R package RRphylo to estimate the evolutionary rate. Next, we modeled the evolutionary rate using the PhyRateARMA(p,q) models, fitting these models to ancestral trait values, and then applied these models to estimate present-day trait values.
In the empirical analysis, we tested whether the rates of continuous trait evolution exhibited time series correlation. In the plant dataset, which combines herbaticus and woody genome sizes, the results did not show an autoregressive effect relationship. However, in the diurnal and nocturnal population dataset, which combines male and female primate body masses, the results indicated an autoregressive effect in the separate analyses of diurnal and nocturnal populations, suggesting a time-series relationship. To observe the evolutionary dynamics, the branches might have different autoregressive coefficients (). These branches can be analyzed to understand how differences between them affect the overall model. Further analysis could deepen the exploration of the effects of this statistical model in the biological context, providing a more comprehensive scientific explanation to more precisely understand and interpret evolutionary processes and phylogenetic patterns in biological systems.
Several methods estimate autoregressive (AR) and moving average (MA) model parameters. Maximum Likelihood Estimation (MLE) assumes normally distributed errors, while Least Squares focuses on AR parameters. The Yule-Walker Equations and Burg’s method enhance AR estimation, but the Innovations Algorithm handles both AR and MA. Bayesian Estimation uses priors and MCMC [41], while bootstrapping evaluates differences in PhyRateAR(1) parameters by resampling and comparing empirical distributions of . Recent applications of machine learning (ML) in related fields, such as molecular evolution [42,43], demonstrate the potential to enhance phylogenetic analyses and inference. Developing such approaches for trait evolution could open new avenues for addressing unresolved questions and refining our understanding of evolutionary processes. Advanced approaches, such as a PhyRateARMA framework, could leverage ML to better account for structural dependencies in phylogenetic trees and manage conditional heterogeneity in evolutionary processes. Inspired by the use of ARMA/GARCH models in other domains [22]. These advancements would provide a promising direction for future studies, extending the utility and applicability of evolutionary rate models.
Proposed research could extend the phylogenetic autoregressive-moving-average model by examining how regional factors like vaccination rates or population density influence SARS-CoV-2 mutation rates and traits like transmissibility. For land mammals, the model could assess changes in diet breadth, climate niche, and range size over time, incorporating phylogenetic data to study species’ responses to environmental pressures. Additionally, exploring how latitude impacts elevation across biomes and under human activities like deforestation would provide deeper insights into climate-related changes, broadening the model’s applicability.
Funding
This research and APC were funded by the Ministry of Science and Technology, Taiwan (grant No. MOST-112- 2118-M-035-003).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data presented in this study are available on request from the corresponding author.
Acknowledgments
I am very grateful to the editors and three anonymous reviewers for their constructive suggestions on improving the early version of this manuscript. I also thank You-Ruei Min for her helpful discussions and Bao-Yuan Huang for his technical assistance on an earlier version of this work.
Conflicts of Interest
The author declares no conflicts of interest.
Appendix A
A.1. Code and Scripts
Script and data can be accessed through the following link https://tonyjhwueng.info/phyarmarate (accessed on 28 December 2024).
A.1.1. Model Script
- Main empirical analysis: https://tonyjhwueng.info/phyarmarate/mainPhyarima.R (accessed on 28 December 2024).
- Model PhyRateARMA(p,q): https://tonyjhwueng.info/phyarmarate/testphyloratesV3.r (accessed on 28 December 2024).
A.1.2. Figures and Table
- Figure 1: https://tonyjhwueng.info/phyarmarate/twosigbmv4.html (accessed on 28 December 2024).
- Figure 2: https://tonyjhwueng.info/phyarmarate/4spetreemap.html (accessed on 28 December 2024).
- Figure 3: https://tonyjhwueng.info/phyarmarate/yLs.pptx (accessed on 28 December 2024).
- Figure 4: https://tonyjhwueng.info/phyarmarate/armarate2.pptx (accessed on 28 December 2024).
- Figure 5: https://tonyjhwueng.info/phyarmarate/tworatetree.html (accessed on 28 December 2024).
- Figure 6, Table 1: https://tonyjhwueng.info/phyarmarate/onerateT1powerSummarizer.html (accessed on 28 December 2024).
- Additional:
- –
- Male primates: https://tonyjhwueng.info/phyarmarate/primateDiNoBodyMassMale.html (accessed on 28 December 2024).
- –
- Female primates: https://tonyjhwueng.info/phyarmarate/primateDiNoBodyMassFemale.html (accessed on 28 December 2024).
- Appendix A.3: https://tonyjhwueng.info/phyarmarate/TestPhyArimaExample.html (accessed on 28 December 2024).
A.2. Phylogenetic ARMA(p,q) Model for Rate Evolution
Given tree topology of n taxa, with tip node and internal node , and the corresponding branch-length set with the ancestral-descendant relationship and fitting RRphylo [28], one can obtain a set of estimated rates for the tree. The phylogenetic ARMA() model of rates is a mixture of phylogenetic autoregressive (AR) and phylogenetic moving average (MA) models given by the following equation:
where represents the time series at time , c is a constant, are coefficients for the AR terms, are coefficients for the MA terms, and denotes the white noise error terms with mean 0 and variance .
A.2.1. PhyRateAR(1) Model
The PhyRateAR(1) model for rates of continuous trait evolution along the phylogenetic tree is written as
where nodes possess ancestral-descendant relationship, and
The negative log-likelihood is
where is the number of branches, and and are the two successive rates along the branch lengths and , respectively.
To find the maximum likelihood estimates of and for the PhyRateAR(1) rate process, we aim to minimize the negative log-likelihood. This requires setting the derivatives of the negative log-likelihood with respect to and to zero and solving these equations for the parameters.
The first-order conditions for the derivatives are and For , the derivative simplifies to
which leads to
For , the derivative is
leading to
These formulations allow us to estimate the parameters that best fit our PhyloRateAR(1) model based on the observed evolutionary rates along a phylogenetic tree.
A.2.2. PhyRateAR(2) Model
The PhyRateAR(2) model for rates of continuous trait evolution along the phylogenetic tree is written as
where is the rate on the next down successive branch.
Given a phylogenetic lineage with rates , the likelihood function for the PhyRateAR(2) model parameters and is
The log-likelihood function simplifies to
To find the maximum likelihood estimates (MLEs) for , , and , the log-likelihood function’s partial derivatives are taken with respect to each parameter, set to zero, and solved for these parameters.
The equations for partial derivatives are
Solving these equations provides the MLEs for , , and in a phylogenetic context.
A.2.3. PhyRateARMA(1,1) Model
The PhyRateARMA(1,1) model for rates of continuous trait evolution along the phylogenetic tree is written as
where represents the corresponding white noise error terms with variance associated with the noise of the evolutionary process.
The likelihood function for the PhyRateARMA(1, 1) process, assuming that the errors are normally distributed, is given by the product of the probability densities of the innovations , which are the one-step forecast errors:
The MLEs for , , and are found by taking the partial derivatives of the log-likelihood function with respect to these parameters and setting them to zero:
These equations generally require numerical optimization to solve because the innovations depend on both and in a nonlinear way. To estimate the parameters and , we use iterative numerical optimization methods that handle nonlinearity. Specifically, we apply the L-BFGS-B method [44] by initializing with a starting point and boundary constraints, to maximize the likelihood function. the R software package stats::optim (version 3.6.2) [34].
A.2.4. PhyRateARMA(2,2) Model
The PhyRateARMA(2,2) model for rates of continuous trait evolution along the phylogenetic tree is written as
where represents the time series at time ; c is a constant; are coefficients for the AR terms; are coefficients for the MA terms; and denotes the white noise error terms.
Similar to PhyRateARMA(1,1), to estimate the parameters , , and , we again apply the L-BFGS-B method [44] to maximize the likelihood function performed by the R software package stats::optim (version 3.6.2) [34].
A.3. Test Example
Figure A1.
A rooted phylogenetic tree of 8 taxa.
Trait values are listed in Table A1.
Table A1.
Trait values.
Table A1.
Trait values.
| y1 | y2 | y3 | y4 | y5 | y6 | y7 | y8 |
|---|---|---|---|---|---|---|---|
Table A2.
Edge lengths obtained by tree with branch lengths using R ape package (version 5.8.1) [45].
Table A2.
Edge lengths obtained by tree with branch lengths using R ape package (version 5.8.1) [45].
| eg1 | eg2 | eg3 | eg4 | eg5 | eg6 | eg7 |
| eg8 | eg9 | eg10 | eg11 | eg12 | eeg13 | eeg14 |
Table A3.
Rate estimates (here denote as corresponding to the tree in Figure A1 from rrphylo.
Table A3.
Rate estimates (here denote as corresponding to the tree in Figure A1 from rrphylo.
| nd1 | nd2 | nd3 | nd4 | nd5 | nd6 | nd7 | nd8 |
| 0.053 | |||||||
| nd9 | nd10 | nd11 | nd12 | nd13 | nd14 | nd15 | |
Table A4.
Maximum likelihood for parameter estimates under each model.
Table A4.
Maximum likelihood for parameter estimates under each model.
| Model | |||||
|---|---|---|---|---|---|
| PhyRateAR(1) | |||||
| PhyRateAR(2) | |||||
| PhyRateARMA(1,1) | |||||
| PhyRateARMA(2,1) | 0.669 | ||||
| PhyRateARMA(2,2) | |||||
| Weighted Estimate |
Table A5.
Model output: k number of parameters; AICc weight w.
Table A5.
Model output: k number of parameters; AICc weight w.
| Model | k | w | Rank | ||
|---|---|---|---|---|---|
| PhyRateAR(1) | 2 | 1st | |||
| PhyRateARMA(1,1) | 3 | 2nd | |||
| PhyRateARMA(2,1) | 4 | 3th | |||
| PhyRateAR(2) | 3 | 4th | |||
| PhyRateARMA(2,2) | 5 | 5th |
A.4. Trait Dataset for Empirical Analysis
For plant data and primate data, the tree files in Newick format, as well as the trait dataset in csv files, can be accessed at https://tonyjhwueng.info/phyarmarate/dataset (accessed on 28 December 2024).
References
- Pettay, J.E.; Kruuk, L.E.; Jokela, J.; Lummaa, V. Heritability and genetic constraints of life-history trait evolution in preindustrial humans. Proc. Natl. Acad. Sci. USA 2005, 102, 2838–2843. [Google Scholar] [CrossRef]
- Holstad, A.; Voje, K.L.; Opedal, Ø.H.; Bolstad, G.H.; Bourg, S.; Hansen, T.F.; Pélabon, C. Evolvability predicts macroevolution under fluctuating selection. Science 2024, 384, 688–693. [Google Scholar] [CrossRef] [PubMed]
- Scott-Phillips, T.C.; Laland, K.N.; Shuker, D.M.; Dickins, T.E.; West, S.A. The niche construction perspective: A critical appraisal. Evolution 2014, 68, 1231–1243. [Google Scholar] [CrossRef] [PubMed]
- Soons, J.; Herrel, A.; Genbrugge, A.; Aerts, P.; Podos, J.; Adriaens, D.; De Witte, Y.; Jacobs, P.; Dirckx, J. Mechanical stress, fracture risk and beak evolution in Darwin’s ground finches (Geospiza). Philos. Trans. R. Soc. B Biol. Sci. 2010, 365, 1093–1098. [Google Scholar] [CrossRef] [PubMed]
- Podos, J.; Nowicki, S. Beaks, adaptation, and vocal evolution in Darwin’s finches. Bioscience 2004, 54, 501–510. [Google Scholar] [CrossRef]
- Xiang, Y.; Huang, C.H.; Hu, Y.; Wen, J.; Li, S.; Yi, T.; Chen, H.; Xiang, J.; Ma, H. Evolution of Rosaceae fruit types based on nuclear phylogeny in the context of geological times and genome duplication. Mol. Biol. Evol. 2017, 34, 262–281. [Google Scholar] [CrossRef]
- Thorne, J.L.; Kishino, H.; Painter, I.S. Estimating the rate of evolution of the rate of molecular evolution. Mol. Biol. Evol. 1998, 15, 1647–1657. [Google Scholar] [CrossRef]
- Uyeda, J.C.; Zenil-Ferguson, R.; Pennell, M.W. Rethinking phylogenetic comparative methods. Syst. Biol. 2018, 67, 1091–1109. [Google Scholar] [CrossRef]
- Garamszegi, L.Z. Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology: Concepts and Practice; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
- Cornwell, W.; Nakagawa, S. Phylogenetic comparative methods. Curr. Biol. 2017, 27, R333–R336. [Google Scholar] [CrossRef] [PubMed]
- OMeara, B.; Ané, C.; Sanderson, M.; Wainwright, P. Testing different rates of continuous trait evolution using likelihood. Evolution 2006, 60, 922–933. [Google Scholar]
- Adams, D.C. A method for assessing phylogenetic least squares models for shape and other high-dimensional multivariate data. Evolution 2014, 68, 2675–2688. [Google Scholar] [CrossRef] [PubMed]
- Maddison, W.P.; Midford, P.E.; Otto, S.P. Estimating a binary character’s effect on speciation and extinction. Syst. Biol. 2007, 56, 701–710. [Google Scholar] [CrossRef]
- Uyeda, J.C.; Hansen, T.F.; Arnold, S.J.; Pienaar, J. The million-year wait for macroevolutionary bursts. Proc. Natl. Acad. Sci. USA 2011, 108, 15908–15913. [Google Scholar] [CrossRef]
- Gingerich, P.D. Rates of evolution on the time scale of the evolutionary process. Microevol. Rate Pattern Process 2001, 8, 127–144. [Google Scholar]
- Felsenstein, J. Phylogeny and the comparative method. Am. Nat. 1985, 125, 1–15. [Google Scholar] [CrossRef]
- Jhwueng, D.C.; Maroulas, V. Adaptive trait evolution in random environment. J. Appl. Stat. 2016, 43, 2310–2324. [Google Scholar] [CrossRef]
- Jhwueng, D.C. Modeling rate of adaptive trait evolution using Cox–Ingersoll–Ross process: An Approximate Bayesian Computation approach. Comput. Stat. Data Anal. 2020, 145, 106924. [Google Scholar] [CrossRef]
- Davis, C.C.; Latvis, M.; Nickrent, D.L.; Wurdack, K.J.; Baum, D.A. Floral gigantism in Rafflesiaceae. Science 2007, 315, 1812. [Google Scholar] [CrossRef]
- Beaulieu, J.; Jhwueng, D.C.; Boettiger, C.; O’Meara, B. Modeling stabilizing selection: Expanding the Ornstein-Uhlenbeck model of adaptive evolution. Evolution 2012, 66, 2369–2383. [Google Scholar] [CrossRef] [PubMed]
- Tsay, R.S. Analysis of Financial Time Series; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 543. [Google Scholar]
- Pham, H.T.; Yang, B.S. Estimation and forecasting of machine health condition using ARMA/GARCH model. Mech. Syst. Signal Process. 2010, 24, 546–558. [Google Scholar] [CrossRef]
- Sakamoto, M.; Venditti, C. Phylogenetic non-independence in rates of trait evolution. Biol. Lett. 2018, 14, 20180502. [Google Scholar] [CrossRef]
- Martin, B.S.; Bradburd, G.S.; Harmon, L.J.; Weber, M.G. Modeling the evolution of rates of continuous trait evolution. Syst. Biol. 2023, 72, 590–605. [Google Scholar] [CrossRef] [PubMed]
- Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef]
- Engle, R. GARCH 101: The use of ARCH/GARCH models in applied econometrics. J. Econ. Perspect. 2001, 15, 157–168. [Google Scholar] [CrossRef]
- Castiglione, S.; Serio, C.; Mondanaro, A.; Melchionna, M.; Carotenuto, F.; Di Febbraro, M.; Profico, A.; Tamagnini, D.; Raia, P. Ancestral state estimation with phylogenetic ridge regression. Evol. Biol. 2020, 47, 220–232. [Google Scholar] [CrossRef]
- Castiglione, S.; Tesone, G.; Piccolo, M.; Melchionna, M.; Mondanaro, A.; Serio, C.; Di Febbraro, M.; Raia, P. A new method for testing evolutionary rate variation and shifts in phenotypic evolution. Methods Ecol. Evol. 2018, 9, 974–983. [Google Scholar] [CrossRef]
- Sanderson, M.J. Estimating absolute rates of molecular evolution and divergence times: A penalized likelihood approach. Mol. Biol. Evol. 2002, 19, 101–109. [Google Scholar] [CrossRef]
- Jhwueng, D.C. On the covariance of phylogenetic quantitative trait evolution models and their matrix condition. Commun. Stat.-Simul. Comput. 2024, 53, 952–971. [Google Scholar] [CrossRef]
- Song, G.; Zhu, L.; Gao, A.; Kong, L. Blockwise AICc and its consistency properties in model selection. Commun. Stat.-Theory Methods 2021, 50, 3198–3213. [Google Scholar] [CrossRef]
- Zhang, J.; Yang, Y.; Ding, J. Information criteria for model selection. Wiley Interdiscip. Rev. Comput. Stat. 2023, 15, e1607. [Google Scholar] [CrossRef]
- Pennell, M.W.; Eastman, J.M.; Slater, G.J.; Brown, J.W.; Uyeda, J.C.; FitzJohn, R.G.; Alfaro, M.E.; Harmon, L.J. geiger v2.0: An Expanded Suite of Methods for Fitting Macroevolutionary Models to Phylogenetic Trees. Bioinformatics 2014, 30, 2216–2218. [Google Scholar] [CrossRef] [PubMed]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
- Henniges, M.C.; Johnston, E.; Pellicer, J.; Hidalgo, O.; Bennett, M.D.; Leitch, I.J. The Plant DNA C-Values Database: A one-stop shop for plant genome size data. In Plant Genomic and Cytogenetic Databases; Springer: Berlin/Heidelberg, Germany, 2023; pp. 111–122. [Google Scholar]
- Hedges, S.B.; Dudley, J.; Kumar, S. TimeTree: A public knowledge-base of divergence times among organisms. Bioinformatics 2006, 22, 2971–2972. [Google Scholar] [CrossRef]
- Galán-Acedo, C.; Arroyo-Rodríguez, V.; Andresen, E.; Arasa-Gisbert, R. Ecological traits of the world’s primates. Sci. Data 2019, 6, 55. [Google Scholar] [CrossRef] [PubMed]
- Dewey, T.; Shefferly, N.; Havens, A. Animal Diversity Web; University of Michigan Museum of Zoology: Ann Arbor, MI, USA, 2010. [Google Scholar]
- Dyer, M.A.; Martins, R.; da Silva Filho, M.; Muniz, J.A.P.; Silveira, L.C.L.; Cepko, C.L.; Finlay, B.L. Developmental sources of conservation and variation in the evolution of the primate eye. Proc. Natl. Acad. Sci. USA 2009, 106, 8963–8968. [Google Scholar] [CrossRef] [PubMed]
- Min, Y.R. Application of the Autoregressive Moving Average Model to Study the Rate of Phylogenetic Trait Evolution. Master’s Thesis, Feng-Chia University, Taichung, Taiwan, 2024. [Google Scholar]
- Spezia, L. Bayesian prior modeling in vector autoregressions via the Yule-Walker equations. Commun. Stat. Theory Methods 2024, 53, 5230–5247. [Google Scholar] [CrossRef]
- Mo, Y.K.; Hahn, M.W.; Smith, M.L. Applications of machine learning in phylogenetics. Mol. Phylogenet. Evol. 2024, 196, 108066. [Google Scholar] [CrossRef] [PubMed]
- Tao, Q.; Tamura, K.; U. Battistuzzi, F.; Kumar, S. A machine learning method for detecting autocorrelation of evolutionary rates in large phylogenies. Mol. Biol. Evol. 2019, 36, 811–824. [Google Scholar] [CrossRef] [PubMed]
- Andrei, N. Modern Numerical Nonlinear Optimization; Springer: Berlin/Heidelberg, Germany, 2022; Volume 195. [Google Scholar]
- Paradis, E.; Schliep, K. ape 5.0: An environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 2019, 35, 526–528. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).