1. Introduction
Today, the availability of extensive, high-quality genomic data makes it possible to infer the evolutionary history of observed populations. This area of research has significant implications, particularly in human history and in identifying genes associated with diseases [
1,
2,
3,
4]. Understanding evolutionary history involves two complementary approaches: forward-in-time and backward-in-time modeling. The backward approach is based on coalescent theory, while the forward approach is traditionally modeled using the Wright–Fisher framework [
5,
6,
7,
8,
9,
10].
Natural populations often inhabit heterogeneous environments, where individuals differ in their reproductive success. Therefore, it is important to understand how environmental heterogeneity affects offspring production. This is especially relevant in areas such as cancer research, where the role of heterogeneity in tumor cell evolution—and its impact on allele frequency distributions—remains an open and challenging question. Gaining insight into this could improve our understanding of therapeutic resistance in tumors [
11,
12,
13,
14]. Recently, Mashayekhi et al. developed a framework that incorporates environmental heterogeneity into offspring production models [
15]. Their approach is based on generalizing the coalescent process using tools from fractional calculus, resulting in what is known as the fractional coalescent. However, the coalescent model typically considers a small number of individuals and does not make full use of allele frequency data.
In the fractional coalescent framework, the fractional derivative order
captures the effects of environmental heterogeneity by linking environmental quality to reproductive success. In scenarios where gene frequency dynamics follow a Markov process, it has been shown that the average coalescent time equals the average fixation time for the population [
16,
17,
18]. Since the time to the most recent common ancestor depends on
in the coalescent model, we expect the fixation time in the forward model to reflect a similar dependency. To explore this, we introduce the parameter
into the forward-time framework by using the fractional Taylor series to derive the fixation time.
In population genetics, fixation time refers to the expected number of generations required for a specific allele to reach a frequency of one, effectively becoming the only allele at its locus. Under the classical Wright–Fisher model—which assumes random mating, non-overlapping generations, and constant population size—fixation time is a central quantity in understanding allele dynamics. For neutral alleles (those with no selective advantage or disadvantage), fixation times are typically long, especially in large populations. These time scales and the effective population sizes are driven by genetic drift. In contrast, beneficial alleles tend to fix more rapidly, while deleterious alleles rarely fix and take longer when they do. Thus, fixation time is a key metric in understanding genetic variation and evolutionary dynamics.
Recently, fractional calculus has shown promising results in modeling various fields. Examples of its application to real-world problems include modeling the dynamics of tuberculosis with vaccination [
19], analyzing dynamical systems in disease spread [
20], and modeling COVID-19 dynamics [
21].
Existing models that estimate fixation time rely on classical (integer-order) calculus and assume equal reproductive success among individuals, as in the Wright–Fisher model [
22]. In this paper, we introduce a new model for estimating fixation time that employs fractional calculus to explicitly account for environmental heterogeneity in reproductive success. This new approach is formulated using fractional differential equations, where the order of the derivative represents the extent to which heterogeneity influences offspring production.
To study fixation time under this new model, we propose a computational method for solving the resulting fractional differential equations. Due to the complexity of fractional systems, we develop a spectral method based on Eta-based functions—basis functions that have proven effective in solving various dynamical systems [
23,
24,
25]. This approach is particularly advantageous when the exact solution is unknown, offering flexibility and precision in handling the complexity of fractional population models.
Current spectral methods in population genetics are limited to integer-order equations and are not well-suited for cases involving environmental heterogeneity or gene flow between populations [
26]. To address this, we develop a novel spectral method using Eta-based functions for solving fractional differential equations. These functions are effective at approximating highly oscillatory and polynomial behaviors, making them well-suited for the kinds of variation encountered in biological systems [
27,
28,
29,
30,
31,
32,
33,
34,
35].
Eta-based functions are especially powerful in approximating solutions to problems where the exact behavior is unknown. When frequencies in the system approach zero, these functions naturally approximate polynomials. This versatility makes them ideal for addressing real-world problems where deriving exact solutions is either infeasible or computationally expensive.
To develop our method for solving the fractional differential equations relevant to fixation time, we first construct the operational matrix of fractional derivatives for the Eta-based functions. We then formulate a computational technique that uses the best approximation via these functions. By employing the operational matrix, we reduce the fractional differential system to an optimization problem. We also analyze the error bound of the method to assess its accuracy and sensitivity to input parameters. Numerical examples are provided to illustrate the strengths of the proposed method and demonstrate its validity.
Finally, we apply the new numerical method to study fixation times in a population under various scenarios. We use different types of Eta-based functions to ensure robustness in the results. Despite employing distinct basis functions, we observe consistent outcomes, underscoring the reliability of our method in capturing the fixation time dynamics. This consistency reinforces the credibility of the approach as a meaningful tool for modeling fixation time under complex environmental conditions.
The structure of the paper is as follows:
Section 2 derives the fixation time model using fractional dynamics in population genetics.
Section 3 provides the necessary background on Eta-based functions and introduces the operational matrix for fractional derivatives.
Section 4 presents the numerical method for solving fractional differential equations.
Section 5 applies the method to study fixation time in populations.
Section 6 concludes the paper.
2. Modeling Fixation Times Using Fractional Dynamics in Population Genetics
In this section, we derive a model for calculating fixation times while incorporating the effect of heterogeneity.
We consider a simple case: a diploid population of fixed size N. Assume there is no selective difference between the two alleles, and , at a given locus, and that mutation is absent. Each generation consists of gene copies. Let X denote the number of alleles in the population. Clearly, in any generation, X can take values in the set . We denote the value of X in generation t by . The model assumes that alleles in generation are sampled with replacement from those in generation t. Under this assumption, follows a binomial distribution with parameters (number of trials) and (success probability).
If
, then the probability
that
is given by [
36]
We consider a neutral model in which genetic drift is the sole evolutionary force—there is no selection or mutation. The process of tracking allele
over time is modeled as a Markov chain with states ranging from 0 to
, corresponding to the number of
alleles in the population. The states 0 (loss of
) and
(fixation of
) are absorbing states, meaning that once the population reaches either of these, it remains there indefinitely. The absorption time is the random variable that represents the number of generations required to reach one of the absorbing states, starting from an initial allele count. Based on the properties of Markov chains, the mean absorption time
when
satisfies the following recurrence relation [
36]:
To incorporate the effect of heterogeneity into Equation (
2), we introduce the rescaled variable
and define
. Let
, where
is a parameter that quantifies the degree of heterogeneity. Applying the fractional Taylor series expansion, we obtain
Using the fractional Taylor expansion for
, we have
where
is the Caputo fractional derivative and has the following definition [
37]:
where
is the order of the derivative and
n is the smallest integer greater than
.
From Equation (
4), we obtain the following fractional differential equation that governs the mean absorption time:
In Equation (
6), the terms
and
may depend on evolutionary forces. Under different scenarios, these terms take on various forms, but they are always functions of
x. In this paper, we explore different functional forms for
and
to investigate how they influence the behavior of fixation time using Equation (
6). Solving this fractional differential equation requires boundary conditions, which depend on both the population parameters and the heterogeneity parameter
. In this paper, we consider two following boundary conditions,
where
and
are two known values chosen based on biological interpretation. For example, setting
ensures, through these boundary conditions, that fixation occurs at the boundaries.
The solution demonstrates that the mean absorption (or fixation) time is influenced by
,
, and
, highlighting the role of population heterogeneity in shaping the temporal dynamics of allele fixation. In the following two sections, we develop a new numerical method for solving Equation (
6) to investigate the behavior of fixation time under various evolutionary forces.
Heterogeneity in Population Genetics and Fractional Calculus
In our earlier work [
15], we introduced the first application of fractional calculus in population genetics to model the effects of heterogeneity. This approach generalizes the classical coalescent process using tools from fractional calculus, leading to what is known as the fractional coalescent. In classical coalescent theory, pairs of lineages are selected to coalesce based on equivalence classes and exponentially distributed waiting times. In contrast, the fractional coalescent modifies the distribution of waiting times to reflect fluctuations in offspring number caused by environmental heterogeneity.
The key insight from [
15] is that when the variance in offspring number is not constant—due, for example, to changing environmental conditions—the waiting time distribution becomes dependent on the order of the fractional derivative, denoted by
.
Figure 1 illustrates how changes in
affect the distribution of the time to the most recent common ancestor,
. Here,
corresponds to no fluctuations, while smaller values of
indicate greater variability in offspring number. In this context,
serves as a measure of the intensity of fluctuations arising from heterogeneous environments. The figure assumes a population of size
n.
In the fractional calculus framework, real datasets have also been analyzed, covering three species with distinct life history strategies: humpback whales, which are long-lived, highly mobile, and produce few offspring; the malaria parasite Plasmodium falciparum, which must adapt to both mosquito saliva and the vertebrate bloodstream; and influenza viruses, which face highly variable host immune responses, leading to considerable heterogeneity in transmission dynamics [
15].
The analysis shows that environmental heterogeneity has minimal effect on the humpback whale and malaria parasite datasets. However, for the influenza dataset, heterogeneity appears to play a significant role and must be considered in population genetic inference. These findings emphasize that estimates of effective population size—and thus species diversity—can be strongly affected by whether heterogeneity is incorporated into the model [
15].
In scenarios where gene frequency dynamics follow a Markov process, it has been shown that the average coalescent time equals the average fixation time [
16,
17,
18]. Based on
Figure 1, since coalescent time depends on the parameter
, which reflects the intensity of fluctuations due to environmental heterogeneity, it follows that fixation time also depends on
. In both the forward- and backward-time processes,
represents the order of the fractional derivative.
In the forward-time framework, this fractional order appears in the differential equation given in Equation (
6). In this paper, we investigate how
influences fixation time by introducing a new numerical method for solving Equation (
6). In future research, we plan to extend this analysis to real datasets to further explore the role of heterogeneity in evolutionary dynamics.
3. Eta-Based Functions
Eta functions were introduced by L. Gr. Ixaru [
27], as a tool for deriving highly accurate CP methods for the Schrödinger equation. However, as was realized later on, they are of equal importance for the derivation by exponential fitting of numerical formulas for operations on highly oscillatory functions (numerical differentiation, quadrature, solving differential equations, interpolation, etc.); see [
28,
31,
32,
33,
34,
35,
38,
39] and references therein.
The Eta functions are primarily intended to approximate functions of the forms
where frequency
is a given constant, and
are smooth and slowly varying functions. With them, different categories of functions become tractable on equal footing. Thus, when
is small, these functions are smooth, but when
is increased, their behavior is increasingly oscillatory or with hyperbolic variation.
Eta functions
with
and
are generated by recurrence [
27,
28]:
where
For
these functions have the following values:
where
is a double factorial [
27]. These functions have the following property [
24]:
These functions are used to introduce the set of Eta-based functions in the following way:
where
is the integer part of
, and
in the trigonometric case and
in the hyperbolic case. An important property of the Eta-based functions is that they tend to power functions when
(for more details, please see [
28]).
3.1. Best Approximation Using Eta-Based Functions
Suppose that
and
be the set of Eta-based functions and
and
f is an arbitrary element in
H. Because
Y is a finite-dimensional vector space,
f has the unique best approximation out of
Y, such as
that is
Since
there exist the unique coefficients
, such that
where
and
Using the properties of the best approximation [
40], because
using Equation (
14), we have
where
and
and
denote inner product.
So we get
where
and
where
D is a dual operational matrix of order
and has the following definition:
In the next section, we use matrix D to introduce the operational matrix of fractional derivatives of Eta-based functions.
3.2. Operational Matrix of Fractional Derivatives
In this section, we introduce the fractional derivative matrix of the Eta-based function. We use the series presentation of Eta-function in Equation (
12) to derive the fractional derivative matrix.
Using Equation (
15), the fractional derivative matrix is defined as
where
in the
derivative matrix. Using Equation (
12) and the definition of the Caputo fractional derivative in Equation (
5), the element of the matrix
is defined as
where for the trigonometric and
we have
and for
we have
also, for the hyperbolic case and
, we have
and for
we have
and
where the matrix
D has been introduced in Equation (
17), and
N is the order of the truncated series defined in Equation (
12). In the next section, we use the operational matrix of fractional derivatives to introduce a numerical method for solving the fractional differential equation.
4. Numerical Method
In this section, we use the Eta-based function to solve the general fractional differential equation, which has the following form:
where
is a known function.
Let us assume, using Equation (
14), the best approximation of function
has the following form:
using Equations (
21) and (
18), we get
Substitute Equations (
21) and (
22) in Equation (
20), and we have
We apply the collocation method by requiring the residual of the problem in Equation (
23) to vanish at specific collocation points. These points are chosen as Newton–Cotes nodes
, given by [
41]
To enforce the boundary conditions, we formulate the following optimization problem:
Equation (25) defines a nonlinear programming problem, which can be solved using standard optimization techniques. In our implementation, we use Mathematica to solve for the unknown coefficient vector .
4.1. Error Bound
This section aims to assess the error norm associated with the numerical scheme outlined in
Section 4. To initiate the discourse on errors, we first introduce the following theorem, which serves as a foundation for the subsequent discussion in this section.
Theorem 1. Suppose that belongs to the Sobolev space and , and are the Eta-based functions defined on the interval . Assume that denotes the best approximation of a function . Then we have [42]where is a small number depending on Eta-based functions, and In the next theorem, we will conduct an error analysis for the numerical scheme presented in
Section 4.
Theorem 2. Suppose that belongs to the Sobolev space and , and are the Eta-based functions defined on the interval . Assume that denotes the best approximation of a function . Thenwhere is the error bound. Proof. If
is the best approximation of
, using Equation (
20), we have
and using Equation (
27), we have
We assume
and
, so we have [
43]
Using Equations (
26), (
28), and (
29), the result can be obtained. □
Remark 1. The results of Theorem 2 show that the convergence rate depends not only on the number of basis functions but also on the order of the fractional derivatives, α. This contrasts with the classical spectral method for solving dynamical systems, where the convergence rate depends solely on the number of basis functions.
4.2. Numerical Example
In this section, we apply the numerical method introduced in
Section 4 to solve a set of test cases and evaluate the accuracy of the approach. For the computations in this section, we used
Mathematica 14.1.0 for Mac OS X
(64-bit) (16 July 2024). The reported error is the absolute error
, computed for different scenarios of the problems.
4.3. Advantages of Eta-Based Functions
In this section, to demonstrate the advantages of using Eta-based functions, we revisit two important cases from the previous section (Cases 2 and 3). We apply the same numerical method introduced in
Section 4, but this time using Legendre polynomials, and compare the results with those obtained using Eta-based functions.
Table 4 and
Table 5 present the absolute errors for both Eta-based functions and Legendre polynomials, using the same number of basis functions (
), across different values of
. The results align with those reported in [
23], highlighting the superior performance of Eta-based functions over traditional basis functions such as Jacobi polynomials, especially in highly oscillatory regimes.
To provide a layman’s interpretation, the absolute errors reported in
Table 4 and
Table 5 measure how far the numerical results deviate from the expected or exact fixation time. For example, an error of
corresponds to a deviation of less than
from the expected value, while an error of
reflects a deviation of less than
. In our results, the Eta-based functions consistently produce errors on the order of
to
, which implies extremely high accuracy—essentially, the results are accurate to more than 12 decimal places. In contrast, Legendre polynomials yield errors ranging from
to as high as
, which can mean deviations of
or more in some cases. This stark contrast illustrates that using traditional basis functions like Legendre polynomials can lead to substantial inaccuracies, particularly for larger values of
, which correspond to more oscillatory behavior in the solution. In practical terms, this means that models relying on Legendre polynomials may significantly misestimate fixation time in complex or rapidly changing environments, while Eta-based functions maintain reliable precision even under such challenging conditions.
5. Fixation Times in Population Genetics
In this section, we apply the numerical method presented in
Section 4 to solve Equation (
6), which presented the fixation time. This will help us study the behavior of fixation time in different scenarios. Using Equations (
6), (
21), and (
22), we have
We apply the collocation method by requiring the residual of the problem in Equation (
33) to vanish at Newton–Cotes nodes
in Equation (
24). To enforce the boundary conditions in Equation (
7) and ensure that the fixation time remains non-negative, we formulate the following optimization problem:
Equation (
34) defines a nonlinear programming problem, which can be solved using standard optimization techniques. In our implementation, we use Mathematica to solve for the unknown coefficient vector
.
We explored several scenarios to study the behavior of the fixation time across different values of , which we interpret as a parameter reflecting the effect of heterogeneity. These scenarios involved varying the behavior of and by considering different functional forms for and . Additionally, we examined different forms of the Eta functions, including both trigonometric and hyperbolic cases.
Figure 4 illustrates how fixation time changes with allele frequency
x under the condition
. We consider two specific cases:
and
, evaluating both the trigonometric and hyperbolic forms of the Eta function separately and focusing on small values of
(
).
Figure 5 and
Figure 6 continue with the assumption
under different scenarios, including
,
, and
, again for small
values (
). The results in
Figure 4,
Figure 5 and
Figure 6 show that despite the different scenarios, fixation time consistently decreases for small allele frequencies (
) and increases for large allele frequencies (
).
We also investigated the behavior of fixation time for a larger value of
, specifically
. In this case, we again assumed
and explored several functional forms:
,
,
, and
. The results, presented in
Figure 7, include both trigonometric and hyperbolic forms of the Eta function. These results reveal that fixation time increases for small allele frequencies (
) and decreases for large allele frequencies (
).
Taken together, the results from
Figure 4,
Figure 5,
Figure 6 and
Figure 7 show that when
, the behavior of fixation time is qualitatively similar across different functional forms. For both small (
) and large (
) values of
, fixation time tends to exhibit an extremum around
.
To draw a more general conclusion, we also considered the case where
.
Figure 8 and
Figure 9 present the results for small and large values of
, respectively. These scenarios involve different choices for
and
in both trigonometric and hyperbolic Eta cases. The results again confirm the presence of an extremum in fixation time near
. The main difference in this case is that the transition between having a minimum versus a maximum at the extreme point occurs more gradually compared to the
case.
The numerical results show that fixation time depends on both the heterogeneity parameter and the initial allele frequency. This contrasts with the classical approach, where heterogeneity is not considered and fixation time depends solely on the initial allele frequency. Specifically, for small values of , fixation time consistently decreases when the allele starts at low frequencies and increases when it starts at high frequencies. In contrast, for larger values of , this pattern may reverse, with fixation time increasing for low initial frequencies and decreasing for high ones. Biologically, this suggests that heterogeneity interacts with the initial genetic composition of the population in shaping allele dynamics. In microbial populations, where beneficial mutations typically arise at low frequencies, a highly heterogeneous environment (corresponding to small ) may lead to more rapid fixation and faster adaptation. In more homogeneous settings, however, such mutations may take longer to fix unless they reach a higher frequency early in the process. Similarly, in tumor evolution, resistance mutations that appear at low frequencies may fix more rapidly in a heterogeneous tumor microenvironment, potentially reducing the window of opportunity for effective treatment. In contrast, a more uniform microenvironment may slow down this process unless resistant clones grow sufficiently large. These results indicate that our framework captures nuanced dynamics that are directly relevant to real-world evolutionary scenarios, particularly in systems where spatial or environmental heterogeneity plays a critical role.
6. Conclusions
In this study, we introduced a novel forward-time framework for estimating fixation time in population genetics by incorporating the effects of environmental heterogeneity using fractional calculus. Traditional models, such as the Wright–Fisher and classical coalescent approaches, assume uniform environments and equal reproductive success, which limits their ability to capture the complexity of real-world biological systems. By introducing a fractional parameter , our model explicitly accounts for the impact of heterogeneous environments on reproductive dynamics.
To solve the resulting fractional differential equations, we developed a spectral method based on Eta-based functions. These functions proved highly effective in approximating unknown or oscillatory solutions, enabling a computationally efficient and accurate numerical approach. By leveraging the operational matrix of fractional derivatives, we reduced the problem to an optimization framework, making the method scalable and practical for large systems.
Our numerical results demonstrated the consistency and robustness of the method across different parameter settings and basis functions, validating its utility in modeling fixation dynamics under varying degrees of heterogeneity. However, as with all modeling approaches, certain limitations should be noted. The current model assumes constant environmental heterogeneity over time and does not yet account for spatial structure, stochasticity, or gene flow. Addressing these factors could further enhance the model’s applicability and realism.
Despite these limitations, the proposed framework offers promising avenues for real-world applications. In particular, it may be relevant for modeling microbial adaptation in fluctuating environments, understanding clonal dynamics in cancer genomics where selection pressures vary across tissues, or exploring fixation processes in epidemiological contexts involving heterogeneous host immunity or transmission rates.
Overall, this work contributes both theoretical and computational advances toward understanding fixation time in heterogeneous populations, and it opens up several directions for future research in applied evolutionary modeling.