Computation of the Likelihood in Biallelic Diffusion Models Using Orthogonal Polynomials

In population genetics, parameters describing forces such as mutation, migration and drift are generally inferred from molecular data. Lately, approximate methods based on simulations and summary statistics have been widely applied for such inference, even though these methods waste information. In contrast, probabilistic methods of inference can be shown to be optimal, if their assumptions are met. In genomic regions where recombination rates are high relative to mutation rates, polymorphic nucleotide sites can be assumed to evolve independently from each other. The distribution of allele frequencies at a large number of such sites has been called “allele-frequency spectrum” or “site-frequency spectrum” (SFS). Conditional on the allelic proportions, the likelihoods of such data can be modeled as binomial. A simple model representing the evolution of allelic proportions is the biallelic mutation-drift or mutation-directional selection-drift diffusion model. With series of orthogonal polynomials, specifically Jacobi and Gegenbauer polynomials, or the related spheroidal wave function, the diffusion equations can be solved efficiently. In the neutral case, the product of the binomial likelihoods with the sum of such polynomials leads to finite series of polynomials, i.e., relatively simple equations, from which the exact likelihoods can be calculated. In this article, the use of orthogonal polynomials for inferring population genetic parameters is investigated.


Introduction
Population genetics is concerned with the evolution of frequencies of heritable variants (alleles) at specific positions in the genome (loci) in natural and domestic populations.The main forces influencing this evolution of allele frequencies are: mutation, migration, drift, linkage, and selection.
In genomic regions, where recombination rates are high relative to mutation rates, polymorphic nucleotides or sites can be assumed to evolve independently, i.e., linkage can be ignored.The distribution of allele frequencies at a large number of such sites in a sample has been called "allele-frequency spectrum" or "site-frequency spectrum" (SFS).Some classes of sites are assumed not to be influenced directly by selection, e.g., some regions in short introns (non-coding insertions in protein coding genes) and fourfold degenerate sites (sites at third positions in codons that do not influence the amino acid in the polypeptides) [1].With these classes of sites, only mutation and demographic forces, e.g., drift and migration, are assumed to be relevant.
In most organisms studied so far, e.g., fruit flies (Drosophila) or mammals including humans, most sites in moderate samples (up to about 100 individuals) are monomorphic; only at some sites a single allele segregates in the population, while sites with more than two segregating alleles are extremely rare.With such a low proportion of polymorphism, a simple biallelic model is adequate, even though each site may assume four states corresponding to the four bases: adenine, cytosine, guanine, and thymine.
Most naturally, the evolution of allele frequencies is modeled forward in time as a Markovian random walk from one generation to the next.The best known such model, the Wright-Fisher model [2,3], uses diploid individuals and binomial sampling of individuals for the transition to the next generation.For large population sizes and if parameters are scaled appropriately, the Wright-Fisher model and other similar models, e.g., the Moran model [4], converge to the same diffusion equation.In population genetics, the use of diffusion equations is associated with the work of Motoo Kimura (1924Kimura ( -1994) ) [5,6].In the diffusion limit, allele frequency counts are usually replaced by the allelic proportion x, a continuous quantity ranging between zero and one.Often, solutions that are difficult or impossible to derive with the discrete models can be obtained relatively easily with the diffusion approach.The diffusion model can either be taken as an approximation to the discrete models or as a model in its own right.
While occasionally new results are presented, this article is mainly a review.Population genetic parameters are inferred from site frequency spectra with the diffusion approach.Note that sudden changes in parameters, such as the (effective) population size, may lead to discontinuous jumps in the process, which cannot naturally be modeled by diffusion.These are not subject of this article.Neither are alternative approaches, such as branching processes.In particular, orthogonal polynomials (modified Jacobi and Gegenbauer polynomials) are used to solve the diffusion equation.Furthermore, the use of the oblate spheroidal wave function for solving a model with directional selection and drift is presented.Other methods to analyze diffusion models, such as the calculation of moments [7,8], are not covered.Due to the importance of Ewen's book [9] in the field of theoretical population genetics, subjects not covered in the book receive special attention.In particular, data analysis with the diffusion approach is reviewed also for equilibrium data, which generally do not require the use of orthogonal polynomials, but the equilibrium distributions may serve as prior distributions in a Bayesian context.
The currently available tools implementing these approaches are limited: functions for Jacobi and Gegenbauer polynomials as well as the oblate spheroidal wave function are available in the formula manipulation programs "Mathematica" [10] and "Maple" [11] for computation and visualization.Song and Steinrücken [12] provide methods and an implementation to solve the Kolmogorov backward equation using modified Jacobi and Gegenbauer polynomials.

Moran and Diffusion Models
Assume a population of N haploid individuals; each may assume the state of zero or one, corresponding to the two arbitrarily labeled alleles.With the decoupled Moran model [13][14][15], either (i) (mutation) at a rate of µ = µ 0 + µ 1 , a random individual i is picked to mutate to type one with probability α = µ 1 /µ or to type zero with probability β = µ 0 /µ; or (ii) (genetic drift) at a rate of one, a random individual i is replaced by another random individual j.Setting θ = µN , the rate of change of the allelic proportion x of the mean per unit time is caused by mutation: and that of the variance by genetic drift: Scaling space with 1/N and time with 1/N 2 and taking the appropriate limits, the Kolmogorov forward (or Fokker-Planck) generator of the process becomes: The forward diffusion equation: then describes the evolution of the probability of the allelic proportion x forward in time t, i.e., in the same temporal direction as the transitions in the discrete Wright-Fisher and Moran models.By contrast, coalescence theory looks backward in time (e.g., [16]).(In the following, the use of orthogonal polynomials to solve this diffusion equation will be explained.This is necessarily rather technical; however, for most results only rather elementary mathematical manipulations are needed.)While this article focuses on the Kolmogorov forward equation, some results can more easily be derived using the Kolmogorov backward generator: On the interval [0, 1] we are looking for solutions of the Kolmogorov backward equation: of the form φ(x, t) = e −λ i t f i (x): where i indexes the eigenvectors.
The forward equation can be transformed to Sturm-Liouville or self-adjoint form by substituting For Sturm-Liouville equations [17] of the form: it can be shown that all eigenvalues λ i are real and can be ordered such that Corresponding to each eigenvalue λ i is a unique (up to a normalization constant) eigenfunction f i (x), which has exactly i zeros in the interval.The normed eigenfunctions form an orthonormal basis: where δ i,j denotes the Kronecker delta, i.e., δ i,j is zero for i = j and one for i = j, of the Hilbert space L 2 ([a, b], w(x)dx).The function w(x) is called the weight function.
The Kolmogorov backward Equation ( 7) can be obtained from the above Sturm-Liouville equation (the last line of Equation ( 8)): Thus, multiplication with the weight function: transforms a solution of the backward equation into that of the forward equation (see also Formula (1.7) in [18]).

Modified Jacobi Polynomials
The backward Equation ( 7) is closely related to the differential function fulfilled by the classical Jacobi polynomials [19].One can either modify the backward Equation (7) to fit the Jacobi polynomials (e.g., [18]) or the Jacobi polynomials to fit the backward equation (e.g., [12]).We will follow the latter strategy.
Define the modified Jacobi polynomials: where the P (α,β) i (z) are the classical Jacobi polynomials [19].It can be shown that these modified Jacobi polynomials fulfill the backward Equation (7) with the corresponding eigenvalues: With the weight function w (θ,α) (x), the modified Jacobi polynomials are orthogonal: where δ i,j denotes the Kronecker delta, i.e., δ i,j is zero for i = j and one for i = j.The proportionality constant depends on i, θ, and α: The set of R (θ,α) i (x) forms a basis of the Hilbert space L 2 ([0, 1], w (θ,α) (x)dx) [12].For i ≥ 1, the R (θ,α) i (x) satisfy the recurrence relation: while R (θ,α) 0 [12].Recall that multiplication with the weight function w (θ,α) (x) transforms an eigenvector of the backward equation into that of the forward equation.If θ > 0, the forward equation has a stationary beta density beta(x | αθ, βθ) proportional to the weight function:

Series Expansion; Approximation of Functions by Orthogonal Polynomials
We have now established that the backward density is given by an expansion of the form: The constants c i need to be determined such that the initial conditions are met, i.e., a probability density f (x), defined within the interval, is represented by the series expansion: The coefficients c i in an expansion up to order n are determined by minimizing a weighted least squares error function.
Since the following considerations hold generally for all orthogonal polynomials, we now use arbitrary intervals between a and b, the symbol f i (x) for the ith orthogonal polynomial, and w(x) for the weight function associated with the f i (x): Differentiating with respect to c i : and setting equal to zero, we get: From the orthogonality relation, we have b a f 2 i (x)w(x) dx = ∆ i δ i,j .Thus, we set the coefficients for the backward equation to: The forward expansion can be obtained from the backward expansion by multiplication with the weight function (see Equation ( 11)): As in the case of the backward expansion, the constants c i are determined such that the initial conditions are met, i.e., an initial probability density f (x), defined within the interval between a and b, is represented by the series expansion of orthogonal polynomials f i (x) with the weight function w(x): The coefficients are now determined by minimizing the weighted least squares error function: With similar considerations as for the backward case, we find: Returning to the mutation drift diffusion, we note that often an initial density corresponding to a Dirac delta function at a point p in [0, 1], f (x) = δ(x − p), is considered (e.g., [20]); then: Substituting these coefficients into Equation ( 25), we get: This corresponds to Formula (4.68) in [9], where the eigenfunctions are assumed to be normed, such that division by the proportionality constant ∆ i is unnecessary.(Note that exchanging x and p transforms the right side of this equation into that of the corresponding backward equation; compare Formula (5) in [12], where the backward equation is used.) Returning to the modified Jacobi polynomials, we note that, from the orthogonality relation Equation ( 15) and R (θ,α) 0 (x) = 1, it can be deduced for all i ≥ 1 and thus also for all times: Therefore the probability mass over the whole interval [0, 1] comes only from the equilibrium term, i.e., the beta density Equation (18); all other terms R (θ,α) i (x) w (θ,α) (x) with i ≥ 1 shift this mass within the interval.Note that a polynomial times a beta density results in a weighted sum of beta densities.

Example: A Change in the Scaled Mutation Rate with Modified Jacobi Polynomials
As an example, assume that the population had been in equilibrium with parameters α and θ a , to switch to a new mutation bias θ c at time t c .Then the expansion until time t c contains only the equilibrium beta density.The change of the mutation bias necessitates a change in the eigenvectors from w (θa,α) (x) R (θa,α) i (x) to w (θc,α) (x) R (θc,α) i (x).The coefficients for the new eigensystem are (compare Formula (28)): The evolution of the proportion f (x) between t c and the present time is given by the series expansion Equation (25) with the c i from Equation (32).
While one such change may not be too cumbersome to implement in a computer program, approximating rapidly changing population sizes by many piecewise linear changes can be, since then equilibrium has not been reached and for each change a sum over all terms in the expansion is needed, such that Equation (32) needs to be modified to: where the τ i (t) are the time-dependent coefficients.

Equilibrium
For θ > 0, the beta density beta(x | αθ, βθ) is the equilibrium or stationary solution of the forward diffusion process [3].
Given a single sample of size M N with a frequency y of the first allelic type, the likelihood conditional on the population allelic proportion x is naturally a binomial: The joint distribution of y and x after multiplication with the equilibrium beta density Equation ( 18) is: Integrating out x results in the likelihood, a beta-binomial compound distribution: Site frequency spectra (SFS) can be considered samples of identical sample size M from L biallelic loci, indexed by l (1 ≤ l ≤ L), with the allelic proportions x l drawn independently from a beta density with common α and θ.Let L 0 , . . ., L M represent the counts of alleles of the first type in the samples.The likelihood then is a product of beta-binomials: Interest is centered on obtaining (maximum-likelihood) estimates of θ and α given the vector of allelic counts (L 0 , . . ., L M ) or, in a Bayesian context, their posterior distribution given a suitable prior.As a function of α, the distribution is a polynomial; as a function of θ, the distribution is a rational function.A rational function can be integrated by partial fraction decomposition.If auxiliary variables that count the number of mutations in each allelic class conditional on θ, α, y and M are introduced, an expectation maximization algorithm may be used for finding the maximum likelihood estimates [21].

Outside Equilibrium
If the population size or the mutation bias has changed recently, a population will be outside equilibrium.Then instead of the equilibrium beta density Equation (18) the expansion in Equation (32) or Equation (33) needs to be used.Since in both cases, the series conforms to a weighted sum of beta distributions, integration to obtain the likelihood can be performed relatively easily; however, the author is not aware of an implementation of this algorithm.With formula manipulation programs, e.g., "Mathematica" [10] or "Maple" [11], the Jacobi polynomials are readily available, such that it is possible to program these algorithms relatively easily.

Selection and Drift Diffusion with Mutations from the Boundaries
Another tradition in theoretical population genetics follows the fate of a single mutant allele to calculate, e.g., the probability of fixation of the mutant allele or the time until its fixation or loss [9].While the allele is polymorphic, directional selection with a scaled strength of γ and drift are the forces usually considered.Importantly, mutation is assumed to be negligible within the polymorphic region, i.e., in 1/N ≤ x ≤ (N − 1)/N .While only drift (or selection and drift) governs the dynamics within the polymorphic region, mutations may be considered as boundary terms.The fact that no mutations are considered within the polymorphic region means that expansions using the Gegenbauer polynomials instead of the Jacobi polynomials can be used.This change simplifies calculations.
Two different ways of approaching the problem with selection are presented: in the Appendix, the forward equation is transformed to the spheroidal wave equation, for which excellent computation tools are available; in the main text, the strategy of Song and Steinrücken [12] is followed, which will likely be more familiar to population geneticists.
From the corresponding Moran model, e.g., [15], the change in the mean is now inferred to be: After scaling, the corresponding forward diffusion equation is for 1/N < x < (N − 1)/N : Again from the corresponding Moran model, it can be deduced that the flow from x = 1/N to x = 0 consists of drift and that between x = (N − 1)/N and x = 1 of selection and drift; after appropriate scaling: By multiplication with the weight function w(x) = e γx x −1 (1 − x) −1 and substituting the series expansion, the forward Equation ( 39) can be transformed to Sturm-Liouville form: From this the backward equation can be obtained: Again, we are looking for an eigensystem of this Sturm-Liouville problem.We proceed indirectly, by first obtaining a solution for the neutral system, i.e., without selection, and then deriving eigenvectors as linear combinations of this eigensystem.

Pure Drift within the Polymorphic Region
Consider the pure drift forward generator: and the corresponding backward generator: As before, either the generator can be modified to that of the classical Gegenbauer polynomials [19] as in [20], or the Gegenbauer polynomials to fit the generator.Choosing the latter strategy again, the orthogonal polynomials solving the backward equation are the modified Gegenbauer polynomials [12] with the weight function: and the proportionality constant: The first two polynomials are G 0 (x) = −x(1 − x) and G 1 (x) = x(1 − x)(2 − 4x) and the recurrence relation to calculate all other polynomials is: Furthermore, the eigenvalues are: As before, the eigenvectors of the forward series expansion U i (x) can be obtained from those of the backward expansion by multiplication with the weight function w(x): Since all eigenvalues are greater than zero, there is no equilibrium term in this expansion.Without replenishing mutations, probability weight is lost continually towards the boundaries zero and one.It is convenient to already include this behavior into the eigenfunctions by including boundary terms at zero and one [22,23], where the deduction made corresponds to what is expected to fix eventually.One can show that: Therefore the forward eigenvectors H i (x) are defined as: where δ(x) is the Dirac delta function.
A probability density f (x) defined between zero and one can be represented by an expansion of the H i (x): where: have point masses at the boundaries, these are included in this integration.The coefficients c i can be calculated using: where the limit indicates that the integration includes only the polymorphic region, i.e., no point masses at the boundaries.
In contrast to the classical solution of Kimura [20], this solution also accounts for the dynamics at the boundaries.For the case of an initial Dirac delta distribution at a proportion p inside the polymorphic region, Tran et al. [23] derive the analogous eigenexpansion using the classical Gegenbauer polynomials (Equation ( 20) in [23]).
For real data, we do not know the true proportion p and rather have an estimate of p given a sample.Then polynomials are more useful as initial distributions, which will be illustrated with examples below.

Equilibrium of Mutations from the Boundaries and Drift; No Outgroup Information
Assuming no outgroup information and equilibrium, i.e., the same model as used for deriving Equation ( 36), but small scaled mutation rates and the other Poisson Random Field (PRF) assumptions, RoyChoudhury and Wakely [29] derive the distribution of polymorphic sites in a sample of L loci and M haploid individuals to be Poisson with mean: If we set ϑ = 2αβθ, this leads to a maximum likelihood estimator of variability that is identical to that of Ewens and Watterson: The same estimator can also be derived without assuming a PRF, but instead expanding the likelihood (Equation ( 37)) into a power series in ϑ at zero, keeping only terms up to first order in ϑ [21].The likelihood (Equation ( 37)) then becomes proportional to: With a model with mutations from both boundaries, the equilibrium density analogous to that in Equation ( 56) is [30]: The joint distribution of this density and the binomial likelihood is for polymorphic alleles, i.e., 1 ≤ y ≤ (M − 1): Integrating this joint distribution over x in the limit of N → ∞ also results in the marginal distribution Equation (62).
The two monomorphic classes L 0 and L M may be combined to obtain a marginal likelihood, from which the same maximum likelihood estimator as in Equation ( 61) can be derived.As long as 1 (which is usually fulfilled), the Poisson approximation can be derived from the marginal likelihood of L p given L, M , and ϑ [21]: Then Lp H M is a maximum likelihood estimator for Lϑ, which corresponds to the parameter θ of RoyChoudhury and Wakeley [29].
Only with small scaled mutation rates, maximum likelihood estimators of ϑ can be obtained relatively easily.With most real data, small scaled mutation rates, i.e., ϑ < 0.0125, are usually observed.This is also the parameter range, where use of outgroup data would enhance analyses, but the outgroups are never ideal.In fact, data will usually conform to a "joint frequency spectrum", where sample sizes in different populations or species may differ.If data from a second population come from a single diploid individual in Hardy-Weinberg equilibrium, the haploid sample size there will be two.Use of such data requires non-equilibrium approaches as in [31].For small scaled mutation rates, a probabilistic model using orthogonal polynomials is formulated in [30].

Example for the Use of Gegenbauer Polynomials: Evolve and Resequence
As is obvious from the preceding subsections, samples from a single time point from a population assumed to be in an equilibrium of mutations from the boundaries and drift do not require orthogonal polynomials.To demonstrate the use of orthogonal polynomials, data that might occur in an "Evolve and Resequence" experiment, e.g., [32], will be analyzed in this subsection.Assume that a base population of, e.g., N = 200 fruit flies (Drosophila melanogaster) is taken from a wild population that is assumed to be in equilibrium.In a cage, the population evolves without selection for t/N generations.Within the short times customary in such studies, mutations are unlikely and can be ignored.At a certain locus, the initial sample size from the base population is M = 5; y = 3 alleles are of the first allelic type.Conditional on the allelic proportion x, the likelihood is binomial and the joint distribution is given in Equation (35): This is actually a polynomial of degree three and proportional to a beta(x | 3, 2) density, which can be represented exactly by a series of the modified Gegenbauer polynomials H i up to the appropriate degree.The loss of variation from a beta(x | 3, 2) distribution within the polymorphic region over t/N generations is shown in Figure 1.
Consider two time points and an even smaller sample size that allows calculation in the text.In particular, assume that the sample size of the initial sample is M 0 = 3 with two alleles of the first type y 0 = 2. Thus the joint distribution of the sample y 0 and the allelic proportions x is: This polynomial can be represented by the modified Gegenbauer polynomials of degree up to one: c 1 = − 3 4 αβθ and c 0 = − 3 2 αβθ.At time t 1 , before considering the second sample, the probability mass of the joint density has diminished in the polymorphic region: while it has grown at the boundaries: and The likelihood of a second sample at time t 1 of size M 1 = 3 with y 1 = 3 alleles of the first type, i.e., a monomorphic sample, is binomial: x 3 .The joint probability consists of an interior and a boundary part.From the interior part of the joint distribution, x can be integrated out: Summing the interior and the boundary parts, the likelihood of the two samples is obtained: Note that the parameters αβθ pertain to the base population.With respect to drift during the experiment, the single parameter in this likelihood is t 1 , i.e., the time in generations normed by the (effective) population size.Usually, the number of generations is known in "evolve and resequence" studies, such that the (effective) population size N can be estimated.This may or may not coincide with the census population size, which is also usually known.
Note that the above computation is much simpler than the use of the transition probabilities of the Wright-Fisher model with the effective population size N as a parameter [33,34].As the data at the time that those articles were published (about 2000) were usually microsatellites, rather than single-nucleotide polymorphisms, these methods provide for multiple alleles.
Using the statistical language "R" [35] with the high-precision algebra package "Rmpfr", likelihoods for sample sizes of about 50 can be calculated within minutes with this method.

Selection and Drift
In the following subsection, the approach of Song and Steinrücken [12] is followed (their section: "Diffusion with Genetic Selection and No Mutation").While the numerical methods of these authors are less advanced than those implemented in the commercial packages (see the Appendix), their general method based on the modified Jacobi polynomials is also applicable in cases with mutation and dominance.
Our goal is to find eigenfunctions V i (x) and the associated eigenvectors Λ i of the backward generator: The V i (x) are orthogonal with respect to the weight function w(x) = e γx x −1 (1 − x) −1 , such that: x G i (x) into this equation, it can be verified that the K i (x) are also orthogonal with respect to the same weight function: Therefore, even though the K i (x) are not eigenfunctions of the backward generator L b (x), linear combinations of the K i (x) can be used to represent V j (x): where the u j,i are constants to be determined.Substituting K i into the backward operator results in: Using this result together with Equations ( 73) and (76) leads to: For i ≥ 0, with the recurrence relation Equation ( 47), one can show that: where: For j ≥ 0, multiplying this system of equations with G j (x) and integrating with respect to the weight function x −1 (1 − x) −1 yields a system of equations.In matrix form, this system can be written as: The eigenvalues Λ j correspond to the eigenvalues of the operator L b with the associated eigenvectors u j = (u j,0 , u j,1 , u j,2 , u j,3 , u j,4 , . . .).Note that this band-diagonal system can be subdivided into two independent tridiagonal systems for even and odd i and j.While this system has infinite size, it can be truncated at dimension D to obtain an approximation, with little loss [12].The eigenvalues of tridiagonal matrices with real coefficients can be obtained relatively quickly.Furthermore, approximate solutions to the eigenvalues can be improved using continued fractions [36].Nevertheless, so far the only implementation seems to be that in [12], where the backward equation is considered.The eigenvectors of the forward equation can be obtained by multiplication with the weight function e γx x −1 (1 − x) −1 .A solution of the forward diffusion equation has not been implemented yet, as far as the author is aware of.

Conclusions
A biallelic locus subject to the population genetic forces such as mutation, drift and selection can be modeled using diffusion equations.These diffusion equations can be solved using orthogonal polynomials; the case of pure drift using the Gegenbauer polynomials, the case of mutation and drift using Jacobi polynomials, and the case of selection and drift using spheroidal wave functions.The theory for using series of orthogonal polynomials to solve the corresponding diffusion equations has been elaborated in detail.By adjusting the coefficients in the expansion any initial distribution may be approximated.In genomic regions of relatively high recombination rates and low mutation rates, each polymorphic nucleotide can be assumed to evolve independently.Samples from such regions have been called site frequency spectra.Assuming equilibrium, the joint distribution of the allelic proportion x and the data y of each such site can be modeled as a linear combination of eigenvectors of the forward equation up to an order determined by the sample size.With this, it is thus possible to condition on samples from two time points, as with ancient DNA or "evolve and resequence" studies, or use outgroup information.
A major advantage of using diffusion equations and orthogonal polynomials over competing methods, e.g., approximate Bayesian computation [37], which uses summary statistics, or even alternative methods based on the solution of diffusion equations [31], is that the relevant distributions may be calculated exactly and without loss of information, or can at least be approximated very efficiently.Furthermore, the well-developed theory of orthogonal polynomials connects population genetics to other more advanced disciplines, e.g., theoretical physics.
It can be transformed to the Sturm-Liouville form by setting g(z)(1 − z 2 ) −1/2 = v((1 − z)/2), since: The last line is in Sturm-Liouville form (see Equation ( 9)).It also corresponds to which is generally used for spheroidal wave functions ( [19], Chapter 21).As can be seen from Equation (9), the weight function is unity, such that the forward and backward equations are identical.The condition c 2 < 0 actually defines the oblate spheroidal wave functions.For c 2 = 0, corresponding to the case without selection, Equation (A7) reduces to the differential equation of the associated Legendre function ( [19], Chapter 8): While the spheroidal wave functions and the associated Legendre functions solving the above equations are not strictly polynomials, much of the theory of orthogonal polynomials also applies to them, such that any initial function can be approximated by a series of Legendre functions.