Ballou’s Ancestral Inbreeding Coefficient: Formulation and New Estimate with Higher Reliability

Simple Summary Deleterious recessive alleles causing inbreeding depression may be eliminated from populations through purifying selection facilitated by inbreeding. Providing evidence of this phenomenon (i.e., inbreeding purging) is of great interest for conservation biologists and animal breeders. Ballou’s ancestral inbreeding coefficient (FBAL−ANC) is one of the most widely used pedigree-based measurements to detect inbreeding purging, but the theoretical basis has not been fully established. In this report, the author gives a mathematical formulation of FBAL−ANC and proposes a new method for estimation based on the obtained formula. A stochastic simulation suggests that the new method could reduce the variance of estimates, compared to the conventional gene-dropping simulation. Abstract Inbreeding is unavoidable in small populations. However, the deleterious effects of inbreeding on fitness-related traits (inbreeding depression) may not be an inevitable phenomenon, since deleterious recessive alleles causing inbreeding depression might be purged from populations through inbreeding and selection. Inbreeding purging has been of great interest in conservation biology and animal breeding, because populations manifesting lower inbreeding depression could be created even with a small number of breeding animals, if inbreeding purging exists. To date, many studies intending to detect inbreeding purging in captive and domesticated animal populations have been carried out using pedigree analysis. Ballou’s ancestral inbreeding coefficient (FBAL−ANC) is one of the most widely used measurements to detect inbreeding purging, but the theoretical basis for FBAL−ANC has not been fully established. In most of the published works, estimates from stochastic simulation (gene-dropping simulation) have been used. In this report, the author provides a mathematical basis for FBAL−ANC and proposes a new estimate by hybridizing stochastic and deterministic computation processes. A stochastic simulation suggests that the proposed method could considerably reduce the variance of estimates, compared to ordinary gene-dropping simulation, in which whole gene transmissions in a pedigree are stochastically determined. The favorable property of the proposed method results from the bypass of a part of the stochastic process in the ordinary gene-dropping simulation. Using the proposed method, the reliability of the estimates of FBAL−ANC could be remarkably enhanced. The relationship between FBAL−ANC and other pedigree-based parameters is also discussed.


Introduction
Inbreeding is defined as a mating between relatives and is unavoidable in small populations [1].Inbred individuals show a reduction in phenotypic performance, especially in fitness-related traits, which is a phenomenon known as inbreeding depression [1][2][3].Inbreeding depression has been documented in various animal and plant species [1,[4][5][6].It has been considered that inbreeding depression is largely caused by partial dominance, i.e., the existence of partially deleterious recessive alleles, although over-dominance and epistasis may also play a role [3,7].
In the present report, I translate Ballou's definition of F BAL − ANC into a mathematical expression.Although the deterministic computation of F BAL − ANC from the expression is limited to simple cases, from the expression a new stochastic method for estimating F BAL − ANC with a higher reliability is proposed.The performance of the new method is evaluated by stochastic simulation.Finally, I discuss some theoretical aspects of F BAL − ANC and the relation to other pedigree-based parameters to detect inbreeding purging.

Notation
Consider the pedigree of an individual X originated from N A founders (ancestors with unknown parents), each with unique alleles.Founders are described as A 1 , A 2 , . .., A N A , and two alleles of founder A i are denoted as a (i,1) and a (i,2) .We attach superscripts (0, N, 1) to a (i,j) , according to its history of autozygosity: a 0 (i,j) : the allele has experienced no autozygous state in the past, a N (i,j) : the allele experienced an autozygous state for the first time in a given individual, a 1 (i,j) : the allele has already experienced an autozygous state at least once.Note that a N (i,j) is a transient notation; it is immediately replaced by a 1 (i,j) when transmitted to the next generation.The number of inbred ancestors in the pedigree is denoted as N B , and the inbred ancestors are described as The partial inbreeding coefficient [34] of individual k for founder allele a (i,j) is denoted as F (i,j)k , which expresses the probability that individual k is autozygous for a (i,j) , i.e., the genotype of k is a (i,j) a (i,j) .With the autozygous history notations, F (i,j)k is expressed by the sum of the three probabilities: where P k (xy) is the probability that the individual k has the genotype xy.Due to symmetry in pedigree, F (I,1)k = F (i,2)k .Thus, the partial inbreeding coefficient ( By the definition of the partial inbreeding coefficient [34], Wright's [35] For the sake of simplicity, we write the above expression as

Derivation of Expression
Denoting male and female parents of individual X as S and D, respectively, the recurrence equation given by Ballou [10] is However, using a gene-dropping simulation, Suwanlee et al. [32] showed that Equation (2) overestimates F BAL − ANC,X .As a revised version of Equation ( 2), they proposed an equation [32]: where F BAL − ANC,S F S is the proportion of genome of S that has been exposed to autozygosity at least once in the past, given that the genome is in an autozygous state in S (or equivalently, the conditional probability that an allele of S on an arbitrary locus has been exposed to autozygosity at least once in the past, given that the allele is in an autozygous state in S) and F BAL − ANC,D F D is the similar proportion in D.
The term 1 − F BAL − ANC,S F S F S in Equation (3) implies the proportion of genome newly exposed to inbreeding in S, and 1 − F BAL − ANC,S F D F D is the similar proportion in D. Denoting these terms as F BAL − NEW,S and F BAL − NEW,D , respectively, we have Here we consider the simple pedigree shown in Figure 1.With Equation (4), F BAL − ANC of individual X in the pedigree can be expanded to For the sake of simplicity, we write the above expression as  =  (  ) +  (  ) +  (  ). (1)

Derivation of Expression
Denoting male and female parents of individual X as S and D, respectively, the recurrence equation given by Ballou [10] is However, using a gene-dropping simulation, Suwanlee et al. [32] showed that Equation (2) overestimates  , .As a revised version of Equation ( 2), they proposed an equation [32]: where  , | is the proportion of genome of S that has been exposed to autozygosity at least once in the past, given that the genome is in an autozygous state in S (or equivalently, the conditional probability that an allele of S on an arbitrary locus has been exposed to autozygosity at least once in the past, given that the allele is in an autozygous state in S) and  , | is the similar proportion in D. The term 1 −  , |  in Equation ( 3) implies the proportion of genome newly exposed to inbreeding in S, and 1 −  , |  is the similar proportion in D. Denoting these terms as  , and  , , respectively, we have Here we consider the simple pedigree shown in Figure 1.With Equation (4),  of individual X in the pedigree can be expanded to In this expression, is the genetic contribution [37] of ancestor (K, L, O, S, D) to X.
Figure 1.Simple pedigree used for explanation of computational procedure of FBAL_ANC.Figure 1.Simple pedigree used for explanation of computational procedure of F BAL_ANC .
In this expression, 1   2 n is the genetic contribution [36] of ancestor (K, L, O, S, D) to X. Analogously in any pedigree, we can expand F BAL − ANC of an individual and can express it with F BAL − ANC and F BAL − NEW of all ancestors in the pedigree.But we should note that, by the definition of F BAL − ANC , an individual without inbred ancestors should have F BAL − ANC = 0, and a non-inbred individual should have F BAL − NEW = 0.This leads to a considerable simplification of the expanded form.In general, F BAL − ANC of individual X can be expressed as where gc (B k )X is the genetic contribution of inbred ancestor B k to X.This is an explicit expression of Ballou's definition of the ancestral inbreeding coefficient.Note that F BAL − ANC,X is expressed only with F BAL − NEW of inbred ancestors and their genetic contributions.
As an application of Equation ( 5), consider the real pedigree shown in Figure 2, which is a part of the pedigree of a mare (individual X) in a captive population of Przewalski's horse [37].Such a pedigree will be typically found in the early history of a captive population expanded from a limited number of wild-caught founders.From Equation ( 5), Animals 2024, 14, x FOR PEER REVIEW 4 of 12 Analogously in any pedigree, we can expand  of an individual and can express it with  and  of all ancestors in the pedigree.But we should note that, by the definition of  , an individual without inbred ancestors should have  = 0, and a non-inbred individual should have  = 0.This leads to a considerable simplification of the expanded form.In general,  of individual X can be expressed as where  ( ) is the genetic contribution of inbred ancestor Bk to X.This is an explicit expression of Ballou's definition of the ancestral inbreeding coefficient.Note that  , is expressed only with  of inbred ancestors and their genetic contributions.
As an application of Equation ( 5), consider the real pedigree shown in Figure 2, which is a part of the pedigree of a mare (individual X) in a captive population of Przewalski's horse [38].Such a pedigree will be typically found in the early history of a captive population expanded from a limited number of wild-caught founders.From Equation ( 5),  of individual X is expanded as  To complete the computation of F BAL − ANC,X , we need to obtain values of F BAL − NEW,B k For the pedigree shown in Figure 2, it is trivial that F BAL − NEW,B 1 = F B 1 = 0.125 and F BAL − NEW,B 2 = F B 2 = 0.125, since these ancestors have no inbred ancestors.But the computation of F BAL − NEW,B 3 is complicated.Prior to the computation, we generally consider the allele-based expression of F BAL − NEW,B k .Since F BAL − NEW,B k can be alternatively viewed as the expected frequency of the founder allele that is newly exposed to inbreeding in B k , F BAL − NEW,B k for a founder allele a (i,j) is written, with an analogy to the partial inbreeding coefficient F (i,j) , as Summing this expression over all founders and alleles within each founder and applying an analogy to Equation (1), we obtain the allele-based expression of F BAL − NEW,B k as To apply Equation (7) to the computation of F BAL − NEW,B 3 in Figure 2, we introduce the nine condensed identity states (S 1 -S 9 shown in Figure 3) [38] between the parents (B 1 B 2 ) and their probabilities of occurrence (∆ 1 − ∆ 9 ), that is, the condensed identity coefficients [38].Four identity states (S 3 , S 5 , S 7 and S 8 ) are relevant to the computation of F BAL − NEW,B 3 : from S 3 and S 5 , a child with genotype a 1 a N will be born with the probability 1/2, and from S 7 and S 8 , a child with genotype a N a N will be born with the probabilities 1/2 and 1/4, respectively (Figure 3).I obtained ∆ 3 = 0.0625, ∆ 5 = 0.0625, ∆ 7 = 0.1094 and ∆ 8 = 0.4688 from the ribd package [39] in R [40].Thus, P B 3 a 1 a N = 1 2 ∆ 3 + 1 2 ∆ 5 = 0.0625 and P B 3 a N a N = 1 2 ∆ 7 + 1 4 ∆ 8 = 0.1719.Substituting these values into Equation ( 7) gives F BAL − NEW,B 3 = 0.2031.Finally, substituting the obtained values of F BAL − NEW,B k (k = 1, 2, 3) into Equation ( 6), we get F BAL − ANC,X = 0.2266.The computational process is summarized in Table 1.The estimated F BAL − ANC,X from a gene-dropping simulation with 10 6 replicates using GRain (v2.2) [41,42] was 0.2263, while Ballou's original Equation ( 2) gave an overestimated value as To apply Equation (7) to the computation of  , in Figure 2, we introduce the nine condensed identity states (S1-S9 shown in Figure 3) [39] between the parents (B1 and B2) and their probabilities of occurrence (∆ − ∆ ), that is, the condensed identity coefficients [39].Four identity states (S3, S5, S7 and S8) are relevant to the computation of  , : from S3 and S5, a child with genotype   will be born with the probability 1/2, and from S7 and S8, a child with genotype   will be born with the probabilities 1/2 and 1/4, respectively (Figure 3).I obtained ∆ = 0.0625, ∆ = 0.0625, ∆ = 0.1094 and ∆ = 0.4688 from the ribd package [40] in R [41].Thus,  (  ) = ∆ + ∆ = 0.0625 and  (  ) = ∆ + ∆ = 0.1719 .Substituting these values into Equation (7) gives  , = 0.2031.Finally, substituting the obtained values of  , (k = 1, 2, 3) into Equation ( 6), we get  , = 0.2266.The computational process is summarized in Table 1.The estimated  , from a gene-dropping simulation with 10 6 replicates using GRain (v2.2) [42,43]     For the four condensed identity states (S 3 , S 5 , S 7 and S 8 ) relevant to computation of F BAL − NEW,B3 , the autozygous state of B 3 born from each of the four parental identity states is shown with respect to the autozygous history (N, 1), together with the probability (q i ) that the autozygous state occurs from each parental identity state.If F BAL − NEW of an individual with multiple (N B ≥ 3) inbred ancestors is considered, the deterministic computation is complicated since it requires the condensed identity coefficients among the multiple inbred ancestors.In theory, computation of the condensed identity coefficients among multiple individuals will be possible [43], but the possible number (n s ) of the condensed identity states exponentially increases as the number (n d ) of involved individuals increases; e.g., for n d = 3, n s = 66 and for n d = 4, n s = 712 [44].In addition, we must trace the autozygous history of the alleles involved in each condensed identity state.These make the generalized computation of F BAL − NEW intractable for a pedigree with multiple inbred ancestors.In fact, for the repeated full-sib mating, I found a compact set of recurrence equations which gives F BAL − NEW and F BAL − ANC at any generation.But the generalization seems to be hopeless.
As an alternative to the deterministic computation of F BAL − NEW , I propose the use of F BAL − NEW estimated from a gene-dropping simulation.As shown later, estimates of F BAL − ANC from this method have a favorable property.The simulation process is essentially same as the ordinary gene-dropping simulation [41,42].In the simulation, alleles are flagged when they experience an autozygous state, and for an individual (necessarily, an inbred individual), newly flagged alleles (a N in our notation) are counted over replicates of the simulation.The total number of counts divided by the number of replicates of the simulation gives a stochastic estimate of F BAL − NEW of the individual.Substituting the estimates of F BAL − NEW for all inbred ancestors into Equation ( 5) gives an estimate of F BAL − ANC,X .This method is referred to as the 'hybrid method', in a sense that it consists of a stochastic process (the gene-dropping simulation) and a deterministic process (the computation of genetic contributions from the inbred ancestors).
As an illustrative example, we apply the hybrid method to the pedigree of Przewalski's horse shown in Figure 2. Estimates of F BAL − NEW for the inbred ancestors (B 1 , B 2 , B 3 ) obtained from the gene-dropping simulation with 10 6 replicates were 0.1250, 0.1243 and 0.2027, respectively.Weighting these estimates by the genetic contributions and them over all inbred ancestors (c.f., Equation ( 5)) gives F BAL − ANC,X = 0.2262 as a hybrid estimate.The computational process is summarized in Table 1.

Performance of Hybrid Estimate
The performance of the hybrid estimate was evaluated with a more complicated pedigree (Figure 4), which is a part of the pedigree of the Spanish Habsburg dynasty [30,45].Pedigree analysis with F BAL − ANC has suggested that inbreeding purging had been acting in the early history of this dynasty [30].I estimated F BAL − ANC of King X (Charles II) from the hybrid method and evaluated the performance of the estimate by comparing with the estimate from the ordinary gene-dropping simulation.For both methods, 100 trials each with n rep (=5000, 10,000, 50,000 and 100,000) replicates were carried out.To fairly compare the two methods, I used the same simulation program originally coded in Fortran95.The estimated F BAL − ANC,X from the gene-dropping simulation with 10 6 replicates using GRain (v2.2) [41,42] was 0.2415.
The results of the simulation are summarized in Table 2, and for visualization, estimates from 100 trials of each method are plotted in Figure 5, for the case of n rep = 10,000.For a given number of replicates, the hybrid method reduced the variance of estimates to 40-60% of those from the ordinary gene-dropping simulation, implying that, by the use of the hybrid method, the reliability of the estimate could be enhanced.The results of the simulation are summarized in Table 2, and for visualization, estimates from 100 trials of each method are plotted in Figure 5, for the case of  = 10,000.For a given number of replicates, the hybrid method reduced the variance of estimates to 40-60% of those from the ordinary gene-dropping simulation, implying that, by the use of the hybrid method, the reliability of the estimate could be enhanced.4, obtained from 100 trials of ordinary gene-dropping method (GDS) and hybrid method each with 10,000 replicates.

Discussion
Equations ( 5) and (7) indicate that F BAL − ANC of an individual is defined with 'allele frequency', but not with 'autozygosity' in the individual.Thus, as mentioned by several authors [18,22,33], a direct relationship between F BAL − ANC and Wright's classical inbreeding coefficient (F) is not expected.In many works [13][14][15][16][17][18][19][20], the correlation between F BAL − ANC and F has been reported, ranging from 0.31 [20] to 0.95 [13].The wide range of variation can be viewed as a reflection of the definition of F BAL − ANC .
In some cases, F BAL − ANC and F will show quite different values; for example, if a population is subdivided into several isolated lines and mating between two lines occurs, the offspring will have F = 0, while F BAL − ANC may show a positive value because of the accumulated F BAL − NEW within the parental lines.Similarly, when a wild-caught animal with unknown parents is introduced into a captive population, a child of the introduced animal is expected to show F = 0 and F BAL − ANC > 0 due to the accumulated F BAL − NEW in the native parent of the child.In this context, F BAL − ANC could be one criterion for selecting breeding animals.If other conditions are the same, an animal with a higher F BAL − ANC should be preferred as a breeding animal to an animal with a lower F BAL − ANC , because the former is expected to have a smaller number of deleterious recessive alleles that may cause inbreeding depression in the descendants.
Kalinowski's ancestral inbreeding coefficient (F KAL − ANC ) is another measurement of purging opportunity induced by ancestral inbreeding [11].This coefficient is based on a decomposition of F as F = F KAL − ANC + F KAL − NEW , where F KAL − NEW is Kalinowski's new inbreeding coefficient [11].F KAL − ANC is the probability that alleles are in an autozygous state in the individual while they have been in an autozygous state at least once in the past, and F KAL − NEW is the probability that alleles are in an autozygous state for the first time in the individual [11,42].In our notation, Of course, as verified from Equation (1), F X = F KAL − ANC,X + F KAL − NEW,X .Unlike F BAL − ANC , F KAL − ANC has a direct relation to F. Thus, it is natural that a higher correlation has been found between F KAL − ANC and F than between F BAL − ANC and F in many works [14][15][16][17][18][19].
For the pedigree in Figure 2, F KAL − NEW,B 3 is computed from Figure 3 as Since F B 3 = 0.25, we get The corresponding estimates from GRain (v2.2) [41,42] with 10 6 replicates are F KAL − NEW,B 3 = 0.1711 and F KAL − ANC,B 3 = 0.0778.However, the exact computation of F KAL − NEW and F KAL − ANC for an individual with multiple inbred ancestors will be intractable for the same reason as the difficulty in generalized computation of F BAL − ANC and F BAL − NEW .
Gulsija and Crow [46] gave parameters to evaluate the potential reduction in an individual's inbreeding load from pedigree data.Assuming that in the same path in a pedigree no two ancestors are autozygous for the same founder allele, they derived a parameter O X (opportunity for purging) to measure the opportunity for purging by the expected contribution of alleles from inbred ancestors to individual X [46].The derived expression of O X is in our notation: Animals 2024, 14, 1844 9 of 11 On the above assumption, it should be that F BAL − ANC,B k = 0 and F BAL NEW,B k = F B k .Thus, from Equation (5) we have For a complex pedigree involving several inbred ancestors in the same path, an inbred individual descending from inbred ancestors will be less likely to carry deleterious recessive alleles than when their ancestors have not been inbred [47].To remove this remote ancestral effect from O X , Gulsija and Crow [46] showed an equation.However, it seems to be too complex to implement in practice [47].Note that Equation (5) contains only the terms with F BAL − NEW , implying that the remote ancestral effects are completely excluded from F BAL − ANC .
In the present study, the hybrid method was proposed for estimating F BAL − ANC from pedigree data.Although the examined cases are limited, it was suggested that the method could enhance the reliability of the estimate, compared to the ordinary gene-dropping simulation [41,42].This favorable property results from the bypass of a part of the stochastic process in the ordinary gene-dropping simulation; in the hybrid method, contributions of the estimated F BAL − NEW to individual X are fully deterministically computed with a genetic contribution, whereas in the ordinary gene-dropping simulation, whole transmission of alleles from founders to individual X are subject to stochastic events (Mendelian segregations), which inevitably inflates sampling variance of the estimates.
Prior to the implementation of the hybrid method, finding inbred ancestors and computing their genetic contributions are required.The extra task can be easily overcome.Rapid algorithms are now available for computing the inbreeding coefficients [48,49], which allows us to find inbred ancestors in a large pedigree efficiently [50].Genetic contributions can be systematically obtained by computing a lower triangle matrix L = l i,j , where l i,j is the genetic contribution of i to j [51].There is a simple algorithm for computing L, applicable to a large pedigree [52].
Although the exact computation of F BAL − ANC with the obtained expression is limited to small and simple pedigrees, the expression deepens our understanding of F BAL − ANC .A useful outcome from the expression will be the hybrid method for estimating F BAL − ANC .The performance should be intensively evaluated under various scenarios in future studies.

Conclusions
In this article, the author provided a mathematical basis for F BAL − ANC and proposed a new method for estimating F BAL − ANC.A stochastic simulation suggested that the new method could remarkably enhance the reliability of estimates, compared to the conventional gene-dropping method.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ani14131844/s1,The R script used for finding inbred ancestors and computing their genetic contributions to individual X in Figure 4. Fortran code used for obtaining the results in Table 2.

Figure 3 .
Figure 3. Nine condensed identity states (S1-S9) of individuals B1 and B2 in Figure2.Dots linked with line are identical by descent, and dots not linked are not identical by descent.History of autozygosity for each allele is described with notations; 0: not exposed to autozygosity in the past, N: exposed to autozygosity for the first time in the individual, 1: exposed to autozygosity at least once in the past.For the four condensed identity states (S3, S5, S7 and S8) relevant to computation of  , , the autozygous state of B3 born from each of the four parental identity states is shown with respect to the autozygous history (N, 1), together with the probability (qi) that the autozygous state occurs from each parental identity state.

Figure 3 .
Figure 3. Nine condensed identity states (S 1 -S 9 ) of individuals B 1 and B 2 in Figure2.Dots linked with line are identical by descent, and dots not linked are not identical by descent.History of autozygosity for each allele is described with notations; 0: not exposed to autozygosity in the past, N: exposed to autozygosity for the first time in the individual, 1: exposed to autozygosity at least once in the past.For the four condensed identity states (S 3 , S 5 , S 7 and S 8 ) relevant to computation of F BAL − NEW,B3 , the autozygous state of B 3 born from each of the four parental identity states is shown with respect to the autozygous history (N, 1), together with the probability (q i ) that the autozygous state occurs from each parental identity state.

Figure 5 .
Figure 5. Plots of estimated F BAL_ANC of individual X in Figure4, obtained from 100 trials of ordinary gene-dropping method (GDS) and hybrid method each with 10,000 replicates.

Funding:
This work was funded by JST Grant Number JPMJPF2010 from Japan Science and Technology Agency.Institutional Review Board Statement: Not applicable.Informed Consent Statement: Not applicable.

Table 1 .
Computation of  of mare X in pedigree of Przewalski's horse shown in Figure2, from exact computation and hybrid method.gc: genetic contribution.

Table 1 .
Computation of F BAL − ANC of mare X in pedigree of Przewalski's horse shown in Figure2, from exact computation and hybrid method.gc: genetic contribution.

Table 2 .
Summary statistics of estimated F BAL_ANC of individual X in Figure4, obtained from 100 trials each with n rep replicates.GDS: ordinary gene-dropping simulation, Hybrid: hybrid method.Figure in parentheses is fraction (%) of variance of estimates from hybrid method to that from GDS.

Table 2 .
Summary statistics of estimated FBAL_ANC of individual X in Figure4, obtained from 100 trials each with nrep replicates.GDS: ordinary gene-dropping simulation, Hybrid: hybrid method.Figure in parentheses is fraction (%) of variance of estimates from hybrid method to that from GDS.