Changes in Allele Frequencies When Different Genomic Coancestry Matrices Are Used for Maintaining Genetic Diversity

A main objective in conservation programs is to maintain genetic variability. This can be achieved using the Optimal Contributions (OC) method that optimizes the contributions of candidates to the next generation by minimizing the global coancestry. However, it has been argued that maintaining allele frequencies is also important. Different genomic coancestry matrices can be used on OC and the choice of the matrix will have an impact not only on the genetic variability maintained, but also on the change in allele frequencies. The objective of this study was to evaluate, through stochastic simulations, the genetic variability maintained and the trajectory of allele frequencies when using two different genomic coancestry matrices in OC to minimize the loss of diversity: (i) the matrix based on deviations of the observed number of alleles shared between two individuals from the expected numbers under Hardy–Weinberg equilibrium (θLH); and (ii) the matrix based on VanRaden’s genomic relationship matrix (θVR). The results indicate that the use of θLH resulted in a higher genetic variability than the use of θVR. However, the use of θVR maintained allele frequencies closer to those in the base population than the use of θLH.


Introduction
Genetic diversity is a prerequisite for populations to be able to face future environmental changes and to ensure long-term survival [1]. Thus, a common objective in genetic conservation programs is to minimize the loss of genetic variability. This can be achieved using the Optimal Contributions (OC) method that optimizes the contributions of candidates to the next generation by minimizing the global coancestry [2][3][4]. It has been demonstrated that OC maximizes genetic diversity measured as expected heterozygosity [5], which is proportional to the additive genetic variance of quantitative traits [6]. Controlling the loss of genetic diversity also keeps the inbreeding rate under control and therefore the risk of inbreeding depression.
A different objective in genetic conservation programs can be to maintain allele frequencies to preserve the uniqueness of a particular population, since current frequencies are the result not only of genetic drift, but also of previous selection processes [7][8][9]. Selection and drift can lead to a given allele responsible for a desirable trait at a high frequency. Moreover, trying to move the frequency to intermediate values to increase genetic variability would remove the uniqueness of the population. Thus, changes in the genetic composition of populations may be undesirable, particularly when dealing with ex situ conservation programs where the final aim is the reintroduction to the wild [9].
When the OC method is applied using pedigree information to compute coancestries, both objectives (maximum heterozygosity and maintenance of allele frequencies) are achieved [9], but this is not the case when coancestries are computed from molecular marker data. Previous studies have shown that using a coancestry matrix (θ) computed from large numbers of single nucleotide polymorphisms (SNPs) in OC is more efficient for maintaining diversity than using the pedigree-based coancestry matrix [10][11][12]. However, given that the highest expected heterozygosity is obtained at intermediate allele frequencies, a consequence of applying OC using a θ based on SNP genotypes is that the genetic composition of the population is modified [9][10][11]13,14].
Different genomic coancestry matrices have been proposed for being used in OC [10,11,[15][16][17]. They include the matrix that describes deviations of the observed numbers of alleles shared by two individuals from the expected numbers under Hardy-Weinberg equilibrium [18], and those obtained from genomic relationship matrices currently used in genomic predictions [19,20]. In a recent study, Morales-González et al. [16] have shown that the expected heterozygosity retained through OC was higher when using the matrix proposed by Li and Horvitz [18] than when using different genomic relationship matrices (i.e., the VanRaden's matrices based on Method 1 and 2 [19] and the Yang's matrix [20]). However, as mentioned above, the genomic θ used in OC will have an impact not only on the diversity maintained, but also on the trajectory of the change in allele frequencies. Gómez-Romano et al. [21] suggested that while OC using a genomic coancestry matrix that simply measures the proportion of alleles shared by two individuals [22] and that correlates perfectly with Li and Horvitz's matrix favors solutions that tend to move allele frequencies towards 0.5, OC using VanRaden's matrices would lead to solutions that tend to keep allele frequencies closer to those in the original population (i.e., allele frequencies would tend to be unchanged). This has been recently confirmed by Meuwissen et al. [17] in the context of OC aimed at maximizing genetic gain through selection while restricting the increase in inbreeding (i.e., restricting the loss of genetic diversity).
In general, populations under conservation programs are small and genetic drift leads to a loss of diversity and changes in allele frequencies. The magnitude of these drift effects depends on the effective population size (N e ) which can be estimated from genomic coancestry. However, Toro et al. [23] have recently questioned the meaning of N e when genomic matrices are used in OC. In particular, when optimal management is carried out using marker information, genetic diversity can increase in the initial generations implying negative estimates of N e . Moreover, in the long term, N e does not attain an asymptotic value, but it shows an unpredictable behavior. Their findings were based on OC using Nejati-Javaremi s matrix [22] and it is unclear if they hold when other genomic coancestry matrices are used.
The objective of this study was to evaluate, through computer simulations, the genetic variability maintained and the trajectory of allele frequencies when different genomic coancestry matrices are used in OC. Estimates of N e obtained from the change in heterozygosity computed from different genomic matrices were also compared.

Materials and Methods
Scenarios simulated involved the management of populations through the OC method using two different genomic coancestry matrices, for 50 discrete generations. Management started from a base population with family structure. The same base population was used for the 100 replicates run and it was created in two steps. Firstly, a population at mutation-drift equilibrium was generated. Secondly, the population was expanded in order to have enough individuals for sampling the 100 replicates (see below, in Section 2.1). The simulations were carried out with our own Fortran 90 codes.

Generation of the Base Population
The simulation of the base population was done in two steps to simulate a realistic amount of linkage disequilibrium and to ensure independency among the replicates. The Genes 2021, 12, 673 3 of 16 first step was to generate a population in LD using a mutation-drift equilibrium approach. For this, 10,000 discrete generations of random mating for a population of 100 individuals (50 males and 50 females) were simulated. Using a larger population size would have generated an unrealistically low LD. Sires and dams were sampled with replacement and were mated at random. Each mating produced 2 offspring (1 of each sex). Thus, N e was equal to 100. The genome was composed of 20 chromosomes of 1 Morgan each. Two types of biallelic loci (SNP and unobserved loci) were simulated and they differed simply in their subsequent use. SNP loci were used for computing the genomic coancestry matrices used in the management of the population that started after the base population was created. The unobserved loci were used for measuring diversity and changes of allele frequencies, and for estimating N e across generations. Thus, the effect of different management strategies (i.e., using different genomic coancestry matrices) can be evaluated in the rest of the genome and not only on the loci used in the management (i.e., it is sometimes done using SNPs). A total of 500,000 SNPs and 500,000 unobserved loci were simulated per chromosome. At the initial generation, all loci were fixed. The mutation rate per locus and generation (µ) was 2.5 × 10 −6 for all loci. The number of new mutations per generation was sampled from a Poisson distribution with mean 2N e n c µn l " where n c is the number of chromosomes (i.e., 20) and n l is the total number of loci per chromosome (i.e., 1,000,000). Mutations were then randomly distributed across individuals, chromosomes and loci, switching allele 1 to allele 2 and vice versa. When generating the gametes, the number of crossovers per chromosome was drawn from a Poisson distribution with mean equal to 1. Crossovers were randomly distributed without interference. At the end of the process, the expected heterozygosity measured at both types of loci had stabilized (mutation-drift equilibrium). The second step consisted of expanding this population so we could sample the individuals to be used at the first generation of each replicate. The population was expanded during 4 generations with the aim of having enough individuals to sample 100 different replicates. During the 4 generations of expansion, each individual was randomly allocated to 8 different mates and each mating produced 1 offspring. In this way, the number of individuals in the population was multiplied by 4 each generation. After these 4 generations, the population was composed by 25,600 individuals and constituted the base population (t = 0). There were a total of 56,017 SNPs and 55,840 unobserved loci still segregating in t = 0. The expected heterozygosity (H e ) computed with all loci (SNPs and unobserved loci) still segregating was 0.1811 and the linkage disequilibrium (measured as r 2 , the squared correlation between pairs of loci) between consecutive loci was 0.131.

Management Strategies
Management was performed on populations of two different sizes (N = 20 and N = 100 individuals, half of each sex) using the OC method across 50 generations. Population size was kept constant across generations. The founder individuals for each replicate were randomly sampled from the base population. Note that, given that the set of individuals sampled in t = 0 differs across replicates, the number of segregating loci can also differ. In most scenarios (see below, at the end of this section), all loci segregating in t = 0 were used for managing the population, for measuring diversity and changes of allele frequencies, and for estimating N e .
The problem to be solved in the OC method is related to the allocation of contributions, i.e., the number of offspring each candidate should produce the next generation. The pursued strategy is to minimize the global coancestry weighted by those contributions, i.e., minimize c'θ c, where c is a N × 1 vector of proportions of offspring left by each candidate (i.e., the vector of solutions), N is the number of candidates and θ is the coancestry matrix. A restriction was imposed in the optimization such as the sum of the contributions of males and females is the same and equal to 1 2 , i.e., Q'c = 1 2 1, where Q is a (N × 2) known incidence matrix indicating the sex of the candidates with 0s and 1s, and 1 is a (2 × 1) vector of ones. The optimization problem was solved using Lagrangian multipliers [2,24]. Note that with this approach, c can contain negative values for some candidates. The contribution of candidates with c i < 0 was then set to 0 and the optimization was repeated with the remaining candidates until all elements of c were non-negative. Finally, the contribution of individual i (c i ), which is a proportion, was converted to a number of offspring by multiplying c i by 2N and rounding to the nearest integer but ensuring that the number of offspring of each sex equals to N/2. Each parent was randomly allocated to different mates (among the selected individuals) to produce its offspring.
Two management strategies were investigated, and they differed in the genomic coancestry matrix used in the optimization of contributions. Under strategy S O_LH , the coancestry matrix used was matrix θ LH which describes the excess in the observed number of alleles shared by two individuals relative to the expected number under Hardy-Weinberg equilibrium [18,25]. Specifically, the coancestry coefficient between individuals i and j was computed as where f OBS(i,j) is the proportion of alleles shared by individuals i and j, S is the number of SNPs and p k is the frequency of the reference allele (allele B) of SNP k in t = 0. Under strategy S O_VR , the coancestry matrix used was matrix θ VR which is based on the genomic relationship matrix obtained from VanRaden's method 2 [19]. Specifically, the coancestry coefficient between individuals i and j was computed as where x ki is the genotype of individual i for SNP k, coded as 0, 1 or 2 for genotypes AA, AB and BB, respectively, and p k is as defined for f LH . In most scenarios, both coancestry matrices were computed every generation using all SNPs that were segregating in t = 0. However, we analyzed two additional scenarios where two different minor allele frequency (MAF) thresholds were imposed for the SNPs to be used to compute the coancestry matrices: (i) using only SNPs with MAF > 0.05; and (ii) using only SNPs with MAF > 0.25. The first threshold (MAF > 0.05) was considered because it is commonly applied when analyzing real data to reduce the number of potential genotyping errors. The second threshold (MAF > 0.25) was considered to explore the influence of rare alleles on the performance of the coancestry matrices investigated. It is known that with VanRaden's method rare alleles contribute more to the coancestry coefficient than common alleles [21,26]. It is, thus, interesting to determine how the differences between management strategies S O_LH and S O_VR vary in the different MAF scenarios. Management in these additional scenarios was performed for 50 generations.
Furthermore, as a benchmark, we simulated a strategy (strategy S E ) where the contributions of all candidates were equalized (i.e., all individuals contributed with two offspring to the next generation). This is the simplest management strategy that has been proposed to maintain genetic diversity by increasing N e . It should be noticed that when dealing with populations in which the relationships between individuals are homogeneous (all equally related), this strategy leads to a N e close to 2N.

Parameters Evaluated
Management strategies were compared in terms of the genetic variability retained and the trajectory of the allele frequencies across generations for the SNPs and for the unobserved loci. Moreover, strategies were compared in terms of the number of individuals selected to produce the next generation (N S ) and the number of loci still segregating in a given generation, both for SNPs and for unobserved loci. The amount of genetic variability retained was measured as the expected heterozygosity (H e ) computed as 1 − ∑ L k=1 ∑ 2 l=1 p 2 kl , where L is the number of loci (SNPs or unobserved loci) and p lk is the frequency of allele l of locus k. In order to evaluate the 'distance' between frequencies in a given generation t and frequencies in t = 0, we used the Kullback-Leibler (KL) divergence criterion, which measures how different is a particular distribution from a reference distribution [27], which here is the distribution of allele frequencies in t = 0. The KL divergence between current frequencies and frequencies in t = 0 was computed as where p kl is the frequency of allele l of locus k in t = 0, and p kl is the corresponding frequency in the current generation (t > 0). The summation over alleles included only alleles with p kl > 0. Finally, N e was estimated from the change in heterozygosity in SNP loci. Thus, N e in generation t was computed as N e = 1/2 ∆H e , where ∆H e equals H e(t−1) − H e(t) /H e(t−1) . All results presented are averages over the 100 replicates.

Expected Heterozygosity and Kullback-Leibler Divergence for Populations of Size N = 100
For populations of size N = 100, and using all the SNPs segregating in t = 0, strategy S O_LH led to higher genetic variability (measured as H e ) than strategy S O_VR (Table 1) and the difference between both strategies increased across generations. In particular, H e was about 1%, 4% and 11% higher with S O_LH than with S O_VR in t = 1, 10 and 50, respectively. With S O_LH , H e even slightly increased in the initial generations while with S O_VR , H e decreased from the start. Moreover, H e obtained with strategy S O_VR was very similar to H e obtained with strategy S E . Table 1 also shows that S O_VR maintained allele frequencies closer to those in the base population than S O_LH given that the KL values for S O_LH were ≥ 100% higher than for S O_VR . The differences in KL between both strategies increased across generations. Moreover, at later generations, S O_VR was slightly more efficient in maintaining the initial frequencies than S E , a strategy that is expected to maximize N e and, thus, to minimize genetic drift. Table 1. Expected heterozygosity (H e , in %) and Kullback-Leibler divergence for unobserved loci (KL × 10 2 ), number of selected candidates (N S ), and number of SNPs (S) and unobserved loci (U) segregating across generations (t) when contributions are equalized (S E ) and when they are optimized using Li and Horvitz (S O_LH ) and VanRaden (S O_VR ) coancestry matrices computed with SNPs with MAF > 0.00 in a population of 100 individuals. The use of both matrices (θ LH and θ VR ) in OC also led to different numbers of individuals selected as parents of the next generation (N S ). In particular, with S O_LH , between 10% and 30% fewer individuals were selected than with S O_VR (Table 1). In fact, with the latter, almost all individuals were selected in all generations up to t = 10. The difference in N S entailed a difference in the number of loci that remained segregating across generations that was much higher with S O_VR than with S O_LH (Table 1), particularly in the initial generations. As for H e and for KL, strategies S O_VR and S E led to very similar values of N S . Table 2 shows the evolution across generations of the average frequency of the minor allele in t = 0. This average frequency was practically constant with S E and slightly decreased with S O_VR . However, with S O_LH , it increased from~1% in t = 1 to 16-19% in t = 50. Thus, it is clear that S O_LH leads average frequencies upward (ultimately towards 0.5) and S O_VR tends to maintain them. As expected, these patterns were more evident for the SNPs than for the unobserved loci. Table 2. Average frequency of the minor allele in generation 0 (× 10 2 ) across generations (t) for SNPs and unobserved loci when contributions are equalized (S E ) and when they are optimized using Li and Horvitz (S O_LH ) and VanRaden (S O_VR ) coancestry matrices in a population of 100 individuals.

SNPs
Unobserved Loci  2 show the frequency (f ) distribution also for minor alleles in t = 0 in this generation and after 50 generations of management, using different sets of SNPs to compute coancestries. When using all SNPs segregating in t = 0, the distributions for SNPs and unobserved loci were very similar (Figures 1a and 2a). However, when using only SNPs with MAF > 0.05 or MAF > 0.25, the distribution for SNPs was greatly affected. When using SNPs with MAF > 0 or MAF > 0.05 (Figure 1a,b), a greater number of SNPs was fixed with S O_LH than with S O_VR across generations (see class f = 0.00). However, more loci (SNPs and unobserved loci) with low frequencies (0.00 < f ≤ 0.15) were observed with S O_VR than with S O_LH and more loci with higher frequencies (f > 0.4) were observed with S O_LH than with S O_VR . Thus, although more alleles are fixed with S O_LH , those that are kept segregating increase their frequency, while with S O_VR the frequencies tend to be maintained. The highest difference between SNPs and unobserved loci was found when only SNPs with MAF > 0.25 were used to estimate the coancestry matrices (Figures 1c and 2c). These differences are due to the fact that no MAF filtering was done for the unobserved loci. than with SO_VR. Thus, although more alleles are fixed with SO_LH, those th regating increase their frequency, while with SO_VR the frequencies tend to The highest difference between SNPs and unobserved loci was found w with MAF > 0.25 were used to estimate the coancestry matrices (Figures 1c differences are due to the fact that no MAF filtering was done for the unob      Figure 3 shows the trajectories of H e and KL across generations for unobserved loci under strategies S O_LH and S O_VR using the three different sets of SNPs. The heterozygosity maintained with S O_LH decreased as the MAF criterion chosen for the SNPs used to estimate coancestries becomes more restrictive given that the number of SNPs used decreased. In fact, the small increase in H e observed in the initial generations when using all SNPs (MAF> 0.00) was not observed when using only the SNPs with MAF > 0.05 or MAF > 0. 25. In parallel, the KL divergence with S O_LH also decreased when increasing the severity of the restriction imposed on the SNPs used. However, with S O_VR , the changes observed in H e and KL when using a different set of SNPs were very small. R PEER REVIEW 9 of 16 maintained with SO_LH decreased as the MAF criterion chosen for the SNPs used to estimate coancestries becomes more restrictive given that the number of SNPs used decreased. In fact, the small increase in He observed in the initial generations when using all SNPs (MAF> 0.00) was not observed when using only the SNPs with MAF > 0.05 or MAF > 0. 25. In parallel, the KL divergence with SO_LH also decreased when increasing the severity of the restriction imposed on the SNPs used. However, with SO_VR, the changes observed in He and KL when using a different set of SNPs were very small.  Table 3 shows results from the different strategies (SE, SO_LH and SO_VR) for populations of size N = 20, when all SNPs segregating in t = 0 were used in the management. Similar to the results found for populations of N = 100, i) SO_LH led to higher He than SO_VR and SE;  Table 3 shows results from the different strategies (S E , S O_LH and S O_VR ) for populations of size N = 20, when all SNPs segregating in t = 0 were used in the management. Similar to the results found for populations of N = 100, (i) S O_LH led to higher H e than S O_VR and S E ; and (ii) S O_VR maintained allele frequencies closer to those in t = 0 than S O_LH . However, differences among strategies were smaller for populations of N = 20. For instance, for N = 20, H e in t = 10 was less than 1% higher when managing with S O_LH than when managing with S O_VR , while for N = 100 this percentage was about 4%. For KL, the highest difference between strategies was 0.0027 units with N = 20 and 0.0127 units with N = 100. However, with N = 20, contrary to what happened with N = 100, S O_LH managed to keep frequencies closer to the initial frequencies than S E in the last generations (t ≥ 30).  In populations of size N = 20, individuals are more closely related than in populations of size N = 100 and the genetic variability is smaller. Thus, most (if not all) individuals were selected to be parents of the next generation with all management strategies across generations. It should be noted that the number of loci segregating in t = 0, when management started, was substantially smaller when simulating populations of size N = 20. In order to investigate if the differences observed between N = 20 and N = 100 are a consequence of the different number of loci segregating in t = 0, a scenario with N = 100 starting with the same number of SNPs as in the scenario with N = 20 (about 40,000 SNPs) was simulated.

Expected Heterozygosity and Kullback-Leibler Divergence for Populations of Size N = 20
The results indicate that the differences between scenarios with different N were due to the population size and not to the different number of loci (results not shown). Table 4 shows estimates of N e across generations for the different scenarios simulated. For N = 100, estimates of N e were around 200 individuals under strategies S E and S O_VR . This is the expected value for N e when contributions are equalized since N e is approximately equal to 2N. However, under strategy S O_LH , estimates of N e were unreasonable as they took negative values in the initial generations. In later generations, N e became positive but did not reach a stable value. For N = 20, estimates under strategies S E and S O_VR were around 40 individuals, as expected. Estimates of N e under strategy S O_LH were between 6% and 50% higher than under strategy S E .

Discussion
Using computer simulations, this study has compared two different management strategies in terms of two important criteria in genetic conservation programs, i.e., genetic diversity (H e ) maintained and changes in allele frequencies. Both strategies optimize contributions for maintaining diversity but differ in the genomic coancestry matrix used in the optimization (θ LH in strategy S O_LH and θ VR in strategy S O_VR ). Moreover, as a benchmark, the simplest management strategy proposed to maintain genetic diversity that implies equalizing the contributions of all candidates (strategy S E ) was evaluated.
The changes in allele frequencies were evaluated using the KL divergence criterion. The greater the value of KL, the greater the divergence of frequencies with respect to the frequencies in the base population. When the strategies were compared using the KL criterion, it was clear that strategy S O_LH gives higher values than strategy S O_VR , indicating that the latter is able to maintain allele frequencies closer to the original frequencies (lower KL values). On the other hand, with strategy S O_LH , the population evolves differently as it pushes frequencies towards 0.5 and thus changes the genetic composition of the population more than strategy S O_VR .
Pushing frequencies towards 0.5 as strategy S O_LH does leads to higher genetic variability when measured as expected heterozygosity. Thus, the hypothesis raised by Gómez-Romano et al. [21] that using matrix θ LH in OC designed for maintaining genetic diversity better achieves the objective (i.e., higher H e ) than using matrix θ VR , but using the latter maintains allele frequencies closer to the initial frequencies, is confirmed. This was observed both in populations with N = 20 and in populations with N = 100 although the differences between both strategies were smaller with N = 20. This is because individuals in the smaller populations are more closely related and there are less options to choose among individuals and strategies behave more similarly.
Saura et al. [9] showed that the use of the pedigree-based coancestry matrix in OC maintained allele frequencies close to those of the initial population. This is related to the high levels of N e obtained when minimizing pedigree coancestry (close to 2N), leading to reduced drift and little departures to the original frequencies. Additionally, several studies [10,12] have shown that OC based on pedigrees leads to less maintained genetic diversity than the use of genomic coefficients based on Nejati-Javaremi s matrix [22]. This is due to the fact that genomic data provide realized estimates of coancestry, while pedigree data provide expected values. Therefore, results under the management of populations with OC using the pedigree-based coancestry matrix would be similar to those under S O_VR .
Strategy S O_VR was only slightly more efficient for maintaining frequencies than strategy S E . This strategy tends to reduce the change in allele frequencies, which implies a reduced genetic drift [17]. The magnitude of drift is minimized when N e equals approx-imately 2N, and it is well known that, when managing the population using pedigree information (as said before), this is achieved by equalizing contributions [6,28]. The small advantage of S O_VR in terms of maintaining frequencies over S E arises from the fact that the former uses realized relationships and detects real differences between individuals while S E assumes homogeneous relationships. Contrarily, S O_LH does not minimize drift but maximizes H e by shifting frequencies towards 0.5. Thus, results from S O_LH are quite different to those obtained under S E in terms of the number of selected candidates and their optimal contributions.
Given that strategy S O_LH brings the frequencies towards 0.5, H e increased in the initial generations and this led to negative estimates of N e in the largest population (N = 100). As generations go by, N e becomes positive but with unrealistic very high values without attaining an asymptotic value. This was also observed by Toro et al. [23] who questioned the meaning of N e when genomic coancestry matrices are used in OC. They showed an unpredictable behavior for N e when using the similarity genomic matrix of Nejati-Javaremi et al. [22], which has a correlation of 1 with the θ LH matrix used here [5,16,29]. However, our results show that when using θ VR in OC, estimates of N e were close to the expected value when equalizing contributions (approximately 2N). As has been discussed above, the results from strategy S O_VR were very similar to those from strategy S E given that both tend to minimize drift. For the smallest population considered (N = 20), estimates of N e were close to 2N not only with S O_VR but also with S O_LH . In such a small population, there are fewer options to choose among individuals and most of them are selected to contribute (Table 3). Thus, the three strategies investigated led to similar results.
Strategy S O_LH led to higher H e but also to a higher loss of segregating loci than strategy S O_VR . In the largest population (N = 100), the percentage of alleles lost for unobserved loci at t = 1 was 13% and 9% with S O_LH and S O_VR , respectively ( Table 1). The difference in both management strategies in terms of the number of alleles lost could be due to a different number of individuals selected to contribute to the next generation that was lower with S O_LH . It must be emphasized that the mean coancestry of each individual with all the candidates (including the individual), i.e., the marginal of the coancestry matrix, is a useful concept for understanding the different numbers selected with both strategies. This is because the marginal of the coancestry matrix is a measure of the 'relevance' of each individual, in terms of the degree of genetic information shared with the rest, and the optimal solutions will depend on all relationships between candidates. Its value is the same for all candidates when considering θ VR . Then, all candidates are equally useful and should be selected as it was observed minimizing the global coancestry through OC using θ VR (strategy S O_VR ). However, when considering θ LH , the average coancestry of individuals AA (homozygous for the minor allele) is lower than that of individuals BB (homozygous for the major allele), since individuals AA harbor genetic information that is underrepresented (i.e., they carry the rarer allele) and should be favored for selection and contributions. Therefore, OC using θ LH minimize the objective function when selecting the same number of AA and BB candidates. This leads to an increase in the frequency of allele A (actually to 0.5 in a single generation in this example with only one locus) while frequencies stay unchanged when using θ VR .
Fernández et al. [13] claimed that OC management using coancestry matrices based on allele sharing moves frequencies to intermediate values and reduces the probability of losing alleles. In fact, these authors observed that strategies that maximize heterozygosity, by managing contributions from parents, keep levels of allelic diversity as high as strategies that maximize allelic diversity itself. Their results were obtained when applying OC using the similarity genomic matrix of Nejati-Javaremi et al. [22], calculated with up to 40 multiallelic markers, but the same could be expected when using θ LH given that correlation between both matrices is 1. However, we have obtained solutions which maintain genetic diversity (H e ) but result in a higher number of fixed loci and this could be due to the different numbers of markers used in both studies.
To understand these contrasting results, we carried out extra simulations to compare observed with expected values for the number of fixed loci under both management strategies (i.e., S O_LH and S O_VR ). In this extra scenario, a population with N = 20 individuals was managed during four generations, with different numbers of SNPs used for the calculation of the coancestry matrices (20 and 1000). A single chromosome was simulated. The expected number of fixed SNPs (ES f ) was estimated using the solutions that came out of each optimization before generating the offspring, following Fernández et al. [13].
Thus, ES f was computed as prob ki , where prob ki is the probability of individual i not transmitting allele k. If parent i carries a unique type of allele (that is, homozygous for the h allele) and leaves descendants, prob ki is 0 if k = h and 1 if k = h. If it carries two different alleles (that is, heterozygous), the probability is prob ki = (0.5) c i , where c i is the number of offspring to be contributed by parent i. ES f value can be averaged then across loci. Table 5 shows that expected and observed numbers of SNPs becoming fixed each generation were close. When using only 20 SNPs, even though only seven-eight individuals are selected with S O_LH , the expected (observed) number of SNPs that become fixed is lower than with S O_VR . However, when the number of SNPs used was increased, the trend reversed and the expected (and observed) number of fixed SNPs becomes lower for S O_VR than for S O_LH , even when the number of selected individuals increases for S O_LH . The explanation for this performance could be that, with many markers, S O_LH is able to find a solution with higher mean H e by keeping loci with high MAF and allowing SNPs with rare alleles to become fixed. The results show that the differences in maintained diversity (H e ) and divergence from the original frequencies (KL) between strategies S O_LH and S O_VR decreased when using only SNPs with a minimum MAF (MAF > 0.05 or MAF > 0.25) for computing the coancestry matrices. As mentioned above, S O_LH promotes the contribution of individuals carrying rare alleles, as their coancestries with the rest of the population are smaller, and thus increases the frequencies of rare alleles. When the minimum MAF permitted increases, the number of rare alleles decreases, and the differences between the average coancestries between pairs of individuals decrease. In such situation, S O_LH does not prioritize too much the contributions from any individual and leads to solutions that imply a higher number of candidates selected. Consequently, the results are closer to those obtained with strategy S O_VR . Moreover, when using only SNPs with high MAF in t = 0 (i.e., initial frequencies are close to 0.5), the performance of S O_VR (i.e., keeping those initial frequencies) is similar to the performance of S O_LH (moving them to intermediate values). These observations are in agreement with results from Morales-González et al. [16] and Villanueva et al. [29], who found that the correlation between VanRaden's and Li and Horvitz's coefficients increases with increasing the MAF of the SNPs used.
Here, we have optimized contributions of parents for minimizing the loss of variability and then changes in frequencies have been evaluated. On the other hand, Saura et al. [9] optimized contributions of parents for minimizing changes in allele frequencies and then the loss of genetic variability was evaluated. An alternative to both approaches could be to consider simultaneously the control of variability and the allele frequency changes. Similar to the OC algorithm designed for maximizing genetic gain while restricting the rate of inbreeding [2,3,24] or for maximizing the phenotypic level for a trait of interest while restricting the loss in variability when creating base populations [30], one could develop an algorithm for minimizing the loss of variability while restricting the change in frequencies or, alternatively, for minimizing frequency changes while restricting the loss of variability. The specific objective would depend on the particular interest of the managers of the program. This kind of approach was followed by Fernández et al. [31] in the context of optimizing the sampling strategy for establishing a gene bank. In particular, they developed an algorithm that simultaneously allows targeting frequencies for alleles at a particular locus while controlling the genetic diversity of other unlinked loci.
It could be also possible to combine both coancestry matrices (θ LH and θ VR ) in the objective function when the specific objective differs across genomic regions (i.e., in some regions the interest may be to maintain diversity, and in other regions the interest may be to maintain frequencies). Maintaining diversity may be of interest for regions associated with inbreeding depression for fitness-related traits and also for regions that harbor loci involved in general resistance to diseases (e.g., the major histocompatibility complex, MHC) as a high level of genetic diversity is desirable to ensure that the population can deal with potential new disease challenges [21]. Maintaining frequencies may be of interest in regions containing loci that have been under natural or artificial selection, and one wants to keep the genetic progress obtained. Gómez-Romano et al. [21] showed that the OC method using a matrix equivalent to θ LH is efficient in maintaining H e in specific regions and simultaneously restricts the loss of H e in the rest of the genome. Their approach could be extended to include the use of θ VR for minimizing the change in allele frequencies in some genomic regions. However, it has to be kept in mind that the higher the number of different parameters to be controlled, or the more regions to be treated differently, the lower the control of each objective one can expect.
In a conservation program, the maintenance of genetic variability throughout the genome is the general aim because usually there is no information available on the relevance of each genome region and the current or future use of the genetic variability present in particular regions. Therefore, it is better to conserve as much diversity as possible because if alleles are lost in a population, they will be no longer available. However, this strategy can lead to the maintenance or even an increase in the frequency of deleterious alleles. Different methods have been proposed to avoid this when using the OC method, including (i) selection of the best sib from the group of offspring generated by the selected parents [28] and (ii) combining selection with inbred matings [14] to allow for some kind of purging. Sonesson et al. [32] also proposed a model in which they tried to eliminate a disease from a population in different scenarios by explicitly performing selection against this condition. Currently, genomics can provide information on deleterious variability and the loci determining the occurrence of the disease [33], so a strategy where selection is made against these deleterious alleles [17], while you restrict the loss of variability in the rest of the genome, could be possible.
The amount of genetic variability retained was measured as the expected heterozygosity (H e ). However, other measures such as allelic diversity can be used [13,34]. Allelic diversity is essential from an evolutionary perspective, since the limit of selection response is determined by the initial number of alleles [35,36]. It is worth noting that strategy S O_VR would be more efficient than strategy S O_LH , not only to maintain allele frequency but also to maintain diversity when this is measured as the number of unobserved loci segregating. It is thus clear that the coancestry matrix to be used in OC when managing a particular genetic conservation program would be case specific.
Finally, it is worth mentioning that further work is needed to explore how the relaxation of some of the assumptions implicit in our simulations could affect the results obtained. Extra work would be necessary to investigate schemes with overlapping generations, variable population size over the management time frame, and different degrees of relatedness between the founders.

Conclusions
When applying strategy S O_LH , more H e is maintained than when applying strategy S O_VR given that S O_LH moves allele frequencies towards 0.5. However, S O_VR maintained allele frequencies closer to those of the initial generation and more loci segregating than S O_LH . Therefore, considering that conservation programs generally aim to increase genetic diversity, but it is also important to maintain population uniqueness, the choice of which genomic coancestry matrix is used in management may depend on which of these two goals is more important for each particular case. When a subset of SNPs with MAF > 0.05 or MAF > 0.25 is used to estimate coancestry matrices, the differences between both strategies in terms of both H e and KL were reduced. The differences between strategies were smaller for populations of smaller sizes given that in a smaller population it is more difficult to differentiate between individuals.

Data Availability Statement:
The codes to perform the simulations are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare that there are no competing interests with respect to the authorship or publication of this article.