Whole Genome Screening Procures a Holistic Hold of the Russian Chicken Gene Pool Heritage and Demographic History

Simple Summary A collection of native farm animal breeds can be considered as a gene pool and a national heritage. Long-term artificial selection in domesticated animals has certain effects on their genomes, which can be investigated using genome-wide screens for DNA sequence variation, that is, so-called single nucleotide polymorphism (SNP) screens. Here, we looked at the genomes of 19 Russian chicken gene pool breeds, both native and imported, evaluating the contrasting egg, meat and dual-purpose types. Based on genetic diversity statistics, we identified differences between the breeds using many DNA markers (SNPs) that may represent genomic regions that are being selected for, either within a specific breed or shared between breeds. Our research will be helpful for further understanding the genomic diversity and demographic history of Russian domestic chickens. This would be essential for their successful breeding. Abstract A study for genomic variation that may reflect putative selective signaling and be associated with economically important traits is instrumental for obtaining information about demographic and selection history in domestic animal species and populations. A rich variety of the Russian chicken gene pool breeds warrants a further detailed study. Specifically, their genomic features can derive implications from their genome architecture and selective footprints for their subsequent breeding and practical efficient exploitation. In the present work, whole genome genotyping of 19 chicken breeds (20 populations with up to 71 samples each) was performed using the Chicken 50 K BeadChip DNA chip. The studied breed sample included six native Russian breeds of chickens developed in the 17th–19th centuries, as well as eight Russian chicken breeds, including the Russian White (RW), created in the 20th century on the basis of improving local chickens using breeds of foreign selection. Five specialized foreign breeds of chickens, including the White Leghorn (WL), were used along with other breeds representing the Russian gene pool. The characteristics of the genetic diversity and phylogenetic relationships of the native breeds of chickens were represented in comparison with foreign breeds. It was established that the studied native breeds demonstrate their own genetic structure that distinguishes them from foreign breeds, and from each other. For example, we previously made an assumption on what could cause the differences between two RW populations, RW1 and RW2. From the data obtained here, it was verified that WL was additionally crossed to RW2, unlike RW1. Thus, inherently, RW1 is a purer population of this improved Russian breed. A significant contribution of the gene pool of native breeds to the global genetic diversity of chickens was shown. In general, based on the results of a multilateral survey of this sample of breeds, it can be concluded that phylogenetic relationships based on their genetic structure and variability robustly reflect the known, previously postulated and newly discovered patterns of evolution of native chickens. The results herein presented will aid selection and breeding work using this gene pool.


DNA Isolation
The manufacturer's recommendations were followed when extracting DNA using Nexttec columns (Nexttec Biotechnologie GmbH, Leverkusen, Germany). We employed a Qubit 3.0 fluorometer to define the concentration of dsDNA solutions (Thermo Fisher Scientific, Wilmington, DE, USA). Values of OD260/280 ratio were measured using a NanoDrop-2000 to determine the recovered DNA purity (Thermo Fisher Scientific).

SNP Markers and Genotyping Quality Control
Using a Chicken 50 K CobbCons SNP microarray (Illumina, San Diego, CA, USA), individual sample genotyping was performed. Before applying any quality control filters, 53,872 SNP markers were available. Using PLINK 1.9 software [50], the following filters were set up to adjust the quality of the SNP genotypes: at least 90% of loci (--geno 0.1) successfully genotyped in at least 80% of samples (--mind 0.2), with minor alleles being contained at least 5% of the time (--maf 0.05) and the linkage disequilibrium (LD) criterion being greater than 50% (--indep-pairwise 50 5 0.5). A total of 39,778 SNPs on 28 autosomes (GGA1 to GGA28) were selected after filtering for further examination. The R software (version 4.2.3) environment was used to create the input files for the analysis and subsequent visualization of the findings [51].

Genetic Diversity, Population Structure and Split and Admixture
The observed heterozygosity (H O ), expected heterozygosity (H E ), unbiased expected heterozygosity ( U H E ) [52], rarefied allelic richness (A R ) [53] and U H E -based inbreeding coefficient ( U F IS ) were computed in the R package diveRsity [54] to measure genetic diversity of the breeds studied. We also carried out the analysis of molecular variance (AMOVA) for our dataset using the following hierarchical structure: breed group (old Russian, improved Russian and specialized foreign) > breed > sample. This analysis was performed using the R package poppr [55]. Using PLINK 1.9, genetic differences between the examined breeds were determined. The R package ggplot2 was used to conduct the principal component analysis (PCA) visualization [56]. Additionally, PCA plots as well as hierarchical clustering trees using Euclidean distances were generated using the Phantasus web application [57].
The SplitsTree 4.14.5 software [58] and T-REX web server [59] were employed to plot Neighbor-Net and Neighbor-Joining [60] dendrograms using a matrix of pairwise F ST [61] and Reynolds et al. [62] genetic distances. The former distance was the respective statistic also known as the fixation index. The latter distance was calculated based on the appropriate Reynolds et al. [62] algorithm that assumed that genetic differentiation occurs only through genetic drift and without mutations. In addition, the Neighbor-Joining trees were reconstructed using the ADDTREE [63] and Unweighted Neighbor-Joining [64] methods. Moreover, heat maps and trees were generated with the ClustVis web tool [65] using Euclidean distances for both rows and columns of the matrix (with the average option selected for the clustering method).
Using the Admixture v1.3 program [66], model-based clustering (obtained by Bayesian clustering) was completed to refine population structure. The cross-validation (CV) procedure was used to compute numbers (K) of clusters (ancestral populations), with K values ranging from 1 to 30 and its lowest CV error being the optimal number of clusters. The findings of the admixture analysis were visualized using the R package BITE [67].

Runs of Homozygosity Analysis
For the estimation of ROHs, the consecutive runs technique [68] implemented in the R package detectRUNS (version 0.9.6) [69] was used, which is a window-free method for consecutive SNP-based detection. In order to prevent underestimating the number of ROHs longer than 8 Mb, we permitted one SNP with an unknown genotype and up to one potential heterozygous genotype in one run [70]. We set the minimum length of an ROH at 500 kb to account for strong linkage disequilibrium (LD), which usually extends up to about 100 kb [71], and to exclude short and very frequent ROHs. We determined the minimum number of SNPs (l), which was first determined by Lencz et al. [72] and subsequently modified by Purfield et al. [73], in order to reduce false-positive results: where n s is the average number of SNPs that each individual has been genotyped for, n i is the total number of individuals that have been genotyped, α is percentage of false-positive ROHs (equal to 0.05 as the study's cutoff) and het is the average level of heterozygosity across all SNPs. In our situation, a minimum of 23 SNPs were required to be considered.
Each individual's ROH length and number were calculated, and these values were then averaged across all individuals in each breed. Additionally, we calculated the genomic inbreeding coefficient based on ROH (F ROH ), which is the ratio of the total autosomal SNP coverage to the aggregate of all ROHs for each animal (0.94 Gb).
As indicated by other studies [74,75], putative ROH islands were defined as overlapping homozygous regions shared by more than 50% of analyzed individuals within each breed. Given that shorter segments of 0.3-1 Mb are more common in the genome of white egg-laying hens [76], we set the minimum overlapping length size threshold at 0.3 Mb.

Demographic History Inference
The degree and patterns of population divergence (splits) and the level of gene flow between the studied breeds were inferred using the TreeMix 1.12 program [77]. We tested 0 to 5 migration events (edges) with 30 iterations for each event. The optimal number of migration edges was determined using the R package OptM [78] and TreeMix output files. The common quail (Coturnix coturnix (Linnaeus, 1758)), genotyped using the same chicken microarray, was used as an outgroup in reconstructing the maximum likelihood (ML) trees based on~16K validated SNPs. The respective residual matrices were visualized as heat maps.
Using the algorithm built into the SNeP v.1.1 program [79] and based on LD [79][80][81][82], N e was calculated over the previous 50 to 500 generations. Except for the recombination rate modifier computed following Sved and Feldman [83], the default settings were used. In order to study the rate and direction of N e changes that occurred over 50 generations, an analysis of the N e slope (N eS ) was also performed [84]. Using the median of 50 most recent N e values, the slope of each segment connecting pairs of adjacent N e values was defined and the results were normalized. This enabled us to determine the N e change coefficient for the increase of pairwise coancestry (N eC ) [85] in order to estimate the amount of N e change over the previous 50 generations. Using the SNeP program [79], it was also possible to calculate the mean r 2 (LD) and standard deviation r 2 in a bin, the mean distance (dist) between each pair of SNP markers in a cell and the number of SNP markers (items) used to calculate r 2 in the bin.
Statistical data analysis, including calculation of means, mean standard errors and Student's t-statistics, was performed using Excel (Microsoft Corporation, Redmond, WA, USA) and the appropriate web tools [86,87].

Genetic Diversity
As follows from the genetic diversity data presented in Table 2, the H O values varied in the studied breeds from 0.164 ± 0.001 in WL to 0.365 ± 0.001 in CW. Interestingly, the same pattern, i.e., the minimum values for WL and the maximum ones for CW, persisted for other indices of genetic diversity. In particular, in terms of H E , U H E and A R , WL had the lowest values (0.185 ± 0.001, 0.185 ± 0.001 and 1.452 ± 0.002, respectively), and the largest ones were found in CW (0.368 ± 0.001, 0.371 ± 0.001 and 1.876 ± 0.001, respectively; p < 0.001). The rest of the breeds had intermediate diversity indicator values. Collectively, we can see that the studied commercial WL line, purposefully bred for egg performance, has the characteristic least genetic variability. Native and foreign DPBs (of the EMB and MEB subtypes of utility) had a significantly greater genetic variation. Another specialized ETB of RW resulted from crossing local chickens with that WL fell into the same intermediate diversity group. Finally, the maximum variability was observed in CW, a commercial MTB synthetic by origin, derived from the Asiatic GBs and MTBs of Asil, White Malay, Indian Game and Cochin, and subjected to selection for meat production traits contrastingly different from the ETB of WL. One indigenous MEB, YC, was also close to CW in terms of variability (Table 2), probably due to the fact that at least five different breeds, including the Asiatic MTBs of Brahma, Cochin and Langshan, are believed to have been used for its creation. In addition, there was a trend of higher average diversity indices (i.e., heterozygosities and allelic richness) in the old Russian indigenous breeds, intermediate ones in the group of improved Russian breeds and lower ones among the specialized foreign breeds ( Table 2). We also calculated the diversity indices for the three breed groups (i.e., old Russian, improved Russian and specialized foreign) obtained by pooling individual genotypes within each breed group (Supplementary Table S1). As a result, the old Russian breed group had greater mean values of diversity indicators, with intermediate and lower values being in the improved Russian and specialized foreign breed groups, respectively. The results of the AMOVA-based analysis for the three breed groups (i.e., old Russian, improved Russian and specialized foreign) are presented in Table 3. The main share of variation conformed to the differences between individuals within the populations (83.71%, p-value = 0.001). Variation between the breed groups was only 1.36% (p-value = 0.011). The results obtained were convincing since some breeds, including those belonging to different groups, were of mixed origin. A comparable pattern was observed in a study of Chinese gamecocks and native chicken breeds [88]. However, a relatively high proportion of variation between breeds within groups (14.93%, p-value = 0.001; Table 3) may be indicative of the apartness of each breed as a result of selection work. When completing a comprehensive PCA of the genetic variability within this breed sample using the four diversity indicators H O , H E , U H E and A R , we were able to distinguish four breed clusters (Supplementary Figure S1A). The first one was the farthest from the rest of the clusters, located in the lower right corner of the PCA plot and formed by the single ETB of WL that had a pronounced minimum variability. The second cluster was situated in the left extreme position on the graph and included two breeds with the greatest variation, the MTB of CW and the MEB of YC. The other two clusters were located in the central part of the PCA plot and mainly consisted of DPBs. Herewith, the third cluster was composed of breeds with higher values of variability indicators, and the fourth cluster (which also involved the ETB of RW) was represented with breeds that had a lower variation. A similar distribution of the studied breeds based on the same four diversity indices was also observed using the hierarchical clustering procedure (Supplementary Figure S1B).
We also evaluated the diversity of genotyped breeds depending on inbreeding coefficients that characterized the genetic structure of populations from a little different angle ( Table 2). Negative U F IS values indicated a slight excess of heterozygotes in OMF, PC, LMF and NH (−0.001 to −0.067), while other breeds showed no or a slight deficiency of heterozygotes (0.000 to 0.103) than what would be expected under the Hardy-Weinberg equilibrium (Table 2). Therefore, we observed a rather significant scatter in the number of heterozygotes in the studied breeds as tested for a deviation from the Hardy-Weinberg equilibrium. Notably, WL had not only the minimum indicators of genetic variability, but also the highest values of both the generally accepted inbreeding coefficient (F IS = 0.080) and its unbiased derivative, U F IS (0.103). This was indicative of some shifts in the population structure of this selected WL line towards an excessive number of homozygotes. Judging by the F IS coefficient, many other breeds, in contrast, showed some redundancy of heterozygotes, except for RC, Ush, YC, AS, AoB and CW. On the other hand, based on the U F IS values, the population structure of some old indigenous (RBB, RC, Ush and YC), improved native (UshF, ZS, RW, Pm, Kt and AS) and specialized foreign (CW, RIR and AoB) breeds tended toward excessive homozygotes.
The distribution of genotyped breeds based on the inbreeding coefficients F IS and U F IS and using PCA and hierarchical clustering is shown in Supplementary Figure S2A,B, respectively. On both of these graphs, five major clusters can be distinguished. WL that had the maximum F IS and U F IS values formed one well-isolated single cluster. Kt was located separately from other breeds as another single cluster. OMF that had an excess of heterozygotes also formed a separate cluster. One more cluster was composed of five breeds that also demonstrated a certain excessive number of heterozygotes. The largest cluster (No. 3) can be divided into two subclusters: in one (3a), there was some bias towards an excess of heterozygotes, and in the other (3b), a redundancy of homozygotes (Supplementary Figure S2A).

ROH Distribution in Chicken Breed Genomes
According to Table 4, we revealed the lowest mean number of ROH segments in the YC genome (70.11) (Table 4). These observations were highly concordant with the above genetic diversity data for the same chicken breeds studied (Table 2). When pooling individual genotypes to form the respective three breed groups, the old Russian breeds had significantly lower ROH-based statistics than those in the improved Russian breeds (Supplementary Table S1). The respective Manhattan plots of the distribution of ROH islands in individual chicken breeds can be seen in Supplementary Figure S3.

Breed Relationship and Admixture
As one of the approaches to assessing the genetic structure and differentiation of the studied breeds based on SNP genotyping, the PCA method was used. The results of the corresponding clustering of genotyped individuals and breeds are shown in Figure 1 Figure 1 demonstrated a high degree of genetic identification of genotyped individuals, i.e., there was a fairly good match of a single individual to the breed cluster to which this individual belonged. This proved, firstly, a high resolution power of genome-wide genotyping of this sample of breeds using a mediumdensity SNP chip. Secondly, there was a definite consolidation of the genetic structure of most breeds. Remarkably, both the PC1-PC2 (Figure 1a) and PC1-PC3 (Figure 1b) plots well reflected the genetic differentiation of breeds and explained about the same amount of total variance, pointing out their fairly significant and adequate display of clustering patterns. As can be seen on these PCA plots (Figure 1a,b), certain breeds were characterized by a rather pronounced genetic uniqueness. These included a distinctive ETB cluster of compactly displayed WL individuals and two closer, but still differentiated, populations RW1 and RW2. Such old indigenous breeds as Ush and RBB, as well as OMF, were also well distinguished from the other breeds, with Ush and RBB showing a great genetic similarity relative to each other (Figure 1a). An improved native breed of KJ appeared to have some unique genetic features, too (Figure 1b), whereas the other, mostly synthetic by origin, DPBs formed a conglomeration of breeds situated quite close to each other (Figure 1a). The latter embraced clusters of individuals of the breeds that were more similar in genetic structure to each other and were located more crowded in the bottom right ( Figure 1a) and upper right (Figure 1b) parts of the two graphs. Within this conglomeration, we observed a closer vicinity of RIR, NH (derived from RIR) and PC (presumably intercrossed with RIR and NH; Figure 1a). Closer to each other were also Pm and AS (originated from Pm).
The pairwise interbreed genetic distances that were computed using the F ST statistic and the Reynolds et al. [62] algorithm are presented in Supplementary Tables S2 and S3, respectively. Using these obtained distances for the full spectrum of breeds genotyped here, the respective phylogenetic trees were built as shown in Figure 2 and Supplementary  Figures S5 and S6. According to the similar tree topology presented in these figures, we identified the presence of between-breed phylogenetic relationships, e.g., for such pairs/trios of breeds as (1)  In addition, we performed an admixture-based genetic structure analysis, determined the mixing degree in the composition of the given set of individuals and populations (breeds) based on the SNP genotype information and estimated the origin of populations from K hypothetical ancestral populations. As followed from the procedure for calculating CV errors for different K values, the minimum error value corresponded to K = 26 (Supplementary Figure S7). When discriminating the populations using admixture, the corresponding mixing patterns of breed composition were observed (Figure 3, Supplementary Figures S8 and S9). The value of K = 2 appeared to conform to the two main evolutionary lineages of domestic chickens, of the Mediterranean (ETB) and Asiatic (MTB) roots [89]. WL, a typical breed of Mediterranean origin, had a solid dark blue bar coloration on the admixture plot at K = 2. Two ETB populations of RW had a significant proportion of WL genotypes in their genomes, with their respective bars being largely dark blue on this plot. In the genomes of other breeds, the red Asiatic component dominated, while dark blue Mediterranean genotype variants were present to a much lesser extent, as was also reflected on the admixture plot. At K = 3, one more ancient evolutionary Asiatic lineage of GB (of yellow bar color) may have been added, which participated to some extent in the formation of many later breeds [89]. At K = 4, another ETB sublineage (of green bar color) emerged. Accordingly, WL apparently began to break up into two sublineages, whereas RW1 was mostly composed of the second ETB sublineage variants, which could, to some extent, affect further steps in the identification of ancestral populations. The optimal patterns of the breed-specific genetic structure were found at K = 26. On the corresponding bar plot, the unique genomic composition of each breed can be seen (Figure 3, Supplementary Figures S8 and S9), suggesting the distinctive patterns of breed consolidation and stratification in accordance with their origin and breeding history. Although the situation when K is more than the number of populations may seem unusual, such a pattern can be observed when the samples of various breeds studied are of mixed origin. Accordingly, the number of clusters K equal to, or lower than, the number of populations was not enough in our case to differentiate samples between breeds clearly. With an increase in the number K, clusters began forming within breeds, uniting samples not by breed, but by other factors, e.g., belonging to a line or a family.    Table 1.
The pairwise interbreed genetic distances that were computed using the FST statistic and the Reynolds et al. [62] algorithm are presented in Supplementary Tables S2 and S3, respectively. Using these obtained distances for the full spectrum of breeds genotyped here, the respective phylogenetic trees were built as shown in Figure 2 and Supplementary  Figures S5 and S6. According to the similar tree topology presented in these figures, we identified the presence of between-breed phylogenetic relationships, e.g., for such pairs/trios of breeds as (1) Table 1.
In addition, we performed an admixture-based genetic structure analysis, determined the mixing degree in the composition of the given set of individuals and populations (breeds) based on the SNP genotype information and estimated the origin of populations from K hypothetical ancestral populations. As followed from the procedure for calculating CV errors for different K values, the minimum error value corresponded to K = 26 (Supplementary Figure S7). When discriminating the populations using admixture, the corresponding mixing patterns of breed composition were observed (Figure 3, Supplementary Figures S8 and S9). The value of K = 2 appeared to conform to the two main evolutionary lineages of domestic chickens, of the Mediterranean (ETB) and Asiatic (MTB) roots [89]. WL, a typical breed of Mediterranean origin, had a solid dark blue bar coloration on the admixture plot at K = 2. Two ETB populations of RW had a significant proportion of WL genotypes in their genomes, with their respective bars being largely dark blue on this plot. In the genomes of other breeds, the red Asiatic component dominated, while dark blue Mediterranean genotype variants were present to a much lesser extent, as was also reflected on the admixture plot. At K = 3, one more ancient evolutionary Asiatic lineage of GB (of yellow bar color) may have been added, which participated to some extent in the formation of many later breeds [89]. At K = 4, another ETB sublineage (of green bar Figure 2. Phylogenetic relationships established between the studied chicken breeds (populations) based on the F ST genetic distances. Unrooted radial trees were reconstructed using the Neighbor-Net algorithm [58] (a) and the Neighbor-Joining/ADDTREE method [63] (b) Breed codes are given in Table 1. color) emerged. Accordingly, WL apparently began to break up into two sublineages, whereas RW1 was mostly composed of the second ETB sublineage variants, which could, to some extent, affect further steps in the identification of ancestral populations. The optimal patterns of the breed-specific genetic structure were found at K = 26. On the corresponding bar plot, the unique genomic composition of each breed can be seen (Figure 3, Supplementary Figures S8 and S9), suggesting the distinctive patterns of breed consolidation and stratification in accordance with their origin and breeding history. Although the situation when K is more than the number of populations may seem unusual, such a pattern can be observed when the samples of various breeds studied are of mixed origin. Accordingly, the number of clusters K equal to, or lower than, the number of populations was not enough in our case to differentiate samples between breeds clearly. With an increase in the number K, clusters began forming within breeds, uniting samples not by breed, but by other factors, e.g., belonging to a line or a family.   Table 1.

Population Divergence and Gene Flow
One of the demographic history inference aspects explored here was the assessment of breed/population divergence (or split) and gene flow (or migration events) [77]. The respective analysis results are presented in Figure 4 and Supplementary Figure S10A-D. As exemplified in Figure 4a, there was a migration event from the common ancestor of RW and WL to the ancestor of ZS and Kt. This does not contradict the history of breed origin: RW, indeed, was used in developing ZS and Kt, and this explains the discovered gene flow example. We also discovered several other gene flow events that conformed to the known origin and breeding history of the breeds (Supplementary Figure S10A). In addition, maximum likelihood trees in Figure 4a and Supplementary Figure S10A,B reflected the plausible breed/population divergence that was largely in agreement with the PCA plots ( Figure 1) and phylogenetic dendrograms (Figure 2, Supplementary Figures S5 and S6). However, the maximum likelihood tree reconstruction algorithm was more sensitive in placing the MTB of CW as a single basal offshoot not related directly to any other breed.

Population Divergence and Gene Flow
One of the demographic history inference aspects explored here was the assessment of breed/population divergence (or split) and gene flow (or migration events) [77]. The respective analysis results are presented in Figure 4 and Supplementary Figures S10A-D. As exemplified in Figure 4a, there was a migration event from the common ancestor of RW and WL to the ancestor of ZS and Kt. This does not contradict the history of breed origin: RW, indeed, was used in developing ZS and Kt, and this explains the discovered gene flow example. We also discovered several other gene flow events that conformed to the known origin and breeding history of the breeds (Supplementary Figure S10A). In addition, maximum likelihood trees in Figure 4a and Supplementary Figure S10A and S10B reflected the plausible breed/population divergence that was largely in agreement with the PCA plots ( Figure 1) and phylogenetic dendrograms (Figure 2, Supplementary  Figures S5 and S6). However, the maximum likelihood tree reconstruction algorithm was more sensitive in placing the MTB of CW as a single basal offshoot not related directly to any other breed.

Effective Population Size
The assessment results of demographic history in terms of N e values at the present time and their change over recent 50-500 generations among the studied breeds are presented in Figure 5, Supplementary Table S4 and Supplementary Figures S11 and S12. It is known that N e is an important indicator that describes the demographic history of a particular population. Consideration of this indicator also allows us to assume how many individuals could participate in the formation of the breed [90]. Based on the genome-wide analysis, CW had the largest N e value three generations ago (N e = 321; Supplementary Table S4 Table S4, Figure 5b), WL had the smallest N e (694), while CW and YC had the largest N e (1814 and 2279, respectively). Remarkably, there was a sharp N e reduction in YC (from 2279 to 141), suggesting that this breed went through a severe genetic bottleneck, as confirmed by the fact that YC was almost extinct by the end of World War II. It can be assumed that breeds with a low N e , both at present and 50-500 generations ago, apparently descended from a limited number of ancestors or were under significant selection pressure in a number of generations. However, it should be noted that, as a rule, a majority of the studied breeds were developed by hybridization of several breeds, as well as by "blood refreshing" (i.e., introduction of single telic mating to another population or breed), which could generally be reflected in similar dynamics patterns of their demographic history and N e changes in a series of generations. derived from the TreeMix analysis for a single migration event, expressed as the number of standard error deviations for the observations in the respective breeds.

Effective Population Size
The assessment results of demographic history in terms of Ne values at the present time and their change over recent 50-500 generations among the studied breeds are presented in Figure 5, Supplementary Table S4 and Supplementary Figures S11 and S12. It is known that Ne is an important indicator that describes the demographic history of a particular population. Consideration of this indicator also allows us to assume how many individuals could participate in the formation of the breed [90]. Based on the genomewide analysis, CW had the largest Ne value three generations ago (Ne = 321; Supplementary Table S4; Figure 5a Table S4, Figure 5b), WL had the smallest Ne (694), while CW and YC had the largest Ne (1814 and 2279, respectively). Remarkably, there was a sharp Ne reduction in YC (from 2279 to 141), suggesting that this breed went through a severe genetic bottleneck, as confirmed by the fact that YC was almost extinct by the end of World War II. It can be assumed that breeds with a low Ne, both at present and 50-500 generations ago, apparently descended from a limited number of ancestors or were under significant selection pressure in a number of generations. However, it should be noted that, as a rule, a majority of the studied breeds were developed by hybridization of several breeds, as well as by "blood refreshing" (i.e., introduction of single telic mating to another population or breed), which could generally be reflected in similar dynamics patterns of their demographic history and Ne changes in a series of generations.  Table 1.
Using the NeS method [84], we also identified subtle changes in the estimated Ne curves ( Figure S12).
In addition to the obtained plots of Ne change dynamics ( Figure 5, Supplementary  Figures S11 and S12), we characterized the respective breed distribution using PCA (Supplementary Figure S13A) and hierarchical clustering (Supplementary Figure S13B). When combining the Ne data for 3, 46 and 520-525 generations ago, the greatest distinctiveness and distance of the trendline relative to the other breeds was inherent in CW and, to a lesser extent, YC. Further, it was possible to observe the proximity for the following three pairs of breeds: WL-Ush, RW-Pm and OMF-ZS. The rest of the breeds formed a crowded core, which can generally indicate the similarity of their dynamics of Ne changes over the last 520-525 generations. Based on the combined data of r 2 , dist and items for three generations ago (not shown) as assessed using the SNP-based genome-wide analysis, the patterns of breed distribution by LD were also obtained (Supplementary Figure S14). The lowest LD values were characteristic of the ETBs of WL and RW, and the highest one was  Table 1.
Using the N eS method [84], we also identified subtle changes in the estimated N e curves (Figure 5c, Supplementary Figure S12), which are not visually detectable in the N e plots (Figure 5a,b, Supplementary Figure S11) and provided more detailed information about changes in N e . This analysis revealed sharp fluctuations in N e in CW that occurred approximately over the last 15-25 generations (Figure 5c). Less pronounced fluctuations in N e were observed over the last 5-15 generations in some other populations, e.g., OMF, RC, YC, RW1, RW2, ZS and KJ. The earlier demographic history was characterized by approximately similar and uniform N eS in all populations that corresponded to a relative increase in N e change and was represented as a bunch of horizontal lines above zero on the y-axis (Figure 5c and Supplementary Figure S12).
In addition to the obtained plots of N e change dynamics ( Figure 5, Supplementary  Figures S11 and S12), we characterized the respective breed distribution using PCA (Supplementary Figure S13A) and hierarchical clustering (Supplementary Figure S13B). When combining the N e data for 3, 46 and 520-525 generations ago, the greatest distinctiveness and distance of the trendline relative to the other breeds was inherent in CW and, to a lesser extent, YC. Further, it was possible to observe the proximity for the following three pairs of breeds: WL-Ush, RW-Pm and OMF-ZS. The rest of the breeds formed a crowded core, which can generally indicate the similarity of their dynamics of N e changes over the last 520-525 generations. Based on the combined data of r 2 , dist and items for three generations ago (not shown) as assessed using the SNP-based genome-wide analysis, the patterns of breed distribution by LD were also obtained (Supplementary Figure S14). The lowest LD values were characteristic of the ETBs of WL and RW, and the highest one was identified in the MEB of Kt, all three being located separately on the respective PCA plot (Supplementary Figure S14A). The remaining breeds formed two large clusters, one with slightly lower and the other with slightly higher LD values. Finally, a comprehensive assessment of the studied breeds by parameters showing the current N e (three generations ago) and taking into account the three parameters of LD between SNP markers made it possible to identify four clusters using PCA (Supplementary Figure S15A) and hierarchical clustering (Supplementary Figure  S15B) methods. One cluster involved the MTB of CW, the other one combined two ETBs, WL and RW, and all other breeds were distributed between two large clusters.

Discussion
To the best of our knowledge, this is the first thorough examination of the Russian chicken gene pool heritage using whole genome screening and focusing on its overall genetic diversity, phylogeny and demographic history. For this purpose, we employed 39,778 autosomal SNPs, explored genome-wide SNP genotypes amongst Russian native breeds developed in the Russian Empire and Soviet Union, along with the specialized foreign breeds.
Based on the performed analysis of the biodiversity of a large sample of native and foreign breeds, we can suggest the presence of very specific patterns of variability and genetic structure in the surveyed populations. For example, WL chickens were different from other breeds that are basically synthetic by origin. This appears to indicate a stronger and longer selection for egg production traits, which this line and the WL breed as a whole were subjected to, and, possibly, a narrower structural variability in the genomes of its ancestral forms. The greatest U F IS value (0.103) was identified in the WL population, which, along with the lowest diversity indices, evidenced a severe selection pressure in this line of laying hens.
The advantage of examining and contrasting the 19 breeds that are typical examples of important, distinct and, in some ways, opposing evolutionary lineages occurred throughout the process of domesticating and breeding chickens [89] has been used in the current study. In particular, WL and RW were representatives of ETBs selected for fecundity, egg number and other egg production traits, CW for meat traits and the others for dual purpose. We discovered that 47.5 to 87.5% of the variability was attributed to allelic variation within the breeds, while 12.5 to 52.5% of the variability was brought on by genetic variations between the breeds compared in the pairwise mode (F ST = 0.125 in the YC-RC pair to 0.525 in WL-NH). WL was characterized by less genetic variability than the other breeds ( Table 1). The genetic drift that had a place in the WL line, which has undergone extensive selection for egg performance as a closed population, may be one explanation for this. A larger degree of genetic variation in CW chickens, however, may possibly be a result of crossbreeding of few diverse breeds used for developing CW. WL had the greatest F ROH inbreeding coefficient (0.551), and YC and CW the lowest ones (0.154 and 0.162, respectively). This may be an indication that WL descended from a small number of founders (e.g., [91]), whereas YC and CW, in contrast, originated from a larger number of founders. WL may have an excess of homozygotes due to a higher level of selection pressure for egg production traits, while some other breeds, especially DPBs, with an excess of heterozygotes, are subject to selection for more diverse breeding targets in comparison with WL [92]. Another reason why the number of homozygotes or heterozygotes significantly deviated among the chicken breeds studied from what would be predicted under Hardy-Weinberg equilibrium is genetic drift.
The PCA plotting, phylogenetic analysis and admixture clustering results (Figures 1-3, Supplementary Figure S9) suggested that the breeds were to a large extent consolidated and showed peculiar (i.e., breed-specific) admixture patterns. The genome-wide examination convincingly distinguished the breeds and supported their distinct genetic origins and selection history. In particular, two ETBs, WL and RW, demonstrated a close genetic kinship, with RW being previously exposed to crossing with WL and both breeds being bred for egg performance traits. Interestingly, RW1 and RW2 showed a clear differentiation, with RW2 being closer to WL, which suggested a varied genetic background and significant differences in genetic structure of these two RW populations. Previously, Abdelmanova et al. [42] suggested the differences between RW1 and RW2, only assuming what caused them. From the data obtained, it can be seen that RW2 was once again crossed with WL, in contrast to RW1. Inherently, RW1 is a purer population of this improved Russian breed. In contrast, CW developed from the crossing of GBs and MTBs and selected for characteristics related to growth, muscle development and meat production was located in an opposite cluster in Figures 1 and 2. Overall, as shown in Figure 2, there were three major superclusters composed of the 20 breeds/populations studied. The first supercluster involved ETBs (WL, RW1 and RW2) and old Russian indigenous DPBs of OMF, Ush, RBB and YC. The second supercluster embraced one MTB (CW) and most of the DPBs, both of specialized foreign and Russian native origins. The third supercluster was made up by four Russian native DPBs (KJ, Kt, RC and ZS). Interestingly, there was a migration event from the common ancestor of WL and RW2 to Kt and ZS, confirming the known information about the origin of Kt and ZS. Collectively, our findings revealed genetic uniqueness of such Russian breeds as OMF, Ush, RBB, YC, KJ, Kt, RC and ZS, suggesting their further continued preservation and breeding.
Many breeds explored here and manifesting different phenotypic traits were previously included in other phylogenetic studies and showed similar relationship patterns (e.g., [15][16][17][18][19][20]37,89,[93][94][95][96][97][98]). For instance, a thorough comparative phylogenetic assessment of several chicken breeds was conducted by Moiseyeva et al. [89] utilizing two sets of morphological discrete traits, body measurements, biochemical markers and the activity of serum esterase-1 (i.e., carboxylesterase 1 like 1, or CES1L1). In the created dendrograms reflecting evolutionary relationships in chickens, RW was grouped with WL and other ETBs, whereas CW formed shared clusters with GBs and MTBs [89]. Recently, Dementieva et al. [97] localized RW and WC on the opposing branches of an F ST -based Neighbor-Joining tree using the Illumina Chicken 60K SNP iSelect BeadChip. The RW and WC breeds, which represent typical ETBs and MTBs, were also used in our previous investigation [42] based on Chicken 50 K_CobbCons chip-assisted SNP genotypes. The earlier diversity and phylogenetic relationship analyses for RW, WC and other breeds were strongly validated by our present data, supported also by the admixture, gene flow and demographic history patterns we identified here.
Being at the junction between East and West, the vast territory of Russia has historically been a crossroads of trade routes, along which there was an intensive exchange of breeds of domestic animals, including poultry. Having settled in Russia, these breeds adapted to diverse and often harsh climatic conditions (e.g., [99]), and often lost their original breed traits, turning into local ecotypes and populations of mongrel animals or crossbreds, giving rise to new local breeds [100][101][102][103][104]. The degree of polymorphism in Russian indigenous breeds is comparable to that of Asian breeds generally. Russian breeds represent a mixed Eurasian group by origin and, therefore, included the gene pools of the two global regions, with a predominant contribution from the Asian region, which could predetermine their high polymorphism [3]. When considering the reasons for a greater variability of Russian breeds shown here in comparison with foreign breeds (mostly of Western origin), one should not forget about the large territory occupied by the Russian Empire and the USSR in the past and by the Russian Federation at present. Here, as for the whole Asian region, we should recognize the significance of the diversity of climatic, landscape and ethnic conditions for the emergence of many and different ecotypes and breeds of animals, including poultry [3].

Conclusions
Judging from the biodiversity analysis of this large sample of native and foreign breeds, we noted very specific patterns of variability and genetic structure of the surveyed breeds, with WL being especially different from other breeds of synthetic origin. This, apparently, may indicate a stronger and longer selection for egg-laying traits, which this line and the WL breed as a whole were subjected to, and, possibly, a narrower structural variation in the genomes of its ancestral stocks. The phylogenetic analysis results did not contradict the available data on the origin and breeding history of the breeds. For example, WL and RW are typical specialized ETBs selected for egg production; moreover, WL roosters were used for the creation of RW. Ush and RBB are old indigenous breeds that are tolerant to cold, have a brooding instinct and are also used for fancy purposes. RIR was an original breed for developing NH, and both were supposedly involved in the creation of PC. One of the initial breeds in the development of AS was Pm; in addition, when creating both breeds, YC was also used. ZS was used to produce Kt, and both breeds descended from NH and RW among other parent breeds. This historical breed information was in line with the present whole genome screening study.
In general, using the results of a multilateral survey of this sample of breeds, it can be argued that phylogenetic relationships based on their genetic structure and variability reflect the known, previously postulated and discovered patterns of evolution and selection in chickens [33,42,89,[105][106][107]. These findings add to our understanding of genomic diversity, phylogeny and admixture in the genomes of unique Russian gene pool breeds with divergent selection histories and phenotypic features. The outcome of the current study will be beneficial for Russian chicken breed conservation, sustainable breeding and effective selection.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/biology12070979/s1. Table S1: Statistics for diversity and runs of homozygosity when grouping breeds by origin. Table S2: Pairwise F ST -based interbreed genetic distances. Table S3: Pairwise Reynolds et al. [62] distances between the chicken breeds studied. Table S4: Effective population size changes over recent 50-500 generations. Figure S1: Distribution of SNP-genotyped chicken breeds into four clusters depending on their genetic variability. Figure S2: Distribution of SNP-genotyped chicken breeds into five clusters depending on their genetic structure. Figure S3: Manhattan plots of the distribution of runs of homozygosity (ROH) islands in each chicken breed. Figure S4: 3D PCA plot for the studied chicken breeds/populations based on genome-wide SNP genotypes. Figure S5: Unrooted Neighbor-Joining phylogenetic trees reconstructed for the 20 studied chicken breeds/populations based on the F ST genetic distances. Figure S6: Unrooted radial Neighbor-Joining phylogenetic trees for 17 breeds (except KJ and NH) according to the results of SNP genotyping. Figure S7: Plot of crossvalidation errors (CV error). Figure S8: Admixture-derived [66,67] structure of studied populations in the form of a semicircle. Figure S9: Admixture-derived structure of the studied populations with K = 2 to 26 ancestral populations. Figure S10: Assessment of the degree of divergence and the level of gene flow between the studied breeds. Figure S11: Dynamics of changes in the effective population size (N e ). Figure S12: Analysis of the N e slopes [84]. Figure S13: Distribution of the 17 studied breeds (except KJ and NH) when combining the N e data for 3, 46 and 520-525 generations ago. Figure S14: Distribution of the 17 studied breeds (except KJ and NH) when combining the r 2 , dist and items data for 3 generations ago. Figure S15: Distribution of the 17 studied breeds (except KJ and NH) when combining the N e , r 2 , dist and items data for 3 generations ago. Informed Consent Statement: Not applicable.

Data Availability Statement:
The genotyping data presented in this study can be shared with third parties upon approval from the GWMAS Consortium.