Population Structure Analysis of the Border Collie Dog Breed in Hungary

Simple Summary The appearance of dog breeds is constantly changing, for many reasons. The Border Collie breed has several lines depending on sport, show, or work requirements, with closed breeding practices within these lines in recent decades. The aim of the study was to map the current population in Hungary and determine the possible inbreeding levels in and between the different subpopulations. The main finding of the study was that there is a detectable genetic divergence between the show and working line. In addition, genetic variability within the breed is decreasing due to a lack of suitable mating plans and the education of the breeders who are repeatedly choosing to breed animals with similar show-related characteristics. The size of the active breeding population has decreased dramatically in the past years. However, there are many dogs in the country without a pedigree. It can be seen that despite the proportion of registered breeders, dog owners prefer not to buy purebred dogs, and thus most of the pups born in Hungary are exported to other countries. Abstract Pedigree data of the Border Collie dog breed were collected in Hungary to examine genetic diversity within the breed and its different lines. The database was based on available herd books dating from the development of the breed (in the late 1800s) to the present day. The constructed pedigree file consisted of 13,339 individuals, of which 1566 dogs (born between 2010 and 2016) composed the alive reference population which was active from breeding perspective. The breed is subdivided by phenotype, showing a thicker coat, harmonic movement, a wide skull, and heavier bones for the show type, and a thinner or sometimes short coat and smaller body for the working line, while the mixed line is quite heterogeneous (a combination of the above). Thus, the reference population was dissected according to the existing lines. The number of founders was 894, but eight individuals were responsible for contributing 50% of the genetic variability. The reference population had a pedigree completeness of 99.6% up to 15 generations and an inbreeding coefficient of 9.86%. Due to the changing breed standards and the requirements of the potential buyers, the effective population size substantially decreased between 2010 and 2016. Generation intervals varied between 4.09 and 4.71 years, where the sire paths were longer due to the later initial age of breeding in males compared to females. Genetic differences among the existing lines calculated by fixation indices are not significant; nonetheless ancestral inbreeding coefficients are able to show contrasts.


Introduction
The Border Collie is considered as one of the most intelligent dog breeds, and originated from the England Northumbria region. The breed's name is related to the words "border" (between England and

Materials and Methods
The pedigree dataset of the Hungarian Border Collie population was constructed using available electronic herd books and pedigrees from Hungarian breeders. These websites could be searched but their content could not be downloaded, and therefore all genealogy data had to be retyped manually. First, the reference population was defined as the living and active animals (from the perspective of breeding) born between 2010 and 2016. The reference population was composed of 1566 dogs (703 males and 863 females). Then, the available genealogy information of these dogs was traced back and was recorded, creating the pedigree of the whole population from the late 1800s to the present day. The created Hungarian Border Collie pedigree dataset contained 13,339 individuals (5649 males and 7750 females).
The genealogy records used in this study were created using the software "Equihun Pedigree Builder" [4]. The following information was entered: After exporting the pedigree records their correctness was checked, pedigree analysis was performed applying ENDOG software [5]. The structure of the Hungarian Border Collie population was characterized by the following parameters: • Number of founders (f: ancestors with two unknown parents) • Effective number of founders (fe: the number of equally contributing founders that would be expected to produce the same genetic diversity as in the population under study) • Effective number of ancestors (fa: similar to fe but replacing the contributions of founders with the marginal contributions of ancestors) • Number of ancestors responsible for 50% of the genetic variability (fa50) • Generation interval (average age of parents at the birth of their progeny kept for reproduction) • Pedigree completeness (proportion of its known ancestors per generation) • Inbreeding coefficient (the probability that the two alleles at any locus in an individual are identical by descent) • Average relatedness (the probability that an allele randomly chosen from the whole population belongs to a given animal) • Effective populations size (realized effective population size) from an individual increase of inbreeding [6].
All these parameters were explained in detail by [5]; therefore equations related to the listed parameters will not be repeated here.
Then, the database was dissected into three groups. The inbreeding coefficients were also adjusted in the three lines. In addition, fixation indices (F IS , F ST ) were calculated to detect the reduction of heterozygosity among subpopulations and individuals for measuring the total population differentiation [7].
For the calculation of the fixation indices, coancestry and kinship distance were used [8,9], using the following equations: where f is the mean coancestry for the metapopulation; F is the mean inbreeding coefficient for the metapopulation; and D is the average genetic distance between the subpopulations. Moreover, ancestral inbreeding coefficients proposed by the authors of [10][11][12] and determined with GRain 2.0 software [13] were used to obtain whether these distinct measurements for inbreeding are able to describe the differences between the lines. For the determination of the cumulative proportion of the genome that was exposed to inbreeding effects, ancestral inbreeding (F_ BAL ) was calculated in the subpopulations. In addition, inbreeding coefficient was also calculated by the method of [11] by dividing inbreeding into two parts, based on whether part of the identical alleles were inbred in the past (F_ KAL ) or became inbred in recent generations (F _KAL_NEW ).

The Probability of Gene Origin
Trends in the probability of gene origin fe, fa50, and their ratios are presented in Table 1. Compared to a great number of founders, a large part of the genetic variability is maintained based on only eight ancestors (Table 1). Looking at the observed ratio of fa and fe it can be concluded that the Hungarian Border Collie population has suffered very strong gene loss.

Effective Population Size
Trends in the realized effective population size are presented in Figure 1. Compared to a great number of founders, a large part of the genetic variability is maintained based on only eight ancestors (Table 1). Looking at the observed ratio of fa and fe it can be concluded that the Hungarian Border Collie population has suffered very strong gene loss.

Effective Population Size
Trends in the realized effective population size are presented in Figure 1. Due to the unequal contribution of the breeding animals to the next generation, the effective population size is always smaller compared to the exact population size. Unfortunately, this decreasing trend may coincide with the loss of genetic variability and with the appearance of genetic diseases [14]. Similar tendencies were reported in several dog breeds [15].  Due to the unequal contribution of the breeding animals to the next generation, the effective population size is always smaller compared to the exact population size. Unfortunately, this decreasing trend may coincide with the loss of genetic variability and with the appearance of genetic diseases [14]. Similar tendencies were reported in several dog breeds [15].

Generation Interval
The generation interval was calculated as the average age of parents at the birth of their progeny kept for the reproduction, and it was computed for all four parent-progeny pathways ( Table 2) represented in the total reference population and per line. The length of the generation interval (T) can be substantially divergent across different breeds. In the present study, the sire paths were longer as the males were kept in breeding for longer ages than females and there is also a tendency for breeders to prefer to use males with more show and sport results rather than males with few titles even if they could increase the genetic variability within the breed. Collection of titles requires many years, so the preferred breeding males are usually older than the females. Intervals within the show and working lines were similar; however, the mean age of the father when the offspring was born was somewhat lower in the mixed line (sire-son path: 2.71 years, sire-daughter path: 4.33 years).

Inbreeding and Average Relatedness
Evolution of the inbreeding coefficient and the average relatedness of the reference population is provided in Table 3. As the population size of the Border Collie breed in Hungary is relatively large, the increase of the inbreeding level was relatively small (10% per 23 years). The decrease of the average inbreeding coefficients between 2011 and 2014 can be explained by the intense import of breeding animals. Because inbreeding coefficients are always dependent on the length and on the completeness of the pedigree the inbreeding coefficients were plotted on the complete generation equivalents [16]. Ancestral inbreeding coefficients were added to determine if inbred alleles in the past may have influenced the characterisations of these different phenotypes (Table 4). Adding ancestral inbreeding coefficients, Ballou's formula showed that individuals in the working line had less probability of inheriting an allele which had undergone inbreeding in the past at least once than individuals in the show and the mixed lines. When estimating the proportion of each dog's genome that was identical by descent in an ancestor to alleles identical by descent for the first time in that dog's linage by the gene dropping method, the F_ KAL and F _KAL_NEW showed similar results. Calculations for ancestral inbreeding were previously used by the authors of [10][11][12].
The differences between the show and working lines regarding Ballou's formula are 20.4%; in addition the working line also differs from the mixed line by 20%. Furthermore, both Kalinowski's formulas show that the working line suffered less inbreeding in the past few generations.
In the total population, only 2.77% of the matings were highly inbred (0.16% between full-sibs, 1.74% between half-sibs, and 0.87% between parent-offspring). Inbreeding coefficients differed among the studied lines; the average inbreeding coefficient was 4.9% in the reference population of the working line, while it reached 10.51% and 11.03% between 2010 and 2016 in the mixed and show lines, respectively.
The maximum and average number of complete generation equivalents were 25.04 and 4.47, respectively. A slow but continuous increase of the inbreeding coefficient based on the increasing complete generation equivalent was obvious ( Figure 2). For the first 15 generations, the pedigree completeness was 99.6%, decreasing to 87.6% by the 40th generation. Adding ancestral inbreeding coefficients, Ballou's formula showed that individuals in the working line had less probability of inheriting an allele which had undergone inbreeding in the past at least once than individuals in the show and the mixed lines. When estimating the proportion of each dog's genome that was identical by descent in an ancestor to alleles identical by descent for the first time in that dog's linage by the gene dropping method, the F_KAL and F_KAL_NEW showed similar results. Calculations for ancestral inbreeding were previously used by the authors of [10,11,12].
The differences between the show and working lines regarding Ballou's formula are 20.4%; in addition the working line also differs from the mixed line by 20%. Furthermore, both Kalinowski's formulas show that the working line suffered less inbreeding in the past few generations.

Fixation Indices in Subpopulations
In the calculation of the subdivision of the lines (Table 5), the within-variety fixation index (F IS ) was 0.36%, showing that mating within lines was not random; this is in contrast to sheep and horse breeds, where this value is negative [17], showing that individuals in farm animal species are less related. In the reference population (fa/fe ratio: 0.17), the overall F ST was 2.6%, with decreasing heterozygosity at the subpopulation level; however the genetic differences are still not significant, despite the diversity of the pheotype of the lines in the past 20 years. Analogous fixation indices were measured studying native Italian hunting dog breeds with microsatellite markers [3].    . represents the effects to the subdivision on the reference population, highlighting that the studied lines started to separate; nonetheless this is not statistically proven.

Discussion
In many populations, all estimates related to the probability of gene origin decreased the most during the first years. Unfortunately, in this study, the analyzed period covered more than a century. Therefore, determining annual numbers was not possible. Similar findings were also reported in the French Beauceron and Braque Francais dog populations [18,19]. The presence of preferential breeding can be shown by calculating the ratio of fe and fa. Small values signal the so-called bottleneck effect [20]. If fe is larger compared to fa, the population suffers from gene loss and consequently a decrease in genetic variation [16]. Comparing this result to other populations, the medium value (0.75) of the fa/fe ratio of the Braque Francais dog population shows a more balanced use of animals for breeding and an absence of a bottleneck in that population [18]. The observed small ratio of fa and fe of the Hungarian Border Collie population can be explained by its closed herd book, intense selection for appearance, and by the favoritism of some relevant individuals One of the reasons for decreasing the effective population size is that most of the puppies born in Hungary are sold abroad due to the lack of suitable owners for the breed. However, the dog-keeping   Figure 3. represents the effects to the subdivision on the reference population, highlighting that the studied lines started to separate; nonetheless this is not statistically proven.

Discussion
In many populations, all estimates related to the probability of gene origin decreased the most during the first years. Unfortunately, in this study, the analyzed period covered more than a century. Therefore, determining annual numbers was not possible. Similar findings were also reported in the French Beauceron and Braque Francais dog populations [18,19]. The presence of preferential breeding can be shown by calculating the ratio of fe and fa. Small values signal the so-called bottleneck effect [20]. If fe is larger compared to fa, the population suffers from gene loss and consequently a decrease in genetic variation [16]. Comparing this result to other populations, the medium value (0.75) of the fa/fe ratio of the Braque Francais dog population shows a more balanced use of animals for breeding and an absence of a bottleneck in that population [18]. The observed small ratio of fa and fe of the Hungarian Border Collie population can be explained by its closed herd book, intense selection for appearance, and by the favoritism of some relevant individuals One of the reasons for decreasing the effective population size is that most of the puppies born in Hungary are sold abroad due to the lack of suitable owners for the breed. However, the dog-keeping   Figure 3. represents the effects to the subdivision on the reference population, highlighting that the studied lines started to separate; nonetheless this is not statistically proven.

Discussion
In many populations, all estimates related to the probability of gene origin decreased the most during the first years. Unfortunately, in this study, the analyzed period covered more than a century. Therefore, determining annual numbers was not possible. Similar findings were also reported in the French Beauceron and Braque Francais dog populations [18,19]. The presence of preferential breeding can be shown by calculating the ratio of fe and fa. Small values signal the so-called bottleneck effect [20]. If fe is larger compared to fa, the population suffers from gene loss and consequently a decrease in genetic variation [16]. Comparing this result to other populations, the medium value (0.75) of the fa/fe ratio of the Braque Francais dog population shows a more balanced use of animals for breeding and an absence of a bottleneck in that population [18]. The observed small ratio of fa and fe of the Hungarian Border Collie population can be explained by its closed herd book, intense selection for appearance, and by the favoritism of some relevant individuals One of the reasons for decreasing the effective population size is that most of the puppies born in Hungary are sold abroad due to the lack of suitable owners for the breed. However, the dog-keeping   Figure 3. represents the effects to the subdivision on the reference population, highlighting that the studied lines started to separate; nonetheless this is not statistically proven.

Discussion
In many populations, all estimates related to the probability of gene origin decreased the most during the first years. Unfortunately, in this study, the analyzed period covered more than a century. Therefore, determining annual numbers was not possible. Similar findings were also reported in the French Beauceron and Braque Francais dog populations [18,19]. The presence of preferential breeding can be shown by calculating the ratio of fe and fa. Small values signal the so-called bottleneck effect [20]. If fe is larger compared to fa, the population suffers from gene loss and consequently a decrease in genetic variation [16]. Comparing this result to other populations, the medium value (0.75) of the fa/fe ratio of the Braque Francais dog population shows a more balanced use of animals for breeding and an absence of a bottleneck in that population [18]. The observed small ratio of fa and fe of the Hungarian Border Collie population can be explained by its closed herd book, intense selection for appearance, and by the favoritism of some relevant individuals One of the reasons for decreasing the effective population size is that most of the puppies born in Hungary are sold abroad due to the lack of suitable owners for the breed. However, the dog-keeping   Figure 3. represents the effects to the subdivision on the reference population, highlighting that the studied lines started to separate; nonetheless this is not statistically proven.

Discussion
In many populations, all estimates related to the probability of gene origin decreased the most during the first years. Unfortunately, in this study, the analyzed period covered more than a century. Therefore, determining annual numbers was not possible. Similar findings were also reported in the French Beauceron and Braque Francais dog populations [18,19]. The presence of preferential breeding can be shown by calculating the ratio of fe and fa. Small values signal the so-called bottleneck effect [20]. If fe is larger compared to fa, the population suffers from gene loss and consequently a decrease in genetic variation [16]. Comparing this result to other populations, the medium value (0.75) of the fa/fe ratio of the Braque Francais dog population shows a more balanced use of animals for breeding and an absence of a bottleneck in that population [18]. The observed small ratio of fa and fe of the Hungarian Border Collie population can be explained by its closed herd book, intense selection for appearance, and by the favoritism of some relevant individuals One of the reasons for decreasing the effective population size is that most of the puppies born in Hungary are sold abroad due to the lack of suitable owners for the breed. However, the dog-keeping Total population.

Discussion
In many populations, all estimates related to the probability of gene origin decreased the most during the first years. Unfortunately, in this study, the analyzed period covered more than a century. Therefore, determining annual numbers was not possible. Similar findings were also reported in the French Beauceron and Braque Francais dog populations [18,19]. The presence of preferential breeding can be shown by calculating the ratio of fe and fa. Small values signal the so-called bottleneck effect [20]. If fe is larger compared to fa, the population suffers from gene loss and consequently a decrease in genetic variation [16]. Comparing this result to other populations, the medium value (0.75) of the fa/fe ratio of the Braque Francais dog population shows a more balanced use of animals for breeding and an absence of a bottleneck in that population [18]. The observed small ratio of fa and fe of the Hungarian Border Collie population can be explained by its closed herd book, intense selection for appearance, and by the favoritism of some relevant individuals One of the reasons for decreasing the effective population size is that most of the puppies born in Hungary are sold abroad due to the lack of suitable owners for the breed. However, the dog-keeping culture is improving; there are still many backyard breeders who are selling puppies at often half of the cost of the breeder's price, resulting in a huge border collie mix population. Besides, accommodating to the breed standards also decreases the effective population size.
For the generation intervals, the results are not surprising as the reproductive life of sires is usually longer compared to dams. Similar results were found in the Nova Scotia Duck Tolling Retriever and the Lancashire Heeler dog breeds [21], and for several French dog breeds [18,19]. The lower length of the mixed-line generation interval can be attributed to the fact that these dogs are bred with lower show and working performance.
In dog breeding, mating of close relatives is a common practice, where the objective is to create an outstanding individual [22]; however, after few generations, the raised inbreeding level escalates juvenile mortality [23]. Moreover, this non-random mating is able to increase inbreeding depression (bringing alleles to a homozygous state), affecting the future genetic health of the breed.
For pedigree completeness, the obtained values show that the available electronic herd books register all ancestors, and that the Hungarian Border Collie population has an exceptionally long and complete pedigree. This pedigree quality is only comparable to that of thoroughbred horses and Pannon White rabbit populations [24,25].

Conclusions
Since the final objectives for dogs in shows and sports require different anatomical structures, the importance of these lines is outstanding. However, the contrasts in dog selection may increase the genetic distance. In the long run, continuous selection for different purposes such as for show and work may disrupt the breed. The decreasing tendency of the effective population size points out a trend that dog owners prefer not to buy from registered breeders. In Hungary, the working line is at the greatest risk in terms of the number of breeding animals and the number of litters. However, this line represents the look and the original function of the breed. To maintain variability, the genetic contribution of some preferred males could be limited by mating schemes in order to help the breeders. Importation of breeding dogs could be a solution to this problem; on the other hand, breeding standards are slightly different between countries, and thus a collaboration is required between breeding organizations and scientists to improve the health of the next generation.