1. Introduction
Since domestication, horses have been selectively bred for a variety of pigmentation phenotypes [
1]. Coat color is an important economic trait, and in some cases, it is a breed-defining phenotype. Coat color is also relevant from a health perspective, as some equine pigmentation variants have been connected to genetic disorders. One of the first variants discovered in the horse was a recessive mutation in the melanocortin 1 receptor (
MC1R), which causes the chestnut phenotype, characterized by red-pigmented hair in the body and points (mane, tail, lower legs, and ear rims) [
2]. Since then, over 60 variants contributing to equine pigmentation have been identified [
3].
Coat color phenotypes can be divided into three categories: base coat color, pigmentation dilution, and white patterning. The base coat color for horses is typically characterized as one of three colors: black, bay, or chestnut. Variants in
MC1R (also called the Extension or Red Factor locus) and its antagonist, agouti signaling protein (
ASIP), are associated with the type and location (body and points) of melanin produced—eumelanin (black/brown pigment) or pheomelanin (red/yellow pigment). In the absence of other modifiers, horses that are homozygous for the recessive
ASIP allele (denoted as
a)—an 11bp deletion in exon 2 (ENSECAT00000004772.2:c.191_201del)—have an all-black phenotype (black body and black points). Signaling through MC1R allows for pigment switching; the dominant
ASIP allele (
A) causes eumelanin to be restricted to the points, and when in combination with the dominant
MCIR (
E) allele, pheomelanin is produced in the body and results in a bay phenotype (red body with black points) [
4]. Individuals homozygous for the loss of function recessive
MC1R variant (denoted as
e, p.S83F) will only produce pheomelanin and will be a shade of chestnut (red body and red points), regardless of the
ASIP genotype [
5]. A second loss-of-function
MC1R mutation, also associated with chestnut base color, was discovered in 2000 and termed
ea (p.D84N) [
6]. To date, this allele has only been detected in Black Forest, Hungarian Coldblood, and Haflinger breeds [
6,
7].
Variants in six genes—major facilitator superfamily domain containing 12 (
MFSD12), myosin VA (
MYO5A), premelanosome protein (
PMEL), solute carrier family 36 member 1 (
SLC36A1), solute carrier family 45 member 2 (
SLC45A2), and T-box transcription factor 3 (
TBX3)—have been shown to contribute to a reduction in the amount of melanin produced, resulting in a diluted coat color [
8,
9,
10,
11,
12,
13,
14,
15]. Some of these variants only affect eumelanin or pheomelanin, while others are known to reduce the amount of both pigments. The variant in
MFSD12 (p.Asp201fs), causing the mushroom phenotype in Shetland Ponies, for example, reduces pheomelanin to create a dilute sepia coat [
14]. The silver phenotype, on the other hand, is caused by a missense mutation in the
PMEL gene (p.R625C) and is hypothesized to affect the deposition of eumelanin only, thus diluting the amount of pigment in black horses and diluting the black mane of bay horses [
8,
9]. The breed distribution for the aforementioned dilution alleles has not previously been reported.
White patterning variants are the most numerous among those identified to impact pigmentation in horses, with 49 mutations in seven genes—endothelin receptor type B (
EDNRB), KIT proto-oncogene, receptor tyrosine kinase (
KIT), melanocyte inducing transcription factor (
MITF), paired box 3 (
PAX3), ring finger and WD repeat domain 3 (
RFWD3), syntaxin 17 (
STX17), and transient receptor potential cation channel subfamily M member 1 (
TRPM1)—identified to date [
3,
16,
17,
18,
19,
20,
21,
22]. Of those, 35 were identified in or thought to regulate the
KIT gene (dominant white 1–16, 17a–b, 18–28, 30–33, sabino 1, and tobiano) [
3]. White patterns occur on any base color and/or dilution background. The phenotypic expression of white patterning variants is extremely variable, ranging from minimal white patches on the body to an all-white horse [
3,
20]. Genetic tests are available and routinely performed for 16 of these variants (lethal white overo, sabino 1, tobiano,
W4,
W5,
W10,
W20,
W22,
SW1–
SW6, gray, leopard complex spotting, and Appaloosa Pattern-1) to assist breeders in producing horses with desired white patterns while avoiding potential health concerns [
3]. Studies suggest that homozygosity for some white patterning variants is embryonic lethal, and some white patterning mutations are thought to be restricted to a specific breed, lineage, or a single individual, although more work is needed in both areas. For example, the
KIT dominant white 5 (
W5) variant (p.Thr732GlnfsX9) is thought to be restricted to one Thoroughbred family [
21] and proposed to be homozygous lethal. Similarly, the recently discovered de novo splashed white 6 (
SW6) variant in an American Paint Horse stallion, a 8.7 kb deletion in
MITF (NC_009159.3:g.21551060-21559770del) that causes extensive white markings, is also hypothesized to be homozygous embryonic lethal [
20]. Therefore, further investigation into potential homozygous lethal variants and a more comprehensive characterization of their breed distribution are needed.
Genetic testing for coat color variants is increasingly being utilized by horse breeders; thus, proper use of test results is important to ensure appropriate mate selection to produce the desired phenotypes. Furthermore, several studies have demonstrated the pleiotropic effects of pigmentation variants that lead to congenital disorders in horses, ranging from deafness [
23] to ocular issues [
24] to lethality [
25]. In some cases, only homozygous individuals are affected, as in the case of lethal white overo [
25], congenital stationary night blindness [
26], and potentially other embryonic lethal white variants mentioned above. In other cases, homozygotes are more severely affected than heterozygotes; for example, multiple congenital ocular anomalies (MCOA) are associated with the same mutation in the
PMEL gene that causes silver coat dilution [
8,
9]. Homozygotes have a more severe ocular disease that includes cataracts, iris stromal hypoplasia, abnormal pectinate ligaments, megaloglobus, and iridociliary cysts, whereas heterozygotes typically present only with iridociliary cysts [
8,
9]. Therefore, genetic testing can be a powerful selection tool for producing desirable traits while limiting pleiotropic anomalies in breeds in which such alleles are found. An understanding of breed-specific allelic and genotypic distribution can inform the proper utilization and interpretation of the results for this purpose. Here, we aim to further evaluate potential homozygous lethal variants and investigate the breed distribution and allele frequencies across 28 breeds of variants in 14 genes known to contribute to equine pigmentation phenotypes.
4. Discussion
The base coat color variants for chestnut (
e, in the
MC1R gene) and black (
a, located in
ASIP) were identified in all 28 breeds analyzed in this study (
Supplementary Table S3). The
a allele was found to be nearly fixed in the Percheron (a.f. = 0.99), a breed highly selected for black and gray coat phenotypes. The
MC1R allele
ea [
6], previously reported in chestnut horses in Black Forest, Hungarian Coldblood, and Haflinger breeds [
6,
7], was identified for the first time here in Paint Horse, Percheron, and Quarter Horse. Moreover, since several Knabstruppers with the
ea allele were detected in this study, we were able to report, for the first time, an estimated allele frequency for
ea in this breed (a.f. = 0.035). Knabstruppers are selectively bred for leopard complex spotting patterns, and a particular base color is often desired. Therefore, genotyping that allows for the accurate detection of horses with the
ea allele is important for marker-assisted selection. No homozygous
ea/ea individuals were identified in this study, although they have been reported in the Black Forest breed [
6].
Dilution alleles, on the other hand, were found to be restricted in their breed distribution and show relatively low allele frequencies, with the exception of
Cr. Nonetheless, some interesting observations can be made. Concerning the
Ch allele, it was found at a low frequency in eight breeds, but interestingly, the breed in which the variant was first discovered, Tennessee Walking Horse [
13], is among those breeds with the lowest estimated frequency (a.f. = 0.0080). Conversely, the breed with the highest frequency, Rocky Mountain Horse (a.f. = 0.026), has not previously been reported to have this variant. No known adverse health effects have been reported to be associated with champagne dilution. Additionally, given the high allele frequency of the silver variant in Rocky Mountain Horses (a.f. = 0.32) and its association with MCOA, the use of marker-assisted selection for silver heterozygosity and champagne may yield breed-desirable coat color dilutions with fewer potential ocular issues. Horses homozygous for the
PMEL silver mutation (
Z/Z) are reported to have a more severe form of disease that can impair vison or cause blindness, whereas heterozygotes (
Z/N) are often reported to have a cyst-only phenotype. The
Z allele was identified in 15 other breeds, and in addition to the Rocky Mountain Horse, it was previously reported in Miniature Horse, Missouri Fox Trotter, Icelandic Horse, Shetland Pony, and Morgan Horse. Here, we also identified the
Z allele in Mustang, Tennessee Walking Horse, Gypsy Vanner, Welsh Pony, Dutch Warmblood, Gypsy Cob, Pony of the Americas, Lusitano, Quarter Horse, and Paint Horse for the first time (
Supplementary Table S3). Given the known pleiotropic effects of this variant, breeds in which it occurs are advised to utilize genetic testing for mate selection, as well as a tool to identify horses who should be examined by a veterinary ophthalmologist for MCOA.
In the case of
Prl, while identified at a relatively low frequency in seven breeds, the highest frequencies were found in Iberian horses. Similar allele frequencies were estimated for the Andalusian and closely related Pura Raza Española breed (a.f. = 0.12 and 0.11, respectively), as well as the Lusitano (a.f. = 0.064) (
Supplementary Table S3). To our knowledge, this is the first time that this variant has been reported in Lusitano and Pura Raza Española. Given that the highest frequency of
Prl is in Iberian breeds, it is possible that it is undergoing positive selection because of the recent use of genetic testing and/or changes in studbook rules in the Pura Raza Española starting in 2002, which allowed the registration of additional color phenotypes other than black, bay, and gray. The
Prl allele is recessive and is known to dilute only the base color when homozygous or when combined with the cream (
Cr) allele. Iberian breeds were also among those with the highest allele frequencies for
Cr, with the highest being in Lusitano (a.f. = 0.42), which again indicates selection for coat color dilution in these breeds. Given the relatively high frequency of
Cr and
Prl in these breeds, genetic testing for these variants can easily assist in the consistent production of these desirable phenotypes.
The
Mu allele was originally reported in Shetland Ponies with an allele frequency of 0.12 (n = 177), and in that study, it was also identified in Miniature Horses at a low frequency (a.f. = 0.020; n = 129) [
14]. In this study, the
Mu allele was only identified in Shetland Pony, as genotyping data were only available for two Miniature Horses. Here, the
Mu allele frequency in Shetland Pony after filtering for relatedness was estimated at 0.23 (n = 173). This was nearly double that previously reported by us [
14]; however, this is unlikely to reflect a true increase in population frequency despite being a favorable trait in the breed. Since the discovery was reported in 2019, the
Mu allele frequency estimated herein more likely represents a sampling bias for horses owned and tested by breeders who breed for this trait. Testing a larger randomized cohort is needed to investigate this further.
Concerning the dun dilution characterized by lightening of body hair and the presence of primitive markings, including a darker dorsal stripe, leg and/or shoulder stripes, and dark marks known as cobwebbing on the forehead, the Norwegian Fjord Horse was found to be fixed for the wild-type
D allele (a.f. = 1.0, n = 121). To the best of our knowledge, this is the first documented molecular investigation of the frequency of this variant in the breed. However, it has been previously suggested that this allele was fixed, or nearly so, given the breed-defining dun phenotype characteristic of Norwegian Fjords [
5]. The
nd1 allele, which leads to the expression of primitive markings without dilution of the coat, was estimated to be at high frequency in Iberian horse breeds as well as in the Arabian: Pura Raza Española (a.f. = 0.79), Andalusian (a.f. = 0.78), Lusitano (a.f. = 0.57), and Arabian (a.f. = 0.68). These findings likely reflect the common ancestry of Iberian horse breeds, which were developed using Arabians brought to the Iberian Peninsula during the Muslim invasions of the 8th century. Given the frequency of
nd1 in Iberian and Arabian breeds, investigating whether this variant contributes to the darker shade or countershading is worthy of further exploration and may help to better understand why the
nd1 allele appears to be under positive selection.
Dominant white phenotypes and their associated alleles were originally named as such because they were believed to be lethal when homozygous. Consistent with this, no homozygotes for
W5,
W10, or
W22 were identified in this study. However, due to their low allele frequencies and limited breed distribution (
Supplementary Table S4), the probability of identifying homozygotes in the population is extremely low, regardless of lethality. Consistent with previous work [
22,
36], homozygotes were identified for the
W20 allele in the 21 breeds included in this study. These breeds represent a variety of coat color selection preferences, ranging from horses with breed-defining phenotypes that include white patterning, such as the Paint Horse (a.f. = 0.21) and the Appaloosa (a.f. = 0.18), to breeds that prohibit excessive white markings, such as the Hanoverian (a.f. = 0.24). The role of
W20 in pigmentation across breeds is not well understood. Previous research supports that
W20 (p.Arg682His) leads only to a minor reduction in
KIT function [
36]. However, it has been documented that
W20, in combination with other dominant white alleles, causes higher amounts of white patterning [
22,
36]. Based on our findings, compound heterozygotes for
W10/W20 (one Quarter Horse) and
W20/W22 (three Quarter Horses) are rare (
Table 2). It is important to note here that the
W22 variant occurs on the sequence background of
W20 [
22]. A detailed investigation of the phenotypic effects of
W20 in combination with other white-patterning alleles has not yet been performed. Here, we identified 93 horses from nine breeds (Appaloosa, Gypsy Cob, Gypsy Vanner, Miniature Horse, Missouri Fox Trotter, Paint Horse, Shetland Pony, Tennessee Walking Horse, and Welsh Pony) with both the
TO allele and
W20 (
Table 4). Thirty-three horses across nine breeds (Gypsy Cob, Gypsy Vanner, Miniature Horse, Missouri Fox Trotter, Mustang, Paint Horse, Quarter Horse, Shetland Pony, and Tennessee Walking Horse) had
SB1 and
W20 (
Table 4,
Supplementary Table S3). Photographic records were not available for most of the horses in this study but given the occurrence of these combinations of
KIT mutations across several breeds, a formal investigation of their potential additive effects on white pattering is warranted. A recent study of American Paint Horses showed that, in the absence of any other known white patterning variants,
W20 was associated with white spotting phenotypes defined by the American Paint Horse Association (APHA) [
37]. However, the possibility that undiscovered variants contribute to the phenotypes, to our knowledge, has not yet been explored. The widespread across-breed distribution of
W20, combined with high estimated allele frequencies in breeds not typically selected for white patterning in this study, provides further justification for evaluating the functional role of this allele in pigmentation across breeds. This knowledge will be essential in order to develop breed-specific recommendations for utilizing
W20 genotypes to maximize breeding potential for desired coat color phenotypes.
Some of the splashed white variants were predicted to be homozygous lethal, and in this study, we only identified homozygotes for
SW1 (n = 49) and
SW2 (n = 9) (
Table 6). Previous studies have shown that
SW1 homozygotes are viable and have an all-white or nearly all-white phenotype [
35]. Two homozygous
SW2/SW2 horses have been reported in the literature, with only one having a documented phenotype [
37]. Having confirmed genotypes of nine
SW2 homozygotes across two breeds (Paint Horse and Quarter Horse)—the most reported to date—and in evaluating photographic records of six of these horses, we confirm that this genotype is not lethal. All six
SW2/SW2 horses had an all-white phenotype, but five of the six also had other white spotting variants (
Figure 2). Thus, further evaluation of
SW2 homozygotes without other known white pattern alleles will substantiate the hypothesis that this genotype alone produces an all-white phenotype. One owner reported that their
SW2/SW2 horse was deaf, but this remains to be clinically evaluated in this individual, as well as in other
SW2 homozygotes.
Another rare non-lethal white patterning variant is
SB1. The sabino 1 phenotype is described as extensive face and leg markings, along with white in the belly and roaning in the flanks [
32]. The causal
KIT variant was first reported in the Tennessee Walking Horse, American Miniature Horse, Paint Horse, Azteca, Missouri Foxtrotter, Shetland Pony, and Spanish Mustang [
32]. A subsequent study investigated this variant in 899 horses across eight breeds and identified
SB1 in three additional breeds (Haflinger, Noriker, and Lippizan), but no homozygotes were observed [
38]. Here, we report the occurrence of the
SB1 allele in Gypsy Cob, Gypsy Vanner, Pony of the Americas, and Quarter Horse for the first time. Furthermore, we identified the largest number of
SB1 homozygotes to date, with six
SB1/SB1 individuals across three breeds (Paint Horse, Shetland Pony, and Tennessee Walking Horse). Photographic records were available for four of these horses, and consistent with a previous report [
37], they have an all-white phenotype (
Figure 1). Given that
SB1 and
SW2 homozygotes are viable but rare, genetic testing can aid in the identification of these rare mates so that 100% of offspring have a white patterning allele.
Concerning leopard complex spotting, we report the first molecular detection and allele frequency of
LP and
PATN1 in breeds outside of those used to discover mutations [
26,
34] (
Table 5). Furthermore, in those breeds that are selected for
LP, this is the first time that the frequency of the modifying locus
PATN1 has been concurrently evaluated. Consistent with photographic records and breeding schemes, the allele frequency for
PATN1 is higher in the Knabstrupper than in Appaloosa, Pony of the Americas, or Miniature Horse. Knowing the frequency in these breeds can help guide selection strategies away from homozygosity for
LP, which causes congenital stationary night blindness (CSNB) and an increased risk of insidious uveitis (ERU) [
26,
39,
40]. Horses homozygous for
LP are night blind due to premature polyadenylation of
TRPM1, which in turn is predicted to lead to no functional TRPM1 protein in the bipolar cells of the retina. In the case of ERU, horses homozygous for
LP are at a higher risk of this disease, but the biological mechanism for this increased risk remains unknown [
39,
40]. Genotyping for LP in breeds where the allele is present can help breeders select for
LP heterozygotes, which, when in combination with the
PATN1 variant, display the desirable leopard pattern without being afflicted with night-blindness [
26]. Interestingly,
PATN1 was identified in three breeds where no
LP horses were detected (Connemara Pony, Missouri Fox Trotter, and Rocky Mountain Horse); moreover,
PATN1 allele frequency was estimated to be higher than that of
LP in three breeds: Gypsy Vanner, Shetland Pony, and Welsh Pony (
Table 5). It is currently unknown whether
PATN1 acts as a modifier for other white patterning loci; therefore, investigating white pattern levels in these breeds where
PATN1 but not
LP was identified, specifically in horses with other white patterning alleles, is warranted.
In conclusion, these data represent the largest study to date investigating pigmentation variants in horses across a large number of breeds. We reported estimated allele frequencies for base coat color, dilution, and white patterning genes in 28 breeds. While we restricted our dataset to breeds with 30 or more samples, a limitation of the study is that these samples were originally submitted for coat color testing. Therefore, it is possible that in those breeds with fewer individuals evaluated, estimates may be biased upwards, as noted above, for the mushroom variant. We identified the presence of several variants in breeds not previously reported. Additionally, we identified and reported nine SW2/SW2 homozygotes, the most to date, and confirmed that this genotype is not lethal. Here, we also report the largest number of SB1 homozygotes to date (n = 6), with photographic records for most to corroborate its effect on phenotype. Understanding breed distribution and allele frequencies can help guide recommendations on and utilization of genetic testing for marker-assisted selection. Furthermore, these findings will help guide future hypothesis-driven studies toward a better functional understanding of these pigmentation variants in and across breeds.