The Complexity of the Ovine and Caprine Keratin-Associated Protein Genes

Sheep (Ovis aries) and goats (Capra hircus) have, for more than a millennia, been a source of fibres for human use, be it for use in clothing and furnishings, for insulation, for decorative and ceremonial purposes, or for combinations thereof. While use of these natural fibres has in some respects been superseded by the use of synthetic and plant-based fibres, increased accounting for the carbon and water footprint of these fibres is creating a re-emergence of interest in fibres derived from sheep and goats. The keratin-associated proteins (KAPs) are structural components of wool and hair fibres, where they form a matrix that cross-links with the keratin intermediate filaments (KIFs), the other main structural component of the fibres. Since the first report of a complete KAP protein sequence in the late 1960s, considerable effort has been made to identify the KAP proteins and their genes in mammals, and to ascertain how these genes and proteins control fibre growth and characteristics. This effort is ongoing, with more and more being understood about the structure and function of the genes. This review consolidates that knowledge and suggests future directions for research to further our understanding.


Introduction
The keratin-associated proteins (KAPs) are structural components of the fibres that form the pelage of mammals. They are part of a matrix, which cross-links with the keratin (K) protein containing keratin intermediate filaments (KIFs), which are the other main structural components of the fibre. In the wool follicles of sheep, the KAPs are produced soon after the synthesis of keratins during the development of fibres in the follicle, and they are thought to cross-link with the KIFs by forming disulphide bonds with cysteine residues in the head and tail domains of the keratins [1]. The nature of this cross-linking is not well understood, and while co-immunoprecipitation studies have demonstrated an interaction between the head domain of human keratin K86 and KAP2-1 [2], and Western blot studies have demonstrated an interaction between the head domain of K85 and KAP8-1 [2], a complete understanding of which K and KAP cysteine thiol groups form disulphide bridges (be they inter-or intra-chain) is not well known, although the bulk of the most readily accessible cysteines in the KAPs are reported to be found close to either the N-or C-terminal domains in these proteins [3]. The KAPs are thought to play a critical role in regulating the physico-mechanical properties of the fibre.
With sheep and goats, the hair and wool fibres are produced by follicles that are located in the skin, but the value of these fibres varies considerably depending on their  * The gene appears to have been identified, but is not well characterised.
Orthologues for all of the human KRTAPs can be found in sheep and goats, with the exception of KRTAP25-1. A recent search for sequences comparable to human KRTAP25-1 in the sheep genome assembly did not reveal any comparable sequences [47], but in the chromosome region, where the human KRTAP25-1 orthologue was expected to be found, there was an apparently unique KRTAP sequence, which has been named KRTAP28-1 [47].
In contrast, several KRTAPs that have not been described in humans are found in Figure 1. Comparison of the predicted amino acid sequences of two goat "KRTAP6-2" sequences with sheep and human KRTAP6 sequences. The goat "KRTAP6-2" sequences are indicated with the GenBank accession number AY316158 and EU145019. The sheep sequences are indicated with the prefix "s", while the human sequences are marked with the prefix "h". The GenBank accession numbers of these sequences are NM_001193399 (sKAP6-1), KT725832 (sKAP6-2), KT725837 (sKAP6-3), KT725840 (sKAP6-4), KT725845 (sKAP6-5), NM_181602 (hKAP6-1), NM_181604 (hKAP6-2), and NM_181605 (hKAP6-3). The numbers on the right of the sequences represent the length of the proteins. Amino acid positions with high levels of homology are coloured, with black indicating 100% homology, red indicating greater than or equal to 75%, and blue indicating greater than or equal to 50%. ±2-bp repeats [66] * The gene appears to have been identified, but is not well characterised.
Orthologues for all of the human KRTAPs can be found in sheep and goats, with the exception of KRTAP25-1. A recent search for sequences comparable to human KRTAP25-1 in the sheep genome assembly did not reveal any comparable sequences [47], but in the chromosome region, where the human KRTAP25-1 orthologue was expected to be found, there was an apparently unique KRTAP sequence, which has been named KRTAP28-1 [47].
Within families, the KRTAPs can exhibit a high degree of nucleotide sequence similarity, particularly in their coding regions, and some KRTAPs even have identical coding sequences. For example, KRTAP6-2 variants B, C, and D have coding sequences that are identical to KRTAP6-5 variants B and D ( Figure 2). These genes and their variants can only be differentiated by variation in their 3 and 5 flanking sequences.
The high sequence similarity in coding regions can make it difficult (and sometimes impossible) to assign KAP protein sequences or partial gene sequences into families. Equally, it is also difficult to determine whether different KAP protein sequences represent different family members, or just variant sequences of the same family member. This hampers the application of proteomic approaches to KAP research. It also highlights the critical importance of having extended and comprehensive gene sequences for the KRTAPs, along with an idea of their location on chromosomes, if one is to accurately identify and classify both the KAP and the KRTAP sequences.  The high sequence similarity in coding regions can make it difficult (and sometimes impossible) to assign KAP protein sequences or partial gene sequences into families. Equally, it is also difficult to determine whether different KAP protein sequences represent different family members, or just variant sequences of the same family member. This hampers the application of proteomic approaches to KAP research. It also highlights the critical importance of having extended and comprehensive gene sequences for the KRTAPs, along with an idea of their location on chromosomes, if one is to accurately identify and classify both the KAP and the KRTAP sequences.

Variation in the KRTAPs
Nucleotide sequence variation has been explored for many of the ovine and caprine KRTAPs. To date, all of the ovine and caprine KRTAPs that have been investigated are

Variation in the KRTAPs
Nucleotide sequence variation has been explored for many of the ovine and caprine KRTAPs. To date, all of the ovine and caprine KRTAPs that have been investigated are polymorphic, but the extent and nature of the polymorphism varies between the genes. Some KRTAPs exhibit a low level of polymorphism, such as ovine and caprine KRTAP7-1 [34,53] and ovine KRTAP20-2 [41]. Each of these genes has only two sequence variants. On the other hand, some KRTAPs possess a high level of polymorphism, such as ovine KRTAP1-2 [20,21], ovine KRTAP1-4 [23], and caprine KRTAP13-3 [48,59], for which nine or more sequence variants have been identified. The majority of KRTAPs exhibit a moderate level of polymorphism, with the number of sequence variants ranging from three to eight.
The polymorphism described in the KRTAPs includes single nucleotide polymorphisms (SNPs), and insertions and deletions (indels). With the exception of KRTAP6-1, for which all of the SNPs are found either upstream or downstream of the coding region [31,67], SNPs in all of the other KRTAPs mostly occur in the coding region. The SNP density, and the proportion of non-synonymous SNPs, varies considerably between the KRTAPs. Some, such as ovine KRTAP1-3, KRTAP1-4, and KRTAP20-1, have a density of over 20 SNPs per kb, while others, such as ovine KRTAP7-1, KRTAP8-2, and KRTAP20-2, have a density of less than five SNPs per kb ( Figure 3). Overall, the SNP density in the majority of KRTAPs is higher than the average density of 4.9 SNP per kb that has been suggested to occur across the sheep genome [68], albeit that estimate is now quite dated. There does not appear to be any obvious pattern with respect to the chromosomal location of the polymorphism, and this suggests that the generation and accumulation of the SNPs in any given KRTAP may be, at least in part, independent of other KRTAPs.
suggested to occur across the sheep genome [68], albeit that estimate is now quite dated. There does not appear to be any obvious pattern with respect to the chromosomal location of the polymorphism, and this suggests that the generation and accumulation of the SNPs in any given KRTAP may be, at least in part, independent of other KRTAPs.
The ratio of non-synonymous SNPs to synonymous SNPs does not have any obvious pattern, but the KRTAPs that are located close together on the chromosomes appear to have a similar ratio. For example, ovine KRTAP36-1, KRTAP15-1, and KRTAP13-3 are located close together on ovine chromosome 1, and they all have a high proportion of nonsynonymous SNPs. The same is true for ovine KRTAP28-1 and KRTAP24-1 ( Figure 3). A low proportion of non-synonymous SNPs is observed for ovine KRTAP6-2, KRTAP6-4, KRTAP6-1, and KRTAP22-1, which are clustered in proximity to each other on ovine chromosome 1, and also for ovine KRTAP1-2 and KRTAP1-3 on chromosome 11. Further investigation of variation in other KRTAPs as they are found and characterized may provide more information about this effect. It also is notable that all of the non-synonymous SNPs revealed to date in the ovine KRTAPs are missense, with the exception of a single nonsense SNP in ovine KRTAP20-2 [41].  The ratio of non-synonymous SNPs to synonymous SNPs does not have any obvious pattern, but the KRTAPs that are located close together on the chromosomes appear to have a similar ratio. For example, ovine KRTAP36-1, KRTAP15-1, and KRTAP13-3 are located close together on ovine chromosome 1, and they all have a high proportion of non-synonymous SNPs. The same is true for ovine KRTAP28-1 and KRTAP24-1 ( Figure 3). A low proportion of non-synonymous SNPs is observed for ovine KRTAP6-2, KRTAP6-4, KRTAP6-1, and KRTAP22-1, which are clustered in proximity to each other on ovine chromosome 1, and also for ovine KRTAP1-2 and KRTAP1-3 on chromosome 11. Further investigation of variation in other KRTAPs as they are found and characterized may provide more information about this effect. It also is notable that all of the non-synonymous SNPs revealed to date in the ovine KRTAPs are missense, with the exception of a single nonsense SNP in ovine KRTAP20-2 [41].
Beside the presence of SNPs, the KRTAPs also contain indels. For sheep, this has been described for numerous KRTAPs, including KRTAP1-1, KRTAP5-4, KRTAP6-1, KRTAP6-5, KRTAP20-1, and KRTAP28-1 [19,28,29,31,40,47], and KRTAP9-2 and KRTAP28-1 in goats [56,66]. For ovine KRTAP1-1, KRTAP5-4, and KRTAP6-5, and for caprine KRTAP9-2, the indels occur within tandem repeat regions of the coding sequence, and they lead to variation in the number of tandem repeats that are present. In ovine and caprine KRTAP28-1, the indels are located within dinucleotide repeats (microsatellites), while the indels in ovine KRTAP6-1, KRTAP6-3, and KRTAP20-1 are not located in the tandem repeat region, but instead occur in regions that are flanked by sequence repeats (Figure 4). All of the indels identified in the KRTAPs are in multiples of three nucleotides in size, and hence they preserve the reading frame. The exception is the dinucleotide repeats found in ovine and caprine KRTAP28-1.
variation in the number of tandem repeats that are present. In ovine and caprine KRTAP28-1, the indels are located within dinucleotide repeats (microsatellites), while the indels in ovine KRTAP6-1, KRTAP6-3, and KRTAP20-1 are not located in the tandem repeat region, but instead occur in regions that are flanked by sequence repeats (Figure 4). All of the indels identified in the KRTAPs are in multiples of three nucleotides in size, and hence they preserve the reading frame. The exception is the dinucleotide repeats found in ovine and caprine KRTAP28-1.

Mechanisms for the Generation of KRTAP Variation
In the KRTAPs, transition SNPs predominate, and they account for over 70% of all SNPs. Among these transition SNPs, the G/C to A/T transition (52.4% of occurrences) happens at nearly three times the frequency of the A/T to G/C transition (18.1%). This effect is still pronounced when the ratio of A/T (approximately 46%) to G/C (approximately 54%) base pairs (1:1.19) in ovine KRTAPs is taken into account. Such a transitional bias should create a pressure towards ovine KRTAPs having a higher AT content, but some other pressure must operate in the other direction to maintain GC ratio.
For over twenty years, there has been a strong belief that biased gene conversion (BGC) is important for shaping the GC content in the genomes of mammals and other eukaryotes [69]. The BGC theory is based on DNA repair processes inside heteroduplexes, the double-stranded DNA segments that form during meiosis at crossover and non-crossover recombination sites. The theory has it that one DNA strand of a heteroduplex has a maternal origin, while the complementary strand is paternal. The heterozygous sites that occur in heteroduplexes create mismatches, and these mismatches are non-randomly resolved in favour of G/C over A/T nucleotides, which leads to an increase of GC content in the sequence [69,70]. The polymorphic nature of the KRTAPs could potentially increase the chance of forming heteroduplexes, and hence elevate BGC. The effect of BGC, if strong enough, can overcome purifying selection and lead to an increased ratio of non-synonymous SNPs to synonymous SNPs [70], as is found for some of KRTAPs.

Mechanisms for the Generation of KRTAP Variation
In the KRTAPs, transition SNPs predominate, and they account for over 70% of all SNPs. Among these transition SNPs, the G/C to A/T transition (52.4% of occurrences) happens at nearly three times the frequency of the A/T to G/C transition (18.1%). This effect is still pronounced when the ratio of A/T (approximately 46%) to G/C (approximately 54%) base pairs (1:1.19) in ovine KRTAPs is taken into account. Such a transitional bias should create a pressure towards ovine KRTAPs having a higher AT content, but some other pressure must operate in the other direction to maintain GC ratio.
For over twenty years, there has been a strong belief that biased gene conversion (BGC) is important for shaping the GC content in the genomes of mammals and other eukaryotes [69]. The BGC theory is based on DNA repair processes inside heteroduplexes, the double-stranded DNA segments that form during meiosis at crossover and non-crossover recombination sites. The theory has it that one DNA strand of a heteroduplex has a maternal origin, while the complementary strand is paternal. The heterozygous sites that occur in heteroduplexes create mismatches, and these mismatches are non-randomly resolved in favour of G/C over A/T nucleotides, which leads to an increase of GC content in the sequence [69,70]. The polymorphic nature of the KRTAPs could potentially increase the chance of forming heteroduplexes, and hence elevate BGC. The effect of BGC, if strong enough, can overcome purifying selection and lead to an increased ratio of non-synonymous SNPs to synonymous SNPs [70], as is found for some of KRTAPs.
Indels are in abundance in eukaryotic genomes, and in humans they are the second most abundant form of genetic variation, after SNPs [71]. Mechanisms have been proposed to explain the generation of indels, including replication slippage (also known as slipped-strand mispairing) [72], unequal crossing-over (also known as non-reciprocal recombination) [73], and transposition (also known as translocation) [74]. It is thought that replication slippage is the principal mechanism responsible for the majority of small indels in the human genome [75].
Replication slippage is a mutation process that occurs during DNA replication and also during the DNA synthesis step of DNA repair processes. It requires DNA polymerase pausing to occur within a short direct repeat. The paused polymerase dissociates from the DNA, and then the terminal portion of the newly synthesized strand separates from the template and anneals to another direct repeat, after which replication resumes [76]. A slippage event normally occurs when a sequence of repetitive nucleotides (e.g., tandem repeats) are found at the site of replication, and strand misalignment at repeated sequences leads to genetic rearrangements, resulting in the insertion or deletion of nucleotides [77].
Slipped-strand mispairing can also occur with non-continuous repeat sequences, and results in longer insertions or deletions of intervening sequences flanked by the repeats [78].
Many KRTAPs are characterized by having an abundance of nucleotide repeats [4,16,19,28,31]. The KRTAPs are also GC-rich, with an average GC content of over 54% in all of the ovine KRTAPs identified to date. It has been reported that a high GC content results in reduced DNA polymerase processivity and increased DNA polymerase slippage, and consequently it can lead to elevated rates of mutation, including the creation of indels and nucleotide substitutions [79].
There is an association between the occurrence of indels and nucleotide substitutions, and this association appears to be universal to all prokaryotic and eukaryotic genomes examined so far [80][81][82]. McDonald et al. [82] propose that it is not the indels per se, but instead the sequence in which the indels occur that causes the accumulation of nucleotide substitutions. Repeat sequences can promote an increased probability of replication fork arrest and are prone to causing the stalling of high-fidelity DNA polymerase. This can lead to the recruitment of error-prone (low-fidelity) DNA polymerases to replicate the surrounding sequence with a higher-than-average error rate [82].
Sequence analyses of two of the most variable KRTAPs in sheep (KRTAP1-2 and KRTAP1-4), reveal that short segments of DNA exchange may have occurred and contributed to the generation of the different nucleotide sequences ( Figure 5). This suggests gene conversion or non-reciprocal genetic exchange is one of the mechanisms for creating sequence diversity in some KRTAPs. Unique sequence motifs have been postulated to promote genetic recombination, including the crossover hotspot instigator (Chi) sequence (5 -GCTGGTGG-3 ) or its complementary sequence, which are abundant in the genomes of bacteriophage and Escherichia coli [83]. Chi and Chi-like sequences, or their complementary sequences, have been reported in KRTAPs [19,61,64,65]. It is proposed that Chi-like sequences, with minor nucleotide variations to the consensus Escherichia coli Chi sequence, may have partial 'hotspot' activity [84]. The presence of Chi and Chi-like, or their complementary sequences in KRTAPs suggests recombination may play a role in creating sequence diversity.

The Chromosomal Clustering of KRTAPs and Evolution
The KRTAPs are clustered and located in several chromosomal regions. In humans, where a full complement of KRTAPs has possibly been identified, there are five clusters of genes located on three chromosomes [10]. Based on the number of KAP genes in each
In sheep and goats, all of the KRTAPs identified to date are from cluster 1 and cluster 2, except ovine KRTAP5-4, a cluster 4 gene on ovine chromosome 21. In these two species, the cluster 1 KRTAPs are located on chromosome 1, but the cluster 2 KRTAPs are located on chromosome 11 in sheep, and on chromosome 19 in goats ( Figure 6).
Within clusters, the KRTAPs are unevenly distributed on the chromosome [10,16]. Despite having different transcriptional directions for the KRTAPs found in each cluster, the genes that are near to each other tend to have the same direction of transcription ( Figure 6 and [5]). This may suggest that some genes are under a shared form of transcriptional control.
The identification of numerous KRTAPs from cluster 1 and cluster 2 in sheep and goats enables a comparison of these two ruminant KRTAP clusters and the matching human cluster. From this, it can be observed that the overall configuration of KRTAPs is similar across these species, but that sheep and goats are more similar to each other (as might be expected), and have some differences to humans. In cluster 1, a major difference is observed in the region between KRTAP21-2 and KRTAP15-1. In humans, this region is estimated to be 306 kb in size, and it contains four KAP gene families (two KRTAP20s, three KRTAP6s, one KRTAP22, and seven KRTAP19s; [15]). Genes of the KAP19 family have not been identified in sheep and goats, but for the other KRTAPs that have been identified, their locations are quite different to those described in humans ( Figure 6). This region is estimated to be 560 kb in length in sheep and contains two additional KRTAP6s and one new KAP gene named KRTAP36-1 [6,31].
An obvious difference in cluster 2 is located between KRTAP4-3 and KRTAP1-3 ( Figure 6). This region is approximately 120 kb in length in humans, but it is estimated to be 90 kb in size in sheep. At a finer level in humans, KRTAP2-1 is approximately 5 kb away from human KRTAP1-1, which corresponds to ovine KRTAP1-3, but the distance between KRTAP2-1 and KRTAP1-3 is approximately 28 kb in sheep. This region contains numerous human KRTAP4s and KRTAP2s [85], but only one KRTAP4 and one KRTAP2 have been identified in this region in sheep. The identification of more ovine and caprine KRTAPs in this region would assist in providing a better understanding of the similarities and differences between the regions in the different species.
The clustering of genes that produce proteins that are involved in key metabolic pathways has been accepted for many eukaryotes, but the evolutionary causes or benefits of clustering remain controversial [86]. One hypothesis put forward to explain the clustering of genes involved in metabolic pathways is that they have 'arrived' in genomes as a group, following horizontal gene transfers from bacteria [87].
Given the intron-less character of the KRTAPs, the possibility of individual genes or groups of KRTAPs having prokaryotic origins cannot be excluded. Research in humans suggests that the cluster-3 KRTAPs are located within introns of the thrombospondin-type laminin G domain and EAR repeat-containing protein gene (TSPEAR) on chromosome 21 [88]. This rather strange location could suggest that this KRTAP cluster has been inserted into this position, and then subsequently, gene duplication and divergence may have enlarged the cluster. Within clusters, the KRTAPs are unevenly distributed on the chromosome [10,16]. Despite having different transcriptional directions for the KRTAPs found in each cluster, the genes that are near to each other tend to have the same direction of transcription (Figure 6 and [5]). This may suggest that some genes are under a shared form of transcriptional control. Gene duplication can arise via several mechanisms, with the major mechanisms including unequal crossing-over, retroposition and chromosomal duplication events (largescale duplications) [89]. Large-scale duplications, as a consequence of polyploidy, are reported to occur frequently in plants, but are much less frequent in animals [89]. However, duplications of large genomic segments (segmental duplications) are abundant in animals such as primates [90] and rodents [91]. Unequal crossing-over, along with gene conversion, is believed to be the main driver for the generation of gene duplications, but the possibility of retroposition should not be ignored.
Retroposition is an RNA-mediated process that occurs when a message RNA is retrotranscribed to complementary DNA (cDNA), and the resulting cDNA is inserted back into the genome. Retrogenes are therefore expected to lack introns and regulatory sequences (which sits well with the nature of the KRTAPs), but instead contain poly A tracts and flanking short direct repeats [89]. Further bioinformatics analyses of the KRTAPs and their flanking sequences may shed more light on the evolution of the KRTAP clusters.
A preliminary sequence analysis of the sheep chromosome 1 region that contains KRTAPs reveals five long intergenic non-coding RNA (lincRNA) genes within the cluster region (spanning approximately 0.9 Mb). However, there is no lincRNA gene found in the approximately 2.9 Mb upstream region, and only two lincRNA genes are found in the approximately 3.2 Mb downstream region (Figure 7). The exact functions of lincRNAs are not well known, but it is proposed that they broadly serve to fine-tune the expression of neighbouring genes with tissue specificity, and with a diversity of mechanisms [92]. Analysis of human lincRNAs reveals one feature, a high prevalence of transposable elements (TEs) [93], or repetitive mobile genetic sequences that are capable of duplicating genes or gene fragments [94]. Whether these lincRNAs play a role in the evolution of KRTAPs awaits further investigation. neighbouring genes with tissue specificity, and with a diversity of mechanisms [92]. Analysis of human lincRNAs reveals one feature, a high prevalence of transposable elements (TEs) [93], or repetitive mobile genetic sequences that are capable of duplicating genes or gene fragments [94]. Whether these lincRNAs play a role in the evolution of KRTAPs awaits further investigation.

The Effect of KRTAP Variation
The proteins encoded by KRTAPs serve as a matrix embedding the KIFs. They form one of the main components of the wool/hair fibre, and thus it is thought that variation in KRTAPs may affect fibre properties, possibly in three ways.
Firstly, non-synonymous SNPs and insertions/deletions in the coding region will alter the protein sequence. This may affect the structure and/or properties of the protein, and consequently its interaction with KIFs and/or other KAP proteins, which may then affect fibre properties. As an example, a nonsense mutation in ovine KRTAP20-2 has been shown to be associated with variation in the curvature of wool fibres [41], and a 57-bp insertion/deletion in the coding region of ovine KRTAP6-1 is associated with variation in the fibre diameter traits [29]. The mean fibre diameter (MFD) of wool is a key determinant of value, with finer wools of a mean diameter of 19 microns or less being considerably more valuable than strong wools of a mean diameter over 36 microns.
Secondly, synonymous SNPs and SNPs in the upstream and downstream of the coding region may affect gene expression and consequently alter the amount of that protein in the wool/hair fibres. Despite synonymous SNPs not causing amino acid changes in the protein, research has shown that synonymous SNPs can affect the stability and structure of mRNA, and also the folding of protein [95]. In felting lustre mutant wool follicles, Li et al. [96] reported that the expression of KRTAP2-12 and KRTAP4-2 was un-regulated, whereas the expression of KRTAP6-1, KRTAP7, and KRTAP8 was down-regulated. In wool from sheep on a restricted diet, KAP13-1 and KAP6-n protein levels were increased, and this was found to be associated with a decrease in the fibre diameter [97].

The Effect of KRTAP Variation
The proteins encoded by KRTAPs serve as a matrix embedding the KIFs. They form one of the main components of the wool/hair fibre, and thus it is thought that variation in KRTAPs may affect fibre properties, possibly in three ways.
Firstly, non-synonymous SNPs and insertions/deletions in the coding region will alter the protein sequence. This may affect the structure and/or properties of the protein, and consequently its interaction with KIFs and/or other KAP proteins, which may then affect fibre properties. As an example, a nonsense mutation in ovine KRTAP20-2 has been shown to be associated with variation in the curvature of wool fibres [41], and a 57-bp insertion/deletion in the coding region of ovine KRTAP6-1 is associated with variation in the fibre diameter traits [29]. The mean fibre diameter (MFD) of wool is a key determinant of value, with finer wools of a mean diameter of 19 microns or less being considerably more valuable than strong wools of a mean diameter over 36 microns.
Secondly, synonymous SNPs and SNPs in the upstream and downstream of the coding region may affect gene expression and consequently alter the amount of that protein in the wool/hair fibres. Despite synonymous SNPs not causing amino acid changes in the protein, research has shown that synonymous SNPs can affect the stability and structure of mRNA, and also the folding of protein [95]. In felting lustre mutant wool follicles, Li et al. [96] reported that the expression of KRTAP2-12 and KRTAP4-2 was un-regulated, whereas the expression of KRTAP6-1, KRTAP7, and KRTAP8 was down-regulated. In wool from sheep on a restricted diet, KAP13-1 and KAP6-n protein levels were increased, and this was found to be associated with a decrease in the fibre diameter [97].
Lastly, variation in KRTAPs may also affect the post-translational modification of the protein. A recent proteomic study revealed a differential abundance of some phosphorylated KAPs and keratins, when comparing the crimped and straight wool of Tan sheep [96], and provided evidence that phosphorylation of KAPs and keratins can occur. Bioinformat-ics analyses of ovine KRTAP11-1 and KRTAP13-3 reveals some non-synonymous SNPs that would alter putative phosphorylation sites in proteins derived from these genes [37,38], and the phosphorylation of the KAPs may alter the structural conformation and interactions with KIFs and/or other KAPs, and consequently affect the properties of the wool fibres [98].

Concluding Remarks and Future Research Directions
To date, 30 KRTAPs from 18 different families have been identified in sheep and 18 KRTAPs from 12 families have been reported in goats. Most of these genes are present in humans, but some are absent. This suggests that sheep and goats may possess more KRTAPs than humans. The ovine and caprine KRTAPs are unevenly clustered on chromosomes and translated in alternating directions. The configuration of the KRTAPs in the sheep and goat genomes are similar to the configuration reported in humans, but differences occur too. All of the sheep and goat KRTAPs are polymorphic, but the extent and nature of polymorphism varies between the genes.
Our current understanding of KRTAPs is based primarily on their chromosomal location and sequence, but little is known about how the genes have evolved and the mechanisms that underlie the generation of variation in the genes. Further investigation into their sequences, especially of their flanking regions, may shed more light on their evolutionary origin and how natural selection may have created or enhanced their diversity. It also must not be forgotten that for many hundreds of years humans have been selecting and breeding sheep and goats for fibres, meat, and milk traits; hence, there may evolutionary dead-ends or rare sequences that will be hard, if not impossible, to place in that evolutionary history.
Ongoing investigations are also needed in other important areas. First is the ongoing need to identify and characterize new KRTAPs that are likely present in the sheep and goat genomes. This should, in time, lead to the definition of a full catalogue of KRTAPs in these species, and the complete annotation of the genes in the reference genomes. This will doubtlessly be enhanced by the use of high-throughput and rapid genome-sequencing techniques, whereby hundreds or thousands of sheep can be rapidly sequenced, subjected to bioinformatics analysis, and publicly recorded and indexed.
Second is the pressing need to better understand the temporal and spatial nature of KRTAP expression. Questions about how, when, and why the genes are expressed will need to be addressed, especially if the fibres from sheep and goats are to be improved and better fitted to purpose. Given the potential number of KRTAPs that exist, the number of variants for each KRTAP, and the diploid nature of the sheep and goat genomes, this will be an immense challenge. If all of the KRTAPs are expressed, then there will, by definition, be a large diversity of KAP proteins, and more importantly different permutations of those proteins and the KIFs in the matrix of the fibres. When fully understood, this may enable a greater fibre uniformity to be selected for breeding sheep and goats, and enable us to address one of the bigger constraints on fibre use: its natural variability. This will not be a small task, because while quantitative analytical techniques can be used to investigate whether variation in KRTAPs affects gene expression or post-translational modifications of individual KRTAPs/KAPs, it is still difficult to unravel the effect of individual KRTAPs because of the potentially large numbers of KAP and keratin proteins in fibres. Given the adage, backed by evidence, that 'there is more variation within any given fleece than between fleeces in any given flock', the size of this task should not be underestimated. Current studies describing associations between variation in individual KRTAPs and variation in fibre traits is a start, but much more research and new multiplex analytical approaches will be needed.