Maternal Relationships among Ancient and Modern Southern African Sheep: Newly Discovered Mitochondrial Haplogroups

Simple Summary The genetic diversity of southern African sheep remains under-studied. We present here the complete mitochondrial genomes of archaeological southern African sheep, as well as the genomes from three indigenous southern African breeds—Damara, Namaqua Afrikaner, and Ronderib Afrikaner. We show that southern African sheep exhibit limited genetic diversity which is consistent with our understanding of their migration south from northernmost Africa. Intriguingly, many of the modern sheep show close relationships with the archaeological sheep, implying an ancestor-descendant relationship. Similarly, the sheep that do not exhibit a close relationship with the archaeological sheep nonetheless cluster closely with each other and do not show a close relationship with European and Asian sheep. This suggests that they too are descendants of indigenous sheep and not the product of historic introductions of exotic breeds. Abstract We investigated the genetic diversity and historic relationships among southern African sheep as well as the relationships between them and sheep outside the continent by sourcing both archaeological and modern sheep samples. Archaeological sheep samples derived from the site Die Kelders 1, near Cape Town, date to approximately 1500 years ago. The modern samples were taken as ear snips from Damara, Namaqua Afrikaner, and Ronderib Afrikaner sheep on a farm in Prieska in the Northern Cape. Illumina sequencing libraries were constructed for both ancient and modern specimens. Ancient specimens were enriched for the mitochondrial genome using an in-solution hybridization protocol and modern specimens were subjected to shotgun sequencing. Sequences were mapped to the Ovis aries reference genome, assigned to haplogroups and subhaplogroups, and used to calculate a phylogenetic tree using previously published, geographically dispersed mitochondrial genome sheep sequences. Genetic diversity statistics show that southern African sheep have lower diversity than sheep in other regions. Phylogenetic analysis reveals that many modern southern African sheep are likely descended from prehistoric indigenous sheep populations and not from sheep imported from Europe during the historic period.


Introduction
More than 400 million sheep (Ovis aries) representing approximately 170 breeds live in Africa [1], but they are not indigenous to the continent. They were likely domesticated from the European mouflon (Ovis aries musimon) and the Asiatic mouflon (Ovis orientalis) in southwestern Asia about 10,000 years ago [2][3][4][5][6]. Northern Africa is home to the Barbary sheep (Ammotragus lervia, and not technically a sheep but a caprine), however, despite archaeological evidence that they were likely confined and even fed sedating fodder [7][8][9][10], there is no evidence that they were ever domesticated, nor evidence that they interbred have coarse hair, rather than wool, and a significant fatty deposit at the base of the tail and are regarded as drought tolerant and hardy [6,50,51,[54][55][56][57]. Mitochondrial diversity in domesticated sheep clusters in to five haplogroups designated A-E [58][59][60][61][62][63][64][65][66][67][68]. Haplogroups A and B exhibit almost worldwide distribution, with haplogroup A being found at particularly high frequencies on the Indian subcontinent and haplogroup B at high frequency in Europe [69]. Haplogroup C has a much more restricted distribution, being found primarily in southwestern Asia, northern China, and Mongolia. Members of haplogroups D and E are relatively rare and have so far been found only in southwestern Asia [69]. Within Africa, haplogroup B is the most common lineage reported in Algeria [70], Egypt [71,72], Ethiopia [73,74], Kenya [75], Morocco [76], and South Africa [32,77]. All southern African sheep so far reported are members of haplogroup B, but short fragment recovery has prevented their assignment to subhaplogroups.

Modern Samples
The sheep analyzed in this study were members of a flock farmed by Dawie du Toit at his Damara Stud Farm in Prieska, Northern Cape, South Africa (see Figure 2). De Toit took ear snips from 31 Damara, 15 Namaqua Afrikaner, and 9 Ronderib Afrikaner sheep. Ear snips were stored in 90% ethanol at room temperature while in the field, and then at −4 °C once they arrived at the laboratory. DNA was extracted using the DNeasy Blood & Tissue kit (QIAGEN, Hilden, Germany) following the manufacturer's protocol, with a small modification. The kit specifies up to 8 h of incubation at 56 °C with ATL buffer and proteinase K. We found that 8 h of incubation was insufficient and extended incubation times to 12-18 h. Following incubation, hair and cartilage were centrifuged for collection and the lysate was transferred to a new tube before proceeding with the protocol. Mitochondrial diversity in domesticated sheep clusters in to five haplogroups designated A-E [58][59][60][61][62][63][64][65][66][67][68]. Haplogroups A and B exhibit almost worldwide distribution, with haplogroup A being found at particularly high frequencies on the Indian subcontinent and haplogroup B at high frequency in Europe [69]. Haplogroup C has a much more restricted distribution, being found primarily in southwestern Asia, northern China, and Mongolia. Members of haplogroups D and E are relatively rare and have so far been found only in southwestern Asia [69]. Within Africa, haplogroup B is the most common lineage reported in Algeria [70], Egypt [71,72], Ethiopia [73,74], Kenya [75], Morocco [76], and South Africa [32,77]. All southern African sheep so far reported are members of haplogroup B, but short fragment recovery has prevented their assignment to subhaplogroups.

Modern Samples
The sheep analyzed in this study were members of a flock farmed by Dawie du Toit at his Damara Stud Farm in Prieska, Northern Cape, South Africa (see Figure 2). De Toit took ear snips from 31 Damara, 15 Namaqua Afrikaner, and 9 Ronderib Afrikaner sheep. Ear snips were stored in 90% ethanol at room temperature while in the field, and then at −4 • C once they arrived at the laboratory. DNA was extracted using the DNeasy Blood & Tissue kit (QIAGEN, Hilden, Germany) following the manufacturer's protocol, with a small modification. The kit specifies up to 8 h of incubation at 56 • C with ATL buffer and proteinase K. We found that 8 h of incubation was insufficient and extended incubation times to 12-18 h. Following incubation, hair and cartilage were centrifuged for collection and the lysate was transferred to a new tube before proceeding with the protocol. Total DNA concentration was measured by spectrophotometry using a NanoDrop 2000 (Thermo Fisher Scientific, Waltham, MA, USA). Approximately 500 ng of DNA was used to construct barcoded Illumina sequencing libraries using the NEBNext Ultra II DNA Library Prep Kit for Illumina and the NEBNext Multiplex Oligos for Illumina (New England BioLabs, Ltd., Ipswich, MA, USA) following the manufacturer's protocol. Barcoded libraries were quantified using the NEBNext Library Quant Kit for Illumina on the Bio-Rad CFX96 Touch Real-Time PCR Detection System. Libraries were pooled in equimolar ratios and sequenced on one PE150 S2 NovaSeq6000 Flowcell (Maryland Genomics, Baltimore, MD, USA).

Archaeological Samples
In an attempt to extract additional information from already analyzed archaeological sheep specimens without undertaking further destructive analyses, we re-extracted DNA from twelve freezer-stored bone digests from earlier work (see Table 1 for specimen details, [77]). All specimens were excavated from the deeply stratified cave site, Die Kelders 1, located approximately 120 km southeast of Cape Town (see Figure 1). The sheep analyzed here were excavated from Layer 2 of the Later Stone Age deposits [78][79][80][81][82][83] which date to AD 630-857 (SHCal13 Curve, [84,85]). Specimens were selected to ensure that the same individual was not sampled twice. Most of the specimens are right half-mandibles and in the few cases in which left half-mandibles were selected, their morphology was examined by an expert zooarchaeologist (R. Klein, Stanford University) to ensure that they could not have derived from the same individual as the right half-mandibles. DNA extraction and the preparation of Illumina sequencing libraries were performed in the ancient DNA lab of the Southern Methodist University Molecular Anthropology Laboratories. DNA was extracted following Allentoft et al. [86], with only minor modifications of

Archaeological Samples
In an attempt to extract additional information from already analyzed archaeological sheep specimens without undertaking further destructive analyses, we re-extracted DNA from twelve freezer-stored bone digests from earlier work (see Table 1 for specimen details, [77]). All specimens were excavated from the deeply stratified cave site, Die Kelders 1, located approximately 120 km southeast of Cape Town (see Figure 1). The sheep analyzed here were excavated from Layer 2 of the Later Stone Age deposits [78][79][80][81][82][83] which date to AD 630-857 (SHCal13 Curve, [84,85]). Specimens were selected to ensure that the same individual was not sampled twice. Most of the specimens are right half-mandibles and in the few cases in which left half-mandibles were selected, their morphology was examined by an expert zooarchaeologist (R. Klein, Stanford University) to ensure that they could not have derived from the same individual as the right half-mandibles. DNA extraction and the preparation of Illumina sequencing libraries were performed in the ancient DNA lab of the Southern Methodist University Molecular Anthropology Laboratories. DNA was extracted following Allentoft et al. [86], with only minor modifications of the protocol. Samples were not pre-digested with 5 mL of 0.5 M EDTA for 15 min to remove surface contamination because samples had been previously extracted. We were unable to achieve a 5 M solution of sodium acetate and substituted a 3 M solution. Extracted DNA was used to construct double-barcoded Illumina sequencing libraries using the SRSLY PicoPlus Kit (Claret Bioscience, Santa Cruz, CA, USA). The molecules in each library were quantified using the NEBNext Library Quant Kit for Illumina on the Bio-Rad CFX96 Touch Real-Time PCR Detection System. These amplified libraries were visualized under UV with ethidium bromide on 2% agarose gels. Libraries were visually assessed for the presence of a spread of DNA fragments. Libraries with evidence of only adapters and primer dimers were not processed further.
MyBaits probes (MYcroarray, Ann Arbor, MI, USA) for sheep were used to enrich the sequencing libraries for the mitochondrial genome following the manufacturer's protocol to low quantity and quality targets. The libraries were quantified using the NEBNext Library Quant Kit for Illumina on the Bio-Rad CFX96 Touch Real-Time PCR Detection System, pooled in equimolar ratios, and sequenced on a PE150 S2 NovaSeq6000 Flowcell (Maryland Genomics).

Data Analysis
For the modern samples, paired end sequence reads in a FASTQ format were aligned to a complete Ovis aries mitochondrial genome (Genbank accession: AF010406.1) using the maximal exact matches (MEM) command of the Burrow's Wheeler Aligner (BWA) (Version 0.7.17) [87]. PCR duplicates were removed from the resulting BAM files using Picard's MarkDuplicates (V.2.26) [88]. For the ancient specimens, the raw FASTQ files were first processed using AdapterRemoval2 (Version 2.3.2) [89]. This tool removed short reads (<25 bp), removed stretches of Ns and bases that had low-quality scores (<30), and finally merged the paired-end sequence reads which had an overlap of at least 11 nucleotides. These collapsed reads were then aligned to the complete Ovis aries mitochondrial genome (Genbank accession: AF010406.1) using the BWA aln command [87] using settings recommended for ancient DNA (specifically, seeding was disabled (−l 1014), the number of gap opens was set to 2 (−o 2) and the maximum edit distance was set to 0.03 (−n 0.03)). PCR duplicates were removed from the resulting BAM files using DeDup, a tool that has been specifically designed for ancient DNA reads. MapDamage (Version 2.0.2) was used to assess damage signatures associated with ancient DNA, specifically the transitions from C to T and G to A at the ends of the sequencing reads.
After these initial alignment steps, the ancient and modern samples were treated identically. Read groups were added using Picard's AddOrReplaceReadGroups and Samtools was then used to exclude reads not mapping to the Ovis aries reference genome. Variant call-Biology 2022, 11, 428 6 of 12 ing was performed using the GATK (Version 4.2.3) HaplotypeCaller. An in-house script was used to mask regions of coverage regions (minimum coverage 5×) before the GATK FastaAl-ternateReferenceMaker was used to generate FASTA sequences for downstream analyses. Additional Ovis aries complete mitochondrial genome were downloaded from Genbank (see Supplementary Materials Table S1) and aligned with the sequences from samples that had 100% DNA coverage using Mafft (Version 7). SeqMagick was used to generate the PHYLIP format files for uploading to PhyML 3.0 [90] (http://www.atgc-montpellier.fr/phyml/, accessed on 12 December 2021) to generate Maximum Likelihood phylogenetic trees. Smart Model Selection using the Akaike Information Criterion (AIC) was used [91] and the R package PopGenome was used to calculate genetic diversity statistics.

Results
Complete, or almost complete, mitochondrial genomes were recovered from six of the archaeological specimens. The archaeological sheep are all members of subhaplogroup B1. Previous work on the same assemblage [77] also found the Die Kelders sheep to be members of haplogroup B, but with only 506 bp of sequence, that study was unable to determine subhaplogroup membership. Supplementary Materials Table S1 provides a complete list of specimens, percentage of the genome covered, the depth of coverage, and haplogroup assignments; Supplementary Materials Figure S1 shows the damage patterns indicative of ancient DNA. The supplementary materials of Lv et al. [69] list the mutations that define membership in each haplogroup and subhaplogroup. The modern sheep are dominated by subhaplogroup B1 (n = 42) but also include A1b (n = 13), previously unreported in southern African sheep. Supplementary Materials Figure S2 shows a phylogenetic tree including all the specimens analyzed here and previously geographically diverse representatives of haplogroups A and B; Supplementary Materials Figure S3 shows the network. The GenBank accession numbers, subhaplogroup assignments, geographical origin, and citations for all included specimens can be found on sheet 4 of Supplementary Materials Table S1. Figure 3 shows the tree with collapsed clades displaying the phylogenetic relationships.
As can be seen on the phylogenetic tree, the southern African sheep of the A1b lineage include members of all the sampled breeds-Damara, Namaqua Afrikaner, and Ronderib Afrikaner. European sheep that are members of haplogroup A cluster separately from the African haplogroup A sheep, suggesting that these A lineage sheep in southern Africa are not the direct and recent descendants of historically imported European sheep. Subhaplogroup A1b sheep are found in east Asia, but as can be seen on the phylogeny, the South African A1b sheep form a cluster separate from those of east Asia. It is therefore likely that these southern African sheep are not the recent descendants of imports from east Asia. None of the ancient specimens, however, are members of subhaplogroup A1b. Interestingly, these sheep possess fat-tails and are members of either haplogroup A or B, but outside of Africa, haplogroup C has been found to possess the highest frequencies of sheep with the fat-tail phenotype [69].
All the archaeological specimens and the majority of the modern ones are members of subhaplogroup B1. Interestingly, many of the modern B1 individuals are very closely related to the ancient B1 sheep, likely signaling that those modern B1 sheep share an ancestor-descendant relationship with the ancient sheep from Die Kelders 1.
Supplementary Materials Table S1, sheet 3 shows nucleotide diversity and haplotype diversity statistics for the sheep in this study and the comparative cattle and sheep data discussed here. Consistent with the patterns found in cattle in southern Africa [92,93], southern African sheep show less genetic diversity than sheep further north on the continent. There is approximately an order of magnitude lower within-haplogroup nucleotide diversity in the southern African sheep than has been reported in ancient and modern Turkish sheep [4]. Presumably, serial founder effects as populations moved south from their northerly origins resulted in a reduction of genetic diversity. While it is very unlikely that the same archaeological sheep were sampled twice, it is possible that the Die Kelders 1 sheep are very recently descended from the same maternal lineage.  Table S1).  Table S1).

Discussion
The genetic diversity of southern African sheep studied so far is markedly low compared with other sheep populations. Sheep took approximately 7000 years to move from northern to southern Africa, and in doing so, encountered a wide variety of environments from Mediterranean North Africa, across the equatorial regions, to subtropical southern Africa. The low level of genetic diversity in southern African sheep is consistent with models of serial founder effects as sheep populations moved across these diverse ecosystems.
Historical records document the importation of sheep to southernmost Africa from England and Persia (now Iran) during the late 17th and early 18th centuries [48]. Curiously, the sheep we have sequenced here show no evidence of these introductions of exotic sheep. The modern B lineage sheep are quite closely related to each other, not widely distributed through the B lineage clade, and many are especially closely related to the archaeological B lineage sheep. The deposits from which the archaeological sheep were excavated date to the middle of the first millennium AD, many hundreds of years before the first European explorers arrived in southern Africa. It seems likely that the modern B lineage sheep are descendants of some of the original southern African sheep.
The A lineage sheep among the modern specimens are phylogenetically distinct from the A lineage sheep so far reported from Europe and eastern Asia. This pattern suggests that these sheep are descendants of early southern African sheep as well, and not those introduced during the historic period. No A lineage sheep have been found among sequenced archaeological sheep from southern Africa, but given the small sample sizes studied so far, the pattern is more likely a consequence of small sample sizes than a genuine absence of A lineage sheep in prehistoric southern Africa. To date, sequenced archaeological sheep include only 20 individuals from Die Kelders 1 in the western Cape [77] and 1 from Blydefontein in the Karoo [32,41]. With such small samples, it is very likely that we have not yet captured an accurate picture of precontact diversity.

Conclusions
Little is known about the mitochondrial diversity of southern African sheep. Herein, we have made some progress towards addressing this gap in knowledge by sequencing the complete mitochondrial genomes of sheep from three indigenous southern African breeds, as well as a small sample of archaeological sheep. Close phylogenetic clustering of the archaeological sheep with some of the modern sheep suggests that many of southern Africa's modern sheep are the direct descendants of the region's prehistoric sheep. Further, those modern sheep that do not show particularly close phylogenetic relationships with the archaeological sheep cluster closely together, to the exclusion of European and Asian sheep. This pattern suggests that a larger sampling program of both archaeological and modern individuals will reveal additional diversity in southern African sheep across both time and space.