Quaternary DNA: A Multidisciplinary Research Field

The purpose of this Milankovitch review is to explain the significance of Quaternary DNA studies and the importance of the recent methodological advances that have enabled the study of late Quaternary remains in more detail, and the testing of new assumptions in evolutionary biology and phylogeography to reconstruct the past. The topic is wide, and this review is not intended to be an exhaustive account of all the aDNA work performed in the last three decades on late-Quaternary remains. Instead, it is a selection of relevant studies aimed at illustrating how aDNA has been used to reconstruct not only environments of the past, but also the history of many species including our own.


Origin and Significance of Quaternary DNA Research
Ancient DNA (aDNA) research is the study of degraded DNA that originates from prehistoric remains. As the rate of DNA degradation varies a lot and is highly dependent on local environmental conditions, the timeframe covered by aDNA studies is very broad and includes the study of sub-fossil remains from the late Quaternary (around a hundred thousand years old), museum and archeological specimens only a few thousand years old, and also herbarium and museum specimens only a few decades old.
Ancient DNA research began in the 1980s following the development of the polymerase chain reaction (PCR) technique, a method now widely used in molecular biology to make many copies of a specific DNA segment, and initial reports suggested that the time period open to investigation was vast (millions of years). Even then however, there was much controversy surrounding the authenticity of DNA results from million-year old remains, and later, several initial studies were proven to be non-authentic. The first studies reporting successful extraction and amplification of aDNA came from the quagga, an extinct member of the horse family, and from an ancient Egyptian mummy [1,2]. Successive studies came from a few-thousand-year-old maize seeds found in a Huari tomb in Peru [3] and from much older plant material found in fossil deposits located in northern Idaho, North America [4][5][6]. These deposits, known as Clarkia fossil beds, consisted of an exposed sequence of lake sediments dating from the early Miocene (circa 15 million years ago) where the lake's cold water and rapid sedimentation formed the perfect conditions for creating fossils (Figure 1). The first Clarkia report came in 1990 and described DNA sequences retrieved from the leaves of a Magnolia species dating back to the Miocene, apparently in an excellent state of preservation showing the presence of intact chloroplasts [4,5]. Two years later, an independent attempt was conducted on samples of a Miocene Taxodium (bald cypress) from the same fossil beds and the results seemed to support previous achievements from Magnolia [6]. However, both these pioneer studies were performed without adequate contamination controls and later attempts to obtain chloroplast DNA sequences from similar remains produced conflicting results. Today, multiple evidence strongly suggests the involvement of contamination in these early studies, probably PCR related [7]. Nevertheless, in the 1990s, the Magnolia study seemed to break the million-year barrier and researchers turned their attention to more charismatic creatures such as dinosaurs [8], amber-preserved specimens [9,10] (Figure 2), mammoths [11], ( Figure 3) and humans. Nevertheless, in the 1990s, the Magnolia study seemed to break the million-year barrier and researchers turned their attention to more charismatic creatures such as dinosaurs [8], amberpreserved specimens [9,10] (Figure 2), mammoths [11], ( Figure 3) and humans.   Nevertheless, in the 1990s, the Magnolia study seemed to break the million-year barrier and researchers turned their attention to more charismatic creatures such as dinosaurs [8], amberpreserved specimens [9,10] (Figure 2), mammoths [11], ( Figure 3) and humans.

DNA Preservation
Today, there is no longer controversy surrounding the authenticity of these initial million-yearold DNA reports and we know that they were artifacts caused by contamination from modern DNA. This is because after death, DNA is heavily affected by hydrolytic and oxidative damage [12] and after a few years, depending on preservation conditions, it can be highly fragmented and damaged and present in a small amount in fossil remains. As a consequence, the retrieval and the analysis of DNA sequences older than a few hundred thousand years is very difficult to achieve at this moment.
Research suggests that after circa one or two million years, DNA degrades to a point no longer usable for analysis [13], even when preserved under optimal conditions such as frozen, dry, and anoxic environments, which are not very common in nature. Even in relatively young fossils, DNA can disappear very quickly if preserved in warm and wet environments such as in the tropics. More recently, a model has been proposed for DNA post-mortem decay in bones where fragmentation rapidly reaches a threshold, then subsequently slows down [14]. In contrast, nucleotide damage (cytosine deamination) follows a conventional thermal age model and DNA damage increases with age [14]. At the moment, the oldest DNA samples ever recovered are from 700,000-year-old horse bones extracted from frozen soils in northern Canada [15] and from insects and plants in ice cores in Greenland of up to 450-800-thousand-years old [16].
Researchers at the moment are not able to estimate the exact rates of decay for DNA because temperatures, oxygenation, and other environmental factors like precipitation and soil quality vary widely between substrates, making it difficult to estimate a common and basic rate of degradation.

High-Throughput Sequencing Technologies
What has significantly changed the scope of aDNA research in the last decade was the development of more rapid and efficient high-throughput (HTS) DNA sequencing technologies [17,18], which require small amounts of template DNA, dramatically reduce the cost of sequencing, and increase the volume of throughput [19]. This has allowed for the sequencing of entire genomes of an increasing number of ancient individuals and extinct species, mainly animals [20]. Genomic research in plants is still lagging behind equivalent work in animals, mainly because it is difficult to find good-quality DNA in charred seeds or wood, which represent 95% of the plant archeological record. However, it has recently become possible to study DNA extracted from mixed substrates like sediments and this, together with the combination of HTS with new DNA extraction methods, means that better knowledge of the origin of aDNA and the processes involved in DNA preservation in fossil substrates (taphonomy) and technical improvements in bioinformatic analyses has

DNA Preservation
Today, there is no longer controversy surrounding the authenticity of these initial million-year-old DNA reports and we know that they were artifacts caused by contamination from modern DNA. This is because after death, DNA is heavily affected by hydrolytic and oxidative damage [12] and after a few years, depending on preservation conditions, it can be highly fragmented and damaged and present in a small amount in fossil remains. As a consequence, the retrieval and the analysis of DNA sequences older than a few hundred thousand years is very difficult to achieve at this moment.
Research suggests that after circa one or two million years, DNA degrades to a point no longer usable for analysis [13], even when preserved under optimal conditions such as frozen, dry, and anoxic environments, which are not very common in nature. Even in relatively young fossils, DNA can disappear very quickly if preserved in warm and wet environments such as in the tropics. More recently, a model has been proposed for DNA post-mortem decay in bones where fragmentation rapidly reaches a threshold, then subsequently slows down [14]. In contrast, nucleotide damage (cytosine deamination) follows a conventional thermal age model and DNA damage increases with age [14]. At the moment, the oldest DNA samples ever recovered are from 700,000-year-old horse bones extracted from frozen soils in northern Canada [15] and from insects and plants in ice cores in Greenland of up to 450-800-thousand-years old [16].
Researchers at the moment are not able to estimate the exact rates of decay for DNA because temperatures, oxygenation, and other environmental factors like precipitation and soil quality vary widely between substrates, making it difficult to estimate a common and basic rate of degradation.

High-Throughput Sequencing Technologies
What has significantly changed the scope of aDNA research in the last decade was the development of more rapid and efficient high-throughput (HTS) DNA sequencing technologies [17,18], which require small amounts of template DNA, dramatically reduce the cost of sequencing, and increase the volume of throughput [19]. This has allowed for the sequencing of entire genomes of an increasing number of ancient individuals and extinct species, mainly animals [20]. Genomic research in plants is still lagging behind equivalent work in animals, mainly because it is difficult to find good-quality DNA in charred seeds or wood, which represent 95% of the plant archeological record. However, it has recently become possible to study DNA extracted from mixed substrates like sediments and this, together with the combination of HTS with new DNA extraction methods, means that better knowledge of the origin of aDNA and the processes involved in DNA preservation in fossil substrates (taphonomy) and technical improvements in bioinformatic analyses has considerably increased the temporal and the geographical range and the types of organisms that can be studied. In this way, we have moved rapidly from the analysis of a few organelle loci (mitochondrial in animals and chloroplast in plants), to the study of entire nuclear and organelle genomes in several ancient iconic species, and it became possible to sequence the entire genome of a~40,000-year-old woolly mammoth (Mammuthus primigenius) [21], the genome of ã 4000-year-old Paleo-Eskimo [22], and the Neanderthal genome (Homo neanderthalensis) [23]. Today, the sequencing of whole genomes from specimens from the late Quaternary dating to thousands and even hundreds of thousands of years before present is possible, though still challenging. Many of these studies have allowed inferences about when a genotype of a living population or a species appeared in a certain environment and the redefinition of the time and area where the extinction and divergence of species and lineages took place as well as the development of more insight into postglacial migration routes [15,24]. It has been also possible to combine datasets through space and time from modern and ancient genotypes, phenotypes, and environmental conditions and investigate how these interacted and how such combinations determined the evolution of our own species [25]. We can thus obtain a better understanding of what happens at the genetic level within human populations when past environments changed and population mixed in the past.
Together, these advances mean that we are at the eve of an explosion of new information on the identification of fossils at the species level and the exact linking of species, populations, and communities with modern equivalents within a tree of life that includes fossils as well as living organisms through periods of environmental and climatic change.
Unfortunately, the field of aDNA has some limitations caused by contamination from fieldwork. Assessing authenticity, however, is possible by following strict laboratory and fieldwork protocols and by planning robust experimental designs (see [26][27][28]). Moreover, recent improved understanding of the effects of damage on aDNA templates [14,28,29] has started to provide a more robust basis for drawing sound ecological and evolutionary conclusions in this fast-growing research field.

A Multidisciplinary Research Field
Quaternary DNA is a research field at the intersection between paleoecology and evolutionary biology as it allows scientists to directly study evolutionary events of the past in ecological contexts. The field is not isolated or alone; it requires coordinated and interdisciplinary work between several disciplines. Indeed, what has been particularly beneficial in the last few decades of study has been the multidisciplinary approach implemented by many research teams around the world. It has been through collaborative work between evolutionary biologists, paleoecologists, archeologists, geologists, anthropologists, and conservation scientists that robust and reliable results have been produced in several groups of organisms including both plants [30] and animals [31].
There are several examples of questions addressed by combining multiple disciplines with aDNA studies, mainly focusing on animals. Here, only three studies that address important issues in evolutionary biology are discussed.
The first study concerns human history, a discipline that in the last decade has been strongly influenced by the introduction of modern and ancient genome sequencing. Recently, studies of human aDNA have helped to understand the proportions in which we are related to Neanderthals and Denisovans. As seen through the analyses of Neanderthal and Denisovan genomes from fossils discovered in Europe and Asia, scientists concluded that most Europeans and Asians have on average 2% of Neanderthal DNA [23,32], while some Austronesians, in particular people from Papua New Guinea, have up to 5% of Denisovan DNA [33]. Previous archeological and genetic evidence have shown that Neanderthals were thinly dispersed across Europe and Asia. The new aDNA evidence suggests that humans mated with both Denisovans and Neanderthals more than 50,000 years ago and that approximately 20% of the whole Neanderthal genome is present in the modern human gene pool, implying that the average 2% Neanderthal DNA present in Europeans and Asians is not the same in all populations [34][35][36]. Other aDNA studies have tried to understand the effects that these different portions of archaic genomes have on the phenotypes of modern humans. Dannemann and Kelso combined modern data from living people by the UK Biobank with phenotypic and genomic data from Neanderthal individuals and showed that Neanderthal DNA in modern Europeans affects a large number of modern behavioral, metabolic, immune, and skin traits like skin tanning and burning, sleeping patterns, mood, and tobacco use [25]. Contrary to what was previously believed, genetic loci associated with redheaded Neanderthals were very rare or almost non-existent, however, knowing how much modern and archaic humans have in common and which part of the genome is shared does not tell us why Neanderthals disappeared ca. 40,000 years ago, nor does it provide information on their health and their way of living. It is possible that they were killed by modern humans or that they were infected with novel diseases, or instead, Neanderthals could not reproduce fast enough to keep up with modern humans. It is also possible that a combination of all these factors, with climate change playing a more important role than previously thought, caused the Neanderthal populations to gradually decrease in size during the repeated cold periods of the late Quaternary.
Other recent aDNA studies that have elucidated human evolution come directly from soils. By extracting and shotgun sequencing the DNA of ancient animals (cave bears, woolly rhinoceros, woolly mammoth) in cave sediment dating up to 240,000 calibrated years before the present (cal. year BP), scientists have recently inferred the presence of Neanderthals and Denisovans without the need to find their bones [37]. Shotgun sequencing is a very powerful approach, which aims at sequencing all DNA fragments present in an aDNA mix (metagenomics) and uses complex bioinformatic analyses to assign the short DNA reads to different taxonomic groups. In sediment, however, the vast majority of DNA fragments come from bacteria and only a small fraction come from animals and plants [38,39], therefore DNA capture-enrichment methods and HTS should be used to sequence informative genome subsets of interest out of the DNA mix [40].
A second example of Quaternary DNA discoveries comes from plants and concerns Reid's paradox of the rapid migration of plants [41]. This was the observation made by Reid more than a hundred years ago that the speed of the forest recolonization process after the last Quaternary glaciation (ca. 12 cal. kyr BP) was much faster (several km/yr) than would be expected based on life histories and the ability to transport seeds and fruits of the species involved (Reid's paradox) [42]. Fast recolonization can be explained by very rare long-distance dispersal events mediated by birds and wind, but can also be explained by the alternative, though controversial [43,44], explanation that cold-tolerant species survived at high latitudes in small isolated microrefugia and recolonized locally once the ice retreated (e.g., [45,46]). Recently, several lines of evidence based on fossil analysis, species distribution modeling, and phylogeographical surveys have been used to propose the existence of these small northern microrefugia [47,48]. In Scandinavia, the finding of Norway spruce (Picea abies (L.) Karst) megafossil remains and the presence of clonal spruce trees with plant material collected under their roots dated up to 11.5 cal. kyr BP (Figure 4), suggests the possible survival of spruce during the last glaciation in this region [45]. This hypothesis was tested using both modern and aDNA from lake sediments, and the genetic findings supported the hypothesis that spruce trees were present in Scandinavia during the last part of the glacial period and the early Holocene [49]. These results, together with modern DNA results from North America showing that populations were closer to modern range limits than previously thought (e.g., [50]), questioned traditional views on survival and the spread of trees as a response to climate changes and strongly suggests that postglacial migration rates may have been slower than those inferred from fossil pollen. The last example comes from environmental DNA extracted from lake sediments that has transformed the accepted hypotheses on how human populations first reached North America. The scientific consensus was that prehistoric humans migrated toward the south via the now extinct land bridge connecting Siberia and Alaska (Beringia) from Asia after the North American ice sheet had melted and created an ice-free corridor. Using aDNA extracted from lake sediments from Alaska in a bottleneck portion of the ice sheet, Pedersen et al. reconstructed how the ecosystem evolved within the corridor after the ice began to melt [51]. They obtained shotgun sequencing DNA and found evidence of bison and mammoth and steppe vegetation by ca 12.6 cal. kyr BP, followed by evidence of open forest with moose and elk at ca 11.5 cal. kyr BP. The Clovis culture, however, first appears in the archeological record south of the ice sheet corridor more than 13,000 years ago. The aDNA therefore revealed that the first Americans were unlikely to have travelled through the corridor into the Americas when an ice-free corridor was unable to supply the biotic resources (bison, mammoth, elk and moose) necessary for human foragers to walk through the corridor. The most likely scenario was that they travelled earlier down the Pacific Coast by sea and that they reached the southern continent by boats.

Environmental DNA Yields Clue to Past Biodiversity
All of these discoveries highlight the potential of Quaternary DNA studies to assess past biodiversity and reconstruct the history of human and plant species and the importance of using environmental DNA [52,53]. In the last decade, environmental DNA extracted from peat, lake sediments, permafrost, and ice cores have provided a massive amount of information on the history of past flora and fauna and its development in relation to climate change [15,16,20,24,37,49,[54][55][56][57][58][59][60]. In The last example comes from environmental DNA extracted from lake sediments that has transformed the accepted hypotheses on how human populations first reached North America. The scientific consensus was that prehistoric humans migrated toward the south via the now extinct land bridge connecting Siberia and Alaska (Beringia) from Asia after the North American ice sheet had melted and created an ice-free corridor. Using aDNA extracted from lake sediments from Alaska in a bottleneck portion of the ice sheet, Pedersen et al. reconstructed how the ecosystem evolved within the corridor after the ice began to melt [51]. They obtained shotgun sequencing DNA and found evidence of bison and mammoth and steppe vegetation by ca 12.6 cal. kyr BP, followed by evidence of open forest with moose and elk at ca 11.5 cal. kyr BP. The Clovis culture, however, first appears in the archeological record south of the ice sheet corridor more than 13,000 years ago. The aDNA therefore revealed that the first Americans were unlikely to have travelled through the corridor into the Americas when an ice-free corridor was unable to supply the biotic resources (bison, mammoth, elk and moose) necessary for human foragers to walk through the corridor. The most likely scenario was that they travelled earlier down the Pacific Coast by sea and that they reached the southern continent by boats.

Environmental DNA Yields Clue to Past Biodiversity
All of these discoveries highlight the potential of Quaternary DNA studies to assess past biodiversity and reconstruct the history of human and plant species and the importance of using environmental DNA [52,53]. In the last decade, environmental DNA extracted from peat, lake sediments, permafrost, and ice cores have provided a massive amount of information on the history of past flora and fauna and its development in relation to climate change [15,16,20,24,37,49,[54][55][56][57][58][59][60]. In particular, studies from high-latitude lakes, where DNA preservation seems to be optimal [61], combined with information from traditional proxies like pollen and macrofossils, are now improving our capacity to assess flora biodiversity changes through time in relation to climate, not only qualitatively, but also quantitatively [62]. Even with a number of biases related to taphonomy, contamination, and incomplete reference databases for taxonomic assignments, metagenomics is a powerful technique, since it theoretically allows all of the fragmented aDNA molecules present in a sample to be assessed, from microorganisms [59,63] to plants [38,64], insects, and mammals [16,39] including extinct humans [37]. In addition, the large output data obtained through NGS metagenomic sequencing permits statistical analyses to detect specific damage patterns in aDNA molecules (substitutions C to T and G to A) that are only present in ancient reads and that can prove their ancient origin [65,66]. In the future, coupling target capture technology with metagenomics will significantly extend the scope of Quaternary DNA research in sediments, particularly as full-genome reference databases are being constructed.

Conclusions
Despite early challenges, the field of aDNA has lately experienced significant advances in methods, technologies, and in understanding the taphonomy of DNA molecules present in ancient remains (bones, macro, and micro plant remains, sediments from caves, lakes, permafrost, and ice). If appropriate sequencing techniques are used, Bayesian estimates of ancient DNA damage parameters now provide a robust basis for correctly identifying authentic sequences from these ancient remains. At the same time, HTS-based sequencing methods are becoming more accessible and less expensive and genomic reference databases used for taxonomic assessments are improving. The expectation is that these new developments will bring in the coming years an explosion of aDNA studies in many laboratories worldwide. These aDNA studies will likely pose some problems for researchers from complementary disciplines like archeologists, historians and paleoecologists. Researchers therefore need to be collaborative, share information, and carefully walk together over the shared landscape of the past [67]. The important thing to understand is that the aDNA field cannot work alone: acknowledgment and discussion of the results from complementary disciplines is not only a recommendation, but a requirement in this rapidly growing research field.
Funding: This research received no external funding.