Back to the Future: Using Long-Term Observational and Paleo-Proxy Reconstructions to Improve Model Projections of Antarctic Climate

: Quantitative estimates of future Antarctic climate change are derived from numerical global climate models. Evaluation of the reliability of climate model projections involves many lines of evidence on past performance combined with knowledge of the processes that need to be represented. Routine model evaluation is mainly based on the modern observational period, which started with the establishment of a network of Antarctic weather stations in 1957 / 58. This period is too short to evaluate many fundamental aspects of the Antarctic and Southern Ocean climate system, such as decadal-to-century time-scale climate variability and trends. To help address this gap, we present a new evaluation of potential ways in which long-term observational and paleo-proxy reconstructions may be used, with a particular focus on improving projections. A wide range of data sources and time periods is included, ranging from ship observations of the early 20 th century to ice core records spanning hundreds to hundreds of thousands of years to sediment records dating back 34 million years. We conclude that paleo-proxy records and long-term observational datasets are an underused resource in terms of strategies for improving Antarctic climate projections for the 21 st century and beyond. We identify priorities and suggest next steps to addressing this.


Introduction
Making quantitative projections of how Antarctic climate may change over the 21 st century and beyond involves the use of numerical climate models. Trust in the climate models used to make projections is based largely on evaluation against observable features of the climate system [1]. In this context it is important to consider a range of model-observation metrics across different components of the climate system. Although rigorous evaluation of many key aspects of climate models' performance is possible based on modern observational datasets, limiting factors in model evaluation for Antarctica and the Southern Ocean are the short time period spanned by modern in-situ instrumental datasets and the lack of sustained observations of some key processes [2]. In particular in Chapter 9 of the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) [1] it is stated that 'In many cases, the lack or insufficient quality of long-term observations, be it a specific variable, an important process, or a particular region (e.g., polar areas, the upper troposphere/lower stratosphere (UTLS), and the deep ocean), remains an impediment' to model evaluation.
A key challenge when interpreting projections of 21 st century climate change is the wide range of estimated changes, even when the same anthropogenic emissions forcing scenario is used [3]. Over many parts of the planet these changes scale with global warming rate, i.e., climate models that have a high sensitivity in terms of their global response to anthropogenic forcing also have a high sensitivity in a given region or continent. However, this is not the case for Antarctica and the Southern Ocean, where, for projections to 2100, a high sensitivity model globally is not necessarily a high sensitivity model over Antarctica (e.g., [4,5]). This points to a major role for regional processes in modulating climate and ice sheet changes over Antarctica. The implication is that our estimates of future Antarctic climate change and its global impacts are strongly reliant on good representation of the atmosphere-ocean-ice system of Antarctica and the Southern Ocean [6].
Here we define the start of the modern observational era as the introduction of a network of continuous meteorological observation stations over Antarctica during the International Geophysical Year (IGY) in 1957/58. However, recognizing that a major improvement occurred subsequently with the introduction of high-quality continuous satellite remote sensing data in 1978/79, we further identify the post 1978/79 period as the modern satellite era. We explore the potential to use longer-term pre-modern (i.e., pre-1957/58) instrumental climate reconstructions and paleoclimate proxies to help improve the reliability of model projections of Antarctic climate change under future emission scenarios to the year 2100. This is a broad time span that incorporates a range of data sources, from early twentieth century in-situ instrumental observations (terrestrial and ship-based platforms) to paleo-climate proxies of different past periods of relevance to projections. In the context of using longer-term datasets to contribute to climate model evaluation, here we outline the current state-of-the-art datasets, identify the key current challenges relevant to model evaluation, and then suggest next steps to addressing these challenges. This paper is novel in the sense that we consider a wide range of different types of long-term datasets, and in each case consider specifically the relevance for informing climate model projections.
This paper draws from discussions at the 2018 Past2Projections workshop of Antarctic Climate Change in the 21 st Century (AntClim21), a scientific research program of the Scientific Committee on Antarctic Research (SCAR). AntClim21 is focused on improving estimates of how Antarctic and Southern Ocean climate may change over the 21 st century. Due to the wide range of timescales and scientific priorities discussed at the workshop, this paper is divided into three main sections. Preceding these, Section 2 introduces key phenomena and processes to provide background for the main sections. Section 3 focusses on past climatic states and transitions back to the presumed onset of the Antarctic glaciations, 34 million years before present. Section 4 covers the last two millennia, considered as the baseline climate of modern conditions, since this matches the approximate range of a number of prominent reconstruction and modelling initiatives. Section 5 looks at the emergence of anthropogenic climate signals and the potential for using what we have observed about the response of the climate system to present levels of anthropogenic forcing to help inform estimates of future change. Sections 3 and 4 follow a similar structure, whereby they include an outline of the current state-of-the-art datasets, the identification of key current challenges relating to model evaluation, followed by suggested next steps. Section 5 focusses on the specific topic of the emergence of anthropogenic climate change signals.

Key Phenomena and Processes Relating to Past Reconstructions and Future Projections of Antarctic Climate
There are a number of phenomena and processes that are important in both the design and interpretation of many longer-term datasets of the Antarctic ocean-atmosphere-ice system. These will be mentioned widely in this paper, and are therefore introduced only briefly here.
From an atmospheric point of view, variability in circulation is widely characterized by indices of the Southern Annular Mode (SAM), which is the leading mode of Southern Hemisphere (SH) atmospheric variability in sea-level pressure [7]. The SAM captures variations in the latitude and/or speed of the Southern Ocean westerlies with more positive index values associated with more poleward and/or stronger westerlies. More regionally, an important feature of atmospheric variability is the Amundsen Sea Low (ASL) [8]. The ASL is a climatological low-pressure anomaly adjacent to West Antarctica, the depth of which is in fact related to the SAM [8]. However, it is a key part of understanding atmospheric circulation influences on regional weather and climate over large parts of West Antarctica and the Antarctic Peninsula and is important in the context of understanding the effects of both stratospheric ozone depletion and tropical variability. With regard to effects of ozone depletion, some model studies indicate that the ozone hole has acted to deepen the ASL [9,10]. The mechanisms responsible for simulated ASL sensitivity to the ozone hole are not well known, but one suggestion is that ozone-depletion-related circumpolar storm track changes cause an increase in cyclonic activity over the Amundsen Sea and a deeper ASL [10]. In relation to the remote influence of tropical variability and trends, the ASL is situated to the poleward end of a pattern of high and low atmospheric pressure anomalies that connects the Tropical Pacific and the Amundsen Sea. This pattern, known as the Pacific South American (PSA) pattern, is strongest in austral winter and emerges as the second mode of SH atmospheric variability. The PSA is generally explained as a Rossby wave train propagating from the tropical Pacific towards West Antarctica [11] and therefore provides a key mechanism for explaining Antarctic teleconnections to El Niño-Southern Oscillation (ENSO). The PSA, SAM and ASL have been found to be sensitive to external climate forcing (both natural and anthropogenic), with stratospheric ozone depletion/recovery, greenhouse gases, orbital parameters, solar forcing and volcanic activity, all potentially important depending on the time period and season under consideration [10,[12][13][14][15][16].
In addition to the dynamic influence of atmospheric circulation patterns described above, climate conditions over the Antarctic continent are also influenced thermodynamically by changes in surface conditions over the Southern Ocean (i.e., sea-surface temperature (SST) and sea ice extent/concentration) -particularly on centennial and longer timescales [6,17,18]. This is particularly apparent for Antarctic-wide averages of precipitation/surface mass balance (SMB) and surface temperature.
From an oceanic point of view, Antarctica is surrounded by the eastward-flowing Antarctic Circumpolar Current (ACC). Branches of the ACC yield two main Southern Ocean polar gyres, the Weddell Sea gyre (Atlantic sector), and the Ross Sea gyre (Pacific sector). At depth, the ACC is composed of Circumpolar Deep Water (CDW), a warm (~0.5 to 1.5 • C close to the margins) and relatively salty water mass (~34.8 PSU). Part of the CDW is geostrophically adjusted and forms the westward-flowing Antarctic Slope Current (ASC), that impinges the continental shelf edge around Antarctica [19]. On the continental shelf edge, brine rejection from sea ice formation and cooling by cold air flowing off the continent yields cold dense bottom waters that contribute to Antarctic Bottom Water (AABW) formation. In the Bellingshausen and Amundsen Seas, the ASC is weak and CDW can flow onto the continental shelf. On the shelf, this relatively warm water, mixed with local continental shelf waters, flows under the Antarctic ice shelves and causes sub-shelf melting that forms the local Ice Shelf Water (ISW). The ISW flows out to the continental shelf edge as one of the contributors to AABW formation. ISW, CDW/ASC, bottom waters and AABW all have distinct signatures in terms of fauna and mineral composition of sediments. Thus, sediment cores can provide information on water mass pathways in past climates.

Past Climates and Transitions (34 Ma to the Last 2 Ka)
The potential of paleoclimate studies in the context of climate model evaluation and improving projections resides in the necessity to simulate projected climate states substantially different from present-day [20]. Long-term paleoclimate records can thus be used to frame valuable tests for climate models across a range of climate states back to the presumed onset of glaciation of the Antarctic ice sheet during the Eocene/Oligocene Transition (~34 million years ago, Ma, EOT) [21]. They can help to improve understanding of mechanisms that drive variability and changes in Antarctic climate. The question of how to bound a conceptual future with realistic but uncertain paleoclimate modeling and observational reconstructions is central to the evaluation of the climate models used to carry out projections [22].

Current State-of-the-Art
Key periods of both direct and indirect relevance to improving climate projections are (Table 1): the Holocene (from~11.7 thousand years ago, ka), the Last Glacial Maximum (LGM) (21 ka), the Last Great Interglacial (LIG, 115-130 ka, marine isotope stage (MIS) 5), MIS 11 (374-424 ka), the mid Pliocene Warm Period (mPWP) (3-3.3 Ma), the mid Miocene Climatic Optimum (MMCO) (14)(15)(16)(17) and the EOT (~34 Ma). These periods are characterized by a gradual increase in atmospheric CO 2 looking back through time ( Figure 1) and are relevant to Antarctic projections specifically in relation to the impact of oceanic warmth on the stability of marine-based Antarctic ice sheet; the seasonality of sea ice cover; bottom water formation and related oceanic heat transport; changes in pole-to-equator thermal gradients and atmospheric jet streams.
Simulating past climates implies, as for present climates, a data-model comparison approach to evaluate the climate reconstructions and identify potential biases that do not arise in short-term 20 th century simulations [20]. Over the past decades various model inter-comparison initiatives, most of them still on-going, have been actively supported by the paleoclimate community: the Paleoclimate Model Intercomparison Project (PMIP, now in phase 4, and endorsed in the current World Climate Research Programme's Coupled Model Intercomparison Phase 6 (CMIP6) project [23]): equilibrium simulations of the mid-Holocene (~6 ka, [24]), the LGM [25] and the LIG [26]. The last two phases also include the last millennium and the current phase is aimed at simulating transient evolution of particular time windows such as the last deglaciation of the LIG.  [29]): is on-going and tests the sensitivity of models to very high atmospheric CO 2 concentrations and/or different paleo-geographies, as suggested by proxy reconstructions for times older than the EOT.  [39]; Pleistocene CO 2 reconstruction is from EPICA Dome C (East Antarctica [36]); the last 2k CO 2 reconstructions are from Law dome (East Antarctica, [55]), Drauning Maud Land [56] and West Antarctic Divide (WAIS, [57]); Historical CO 2 concentration is from law Dome (West Antarctica); 20 th century CO 2 concentration are from the Mauna Loa observatory; Projected CO 2 pathways correspond to the new Shared Socio-Economical pathways (SSP, [58]) as used in CMIP6. The main warm periods of interest are indicated on the frames (see Table 1). Blue bars indicate the onset of Antarctic glaciations at the EOT boundary and the intensification of Northern Hemisphere (NH) glaciations at the Plio-Pleistocene transition, respectively.
These initiatives have triggered the development of paleoclimate proxy databases focused on those specific periods. For example, a multi-proxy database of sea surface temperature for the LGM was compiled as part of the Multiproxy Approach for the Reconstruction of the Glacial Ocean surface (MARGO) project [30]. For the LIG, two main databases have been compiled, the first one includes terrestrial and marine sites [31] and the second one includes high-latitudes ice cores and marine sediment cores [32]. The most complete global reconstruction of climate is currently the Pliocene Research, Interpretation and Synoptic Mapping (PRISM) dataset [33]. For the deep past, Miocene, EOT and before, no similarly comprehensive databases exist, mainly due to a scarcity of data. For past global to regional sea level changes, paleo-sea level markers for all the periods mentioned above are being compiled as part of the PALeo constraints on SEA level rise working group (PALSEA).  Figure 1 for a timeline. LGM: Last Glacial Maximum; LIG: Last Great Interglacial; mPWP: mid-Pliocene Warm Period; MMCO: mid Miocene Climatic Optimum; EOT: Eocene/Oligocene Transition. Orbit is determined based on [35]; Atmospheric CO 2 concentration is from EPICA Dome C ice core record [36] for Pleistocene, from sediment cores for mPWP [37], MMCO [38] and EOT [39]; Eustatic sea level comes from [40] for the Holocene, [34] and references therein up to the mPWP, from [41] for MMCO and from [42] for the EOT. WAIS: West Antarctic ice sheet and EAIS: East Antarctic ice sheet.

•
The last deglaciation represents a transient climate warming (from glacial to interglacial) with atmospheric CO 2 remaining lower than present day (< 300 ppm, Figure 1).
• The LIG and MIS 11 can be used to evaluate the impact of warm orbits on meridional and zonal thermal gradients, polar amplification and the behaviour of teleconnections (SAM, PSA, ENSO). Both periods are suitable for reconstructing the oceanic conditions leading to Marine Ice Sheet Instability in Antarctica under low atmospheric CO 2 concentration [34].
• mPWP, MMCO and EOT provide opportunities for out-of-sample evaluation of the ability of models to simulate rapid retreat of ice sheet (freshwater feedbacks) as well as sea ice expansion, polar amplification, teleconnections behaviour and bottom water formation. They have similarities with emission scenarios used by the IPCC for future climate projections.

Current Challenges
In order to accurately simulate past climates, it is important to have reliable reconstructions of atmospheric greenhouse gas concentrations on a range of time scales. Atmospheric concentrations of CO 2 can be retrieved from Antarctic ice core records back to about 850 ka (e.g., [36]) ( Figure 1). However, a recent study shows that uncertainties in CO 2 reconstruction of the Late Pleistocene are non-negligible [59], leading to large differences between models, especially in the case of ice sheet glaciation and deglaciation thresholds [60]. For pre-ice-core records, CO 2 reconstructions are retrieved from sediment cores, with large uncertainties (Figure 1, e.g., [39]). The low temporal resolution of CO 2 records, ice cores included, is therefore a major challenge in terms of accurately modeling past millennial to sub-millennial time-scale CO 2 -forced climate variability.
In addition to the importance of a reliable greenhouse gas forcing history, climate models also need to be set up with accurate boundary conditions. A current challenge in this respect is paleo-bathymetry, which is critical to the understanding and reconstruction of past ocean circulation and marine-based ice sheet dynamics. For ice sheets, improved knowledge of bathymetry can help to reconstruct the occurrence of pinning points favoring the stability of the ice sheet and formation of ice shelves [43,61]. For the ocean, it helps to more realistically simulate regional circulations, the formation of AABW and heat exchange across the continental shelf edge (e.g., [62]). Paleo-morphological features of current bathymetry can be used to infer the type of ice sheet flow, the occurrence and pathways of release of meltwater [63], pathways of bottom water formation and feedbacks with surrounding oceanic water masses. The scarcity of marine deep drilling sites hampers accurate dating of the paleo-bathymetry reconstructions based on seismic stratigraphy (e.g., [64]) and thus limits the availability of reliable reconstructions to be used as boundary conditions in models [61].
Challenges relating to forcing history and boundary conditions are relevant to the long-standing issue of climate model spin-up, which remains difficult to solve. A climatic state at a given time results from the superposition of long-term (several millennia to tens of millennia mostly influenced by astronomical forcing) and short-term climate evolution prior to that moment. From a modelling perspective, simulating the long-term transient evolution of the entire system is only doable for a limited range of simplified climate models, such as Earth System Models of Intermediate Complexity (EMICs) [65,66], or very low-resolution Earth System Models (ESMs) (e.g., [67]). Some studies suggest that~5,000 years of free running, initialized with fixed conditions (e.g., a climatology of ocean state variables), might be required to adequately spin up the deep ocean, particularly for climates which are significantly different to present (e.g., [68]). This is especially true for much colder periods like the LGM or much warmer periods such as the mPWP, the MMCO or the EOT. Therefore, depending on the model used and the spin-up method and duration, different equilibria can be reached for a single climate state (e.g., [69]). For example, in some models, more than 3000 years of spin-up are necessary to trigger the realistic formation of AABW under glacial climatic conditions [70].
Turning to challenges of data-model comparisons in specific phenomena or variables, key issues have been revealed in analyses of the simulations conducted by the various model intercomparison projects. Comparison of PMIP3 simulations of the LIG with reconstructions reveals that the models are unable to capture the reconstructed warmth at high latitudes, particularly in the Southern Ocean (e.g., [26,71]). Simulations suggest that models overestimate the sea ice extent as well as the upwelling in some cases. Conversely, models are found to underestimate the Antarctic cooling in LGM simulations [72]. This has implications for the reliability of simulated westerlies, which are however difficult to assess in models due to uncertainties in available proxies for westerlies in the LGM [5,73,74]. Similar issues have been noted in PlioMIP experiments of the mPWP for which model disagreements in Southern Ocean SST, sea ice extent, westerly winds and the broader thermohaline circulation are again difficult to assess due to uncertainty in proxy reconstructions across these variables [75]. However, looking in more detail at processes, PlioMIP simulations show that there is a mismatch between proxy and simulated deep water ventilation in the Southern Ocean. Only a few models are able to explain the reconstructed changes in AABW fluctuations during the mPWP [70]. In general, data-model mismatches in the Southern Ocean are likely due, at least in part, to the omission of key processes within the climate models, such as the discharge of Antarctic meltwater into the ocean [71] or incorrect formation mechanisms for Antarctic bottom water [76].
Another current challenge potentially responsible for model-data differences is temporal consistency between paleo-proxy data and model output. For example, the PRISM dataset collated proxies over the entire mPWP [33,50] while models participating to the PlioMIP project had to impose specific sub-periods for their simulations [77]. This discrepancy contributed to a mismatch between the PRISM proxy database and the simulations [78]. Model intercomparison projects now attempt to focus on specific time slices (e.g., PMIP, PlioMIP2 and MioMIP). There is therefore a requirement to retrieve geological archives of at least sub-orbital to millennial temporal resolution and calibrate the large range of age models across the various types of proxies and regions for the Pliocene (e.g., [78,79]) and also other periods of interest. Another timing-related issue is the seasonal nature of many proxies. Some proxies have inherent biases toward a specific season or can be representative of mean annual conditions.
Relevance for improving projections: • Uncertainty in paleo bathymetry is hampering progress in the ability to realistically simulate past and future fresh water flux pathways and related ice sheet-ocean interactions • Differences in model spin-up approach used by different model centres affect inter-comparisons of paleo-climate model simulations and potentially the rate of projected ice sheet change • Current challenges in reconstructing and modelling sea ice, SST gradients, the ocean thermohaline circulation, westerlies and glacier conditions for past climates affects confidence in model projections

•
A caveat for informing model projections based on past warm periods is that warming today is different in terms of drivers and rates of change (orbital versus greenhouse gas)

Next Steps
Paleoclimate model intercomparison projects (PMIPs) have increasingly provided data and stimulated the development of approaches for using paleoclimate simulations to improve future projections [20,80]. Examples of promising approaches that combine output from PMIP simulations and long-term climate reconstructions to constrain Antarctic climate projections include: (i) exploiting Antarctic/global warming relationship [81], (ii) utilizing cross-model correlations in simulated future and past sea ice extent (SIE) changes [81], and (iii) by assessing links between SIE and the circumpolar surface westerlies as simulated by models in a range of different past climates [73]. A key next step that would benefit the above interpretations by, for example, increasing the statistical significance of relationships across the multi-model ensemble, would be to increase the number of modelling groups participating in future PMIPs (only nine models were available for use in recent PMIP3 studies [73,81]).
Recent studies have used Antarctic ice sheet contributions to mPWP and LIG global mean sea levels as tuning targets for ice sheet models to reduce the uncertainties on sea level projections [82,83]. Further investigations of Antarctic ice sheet sensitivity to past warm climate conditions (e.g., MMCO, [51]; mPWP, [28,84]; LIG, [85], and the last deglaciation, [86]) are needed to better constrain its tipping points under different level of atmospheric CO 2 concentration (Figure 1).
Increases in computing power now make it more feasible to run transient simulations of a few thousands of years with fully coupled Earth System Models, as for example for the LIG in the framework of PMIP4 [24]. A recent on-going German initiative, Paleoclimate Modeling (PALMOD), is aimed at simulating the entire last glacial cycle (130-0 ka) with ESMs. Implementation of adaptive mesh and unstructured grids for models to optimize the use of computing resources while reaching high resolution in areas of interests is on-going. Improvements in computing facilities are nevertheless needed to increase spatial resolution of ESMs or to allow for high resolution coupled regional downscaling. Evaluation of such models for the past will require increases in temporal resolution and better characterization of age model uncertainty in marine and ice core proxies to better identify and constrain local polar processes and feedbacks that are still missing in models, as for example processes at the interface between ice dynamics, sea ice, ocean, and atmosphere [61,87]. One on-going project, Pliocene climate variability over glacial-interglacial timescales (PlioVAR), is aimed at collecting and referencing sediment cores with potential for millennial to sub-millennial time resolution.
In addition to increased temporal resolution, it is necessary to select key variables that can ideally be both easily retrieved in the available proxy or observational datasets and are also representative of the main features of a given climate state such as global mean seasonal temperature, inter-hemispheric temperature gradients, polar amplification, cryosphere conditions, and the carbon cycle. As an example, ocean temperature, in particular SST and mid-level depth temperatures (400-800 m), is critical to many processes at the ice-ocean-atmosphere interface. Regional SST reconstructions, such as the recent twelve-thousand-year proxy-based index of the South Atlantic Subtropical Dipole [88], are important for improving our knowledge of spatial variability and trends. In combination with such efforts, new emerging proxies, i.e. marine biomarkers, will better constrain past polar ocean temperature (e.g., [53]).
Along with improved ocean temperatures, the goal of a more accurate model representation of the ice-ocean-atmosphere interface is a major motivation for improving reconstructions of paleo-bathymetry and related processes. Key next steps, which are currently being coordinated by the Past Antarctic Ice Sheet (PAIS) group of SCAR, involve both (i) developing and improving several paleo-bathymetric reconstructions of Antarctic margins and sub-glacial topography and (ii) Antarctic community participation in sediment core drilling expeditions and seismic surveys of the International Discovery Ocean Drilling Program (IODP). With regard to (i), these reconstructions will make it possible to locate AABW formation pathways in past climate time intervals from mapping bottom-controlled  (2019)) have highlighted differences, for the same time intervals, in the mode of ice streams waxing and waning versus climate changes and the mode of AABW formation and its role in the source-to-sink sediment transfer process. Data from these recent IODP expeditions will provide new sediment core records of paleo-productivity that will reveal the fluctuations of the polar front and the location of preferential CDW upwelling pathways into the continental shelves, as well as toward ice shelf cavities during those critical past warm periods. A further step will be to identify other potentially important drilling sites around the Antarctic margin in key areas still lacking direct information to constrain models to study those mechanisms and understand local versus regional response of the environment to climate change. The use of high resolution seismic data is crucial to look for suitable drilling sites for sedimentary archives with expanded sections that will potentially provide information on orbitally-driven environmental records [89].
Relevance for improving projections: • Increased PMIP multi-model ensemble sizes in PMIP simulations would help to inform potential constraints on future projections • Higher resolution paleoclimate records with better age characterisation to allow model evaluation over a wider range of climate variability timescales • Improved bathymetry will help develop more realistic paleoclimate simulations, which is valuable for testing climate models outside their modern-era development focus

The Last Two Millennia
This period represents a transition between the present modern instrumental period and the period where climate information comes from a range of different proxy datasets. To help develop datasets for the purpose of climate model evaluation, an important step is to compile regional to continental-scale products that can be directly compared with model outputs. For in-situ weather station observations, large-scale patterns of atmospheric variability can be exploited to develop an Antarctic-wide picture from data gathered/recovered both from Antarctica and further equatorward from land masses along the northern boundary of the Southern Ocean [90]. With regard to ice cores, of the numerous variables measured, many vary with aspects of local, regional, and global climate. The creation of continental-scale ice core climate proxy data networks is complicated by the fact that, in some cases, a single ice core variable (e.g., methane sulfonic acid (MSA)) can represent a distinctly different climate variable depending on where the ice core site is located. In some geographies the year-to-year variability is dominated by onshore-offshore transport, and at others reflects biological production in the sea ice zone. For reasons such as this, great care and objectivity must be exercised when creating a continental-scale ice core climate proxy dataset to be related to a climate variable. As a consequence, this section is arranged into subsections focusing on two groups of variables: (i) temperature, precipitation and surface mass balance and (ii) surface pressure and teleconnections.

Temperature, Precipitation and Surface Mass Balance
There have been significant efforts made to gather Antarctic ice core arrays at a regional scale, e.g., ITASE [91]. Such arrays have been useful for reconstructing past climate over specific regional areas of Antarctica, including West Antarctica [92], the Antarctic Peninsula [93,94], and Wilkes Land [95]. The usefulness of these regional-scale arrays to climate modelers has been enhanced in recent years through efforts to compile multiple regional arrays into continental-scale datasets.
Examples of international efforts to compile and distribute multiple regional-scale ice core datasets include IceReader [http://www.icereader.org/icereader/] and PAGES 2k (e.g., [96][97][98]). This effort should be continued under the auspices of AntClim21 and the PAGES 2k project Climate Variability in Antarctica and Southern Hemisphere in the Past 2000 years (CLIVASH2k). Ultimately, the goal is to compile enough data points (core sites) to create reasonably representative gridded datasets of Antarctica for multiple ice core proxy variables that will be useful for the Antarctic climate modeling community. A key motivation for this approach is that it helps to overcome the issue of spurious (non-climate related) signals in individual records.
A comprehensive evaluation of stable water isotopes from ice cores was compiled by Stenni et al. [98]. Stable water isotopes from the high southern latitudes have long been used as a proxy for past surface temperature. Stenni et al. [98] evaluated 112 ice core records to produce regional and continental-scale temperature reconstructions back to 0 CE. Temperature scaling was based on the δ 18 O-temperature relationship output from an isotope enabled climate model (ECHAM5-wiso) and by scaling the isotopic normalized anomalies to the variance of the regional reanalysis-estimated temperature. Significant warming trends during the past century were identified in West Antarctica, the Dronning Maud Land coast, and the Antarctic Peninsula.
For surface mass balance (SMB), the PAGES Antarctica2k working group compiled a database of 79 ice core snow accumulation records spanning the last millennium [96]. Snow accumulation is the sum of precipitation, sublimation/solid condensation, runoff/refreeze, and erosion/deposition by the wind determined by annual layer counting of seasonally deposited chemical species. Unlike other ice core proxies, such as stable water isotopes, the snow accumulation is comparable to modelled SMB from regional atmospheric circulation models as for instance RACMO2 or MAR models [99]. To first order, high resolution and quality controlled SMB datasets [100] satisfactorily compare with precipitation minus evaporation (P-E) from global reanalysis products in areas where runoff is small, which is the case for most of the ice sheet. As such, the regional composites compiled in Thomas et al. [96] were compared directly with SMB from output of the RACMO2 atmospheric model [101] to estimate changes in SMB (Gt per year) in seven geographically-distinct regions of Antarctica since 1800 AD. Moreover, the ice core databases have been converted to a gridded reconstruction of annual P−E based on three global atmospheric reanalysis products, specifically the European Centre for Medium-Range Weather Forecasts "Interim" (ERA-Interim), the NASA Modern Era Retrospective Analysis for Research and Applications version 2 (MERRA-2), and the National Centers for Environmental Prediction Climate Forecast System Reanalysis (CFSR) [102].
The ice core snow accumulation records have been used in a number of recent studies to evaluate the performance of climate models, including the regional atmospheric climate model (RACMO2) [103] and the longer reanalysis datasets such as ERA 20C [104]. Another recent study confirms that the observed temporal variability in SMB (from ice cores) is also captured by Global Circulation Models (GCMs) [105].

Relevance for improving projections
• Improved estimates of natural climate variability, and therefore its potential impact on projected change over the 21 st century and beyond

Current Challenges
For Antarctic near-surface air temperature trends there are discrepancies between reconstructions, which indicate limited, non-significant continent-scale temperature trends, and the warming trends simulated by historical (1850-2005) climate simulations [90,106,107]. The Antarctic-wide picture masks contrasting behavior in different regions of Antarctica, such as the significant warming trends in West Antarctica and on the Antarctic Peninsula shown in the regional ice core isotope composites compiled by [98]. Although this provides potentially important information on regional-scale climate processes, significant features such as mountains on spatial scales smaller than a climate model grid box (typically~100 km for most global climate and earth system models) may make it more challenging to robustly evaluate climate models [81]. However, identifying specific modes of atmospheric variability, such as the changing Southern Annular Mode (SAM) and Pacific South American (PSA) pattern, and focusing on relevant regions and datasets, has the potential to provide a useful approach to model evaluation [108,109]. Last Millennium simulations from the 3rd Paleoclimate Model Intercomparison Project (PMIP3) have also been compared to long-term Antarctic temperature reconstructions. These Last Millennium simulations were conducted as a subset of CMIP5 runs, forced by key drivers of change over this period: volcanic eruptions, vegetation changes, orbital parameters and some changes in greenhouse gas concentrations [110]. These simulations do not exhibit a clear difference from Antarctic temperature reconstructions in terms of multi-decadal variability [109]. However, on shorter decadal timescales the PMIP3 models exhibit larger variability than the reconstructed record. The reasons for this difference are difficult to identify and could include: • uncertainty in forcings (e.g., vegetation, volcanic eruptions, solar variability) used to drive the models; • misrepresentation or omission of physical processes within the models (e.g., uncertainty in representing anthropogenic aerosol processes and challenges in simulating extreme air-sea interaction conditions at high latitudes); • the sparse and uneven availability of proxy data; • biases in the reconstructions due to post-deposition effects, non-climatic influences on the records, or the complex relationship between some proxies (in particular δ 18 O) and climate variables (see for instance [111]).
• non-stationarity in relationships between variables of interest (such as changing SAM -temperature relationships [112], and PSA -SAM pressure relationships [113] ); or • Incorrect assessment and interpretation ice sheet surface mass balance from ice cores, for instance due to omission high precipitation events resulting from maritime air intrusions [114].
Relevance for improving projections: • A 'good' model could exhibit biases in variability due to uncertainty in the amplitude of natural forcing of variability such as volcanic eruptions • Key processes currently omitted or misrepresented in climate models may significantly bias estimates of future climate change impact • Non-stationarity in relationships such as the SAM-temperature relationship are an important consideration when comparing models and reconstructions

Next Steps
The physical variables simulated by climate models are often not directly comparable with the biological or environmental variables derived from natural archives (e.g., [115]). For example, in the case of past climatic proxy data, the temporal resolution of natural archives often does not resolve the seasonal cycle. Thus, the traditional data-model comparison is biased by the mis-interpretation of the time averages that specific proxies are related to (from mean annual to mean seasonal). Ice core isotope records are not as simple a proxy for air temperature as previously thought. It is thus required to better understand the detailed processes that are incorporated in oxygen and deuterium isotopes in various types of natural archives globally. This also highlights the need to develop alternative techniques for data-model comparison, including the incorporation of proxy system models [116] and isotopic tracers into climate models. Alternative approaches that can be employed in the interim include the use of pseudo-proxy approaches in comparisons of proxy data and model output, and the careful selection of the most appropriate model variables for comparison.
In the case of past surface temperature, the most commonly used proxy is stable water isotopes. The process of air-mass cooling as moisture is transported to the poles, and the progressive loss of heavy water molecules along the condensation pathway, is well understood [117]. However, the relationship between stable water isotopes in snow and surface temperatures is also dependent on a number of other factors and may vary spatially and temporally. For example, changes in evaporation conditions and moisture origin, shifts in atmospheric transport pathways, and precipitation seasonality will all influence the δ 18 O -temperature conversion [97]. A recent study highlighted the degree to which Antarctic precipitation is dominated by extreme precipitation events, especially in regions such as West Antarctica [114], which will undoubtedly have a strong influence on the δ 18 O record preserved in ice cores. Thus, it is preferable to compare stable isotope records from ice cores with stable water isotope enabled climate models, such as ECHAM-wiso, following the approach of [98,118]. However, a recent study using GCM simulations advocated the use of SMB as a proxy for past surface temperature over the past 2000 years ( [105], this issue). They demonstrated a stronger link between surface temperature and SMB in model simulations than that of δ 18 O and surface temperature at a regional scale.
Limitations still exist in properly modeling the surface energy balance, and hence heating and/or melt rate of Antarctic snow and ice, saltation of snow, and sublimation and albedo in ESMs [99,119,120], as well as the polar ocean and sea ice [87] and this is especially difficult due the complexities of conducting accurate field observations, especially across large geographic domains. Additionally, within Antarctica, little effort has been made to robustly organize and standardize physical snow observations, which is important for closing the surface energy balance, as well as for interpreting ice core records. Furthermore, the radiative forcing of impurities in the cryosphere, such as black carbon, are currently reported with 90% uncertainty bounds [121] and there is a need for further observations to refine models of snow physical and optical properties (e.g., [122,123]) and of their impact on absorption and reflectance [124]. Future work should include efforts to address these limitations in representing the surface energy balance with fieldwork being particularly important for providing the necessary data to help refine numerical models.
Because the computational costs of paleoclimate simulations remain prohibitive, only a small number of simulations are typically conducted, which is a key limitation since there is a need for ensembles of multiple simulations to be conducted using each model in order to quantify and assess the uncertainty arising from unforced internal climate variability. An exception to this, which demonstrates the potential for progress in this area, is the CESM1 Last Millennium Ensemble [125].
Relevance for Improving Projections: • Moving more towards a 'like-for-like' comparison between models and climate proxies will likely help reduce uncertainties in model evaluation • Increased intra-model ensemble sizes in PMIP simulations would help to assess projection uncertainty relating to internal climate variability

Current State-of-the-Art
A number of modern re-analysis systems have been used to develop products that go back before the modern observational and satellite eras (ERA-20C, CERA20C, 20CR). In regions with a sufficient number of long-term weather stations, such as over Europe, they have reasonable reliability even in the early 20 th century. However, in data-sparse regions, such as the high-latitude SH, reliability before the mid-20th century is poor (e.g., [126,127]). Rescue of historical observations as undertaken by projects such as Atmospheric Circulation Reconstructions over the Earth (ACRE) can contribute towards reducing these errors.
A number of studies have used the teleconnections between pressure at mid and high latitudes in the SH to produce reconstructions of atmospheric circulation and pressure back to the early 20th century. Jones, et al. [128] produced seasonal reconstructions of the SAM using mid latitude sea-level pressure (SLP) records; using these reconstructions Fogt, et al. [129] found that global climate models underestimate natural variability in the SAM in the 1930s and 1960s. SLP records from mid-latitude stations were also used to reconstruct surface pressure at Antarctic stations back to 1905 [130]. Reconstruction skill was found to be better in austral summer than in winter. These reconstructions were extended to produce a gridded surface pressure reconstruction poleward of 60 • S in Austral summer, and forcing mechanisms of the reconstructed pressure changes were explored through comparison with simulations with the Community Atmospheric Model, version 5 (CAM5) [131]. These reconstructions offer the possibility of both further use for model evaluation, as well as for comparison with, and calibration of, proxy records.
Various proxy networks and methodologies have been used for reconstructing SAM during the last millennium. These include annual and summer reconstructions based on temperature-sensitive temperature proxies from across the SH [15], annual average reconstructions using SAM-temperature correlation patterns across the Antarctica-S. America sector [132] and summer reconstructions derived from S. America or New Zealand tree ring records [133]. The reconstructions show differences in scaling but all suggest that the most negative SAM conditions of the last millennium occurred about 1400 CE, and that SAM is characterized by large multi-decadal variability.
Relevance for improving projections: • Multi-proxy circulation reconstructions are best suited for evaluation of model-simulated hemispheric-scale SAM variability • SAM reconstructions (both proxy and instrumental) are most consistent in summer, which is therefore most relevant to projections in this season

Current Challenges
Current key challenges in developing reconstructions of past atmospheric circulation variability and trends are (i) improving the reliability for non-summer months (and the annual mean), (ii) developing more reliable regional (non-annular) reconstructions and (iii) extracting more detail about whether variability and trends in the SAM is are related more to shifting or strengthening/weakening of the Southern Ocean westerlies or indeed changes in non-annular features such as the ASL (e.g., [134]). Reconstructions of the SAM show large multi-decadal variability, but robustness of the reconstructions at this time-scale is difficult to assess because of the potential non-stationarity of the teleconnections [113]. Models generally display less variability at multi-decadal timescales than the reconstructions [109] and the mechanisms potentially responsible for such low frequency changes need a deeper investigation. Reconstructing circulation trends in greater detail is important since, for example, shifting and/or strengthening of the westerly jet have different drivers and different impacts on, for example, sea ice [5,135,136].
Relevance for improving projections: • Current reconstructions limit the scope for evaluation of non-annular circulation patterns and/or jet stream shifting or strengthening associated with SAM variability • There is a need to better understand model-reconstruction differences in multi-decadal variability in order to increase confidence in using such information to inform projections

Next Steps
Early 20 th -century observations retrieved through data rescue projects have the potential to improve historical reanalyses, but also to be used to evaluate these long-term reanalyses and help to develop specific southern mid-high latitude gridded circulation reconstructions. Rescue and use of observations located in the high latitude SH is particularly important, as existing reconstructions are based on mid-latitude data. There is also the potential to use such observations, or the resulting reconstructions, for comparison with proxy records (e.g., ice cores), and for model evaluation for non-Antarctic regions [137]. Longer-term pre-modern-observational-era observations have been little used for model evaluation so far. This may partly be because it is only relatively recently that large-scale data rescue and digitization efforts have been undertaken (e.g., through ACRE, and citizen science in initiatives such as Old Weather). The fact that it may be unclear to modelers that such data are taken using very similar methods/instruments to today, so are more trustworthy than may be thought, may also be a reason for lack of use. Gridded datasets would make these observations suitable for use by climate modelers, particularly as they can provide monthly and seasonal information, and are direct measurements of the variables of interest (SST, SLP, wind, and air temperature).
The available reconstructions of circulation changes covering several centuries are generally independent of reconstructions for other variables, but atmospheric circulation variability is intrinsically coupled to changes in surface temperature, precipitation and surface mass balance. Taking advantage of the covariance between those variables offers the possibility to reduce the uncertainties on the reconstructions for each of them individually and to provide consistent reconstructions that allow a better interpretation of the potential origin of the changes. This is the basis of the data assimilation approach, leading to the long-term reanalyses that have been carried out very recently [105,113,138]. A systematic evaluation of existing reanalyses offers many opportunities, but new reanalyses are needed that take account of the proxies in the high latitudes of the Southern Ocean, in particular those directly related to circulation changes.
Relevance for improving projections: • New gridded datasets utilising rescued data would have potential for improved model evaluation of atmospheric circulation through the early 20 th century • Improving the robustness of paleo-proxy reconstructions of atmospheric circulation and the coherency with the reconstructions of other variables (e.g., through data assimilation) would help to improve the evaluation of climate variability in models and its role in projections

The Emergence of Anthropogenic Climate Signals
Looking back at past climate variability and change can potentially help to determine the extent to which humans are responsible for recent and current observed trends in Antarctic climate. This is relevant to estimating future climate change since the amplitude of detectable anthropogenic signals in historical data is potentially useful for informing such estimates (e.g., [139]). In the climate science community quantitative 'detection and attribution' (D&A) methods are widely used to establish the human contribution to recent/current trends. D&A refers to the two stage processes of first detecting a statistically significant signal that cannot be explained by internal climate variability followed by the second attribution stage of identifying the cause (or causes). All known potentially important forcings, both natural and anthropogenic, should ideally be considered.
The climate system over Antarctica exhibits the largest internal variability on the planet (e.g., [140,141]), therefore externally-driven anthropogenic climate change signals can emerge later relative to lower latitudes. However, where anthropogenic signals can be identified in observations, then there is potential for constraining estimates of future change based on the extent to which climate models over-or under-estimate the strength of a given signal [139]. This can help in regions of large internal climate variability such as around Antarctica. Two defining questions for this section are: (i) In what variables in the modern era have effects from anthropogenic forcing been detected? And (ii) how can longer-term datasets help to improve Antarctic D&A analyses? To answer the first question, a comprehensive suite of model simulations, robust and appropriate statistical methods, and long-term reliable observations are all required. The latter point clearly links to the second question, whereby extensions to the modern observational era are of potential benefit in the detection and attribution of Antarctic climate and environmental change.
In this section we first outline the different ways in which anthropogenic signals have been detected across a range of variables. The role, or potential role, of long-term datasets in improving D&A for Antarctica is then highlighted along with relevance to improving climate change projections to 2100.

Anthropogenic Signals in Antarctic Variables
For many parameters the emergence of a detectable anthropogenic signal in Antarctica may not occur until the second half of the 21st century [142,143]. However, careful analysis of a range of variables has revealed a detectable human influence in many aspects of Antarctic and Southern Ocean climate. A number of these studies involve the use of a powerful methodology to consider differences in the spatial patterns of climate response to different forcings, known as 'optimal fingerprinting' [144,145].

Temperature
In the modern instrumental era an anthropogenic signal has been detected in observed spatial patterns of surface temperature change in terrestrial Antarctic observation stations [146]. Another variable showing success is Southern Ocean subsurface temperatures (warming), for which a response has been detected and recently attributed to GHG and stratospheric ozone change from the 1950s to now [145].
The reconstruction of Stenni et al. [98] complements Gillett et al. [146] by providing a longer-term perspective on different regions of Antarctica. They found that the Antarctic Peninsula is the only region in which the most recent century-scale trend is unusual in the context of natural variability over the last 2000 years. An ice core melt history from the Antarctic Peninsula also identifies the unusual acceleration in surface melting associated with regional warming, and validates expectations of a non-linear threshold response of surface melt to temperature change [147].

SMB
Regional composites [96] and spatial reconstructions [102] highlight the large regional differences in SMB across Antarctica. The Antarctic Peninsula has experienced dramatic increases in snow accumulation during the 20th century, related to changes in large-scale modes of atmospheric variability (especially SAM), tropical teleconnections and local sea ice conditions. This region also exhibits the largest multi-decadal variability over the past two centuries.
Simulations covering the past millennium display an increase in SMB over the past century similar to the one of [96] for the whole continent but generally fail in reproducing the regional differences. This may be due the large internal variability observed and simulated at regional scale. Models also simulate a strong link between large scale temperature and SMB variability over the past centuries, as derived from modern observations [105].
Although the large internal variability makes detection of a statistically-significant signal in SMB difficult, modeling suggests that stratospheric ozone depletion acting alone would drive an increase in Antarctic SMB [148]. However, sea ice extent appears to be an important factor in Antarctic precipitation trends [149] and the response of sea ice to stratospheric ozone depletion in models has been found to be highly model dependent [150]. Therefore, it will be important to evaluate the link between ozone depletion (and indeed other forcings) and SMB across a wider range of Earth System Models. Nevertheless, the anthropogenic impact on Antarctic surface mass balance will be likely masked by natural variability until mid-century [151].

Atmospheric Circulation
Some success has been documented in detecting and attributing change for a number of Antarctic variables. For sea-level pressure (SLP), it has been shown that decreases and associated geostrophic wind increases can be attributed to greenhouse gas, sulfate aerosol, stratospheric ozone, volcanic aerosol and solar forcing [10,152,153]. It should be noted that since these papers were published, reliable reconstructions of Antarctic surface pressure now exist back to 1905 [130] although the reconstruction does not extend over the surrounding oceans. Extending beyond regional circulation to hemispheric-scale trends in the SAM, robust and consistent changes are observed across observations, models and reconstructions since the mid-20th Century [15,154]. These exceed model-generated internal variability and with the most prominent trends in summer attributed to stratospheric ozone depletion (e.g., [155,156]).

Sea Ice
Attempts to determine the Southern Ocean sea ice response to anthropogenic forcing have been hampered by the relatively short modern satellite record and the weak signal compared to simulated internal variability. There is also an apparent contrast between observed slight increases (a spatially-heterogeneous overall increase, most intense in the austral autumn) in Antarctic SIE and overall declines simulated by the majority of climate models in response to anthropogenic forcing [157][158][159][160]. Much of the observed trend in sea ice extent adjacent to West Antarctica is thought to be due to wind changes related to tropical Pacific variability [161,162], compounding the issue of identifying the true physical response to anthropogenic forcings. Ocean-sea ice feedbacks may be important for explaining the observed trends [163], and there is evidence that inadequate representation of the subsurface ocean could be a major model deficiency [164,165]. The lack of year-round ocean observations in the sea ice zone is a challenge for validating these model processes.
A number of approaches have been used to overcome these challenges. Proxy reconstructions have been used to extend the sea ice record throughout the 20th century, but an anthropogenic forcing response is still not formally detectable. Reliable proxies are not available for the region of greatest change (the western Ross Sea) or the seasons of greatest change (summer and autumn) [166]. A number of ice core proxies exist and the most widely used is methane sulfonic acid (MSA), which gives information about winter maximum sea ice extent [147,167]. MSA-based reconstructions have revealed a 20th century sea ice decline in the Bellingshausen Sea [168], while sea ice has increased in the Amundsen-Ross Sea over the same period [169]. Although these contrasting trends suggest a strengthening of the dipole between the Bellingshausen and the Ross Sea, comparisons with early satellite records suggest that proxy records from this region may not be reliable [166]. The meridional circulations in West Antarctica that drive much of the observed trends and variability are prone to east-west shifts, which means that the source location of proxy chemicals also shift, leading to a confounding factor in reconstructing local sea ice changes; combining proxies from multiple sites may allow a reconstruction of the evolution of these zonal shifts. Seasonally, the most reliable proxy for inter-annual sea ice variability (MSA) is a proxy for winter sea ice extent and hence an alternative proxy would be required to reconstruct warm season sea ice variability.
A potentially useful approach to characterize sea ice trends is to evaluate trends in factors that may drive these trends. An example where the current generation of climate models shows promise, but also offers large areas for improvement, is in attributing the role of stratospheric ozone depletion on recent historic regional Antarctic sea ice trends. Ozone depletion has been demonstrated to be the largest driver of SH atmospheric circulation changes in the twentieth century [13,14], so the expectation is that its effect on Antarctic sea ice should be detectable. The largest regional trends in sea ice are located to the east and west of the ASL [8,9]. This regional low pressure feature has been offered as a potential explanation of the regional pattern of SIE trends [170] although the relationship between the ASL and sea ice is complex [171]. Despite the large internal variability in this region, stratospheric ozone depletion has been shown to cause a deepening of the ASL in austral summer [10]. Models are able to capture the inter-annual relationship between the ASL and autumn sea ice and yet the observed sea ice trends nearly diametrically oppose what the models project in response to stratospheric ozone depletion [172]. It is possible that internal variability overwhelms the sea ice response to the ozone hole [157,173,174] but another option is that climate models are lacking in their representation of ocean-atmosphere-sea ice coupling. Improvements in understanding of the climate variability in West Antarctica could be key for placing the response to stratospheric ozone depletion in appropriate context.

Incorporating Longer-Term Datasets in Antarctic D&A
The survey of different variables in the previous section shows that longer-term datasets are key to setting recent trends in a longer-term context. However, there are challenges in terms of developing proxy and station-based reconstructions with a small enough uncertainty for effective use in formal D&A methods, particularly in relation to sea ice where reconstructions require indirect relationships linking SIE to the proxy signal. If these challenges can be overcome, the most direct link to projections is the ability to quantify the degree to which a given climate model may over-or under-estimate historical responses to climate forcings (or a specific forcing). This can help to inform whether a given model may over-or under-estimate projected future change [139]. Important considerations are a changing mix of forcings, as stratospheric ozone begins to recover, and a range of response timescales (e.g., [175]). A key caveat is also that historical model trend biases may not be constant into the future.
There are a number of possibilities going forward with regard to better integrating long-term datasets in Antarctic D&A studies. (i) The use of better methods to link large-scale and small-scale processes. For example, using regional high-resolution atmospheric numerical models (including water isotopes in regional circulation models) to assess regional signals in particular for interpolation of sparse data/signals. (ii) Atmospheric regional downscaling may prove especially useful in assessing how spatially representative local (ice/sediment core) signals are [98], especially in regions of strong climate gradients (e.g., the Antarctic Peninsula or West Antarctica). (iii) Better quantification of the contributions from fast and slow components of the Antarctic climate system (ocean intermediate and deep circulation, ice sheet dynamics and related glacio-isostatic adjustment) since there is a need to determine the extent to which observed temperature trends, and other variables such as ice-sheet velocities, calving rates, basal melt rates, are the result of anthropogenic climate change rather than long-term climate evolution inherited from the last deglaciation.
Relevance for improving projections: • Anthropogenic signals are still not detectable in sea ice and therefore it is difficult to evaluate robustness of projections • Improved quantification of the role of the ozone hole in West Antarctic climate variability is a key foundation for improving projections of broader environmental change • Improved detection and attribution scaling would provide a basis for scaling climate model projections

Conclusions
This paper presents an analysis of how long-term observational and paleoclimate data may help to improve Antarctic climate model projections for the 21 st century. A key novelty of this study is the wide range of timescales, from instrumental reconstructions of the early 20 th century back through different climate regimes to the EOT, 34 Ma. The rationale for this is to synthesize and compare the wide range of different ways in which longer-term datasets may be of value to climate model evaluation.
Trust in future projections of Antarctic climate relies to a large extent on whether climate models are able to reproduce past observable behavior of the climate system. However, this alone does not guarantee a realistic response to future forcing, since it is also important to include processes known to be, or likely to become, important for the aspect of climate being evaluated. Figures 2-4 show schematics that summarize the main points as presented in the text, which focus on relevance for improving projections in the context of state-of-the-art (Figure 2), current challenges ( Figure 3) and next steps in making improvements (Figure 4). Across the different time periods considered, key ways in which long-term climate reconstructions may help improve projections are:

•
Reconstructions of past conditions are being used to identify climate model variants that best match past conditions and therefore provide the potential to narrow uncertainty in projections; wide participation in multi-model paleo-focused MIPs such as CMIP6-PMIP is encouraged.
• Improved paleo bathymetric data has the potential to better constrain past reconstructions and future simulations of freshwater fluxes from ice sheet melting, oceanic heat exchange between regional polar oceans and the open ocean, and impacts of freshwater release on the Southern Ocean.
• Recent progress in compiling long term extended instrumental and paleo-proxy records are providing improved information on decadal-to-centennial variability of the Antarctic climate system that may help to provide insight into the realism of the pronounced variability generated internally within the latest earth system models. There are both opportunities and challenges in assessing how drivers of variability (such as ENSO) influence climate indices such as the SAM and climatic conditions over Antarctica.
• To date formal D&A Antarctic studies have focused on the modern instrumental era, but there is potential to incorporate longer-term datasets and to help narrow the uncertainty range on detected signals.
• An overall recommendation that is not specific to the time periods or processes considered in this paper is for communities working on long-term Antarctic climate reconstructions to produce datasets for use in routine climate model evaluation. In this way, the paleoclimate information could be a more prominent part of the standard model development and testing cycle and feed directly into improving and developing the next generation of climate and earth-system models. A prominent example of a repository for observational data for use in model evaluation is the Obs4MIPs project (https://esgf-node.llnl.gov/projects/obs4mips/). Many aspects of the gridded reconstructions of Antarctic climate that have been generated as part of synthesis projects, such as PAGES Antarctica2k, could be adapted to conform to the formatting and uncertainty estimation requirements.
Looking to the future, our overall conclusion is that closer integration between the climate reconstruction and climate modelling communities would be of great benefit for improving the reliability of projections of future Antarctic climate change. This should be achieved through a combination of: (i) extending the range of climate model outputs to better match key climate proxies, (ii) feeding long-term reconstruction datasets into community standard climate modelling metrics, (iii) encouraging participation in PMIP or other project simulations as a high priority for model centers and (iv) regular workshops and meetings to encourage collaboration and sharing of tools and data across the modern and paleo climate communities. This approach is encouraged by SCAR, whose future scientific plan under development will follow an integrated research approach across the existing SCAR working groups to encourage multi-disciplinary research of past to future polar sciences and provide more reliable projections of Antarctic climate and ice sheet contribution to global mean sea level rise.  Table 1 and Figure 1. The vertical blue lines are used to indicate relevance across a range of time periods.