Intercomparison of Ground- and Satellite-Based Total Ozone Data Products at Marambio Base, Antarctic Peninsula Region

This study aims to compare the ground-based Brewer spectrophotometer total ozone column measurements with the Dobson spectrophotometer and various satellite overpass data available at Marambio Base during the period 2011–2013. This station provides a unique opportunity to study ozone variability near the edge of the southern polar vortex; therefore, many institutions, such as the National Meteorological Service of Argentina, the Finnish Meteorological Institute and the Czech Hydrometeorological Institute, have been carrying out various scientific activities there. The intercomparison was performed using total ozone column data sets retrieved from the ground-based instruments and from Ozone Monitoring Instrument (OMI)—Total Ozone Mapping Spectrometer (TOMS), OMI–Differential Optical Absorption Spectroscopy (DOAS), Global Ozone Monitoring Experiment 2 (GOME2), and Scanning Imaging Absorption Spectrophotometer for Atmospheric Cartography (SCIAMACHY) satellite observations. To assess the quality of the selected data products, comparisons with reference to the Brewer spectrophotometer single observations were made. The performance of the satellite observational techniques was assessed against the solar zenith angle and effective temperature, as well as against the actual shape of the vertical ozone profiles, which represent an important input parameter for the satellite ozone retrievals. The ground-based Dobson observations showed the best agreement with the Brewer data set (R2 = 1.00, RMSE = 1.5%); however, significant solar zenith angle (SZA) dependency was found. The satellite overpass data confirmed good agreement with the Brewer observations but were, however, overestimated in all cases except for the OMI(TOMS), when the mean bias differed from −0.7 DU in the case of the OMI(TOMS) to 6.4 DU for the SCIAMACHY. The differences in satellite observational techniques were further evaluated using statistical analyses adapted for depleted and non-depleted conditions over the ozone hole period.


Introduction
Stratospheric ozone is an important gas, which attenuates harmful ultraviolet (UV)-B radiation and therefore protects life on earth from the damage of DNA structures and related negative health effects, e.g., [1,2]. However, within the area of the southern polar vortex, severe stratospheric ozone losses have been observed since the late 1970s, especially in September and October (e.g., [3][4][5]). The enhanced depletion has been explained by the chemical reaction between the ozone molecules and man-made chemicals, such as chlorofluorocarbons (CFCs), e.g., [6,7]. The use and manufacture of Along with the above-mentioned global validations, there are much fewer detailed, site-based studies covering the Antarctic continent. For example, Evtushevsky et al. [26] and Bian et al. [27] assessed the performance of a selected satellite product against a Dobson spectrophotometer at the Vernadsky Station used ground-based Dobson and Brewer data from the Syowa and Zhongshan Stations, respectively, and compared different satellite products with the Brewer spectrophotometer installed at the Zhongshan Station [28]. A long-term comparison of two selected satellite data products with various ground-based instruments located in Antarctica was provided in [29]. Although these studies only cover one point, they can be highly useful to explain the behavior of satellite instruments under the extreme ozone and climatic conditions of Antarctica.
Ozone depletion can be well studied not only on Continental Antarctica, but also on its coast, for example, on the Antarctic Peninsula. Climatologically, the Antarctic Peninsula Region is a unique environment, because its complex topography and frequent alteration of ozone-rich subpolar and ozone-poor air masses originating within the polar vortex lead to considerable variation not only in the TOC, but also in mean sea-level pressure and other variables. This great variability is especially difficult to represent using current chemistry-climate models, e.g., [30][31][32]; therefore, and also because ozone depletion and current climate changes are complex, interlinked processes, e.g., [33], both ground-based and satellite ozone observations, are very important in this area. On the Antarctic Peninsula, only three stations have been equipped with a Brewer spectrophotometer: San Martin (data publicly available from 2002 to 2011), King Sejong (since 1999), and Marambio (since 2010).
This study aimed to compare the ground-based Brewer spectrophotometer ozone data with Dobson spectrophotometer measurements and various satellite data products in the highly specific Antarctic Peninsula Region, and to assess the performance of the selected data products with reference to the Brewer spectrophotometer TOC observations. The evaluation of various data products including the ground-based Dobson spectrophotometer is important with respect to other validation studies which use either Dobson or Brewer spectrophotometers as reference instruments. The intercomparison presented in this study was performed using the ozone data from the Marambio Base, Antarctic Peninsula Region, over the period 2011-2013. The performance of the satellite instruments was assessed not only against the SZA, effective temperature (T eff ) and the TOC, but also against the actual shape of the vertical ozone profiles, which represent an important input parameter for the satellite TOC retrievals. This study offers a complex assessment of the different TOC monitoring methods, using all ground-based data sources available at the given location, including the highly precise measurements from a double-monochromator Brewer spectrophotometer.

Study Site
The Marambio Base (64.233 • S, 56.623 • W) is a permanent Argentinean research station founded in 1969 on Seymour Island, an approximately 100 km 2 island situated in the northeastern part of the Antarctic Peninsula Region at the edge of the Weddel Sea ( Figure 1). The Marambio Base, located on a plateau at an altitude of 196 m a.s.l., provides an exceptional opportunity to study ozone variability near the edge of the southern polar vortex. Therefore, many institutions-such as the National Meteorological Service of Argentina, the Finnish Meteorological Institute, and the Czech Hydrometeorological Institute-have been carrying out various scientific activities there. These include monitoring of the TOC, vertical ozone distribution, and ultraviolet radiation by both ground-based instruments and ozone soundings [21,34].

Data and Instrumentation
The B199 double-monochromator Mk-III Brewer spectrophotometer (Kipp & Zonen, the Netherlands) was installed at the Marambio Base in February 2010 by the Czech Hydrometeorological Institute (CHMI). The instrument is regularly maintained by the CHMI and calibrated against the world traveling standard B017. TOC observations are available each year from mid-August to mid-April, because, in May to July (Antarctic winter), the SZA is too high to perform spectrophotometric measurements [21]. The fully automated double-monochromator Mk-III Brewer spectrophotometer currently provides some of the most accurate TOC measurements, which are widely accepted as the reference observations in intercomparison studies, e.g., [20,35]. Therefore, for this study, the B199 spectrophotometer was chosen as the reference instrument. In order to maintain the quality of the time series, only the most precise direct sun observations were considered for further analysis [36].
The second ground-based instrument used in this study was the D099 Dobson spectrophotometer, operated by the National Meteorological Service of Argentina. It has been providing TOC measurements at the Marambio Base since 1987. The instrument has been regularly calibrated against the Dobson standards in Boulder and Buenos Aires [34]. Due to the high SZA during the Antarctic winter, the D099 TOC measurements are only available from April to August. The precision of the direct sun Dobson observations is comparable to Brewer within ±1% [37], but the instrument is manually controlled and might suffer from temperature dependency [15]. So, in order to validate satellite measurements, using only post-processed Dobson data is strongly recommend [38]. Therefore, the D099 ozone data were corrected for Teff based on B199 measurements. The suitability of Teff calculation method and other relevant information can be found in [39]. In this study, the comparison of TOC measured by the B199 and the D099 spectrophotometers is used as a reference for the satellite TOC performance assessed against the B199 Brewer spectrophotometer observations. The studied satellite instruments include the Ozone Monitoring Instrument (OMI), the Global Ozone Monitoring Experiment 2 (GOME2), and the Scanning Imaging Absorption Spectrophotometer for Atmospheric Cartography (SCIAMACHY).
The OMI flies aboard the NASA Earth Observing System Aura satellite, which was launched in 2004. It is a visible and ultraviolet spectrometer which measures trace gases including nitrogen dioxide (NO2), sulfur dioxide (SO2), ozone-depleting substances, and ozone (O3). The nadir ground pixel size is 13 × 24 km 2 and 13 × 128 km 2 at 57°, which is the most outer swath angle [40]. The TOC can be derived from the OMI radiation measurements using several different algorithms, two of

Data and Instrumentation
The B199 double-monochromator Mk-III Brewer spectrophotometer (Kipp & Zonen, the Netherlands) was installed at the Marambio Base in February 2010 by the Czech Hydrometeorological Institute (CHMI). The instrument is regularly maintained by the CHMI and calibrated against the world traveling standard B017. TOC observations are available each year from mid-August to mid-April, because, in May to July (Antarctic winter), the SZA is too high to perform spectrophotometric measurements [21]. The fully automated double-monochromator Mk-III Brewer spectrophotometer currently provides some of the most accurate TOC measurements, which are widely accepted as the reference observations in intercomparison studies, e.g., [20,35]. Therefore, for this study, the B199 spectrophotometer was chosen as the reference instrument. In order to maintain the quality of the time series, only the most precise direct sun observations were considered for further analysis [36].
The second ground-based instrument used in this study was the D099 Dobson spectrophotometer, operated by the National Meteorological Service of Argentina. It has been providing TOC measurements at the Marambio Base since 1987. The instrument has been regularly calibrated against the Dobson standards in Boulder and Buenos Aires [34]. Due to the high SZA during the Antarctic winter, the D099 TOC measurements are only available from April to August. The precision of the direct sun Dobson observations is comparable to Brewer within ±1% [37], but the instrument is manually controlled and might suffer from temperature dependency [15]. So, in order to validate satellite measurements, using only post-processed Dobson data is strongly recommend [38]. Therefore, the D099 ozone data were corrected for T eff based on B199 measurements. The suitability of T eff calculation method and other relevant information can be found in [39]. In this study, the comparison of TOC measured by the B199 and the D099 spectrophotometers is used as a reference for the satellite TOC performance assessed against the B199 Brewer spectrophotometer observations. The studied satellite instruments include the Ozone Monitoring Instrument (OMI), the Global Ozone Monitoring Experiment 2 (GOME2), and the Scanning Imaging Absorption Spectrophotometer for Atmospheric Cartography (SCIAMACHY).
The OMI flies aboard the NASA Earth Observing System Aura satellite, which was launched in 2004. It is a visible and ultraviolet spectrometer which measures trace gases including nitrogen dioxide (NO 2 ), sulfur dioxide (SO 2 ), ozone-depleting substances, and ozone (O 3 ). The nadir ground pixel size is 13 × 24 km 2 and 13 × 128 km 2 at 57 • , which is the most outer swath angle [40]. The TOC can be derived from the OMI radiation measurements using several different algorithms, two of which are covered in this study. The OMI Total Ozone Mapping Spectrophotometer, or the OMI(TOMS), is based on the TOMS algorithm [41]. Its principle is similar to ground-based spectrophotometers, when the TOC is retrieved using a wavelength pair [28,42]. The OMI(TOMS) data were retrieved using the TOMS v.8 algorithm, which is an extension of the TOMS v8 algorithm with the improved treatment of the effective cloud height. We have used the overpass data that have been filtered using a quality parameter to eliminate row anomaly problems occurring in the original data set [41,42]. A detailed description of the used data product (OMTO3) can be found on the following website: https://avdc.gsfc.nasa.gov/index.php?site=830165109. The second considered retrieval algorithm, OMI Differential Optical Absorption Spectroscopy (OMI(DOAS)) was developed by the Royal Netherlands Meteorological Institute and is described in detail by [43]. This retrieval algorithm is different from both the Brewer and the Dobson spectrophotometer principles, as it is based on the entire absorption spectra. Using DOAS fitting, the amount of ozone along an average photon path is determined, which is then converted to a vertical ozone column via the air mass factor [28]. It should be noted that OMI(DOAS) retrieval algorithm has strong dependencies between T eff or SZA and the ozone cross sections [38]. Another difference between TOMS and DOAS algorithms is the treatment of aerosol and clouds, where the former use empirical correction and DOAS algorithms apply spectral fitting. A detailed comparison of the OMI(TOMS) and the OMI(DOAS) ozone algorithms is described in [42]. The overpass data product used in this study (OMDOAO3) is described in detail on the following website: https://avdc.gsfc.nasa.gov/index.php?site=962428764. All OMI data are corrected for temperature, while records affected by row anomaly have been screened out with a time-dependent screen and quality parameter [42,43]. The TOC from the OMI, for the Marambio Base, is available for the 2011-2013 study period from the beginning of August to the end of April (TOMS), or over the entire year with a limited number of daily TOC observations during the Antarctic winter (DOAS).
The GOME2 is an ultraviolet and visual spectrometer aboard the Meteorological Operational satellite program (MetOp-A) series of satellites, as launched in 2006. A GOME2 instrument is also installed aboard the MetOp-B satellite launched in September 2012, but, given the chosen study period, only data from the MetOp-A satellite were included in this study. The GOME2 provides information not only about ozone, but also about other gases such as water vapor, NO 2 , SO 2 , or ozone-depleting substances. The instrument's typical nadir spatial resolution is 80 × 40 km 2 [44]. TOC data from the GOME2 instrument were retrieved by the GDP 4.4 algorithm, which uses two-step DOAS methodology [23,45]. In this study, the GOME2 overpass data from the following website were used: http://www.temis.nl/protocols/O3total.html. At the Marambio Base, TOC data measured by the GOME2 data are available for the entire 2011-2013 study period from mid-August to the end of April.
The SCIAMACHY is an ultraviolet, visible, and near-infrared spectrometer designed to measure trace gases in the troposphere and the stratosphere, such as ozone, ozone-depleting substances, O 2 , and CH 4 . Based on the intensity of earthshine radiance, the pixel size varies between 26 × 30 km 2 and 32 × 930 km 2 , being much larger in the case of high latitudes, especially in the winter months [46]. The instrument was installed aboard the ESA ENVISAT satellite and operational from its launch in March 2002 until April 2012. In the Antarctic Peninsula Region, the measurements are available from the beginning of August to mid-May. As the instrument's measurements were discontinued in 2012, data are not available for the entire study period. TOC data from the SCIAMACHY instrument have been retrieved using the TOSOMI algorithm, which is based on the DOAS technique and is described in detail in [14]. The SCIAMACHY overpass data from the following website were used: http://www.temis.nl/protocols/O3total.html.

Methods of Data Analysis
TOC data for the 2011-2013 period, obtained by various instruments (see Section 3), were analyzed and intercompared. B199 single ozone observations were paired with the closest D099 and OMI measurements. In the case of the GOME2 and the SCIAMACHY, six-hour aggregated overpass data were used and coupled with the closest B199 ozone observation. The maximum lag between the B199 and other TOC observations was set to 30 min, because, within this threshold, no significant dependency in terms of TOC difference and lag time was found.
According to [47], the most commonly used distance limit between the ground-based instrument and the satellite overpass can reach up to 150 km; however, Kuttippurath et al. [29] stresses the high variability of ozone in polar regions, so a shorter distance threshold was chosen. Therefore, only the satellite overpass data within a distance of up to 100 km from the Marambio Base were further processed. No significant dependencies were found between the performance of the various instruments and the distance within the given threshold, which also applies to the ozone hole period (see Appendix A). The number of pairs available for the intercomparison with the B199 instrument differed according to the data source, ranging from 195 observations for the SCIAMACHY to 499 pairs for the GOME2 (Table 1). The TOC observations obtained by the five different sources were compared with the B199, which was chosen as the reference instrument. Since the individual data products were not compared with each other, but only with the B199 Brewer spectrophotometer, all the available pairs of TOC values (Table 1) were used for this comparison. Various different aspects of the TOC measurement performance have been studied, such as the general agreement of the studied TOC data sets, the relationship between this agreement and the variables that may affect it, the differences between ozone hole and non-ozone hole conditions, including the shape of the vertical ozone profiles, and the extreme cases of TOC variability between the selected data sets. For all statistical testing, α = 0.05 was chosen as the level of significance.
The overall and monthly agreement of the TOC measured by the B199 and the selected TOC data sets was first studied using Student's t-test (α = 0.05), and by the determination coefficient R 2 , which gives the amount of variability in common among the studied variables. Bias, the mean absolute error (MAE) and the root mean square error (RMSE) for each of the data sets were also calculated for every month (Equations (1)-(3), respectively) and their changes throughout the year were analyzed.
Next, the ratios between the TOC from the available data sources and the TOC measured by the B199 were calculated for each available pair of values. These ratios and their variability over the year were assessed using basic statistical characteristics, and their mean value was tested against 1 (the ideal ratio) using the t-test.
Further, the relationship between the TOC data sets and different variables, which can affect the performance of the satellite instruments, was studied. The dependencies of the TOC ratios on the SZA, the TOC measured by the B199, and T eff were considered using linear regression, Pearson correlation, and the determination coefficient R 2 . In order to analyze the individual roles of explanatory variables, partial correlation with the exclusion of other variables' effects was computed.
According to the TOC measured by the B199, the available observations were divided into non-ozone hole (TOC measured by the B199 > 220 Dobson Units (DU)) and ozone hole conditions (TOC measured by the B199 ≤ 220 DU), and the TOC ratio differences were analyzed. Nevertheless, this division was arbitrary and provided no information about the actual shape of the ozone profile, which is an important parameter that can, via the assumed a priori profiles, affect the performance of satellite instruments, e.g., [16].
In order to address the limitations of the standard ozone hole definition, all measurements within the ozone hole period (the three-month period between September and November) were classified as having either a standard ozone profile (non-depleted) or a depleted profile shape. A depleted ozone profile was defined as having less ozone in the 15-20 km layer than in the underlying 10-15 km layer. As stated in [48], depleted profiles do not necessarily occur only on ozone hole days with a TOC below 220 DU. For example, no case of depleted-shape ozone profiles was found in the early Antarctic spring (August), but they were fairly common towards the end of the ozone hole period. At this point, the ozone in the upper parts of the profile starts to recover, but its amount remains low in the underlying layers, giving the entire profile its non-standard, depleted shape. Therefore, within the ozone hole period, profiles with a depleted shape can sometimes present higher TOC values than profiles with a standard shape which are depleted in their entire range. As seen in Appendix B, throughout the ozone hole period, depleted ozone profiles tend to occur at a higher SZA and T eff than non-depleted ones, meaning they are most likely to occur during the later ozone hole phases. In order to distinguish between depleted and non-depleted ozone conditions, methodology based on the artificial neural network classification of potential vorticity [48] was applied.
The last part of the data analysis focused on the extreme TOC ratios, which were defined as the 10% of the ratios that most differed from 1 on the logarithmic scale. Therefore, in the cases of D099, OMI(TOMS), OMI(DOAS), GOME2, and SCIAMACHY, there were 39, 44, 46, 50, and 20 extreme ratios defined, respectively. The distribution of the extreme TOC ratios over the year and their values were examined with respect to the shape of the vertical ozone profile.

Basic TOC Characteristics
The mean daily TOC for the period 2011-2013, as measured by the B199 Brewer spectrophotometer, was 264 ± 54 DU, with the absolute recorded minimum of 117 DU on 3 October, 2011 (20:13 UTC), and the absolute recorded maximum of 402 DU on November 6, 2013 (18:18 UTC). As seen from Figure 2, ozone hole conditions with a TOC below 220 DU occurred at the Marambio Base at least once a year, usually during September, October, and November. However, on several days, the TOC also dropped close to the ozone hole values in August and December. In the three studied years, the ozone hole of 2011 was the deepest and the longest-lasting (the last ozone hole day was recorded on 25 November 2011), but its onset was slow. On the other hand, the ozone holes of 2012 and 2013 had an earlier onset but were not as long-lasting (Table 2).

Comparison of the B199 TOC and the Selected Data Products
In this section, the agreement between the B199 and the selected data products is analyzed for each of the data products separately. Basic statistical characteristics were considered, and relationships observed between the studied ratios and the SZA, the TOC measured by the B199, and Teff. Ozone hole (TOC recorded by the B199 ≤ 220 DU) and non-ozone hole conditions were compared, and the shape of the ozone profiles (non-depleted and depleted) considered. Depleted ozone profiles, which can also occur under non-ozone hole conditions with a TOC > 220 DU, were defined as those having less ozone in the layer between 15 and 20 km above the surface than in the underlying layer of 10-15 km above the surface. In the last part of this analysis, the extreme ratios that are most different from 1 on the logarithmic scale (see Section 4) and their relation to depleted or non-depleted conditions were assessed. The statistical distribution of the studied ratios is shown in Figure 2. The basic statistical characteristics of the individual comparisons are included in Appendices B and C. All statistical tests were performed at a significance level of α = 0.05.

D099 Dobson Spectrophotometer Observations
TOC measurements from both the B199 and the D099 spectrophotometers were available for the period 2011-2013 in 394 cases. The agreement was very good (Figure 3), the TOC difference did not exceed 7%, overall bias was 0.5 DU, the MAE was 2.9 DU, the RMSE was 1.5%, and the data sets shared over 99% variability. Although the 0.5 DU bias was rather small, it was statistically significant but the mean D099/B199 ratio did not differ from 1. A significant mean TOC overestimation was found in February, March, November, and December, while an underestimation was observed in January. However, the ratio between the studied data sets was different from 1 in all months except for April and September (Figure 4b). The monthly error statistics and the TOC

Comparison of the B199 TOC and the Selected Data Products
In this section, the agreement between the B199 and the selected data products is analyzed for each of the data products separately. Basic statistical characteristics were considered, and relationships observed between the studied ratios and the SZA, the TOC measured by the B199, and T eff . Ozone hole (TOC recorded by the B199 ≤ 220 DU) and non-ozone hole conditions were compared, and the shape of the ozone profiles (non-depleted and depleted) considered. Depleted ozone profiles, which can also occur under non-ozone hole conditions with a TOC > 220 DU, were defined as those having less ozone in the layer between 15 and 20 km above the surface than in the underlying layer of 10-15 km above the surface. In the last part of this analysis, the extreme ratios that are most different from 1 on the logarithmic scale (see Section 4) and their relation to depleted or non-depleted conditions were assessed. The statistical distribution of the studied ratios is shown in Figure 2. The basic statistical characteristics of the individual comparisons are included in Appendices B and C. All statistical tests were performed at a significance level of α = 0.05.

D099 Dobson Spectrophotometer Observations
TOC measurements from both the B199 and the D099 spectrophotometers were available for the period 2011-2013 in 394 cases. The agreement was very good (Figure 3), the TOC difference did not exceed 7%, overall bias was 0.5 DU, the MAE was 2.9 DU, the RMSE was 1.5%, and the data sets shared over 99% variability. Although the 0.5 DU bias was rather small, it was statistically significant but the mean D099/B199 ratio did not differ from 1. A significant mean TOC overestimation was found in February, March, November, and December, while an underestimation was observed in January. However, the ratio between the studied data sets was different from 1 in all months except for April and September (Figure 4b). The monthly error statistics and the TOC correlation (Figure 5a-d) do not point to any clear pattern with a good agreement between the B199 and D099 data sets in the Antarctic spring, summer, and autumn.        Throughout the ozone hole period, the mean D099/B199 ratio differed significantly under ozone hole and non-ozone hole conditions, being significantly lower than 1 under ozone hole conditions (TOC ≤ 220 DU) and higher than 1 when the TOC measured by the B199 was higher than 220 DU. There was no significant difference in the mean D099/B199 ratio between the depleted and non-depleted ozone profiles; in both cases, the mean ratio was not different from 1 (Figure 6a). The D099/B199 ratio showed a significant negative relationship with the SZA and a positive correlation with the TOC measured by the B199 (Figure 7a-b), which explained 8% and 12% of the ratio variability, respectively. No relationship between the ratio and T eff was found (Figure 7c), which can be explained by the use of D099 data corrected for T eff using the B199 measurements. However, when the individual role of these variables was assessed, a significant correlation was found between the ratio and all three studied variables, while the role of the SZA was greatest, followed by T eff . Therefore, the D099 data correction for T eff probably obscured the dependency, which again appeared when excluding the effect of the TOC and the SZA. Nevertheless, in the non-ozone hole period, with the exclusion of other variables' effect, the ratio was only significantly correlated with the SZA.
Of the 39 most extreme D099/B199 ratios, 59% were lower than 1 (a large underestimation of the TOC measured by the B199), and 41% of them were greater than 1. Most of the D099/B199 extreme ratios were recorded during the Antarctic spring, especially in September and October, but mostly when the ozone layer was not depleted. Some extreme ratios were also found in the Antarctic summer and autumn (Figure 8). This is likely linked to the relationship between the D099/B199 ratio, the SZA, and the TOC, with the agreement being worse in those days with a high SZA and a low TOC.
The worse agreement between the Brewer and Dobson spectrophotometers in the case of a high SZA was already described in the 1980s and 1990s, e.g., [49,50]. The increased bias in high SZA observations, partly due to increased radiation scattering in the atmosphere and partly due to the contamination of measured spectra caused by the diffuse radiation within the instrument, whose relative proportion increases as the sun approaches the horizon, was explained in [51]. This then leads to ozone underestimation, because, when the SZA is high, ozone affects diffuse radiation less than direct radiation. Moreover, simplifications within the standard retrieval algorithm lead to a systematic error, causing TOC underestimation to be most pronounced when the SZA is high and under low-ozone conditions, which are specific to the Antarctic region. Moeini et al. [25] also stressed the effect of a long ozone slant path in the case of high SZA direct sun measurements. The temperature dependency of the Brewer and Dobson spectrophotometers has also been described, e.g., [49,52], who observed a relatively higher underestimation of the TOC at lower T eff when using the Dobson spectrophotometer compared to the Brewer instrument. In this study, this phenomenon, which was well visible when using raw D099 measurements, was tackled by the T eff correction. However, even the Brewer spectrophotometer bias during the TOC observations increased by up to 1.5% at low effective temperatures [52], which could be a possible reason for the negative relationship found between the D099/B199 ratio and T eff when excluding the effect of other variables. The differences found between the B199 and the D099 TOC observations reinforce the choice of the reference instrument for satellite data intercomparison.

OMI(TOMS) Total Ozone Satellite Retrievals
Throughout the period 2011-2013, there were 444 cases with a daily TOC available from both the B199 and the OMI(TOMS). The general agreement was slightly worse than in the case of the D099 (Figure 3), together with an overall bias lower than 1 DU. The TOC difference reached up to 10%, bias was −0.7 DU, the MAE was 4.3 DU, the overall RMSE was 2.4%, and the data sets shared 99% variability. The mean TOC measured by the OMI(TOMS) was significantly lower than the TOC recorded by the B199, the mean ratio differed from 1. However, systematic underestimation was only significant in February and September, while, in other months, the difference was not significant (Figure 4c). A period of increased ratio variability can be observed in October and November, which, at the Marambio Base, are the months with the highest variation in TOC records. In the ozone hole period, the MAE and the RMSE were also higher than in the Antarctic summer, but the error statistics increased again in April ( Figure 5; Appendix C). The variability shared between the two data sets only dropped below 95% in April (Figure 5d), but, in this month, there were only 10 available pairs of OMI(TOMS) and B199 measurements.

OMI(TOMS) Total Ozone Satellite Retrievals
Throughout the period 2011-2013, there were 444 cases with a daily TOC available from both the B199 and the OMI(TOMS). The general agreement was slightly worse than in the case of the D099 (Figure 3), together with an overall bias lower than 1 DU. The TOC difference reached up to 10%, bias was −0.7 DU, the MAE was 4.3 DU, the overall RMSE was 2.4%, and the data sets shared 99% variability. The mean TOC measured by the OMI(TOMS) was significantly lower than the TOC recorded by the B199, the mean ratio differed from 1. However, systematic underestimation was only significant in February and September, while, in other months, the difference was not significant (Figure 4c). A period of increased ratio variability can be observed in October and November, which, at the Marambio Base, are the months with the highest variation in TOC records. In the ozone hole period, the MAE and the RMSE were also higher than in the Antarctic summer, but the error statistics increased again in April ( Figure 5; Appendix C). The variability shared between the two data sets only dropped below 95% in April (Figure 5d), but, in this month, there were only 10 available pairs of OMI(TOMS) and B199 measurements.
No significant difference was found between the ratios under ozone hole and non-ozone hole conditions, but it could be observed between depleted and non-depleted profiles ( Figure 6). The ratio did not differ from 1 when the ozone profile was depleted, but it was significantly lower under non-depleted conditions. There was a negative correlation of the OMI(TOMS)/B199 ratio with the SZA (Figure 7); and, since the mean SZA was higher under non-depleted conditions (Appendix B), SZA dependency is a likely explanation of the difference between the ratios under depleted and non-depleted conditions. The correlation between the ratio and the TOC measured by the B199 was significant but weak; the same applies for the correlation with Teff. Considering the individual role of the explanatory variables (Table 3), only the effect of the SZA was statistically significant. The relationship between the ratio and Teff was especially weak, even when excluding the effect of other variables. This can be explained by the nature of TOMS-like algorithms which are designed so as to be not particularly sensitive to Teff [43].
Of the 44 studied OMI(TOMS)/B199 extreme ratios, 73% were lower and only 27% were higher than 1. Most of the extreme values were recorded during the ozone hole period, documenting the increased variability in the instruments' agreement during these three months. Due to the SZA dependency, more extreme ratios were observed during non-depleted conditions (Figure 8).
The other studies [15,18,28] also found that in southern mid-latitudes, OMI(TOMS) agreement with ground-based instruments was in the range of 0 to −2%, with a large variability of ±6%. On the other hand, the OMI(TOMS) results reported by [29] revealed much spatial variation, with an overestimation of the ground-based measurements at the Marambio Base. However, in [29], the No significant difference was found between the ratios under ozone hole and non-ozone hole conditions, but it could be observed between depleted and non-depleted profiles ( Figure 6). The ratio did not differ from 1 when the ozone profile was depleted, but it was significantly lower under non-depleted conditions. There was a negative correlation of the OMI(TOMS)/B199 ratio with the SZA (Figure 7); and, since the mean SZA was higher under non-depleted conditions (Appendix B), SZA dependency is a likely explanation of the difference between the ratios under depleted and non-depleted conditions. The correlation between the ratio and the TOC measured by the B199 was significant but weak; the same applies for the correlation with T eff . Considering the individual role of the explanatory variables (Table 3), only the effect of the SZA was statistically significant. The relationship between the ratio and T eff was especially weak, even when excluding the effect of other variables. This can be explained by the nature of TOMS-like algorithms which are designed so as to be not particularly sensitive to T eff [43].
Of the 44 studied OMI(TOMS)/B199 extreme ratios, 73% were lower and only 27% were higher than 1. Most of the extreme values were recorded during the ozone hole period, documenting the increased variability in the instruments' agreement during these three months. Due to the SZA dependency, more extreme ratios were observed during non-depleted conditions ( Figure 8).
The other studies [15,18,28] also found that in southern mid-latitudes, OMI(TOMS) agreement with ground-based instruments was in the range of 0 to −2%, with a large variability of ±6%. On the other hand, the OMI(TOMS) results reported by [29] revealed much spatial variation, with an overestimation of the ground-based measurements at the Marambio Base. However, in [29], the D099 Dobson spectrophotometer was used as the reference instrument, so the overestimation might have been caused not only by OMI(TOMS) long-term instability, but also by the D099 spectrophotometer SZA and T eff dependencies (see Section 5.2.1). When raw D099 data, which were not corrected for T eff , were used for comparison purposes for the period 2011-2013, overestimation by the OMI(TOMS) of ozone by about 2% on average was found. Correction for T eff significantly reduced the differences between the TOC measured by the B199 and the D099; therefore, when using D099 ozone data corrected for T eff , as measured by the B199, the OMI(TOMS) did not present a significant overestimation of the TOC.
Similar to this study, McPeters et al. [18] and Zhang et al. [28] find no dependency between the ground-based TOC and the performance of the OMI(TOMS), especially under low-ozone conditions. However, since the vertical ozone profile shape is not fully determined by the ozone amount [48], differences between the agreement under depleted and non-depleted conditions have been found in this study. As no correlation was found between the OMI(TOMS) data and T eff , the reason for the differences in agreement under depleted and non-depleted conditions was more likely due to the SZA's dependency on ozone in the case of the OMI(TOMS), as also described in [15,53]. Table 3. Partial correlation between the explanatory variables and the ratio between the B199 and the studied data sources at the Marambio Base for the period 2011-2013. Asterisks indicate statistically significant partial correlations.

All Data
Non-OHP

OMI(DOAS) Total Ozone Satellite Retrievals
Although the OMI(DOAS) data were obtained using the same instrument as the OMI(TOMS) time series, there are many differences between these two data sets, as described, e.g., by [15], which also lead to agreement with the B199 observations. In the case of the OMI(DOAS), there were 456 pairs available for TOC comparison with the B199 data. The general agreement between these two data sets was characterized by TOC differences reaching up to 18%, a bias of 3.1 DU, an MAE of 6.3 DU, an overall RMSE of 3.5%, and 98% variability in common. Compared to the B199 ozone retrievals, the mean OMI(DOAS) observations were significantly overestimated over the entire period, as well as in all individual months except for February and December (Figure 4d). The error statistics (Figure 5a-c) were rather consistent, except for increased errors in April. The correlation between the B199 and the OMI(DOAS) data sets was very high in the Antarctic spring, but declined in summer (especially in December and January), probably due to the rather low total variability of the measured TOC.
The OMI(DOAS)/B199 ratio differed significantly between both the ozone hole and non-ozone hole conditions, and also the depleted and non-depleted profiles ( Figure 6). In both cases, the ratio was higher than 1 (1.014 ± 0.033 for non-depleted conditions, and 1.025 ± 0.035 for depleted conditions), and the ratio difference was even higher for a TOC under or over 220 DU. The OMI(DOAS)/B199 ratio was only correlated with the TOC measured by the B199, not with the SZA or T eff (Figure 7g-i). The negative relationship between the ratio and the TOC explains the ratio difference between depleted and non-depleted conditions, because, throughout the ozone hole period, the mean TOC was significantly lower when the ozone profiles were depleted (Appendix B). When the individual roles of explanatory variables were assessed, the correlation between the ratio and the TOC measured by the B199 remained the strongest. However, a weaker, but still significantly negative, correlation emerged between the ratio and the SZA, which was likely caused by the negative correlation between the SZA and the TOC measured by the B199.
The 46 analyzed extreme values of the OMI(DOAS)/B199 ratios were spread across a rather large range (between 0.87 and 1.18), but the extreme values were higher than 1 in 87% of cases (Figure 8). This underlines the significant OMI(DOAS) overestimation of the TOC values obtained by the B199, even within the range of extreme values. Most OMI(DOAS)/B199 extremes were recorded over the ozone hole period, especially in October. Moreover, in October and November, more than 50% of the recorded extreme ratios occurred under depleted conditions. An intercomparison of the TOC measured by ground-based instruments and the OMI(DOAS), including a site in high southern latitudes, was also performed by [15,27], who came to comparable conclusions, stating that the TOC obtained by the OMI(DOAS) was overestimated by 2-4% in relation to the ground-based instruments. The overestimations tend to be higher at an SZA of approximately 60 • and larger, and, similar to the findings of this study, at an extremely low TOC. The dependency of the OMI(DOAS) algorithm on the TOC and Teff was also described by [52], who provided a possible explanation by the SZA relationship, but also by the sensitivity of the algorithm to differences between real and assumed ozone profiles. Zhang et al. [28] also refer to the importance of the a priori ozone and temperature selection. In this study, this was confirmed by the significant differences in agreement under non-depleted and depleted ozone layer conditions. Another possible reason for the differences between the TOC measured by ground-based and satellite instruments could be the different ozone cross sections for satellite and ground-based retrievals, or the precision of the used absorption cross sections and their dependency on temperature [16]. However, the satellite retrievals take temperature dependency into account [16] and the ground-based data have been corrected for T eff , so ozone cross-sectional temperature dependency was likely not the primary reason for the data set differences. Moreover, Damiani et al. [16] suggests another potential source of disagreement between OMI(DOAS) and ground-based measurements-i.e., the problems with accurately computing the air mass factor-which is used to calculate the vertical column ozone from the slant observation (see Section 3).

GOME2 Total Ozone Satellite Retrievals
The TOC measured by both the GOME2 and the B199 was available in 499 cases for the period 2011-2013. The relative TOC differences reached up to 21%, while the bias and the MAE were, respectively, 4.2 and 6.2 DU. The overall RMSE was 3.6%, and the data sets shared 99% variability. The mean TOC, as retrieved from GOME2 observations, was higher than the TOC measured by the B199, and the overestimation was significant for all months except April and August. The TOC difference was most pronounced during high SZA months and the ozone hole period, with the fit being much better in the Antarctic summer (Figure 4e). The error statistics were also largest in October and April, i.e., months with a high SZA or with large ozone variability due to ozone depletion and recovery. The shared variability between the TOC measured by the B199 and the GOME2 never dropped below 93%, with the exception of January and April, when it was lower ( Figure 5).
There was a significant difference between the mean ratios under ozone hole and non-ozone hole conditions, while the ratio was higher when the TOC dropped below 220 DU. However, there was no significant difference between the mean ratio under depleted and non-depleted conditions ( Figure 6); in both cases, the ratio was about 1.026. Similar to the case of the OMI(DOAS), the GOME2/B199 ratio was only correlated with the TOC measured by the B199, not with the SZA and T eff . A high SZA and a low T eff only resulted in the greater variability of the studied ratio. As seen from Figure 7, the GOME2/B199 ratio was often very high under low-ozone conditions, in the case of both depleted and non-depleted ozone profile shapes. When assessing the roles of the explanatory variables separately (Table 3), only TOC proved to be correlated with the GOME2/B199 ratio. By contrast, the effect of T eff or the SZA was weak and insignificant.
The 50 studied GOME2/B199 extreme ratios showed a similar pattern to the OMI(DOAS)/B199 extreme ratios, albeit more pronounced. It was found that 94% of the extreme ratios were larger than 1, with most of them recorded during the ozone hole period, especially in October. However, in the case of the GOME2, the majority of extreme ratios were recorded under non-depleted conditions, which can be explained by the fact that depleted profiles occur about 50% less frequently than non-depleted ones.
These results are in agreement with [24], who also showed that, around 60-65 • S, the GOME2 overestimates the ground-based TOC observations in the Antarctic spring and autumn. However, they found up to 3% underestimation in the Antarctic summer, which was not detected at the Marambio Base in 2011-2013. In our study (e.g., Figure 4e), a lower mean ratio was reported in summer, but it was still higher than 1. In [24], a D099 was used as the reference instrument, but, even when using both raw and T eff -corrected data, no underestimation of the summertime TOC in the case of the GOME2 was seen. Therefore, a different study period or a different approach for correcting the raw D099 data was the most likely reason for this disagreement. Loyola et al. [23] also noticed the overestimation of ground-based TOC data, which was significant at an SZA > 60 • . Similar to our results, Loyola et al. [23] also observed increased variability under low-ozone conditions, while [28] found an overestimation of the TOC on days with a low TOC. The reason for the GOME2 s systematic inconsistencies with the B199 data is likely similar to the case of the OMI(DOAS), because the GOME2 retrieval algorithm is also based on the DOAS principle [23,45].

SCIAMACHY Total Ozone Satellite Retrievals
The last studied TOC data source was the SCIAMACHY, which was available in 195 cases, overlapping with the B199 measurements in 2011-2012 (Figures 2 and 3). The year 2013 was not included because the SCIAMACHY data set was discontinued in the first half of 2012, when contact with the ENVISAT satellite was lost. The mean difference between the SCIAMACHY and the B199 ozone data sets reached up to 21%, while the bias was 6.4 DU, the MAE was 8.6 DU, the overall RMSE was 5.6% and the data sets shared 98% variability.
Similar to the OMI(DOAS) and the GOME2, the mean TOC measured by the SCIAMACHY was significantly higher than in the case of B199 observations, the SCIAMACHY/B199 ratio was significantly higher than 1 for all months except February, April, and August (Figure 4f). Out of all the studied data sources, the error statistics for the SCIAMACHY ( Figure 5) showed the clearest annual cycle, with a better fit in the Antarctic summer and a worse agreement among the studied data sets in high SZA months and over the ozone hole period (i.e., September, October, and April). The determination coefficient, however, was very similar to other DOAS-type algorithm data products assessed within this study, being high in the Antarctic spring and decreasing over the summer, especially in January.
The mean SCIAMACHY/B199 ratios were higher than 1 and significantly different to each other under ozone hole and non-ozone hole conditions, but also when the ozone profile had a depleted or non-depleted shape. When the TOC dropped below 220 DU, the mean SCIAMACHY/B199 ratio increased to approximately 1.08; however, when the TOC was higher, the mean SCIAMACHY/B199 ratio was about 1.03. However, the SCIAMACHY/B199 ratio was significantly higher under non-depleted rather than depleted conditions ( Figure 6). This is most likely due to the fact that only one ozone hole season was captured by both the SCIAMACHY and the B199, and, in this season, the difference between the mean TOC for depleted and non-depleted profile shapes was not statistically significant. These ratio differences can be explained by the rather strong negative dependency of the SCIAMACHY/B199 ratio on the TOC measured by the B199 (Figure 7, R 2 = 57%). Under depleted conditions, a significant relationship was found between the ratio and the SZA; otherwise, no significant correlation with the explanatory variables was found. The same pattern emerged when assessing the effects of each variable separately (Table 3); in most cases (except under depleted conditions over the ozone hole period), only the TOC measured by the B199 affected the SCIAMACHY/B199 ratio.
All of the 20 assessed SCIAMACHY/B199 extreme ratios over the studied period were higher than 1 (Figure 8), indicating that the SCIAMACHY overestimated the TOC measured by the B199, even in extreme cases. Most of the SCIAMACHY/B199 extreme ratios were recorded over the ozone hole period, only one was observed in April. Regarding the shape of the vertical ozone profile, the extreme ratios mostly occurred under non-depleted conditions, albeit in November.
Eskes et al. [14] also assessed the SCIAMACHY performance at several Antarctic stations, with the RMSE ranging from 5.3% to 7.1%, similar to the results presented in this study. A significant overestimation of SCIAMACHY retrievals from across Antarctica, especially under low-ozone conditions in the Antarctic spring, was also found [14,28]. Similarly, Koukoli et al. [24,54] documented a 0.5-2.0% overestimation of the ground-based TOC in the Antarctic spring by the SCIAMACHY. The inconsistency with the B199 TOC was likely caused by similar reasons as in the case of the OMI(DOAS), because SCIAMACHY TOC retrieval employs the DOAS method [14]. However, the differences in the consistency between the B199 and the SCIAMACHY, and other satellite instruments (e.g., a very pronounced annual cycle of error statistics) presumably also originated during the different study period applied for this data source. However, the general characteristics of SCIAMACHY and B199 agreement, with a better fit in the Antarctic summer and large variability across the ozone hole period, are in accordance with previous studies and also with the other studied satellite DOAS-based data products.

Summary and Conclusions
This study aimed to compare TOC data sets from different platforms available at the Marambio Base: the ground-based Brewer B199 spectrophotometer, the Dobson D099 spectrophotometer, and OMI(TOMS), OMI(DOAS), GOME2, and SCIAMACHY satellite overpass data. Based on the B199 observations, a full, comprehensive comparison of all available data products at the Marambio Base was performed for the first time. In this study, overall consistency with the B199 measurements for the 2011-2013 period-as well as the relationship between the instruments' performance and the SZA, B199 ozone data, effective temperature (T eff ), and the shape of vertical ozone profiles-was considered.
The agreement between the B199 ozone observations and the selected data products was very good, with mean differences up to ±2-3%. The ground-based D099 spectrophotometer showed the best agreement with the B199 data, likely caused not only by the similarity between the measurement principles, but also by the temperature correction that used T eff data retrieved by the B099 spectrophotometer. However, a significant dependency was found between the agreement of D099 data with B199, the ozone amount and the SZA. When the effect of other variables was excluded, a relationship between the agreement and T eff emerged.
Among the available satellite data products, the OMI(TOMS) was in the best agreement with the B199 Brewer spectrophotometer measurements. The likely reason for this very good agreement could be the similarity between the OMI(TOMS) total ozone retrieval algorithm and the Brewer spectrophotometer principle for ozone observations. This instrument generally had a good fit with a mean difference of less than 1%. Nevertheless, there was still a significant difference between the fit under depleted and non-depleted conditions, which could be explained by the dependency of the SZA, and the OMI(TOMS) agreement with B199 data.
The other satellite data products in this study were retrieved using algorithms based on DOAS, in contrast to those used by ground-based spectrophotometers. All these data products-i.e., the OMI(DOAS), GOME2, and SCIAMACHY-showed a systematic overestimation of the satellite ozone retrievals in relation to the B199 measurements over the entire period, but became significantly stronger with a decreasing amount of ozone. No, or few, significant SZA or T eff dependencies were found in the case of the DOAS algorithms, although this retrieval algorithm is known to be highly sensitive on both parameters. Therefore, the large overestimation of satellite-based data during low-ozone conditions could not be attributed to the SZA or T eff . It means that there is more likely another cause of B199 and DOAS disagreement, such as the air mass factor, ozone cross-sectional accuracy in polar regions, especially in Antarctica, or the difference in assumed and actual vertical ozone and temperature profiles used for TOC retrievals. This opens up possible space for future research, along with, for example, the relationship between the instruments' performance and surface albedo or the precise overpass location with regard to the edge of the southern polar vortex.

Acknowledgments:
The authors would like to thank the NASA AURA Validation Data Center for providing the ozone data products retrieved from OMI measurements, and R.D. McPeters for advising on the characteristics of the used OMI data sets. The authors would also like to thank: R. J. van der A and M. A. F. Allaart for providing the assimilated ozone column data available at http://www.temis.nl/protocols/; the WMO-GAW, WOUDC, and Marambio Base operators for providing the ozone column data at http://www.woudc.org/; and R. Sanchez for the single D099 measurements. The GOME2 O3 level 2 data products used in the assimilated GOME2 ozone data were delivered by DLR/ACSAF/EUMETSAT. Furthermore, the authors would like to thank I. Petropavlovskikh and H. Rieder for discussing our ideas and motivating us in the course of this study.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Atmosphere 2019, 10, x FOR PEER REVIEW 19 of 26 Figure A1. Relationship between (a) OMI(TOMS) and (b) OMI(DOAS) overpass distance from the Marambio Base and the ratio to the B199 ozone column records in 2011-2013; please note that none of these relationships was statistically significant (R 2 for OMI(TOMS) was 0.00, 0.03, and 0.00 for non-depleted, depleted, and non-OHP conditions, respectively; and for OMI(DOAS) it was 0.03, 0.03, and 0.00 for non-depleted, depleted, and non-OHP conditions, respectively). Figure A1. Relationship between (a) OMI(TOMS) and (b) OMI(DOAS) overpass distance from the Marambio Base and the ratio to the B199 ozone column records in 2011-2013; please note that none of these relationships was statistically significant (R 2 for OMI(TOMS) was 0.00, 0.03, and 0.00 for non-depleted, depleted, and non-OHP conditions, respectively; and for OMI(DOAS) it was 0.03, 0.03, and 0.00 for non-depleted, depleted, and non-OHP conditions, respectively).