AATSR: Product Development and Intercomparison

Abstract: Models and observations show that the Arctic is experiencing the most rapid changes in global near-surface air temperature. We developed novel EASE-grid Level 3 (L3) land surface temperature (LST) products from Level 2 (L2) AATSR and MODIS data to provide weekly, monthly and annual LST means over the pan-Arctic region at various grid resolutions (1–25 km) for the past decade (2000–2010). In this paper, we provide: (1) a review of previous validation of MODIS/AATSR L2; (2) a description of the processing chain of L3 products; (3) an assessment of the 25 km products uncertainty, and; (4) a quantification of the bias introduced by over-representing clear-sky days in MODIS L3 products. In addition, we generated uncertainty maps by comparing L3 products with LST from passive microwave sensors (AMSR-E and SSM/I) and the North American Regional Reanalysis (NARR). Results show a close correspondence between MODIS and AATSR monthly products with a mean-difference (MD) of −1.1 K. Comparing L3 products with NARR indicates a close agreement in summer and a systematic bias in winter, which is entirely negative with respect to MODIS L3 (MD: −3.6, Min: −6.8, Max: −1 K). Comparing monthly averaged MODIS L3 to NARR clear-sky to quantify over-representing clear-sky days indicates a decrease of winter and an increase of summer difference compared to NARR all-sky. Finally, we provide suggestions to improve LST retrieval over Arctic regions.


Introduction
Land Surface Temperature (LST) is a key surface parameter needed to study energy and matter exchange near land surface [1,2].The successful retrieval of sea surface temperature from thermal infrared (TIR) observations has led to the development of LST retrieval algorithms [3].For example, the recent advancements of LST retrieval algorithms for the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor aboard the Terra & Aqua satellites and the Advanced Along-Track Scanning Radiometer (AATSR) sensor on-board ENVISAT satellite, despite challenges related to variations in emissivity within pixels over heterogeneous landscapes [4].The availability of satellite-derived LST allows for detecting changes in surface (skin) temperature on a regular basis (i.e., on weekly, monthly and annual time steps) [5] and for monitoring decadal trends [6,7].
Evidence of increased near-surface air temperature in the Arctic, at almost twice the rate as the global average [8,9], emphasizes the importance of monitoring surface temperature of this vast and remote region from space.Although TIR data have been used to retrieve LST operationally [3,10], it has not been utilized widely to map/monitor LST in high latitude regions [11,12].In addition to the relevance of monitoring LST in the context of Arctic climate change, there is a growing interest by the permafrost community to integrate gridded LST data into spatially-distributed permafrost models for basin-scale and Arctic-wide simulations in order to evaluate the impact of changes in LST on near-surface permafrost temperature and the thickness of the active layer [13].There is also interest by the climate modeling community in evaluating climate model output of surface temperature with LST products derived from satellite remote sensing [14,15].
Therefore, there is a clear need identified by both the permafrost and climate modeling communities for the production of LST datasets aggregated both spatially and temporally to meet these requirements [16].Furthermore, international initiatives such as the newly established EarthTemp [17] are a further indication of the interest in acquiring temperature measurements on a regular basis for climate monitoring, including those retrieved from satellite remote sensing and their comparison with more conventional screen-height surface air temperature measurements from meteorological stations and atmospheric reanalysis data.
One recently completed initiative is the European Space Agency (ESA) Data User Element (DUE) Permafrost project aimed at building an Earth Observation (EO) system to monitor permafrost regions [18].Although permafrost is a subsurface phenomenon, LST was identified, in addition to surface soil moisture, freeze and thaw status, surface water, land cover and topography, as a proxy to monitor the near-surface thermal state of permafrost [18,19].Based on a survey involving the International Permafrost Association [20], two critical scales for LST products, regional (1 km) and pan-Arctic (25 km) scales at weekly, monthly and annual intervals, were identified.Maximum and minimum temperatures, temperature amplitude, and the number of satellite observations used to calculate mean values were requested as ancillary data to provide an indication of the quality of the LST products.It was also suggested to develop a scheme for the validation of LST products over permafrost areas as most validation efforts of L2 MODIS or AATSR products have been conducted in mid-latitude regions over homogenous agricultural fields [21][22][23] or large lakes [1].A few recent studies, conducted within the context of the DUE Permafrost project, have evaluated existing L2 and L3 MODIS products in selected regions of the Arctic [12,[24][25][26].Although these studies have been useful in estimating the uncertainty of L2 and L3 MODIS (Collection 5) 1-km products, they were limited spatially to a few locations.
Through ESA's DUE Permafrost project, we developed novel LST L3 products using readily available L2 AATSR and MODIS unprojected products [27].The new L3 products span over the past decade (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) and are produced at the pan-Arctic scale, above 50 degrees north, on weekly, monthly and annual time steps.The specific objectives of this paper are to: (1) provide a brief review of the validation of L2 MODIS /AATSR products from previous investigations; (2) provide a description of the processing chain developed to produce L3 pan-Arctic products with a critical analysis of the aggregation method used; (3) describe the spatial and temporal bias patterns associated with the new L3 MODIS and AATSR pan-Arctic products at a 25-km grid resolution through an intercomparison with available near surface temperature geophysical products, and; (4) quantify the temperature bias of using only clear-sky observations to produce pan-Arctic LST products from MODIS.

Accuracy of Retrieved LST from MODIS
MODIS sensors were launched on 18 December 1999 and 4 May 2002 onboard of NASA's Terra and Aqua satellites, respectively.The Terra satellite descends at the equator around 10:30 am, while Aqua's descending time is lagged by three hours [28].Level 2 LST of MODIS (MOD11_L2/MYD11_L2) Collection 5 is retrieved from clear-sky pixels by considering the differential atmospheric absorption in two TIR channels (31 and 32) for images obtained at the same time [29].The algorithm coefficients, which describe the relation between top-of-the-atmosphere brightness temperature and LST, are estimated by interpolating look-up table values.Band emissivity is assigned to various land cover classes as given in the MOD12Q1 product and with the aid of emissivity libraries [30].LST retrieval from MODIS brightness temperature channels (MOD021KM) requires other products namely, MODIS cloud mask (MOD35_L2) to define clear-sky pixels, MODIS atmospheric profile (MOD07_L2) to determine the coefficients of the split window algorithm, MODIS landcover (MOD12Q1), and the MODIS snow cover (MOD10_L2) products to assign emissivity values to pixels.A complete description is given in [31,32].
The reported accuracy of L2 MODIS LST varies between validation methods [32][33][34], land cover type, and the observation periods (Table 1).MODIS-derived LST product MOD11 (Collection 5), which is used in this project, has been validated using a radiance-based approach in the mid and south US states.The bias of L2 MODIS LST observations from LST estimates obtained with a radiative transfer model ranged from −0.8 to 0.1 K, with a general tendency of underestimation of LST values from MODIS [33].
Recently, more attention has been given to the validation of L2 and L3 MODIS products at high latitudes.For example, a temperature-based validation was conducted to assess the bias in weekly average L2 MODIS LST at two high Arctic tundra sites; one on Svalbard, Norway [25], and another one at a polygon tundra site in northern Siberia, Russia [24].In these two studies, L2 MODIS products were validated against thermal images taken using a video camera mounted on a 10-m mast.In summer, the bias was found to be ±2 K at the Svalbard site and less than −0.5 to +2 K at the northern Siberia site.The bias was mostly attributed to a combination of a warm positive bias due to overrepresentation of warm clear-sky days and erroneously cold MODIS observations due to the presence of undetected clouds.However, in winter, Westermann et al. [26] concluded a systematic negative bias of −3 K (from −6 to −1.5 K), which was attributed to overrepresentation of cold clear-sky days in winter in addition to contamination by undetected cold top-of-the-cloud temperatures.

Accuracy of Retrieved LST from AATSR
ENVISAT is a sun synchronous satellite that descends at the equator at 10:00 am local time.Launched in March 2002, the satellite has a revisit period of two days at 70° latitude [35].The satellite stopped operating in April 2012 even if it exceeded its life expectancy.AATSR is part of the payload of ENVISAT.This sensor is a multi channel radiometer that images the earth surface in the thermal infrared wavelength at 1 km spatial resolution.AATSR is a continuation of the ATSR-1 and ATSR-2 sensors to monitor sea surface temperature (SST) from space.LST from AATSR brightness temperature data is derived operationally using a nadir split-window algorithm [3].
The split window algorithm coefficients are determined by regression of simulated datasets for each land cover class as given in Coll et al. [23].The coefficients are retrieved for two scenarios, full coverage and bare surface and the values for any intermediate surface coverage are interpolated.The land cover type is derived from the Dorman and Sellers [36] land cover map at 0.5 degree resolution, which influences the accuracy of the retrieved LST values [22].Recently, Zeller et al. [37] showed that using the GlobCover biome map (300 m resolution) aggregated to a 1-km pixel dimension improves LST estimates.Emissivity is implicitly included in the retrieval algorithm and depends on the quality of the BIOME map classes and the vegetation cover fraction maps.
The accuracy of AATSR retrieved LST has been evaluated mostly at low latitudes and over homogeneous regions (Table 2).L2 AATSR observations were found to suffer from biases.For example, Noyes et al. [38] found that the operational L2 AATSR product tends to be warmer (colder) in summer (winter).A comparison of AATSR LST with average radiometric temperature measured in situ (corrected for emissivity and incoming long wave radiation) over flat homogenous rice crop fields of summer 2002-2004 in Valencia, Spain, revealed that the operational LST product has an average bias of 3 K [23].
Adjusting land cover type and vegetation cover fraction to correct for misclassification, due to the utilization of the current coarse resolution land cover map (0.5 degree), improved the average bias to −0.9 K with a standard deviation of 0.9 K, while using an emissivity-dependent split window algorithm gave the most accurate results with an average bias of 0.3 K and a standard deviation of 0.9 K [23].Similar results have been obtained over a longer period, 2002-2008, in the same study area [39].Results indicate that the operational L2 AATSR 1-km product overestimates LST by 2 to 5 K when compared to ground radiometer measurements.A sensitivity analysis conducted by Coll et al. [1] indicated that misclassification of land cover is the main source of bias in the L2 AATSR product, followed by uncertainty in vegetation cover fraction.For example, changing vegetation cover fraction by ±10% resulted in an LST bias ranging from ±0.4 to ±1.2 K depending on the land cover class.Finally, a change of emissivity of 0.005 was found to cause LST to vary by ±1 K [1].

Pan-Arctic LST Products Development
The L3 pan-Arctic products (referred to here onward as UW-L3) are generated from AATSR unprojected, ATS_NR_2P available through ESA's online archive MERCI [40].Some L2 AATSR (ATS_NR_2P) products were found to suffer from an error consist of a sequence of identical temperature regions.These images were identified and excluded from the final products calculation.In addition, images that have less than 100 pixels were excluded and flagged as "insufficient" data files.The UW-L3 products of MODIS are generated using Collection 5 L2 data acquired by both the Terra (MOD11_L2.5)and Aqua (MYD11_L2.5)satellites (Terra only 2000-2002; Terra and Aqua, 2002-2010) [41].Although, MODIS quality information indicates the probability of "unseen" thin clouds, we decided to include all MODIS observations that labeled clear sky by MODIS cloud mask.The spatial averaging over 25 km pixels (625 L2 observations per scan) during a month is likely to counterbalance the effect of MODIS cloud mask failures.However, we did evaluate this assumption using the MODIS products of year 2009.Comparison with LST radiometer We selected the Northern Hemisphere EASE-grid, Lambert's Equal Area Azimuthal projection, based on a sphere datum with a radius of 6,371.228km, as the standard projection for the UW-L3 products.This projection was selected to match ESA's GlobSnow archive [42], snow and ice datasets available from the National Snow and Ice Data Center (NSIDC) [43] and DUE Permafrost products [18], therefore, facilitating future combined use of snow and ice products with UW-L3 LST products over the Arctic.The initial grid was assigned a 1-km spacing similar to the observation intervals of L2 AATSR and MODIS products.Bilinear interpolation was used to interpolate irregular L2 observations to the center of the EASE-grid pixels (Figure 1).Local time for each EASE-grid pixel is calculated using UTC acquisition time and longitude extracted from the Annotation Data Set (ADS) information in the case of AATSR data, and from file name in case of the L2 MODIS products.This method is limited to an accuracy of ±15 minutes, which is sufficient for the creation of weekly and monthly products.The pan-Arctic UW-L3 products are generated through aggregation of interpolated L2 1-km data into a 25-km EASE grid.In the current version of the processing chain, spatial aggregation requires a minimum of 32 1-km observations (5% of total observations), which was found to be reasonable for estimating a clear-sky LST mean.Pixels with less than 32 1-km observations are discarded (assigned a "no data" value).Valid aggregated pixels are separated into either a day-time bin (from 6 am to 6 pm local time) or a night-time bin (from 6 pm to 6 am of the next day).The definition of day and night is not determined by the solar angle (i.e., number of hours of daylight and darkness).Day (night) LST mean, is calculated as an arithmetic average of all pixels that fall in the day (night) bin during the week or month of interest.The final LST products are created by calculating a mid-range (balanced) mean of the day and night bins for the time period of interest in order to ensure that the calculated LST values are not biased towards the time period of the day with the greatest number of observations.Day average, night average, daily average and the number of clear-sky L2 observations (counts) are recorded in separate images for the period of interest (Figure 1).
The calendar month is used for calculating monthly UW-L3 products.For the weekly products, a sliding window (average of seven days) consisting of the date of interest and the six previous days is used.This follows the same convention adopted by the GlobSnow project for the creation of 25-km weekly snow water equivalent (SWE) products.The mean annual surface temperature (MAST) is then calculated as the average of the twelve UW-L3 monthly mean products per year for computational efficiency.

Quality Assessment of the UW-L3 25-km Pan-Arctic Products
Validating coarse-resolution (km to tens of km) pan-Arctic products is a challenging task [44].Ground-based LST measurements are limited spatially.In addition, up-scaling of ground measurements (e.g., thermal radiometer measurements) is also challenging due to surface heterogeneity and the number of temperature instruments that can be deployed in remote high-latitude regions.An alternate approach for the evaluation of coarse resolution satellite products is through intercomparison with existing products of the same or similar physical quantities [45,46].Uncertainty in UW-L3 pan-Arctic (monthly and annual) products was therefore estimated by intercomparing UW-L3 of AATSR and MODIS to each other and against existing surface and near-surface temperature products derived from the Advanced Microwave Scanning Radiometer for EOS (AMSR-E), the Special Sensor Microwave/Imager (SSM/I), and from North American Regional Reanalysis (NARR).Products (summarized in Table 3) were projected to the exact EASE-grid projection with 25-km spacing to match that of the UW-L3 products.Interpolation of NARR 32-km data to the 25-km EASE-grid resulted in some contaminated pixels along shorelines of oceans and that of large lakes.A binary mask was therefore applied to remove all pixels within 25-km of the shorelines.

Estimating Bias Introduced by Considering only Clear-Sky Observations
In order to explore the effect of including only clear-sky LST observations in the generation of UW-L3 products, in contrast to AMSR-E Ta, SSM/I and NAAR products that include all-sky (clearand cloudy-sky) observations, we first identified cloudy-sky observations at daily intervals in UW-L3 using the original cloud flags of L2 MODIS.The identified cloudy days were then used to produce a mask and applied to LST products from passive microwave (AMSR-E and SSM/I) and reanalysis (NARR) products to create clear-sky only monthly averages of these products.

Statistical Estimation of Bias
UW-L3 weekly, monthly and annual products were evaluated against the AMSR-E Ta, SSM/I, and NARR temperature datasets by calculating the mean difference (MD) and root-mean-square-difference (RMSD) statistics.MD is an uncertainty measure that takes into account the direction of the difference, whether positive or negative, while RMSD is a measure that is sensitive to outliers and considers the magnitude of the difference without considering the sign.Both uncertainty measures were calculated by averaging the difference obtained at all pixels between the different datasets on corresponding time periods.We deliberately use the term difference rather than error to indicate that each dataset contains its own level of error.We prefer to reserve the term "error" as used when validating satellite-derived LST with LST measured with a thermal radiometer deployed in the field.Such values are reported in Tables 1 and 2, and reported in corresponding Sections 2 and 3 for MODIS and AATSR, respectively.Differences of UW-L3 products with AMSR-E Ta, SSM/I and NARR were calculated over North America above 50 degrees latitude, while comparison between UW-L3 of MODIS and AATSR products was possible for all pixels included in the entire Arctic region.

UW-L3 Products
Examples of UW-L3 products are presented in Figure 2. Corresponding UW-L3 products of AATSR and MODIS were found to follow similar LST patterns.Yet, LST products of MODIS were more spatially and temporally complete than those of AATSR (Figure 2) because the number of L2 MODIS observations is approximately 10-fold that of AATSR.MODIS sensors have the advantages of being onboard of two satellite platforms, Aqua and Terra, in addition to having a larger swath than AATSR.Although at the moment the density of AATSR is low, in the future, the SLSTR (Sea and Land Surface Temperature Radiometer), which is a successor of AATSR, is to be included on the payload of the twin Sentinel-3 satellites.The satellite pair will enable a short revisit time of less than one day, which will restore the ratio between MODIS and AATSR / SLSTR observations.
The maximum count of L2 MODIS observations was found in the April to October period.Unexpectedly, this period is known to have the highest cloud cover in the high and low Arctic [52].The minimum count occurred during the period from October to March (Figure 3), which is characterized by the existence of snow on ground.Despite the high cloud fraction during the snow-free period (June-August), clouds can be easily detected during summer because of the large difference in reflectance between land and cloud.In contrast, snow-cover interferes with the cloud masks [53,54] and causes more L2 observations to be rejected, and therefore the total number of observations to drop.
AATSR observations were found to drop significantly during the snow-free period, suggesting that AATSR cloud mask tends to reject more observations in summer (Figure 3).The total number of L2 AATSR observations increased during the year 2006 compared to the average number of L2 observation found between years 2005-2009.The most likely cause of the "false" clear-sky observations was the development of a thin coating film on the visible channel calibration.This thin film caused a drift and poor performance of the sensor and higher probability of cloud contamination during that year [55].
A sudden drop in LST was identified around 60°N in Eurasia and North America during the cold months, particularly evident during the months of December and January (Figure 4).The drop in LST was found to correspond to a significant loss of available L2 MODIS observations [56].This problem The average bias between MODIS and NARR is entirely negative.In contrast, the observed bias between monthly UW-L3 of AATSR products and NARR had positive and negative magnitudes with a MD of −2.5 K (Min: −5.8 K, Max: 2.4 K) and a RMSD of 5.1 K (Min: 2.9, Max: 7.2 K).The positive difference during summer (AATSR warmer) could be traced to overestimating LST by L2 AATSR observations [38].
The bias between monthly UW-L3 of MODIS and AATSR products and NARR (all-sky), depicted in Figure 6, is systematic during the period 2000-2010 with a maximum difference around mid-winter and a minimum around July/August of every year.However, the maximum cloud coverage during the warm period (June-August) corresponds to the minimum difference.Remarkably, the systematic bias pattern is opposite to the interannual variation of cloud cover at high latitudes (see Figure 3).The observed negative difference during the cold months (snow on ground season) suggests that the admixing of top-of-the-cloud temperatures influences the winter LST estimates more than the summer.The reported bias is not absolute.The accuracy of LST (NARR) is influenced by how accurate the land surface scheme utilized in NARR captures energy partition in higher latitudes.In addition, the number of assimilated weather stations in higher latitudes is smaller than in mid latitudes.For example, it has been reported that NARR data tend to be warmer than LST derived from SSM/I data [49].

The Bias towards Clear-Sky Observations
Different sources of bias could influence UWL3 products.We will discuss two major types of errors: the bias towards clear-sky average and the erroneous observations of undetected clouds.The bias towards clear-sky days will result in a warmer (colder) LST average during winter (summer) than the average LST for all-sky conditions (true LST mean).The cloud cover increases LST during winter by re-emitting long-wave radiation to the earth's surface, while decreasing LST during summer by blocking direct solar radiation (increasing reflection back to space).In contrast, the bias introduced by erroneous observations of undetected clouds will always result in a negative offset of average LST because top-of-the-cloud temperatures are much colder than that of the land surface regardless of the season.
The results (Figure 7) clearly show that contamination with top-of-the-cloud temperatures (as indicated by applying a conservative quality control filter) is reduced in monthly LST averages, but it is not the main source of bias.Using a less restrictive quality control filter resulted in a negative bias between 0.5 and 1 K (relative to NARR).The bias towards clear-sky days contributed much more to the overall error, as can be seen by comparing the left and right panels of Figure 7. Adding to that, the summer bias between MODIS and NARR all-sky (Figure 7(b)) was found to be less than the summer bias between MODIS and NARR clear-sky (Figure 7  analyzed, followed by the bias patterns between LST of SSM/I and NARR.However, the close resemblance between the SSM/I and NARR products is expected given the fact that diurnal temperature cycle of hourly NARR data was used to normalize the SSM/I measurements, see details in Royer and Poirier [49].The bias (MD) patterns between UW-L3 of MODIS LST and NARR indicate that MODIS is colder by −2 to −4 K. Apart from that, comparing monthly near-surface air temperature of AMSR-E Ta with NARR shows a significant positive bias in mountainous regions.The high-altitude bias can be attributed to the coarse resolution of passive microwave, which does not resolve areas with complex topography.A negative bias (monthly AMSR-E Ta is colder than NARR) was observed at north latitudes.This negative bias was limited to the month of July and did not show up in the month of August (data is not presented).Melting of snow and ice, which still continues to happen at high latitudes between July and August, could interfere with passive microwave signals [59].Spatially averaged bias (RMSD) between satellite LST products (UW-L3, SSM/I and AMSR-E) during mid-summer (July and August 2007) were found to be less than RMSD between the satellite products (SSM/I, AMSR-E and MODIS) and NARR, indicating a higher agreement between satellite products than the reanalysis (Table 4).Comparing spatially averaged bias between MODIS and clear-sky/all-sky monthly-averaged datasets during summer time indicates a background MD of −1.3 K and a RMSD of 2.9 K for comparison with clear-sky products and a MD of −1 K and a RMSD of 2.7 K for comparison with all-sky products.The background error indicates that overrepresentation of clear-sky days, did not add much bias to the summer UW-L3 estimates.Mean annual surface temperature (MAST) is an important variable in studying different cryospheric systems such as permafrost.We derived uncertainty maps over North America showing that a RMSD of 4 K exist between UW-L3 of MODIS and NARR, while UW-L3 of AATSR differed from NARR by a RMSD of 3.5 K between, which is higher than the bias between the MAST of both sensors, 2.1 K for the period 2005-2009 (Table 5).The distribution of bias in MAST (MODIS/AATSR-NARR) indicates an association with cold temperature, dry climate and snow-covered conditions (Figure 11).Warm regions with relatively high atmospheric humidity and temperature on average (i.e., Western North America) were found to deviate by 0 to −2 K on annual basis, with NARR being warmer.Regions which experience colder and dryer atmosphere during the Arctic winter had a negative bias reaching between −4 to −6 K (see medium blue in Figure 11).Regions, which maintain dry and cold atmosphere year around (i.e., high Canadian Arctic and central Greenland), were found to deviate the most from NARR, with bias that could reach below −6 K. Knowing the exact atmospheric column properties (i.e., water vapor concentration and air temperature) is crucial to estimate the empirical regression coefficients (used in the split window algorithms) that relate LST to brightness temperature recorded by the TIR sensor.However, the number of Arctic radiosonde profiles available for the regression analysis in MODIS and AATSR retrieval libraries are relatively low.In addition, studies on SST retrieval have shown that split window algorithms, sometimes, perform poorly in high latitudes.During winter, ice fog and ice clouds, which are difficult to detect, interfere with IR absorption and emission [60].Therefore, we speculate that misestimating Arctic atmosphere properties (by using atmospheric profiles from lower latitudes with relatively higher temperature and water vapor concentration) could result in the high negative bias observed during the winter.This hypothesis is consistent with the minimum bias observed during the Arctic summer because the atmosphere starts to get warmer and the water vapor reaches a higher level compared to the winter.However, a careful testing of this hypothesis is beyond the scope and focus of this paper.

Conclusions
Novel pan-Arctic Land Surface Temperature (LST) products were aggregated from Level 2 (L2) unprojected observations of AATSR and MODIS, separately.The new products (UW-L3) were binned on a 25 km Ease-grid at weekly, monthly and annual intervals, thereby meeting the requirements of the permafrost and the regional climate modeling communities.An intercomparison was conducted between UW-L3 products and skin temperature derived from passive microwave data (AMSR-E and SSM/I) and derived from the North American Regional Reanalysis (NARR) to assess products uncertainty.Results indicate that UW-L3 products are closely related to all datasets during the summer months with a Mean Difference (MD) of −1 K and a Root Mean Square Difference (RMSD) of 2.7 K, averaged over all geographic locations above 50 degrees north.Since other datasets represent land temperature under all weather conditions, the observed difference is partially caused by biases in original datasets and by interpolation and aggregation of UW-L3.
A systematic winter difference was found between UW-L3 of MODIS/AATSR and monthly average skin temperature derived from NARR.The bias increased gradually from summer to winter months and reached a maximum RMSD value in the middle of the Arctic winter (7.7 K for MODIS and 7.2 K and for AATSR products).One should not consider reported bias as absolute.It is recognized that NARR performance in the Arctic is influenced by the number of assimilated weather stations, which is relatively small at high latitudes.Nevertheless, the winter bias was attributed to the inclination of UW-L3 of MODIS towards clear-sky observations and admixing with top-of-the-cloud temperatures.Furthermore, we speculate that the difference between winter and summer bias range could be caused by a reduction in the efficiency of the split window algorithm (of both AATSR and MODIS) at high latitudes during Arctic winter.A careful test of this hypothesis is beyond the scope of this study.Finally, we present the following four topics, based on our reported results here that merit further investigation to improve the quality of LST products from AATSR and MODIS over the Arctic: (1).The impact of the improvement in the upcoming MODIS Collection 6 LST products in relation to the identified artifact at 60 degrees North need to be quantified and compared to current products from Collection 5. (2).The bias between UW-L3 LST products at 1-km and ground-based station measurements of both near-surface air temperature and radiometric LST measurements is unknown.Further studies are needed to quantify the magnitude and various sources of uncertainty of the 1-km products.(3).The quality of cloud masks used in L2 MODIS and AATSR LST products is a topic that merits further investigation.The influence of polar darkness and snow cover on the quality of the operational cloud mask needs to be studied and more robust algorithms need to be developed.

*
Comparison with a reconstructed LST using a classified LANDSAT image and radiometric temperature for each class.ME: mean error, SD: standard deviation and RMSE: root mean square error.

Figure 1 .
Figure 1.Flow diagram of the processing chain used to produce pan-Arctic LST L3 products from original L2 un-projected LST 1-km products.

Figure 6 .
Figure 6.MD and RMSD between monthly UW-L3 of MODIS and AATSR against monthly 0-m height surface temperature of NARR (all-sky average) for the 2000-2010 period.
Figure observ (a) dep compa

Figure 10 .
Figure 10.Bias maps derived from intercomparison between monthly UW-L3 of MODIS and clear-sky only monthly of SSM/I, AMSR-E Ta and NARR temperature for the month of July 2007.SSM/I and MODIS were found to have the least contrasting bias zones when compared to reanalysis.

Figure 11 .
Figure 11.Mean annual surface temperature (MAST) calculated from monthly 0-m height surface temperature of NARR and MODIS/AATSR LST and difference between NAAR and MODIS/AATSR calculatedMAST (2005MAST ( -2009)).For AATSR, any pixel found to have more than two missing months during one year was discarded from the calculation of MAST.

Table 1 .
Summary of accuracy assessment of L2 MODIS land surface temperature (LST) products (Collection 5) using the operational generalized split window algorithm.

Table 2 .
Summary of accuracy assessment of L2 AATSR LST products using different retrieval algorithms.

Table 3 .
Summary of coarse resolution datasets.

Table 4 .
Summary of bias statistics during the snow-free period (July and August 2007) between MODIS and other products for clear-sky and all-sky conditions.

Table 5 .
Difference between mean annual surface temperature (MAST) estimated from monthly UW-L3 of MODIS and AATSR LST, and NARR 0-m surface temperature over North America.