Determining the Pixel-to-Pixel Uncertainty in Satellite-Derived SST Fields

: The primary measure of the quality of sea surface temperature (SST) fields obtained 16 from satellite-borne infrared sensors has been the bias and variance of matchups with co-located 17 in-situ values. Because such matchups tend to be widely separated, these bias and variance 18 estimates are not necessarily a good measure of small scale (several pixels) gradients in these 19 fields because one of the primary contributors to the uncertainty in satellite retrievals is 20 atmospheric contamination, which tends to have large spatial scales compared with the pixel separation of infrared sensors. Hence, there is not a good measure to use in selecting SST fields 22 appropriate for the study of submesoscale processes and, in particular, of processes associated 23 with near-surface fronts, both of which have recently seen a rapid increase in interest. In this study, two methods are examined to address this problem, one based on spectra of the SST data 25 and the other on their variograms. To evaluate the methods, instrument noise was estimated in Level-2 VIIRS and AVHRR SST 27 fields of the Sargasso Sea. The two methods provided very nearly identical results for AVHRR: 28 along-scan values of approximately 0.18 K for both day and night and along-track values of 0.21 K 29 also for day and night. By contrast, the instrument noise estimated for VIIRS varied by method, 30 scan geometry and day-night. Specifically, daytime, along-scan (along-track), spectral estimates 31 were found to be approximately 0.05 K (0.08 K) and the corresponding nighttime values of 0.02 K 32 (0.03 K). Daytime estimates based on the variogram were found to be 0.08 K (0.10 K) with the 33 corresponding nighttime values of 0.04 K (0.06 K). Taken together: AVHRR instrument noise is significantly larger than VIIRS instrument noise, along-track noise is larger than along-scan noise 35 and daytime levels are higher than nighttime levels. Given the similarity of results and the less 36 stringent preprocessing requirements, the variogram is the preferred method although there is a suggestion that this approach overestimates the noise for high quality data in dynamically quiet regions. Finally, simulations of the impact of noise on the determination of SST gradients show that on average the gradient magnitude for typical ocean gradients will be accurately estimated with VIIRS but substantially overestimated with AVHRR.


47
To date, a great deal of attention has been focused on the accuracy of satellite-derived sea surface temperature (SST) fields. By contrast, their local precision 1 has only been addressed by 112 5 In the case of 'buoy' temperature L2 and L3U datasets, the error in extrapolating from the skin temperature, the quantity actually measured by the satellite, to the temperature at the depth of the buoy, generally 1 m below the surface, additional contributions to the local precision may result from the horizontal variability in the vertical temperature step, the orange block in the figure. Only L2 skin temperature SST fields are considered in this study, hence horizontal gradients in the surface to buoy depth temperature difference do not contribute to the uncertainty in retrievals discussed herein. SST dataset obtained from VIIRS radiances and one L2 SST dataset obtained from AVHRR and Hamilton Harbor, Bermuda (Figure 2). Thermosalinograph temperature measurements were 120 obtained from two thermistors, one from the seawater intake in the interior of the ship and the 121 second directly at the intake; i.e., "external" to the hull. The exterior measure (referred to as TEX for Laboratory. The quality control procedures used to screen these data are described in Sch16.

138
Only quality level 1 data, the 'best' quality level, were used. Although screening at this level ideally 139 removes all cloud contaminated pixels, some are still included in the analysis, leading to the 140 misclassification error discussed above.       Stream, and the Sargasso Sea. In that the focus of this analysis is on the spatial resolving power of 165 satellite-derived SST datasets, it is important to select a region in which the geophysical variability 166 of the SST field does not overwhelm the uncertainty associated with the SST retrievals, be they 167 driven by misclassification errors (the green block in Figure 1) or instrument/calibration issues (the 168 yellow block). Specifically, this means selecting a dynamically "quiet" region in the ocean. The

169
Sargasso Sea portion of the Oleander track between 32 • N and 36 • N meets this requirement. In order 170 to increase the amount of satellite-derived data with what we believe to be similar statistics to those The analyses presented here are based on SST fields from the summer of 2012 only -June, July suggesting that the latter would be more appropriate for the evaluation proposed here, but the the increased spectral energy is likely due in part to diurnal warming, the effect of which may be 181 mitigated by selecting nighttime fields only as shown in Section 4.2. This raises a concern with 182 regard to the TEX data because TEX sections are not synoptic, taking approximately 20 hours to 183 cross the study area. However, since the TEX samples between 5 and 6 m below the surface, diurnal warming is not thought to be a significant problem [6].

191
The FFT requires equally spaced, gap-free data; i.e., gaps, if they exist in the original series, must be 192 filled and the data must be interpolated to equal spacing, if not already equally spaced, prior to

201
Of importance to the analysis presented here is that interpolation, either to fill gaps or to 202 regularize the spacing of samples on a section, impacts the resulting spectrum, with the impact 203 generally increasing as the wavenumber increases; i.e., in the spatial range of most importance to 204 the analysis here. Furthermore, the impact is a function both of the fraction of "good" values 205 (defined as by Sch16), and the degree to which the "missing" data are clustered (referred to as 206 cohesion and assigned the symbol by Cayula and Cornillon [8]). Sch16 found that "…spectral 207 slopes are increasingly biased low as decreases and increases, and this effect becomes more 208 pronounced as the true spectral slope increases". Based on this they only considered VIIRS spectra 209 for − > 0.1 and > 0.5 in their analysis. We found these thresholds to be too permissive for 210 our purposes; the impact of interpolation on spectra in the 1 to 10 pixel range can overwhelm the 211 underlying spectrum as will be shown below. We therefore chose more stringent constraints on ,

212
generally resulting in > 0.9. At this level, the cohesion of the data has a relatively small impact 213 on the spectra for slopes in the range of those observed in the Sargasso Sea (Sch16), so we did not 214 impose an additional constraint on cohesion.
size impacts along-track spectra, again at these scales. Although the pixel spacing of along-track 228 sections is virtually independent of the distance from nadir, the size of the pixel is not; i.e., the SST 229 values associated with pixels is averaged over increasingly larger areas away from nadir. This is 230 similar to smoothing along-track with a moving average, which in turn depresses the power 231 spectral density at small scales, this, independent of the preprocessing performed on the data and it 232 affects along-track and along-scan spectra equally. Along-track interpolation (discussed below) to 233 address the change in pixel spacing in the along-scan direction (Figure 3) also impacts the resulting 234 spectra. In order to reduce the impact of both of these effects, only sections within 500 km of nadir 235 are used for this analysis.

236
The final criterion used to select sections from the L2 fields relates to the gappiness of the data.

237
For clarity, we combine this step with the interpolation to fill missing pixel values in the study area.

238
The actual implementation of the algorithm is slightly different to reduce processing time but the

262
7 Pixel area is approximately the along-track spacing, 741 m for VIIRS and 1,115 m for AVHRR, times the along-scan spacing 8 Gap filling was still possible in that adjacent pixels were left as is; i.e., not set to missing values. 9 Selection of temperature sections with maximum sample spacing in excess of 150 m resulted in a significant steepening of the spectral slope for wavelengths smaller than approximately 1 km. This is due to the nearest neighbor interpolation to 75 m spacing, which repeats samples for these large separations.  in Table 2 based on the mean pixel spacing of the subgroup. All of the temperature sections falling 274 in a given group were then interpolated to the same pixel spacing, also shown in Table 2. This pixel 275 spacing was determined from the mean pixel spacing determined from the contributing 276 temperature sections for the given group. This, together with the relatively small size of the ranges,

277
tended to eliminate problems associated with different spatial sampling and with an interference 278 between the sampling frequency along the original section and that along the interpolated section.

279
Nearest neighbor interpolation was used. Figure 4 shows the effective transfer function of three 280 different interpolation algorithms available in Matlab: linear, nearest neighbor and cubic spline 10 .

281
To determine the most appropriate resampling strategy, SST values on the VIIRS sections were 290 Table 2. Grouping of along-scan sections based on mean pixel spacing of the temperature section.

291
The values indicated correspond to the lower limit on the range -the value to which temperatures 292 sections in this range are interpolated -the upper limit on the range.

315
Oleander spectra were ensemble averaged over all of the selected sections.

349
In order to determine the instrument noise, i.e., to separate it from the other factors cited above,

350
we defined a two steps process based on the following three assumptions:   3. The instrument noise for both sensors is white; i.e., that it contributes equally at all 359 wavenumbers associated with the given temperature sections. This is not quite the case for VIIRS 360 hence one has to take a bit more caution with the results presented herein.

361
In the first step, the slope, intercept and noise level of a hypothetical spectrum yielding the best 362 fit to the satellite spectrum is determined in a least squares sense. This is done by minimizing gamma, the sum of the squared difference between the hypothetical spectrum and the satellite 364 spectrum: where slope and intercept define the straight line portion of the best fit spectrum in log-log space 367 (assumption 1 above), noise is the noise level (assumption 3) also in spectral space, ki is the 368 wavenumber of the i th spectral component and where ! ! , referred to as the nugget, is the variance of the difference in the retrieval at a given 397 location from that at a neighboring location as the separation between the two locations goes to 398 zero; i.e., the instrument noise in this case, ! , referred to as the sill, is the variance associated with 399 the variability for a spatial separation of L, the decorrelation scale. Note that the sill is a measure of 400 the geophysical variance of the field plus the 'large' scale retrieval variance, which depends on the 401 variance in the atmosphere, the variance of the surface emissivity, instrument noise, etc. So,
directions separately for much the same reasons presented in the discussion of the preliminary processing of the data,

416
As in Tan14 we use the formulation given by Cressie to estimate the variogram [13]: where ! is at location ! , Δ x or y is the spatial separation in kilometers of ( ! , ! ) pairs

433
The local precision of satellite-derived SST retrievals, the noise resulting from processes in the 434 yellow and green boxes of Figure 1, which we refer to as instrument noise here, is shown in Table 3 435 for each of the along-scan/along-track, day/night combinations. The first row for each sensor 436 (labeled Spectra) corresponds to the estimates obtained from the spectral method. Only subgroups 437 consisting of five or more temperature sections and with a spectral slope steeper than -1 were used.

438
The instrument noise for subgroups with shallower spectral slopes tended to dominate the If the noise is not white, for example, the actual level of noise may, in fact, be larger than the 'upper limit'.
Furthermore, although somewhat larger the variogram estimates are quite close to the spectral 457 estimates and all of the estimates are close to the 'upper' limit for the given 458 sensor/day-night/scan-track combination suggesting that the instrument noise is white. It is 459 possible that the pixel noise is correlated at small scales but, again, the mechanism for this is not The along-scan AVHRR spectra are shown in Figure 6 for a daytime subgroup and a nighttime 466 subgroup. Also shown in the figure are the best-fit linear spectra with noise, obtained as discussed 467 in Section 3.2. Figure 7 shows the corresponding along-track AVHRR spectra. In all four cases, noise 468 is seen to impact the spectrum for wavelengths (wavenumbers) up (down) to approximately 25 km 469 (0.04 km -1 ). Also apparent from these plots is that the approximately linear portion of the AVHRR 470 spectrum corresponds to a small fraction (~10%) of the 129 spectral values. This means that 471 relatively small changes in the low wavenumber end of these spectra will have a more significant 472 impact on the estimated background slope than for spectra less impacted by noise. However, the 473 spectral method for determining instrument noise is relatively insensitive to this; significant 474 changes in slope and intercept result in virtually identical values of instrument noise. For example,

475
for the spectrum shown in the left panel of Figure 7, a slope, offset combination of (-1.7570, -6.2730) 476 yields the same level of instrument noise. This is because the instrument noise is one to two orders 477 of magnitude larger that the assumed geophysical signal, the straight line portion of the spectrum, 478 over a significant fraction of the spectrum (remember the fits are in regular, not log-log space) so 479 changes in the slope do not result in a significant difference in the squared sum of the differences 480 between the model and the observed spectrum. For spectra that level off substantially at large 481 wavenumbers, the noise is effectively determined by the power spectral density level at these 482 wavenumbers. This is readily seen in Figure 6 and 7; the high wavenumber end of the simulated 483 spectra with noise are at a similar level for the along-scan sections and at a slightly higher level for or smaller, in magnitude to the geophysical signal at these wavenumbers, as will become clear in

516
The third significant difference between AVHRR and VIIRS spectra relates to the daytime 517 spectra compared with the nighttime spectra. Specifically, there is a statistically significant 518 difference between daytime and nighttime VIIRS spectra, with the daytime spectra being more 519 energetic at wavelengths smaller than approximately 100 km. This is likely due to diurnal warming, 520 which occurs frequently in the Sargasso Sea in summer months [6,10]. Also note that the slope of 521 nighttime spectra for both along-scan and along-track sections is closer to that of the TEX spectrum 522 than the daytime spectra. Surprisingly, the level of instrument noise is also larger at daytime than at 523 nighttime as is evident both from the figures and from Table 3. This may result from the sensitivity 524 of the banding to the energy in the SST field. Banding is difficult to correct for because it is not the 525 entire scan line that has higher values than its neighbors, but rather, what appear to be randomly 526 located segments of a given scan line. Furthermore, the magnitude of the difference in these regions 527 appears to be related to the magnitude of the retrieved temperature.

528
Finally, the level of instrument noise estimated with the spectral approach is substantially 529 smaller than (as much as one half) that estimated based on the variogram. The reason for this is not

540
Tan14 estimated the nugget in the L3 Meteosat AVHRR data set produced by the O&SI SAF They found ! ≈ 0.14 K for the study area. This is larger than would be expected if instrument 544 noise of the full resolution Meteosat AVHRR data is similar to that found for NOAA-15 AVHRR (on 545 the order of 0.20 K) and if this noise is uncorrelated from pixel-to-pixel, the assumption made in the 546 analyses presented herein. Specifically, we would expect the noise for the L3 product to be 547 approximately 0.05 K since order 25 pixels are averaged for each 0.05° × 0.05° SST estimate. It is 548 possible that the level of instrument noise (elements in the yellow block of Fig. 1) associated with 549 the AVHRR on Meteosat is higher than that of NOAA-15. More likely however is that the difference 550 results from misclassification errors associated with cloud flagging (the most significant element in 551 the green block). Specifically, Tan14) processed all of the data for one year, 2008; i.e., they did not 552 constrain their analysis to relatively cloud free fields as we did. Cloud-contaminated L2 pixels were, 553 of course, excluded from the production of the L3 fields and Tan14 also excluded pixels flagged as 554 cloud-contaminated. However, the likelihood of misclassification, cloud-contaminated pixels not 555 being flagged as such, increases as the fraction of cloud cover increases. Furthermore, classification 556 errors tend to be small-scale errors, a small number of pixels here, a small number of pixels there, as 557 opposed to large regions, which are misclassified. This means that such errors will likely contribute 558 to noise at small spatial scales. A histogram of Tan14 nuggets (not shown) shows a broad 559 distribution ranging from ! in the 0.05 K range to order 0.3 K with a peak around 0.14 K. If the 560 nugget resulted primarily from instrument errors (those in the yellow block), one would expect a 561 relatively narrow peak; the instrument noise is unlikely to vary substantially for the region. Thus 562 the broad ! range suggests that it is a combination of classification errors and instrument noise.

563
Because our analysis required long sections of cloud-free pixels the data were likely much more 564 clear, on average, than those of Tan14. Also contributing to the difference between our estimate of 565 local noise and that of Tan14 is that noise may be added through the combination of L2 fields to 566 obtain the L3 product. Using nighttime only data, as Tan14 have done, will minimize, but not model. Tan14 used the exponential form. This will likely also contribute to an overestimate of the 570 instrument noise in regions in which a mixed form is more appropriate.

572
Of interest is how levels of noise, typical of the values found thus far, impact gradients and 573 fronts. In order to address this, we simulated 10,000 3 × 3 pixel squares for a given gradient in x,

574
added Gaussian white noise to each of the elements, applied the 3 × 3 Sobel gradient operator in x 575 and y to these squares and then determined the mean gradient and the standard deviation of the 576 gradient. This was done for gradients ranging from 0.001 to 0.01 K km -1 , values typical in the ocean,

577
and for levels of instrument noise ranging from 0.001 K to 0.02 K. Figure 10    gradient, but that the gradient magnitude will be substantially overestimated in AVHRR fields. The 596 uncertainty of the estimated gradient magnitude increases with the imposed noise, nearly doubling 597 from the value associated with a zero imposed gradient to an imposed gradient of 0.1 K/km. These 598 observations do not mean that a front with a gradient of this magnitude (0.05 K/km) is undetectable 599 in a field with an AVHRR noise level but detection will be problematic. Simulations using front 600 detection algorithms need to be undertaken to evaluate this. Although none of this is surprising, we 601 are not aware of any studies involving the gradient magnitude of satellite-derived SST fields 602 accounting for this -including many of our own.  found to differ by method, scan geometry and day-vs-night -ranging from 0.021 K for the 621 nighttime, along-scan spectral estimate to 0.097 K for the daytime, along-track variogram estimate.

622
Day and night along-scan estimates based on the spectral approach are close to one half those based 623 on the variogram. For both methods, the nighttime estimates are also roughly one half the 624 corresponding daytime estimates. Finally, the along-track estimates are roughly 50% larger than the 625 along-scan estimates for the spectral approach but only about 25% larger when based on the 626 variogram. In all cases, the estimates were smaller than the 'upper' limit.

627
In summary: VIIRS instrument noise is substantially smaller than AVHRR instrument noise, 628 with levels as low as 0.02 K in the along-scan direction at nighttime. In fact, VIIRS instrument noise 629 under these conditions is near the level of the geophysical signal in the dynamically quietest 630 regions in the ocean.