Optimization of Sensitivity of GOES-16 ABI Sea Surface Temperature by Matching Satellite Observations with L 4 Analysis

Monitoring of the diurnal warming cycle in sea surface temperature (SST) is one of the key tasks of the new generation geostationary sensors, the Geostationary Operational Environmental Satellite (GOES)-16/17 Advanced Baseline Imager (ABI), and the Himawari-8/9 Advanced Himawari Imager (AHI). However, such monitoring requires modifications of the conventional SST retrieval algorithms. In order to closely reproduce temporal and spatial variations in SST, the sensitivity of retrieved SST to SSTskin should be as close to 1 as possible. Regression algorithms trained by matching satellite observations with in situ SST from drifting and moored buoys do not meet this requirement. Since the geostationary sensors observe tropical regions over larger domains and under more favorable conditions than mid-to-high latitudes, the matchups are predominantly concentrated within a narrow range of in situ SSTs >2 85 K. As a result, the algorithms trained against in situ SST provide the sensitivity to SSTskin as low as ~0.7 on average. An alternative training method, employed in the National Oceanic and Atmospheric Administration (NOAA) Advanced Clear-Sky Processor for Oceans, matches nighttime satellite clear-sky observations with the analysis L4 SST, interpolated to the sensor’s pixels. The method takes advantage of the total number of clear-sky pixels being large even at high latitudes. The operational use of this training method for ABI and AHI has increased the mean sensitivity of the global regression SST to ~0.9 without increasing regional biases. As a further development towards improved SSTskin retrieval, the piecewise regression SST algorithm was developed, which provides optimal sensitivity in every SST pixel. The paper describes the global and the piecewise regression algorithms trained against analysis SST and illustrates their performance with SST retrievals from the GOES-16 ABI.


Introduction
Diurnal variations in sea surface temperature (SST) play an important role in the energy exchange between the ocean and the atmosphere (e.g., [1,2]).The key advantage of the infrared radiometers onboard the geostationary satellites is that they enable continuous monitoring of the diurnal cycle in SST (see Table 1 for list of abbreviations used in the paper).The capabilities of such monitoring have expanded with the launch of a new generation instrument, the Advanced Baseline Imager (ABI) onboard the Geostationary Operational Environmental Satellite (GOES)-16 and -17 (launched on 19 November 2016 and on 1 March 2018, respectively), and the ABI's twin sensor, the Advanced Himawari Imager (AHI) flown onboard the Japan Himawari-8 and -9 satellites (launched on The ABI/AHI offers five infrared atmospheric transparency window bands (centered at 3.9, 8.4, 10.3, 11.2, and 12.3 µm) suitable for SST, with high spatial resolution (2 km at nadir, which degrades to ~12 km at satellite view zenith angles, VZA~67 • ), frequent scans (every 15/10 minutes for ABI/AHI; note that NOAA also considers "Mode 6" for ABI, with a 10 minute refresh rate), and superior radiometric performance [3][4][5].The NOAA Advanced Clear-Sky Processor for Oceans (ACSPO) system, initially developed to retrieve SST from polar-orbiting sensors, such as NOAA and MetOp AVHRRs; S-NPP and NOAA-20 VIIRS; and Terra and Aqua MODIS [6], was modified with the launch of Himawari-8 to process data of new generation geostationary sensors [7][8][9].SSTs retrieved from GOES-16 ABI and Himawari-8 AHI reveal a clear and smooth diurnal cycle.However, accurate quantification of the diurnal cycle in SST requires modifications to the retrieval algorithms currently employed with polar-orbiting sensors.
The magnitude of the diurnal cycle (DCM) is measured as difference between the warmest daytime and the coldest nighttime SSTs.Under the conditions of strong insolation and low wind speed, the local DCM may reach several degrees Kelvin [10][11][12], whereas DCM averaged over larger areas (including the full observed SST domain) is typically estimated from ~0.3-0.5 K [7,13].To correctly measure the DCM, the retrieval algorithm should be able to accurately reproduce both temporal variations and spatial contrasts in the retrieved SST.Furthermore, there may be a substantial difference between the DCM in the upper ~10 µm skin layer of the sea surface (T SKIN ), which forms the thermal infrared emission of the ocean, and T DEPTH , measured at 0.2-1 m depth by drifting and moored buoys, respectively, and customarily used for training regression SST algorithms [14,15].This calls for algorithms more specifically targeted at T SKIN versus T DEPTH retrievals.
Here, we present the SST retrieval algorithms, developed in ACSPO for the ABI and AHI, to specifically improve the estimation of DCM in T SKIN .The algorithms are evaluated with the emphasis on sensitivity-a scale, in which variations in true SST are reproduced in the retrieved SST, T S [16].It should be noted that the µ is not a measured quantity.Rather, it is calculated by replacing observed brightness temperatures with simulated derivatives of brightness temperatures in terms of SST in the regression equation and zeroing the terms independent from brightness temperatures.The radiative transfer simulations in ACSPO, including calculations of brightness temperature derivatives, are performed using the Community Radiative Transfer Model (CRTM) [17].The input data for the CRTM are the analysis SST, produced by the Canadian Meteorological Center (CMC) [18], and atmospheric profiles of temperature and humidity from the NCEP Global Forecast System, GFS [19].Since the CRTM treats the input SST as T SKIN , the sensitivity characterizes response of T S specifically to T SKIN .We assume that errors of CRTM-based sensitivity calculations are much smaller than typical deficits in sensitivity (~0.1-0.5) for T S retrieved from geostationary data with conventional algorithms.Optimization of sensitivity (i.e., bringing it as close to 1 as possible) is a prerequisite for accurate DCM estimation, as well as for reproduction of spatial contrasts in T S .The importance of optimal sensitivity for the analyses of diurnal SST variations was recently stressed in, e.g., [20].
The most detailed hitherto satellite studies of the diurnal cycle [11][12][13]21] utilized the multiyear dataset of SST produced from the geostationary Spinning Enhanced Visible and Infrared Imager (SEVIRI) onboard METEOSAT-8 [22] with the Non-Linear SST (NLSST) algorithm [23], using two split-window bands at 10.8 and 12 µm.It was shown, however, that the SEVIRI NLSST may include significant regional biases [24] and that the sensitivity of the NLSST may be suboptimal [16,25].Minimization of regional biases, inherent in the SEVIRI NLSST, has been the objective of developing incremental algorithms, aimed at retrieval of SST increments (i.e., "true minus first guess" SST) from brightness temperature increments (i.e., "observed minus simulated" brightness temperatures).The physical Optimal Estimation method [26] was applied to the SST retrievals from SEVIRI [24] and, recently, from Himawari-8 AHI [27].In the algorithm [28], currently used in the reprocessing of SEVIRI data at the EUMETSAT Ocean and Sea Ice Satellite Application Facility [29], a regression equation with coefficients derived by matching absolute SSTs with absolute brightness temperatures is applied to the retrieval of SST increments from brightness temperature increments.At the early stage of preparations for the GOES-16/17 mission, the concept of the incremental regression was further elaborated [30] by deriving regression coefficients directly from SST increments matched with brightness temperature increments.Overall, the incremental algorithms were shown capable of reducing regional biases and adjusting the mean sensitivity of retrieved SST [31,32].However, the weakness of the incremental approach is that due to a limited accuracy of the existing radiative transfer models in conjunction with the numerical weather prediction data, the incremental algorithms require correction of biases between observed and simulated brightness temperatures.The latter biases can be estimated for specific geographic regions [24,28] or as functions of certain physical variables [25,30] by averaging brightness temperature increments over prolonged time periods.As a consequence, the T S biases are reduced in the statistical sense rather than suppressed in every single image.For this reason, we eventually decided not to implement the incremental method for AHI and ABI within ACSPO in favor of global and piecewise regression algorithms enhanced by using extended sets of radiometric bands and advanced methods of training regression coefficients.
Optimization of the T S sensitivity has been the most challenging aspect of the adaptation of ACSPO to geostationary data.The processing of Himawari-8 AHI at NOAA started in April 2015 with the initial set of global regression coefficients produced, consistently with the polar-orbiting sensors, by the least-squares fit to in situ SSTs (T IS ) within the dataset of matchups (MDS) of clear-sky satellite brightness temperatures with T IS from the NOAA in situ SST Quality Monitor (iQuam) system [33,34].However, the sensitivity of the AHI global regression SSTs was found to be much lower than that for the polar-orbiting sensors (VIIRS, AVHRR, and MODIS).The difference in sensitivities between geostationary and polar-orbiting sensors was due to the peculiarities of the observed SST domains.Figure 1 shows the domains observed by the SNPP VIIRS and the GOES-16 ABI.Compared to VIIRS, ABI observes mostly low-latitude regions with relatively warm SSTs, whereas the mid-to-high latitudes with colder SSTs are underrepresented in the ABI images.Moreover, those regions are observed under less favorable conditions (larger VZA and lower spatial resolution).As a result, the MDSs for geostationary sensors are dominated by matchups from low-latitude regions with relatively warm SST, resulting in a narrower distribution of matchups in terms of SST.The global regression coefficients for geostationary satellites, derived from such MDS, are mainly optimized for low-latitude regions, and the mean sensitivity of retrieved SST, averaged over the range of view zenith angles, VZA, 0 • ≤ VZA < 67 • , is as low as ~0.7, compared to a typical mean sensitivity of ~0.85-0.90 for VIIRS [25].satellite brightness temperatures with TIS from the NOAA in situ SST Quality Monitor (iQuam) system [33,34].However, the sensitivity of the AHI global regression SSTs was found to be much lower than that for the polar-orbiting sensors (VIIRS, AVHRR, and MODIS).The difference in sensitivities between geostationary and polar-orbiting sensors was due to the peculiarities of the observed SST domains.Figure 1 shows the domains observed by the SNPP VIIRS and the GOES-16 ABI.Compared to VIIRS, ABI observes mostly low-latitude regions with relatively warm SSTs, whereas the mid-to-high latitudes with colder SSTs are underrepresented in the ABI images.Moreover, those regions are observed under less favorable conditions (larger VZA and lower spatial resolution).As a result, the MDSs for geostationary sensors are dominated by matchups from low-latitude regions with relatively warm SST, resulting in a narrower distribution of matchups in terms of SST.The global regression coefficients for geostationary satellites, derived from such MDS, are mainly optimized for low-latitude regions, and the mean sensitivity of retrieved SST, averaged over the range of view zenith angles, VZA, 0° ≤ VZA < 67°, is as low as ~0.7, compared to a typical mean sensitivity of ~0.85-0.90 for VIIRS [25].
(a) (b) In order to increase the sensitivity of the global regression SST, the Constrained Least-Squares Method for training regression coefficients was tested, which fits TIS under a predefined value of mean sensitivity over the MDS [8].This way, the mean sensitivity of AHI global regression SST was raised to ~0.95, which, however, came at the expense of larger regional TS biases.An alternative method for training the regression coefficients, based on matching nighttime satellite observations with CMC analysis of SST, has been explored after ABI thermal IR data became available in January 2017.Note that the CMC SST is a foundation level 4 product, derived on a daily basis on a 0.1° grid from nighttime satellite SSTs and anchored to in situ SST measurements [18].ACSPO interpolates the gridded CMC SST to every pixel of the sensor.The advantage of the newly developed training method is that, in contrast with matchups with in situ SST, the number of clear-sky pixels, supplied with CMC SST, is much larger than the number of matchups with in situ SST, even in near-polar regions.Using the regression coefficients, calculated with the least-squares method from matchups with CMC SST, the mean sensitivity of the global regression SST was raised to ~0.9, without a noticeable increase in regional biases.The next step towards optimization of TSKIN estimates has been the development of the piecewise regression algorithm, which provides optimal and uniform sensitivity in each SST pixel.In this paper, we compare the performance of the global regression algorithms trained against in situ and CMC SSTs (GR-IS SST and GR-L4 SST, respectively), and  [35,36].
In order to increase the sensitivity of the global regression SST, the Constrained Least-Squares Method for training regression coefficients was tested, which fits T IS under a predefined value of mean sensitivity over the MDS [8].This way, the mean sensitivity of AHI global regression SST was raised to ~0.95, which, however, came at the expense of larger regional T S biases.An alternative method for training the regression coefficients, based on matching nighttime satellite observations with CMC analysis of SST, has been explored after ABI thermal IR data became available in January 2017.Note that the CMC SST is a foundation level 4 product, derived on a daily basis on a 0.1 • grid from nighttime satellite SSTs and anchored to in situ SST measurements [18].ACSPO interpolates the gridded CMC SST to every pixel of the sensor.The advantage of the newly developed training method is that, in contrast with matchups with in situ SST, the number of clear-sky pixels, supplied with CMC SST, is much larger than the number of matchups with in situ SST, even in near-polar regions.Using the regression coefficients, calculated with the least-squares method from matchups with CMC SST, the mean sensitivity of the global regression SST was raised to ~0.9, without a noticeable increase in regional biases.The next step towards optimization of T SKIN estimates has been the development of the piecewise regression algorithm, which provides optimal and uniform sensitivity in each SST pixel.In this paper, we compare the performance of the global regression algorithms trained against in situ and CMC SSTs (GR-IS SST and GR-L4 SST, respectively), and explore the potential of further sensitivity optimization by employing a PWR algorithm, trained against the CMC (PWR-L4).

Regression SST Equation
ACSPO derives SST from four longwave IR ABI/AHI bands centered at 8.4, 10.3, 11.2, and 12.3 µm, within the range 0 • < VZA < 67 • .The shortwave band, centered at 3.9 µm, is currently not used because during the day it is affected by the reflected solar radiation, and using it at night only might introduce a discontinuity in the observed diurnal signal.The four-band SST equation takes the following form: Here, R is a vector of regressors: C is a vector of regression coefficients; a is offset; T 8 , T 10 , T The extended set of regressors in Equation ( 2) enables efficient fitting of matched SST (i.e., T IS in GR-IS SST or T S 0 in GR-L4 SST and PWR-L4 SST) within the corresponding MDSs.The method for stable estimation of regression coefficients [8] minimizes/avoids potential instabilities, caused by correlations between the regressors within the MDS with minimum loss of the information.

Training Global Regression Algorithms
The dataset of matchups with in situ SST (MDS-IS), used in this study, includes N = 977,439 matchups of ABI clear-sky observations (both day and night) with high-quality iQuam drifting and moored buoys.The matchups were collected from April 2017 to March 2018, with a time/space window of 15 min and 10 km, respectively.Every matchup was supplied with simulated brightness temperature derivatives in terms of SST, required for sensitivity calculations, and with GFS data, including wind speed near the sea surface, V, and total column precipitable water vapor content in the atmosphere, W. The difficulty of using in situ SST in the context of T SKIN retrievals is that T IS represents T DEPTH , which may significantly deviate from T SKIN.The discrepancy between T SKIN and T IS reaches the maximum during the daytime under the conditions of strong insolation and low wind speeds near the sea surface [14,15].To minimize the effect of this discrepancy on trained GR-IS SST coefficients, the daytime matchups with V < 6 m/s were excluded from the training MDS (TMDS-IS).This has reduced the number of matchups within the TMDS-IS to N = 795,877 matchups, or by 18.6%.
The dataset used for training the GR-L4 SST and the PWR-L4 coefficients (TMDS-L4) was composed from nighttime clear-sky ABI L2P SST pixels, supplied with CMC SSTs.Only nighttime data were used because the foundation CMC SST is produced from nighttime SST data (for which the foundation SST is most close to T SKIN ) and does not capture the daytime variations in T SKIN .Therefore, using daytime SST pixels for training the GR-L4 SST coefficients would result in additional errors.The TMDS-L4 was formed from 744 ABI images taken every hour from 15 December 2017 to 15 January 2018.A full ABI L2P SST image contains on average 3.2 × 10 6 clear-sky pixels, with approximately half of those being nighttime.The total number of nighttime pixels within the TMDS-L4 reaches 1.19 × 10 9 , which is 1.5 × 10 3 times larger than the number of matchups within the TMDS-IS.
The large number of pixels within the TMDS-L4 allows improvement to the performance of the GR-L4 SST compared with the GR-IS SST.Curves 1 and 2 in Figure 2 show the normalized histograms of matchups within TMDS-IS and TMDS-L4, as functions of CMC SST.For both these distributions the majority of matchups is concentrated at warm SSTs > 285 K, and the histogram for TMDS-L4 is even narrower than for TMDS-IS.The advantage of TMDS-L4, however, is that the absolute number of clear-sky pixels in the high-latitude regions is much larger than the number of matchups with in situ SST.It is possible, therefore, to improve the performance of the GR-L4 SST by accounting for the pixels from the under-represented regions with larger weights.When training the GR-L4 SST and the PWR-L4 SST algorithms, the weights of the pixels within each 5 • × 5 • lat/lon box were selected in inverse proportion to total numbers of clear-sky pixels within the box.The curve 3 in Figure 2 shows the modified normalized histogram, which accounts for weights of the clear-sky pixels within the TMDS-L4.The weighting expands the histogram and increases the contribution of cold pixels to the overall statistics.
Remote Sens. 2018, 10, x FOR PEER REVIEW 6 of 16 pixels, with approximately half of those being nighttime.The total number of nighttime pixels within the TMDS-L4 reaches 1.19 × 10 9 , which is 1.5 × 10 3 times larger than the number of matchups within the TMDS-IS.The large number of pixels within the TMDS-L4 allows improvement to the performance of the GR-L4 SST compared with the GR-IS SST.Curves 1 and 2 in Figure 2 show the normalized histograms of matchups within TMDS-IS and TMDS-L4, as functions of CMC SST.For both these distributions the majority of matchups is concentrated at warm SSTs > 285 K, and the histogram for TMDS-L4 is even narrower than for TMDS-IS.The advantage of TMDS-L4, however, is that the absolute number of clear-sky pixels in the high-latitude regions is much larger than the number of matchups with in situ SST.It is possible, therefore, to improve the performance of the GR-L4 SST by accounting for the pixels from the under-represented regions with larger weights.When training the GR-L4 SST and the PWR-L4 SST algorithms, the weights of the pixels within each 5° × 5° lat/lon box were selected in inverse proportion to total numbers of clear-sky pixels within the box.The curve 3 in Figure 2 shows the modified normalized histogram, which accounts for weights of the clear-sky pixels within the TMDS-L4.The weighting expands the histogram and increases the contribution of cold pixels to the overall statistics.The GR-IS SST algorithm was trained with the least-squares method, which minimizes the global standard deviation of TS from TIS.The GR-L4 SST was trained by minimization of the weighted standard deviation of Ts from TS 0 .After training, the offsets of the GR-IS and GR-L4 equations were adjusted to zero bias between retrieved SSTs and TIS averaged over matchups within the TMDS-IS from 12 am to 7 am local solar time.

The PWR-L4 SST Algorithm
As will be shown in Section 5, the GR-L4 SST increases the sensitivity compared with the GR-IS SST but leaves it suboptimal and non-uniform.The goal of the PWR-L4 SST is to further optimize the sensitivity in every pixel.The algorithm is constructed and performs as follows.
During off-line training, the vector of GR-L4 coefficients, CGR-L4, is derived from TMDS-L4, as described in Section 3, and the GR-L4 sensitivity, µGR-L4 = CGR-L4 T K, is calculated for all pixels (K is defined in Equation ( 4)).The whole TMDS-L4 is subdivided into 9 subsets in terms of µGR-L4: The GR-IS SST algorithm was trained with the least-squares method, which minimizes the global standard deviation of T S from T IS .The GR-L4 SST was trained by minimization of the weighted standard deviation of Ts from T S 0 .After training, the offsets of the GR-IS and GR-L4 equations were adjusted to zero bias between retrieved SSTs and T IS averaged over matchups within the TMDS-IS from 12 am to 7 am local solar time.

The PWR-L4 SST Algorithm
As will be shown in Section 5, the GR-L4 SST increases the sensitivity compared with the GR-IS SST but leaves it suboptimal and non-uniform.The goal of the PWR-L4 SST is to further optimize the sensitivity in every pixel.The algorithm is constructed and performs as follows.
During off-line training, the vector of GR-L4 coefficients, C GR-L4 , is derived from TMDS-L4, as described in Section 3, and the GR-L4 sensitivity, µ GR-L4 = C GR-L4 T K, is calculated for all pixels (K is defined in Equation ( 4)).The whole TMDS-L4 is subdivided into 9 subsets in terms of µ GR-L4 : I = 2, 3, . . ., 8: 0.6 + 0.05(I − 2) ≤ µ GR-L4 < 0.6 + 0.05(I − 1) (5b) Here, i is the number of the subset.The first guess of the PWR-L4 coefficients for the i th subset, C 1 i , is derived with the Constrained Least-Squares Method [8], by minimization of the weighted standard deviation of T S − T S 0 under the constraint on the mean sensitivity: (C 1 i ) T <K> = 1, <*> denotes averaging over the pixels belonging to a given TMDS-L4 subset.The offsets a i are defined in order to zero the bias of T S with respect to in situ SST over the subset of TMDS-IS, which includes matchups with the corresponding sensitivities, taken from 0 to 7 am of the local solar time: <<*>> in Equation ( 6) denotes averaging over a given TMDS-IS subset.Similarly, the particular offsets of the GR-L4 SST equations for the i th subset, b i , are defined as The values of C 1 i , a 1 i , b i and mean values of the GR-L4 SST sensitivities, µ i = <µ GR-L4 >, I = 1, 2, . . ., 9, are stored in the look-up table.
During processing, every SST pixel is attributed to a specific subset depending on a pixel value of µ GR-L4 and the PWR-L4 SST equation for this pixel is modified with two sequential iterations.The second iteration of the PWR-L4 coefficients and the offsets, {a 2 , C 2 } ensures their continuity in terms of µ GR-L4 : If µ GR-L4 ≤ µ 1 : If µ i < µ GR-L4 ≤ µ i + 1 , I = 1, 2, . . ., 8: If µ GR-L4 > µ 9 : The third iteration brings the PWR-L4 sensitivity exactly to 1 by extrapolation/interpolation between the second iteration of the PWR-L4 sensitivities, µ 2 =C 2 T K, and µ GR-L4 : Considering that µ GR-L4 = C GR-L4 T K, the sensitivity of the T S , produced with coefficients C 3 from Equation ( 11), C 3 T K ≡ 1.Finally, the PWR-L4 SST is calculated as

Validation Against In Situ SST
In this Section, the explored algorithms are evaluated and compared using matchups from the MDS-IS.In order to minimize the contribution of the discrepancies between T IS and T SKIN to the validation statistics, biases and standard deviations of T S -T IS , as well as mean sensitivities are calculated from TMDS-IS, which, as described in Section 3, excludes daytime matchups with GFS wind speeds V < 6 m/s.In contrast, the DCM estimates are produced from the full MDS-IS, taking into account both daytime and nighttime matchups at all wind speeds.It should be noted that since the TMDS-IS was used for training the GR-IS coefficients, it can be viewed as a dependent MDS in terms of validating the GR-IS SST.However, it is independent in terms of validating the GR-L4 and the PWR-L4 SSTs.This suggests that the conditions of the comparison are more favorable for the GR-IS SST than for two other algorithms.
Absorption by the atmospheric water vapor is a major factor, modifying the responses of brightness temperatures, observed in the atmospheric transmission window, to variations in SST (e.g., [37,38]).The performance of the regression algorithms (including sensitivity) usually degrades with the increase of the water vapor content along the sensor's slant line of sight, STPW = W × sec (θ) [25].It is important, therefore, to determine the ranges of STPW, suitable for retrieval of T SKIN with the explored algorithms.The statistical factor which should also be considered when comparing different training techniques is the distribution of matchups within the training MDS. Figure 3 shows the normalized histograms of matchups within TMDS-IS and TMDS-L4 as functions of STPW.The histogram for TMDS-L4 accounts for weighting the pixels during training the GR-L4 and PWR-L4 coefficients, as described in Section 3. The latter histogram shows wider maximum and increased contribution of matchups with STPW < 30 kg/m 2 , typical for mid-and high latitudes.Figure 4 shows mean sensitivities, biases, and standard deviations as functions of STPW.All algorithms perform best near the maxima of the corresponding matchups' distributions at 30 < STPW < 50 kg/m 2 , rather than at the minimum of STPW, as one could expect from physical considerations.At smaller STPWs the biases and the standard deviations increase for all three algorithms, and the sensitivities for the GR-IS and GR-L4 SSTs reduce due to reduced relative numbers of corresponding matchups within the training MDSs.At larger STPWs, the statistics are affected by a combination of increasing atmospheric absorption and decreasing relative numbers of matchups.In Figure 4a, the GR-IS SST sensitivity is as low as 0.75 at its maximum near STPW = 30 kg/m 2 and reduces to 0.40 at STPW = 130 kg/m 2 .The sensitivity of the GR-L4 SST reaches 0.95 at STPW = 20 kg/m 2 but reduces to 0.56 at STPW = 130 kg/m 2 .The sensitivity of the PWR-L4 SST, by design, remains constant and optimal at all STPWs.In Figure 4b, the biases for all three SSTs do not exceed 0.1 K within the range 20 < STPW < 80 kg/m 2 .At STPW < 20 kg/m 2 , the large bias in the GR-IS SST is caused by the insufficient number of matchups within the TMDS-IS, whereas the biases in the GR-L4 SST and, especially, PWR-L4 SST increase to a lesser extent.In Figure 4c, the standard deviations for all three SSTs increase with STPW > 40 kg/m 2 , consistently with the sensitivities in Figure 4a.Thus, we conclude that the sensitivity of the GR-L4 SST is suboptimal for estimations of the DCM in T SKIN within the whole range of STPW.The DCM estimates from GR-L4 SST may be appropriate at STPW < 60 kg/m 2 but become suboptimal at larger STPWs because of degraded sensitivity.On the other hand, the T SKIN retrievals and DCM estimates, made from the PWR-L4 SST, may be inefficient at large STPWs due to degraded accuracy and precision.Considering the above, we restrict the range of STPW for T SKIN retrieval with STPW < 100 kg/m 2 .This excludes nearly 0.6% of the MDS-IS and TMDS-IS matchups from the further analysis.
Figure 5a illustrates the effect of sensitivity on the reproduction of the diurnal cycle in T S with three algorithms by showing the hourly biases of T S − T S 0 as functions of local solar time.The latter biases were estimated from the MDS-IS, using 970,371 daytime and nighttime matchups at all wind speeds and with STPW < 100 kg/m 2 .Figure 5a also shows the hourly biases in T IS − T S 0 .The diurnal cycles are well expressed in all three retrieved SST as well as in T is , although with substantially different DCMs.The maxima and the minima in T IS − T S 0 take place later than in T S − T S 0 , consistent with the fact that the "skin" layer cools down and warms up faster than the "depth" layer, at which the in situ SST is measured.Figure 5b shows the hourly biases in T S − T IS.The biases in T S − T IS in Figure 5b are smaller than in T S − T S 0 in Figure 5a.One can also notice that the maxima and the minima of biases in T S − T IS shift to a later time when the mean sensitivity of an SST product increases.
the DCM in TSKIN within the whole range of STPW.The DCM estimates from GR-L4 SST may be appropriate at STPW < 60 kg/m 2 but become suboptimal at larger STPWs because of degraded sensitivity.On the other hand, the TSKIN retrievals and DCM estimates, made from the PWR-L4 SST, may be inefficient at large STPWs due to degraded accuracy and precision.Considering the above, we restrict the range of STPW for TSKIN retrieval with STPW < 100 kg/m 2 .This excludes nearly 0.6% of the MDS-IS and TMDS-IS matchups from the further analysis.Figure 5a illustrates the effect of sensitivity on the reproduction of the diurnal cycle in TS with three algorithms by showing the hourly biases of TS−TS 0 as functions of local solar time.The latter biases were estimated from the MDS-IS, using 970,371 daytime and nighttime matchups at all wind speeds and with STPW < 100 kg/m 2 .Figure 5a also shows the hourly biases in TIS−TS 0 .The diurnal cycles are well expressed in all three retrieved SST as well as in Tis, although with substantially different DCMs.The maxima and the minima in TIS−TS 0 take place later than in TS−TS 0 , consistent with the fact that the "skin" layer cools down and warms up faster than the "depth" layer, at which the in situ SST is measured.Figure 5b shows the hourly biases in TS−TIS.The biases in TS−TIS in Figure 5b are smaller than in TS−TS 0 in Figure 5a.One can also notice that the maxima and the minima of biases in TS-TIS shift to a later time when the mean sensitivity of an SST product increases.
Table 2 summarizes mean sensitivities, SDs, and DCMs, estimated from the corresponding subsets of matchups.Overall, training the GR coefficients against CMC increases all three statistics compared with training against in situ SST and so does the adjustment of sensitivity in the PWR-L4 SST.More specifically, the 25% increase in mean sensitivity between the GR-IS SST and the GR-L4 SST results in comparable (22%) increase in DCM and disproportionally small (9%) increase in SD.On the other hand, the 11% increase in mean sensitivity between the GR-L4 and PWR-L4 SSTs causes a 12% increase in DCM and a larger (14%) increase in SD.This result, consistent with Figure 4b,c, suggests that estimation of the PWR-L4 coefficients with the CLS method may cause amplification of regional biases compared with those in the GR-L4 SST.Table 2 summarizes mean sensitivities, SDs, and DCMs, estimated from the corresponding subsets of matchups.Overall, training the GR coefficients against CMC increases all three statistics compared with training against in situ SST and so does the adjustment of sensitivity in the PWR-L4 SST.More specifically, the 25% increase in mean sensitivity between the GR-IS SST and the GR-L4 SST results in comparable (22%) increase in DCM and disproportionally small (9%) increase in SD.On the other hand, the 11% increase in mean sensitivity between the GR-L4 and PWR-L4 SSTs causes a 12% increase in DCM and a larger (14%) increase in SD.This result, consistent with Figure 4b,c, suggests that estimation of the PWR-L4 coefficients with the CLS method may cause amplification of regional biases compared with those in the GR-L4 SST.Table 2. Mean sensitivities, standard deviations of retrieved SST -in situ SST, and average magnitudes of diurnal cycle (DCMs), estimated from MDS-IS, using matchups with STPW < 100 kg/m 2 .

Processing GOES-16 ABI Data
The three algorithms have been implemented within the experimental version of ACSPO and tested on GOES-16 ABI data from 1 March-31 March 2018.Since the operational ACSPO Clear-Sky mask [39] uses retrieved SST as one of the clear-sky predictors, the clear-sky SST domains may be somewhat different for different retrieval algorithms.To ensure the comparison of the algorithms on the same domain, a single clear-sky mask, generated with the GR-L4 SST, was consistently employed with all three algorithms.Figure 6 shows geographical distributions of sensitivities for three SSTs retrieved from the ABI data on 1 March 2018 at 20:00 UTC (close to the maximum of the diurnal warming signal within the GOES-16 ABI domain).Figure 7 shows deviations of the retrieved SSTs from CMC SST for the same ABI data.The statistics of sensitivities for the images in Figure 6 and of TS-TS 0 for the images in Figure 7 are given in Table 3.In Figure 6, the GR-IS SST has the lowest and variable sensitivity.The GR-L4 SST sensitivity is higher than for the GR-IS SST and is also variable.For both the GR-IS SST and the GR-L4 SST, the sensitivities are especially low in the low latitudes at large VZAs, due to large atmospheric absorption along the line of sight, consistent with the behavior of sensitivities at large STPWs in Figure 4a.The GR-IS sensitivity is also reduced in the near-polar zones, which are poorly represented within TMDS-IS.The PWR-L4 SST, by design, provides optimal and uniform sensitivity, µ ≡ 1, over the full SST domain.

Processing GOES-16 ABI Data
The three algorithms have been implemented within the experimental version of ACSPO and tested on GOES-16 ABI data from 1 March-31 March 2018.Since the operational ACSPO Clear-Sky mask [39] uses retrieved SST as one of the clear-sky predictors, the clear-sky SST domains may be somewhat different for different retrieval algorithms.To ensure the comparison of the algorithms on the same domain, a single clear-sky mask, generated with the GR-L4 SST, was consistently employed with all three algorithms.Figure 6 shows geographical distributions of sensitivities for three SSTs retrieved from the ABI data on 1 March 2018 at 20:00 UTC (close to the maximum of the diurnal warming signal within the GOES-16 ABI domain).Figure 7 shows deviations of the retrieved SSTs from CMC SST for the same ABI data.The statistics of sensitivities for the images in Figure 6 and of T S − T S 0 for the images in Figure 7 are given in Table 3.In Figure 6, the GR-IS SST has the lowest and variable sensitivity.The GR-L4 SST sensitivity is higher than for the GR-IS SST and is also variable.For both the GR-IS SST and the GR-L4 SST, the sensitivities are especially low in the low latitudes at large VZAs, due to large atmospheric absorption along the line of sight, consistent with the behavior of sensitivities at large STPWs in Figure 4a.The GR-IS sensitivity is also reduced in the near-polar zones, which are poorly represented within TMDS-IS.The PWR-L4 SST, by design, provides optimal and uniform sensitivity, µ ≡ 1, over the full SST domain.Overall, the three products show similar diurnal warming patterns, but with different amplitudes.The exceptions are unrealistically warm biases in the GR-IS SST in the North Atlantic Ocean and smaller biases at the south-western edge of the image, which are not reproduced (or are less noticeable) in the two other SSTs.Differences between the GR-L4 and the PWR-L4 SSTs are more noticeable in regions with relatively low GR-L4 sensitivities: for instance, the PWR-L4 SST is warmer in the Equatorial Pacific Ocean and in the Southern Atlantic and colder in the Equatorial Atlantic at the eastern edge of the image, due to the effect of the Saharan dust outbreak.8 shows the time series of biases in retrieved SST-CMC SST for 20-30 March 2018.The DCMs increase from GR-IS to GR-L4 and to PWR-L4 SST.Table 4 shows the mean sensitivities and DCMs for three algorithms averaged over the month from 1-31 March 2018.The DCMs in Table 4 are roughly proportional to mean sensitivities.Overall, the three products show similar diurnal warming patterns, but with different amplitudes.The exceptions are unrealistically warm biases in the GR-IS SST in the North Atlantic Ocean and smaller biases at the south-western edge of the image, which are not reproduced (or are less noticeable) in the two other SSTs.Differences between the GR-L4 and the PWR-L4 SSTs are more noticeable in regions with relatively low GR-L4 sensitivities: for instance, the PWR-L4 SST is warmer in the Equatorial Pacific Ocean and in the Southern Atlantic and colder in the Equatorial Atlantic at the eastern edge of the image, due to the effect of the Saharan dust outbreak.8 shows the time series of biases in retrieved SST-CMC SST for 20-30 March 2018.The DCMs increase from GR-IS to GR-L4 and to PWR-L4 SST.Table 4 shows the mean sensitivities and DCMs for three algorithms averaged over the month from 1-31 March 2018.The DCMs in Table 4 are roughly proportional to mean sensitivities.Overall, the three products show similar diurnal warming patterns, but with different amplitudes.The exceptions are unrealistically warm biases in the GR-IS SST in the North Atlantic Ocean and smaller biases at the south-western edge of the image, which are not reproduced (or are less noticeable) in the two other SSTs.Differences between the GR-L4 and the PWR-L4 SSTs are more noticeable in regions with relatively low GR-L4 sensitivities: for instance, the PWR-L4 SST is warmer in the Equatorial Pacific Ocean and in the Southern Atlantic and colder in the Equatorial Atlantic at the eastern edge of the image, due to the effect of the Saharan dust outbreak.
Figure 8 shows the time series of biases in retrieved SST-CMC SST for 20-30 March 2018.The DCMs increase from GR-IS to GR-L4 and to PWR-L4 SST.Table 4 shows the mean sensitivities and DCMs for three algorithms averaged over the month from 1-31 March 2018.The DCMs in Table 4 are roughly proportional to mean sensitivities.

Algorithm
Mean sensitivity DCM GR-IS SST 0.73 0.34 K GR-L4 SST 0.91 0.43 K PWR-L4 SST 1.00 0.47 K Figure 9 shows the mean values of DCMs, averaged from 1-31 March 2018, as functions of STPW.The corresponding dependencies for sensitivities are similar to those produced from matchups and shown in Figure 4a.The unrealistically large DCM in the GR-IS SST at STPW < 10 kg/m 2 confirms that this algorithm is inefficient at small STPWs.Other than that, the changes in DCM between the algorithms are consistent with the differences in the sensitivities.The difference between the DCMs estimated from the GR-IS SST and the GR-L4 SST is the largest at STPW < 60 kg/m 2 .In contrast, the difference between the DCMs for GR-L4 SST and PWR-L4 SST is small at 10 < STPW < 40 kg/m 2 and increases with STPW growing over 40 kg/m 2 due to increased difference in sensitivities.Assuming that that the maximum of the clear-sky pixels distribution takes place at 30 < STPW < 50 kg/m 2 , consistent with the maximum of the distribution of matchups in Figure 3, the adjustment of the DCM estimates in the PWR-L4 SST at larger STPWs may essentially exceed the mean difference between the average DCM values for GR-L4 and PWR-L4 SSTs shown in Table 4.   Figure 9 shows the mean values of DCMs, averaged from 1-31 March 2018, as functions of STPW.The corresponding dependencies for sensitivities are similar to those produced from matchups and shown in Figure 4a.The unrealistically large DCM in the GR-IS SST at STPW < 10 kg/m 2 confirms that this algorithm is inefficient at small STPWs.Other than that, the changes in DCM between the algorithms are consistent with the differences in the sensitivities.The difference between the DCMs estimated from the GR-IS SST and the GR-L4 SST is the largest at STPW < 60 kg/m 2 .In contrast, the difference between the DCMs for GR-L4 SST and PWR-L4 SST is small at 10 < STPW < 40 kg/m 2 and increases with STPW growing over 40 kg/m 2 due to increased difference in sensitivities.Assuming that that the maximum of the clear-sky pixels distribution takes place at 30 < STPW < 50 kg/m 2 , consistent with the maximum of the distribution of matchups in Figure 3, the adjustment of the DCM estimates in the PWR-L4 SST at larger STPWs may essentially exceed the mean difference between the average DCM values for GR-L4 and PWR-L4 SSTs shown in Table 4.   9 shows the mean values of DCMs, averaged from 1-31 March 2018, as functions of STPW.The corresponding dependencies for sensitivities are similar to those produced from matchups and shown in Figure 4a.The unrealistically large DCM in the GR-IS SST at STPW < 10 kg/m 2 confirms that this algorithm is inefficient at small STPWs.Other than that, the changes in DCM between the algorithms are consistent with the differences in the sensitivities.The difference between the DCMs estimated from the GR-IS SST and the GR-L4 SST is the largest at STPW < 60 kg/m 2 .In contrast, the difference between the DCMs for GR-L4 SST and PWR-L4 SST is small at 10 < STPW < 40 kg/m 2 and increases with STPW growing over 40 kg/m 2 due to increased difference in sensitivities.Assuming that that the maximum of the clear-sky pixels distribution takes place at 30 < STPW < 50 kg/m 2 , consistent with the maximum of the distribution of matchups in Figure 3, the adjustment of the DCM estimates in the PWR-L4 SST at larger STPWs may essentially exceed the mean difference between the average DCM values for GR-L4 and PWR-L4 SSTs shown in Table 4.

Training SST Algorithms Against Different L4 Analyses
L4 SST analyses, available from different producers, are not identical and may report different SST values for the same time and geographical regions.Table 5 illustrates this statement by showing biases and standard deviations between the CMC SST and the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA) [40] as well as between CMC and the Optimal Interpolation SST (OISST) [41].The statistics were produced for 31 December 2017 by averaging over the ABI SST domain within the range of VZAs from 0 to 67 • .In both cases the standard deviations between the compared L4 analyses are significant.The difference between CMC and OISST is larger than between CMC and OSTIA in terms of both bias and standard deviation, and may be considered as an extreme case.In this Section, we evaluate possible variations in T S retrieved with the GR-L4 and the PWR-L4 algorithms, which may be caused by training against different SST analyses.Table 6 shows the statistics, averaged over the clear-sky GOES-16 ABI SST domain, for the difference in T S caused by training the same algorithm against different L4 analyses (i.e., CMC vs. OSTIA or CMC vs. OISST).In all cases, the algorithms were trained using ABI and L4 SST data from 15 December 2017 to 15 January 2018.The training procedures were similar to those described in Sections 3 and 4 with the exception that the offsets of the SST equations were selected to fit the nighttime L4 SST rather than in situ SSTs.The statistics are shown for two ABI images taken on 31 December 2017 at 9:00 and 20:00 UTC, close, respectively, to the minimum and the maximum of the diurnal cycle in the GOES-16 ABI SST domain.The difference between the daytime and nighttime biases caused by training against different L4 analyses is negligible in all cases.This suggests that the estimates of DCM, averaged over the full ABI SST domain, are practically insensitive to the selection of the L4 analysis for training.The standard deviations of variations in the GR-L4 SST, do not exceed 0.06 K.The corresponding standard deviations for the PWR-L4 SST are larger, especially in the case of switching between CMC and OISST.This suggests that the PWR-L4 SST algorithm requires more careful selection of the analysis SST for training.

Summary and Conclusions
The sensitivity of retrieved SST to T SKIN is a characteristic of an SST retrieval algorithm, which determines the reproduction of true spatial and temporal SST variations in retrieved SST.The sensitivity directly affects the magnitude of diurnal SST variations, estimated from geostationary sensors, and, therefore, it is of particular importance for the quantitative monitoring of the diurnal cycle in SST.

Figure 3 . 16 Figure 3 .
Figure 3. Normalized histograms in terms of total precipitable water vapor content along slant line of sight (STPW) of (1) matchups with in situ SSTs within TMDS-IS and (2) weighted matchups of nighttime ABI pixels with CMC L4 SSTs within the MDS-L4.

Table 4 .
Mean sensitivities and magnitudes of the diurnal cycle, averaged from 1 to 31 March 2018, for the three explored algorithms.

Figure 8 .
Figure 8.Time series of the bias of T S − T S 0 derived from GR-IS SST, GR-L4 SST, and PWR-L4 SST for 24-30 March 2018.
11, and T 12 are brightness temperatures observed in the 8.4, 10.3, 11.2, and 12.3 µm bands, respectively; S θ = 1/cos(θ) − 1; θ is VZA; T S 0 is the CMC SST (in • C), interpolated from the original 0.1 • grid to sensor's pixels.Derivatives of brightness temperatures in terms of SST, D 8 , D 10 , D 11, and D 12 , are calculated with the CRTM, and sensitivity in every SST pixel is calculated as follows:

Table 2 .
Mean sensitivities, standard deviations of retrieved SST -in situ SST, and average magnitudes of diurnal cycle (DCMs), estimated from MDS-IS, using matchups with STPW < 100 kg/m 2 .

Table 3 .
Statistics of sensitivities and deviations of ABI SST from CMC SST corresponding to the images in Figures6 and 7.

Table 3 .
Statistics of sensitivities and deviations of ABI SST from CMC SST corresponding to the images in Figures6 and 7.

Table 3 .
Statistics of sensitivities and deviations of ABI SST from CMC SST corresponding to the images in Figures6 and 7.

Table 4 .
Mean sensitivities and magnitudes of the diurnal cycle, averaged from 1 to 31 March 2018, for the three explored algorithms.

Table 4 .
Mean sensitivities and magnitudes of the diurnal cycle, averaged from 1 to 31 March 2018, for the three explored algorithms.

Table 5 .
Bias and standard deviation of the differences between L4 analyses for 31 December 2017.

Table 6 .
Bias and standard deviation of variations in the GR-L4 SST and the PWR-L4 SSTs caused by training on different L4 analyses, averaged over the GOES-16 ABI clear-sky SST domains on 31 December 2017 at 9:00 and 20:00 UTC.