The role of Advanced Microwave Scanning Radiometer 2 channels within an optimal estimation scheme for sea surface temperature

: We present an analysis of information content for sea surface temperature (SST) retrieval from the Advanced Microwave Scanning Radiometer 2 (AMSR2). We ﬁnd that SST uncertainty of ∼ 0.37 K can be achieved within an optimal estimation framework in the presence of wind, water vapour and cloud liquid water effects, given appropriate assumptions for instrumental uncertainty and prior knowledge, and using all channels. We test all possible combinations of AMSR2 channels and demonstrate the importance of including cloud liquid water in the retrieval vector. The channel combinations, with the minimum number of channels, that carry most SST information content are calculated, since in practice calibration error drives a trade-off between retrieved SST uncertainty and the number of channels used. The most informative set of ﬁve channels is 6.9 V, 6.9 H, 7.3 V, 10.7 V and 36.5 H and these are suitable for optimal estimation retrievals. We discuss the relevance of microwave SSTs and issues related to them compared to SSTs derived from infra-red observations.


Introduction
Sea surface temperature (SST) is a geophysical quantity of fundamental importance in the Earth system, since it is a controlling factor in air-sea fluxes [1,2] and therefore profoundly influences atmospheric and oceanographic thermodynamics [3], dynamics [4,5] and coupled interactions [6].Near-real time estimation of global SST at adequate spatial resolution is crucial to weather forecasting by numerical weather prediction (NWP, [7]) and errors in knowledge of SST can materially degrade weather forecast skill [8,9].SST is used as the measure of Earth's surface temperature over oceans [10][11][12] and is therefore a key metric of climatic variability and change whose global evolution can be estimated back to the mid-19th Century [12].Historic observations of SST are relatively sparse prior to the satellite era [13], and centennial-scale reconstructions draw heavily on the relative completeness and detail of remotely sensed SST [14].The series of Advanced Very High Resolution Radiometers (AVHRRs) have been operated since 1979 with channels supporting SST estimation, using differential-absorption-based techniques to account for the influence of the atmosphere on infra-red (IR) brightness temperatures [15][16][17][18].Thus, reprocessing of multi-decadal satellite SST datasets has concentrated on IR sensors, namely, the AVHRRs [19] and Along Track Scanning Radiometers (ATSRs; [20]).Merchant et al. [21] more recently used both AVHRRs and ATSRs jointly to develop a blended, gap-filled analysis for climate applications, analogous to the SST analyses produced operationally for NWP [9,22], but with more attention to long-term stability Microwave (MW) observations of SST were first attempted with the Scanning Multichannel Microwave Radiometer (SMMR) launched in 1978 and in 1999 the Tropical Rainfall Measuring Mission's (TRMM's) Microwave Imager began delivering SSTs of useful accuracy across the tropics.The record of globally SST-capable microwave radiometers is shorter, having commenced with the Advanced Microwave Scanning Radiometer-E (AMSR-E) in 2002.MW radiometry for SST has strengths and weakness relative to IR records.The primary advantage is coverage [23]: MW SSTs are available over the open ocean under non-precipitating cloud cover, while both precipitation and cloud cover strongly limits the sampling available in the IR.MW SST is not available near coasts, near sea-ice and in areas of persistent radio-frequency interference (RFI).The spatial resolution of MW SST is typically 50 km [24] compared to 1 km for IR, limiting the precision with which thermal ocean fronts can be located in MW imagery.The potential for confounding of SST signals by wind variability (via emissivity effects) is greater for MW SSTs than for IR SSTs.Nonetheless, since cloud cover is persistent in some seasons in climatologically significant regions, the coverage advantage of MW radiometry is such that the blending of MW and IR SSTs for climate data records should be considered.
AMSR2 is a microwave radiometer instrument flying on board the Japan Aerospace Exploration Agency's (JAXA) Global Change Observation Mission 1st-Water (GCOM-W1) satellite, launched in 2012.This forms part of the "A-train" [25] series of satellites that fly in the same orbit separated by a few minutes.It observes at 6.9, 7.3, 10.65, 18.7, 23.8, 36.5 and 89.0 GHz in both H and V polarizations.The 7.3 GHz channel is an addition compared to the predecessor AMSR-E instrument on Aqua and improves detection of radio frequency interference (RFI) from artificial sources.
This paper provides an information content analysis for the AMSR2 radiometer.Our aims are to establish the fundamental limits of retrieval uncertainty for AMSR2 SST retrieval in the framework of optimal estimation (OE), and to inform strategies about channel selection for developing a new MW SST product, ultimately intended for joint use with IR products in a climate data record.A previous study with similar objectives [26] neglected the importance of variable cloud liquid water in MW SST retrieval, and did not address itself to the prioritisation of channels, both addressed here.
In Section 2, we review some of the underlying physics relevant to MW SST retrieval, noting and contrasting the MW case from the IR case.Section 3 reviews some background theory relating to information content analysis and OE.These are applied to SST retrieval from the AMSR2 instrument in Sections 4 and 5.

Physical Considerations
Microwave thermal emission from the ocean surface occurs in the Rayleigh-Jeans tail of the Planck function.This is in contrast to the thermal IR, where the peak of the Planck function is in the 10.5-12.5 µm window that is often used for SST remote sensing.The ocean surface emissivity (ε) for the low-frequency AMSR2 channels is around ∼0.5 compared to an emissivity of ∼1 in the IR.The intensity of MW radiation at the top of atmosphere (TOA) is low, which is mitigated somewhat by the ability to use large (∼m) antennae for microwave instruments.Despite this, the effective noise equivalent temperature difference (NEdT) is larger in the MW region than in the IR.The longer wavelengths involved also give rise to diffraction effects that limit the spatial resolution of AMSR2 to ∼50 km.The MW emissivity of land and ice is significantly higher than the ocean.With contemporary instruments, this leads to side-lobe contamination of the ocean MW signal close to coasts and ice edges and prevents accurate SST retrievals in these areas.There is also a larger change in emissivity with polarisation over ocean compared to ice.This can be exploited for ice detection and classification [27].
A significant advantage of using MW measurements when attempting to achieve global coverage of SST is that microwaves can penetrate cloud, so they can observe the surface signal under cloudy conditions wherein IR instruments cannot.This is useful, in particular, in persistently cloudy regions such as winter high-latitudes.Here, the restriction of IR instruments to clear-sky conditions decreases the temporal frequency of the observations and thus increases sampling errors.
This study utilises simulations of AMSR2 brightness temperatures by the fast radiative transfer model "Radiative Transfer for TOVS" (RTTOV; whose acronym has evolved into a name).We use the v11.3 software package [28][29][30][31] to carry out the simulations in Sections 4 and 5.In the MW region, this uses the FAST EMissivity (FASTEM) code to calculate the surface emissivity which, for version 4, is described by Liu et al. [32].In this study, we use the latest version, FASTEM-6.The MW emissivity model involves a complex calculation, which we summarise below.
There are several models for the emissivity and permittivity of seawater [33][34][35][36][37][38][39].FASTEM-6 uses a method that starts from a formulation for the permittivity based on Ellison et al. [33].This describes the complex permittivity with a double Debye model: Here, 0 is the permittivity of free space and ν the frequency of the electromagnetic wave.The other parameters have been derived by fitting to measurements: ∞ has a linear dependence on temperature; s , 1 , 1 , τ 1 and τ 2 are represented by polynomial fits to temperature (T) and salinity (S); and α has a mixed polynomial and exponential dependence on temperature and salinity.
The modelled permittivity is used to calculate Fresnel reflectivities (R p where p is v or h for vertical and horizontal polarization components respectively) from the standard Fresnel equations.These are subsequently modified to effective values that account for other factors such as foam and surface roughness.In general, these factors add a dependency of the final emissivity on the wind vector (U).Surface roughness causes MW energy to be scattered both into and out of the direct line of sight of the surface by quasi-specular reflection events.FASTEM represents these with a two-scale model [32,40].The small-scale waves have a size close to the wavelength of the emitted radiation.These small waves ride on the large-scale undulations of gravity waves.The correction to R p for the small-scale features takes the form of a multiplicative factor exp(−y cos 2 θ) where y is a polynomial fit to wind speed and frequency and θ is the zenith angle of the observation.The large-scale correction (L p ) takes the form of an additive term with polynomial fit to frequency, wind speed and sec θ.The wave orientation is accounted for by adding three cosine harmonics for the relative azimuth angle (φ) between the observation and wind vectors.The wind-speed factors here act as a proxy for what is in reality the mechanical stress on the ocean due to the wind.This drives the creation of small scale waves and thus changes the effective surface area.
Above wind speeds of a few metres per second, foam begins to form on the sea surface [38].This is principally a mixture of water with air bubbles.FASTEM-6 calculates the fraction of the surface covered by foam ( f ) using the expression of Monahan et al. [41] where f ∼ |U| 2.55 .(An alternative form f ∼ |U| 3.231 by Tang [42] is used in FASTEM-4.)The model then computes area-weighted mean values of foam emissivities (ε p, f ) and the modified sea water emissivities.The foam emissivities are calculated using a combination of the zenith angle polynomial fit of Kazumori et al. [43] with the linear frequency dependence from Stogryn [44].The final form relating the effective emissivities (ε p ), Fresnel reflectivities and the correction factors is thus where the functional dependencies are f (|U|), R p (T, S), y(|U|, ν), L p (|U|, ν, θ), ε p, f (θ, ν).

Cosmic Microwave Background
The cosmic microwave background (CMB) is radiation from the recombination era of the early universe that has subsequently cooled due to the expansion of the universe and now forms a near isotropic source of background photons [45,46].Its spectrum is characterised by an effective temperature of ∼2.73 K [47].We can make a simple estimate of the relative intensity of this source to emission from the Earth from the ratio of the black-body functions B ν for the two sources: F CMB F ⊕ ≈ 0.02 at 6.9 GHz 0.008 at 89 GHz (4) for T ⊕ = 290 K and emissivity ε = 0.5.Although we have neglected surface roughness and atmospheric effects, this demonstrates that the contribution of the CMB to the observed TOA flux, although small, is not negligible and must be included in MW radiative transfer modelling.

Skin Depth
There is typically a cooling of order 0.2 K from a depth of ∼1 mm at the top of the ocean (the sub-skin) to the interface where the atmosphere and ocean meet.At IR wavelengths, electromagnetic waves are absorbed in a distance of order 10 µm and sample the ocean at the top of the skin layer and are thus sensitive to "SST-skin".In contrast, microwaves have a frequency-dependent penetration depth measured in millimeters and so observations here are sensitive to SST-sub-skin.To compare or harmonise measurements made in the two wavelengths regions with those from in situ sources, retrievals must be corrected to the depth of in situ measurements, typically 10 cm to 1 m.This requires a model for the skin effect and the diurnal warming.
Robinson [48] gives an expression for the apparent temperature (T app ) seen by a radiometer assuming an exponential form for the temperature profile in the skin-layer.This temperature profile can written as where T 0 is the surface (interface) temperature and T ss is the sub-skin temperature.Using an e-folding distance d µ for the absorption of radiation at the surface, results in where γ = d T d µ + d T .If the cooling across the skin layer is due to molecular conduction, we might expect the temperature profile through the skin layer to be linear.A similar derivation using a total skin thickness δ and such a linear assumption for T(z) yields where

Salinity
Salinity has a negligible effect on emissivity in the IR region but can be significant at MW wavelengths.Figure 1 shows the change in brightness temperature with salinity for a given atmospheric profile.For the most SST-sensitive, low-frequency channels, the effect is relatively small across the typical range of global oceanic salinity (33)(34)(35)(36)(37).The effect is more significant, however, for the higher frequency channels and is temperature dependent.Including this effect in modelling would be more important in areas with a strong freshwater influence.
Version December 18, 2017 submitted to Remote Sens. 5 of 23 for the higher frequency channels and is temperature dependent.Including this effect in modelling would be more important in areas with a strong freshwater influence.

Figure 1.
The change in the top-of-atmosphere brightness temperature from the value at 35 PSU as a function of salinity.The data were modelled by RTTOV for the AMSR2 instrument using the same atmospheric profile and with surface emissivities calculated by FASTEM.All channels are included ranging from 6.9 GHz (red) to 89 GHz (purple) with V-polarized channels indicated by solid lines and H-polarized channels by dashed lines.

½ ¿
As noted at the start of this section, the ocean emissivity in the MW region is affected by wind ½ speed through the generation of foam and large-and small-scale waves.Accurate modelling of these ½ processes is difficult particularly at low frequencies and is an ongoing area of research.Figure 2 shows ½ the change in emissivity with wind speed for each of the channels for a SST of 297 K.The deviation ½ from this azimuthal-mean emissivity value at a given wind speed is displayed against the separate ½ wind-speed components in figure 3. The lack of azimuthal symmetry means that it is possible, in ½ principle, to derive some information about the separate wind components from MW observations.

½ ¼
The small size of the deviation, however, implies that this is a weak constraint.

Figure 1.
The change in the top-of-atmosphere brightness temperature from the value at 35 PSU as a function of salinity.The data were modelled by RTTOV for the AMSR2 instrument using the same atmospheric profile and with surface emissivities calculated by FASTEM.All channels are included ranging from 6.9 GHz (red) to 89 GHz (purple) with V-polarized channels indicated by solid lines and H-polarized channels by dashed lines.

Emissivity Dependence on Wind
As noted at the start of this section, the ocean emissivity in the MW region is affected by wind speed through the generation of foam and large-and small-scale waves.Accurate modelling of these processes is difficult particularly at low frequencies and is an ongoing area of research.Figure 2 shows the change in emissivity with wind speed for each of the channels for a SST of 297 K.The deviation from this azimuthal-mean emissivity value at a given wind speed is displayed against the separate wind-speed components in Figure 3.The lack of azimuthal symmetry means that it is possible, in principle, to derive some information about the separate wind components from MW observations.The small size of the deviation, however, implies that this is a weak constraint.The lack of symmetry implies that the observation contains some information about the individual wind-speed components.The satellite azimuth angle has been chosen to be 37 • here and is indicated by the arrow.A value of 0 • would align the pattern along the v-axis.

Top-of-Atmosphere Radiance Dependence on Total Column Water Vapour
Water vapour acts as an additional source of absorption for radiation traveling through the atmosphere both at MW and IR wavelengths.There are interesting differences between the two regions, however.For illustrative purposes, consider radiative transfer for microwaves using a simple slab model of the atmosphere with absorptivity a (equal to its emissivity ε a ) and temperature T a .Being in the Rayleigh-Jeans tail B ν ∝ T and, for convenience in this section, we absorb the constants of proportionality into the temperature units.The radiance of the upward emission by the atmosphere at temperature T a is then and, similarly, the downward emission by the atmosphere is The radiance from the surface emission at temperature T s and the amount that is transmitted through to the top of the atmosphere is The total outward radiance is thus For a given column with fixed T a and T s , I TOA can either increase or decrease with atmospheric absorption according to the sign of the final bracket.For ε ≈ 1 (as in the IR part of the spectrum), I TOA will always decrease as the absorption in the atmosphere increases.In the MW region, however, where ε ≈ 0.5, I TOA can increase with increasing absorption.
In reality, the situation is obviously more complex.Not only is the atmosphere not isothermal, but, across the global ocean, there is a large-scale correlation between the total column water vapour (TCWV) and T a .This sign of relationship, however, does occur and is counter to behaviour at IR wavelengths.
2.6.Top-of-Atmosphere Radiance Dependence on Total Cloud Liquid Water At IR wavelengths, clouds are largely opaque, thus rendering observations of the surface impossible except perhaps in instances of thin cirrus.Microwaves penetrate non-precipitating clouds, although measured radiances are sensitive to the cloud liquid water content which must be included in any radiative transfer modelling.Figure 4 shows the change in modelled brightness temperature for the same conditions but with the cloud liquid water profile scaled to achieve different total cloud liquid water (TCLW) values.There is a significant effect on all of the channels as well as clear differences in the sensitivity between channels.Not only does this emphasise the importance of including these effects in any modelling but also suggests that TCLW can be retrieved to some degree.The data were modelled by RTTOV for the AMSR2 instrument using the same atmospheric profile but for scaled total cloud liquid water (TCLW).All channels are included ranging from 6.9 GHz (red) to 89 GHz (purple) with V-polarized channels indicated by solid lines and H-polarized channels by dashed lines.

Information Content and Optimal Estimation
OE provides a means to combine measured values from an instrument with initial a priori estimates of physical quantities of interest to provide a best estimate of the true value of the physical quantities.It does this by weighting the observations and a priori values via the appropriate covariance matrices of their uncertainties.The solution is always an optimised (minimised) function of the squares of residuals between observation and solution.
From Rodgers [49], the optimal estimate of the physical quantities in the state vector x is given by This is the solution with maximum a posteriori probability given priori information and its uncertainty.In Equation ( 13), y is a vector containing the observations, K is the Jacobian matrix describing the sensitivity of each of the measurements to each physical quantity, S a is the uncertainty covariance matrix of the a priori values for the physical quantities and S is the uncertainty covariance matrix for the measurements.The quantity y a is the observation vector that would result from the a priori state x a .This must be calculated using a forward model and y a (x a ) is treated as linear in the region of x a .This equation can be interpreted as a form of multi-dimensional "weighted average" between the a priori values for the retrieved quantities and the values of the retrieved quantities that would give rise to the observations.Consider very small values of the a priori uncertainties.Here, the second term vanishes and the best estimate of the retrieval vector is the initial a priori values.Conversely, for large a priori uncertainties or very low measurement uncertainties, the best estimate is dominated by the observation vector.The degree to which observations and modelled values in the final bracket differ is translated from observation space into physical-quantity space by the preceding matrices.No assumption about the Gaussianity or otherwise of the uncertainty distributions is required in the derivation of this equation.In the particular case of Gaussian uncertainty distributions, the maximum a posteriori solution is also the solution with minimum error variance.
The expected uncertainty covariance matrix for the retrieved variables is In principle, this approach allows all sources of information about a problem to be combined with the correct weighting no matter how weak their sensitivity to the variables we are interested in.In practice, imperfect forward modelling and the lack of exact knowledge of appropriate covariance matrices, limit the degree to which additional observations improve the accuracy of the retrieved quantities.
Without performing any retrievals, we can calculate the degrees of freedom for signal in a measurement system from d s gives an estimate of the number of distinct quantities that may be inferred from the measurements.It is not, in general, an integer because usually retrieved variables are only partially constrained rather than precisely determined.A fuller description of optimal estimation as applied to retrieval of SST is given by Merchant et al. [50].
In the following sections, these techniques are applied to simulations using 2680 profiles over ocean taken from the EUMETSAT Satellite Application Facility on Numerical Weather Prediction (NWP SAF) 91-level dataset [51] sampled for specific humidity.The RTTOV simulation code is used as the forward model to generate y a and K appropriate to the AMSR2 instrument.A constant salinity of 35 PSU is assumed for all the profiles.
Prigent et al. [26] carried out a similar analysis for a new mission concept, Microwat, simulating retrievals based on AMSR-E channel sensitivities.They retrieved SST and wind speed assuming initial uncertainties on these two quantities of 3.31 K and 1.33 m•s −1 , respectively.They also carried out an information content analysis including water vapour content uncertainties of 10% on model levels.To provide comparability, we conduct an analysis below based on this specification using, as did Prigent et al. [26], a retrieval vector containing the four variables SST (T s ), the natural logarithm of TCWV (W) and the two wind-speed components (u, v): with an assumed-diagonal S a populated with a priori uncertainties of 3.31 K in SST, 10% TCWV and 0.94 m•s −1 for each wind component.We also extend this approach using a retrieval vector with five variables: that includes the logarithm of TCLW (L).With this formulation, we use a priori uncertainties of 1 K in SST, 10% in TCWV, 1.41 m•s −1 in each wind component and 10% in TCLW.Retrieving the logarithm of the integrated column values avoids retrieving unphysical negative estimates for quantities bounded at zero.The fractional uncertainties expressed on the quantities TCWV and TCLW transform into absolute uncertainties when expressed in log-space since, for a fractional uncertainty f on a quantity a, where L = ln a and the absolute uncertatinty in L is S is also assumed to be diagonal with values filled by the NEdT for each AMSR2 channel.In ascending order of frequency, these are (0.34, 0.43, 0.7, 0.7, 0.6, 0.7, 1.2) K with both H-and V-components having the same value [52].

Information Content Analysis
The degrees of freedom for signal d s , using all 14 channels, for each of the considered profiles, is shown in Figure 5.The mean value ds = 2.86 for the four-variable retrieval vector and ds = 3.09 when using the five-variable vector.These values are lower in both cases than the number of retrieved quantities and likely reflects the weak constraint that the observations place on the separate wind-speed components.There is also a noticeably wider spread of ds values for the five-variable retrievals compared to the four-variable cases.
Version December 18, 2017 submitted to Remote Sens. 11 of 23 The estimated retrieval uncertainty matrix was calculated from Equation ( 14) for every profile for all possible channel combinations.For a given channel combination, we define the estimated average SST retrieval uncertainty (s) as the root mean squared expected uncertainty for SST across the profile set i.e., where n is the number of profiles (2680) that are indexed by i. Figure 6 shows s for the single-channel-only retrievals, illustrating which channels make the greatest individual contribution to reducing uncertainty in retrieved values of SST.
Version December 18, 2017 submitted to Remote Sens. 12 of 23 Figure 6.Estimated SST retrieval uncertainty for a single-channel retrieval from information content analysis.The upper set of channels and solid line come assume a four-variable retrieval vector and prior SST uncertainty of 3.31 K.The lower set and dashed line assunme a five-variable retrieval vector, in which cloud liquid water is additionally accounted for, and a priori SST uncertainty of 1.0 K.The prior uncertainty values in each case are shown in long-dashed lines.
Figure 7 shows the smallest value of s when a given number of channels is included in the ¾½ observation vector along with the best channel to add.This is summarised in tables 1 and 2.
¾½ Figure 6.Estimated SST retrieval uncertainty for a single-channel retrieval from information content analysis.The upper set of channels and solid line come assume a four-variable retrieval vector and prior SST uncertainty of 3.31 K.The lower set and dashed line assunme a five-variable retrieval vector, in which cloud liquid water is additionally accounted for, and a priori SST uncertainty of 1.0 K.The prior uncertainty values in each case are shown in long-dashed lines.
Figure 7 shows the smallest value of s when a given number of channels is included in the observation vector along with the best channel to add.This is summarised in Tables 1 and 2.
Version December 18, 2017 submitted to Remote Sens.
13 of 23 Figure 7.Estimated SST retrieval uncertainty against the number of AMSR2 channels used, for the best combination of the given number of channels, based on information content analysis.The new channel added to the set is indicated at each step and is chosen on the basis of minimizing the SST retrieval uncertainty.The upper set of channels and solid line come assume a four-variable retrieval vector and the lower set and dashed line assume a five-variable retrieval vector.

¾¾¼
Simulated retrievals were carried out by randomly perturbing the NWP SAF profiles according ¾¾½ to the Sa uncertainties for the two cases.A 10% variation was also applied to the total cloud ¾¾¾ liquid water (TCLW) profiles for the four-variable case even though this was not a retrieved variable.

¾¾¿
The water vapour and CLW values on each level of the profiles were uniformly scaled to give the ¾¾ perturbed TCWV and TCLW values.These perturbed profiles were treated as the unknown true ¾¾ values and corresponding simulated observations y were generated using RTTOV with random ¾¾ noise added consistent with Sǫ.The unperturbed profiles were used both as the a priori state and ¾¾ linearisation point from which ya and K were generated, again using values obtained from RTTOV.

¾¾
The simulated retrieval error was calculated for every profile for all possible channel combinations.For a given channel combination, we define the simulated uncertainty (σ) as the Figure 7.Estimated SST retrieval uncertainty against the number of AMSR2 channels used, for the best combination of the given number of channels, based on information content analysis.The new channel added to the set is indicated at each step and is chosen on the basis of minimizing the SST retrieval uncertainty.The upper set of channels and solid line come assume a four-variable retrieval vector and the lower set and dashed line assume a five-variable retrieval vector.

Simulated Retrieval
Simulated retrievals were carried out by randomly perturbing the NWP SAF profiles according to the S a uncertainties for the two cases.A 10% variation was also applied to the total cloud liquid water (TCLW) profiles for the four-variable case even though this was not a retrieved variable.The water vapour and CLW values on each level of the profiles were uniformly scaled to give the perturbed TCWV and TCLW values.These perturbed profiles were treated as the unknown true values and corresponding simulated observations y were generated using RTTOV with random noise added consistent with S .The unperturbed profiles were used both as the a priori state and linearisation point from which y a and K were generated, again using values obtained from RTTOV.
The simulated retrieval error was calculated for every profile for all possible channel combinations.For a given channel combination, we define the simulated uncertainty (σ) as the standard deviation of the SST retrieval errors (e) across the profile set.Thus, for any retrieval e = x1 − x 1 (21) and, for a given channel combination, Figure 8 shows the values of σ for single-channel-only retrievals, again illustrating which channels make the greatest individual contribution to a retrieval of SST. Figure 9 shows the smallest value of σ for a given number of channels included in the observation vector along with the best new channel to add to the existing set.These results are also summarised alongside the information content analysis in Tables 1 and 2.
Version December 18, 2017 submitted to Remote Sens.
14 of 23 for a given number of channels included in the observation vector along with the best new channel to ¾¿½ add to the existing set.These results are also summarised alongside the information content analysis ¾¿¾ in table 1 and 2.    The standard deviation of the SST retrieval error for the best combination of the given number of channels from simulated retrievals.The additional channel added to the set is indicated at each step.The upper set of channels and solid line assume a four-variable retrieval vector and the lower set and dashed line asume a five-variable retrieval vector.

Figure 9.
The standard deviation of the SST retrieval error for the best combination of the given number of channels from simulated retrievals.The additional channel added to the set is indicated at each step.The upper set of channels and solid line assume a four-variable retrieval vector and the lower set and dashed line assume a five-variable retrieval vector.
Table 1.The standard deviation of the sea surface temperature (SST) retrieval error and root-mean-squared predicted SST uncertainty from an information content analysis across all profiles for varying numbers of channels.The channel indicated on each row is the best one to add to the existing channel set for retrieving SST.The standard deviation of the retrieval error and root-mean-squared predicted uncertainties from an information content analysis across all profiles using the best channel combinations for simulated SST retrieval i.e., 10 channels for the 4-variable retrievals and all 14 channels for the 5-variable retrievals.

Discussion
The OE framework provides a mechanism for combining all available information relating to an inverse problem with appropriate weighting.Since each channel brings some information, adding more channels to the observation vector results in progressively improving retrieval uncertainties if all sources of uncertainty are well-described by the error covariances used, and if the retrieved variables account for all significant variability in the observations.This is the behaviour that we see in the information content analyses summarised in Table 1 and Figure 7, where the predicted uncertainty monotonically decreases to the all-channel value at a declining rate as less informative channels are added.
The simulated uncertainty using the four-variable retrieval vector and S a,4var shows different behaviour, with the uncertainty increasing with added channels after the 10th.This arises because TCLW is missing from the retrieval vector.The OE method use a forward model run using the a priori values for the quantities in the state vector x a to generate simulated observation.The differences between the simulated and observed values are ascribed to deviations between the a priori values in x a and their true values.However, if the observed radiances additionally include variability due to TCLW (which is not in the 4-variable state vector), the scheme can only interpret any observational differences in terms of the other four state-vector variables.This misattribution is naturally largest for those channels that are most sensitive to TCLW where the "observed" values are most affected and which therefore result in the largest retrieval errors.These channels thus drop down the ranking of the best channel to add to the scheme.This effect is most obviously demonstrated in that adding the four least-favoured channels actually increases the SST retrieval error.
When TCLW is included in the retrieval vector, there is consistency between the behaviour of the estimated uncertainty (s) and simulated uncertainty (σ). Figure 5 bears out the above interpretation.Here, the degrees of freedom for signal of the four-variable retrieval has a lower mean value across the profile set, while the five-variable retrieval has a larger mean and a spread of values.In the five-variable retrieval case, the degrees of freedom for signal steadily increase with TCLW up to approximately 0.3 kg•m −2 (which includes 90% of the profiles) before plateauing.It then slowly declines again above about 1 kg•m −2 (4% of the profiles).The five-variable results indicate that a retrieval uncertainty for SST of ∼0.37 K may be achievable if TCLW is explicitly accounted whereas neglecting that aspect of variability would limit the achievable SST uncertainty to ∼0.45 K.
To check that the above difference is a result of including TCLW in the vector rather than merely an effect of the different a priori error covariance assumptions in the two case studies, we calculated results for a third configuration (not shown).This used the 5-variable retrieval vector with the error covariance assumptions used in the 4-variable case study.The error covariance assumption for L was 0.1 2 as used in the 5-variable case.When including all 14 channels, the values of s = 0.383 K and σ = 0.395 K are comparable to the 5-variable case.The value of σ also decreases monotically as channels are added to the scheme.This comparison proves that expanding the vector is more critical than the error covariance assumptions.
The analyses suggest a preferential ordering of channels for inclusion in the observation vector.We can interpret the channel ordering through Figures 10 and 11 for low-and high-TCLW profiles, respectively.In these figures, the axes represent the brightness temperature in pairs of channels in the order suggested by the five-variable information content analysis.The sensitivity of the two channels with respect to the retrieved quantities is scaled to a "typical" change in brightness temperature by multiplying by the a priori uncertainty on the quantities.In Figure 10, the panel (a) shows that the leading two channels (6.9 V and 7.3 V) are principally sensitive to SST and TCWV, with only small contributions from the other variables.In this case, the difference between the modelled and observed retrieval vectors is interpreted in proportion to the a priori uncertainties expressed as radiances.Panel (b) shows the next pair (7.3 V and 36.5 H) with very different responses for SST and TCWV.The 36.5 H channel is largely insensitive to SST in comparison to large changes due to TCWV, and it is consequently possible to remove the previous ambiguity and separate the two variables in the retrieval.It is not until the third pair (36.5 H and 6.9 H) that it begins to be possible to resolve wind speed effects and thus refine the small contributions they made to brightness temperature changes in the earlier channel combinations.The fact that the two wind-speed components are largely co-linear suggests that it is difficult to discriminate their individual contributions.This is the main reason that d s is less than the number of state-vector variables.
The high TCLW profile shown in Figure 11 suggests significant ambiguity for brightness temperature changes between SST, TCWV and TCLW for the 6.9 and 7.3 V pair of channels.While the remaining panels show the effect of SST now being distinguishable from both TCWV and TCLW, these latter two variables remain largely co-linear.This figure also shows different sensitivities for the two wind-speed components largely indicative of the change in wind-speed sensitivity with wind speed.The u-component of the wind speed in this case is significantly smaller than the v-component.Consequently, the u-component sensitivity arrow is barely visible, whereas the v-component shows changes in some of the channel combinations comparable to TCLW.  ).Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e. each arrow shows ∂y ∂xi σ i for the channels in y indicated on the figure axes.This is a measure of the magnitude of the response of the observations to the range of uncertainty in each retrieved variable.The variables are SST (black), ln(TCWV) (red), u (green), v (blue) and ln(TCLW) (cyan).The pairs of channels in each diagram are ordered (a)-(d) in accordance with the information content analysis for a five-variable retrieval vector in table 1.
. Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e., each arrow shows ∂y ∂x i σ i for the channels in y indicated on the figure axes.This is a measure of the magnitude of the response of the observations to the range of uncertainty in each retrieved variable.The variables are SST (black), ln(TCWV) (red), u (green), v (blue) and ln(TCLW) (cyan).The pairs of channels in each diagram are ordered (a-d) in accordance with the information content analysis for a five-variable retrieval vector in Table 1.
As alluded to in Section 2, modelling the emissivity in the MW region and particularly the wind speed dependency is a difficult task.In an effort to assess the effect of any shortcomings of the forward model in this respect, the information content and retrieval analysis were rerun doubling the sensitivity of brightness temperature to each of the wind components in K.The results are summarised in Table 3. From the information content analysis, the expected SST uncertainties for both retrieval vectors with all channels included change by around 0.01 K and although there is some slight reordering of the channels, the top five remain the best five to include.For the simulated retrievals with a five-variable retrieval vector, the 10.7 H channel has been promoted into the top five, but the best 14-channel retrieval changes by only 0.001 K.In the four-variable simulated retrieval case, the best retrieval error values is similarly small.Here, though, there are no changes to the channel order down to 7th place, perhaps reflecting that the absence of TCLW from the retrieval vector dominates the ordering.
As mentioned in relation to the increasing retrieval errors for the four-variable retrieval, including all channels in the retrieval is not necessarily the best approach in practice since there may be unrepresented physical processes (such as calibration errors) or poorly-estimated covariance matrices.Given the reasonable consistency of the channel ordering for the five-variable retrievals, we conclude that including the top five or six channels here is the optimum approach in practice when estimating SST using AMSR2.σ i for the channels in y indicated on the figure axes.This is a measure of the magnitude of the response of the observations to the range of uncertainty in each retrieved variable.The variables are SST (black), ln(TCWV) (red), u (green), v (blue) and ln(TCLW) (cyan).The pairs of channels in each diagram are ordered (a)-(d) in accordance with the information content analysis for a five-variable retrieval vector in table 1.
The high TCLW profile shown in figure 11 suggests significant ambiguity for brightness ¾ temperature changes between SST, TCWV and TCLW for the 6.9 and 7.3V pair of channels.While ).Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e., each arrow shows ∂y ∂x i σ i for the channels in y indicated on the figure axes.This is a measure of the magnitude of the response of the observations to the range of uncertainty in each retrieved variable.The variables are SST (black), ln(TCWV) (red), u (green), v (blue) and ln(TCLW) (cyan).The pairs of channels in each diagram are ordered (a-d) in accordance with the information content analysis for a five-variable retrieval vector in Table 1.
Table 3.The standard deviation of the SST retrieval error and root-mean-squared predicted SST uncertainty from an information content analysis across all profiles for varying numbers of channels with the sensitivity to wind speed doubled in the Jacobian matrix.

Conclusions
This information content analysis and simulated retrieval study for AMSR2 provides an ordered list of the best combination channels to use to retrieve SST in OE.In practice, the top five or six would be used in an OE scheme to minimise the accumulation of poorly understood errors (such as from calibration).The recommended channel set is 6.9 V, 6.9 H, 7.3 V, 10.7 V and 36.5 H.The 6.9 V and 7.3 V channels provide the greatest SST sensitivity to the retrieval and the contribution of TCWV is separated out with the addition of the 36.5 H channel.The 6.9 H and 10.7 V channels add in discrimination of the wind speed effects.These results will govern our approach in future work applying OE to real AMSR2 data for SST retrieval.

Figure 2 .
Figure 2.The change in emissivity as a function of wind speed using the FASTEM model for sea surface temperature (SST) = 297 K .All channels are included ranging from 6.9 GHz (red) to 89 GHz (purple) with V-polarized channels indicated by solid lines and H-polarized channels by dashed lines.

Figure 3 .
Figure 3. Deviation of emissivity from the azimuthal mean as a function of wind speed components.The lack of symmetry implies that the observation contains some information about the individual wind-speed components.The satellite azimuth angle has been chosen to be 37 • here and is indicated by the arrow.A value of 0 • would align the pattern along the v-axis.

Figure 4 .
Figure 4.The change in measured brightness temperature as a function of total cloud liquid water.The data were modelled by RTTOV for the AMSR2 instrument using the same atmospheric profile but for scaled total cloud liquid water (TCLW).All channels are included ranging from 6.9 GHz (red) to 89 GHz (purple) with V-polarized channels indicated by solid lines and H-polarized channels by dashed lines.

Figure 5 .Figure 5 .
Figure 5. Degrees of freedom for signal across the profiles set used in the information content analysis using all 14 AMSR2 channels.Results assuming a four-variable retrieval vector are shown with open red bars and the five-variable retrieval vector with hatched, blue barsFigure 5. Degrees of freedom for signal across the profiles set used in the information content analysis using all 14 AMSR2 channels.Results assuming a four-variable retrieval vector are shown with open red bars and the five-variable retrieval vector with hatched, blue bars.

Figure 8 .
Figure8.The standard deviation of the SST retrieval error for a single-channel retrieval from information content analysis.The upper set of channels and solid line come assume a four-variable retrieval vector and prior SST uncertainty of 3.31 K.The lower set and dashed line assunme a five-variable retrieval vector, in which cloud liquid water is additionally accounted for, and priori SST uncertainty of 1.0 K.The prior uncertainty values in each case are shown in long-dashed lines.

Figure 8 .
Figure8.The standard deviation of the SST retrieval error for a single-channel retrieval from information content analysis.The upper set of channels and solid line come assume a four-variable retrieval vector and prior SST uncertainty of 3.31 K.The lower set and dashed line assume a five-variable retrieval vector, in which cloud liquid water is additionally accounted for, and priori SST uncertainty of 1.0 K.The prior uncertainty values in each case are shown with long-dashed lines.

Figure 9 .
Figure9.The standard deviation of the SST retrieval error for the best combination of the given number of channels from simulated retrievals.The additional channel added to the set is indicated at each step.The upper set of channels and solid line assume a four-variable retrieval vector and the lower set and dashed line asume a five-variable retrieval vector.

Figure 10 .
Figure 10.Sensitivity diagrams for an example atmospheric profile with low total cloud liquid water (SST=296K, TCWV=36.2 kg m −2 , u=4.8 m s −1 , v=-8.3 m s −1 , TCLW=0.017 kg m −2 ).Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e. each arrow shows

Figure 10 .
Figure 10.Sensitivity diagrams for an example atmospheric profile with low total cloud liquid water (SST = 296 K, TCWV = 36.2kg•m −2, u = 4.8 m•s −1 , v = −8.3m•s −1 , TCLW = 0.017 kg•m −2 ).Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e., each arrow shows

Figure 11 .
Figure 11.Sensitivity diagrams for an example atmospheric profile with high TCWV and TCLW (SST = 292 K, TCWV = 40.8kg•m −2 , u = −2.7 m•s −1 , v = m•s −1 , TCLW = 0.99 kg•m −2 ).Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e., each arrow shows

Retrieval 5-Variable Retrieval Simulated Retrieval Information Content Simulated Retrieval Information Content
Sensitivity diagrams for an example atmospheric profile with high TCWV and TCLW (SST=292 K, TCWV=40.8 kg m −2 , u=-2.7 m s −1 , v=14.8 m s −1 , TCLW=0.99 kg m−2 ).Each arrow indicates, for two channels, the product of the brightness temperature sensitivity to a variable and the a priori uncertainty in that variable i.e. each arrow shows