Assessment of the Hyperspectral Infrared Atmospheric Sounder (HIRAS)

: The hyperspectral infrared atmospheric sounder (HIRAS), the first Chinese hyperspectral infrared instrument, was launched in 2017 on board the fourth polar orbiter of the Feng Yun 3 series, FY-3D. The instrument is a Fourier transform spectrometer with 2275 channels covering three spectral bands (650–1136, 1210–1750, and 2155–2550 cm − 1 ) with 0.625 cm − 1 spectral resolution. The first data quality assessment of HIRAS observations at full and normal spectral resolutions is presented. Comparisons with short-range forecasts from the Met Office numerical weather prediction (NWP) global system have revealed biases (standard deviation) generally less than 2.6 K (2 K) in the spectral regions mostly unaffected by trace gases where the confidence in the NWP model is largest. Of particular concern, HIRAS detector 3 seems to suffer from sunlight contamination of its calibration towards the end of the descending node. This, together with an obstruction of the detector field of view by an element of the platform, results in accentuated bias and noise in the observations from this detector. At normal spectral resolution, a background departure double difference analysis has been conducted between HIRAS and the NOAA-20 crosstrack infrared sounder (CrIS). The results show that HIRAS and CrIS are in good agreement with a mean difference across the three bands of –0.05 K (±0.26 K at 1 σ ) and 75.2% of the channels within CrIS radiometric uncertainty, noting though that HIRAS is noisier than CrIS with, on average, a standard deviation 0.34 K larger.


Introduction
Hyperspectral infrared instruments such as the atmospheric infrared sounder (AIRS) and the crosstrack infrared sounder (CrIS) on board the U.S. platforms Aqua, Suomi-National Polar-Orbiting Operational Environmental Satellite System Preparatory Project (SNPP), and NOAA-20, and the infrared atmospheric sounding interferometer (IASI) on board the European Metop platforms have been a fundamental part of the Earth observation system since the early 2000s [1][2][3]. Their high spectral resolution provides information on atmospheric temperature and humidity, sea and land surface temperature, cloud amount, water content and top height, precipitation and atmospheric composition, and complements the historical records of legacy infrared instruments of lower spectral resolution. The data coming from these instruments have become of foremost importance for numerical weather prediction (NWP) centres [4][5][6][7][8] as well as the climate community (e.g., [1]).
The launch of a new instrument is accompanied by a validation phase aimed at assessing the geometric, spectral, and radiometric calibrations upon which the data quality depends. While frequency stability and radiometric uncertainty can be estimated from pre-launch laboratory measurements and on-board calibration systems, methods based on collocated observations from an imaging radiometer, or on the detection of high contrast targets like coast lines, are needed to assess the geolocation calibration [9][10][11][12].
Strategies for the assessment of the data quality can be categorised into three main groups. First, the matchup of satellite data with conventional observations is produced on a regular basis at observation sites or with high intensity over short periods of time for dedicated campaigns. This strategy has, for example, been adopted for the comparison of IASI radiances with radiance measured by airborne interferometers obtained during the satellite overpasses [13,14].
Second, satellite-to-satellite cross validation techniques involve simultaneous nadir overpasses (SNO) or double differences. SNO have been used to compare, for example, IASI and AIRS [15,13], CrIS, IASI, and AIRS [16], or more recently, SNPP CrIS and NOAA-20 CrIS [17]. However, this technique requires stringent collocation criteria to reduce the collocation uncertainty and, depending on the orbit patterns, is often limited to high latitudes [18]. Double differences on the other hand, allow for the indirect comparison of two sensors through a comparison with a unique reference that acts as a radiative transfer medium. The reference can be observations from an independent sensor, e.g., [15,19] used the geostationary operational environmental satellite (GOES) imagers for the comparison of AIRS and IASI, or short-range forecasts from an NWP system as described by [20]. The working assumption behind double differences is that comparisons should be free from the biases in the reference and only reflect the biases in the data sets that are being compared such as radiometer nonlinearity, spectral response function inaccuracy, or calibration error.
Third, satellite observation are assessed against analyses and short-range forecasts from NWP models. The physically constrained, continuous, global, and homogeneous representation of the atmosphere in NWP models offers a practical reference comparator for the assessment of satellite observations, and this method has become common practice in NWP centres (e.g., [21,20,8]). We can note that the assessments are carried out in radiance space (often expressed as brightness temperature) thanks to the radiative transfer models used by data assimilation systems [22]. Although this introduces spectroscopic-related uncertainty that adds up to the uncertainty in the forecast fields and to the scale mismatch between observation and model resolution, [23,24] reported that the forecast uncertainty is sufficiently low (potentially as low as 0.08 K in the spectral domain where observations are sensitive to temperature) to characterise subtle biases in satellite observations.
In this study, we propose an assessment of the radiometric data quality of the new Chinese hyperspectral infrared atmospheric sounder (HIRAS) on board the FY-3D platform through a comparison with short-range forecasts from the Met Office NWP global system and a double comparison with NOAA-20 CrIS. Launched in 2017, FY-3D is on a sun-synchronous quasi-polar low Earth orbit (836 km) with an equator crossing time (ECT) on the ascending node at 14:00 ECT. The FY-3D orbit closely follows that of NOAA-20 which crosses the equator on the ascending node at 13:25 ECT. HIRAS, the first Chinese hyperspectral infrared instrument in polar orbit, has been entirely designed and manufactured in China by the Shanghai Institute of Technical Physics and the Chinese Academy of Sciences. It is a Michelson interferometer of 2275 channels covering three spectral bands (650-1136, 1210-1750, and 2155-2550 cm −1 ) with 16 km nominal resolution at nadir [25,26]. At full spectral resolution (FSR), the spectral sampling is 0.625 cm −1 in all three bands, but a lower resolution product (referred to as normal spectral resolution, NSR) is also available, with 0.625 cm −1 in band 1, 1.25 cm −1 in band 2, and 2.5 cm −1 in band 3. The instrument has 116 pixels per scan arranged in 2 × 2 arrays with two adjacent observation positions separated by 3.6 °, and a full scan spans 50.4 ° on either side of the nadir. The combined 2 × 2field of views (FOV) across each of the three focal planes define the field of regard (FOR) of the instrument as illustrated on Figure 1. At nadir, the FOVs are aligned with the scan direction, but away from nadir, the pattern is rotated by an angle equal to the difference in scan angle (as is the case for CrIS). At each FOR, twelve double-sided interferograms (four FOV per band) are recorded simultaneously with a maximum optical path differences (MPDs) of 0.8 cm in the three bands. A typical HIRAS cross-track scan takes about 10 seconds and consists of 33 interferometer sweeps composed of 29 Earth scenes, two deep space, and two internal calibration target (a high-precision back body) measurements with mirrors moving in both forward and reverse directions. The deep space measurements provide information on the internal emission of the spectrometer since the space radiance is close to zero in the infrared. Tables  1 and 2 summarize the instrument characteristics.  This paper is structured as follows. The methodology for the data quality assessment is explained in Section 2. Results are presented in Section 3 and discussed in Section 4. Section 5 concludes this study.

Methodology
Two months (February and March 2019) of global unapodized FSR HIRAS observations have been provided by CMA for assessment and experimental assimilation. The assessment of these observations has been carried out using two parallel methodologies: 1) FSR observations have been compared to short-range forecasts from the Met Office global model using the radiance simulator. The radiance simulator is a tool developed at the Met Office for the NWP satellite application facilities (NWP SAF) that enables the collocation in space and time of NWP model variables and their mapping in the observation spectral domain [27]; 2) HIRAS observations have been passively processed in an off-line version of the operational system. The latter methodology however allows for more efficient and sophisticated quality controls, as well as bias correction, but requires the data to be at NSR. Both methodologies are further described in the following sections.

Assessment of HIRAS FSR
For the assessment of HIRAS FSR observations, short-range forecasts from the Met Office global system were extracted from the operational archive. The operational forecast model has a N1280L70 resolution (about 10 km grid length in mid latitudes and 70 levels with model top at 80 km, with a 3hour temporal resolution) while the underlying data assimilation system, a hybrid incremental fourdimensional variational analysis (4D-var), runs at a N320L70 resolution (about 40 km in mid latitudes) with a 6-hour time window centred on nominal analysis time (0, 6, 12, 18 UTC) [28,29].Daily errors are provided by the global ensemble every 6 hours. Assimilated satellite radiances are corrected with a variational bias correction similar to the scheme described by [30].
Observation latitude, longitude and time, as well as the satellite zenith angle and azimuth, are provided to the radiance simulator to compute the collocation and simulation. Input variables taken from the Met Office short-range forecasts are model pressure, temperature and specific humidity on pressure levels, and surface pressure, 2 m temperature and specific humidity, skin temperature, 10 m wind (u and v components), land-sea mask, sea-ice fraction, and orographic height. Note that for this assessment only observations over the ocean are considered (land and sea-ice are discarded due to the low confidence in surface emission and properties in these areas). The fast radiative transfer model RTTOV version 12 [22] is used to simulate top-of-the-atmosphere brightness temperature from the model fields. For these simulations, 54-level RTTOV coefficients have been used (available from https://www.nwpsaf.eu/site/software/rttov/download/coefficients/coefficient-download/). The emissivity model used by RTTOV for the ocean surface is the InfraRed EMISsivity model (IREMIS) [31].
HIRAS observations have been apodized with a Hamming function as described by the NOAA/STAR CrIS SDR Team [32] with the Hamming coefficient a = 0.23 and maximum optical path difference MPD = 0.8 cm, consistently with that used in the RTTOV coefficients. For the purposes of this assessment, cloudy pixels were screened out using a cloud mask developed by CMA specifically for HIRAS based on collocated observations from the medium resolution spectral imager-2 (MERSI-2) that operates onboard the same platform.

Assessment of HIRAS NSR
The collocation of observations and model fields, quality control, and retrieval of surface and cloud parameters are carried out in the Met Office observation processing system (OPS). In operations, OPS runs a N1280L70 resolution one-dimensional variational analysis (1D-Var) used to quality control and thin the observations, and derive physical parameters for the subsequent main 4D-Var. The background used for the comparison and the 1D-Var retrieval is the short-range forecast from the previous assimilation cycle, interpolated at the observation location and time. The fast radiative transfer model RTTOV version 12 and coefficients on 44 levels are used to map model variables in the observation spectral domain. Surface emissivity over ocean is calculated from the infrared surface emissivity model (ISEM) [31]. Note that both the RTTOV coefficients and the surface emissivity model differ from those used in the evaluation of HIRAS FSR, the implications of which are discussed further below. Cloud-cleared observations are secured by retaining for each pixel only the channels where 90% of the Jacobian is above the cloud top (if any). Cloud top pressure and cloud fraction are retrieved using the minimum residual method [33] and used as a first guess for the 1Dvar cloud retrieval. In this assessment, only the clear pixels where all the channels have passed the cloud detection are retained. Note that CrIS benefits from an additional cloud detection test carried out in the ATOVS and AVHRR pre-processing Package (AAPP) [34]. This consists of a threshold applied to a scattering index based on the advanced technology microwave sounder (ATMS) window channels and the 89 GHz channel that is sensitive to ice cloud scattering effects combined with an emission threshold applied over sea for ATMS 23.8 and 31.4 GHz channels. This test is not currently available for HIRAS.
To be OPS-compatible, HIRAS observations have been pre-processed as follows. The HIRAS test data analysed in this report were prepared by CMA in HDF5 format, with a spectral sampling of 0.625 cm −1 in all three bands. These data are compatible with version 8.4 of the AAPP. The data are ingested and converted to AAPP "level 1c" binary format using the tool hiras_sdr. A Hamming apodization similar to that described in the previous section is applied at this stage. Next, the data are degraded to normal spectral resolution using the AAPP tool hiras_degrade_fsr. In brief, this works by performing a fast Fourier transform (FFT) on the spectra from bands 2 and 3, modifying the amplitudes in the interferogram domain (according to the properties of the desired Hamming functions), and then performing an inverse FFT. To eliminate discontinuities at the band edges, before carrying out any FFT each spectrum is padded to a power of 2 (1024 points) then further padded by appending its mirror image. Next, a channel selection is done. We use a channel selection (399 channels) that is identical to that currently used at the Met Office in CrIS operations [8,35,36]. Finally, the data are converted to BUFR format using a locally defined BUFR sequence (available through AAPP) and stored in the Met Office's observation database (MetDB).
For this assessment, HIRAS data are used passively (i.e., not assimilated in 4D-var), the background is therefore independent from HIRAS when compared to raw observations. On the contrary CrIS (shown below for comparison) is actively assimilated into the system, resulting in analyses and subsequent forecasts being constrained by the value of its observations (depending on the weight given to the measurements constrained by the observations errors). Consequently, the difference between CrIS and model background is expected to be slightly lower than the difference between HIRAS and model background.

Results
Focusing first on the FSR data set, background departures from the Met Office short-range forecasts (O-B) have been calculated for each of the four HIRAS detectors over the two-month period of the data set using only clear sky observations over ocean. The global mean is shown on Figure 2. The 1σ standard deviation in O-B is shown on Figure 3. The background departure for temperature sensitive channels in the carbon dioxide long-wave band (CO2 -channel wavenumber less than 760 cm −1 ) varies within ±2.6, ±1.6, and ±1.7 K for detectors 1, 2, and 4, respectively. Detector 3 stands out with a bias varying from -1.6 to 3.5 K, larger in the uppermost peaking channels. The standard deviation is also the largest for detector 3 (≤ 2 K) and, to a lesser extent, detector 4 (≤ 1.7 K) for these channels. The window channels (760-980 and 1095-1135 cm −1 ) present a greater consistency with the background with O-B on average within ±0.2 K and standard deviation less than or equal to 1.3 and 1.5 K before and after the Ozone (O3) band, respectively. One exception is the last channels (1135 cm −1 ) of the long-wave band for which a 4 K O-B is seen for all four detectors. This is likely caused by the presence of a chemical component in the atmosphere that has not been accounted for in the generation of RTTOV coefficients, such as phosphine (PH3), which has a strong absorption line located at 1135.5 cm −1 . The O3 band (980-1095 cm-1) is subject to larger departure, up to 11 K (3.8 K standard deviation), principally because the model uses an O3 zonally symmetric climatology which makes the forward modelling in this portion of the spectrum inaccurate [37].  In the mid-wave band, channels of wavenumbers less than 1300 cm −1 are predominantly sensitive to temperature, and to a lesser extent humidity and methane (CH4). A cold bias (≥ -2 K), consistent across the detectors, is found in the mid-tropospheric peaking channels (~1250-1300 cm −1 ) and likely due to the misrepresentation of trace gases such as CH4 in the model that assumes these gases well mixed and uses fixed (or time-varying) concentrations [37]. Similarly, between 1300 and 1370 cm −1 , several trace gases have significant absorption lines, such as CH4, dinitrogen monoxide (N2O), or nitric acid (HNO3), resulting in large negative departures (≥ -5.5 K) and peaks in the standard deviation (≤ 1.4 K). Beyond 1370 cm −1 , observations are driven by the sensitivity to midtropospheric temperature and humidity all the way to 1750cm −1 . These channels are noisier with O-B on average 0.4 K (from -1.2 to 2.2 K across detector 1, 2, and 4) with a standard deviation varying from 0.5 to 1.8 K. Detector 3 stands out once again with significantly lower O-B (from -2 to 1.7 K) and standard deviation (≤ 2.4 K).
In the short-wave band, channels with wavenumber less than 2200 cm −1 are sensitive to the surface and carbon monoxide (CO). Unlike other window channels, the impact of CO lines causes the O-B to oscillate between 0 and 2.8 K (the bias being larger closer to the line center). However, the standard deviation remains stable between 0.7 and 1.3 K. Up to 2267 cm −1 , and the peaking height of the channels increase with wavenumber, from lower to upper troposphere, as does a cold-bias that reaches -4.7 K. This is also seen in the standard deviation that increase sharply from 0.6 to 6.8 K. In the 2267-2385 cm −1 spectral region, observations are sensitive to CO2 and upper stratospheric temperature, and are subject to well-known solar contamination affecting day-time observations (see, e.g., [38]). This results in ±3.7 K O-B and standard deviation peaking to 8.2 K. Interestingly, detector 3 shows again the lowest O-B, but in the standard deviation detectors 1 and 2 stand out with average values up to 1 K larger than detectors 3 and 4. Window channels are found at larger wavenumbers and the statistics are consistent with the long-wave window channels with O-B within ±0.2 K for standard deviations less than or equal to 2 K.
To further evaluate the instrument's performance, it was interesting to investigate the noise equivalent differential temperature (NEΔT) and compare it to the calculated standard deviation in O-B. Figure 4 shows a 90-min average from the internal warm calibration target (IWCT) counts for each detector in the three bands. Large NEΔT comparable to the size of the standard deviation obtained from the background departures are found at wavenumbers less than 700 cm −1 in the longwave band and in the window channels of the short-wave band (wavenumbers greater than 2385 cm −1 ). For these channels, the random component of the observation error is mainly driven by the radiometric internal noise. Furthermore, we can notice that detectors 3 and 4 and detectors 1 and 2 have the largest NEΔT in the low wavenumbers of the long-wave band and in the short-wave band, respectively. This is consistent with the statistics shown in Figure 3. In the mid-wave band however, all four detectors have a similar NEΔT, which suggests that the larger standard deviation (and lower O-B) obtained for detector 3 in this band may be related to a contamination of this detector by a source unaccounted for in the calibration.
Background departures and associated standard deviation for the NSR data set are shown in Figures 5 and 6, respectively.   In the long-wave, the biases in the temperature channels vary between -1 and 0.8 K across detectors 2-4 and between -1.2 and 2.5 K for detector 1, with a standard deviation less than or equal to 1 K. The O-B in the window channels vary from -0.8 to 0.2 K (detector 2-4) with detector 1 again showing larger biases from -1.3 to 0.8 K. The standard deviation is less than or equal to 1.2 K across all detectors. Note that while only a few channels of detector 1 (a dozen in total, mostly below 800 cm −1 wavenumber) drive the detector statistics to larger extremes, detector 3 tends to have a slightly lower bias with respect to the others in the temperature and window channels as already noted in the FSR dataset. In the O3 band, the bias reaches 7.5 K with a standard deviation less than or equal to 5 K.
The mid-wave the background departures are mostly negative below 1370 cm −1 , down to -2.2 K, and positive thereafter, varying between 0.2 and 1.7 K, 0.4 and 2 K, 0 and 1.5 K for detector 1, 2, and 4, respectively. The biases for detector 3 are found between -0.5 and 1 K. The standard deviation across all detectors is less than or equal to 1.8 K, although often larger for detector 3 with respect to the others.
In the short-wave, the passage from FSR to NSR has significantly smoothed the channels below 2250 cm −1 . In this segment, the background departures vary from -0.9 to 0.8 K, except for detector 3 which varies from -1.5 to 0.5 K. The standard deviation is less than or equal to 1.1 K. In the upper stratospheric channels, O-B rises in a range from 0.6 to 5.7 K (-0.7 to 4.7 for detector 3) with a standard deviation less or equal to 5.9 K. Beyond 2385 cm −1 , the biases are consistent across all four detectors and range between -0.3 and 1.7 K with a standard deviation less than or equal to 3.2 K.
At equivalent channels in the long-wave band where HIRAS FSR and NSR have the same spectral resolution, the FSR data set is on average 0.1 K warmer than the NSR at channels less than 757 cm −1 , 0.5 K warmer between 757 and 1000 cm −1 , and up to 3.5 K warmer in the O3 band. This warm bias emerges partly from a sampling effect caused by the CMA cloud mask that tends to remove more observations at high latitudes than the cloud detection scheme used in OPS for the NSR data sets (not shown), as well as other processing differences. The differences between FSR and NSR results can be traced back to various processing differences including the spectroscopy or choice of emissivity model. These are further discussed in the next section.
HIRAS NSR data set has been compared to NOAA-20 CrIS. The selected CrIS data set covers the same period of February-March 2019 and the spectral resolution is NSR. Only night-time clear pixels over ocean are used in this comparison. The comparison is a double difference, i.e., HIRAS background departure minus CrIS background departure, hereafter Δ(O-B), and HIRAS 1σ standard deviation in O-B minus CrIS 1σ standard deviation in O-B, hereafter Δ1σ. For consistency, the background used in this double comparison has been taken from an experiment where CrIS was not assimilated so that both HIRAS and CrIS are passively processed.
In operation, only CrIS detector 5 (the central detector of a 3 × 3 array) is retained in the instrument pre-processing. For this assessment, CrIS detector 5 is compared to HIRAS detector 4. The selection of the fourth detector has been done by evaluating the root mean square (rms) error across the 399 channels of HIRAS NSR four detectors. Figures 7 and 8 show Δ(O-B) and Δ1σ, respectively. Figure 7 also shows CrIS radiometric uncertainty (dotted lines) as estimated by [10] for the instrument on board SNPP. Here, we assume that the radiometric uncertainty of NOAA-20 CrIS is equivalent.  In the long-wave band, Δ(O-B) averages to -0.12 K in the CO2 lines, 0.06 K in the window channels, and -0.07 K in the O3 band. The Δ1σ ranges from 0.1 to 0.5 K. We can note that 6 channels (666.875, 668.75, 718.75, 720., 741.25, and 741.875 cm −1 ) present a significantly larger bias (±1.07 K). In total, 17 channels (9.2%) have a bias larger than the CrIS radiometric uncertainty.
In the mid-wave band, the average bias is 0.12 K at wavenumber less than 1300 cm −1 and 0.03 K beyond, with a range of ±0.7 K. The standard deviation difference ranges from 0.01 to 0.34 K. A total of 47 channels (36.7%) have a bias larger than CrIS uncertainty.
In the short-wave band, the mean bias in the window and low peaking channels each side of the CO2 band is -0.08 K, while it is mostly negative in the CO2 band between 2267 and 2385 cm −1 with Δ(O-B) ranging from -1.01 to 0.75 K. The standard deviation difference ranges from 0.19 to 3.64 K with a peak at 2382.5 cm −1 . HIRAS cold bias in the CO2 band remains unexplained and will require further work. In this band, 35 channels (40.2%) have a bias larger than CrIS uncertainty, most of them in the CO2 band.
In summary, HIRAS compares generally well to CrIS with a mean bias across the 399 channels of -0.05 K (±0.26 K at 1σ), down to -0.02 K (±0.22 K at 1σ) if the short-wave CO2 band is excluded, and 75.2% of the channels are within CrIS radiometric uncertainty. However, the Chinese instrument is also noisier, with a standard deviation systematically larger than CrIS standard deviation (0.34 K on average across all channels, and 0.25 K excluding the short-wave CO2 band). We can also note that this analysis does not account for the radiometric uncertainty HIRAS not provided by CMA. Assuming an uncertainty equal to that used for CrIS, then the total uncertainty would be 0.48, 0.28, and 0.28 K in the three bands (expressed as the root sum square of CrIS and HIRAS uncertainties), respectively, and 84.2% of HIRAS channels would be within the uncertainty.

Discussion
In the previous section, we highlighted several discrepancies between HIRAS and CrIS but also between HIRAS NSR and FSR data sets. Here, we investigate further these discrepancies through the analysis of three possible causes, namely the difference in surface emissivity model, the differences in spectroscopy, and solar contamination.

Surface emissivity
As explained by [31], ISEM has been the historical sea emissivity model used by RTTOV (since version 6). This model is still optionally available in the current RTTOV version (12) and is used in the operational system at the Met Office. The emissivity is parameterised in terms of satellite zenith angle and assumes zero wind speed. The parameterisation uses coefficients based on a wave slope model derived from [39], refractive indices from [40] and salinity correction from [41]. IREMIS on the other hand was introduced as default sea emissivity model in RTTOV 12. This is a more advanced model that uses zenith angle, wind speed, and skin temperature to parameterise the emissivity. Additionally, a linear dependence of refractive index on skin temperature has been introduced. The waves are now derived from [42]. The radiance simulator uses IREMIS by default. The validation carried out by [31] shows little differences between the models except for high wind speeds that degrade the performances of ISEM with respect to IREMIS since wind speed is not accounted for in the former model.
For this study, we investigated the impact of switching from one model to another in the radiance simulator. One day of the model background has been reprocessed setting the surface emissivity model to ISEM in the radiance simulator and the simulated brightness temperature compared to those obtained in the first instance with IREMIS. The difference between ISEM-based and IREMIS-based simulations has been found to be on average 0.15, 0.09, and 0.04 K in the long, mid, and short-wave bands, respectively. Note that, as expected, the window and low peaking channels are more affected than the others. These results are in line with and of the same order as the comparison provided by [31].
Although not negligible, the impact of using different surface emissivity models cannot alone explain the observed difference between the NSR and FSR data sets that is an order of magnitude larger.

Spectroscopy
As noted above, in this study we have used two different versions of RTTOV coefficients. For simulations with the radiance simulator, the radiative transfer model is used with official HIRAS FSR coefficients as provided on NWP SAF (https://www.nwpsaf.eu/). These coefficients have been generated in 2018 with LBLRTM v12.2 line-by-line model (http://rtweb.aer.com/lblrtm.html) and 83 training profiles. The optical depth calculation is carried out on a fixed set of pressure of either 54 or 101 levels. Here, we have selected the coefficients file on 54 levels. The generation of RTTOV coefficients is further detailed by [43].
Simulations in the off-line operational system have been carried out with a different set of coefficients to remain consistent with operational practices. These coefficients were originally generated in 2012 for CrIS NSR. The line-by-line model used at that time was LBLRTM v12.1. The number of training profiles was similar, but trace gas values were since updated in 2016. The number of levels is 44. Note that official HIRAS NSR coefficients are not yet available, but the application of CrIS channel selection in the pre-processing as well as the similar characteristics of the two instrument allows for the use of CrIS coefficients as proxy.
To characterise the impact of using different versions of RTTOV coefficients (with different characteristics), we have used the 83 training profiles to simulate and compare brightness temperatures. Two scenarios have been investigated. First, we have compared the results of simulations based on 54 levels FSR coefficients (54FSR) to simulations based on 54 levels NSR equivalent (54NSR) generated with the same underlying spectroscopy. This aims to highlight the difference between full and normal spectral resolution. Second, we have compared simulations carried out with 54-level NSR coefficients with new spectroscopy and training profiles to 44-level NSR (44NSR) with old spectroscopy and training profiles. This second set of simulations aims to evaluate the impact of both the reduced number of levels and the update in the line-by-line model and trace gas profiles.
The degradation from FSR to NSR affects the mid and short-wave band only, no difference has been observed (as expected) in the long-wave band when comparing 54FSR to 54NSR. In the midwave band, we have found a RMSE of 2.3 K with channel to channel variability as low as 0.07 and up to 11.3 K. In the short-wave, the RMSE is smaller, 0.9K, and varies between 0.004 and 2.9 K.
The combined impact of the number of levels and updated spectroscopy and training profiles (54NSR versus 44NSR) is much smaller in comparison and yields a RMS between 0.02 and 0.3 K in all three bands.

Solar contamination
There are two types of solar-related biases that affect the background departures: biases in the forward model and the solar contamination that directly affects the instrument.
FY-3D is an afternoon platform, meaning that during nearly half of its orbit (the ascending node) the instruments see the daytime side of the Earth. During that time, the reflected solar radiance at the surface of oceans significantly affects the short-wave infrared window channels. Additionally, nonlocal thermodynamic equilibrium (NLTE) emission occurring in the daytime high atmosphere also affects observations in the short-wave CO2 lines. Although the correct representation of these effects has been a long-standing issue for the radiative transfer models [44], the latest version of RTTOV uses a sun-glint model to provide an estimation of the reflectance distribution for the direct surface-reflected light as well as an optional correction for the NLTE emission [43]. In practice however, residual biases remain in the short-wave, and channels in this band are generally not assimilated. Note that, in line with operational practices, RTTOV NLTE correction has not been used in this study. Figure 9 illustrates HIRAS NSR background departures as a function of the solar zenith angle (SZA) for all four detectors. An observed 10 K warm daytime (solar zenith angle less than 90°) bias in the short-wave CO2 lines and a 3 K warm daytime bias in the short-wave window channels are consistent with the NLTE and solar reflection-related biases expected in the forward model, respectively. Similarly, the positive-negative-positive pattern seen in the long-wave O3 lines results from deficiencies in the Ozone distribution around the globe in the background as discussed earlier.
These model-based biases are also observed in CrIS datasets (not shown). While the biases are relatively similar across the four detectors for most SZA, there is a noticeable difference for detector 3 between 120 and 100 ° SZA. In the mid-wave band in particular, a large negative bias (down to -7.1 K) is visible. As shown in Figure 10, this bias consistently appears towards the end of the descending orbit, where the instrument is still looking at the night side of Earth, but the platform leaves the shadow of Earth. A possible explanation is that, for a brief amount of time, the angle at which the satellite moves into the light allows a contamination of detector 3 deep space view by sunlight during its calibration cycle. Sunlight contamination of the cold calibration view would result in elevated cold-view observation counts and therefore colder temperature for calibrated Earth scene observations. Note that the direct illumination of one or more of the components of the instrument can result in scene contamination [38], but unlike calibration contamination, this would result in a cross-track bias that is not observed in HIRAS data set. From private communications with CMA, we know that detector 3 has an element of the platform covering a fraction of its field of view. However, it is not clear whether this covering element is related to the contamination deep space view. It is possible that at the angle at which the satellite crosses the terminator from shade to sunlight, this element reflects the sun light while the instrument is scanning the deep space for calibration. Except for a few channels, the comparison between HIRAS and CrIS is consistent with previous works aiming at comparing hyperspectral instruments. [16] reported SNO-based differences between SNPP CrIS and the two IASI less than 0.2 K in the long and mid-wave bands (with standard deviation ranging from 0.2 to 0.7 K) and increasing up to 1 K in the short-wave band (standard deviation up to 4.5 K). Similar results were reported against AIRS in the first two bands. The authors pointed out that CrIS is generally slightly warmer than IASI and AIRS.
Additionally, we can note that [45] have presented preliminary results from SNO of HIRAS and CrIS at the 2019 Global Space-based Inter-Calibration System (GSICS) working groups annual meeting. Their analysis is based on 20 days (April 25 to May 15, 2018) of SNO at FSR and shows biases in the long-wave band generally less than 0.5 K (up to 1 K at wavenumber less than 700 cm −1 ) and less than 0.7 K in the mid-wave band, with a standard deviation ranging from 0.2 to 1 K in these two bands, which is also consistent with our findings.
In our study, HIRAS is matched to NOAA-20 CrIS, it is therefore interesting to note how NOAA-20 CrIS compares to its predecessor on SNPP. For their evaluation of the CrIS instruments using a NWP model (from the European Center for Medium-range Weather Forecasts) [17] have faced the same limitations and highlighted the same type of biases as those detailed in our study, such as the negative bias in long-wave window channels (related to possible cloud contamination of the O-B), the positive bias in the short-wave CO lines (due to sub-optimal representation of trace gases in the model), or the positive bias in the high peaking CO2 lines (linked to the difficult simulation of this part of the atmosphere potentially subject to NLTE). It should be noted that the authors have also conducted double differences using ECMWF and IASI as transfer medium to compare NOAA-20 and SNPP CrIS. The two instruments agree within ±0.1 K in the long and mid-wave band while the difference has been reported slightly larger in the short-wave band, with notably a 0.1 K warm bias in the CO2 lines for the instrument on NOAA-20 [17].
Finally, it is worth noting that although an advance evaluation of the geolocation calibration has not been conducted in our study, no significant change in bias nor increase in standard deviation has been detected in coastal areas. A geolocation bias would mainly result in a displacement of the observed brightness temperature with respect to its actual position and would affect the background departures (increase bias and standard deviation) near coast lines and area of strong thermal contrast, especially in window and low peaking channels.
This simple sanity check seems consistent with the preliminary information communicated by CMA. CMA has investigated HIRAS geolocation through comparisons with MERSI-2 (the imager on board FY-3D) where MERSI-2 channels overlap with HIRAS spectrum, i.e. at 4.05, 7.2, 10.8, and 12 μm. The correlation coefficients between scene brightness temperature observed by MERSI-2 and HIRAS (convoluted at MERSI-2 spectral resolution) reach 0.990, 0.995, 0.990, and 0.991, respectively. When a uniformity threshold of 0.2 K on MERSI-2 standard deviation is imposed, the correlation coefficients increase to 0.998, 0.999, 0.999, and 0.999, respectively. These results have not yet been published, although they have been presented by Chengli Qi at the 2019 GSICS working groups annual meeting [45].
Although it is beyond the scope of this study to carry out a geolocation assessment of HIRAS observations based on e.g. MERSI-2, we strongly recommend CMA to produce a public report or publish the results in a peer review journal.

Conclusions
In this paper, we have provided the first assessment of the first Chinese hyperspectral infrared instrument on board a polar orbiter. HIRAS was launched as part of the FY-3D payload in 2017. CMA has made available two months of unapodized global observations at full spectral resolution that have been evaluated through comparisons with short-range forecasts from the operational Met Office global NWP system and the CrIS instrument on board NOAA-20.
Both the native FSR resolution and a degraded NSR version with similar characteristics to that of NOAA-20 CrIS, as used in operation at the Met Office, have been assessed. While both data sets have been considered over clear-sky ocean, their processing are notably different. The FSR observations have been apodized consistently with the RTTOV coefficients used for the simulation of the model background interpolated at observations location and time. Clear-sky scenes have been ensured by the application of a MERSI-2-based cloud mask developed by CMA. In parallel, HIRAS observations have been ingested in the Met Office's observation database after going through a process of spectral resolution degradation from FSR to NSR, including the apodization of the observation, and a channel selection identical to that used for NOAA-20 CrIS. The degradation of NSR is intended to allow for the processing of data in the current Met Office operational system, offering insights into the instrument quality in an operational configuration.
In the FSR data set, and away from the well-documented model biases in the O3 band and trace gases lines, and forward model deficiencies in the short-wave band, the background departures vary within ± 2.6, ± 1.6, ± 3.5, and ± 1.7 K for detectors 1 to 4, respectively, with a standard deviation less than or equal to 2 K, in the temperature sensitive channels of the long-wave band. In the humidity sensitive channels of the mid-wave band, biases vary from -1.2 to 2.2 K, with a standard deviation less than or equal to 2.4 K. The window channels across both the long and short-wave bands show similar characteristics with background departures mostly within ± 0.2 K with a standard deviation less than or equal to 2 K. Consistently, across the three bands, HIRAS detector 3 shows biases differing from the other three (either higher or lower), as well as a larger standard deviation (except in the short-wave).
For the NSR data set, the degradation of the resolution has smoothed the biases in the mid and short-wave bands, but the characteristics remain broadly similar to the FSR. The O-B in the longwave varies from -1 to 0.8 K and from -0.8 to 0.2 K for most temperature and window channels, respectively, with standard deviation less or equal to 1.2 K. The biases in the mid-wave band humidity channels range from 0 to 2 K, with the exception of detector 3, which is biased low with respect to the others (-0.5 to 1 K), for a standard deviation across all detectors not greater than 1.8 K. The short-wave window channels beyond 2385 cm −1 yield background departures ranging from -0.3 and 1.7 K and standard deviation up to 3.2 K.
The origin of biases, especially affecting detector 3, has been investigated. We have highlighted a probable contamination of the deep space view of detector 3 by sunlight, either directly or via reflection, localized at the end of the descending node when the platform moves from the shadow of Earth into the light. It was also confirmed by CMA that an element of the platform is present in the field of view of detector 3, which may point towards the hypothesis of a contamination by reflection although this has not been demonstrated. This element in the field of view of detector 3 also potentially contributes to the inter-detector bias and standard deviation heterogeneity.
Of HIRAS four detectors, detector 4 has been found to have the lowest rms errors and has been compared to NOAA-20 CrIS detector 5 (the detector used in operation at the Met Office) in a double difference analysis at NSR using a similar data processing and channel selection. On average, across the three bands, the double difference is as low as -0.05 K (±0.26 K at 1σ), but HIRAS standard deviation has been found to be systematically larger (0.34 K mean difference) than that of CrIS. In the high peaking temperature-sensitive channels of the long-wave CO2 lines, the mean double difference is -0.12 K with a few individual channels reaching ±1 K. The humidity channels of the mid-wave band average to 0.03 K with a range of ±0.7 K. In the short-wave CO2 band, the double difference highlights a cold bias in HIRAS data set, down to -1.01 K, of unknown origin. Further, 75.2% of HIRAS channels were found within the radiometric uncertainty estimated for SNPP CrIS and applied to the instrument on NOAA-20. When HIRAS radiometric uncertainty, speculatively taken equal to that of CrIS, is accounted for, the number of channels within the total radiometric uncertainty increases to 84.2%. Looking forward, the addition of a new hyperspectral infrared sounder in the Met Office NWP system could potentially have a beneficial impact on forecast errors and make the system more resilient.