The NCAR Airborne 94-GHz Cloud Radar: Calibration and Data Processing

: The 94-GHz airborne HIAPER Cloud Radar (HCR) has been deployed in three major ﬁeld campaigns, sampling clouds over the Paciﬁc between California and Hawaii (2015), over the cold waters of the Southern Ocean (2018), and characterizing tropical convection in the Western Caribbean and Paciﬁc waters off Panama and Costa Rica (2019). An extensive set of quality assurance and quality control procedures were developed and applied to all collected data. Engineering measurements yielded calibration characteristics for the antenna, reﬂector, and radome, which were applied during ﬂight, to produce the radar moments in real-time. Temperature changes in the instrument during ﬂight affect the receiver gains, leading to some bias. Post project, we estimate the temperature-induced gain errors and apply gain corrections to improve the quality of the data. The reﬂectivity calibration is monitored by comparing sea surface cross-section measurements against theoretically calculated model values. These comparisons indicate that the HCR is calibrated to within 1–2 dB of the theory. A radar echo classiﬁcation algorithm was developed to identify “cloud echo” and distinguish it from artifacts. Model reanalysis data and digital terrain elevation data were interpolated to the time-range grid of the radar data, to provide an environmental reference. Dataset: The data for the three ﬁeld campaigns is available at https://doi.org/10.5065/D6CJ8BV7 (CSET ﬁeld campaign), https://doi.org/10.5065/D68914PH) (SOCRATES campaign), and https: //doi.org/10.26023/V9DJ-7T9J-PE0S (OTREC campaign).


Summary
The High-performance Instrumented Airborne Platform for Environmental Research (HIAPER) aircraft, which is operated by the National Center for Atmospheric Research (NCAR) for the National Science Foundation (NSF), is a state-of-the-art observational platform available to the scientific community. HIAPER is a Gulfstream V business jet that has been highly modified to carry up to 2500 kg of scientific instruments. It can fly at altitudes up to 15 km and with its range exceeding 11,000 km it can reach many remote locations.
One of the instruments that is deployed on the aircraft is the HIAPER Cloud Radar (HCR, [1]), a W-band, dual-polarization (vertical, V; horizontal, H), Doppler radar that is mounted in an underwing pod (Figure 1a,b). A single lens antenna is used to both transmit and receive. The transceiver uses a two-stage up and down conversion super-heterodyne design. A waveform generator creates the transmitted waveform, which passes through the two-stage up-conversion to the transmission frequency of 94.4 GHz. It is then amplified by an extended interaction klystron amplifier (EIKA) to 1.6 kW peak power. The received    The HCR's unique design, where a lens antenna illuminates a rotatable reflector, allows for 240 • cross-track scanning (considering fuselage blocking) as well as staring, e.g., at zenith or nadir. In staring mode, the beam is stabilized for changes in roll and pitch angles due to platform motion in real time. The scanning/staring capability together with the HCR's high sensitivity allow the precise detection of drizzle, liquid, and ice clouds and provides unique observations of the formation and evolution of clouds, aiding our understanding about the effects of clouds on the regional and global weather and climate.
The HCR has collected data in one minor and three major field campaigns [2][3][4] in four distinct locations, reaching from the tropics to 62 • S in the Southern Ocean. Data processing and quality control procedures were developed, which are now consistently applied to all collected data. The goal of this publication is to provide a detailed description of the data itself and to document the data processing and quality control procedures that have been developed specifically for the HCR.

HCR Deployments 2015-2019
The HCR has been deployed in four field campaigns. The first consisted of one flight in the Nor'easter project where the HCR collected data across the comma head of a strong Nor'easter cyclone over the northeastern United States in February 2015 [5]. The HIAPER aircraft flew at~12 km altitude for most of the flight and the HCR was operated in nadir pointing mode. Significant improvements in the HCR such as the mitigation of significant gear back-lash that caused errors in the radial velocity field [6] were made after this first deployment. Because of its short duration and the improvements made thereafter, we consider Nor'easter as a test case and focus on the later three major field campaigns in this study.
During the Cloud Systems Evolution in the Trades (CSET) study, the HCR was deployed in 16 research flights, which took place in July and August 2015, between the west coast of California and Hawaii (Figure 1c). The CSET was "designed to describe and explain the evolution of the boundary layer aerosol, cloud, and thermodynamic structures along trajectories within the North Pacific trade winds" [7]. The flight patterns consisted of higher altitude (~6-10 km) ferry legs at the beginning and end of each flight to reach the target area during which the HCR was generally operated in nadir pointing mode. Frequent sea surface calibration events (Section 3.2) where the HCR scanned to 20 • off of nadir on each side were also conducted during the ferry legs. When the target area was reached, the aircraft descended to lower altitudes, sometimes to just 150 m above the sea surface below the cloud base, with the HCR pointing zenith (up), sometimes to 2-3 km altitude just above the clouds with the HCR pointing nadir (down), and so called "saw-tooth vertical patterns" through the clouds with the HCR alternating between nadir and zenith modes. An example of a typical CSET flight pattern is shown in Figure 2a.
The Southern Ocean Clouds, Radiation, Aerosol Transport Experimental Study (SOCRATES) took place in January and February 2018, in the Southern Ocean [8]. Based in Tasmania, Australia, HIAPER flew 15 research flights south over the Southern Ocean ( Figure 1d) to improve the understanding of clouds, aerosols, air-sea exchanges, and their interactions. The flight patterns again consisted of higher altitude ferry flights to and from the target area with the HCR in nadir pointing mode, and lower-level maneuvers above, below, and through the clouds once the target area was reached, with the HCR frequently transitioning between zenith and nadir pointing (see Figure 2b for a typical flight altitude pattern). The Southern Ocean Clouds, Radiation, Aerosol Transport Experimental Study (SOCRATES) took place in January and February 2018, in the Southern Ocean [8]. Based in Tasmania, Australia, HIAPER flew 15 research flights south over the Southern Ocean ( Figure 1d) to improve the understanding of clouds, aerosols, air-sea exchanges, and their interactions. The flight patterns again consisted of higher altitude ferry flights to and from the target area with the HCR in nadir pointing mode, and lower-level maneuvers above, below, and through the clouds once the target area was reached, with the HCR frequently transitioning between zenith and nadir pointing (see Figure 2b for a typical flight altitude pattern).
The Organization of Tropical East Pacific Convection (OTREC) field campaign took place in the East Pacific and the extreme SW Caribbean (Figure 1e), August-October 2019, to study the large-scale environmental factors that control convection over tropical oceans [9]. The flight patterns were designed differently from CSET or SOCRATES: During OTREC the aircraft did not fly to the weather but rather flew a number of pre-determined flight patterns laid out in a grid format either over the Pacific Ocean or the Caribbean Sea. The aircraft generally flew at very high altitudes of over 14 km, at the extreme end of (and in rare cases exceeding) the maximum range of the HCR (Figure 2c).

Radar Data
During signal processing, the fields (the so-called "moments") are calculated from the measured I/Q time series data. Derived radar moments are listed in Table 2. All data is available in CfRadial (version 1.4) format, the NetCDF CF Conventions for RADAR and The Organization of Tropical East Pacific Convection (OTREC) field campaign took place in the East Pacific and the extreme SW Caribbean (Figure 1e), August-October 2019, to study the large-scale environmental factors that control convection over tropical oceans [9]. The flight patterns were designed differently from CSET or SOCRATES: During OTREC the aircraft did not fly to the weather but rather flew a number of pre-determined flight patterns laid out in a grid format either over the Pacific Ocean or the Caribbean Sea. The aircraft generally flew at very high altitudes of over 14 km, at the extreme end of (and in rare cases exceeding) the maximum range of the HCR (Figure 2c).

Radar Data
During signal processing, the fields (the so-called "moments") are calculated from the measured I/Q time series data. Derived radar moments are listed in Table 2  The raw I/Q time series data are saved to disk so that all quality control data processing can be repeated after the flight. It is high-rate data that can exceed 2 TB in size per flight. The radar fields are computed from the I/Q data, both during and after the flight, using the standard pulse-pair and dual-polarization techniques [10]. Additional derived radar data products such as a melting layer field [11] are sometimes added to the data set, but their description is beyond the scope of this study.
The radar fields, except for the primary power fields DBMVC and DBMHX, are censored (i.e., set to a missing value) when there is not sufficient signal to yield useful information. This censoring is done using thresholds applied to the SNR and NCP fields, on a gate-by-gate basis. It is a 1-dimensional operation, performed along a single beam. The logic is as follows: if the SNR is less than −10 dB and the NCP is less than 0.1, the non-power fields are set to missing. These values are set conservatively so as not to remove any valid data. After this, one extra censoring step is applied, as follows: we check along the beam for contiguous non-missing value data regions that are surrounded by missing values. If they consist of only one or two data points, they, too, are set to missing. This eliminates some of the 'speckle' features in the data fields. A global positioning system (GPS) and an inertial navigation system (INS) unit are mounted in the nose of the radar pod. The GPS/INS combination provides data on the position, the speed and direction of movement, and the orientation of the radar in space, referenced to earth coordinates. The GPS data allow the antenna pointing to be controlled relative to earth coordinates rather than aircraft coordinates, which is especially important for the vertical pointing operations zenith and nadir.
The GPS system models the earth according to the World Geodetic System (WGS84, see https://gisgeography.com/wgs84-world-geodetic-system/ (accessed on 8 December 2019). However, the variability of the influence of gravity over the globe means that the sea surface height does not accurately follow the WGS84, with deviations of over 70 m in some places. To correct for these deviations, the measured GPS altitude is corrected using the Earth Gravitation Model (EGM2008, see https://earth-info.nga.mil/GandG/wgs84/ Data 2021, 6, 66 6 of 25 gravitymod/egm2008/ (accessed on 8 December 2019). In addition, the radar system reports on the pressures, temperatures, and voltages of various components. This metadata is added to the data stream and is used extensively in the calibration correction procedures carried out in the data quality phase during post-processing.

The FLAG Fields
The HCR receives data not only from clouds but also from targets that are not necessarily of primary interest to scientists. We developed an algorithm that classifies all the HCR echoes into different categories and add the resulting 2D field, with dimensions of time and range, to the data where it is referred to as the FLAG field. The intention of the FLAG field is to make it easy for the user to filter out unwanted echo by masking the data using the flag values. We also add a second reflectivity field (DBZ_MASKED) for which the flag field has been applied-i.e., echo that is not classified as "cloud" has been removed.
The different categories in the FLAG field are: • Cloud: Echoes that are not classified as one of the categories below are flagged as cloud.

•
Speckle: Contiguous echoes with fewer than 100 data points (in 2D-time × range) are flagged as speckle. These are mostly echoes that slightly exceed the noise threshold (see Section 2.2). Occasionally, very small cloud echoes are also flagged as speckle. • Extinct: When the HCR observes thick clouds with high liquid water content (e.g., in convection), sometimes the signal is unable to penetrate through the entire cloud depth because it becomes completely attenuated. In nadir pointing mode, we try to identify the echo of the ocean or land surface (see below) and if the surface echo is too weak (i.e., below a certain threshold) or not found at all, the region from the lower edge of the cloud (i.e., the last range gate with valid echo) to the end of the radar beam is classified as extinct. A flag of extinct implies that it is likely, but not certain, that cloud or precipitation is present in that region. • Backlobe: When the HCR is pointing at zenith and the aircraft is flying low, withiñ 2 km of the ocean or land surface, there is often an echo that results from the backlobe of the radar reflecting off the surface. This backlobe contamination is typically characterized by a band of low reflectivity and high spectrum width. The backlobe appears only during zenith pointing at a range equal to the altitude of the radar-i.e., at a height of twice the aircraft altitude above the surface. As the aircraft ascends or descends, the backlobe contamination will recede and approach in range, respectively. We flag data as backlobe echoes when they are at the expected altitude, have reflectivity values of less than −20 dBZ, and spectral width values higher than 1.4 m s −1 . Not all backlobe echo is flagged with these thresholds, and sometimes cloud echo is erroneously flagged. Still, these thresholds generally yield a good estimate of the presence of backlobe echo. • Out of range: During OTREC, the aircraft sometimes flew higher than the unambiguous range of the radar (the last valid HCR range gate is at~14.5 km). This can cause second trip echoes-i.e., signal reflected by the surface still reaches the receiver but because of its late timing it is erroneously placed in range gates close to the radar. These echoes are classified as out of range. • Transmitter pulse: The timing of the receiver digitization relative to the transmit pulse can be set differently for different radars. For the HCR, the receiver starts taking data before the transmitter fires. As a result, the first 12 gates are assigned a negative range. Generally, they will contain just noise, but sometimes they will contain second trip echo. In addition, as the transmitter fires, some of the power from the transmit pulse is coupled to the receiver circuitry and thus is manifest in the data. We refer to this as the "burst echo" or the "bang". The measured radial velocity of the burst will always be zero since there is no relative motion involved. Approximately five gates are affected by the burst echo. Such contaminated data, i.e., the first 17 range gates (12 gates with negative range and 5 gates with burst) of each beam are classified as transmitter pulse. • Water surface: In nadir pointing mode, echo from the ocean or land surface is received in several range gates. We identify the surface by searching for the highest reflectivity value in specific range gates, which are calculated from the altitude of the radar and the topography data. A set number of range gates below and above the gate with the maximum reflectivity are classified as surface. If the topography height is zero, it is classified as water surface. Note that, currently, lakes are classified as land surface (see below) as data over large lakes were not collected in any of the field campaigns.

•
Land surface: As for water surface above, but for topography heights greater than zero. • Below surface: Echo from below the surface to the last range gate is classified as below surface. • Noise source calibration: To aid with radar calibration, noise source calibration events are conducted during each flight (see Section 3.1.4 for details). The radar is not transmitting, and no scientific data is collected. • Antenna in transition: We flag beams for which the antenna is moving very fast, e.g., when transitioning from nadir to zenith pointing or vice versa.

•
Missing: If the radar is not transmitting (for reasons other than a noise source calibration event) the data is classified as missing.
An example of the FLAG field is shown in Figure 3 from a flight during the SOCRATES field campaign. Figure 3a shows measured reflectivity while Figure 3b shows the respective classification in the FLAG field. In Figure 3c, all echo that was classified as something other than cloud was removed, which results in a cleaned-up reflectivity field.  A field, ANTFLAG, is added to help with data processing and analysis. It is a 1D field on the time dimension and it flags the antenna pointing status:  A field, ANTFLAG, is added to help with data processing and analysis. It is a 1D field on the time dimension and it flags the antenna pointing status: The antenna is scanning, e.g., for a sea surface calibration event (Section 3.2). • Transition: As "Antenna in transition" in the FLAG field classification above.

Model and Topography Data
To aid users in their research, and also for calibration monitoring purposes (Section 3.2), we interpolate 3D data from numerical weather prediction models onto the HCR observed time-range grid. For the three field campaigns discussed in this publication, we use ERA5 model data, which is available in 1-h time steps on a 0.25 • latitude × 0.25 • longitude grid [12]. We use 100 hPa to 1000 hPa model levels for pressure, temperature, relative humidity, and geopotential height, interpolated in four dimensions (4D, three spatial dimensions and the time dimension) onto the HCR time-range grid. Model results of surface fields are used to extend the interpolation to the surface. Surface model data are also used for surface U and V wind components and sea surface temperature (SST), which are interpolated in three dimensions (3D, two horizontal spatial dimensions and one time dimension) onto the HCR time dimension. We also interpolate GTOPO30 digital elevation model (DEM) data [13] with 30 arc-seconds spacing onto the HCR time dimension.
The interpolation of the model data onto the HCR grid is carried out in several steps, on a flight-by-flight basis. First the model times that encompass the flight are identified. In theory, it is possible to directly interpolate from the 4D (or 3D) model data onto the 2D HCR grid. However, because the HCR data has very high temporal (0.1 s) and spatial (~20 m) resolutions, a direct one-step 4D (or 3D) interpolation is computationally expensive. To speed up the process, we split the interpolation into two steps. First only the model longitude, latitude, and time dimension data are interpolated onto the HCR longitude, latitude, and a thinned out (1 s) time dimension, i.e., onto an intermediate 2D (or 1D) HCR track. Before we perform the second interpolation, we compare the altitude from the model surface data with the pressure levels to see where they intersect. Pressure level data with altitudes below the surface altitude is removed such that the lowest model data level always represents the surface model data. In the second step we interpolate to the full HCR time resolution and also to the HCR range grid, if applicable. An example of model data is shown in Figure 4. The model and topography data are then added to the CfRadial files (Table 2).
The interpolation of the model data onto the HCR grid is carried out in several steps, on a flight-by-flight basis. First the model times that encompass the flight are identified. In theory, it is possible to directly interpolate from the 4D (or 3D) model data onto the 2D HCR grid. However, because the HCR data has very high temporal (0.1 s) and spatial (~20 m) resolutions, a direct one-step 4D (or 3D) interpolation is computationally expensive. To speed up the process, we split the interpolation into two steps. First only the model longitude, latitude, and time dimension data are interpolated onto the HCR longitude, latitude, and a thinned out (1 s) time dimension, i.e., onto an intermediate 2D (or 1D) HCR track. Before we perform the second interpolation, we compare the altitude from the model surface data with the pressure levels to see where they intersect. Pressure level data with altitudes below the surface altitude is removed such that the lowest model data level always represents the surface model data. In the second step we interpolate to the full HCR time resolution and also to the HCR range grid, if applicable. An example of model data is shown in Figure 4. The model and topography data are then added to the CfRadial files (Table 2).

Engineering Calibration in the Laboratory
To ensure proper calibration, prior to and after each field campaign, a standard engineering-type calibration is performed on the HCR receiver in the laboratory at NCAR. A signal of known power from a signal generator is injected into the waveguide just on the receiver side of the connection to the antenna. Because of the high frequency at the Wband, it is not straightforward to perform the calibration automatically using a controllable signal generator. Instead, a variable attenuator is placed into the circuit between the signal generator and the injection point, and the value of the attenuation is adjusted manually. The digital receiver is used to measure the received power for each injected power value.
As an example, Figure 5 shows the engineering calibration results from 19 June 2019, prior to the OTREC campaign. The individual points show the power as measured by the receiver-red for the H channel and green for the V channel. The last three points to the left are used to estimate the noise power-in this case, −60.03 dBm for H and −60.85 dBm for V. Then, the noise-corrected signal power is computed as the measured power minus the noise power. The noise-corrected powers are shown as solid lines-light blue for H and magenta for V. Ideally, the receiver should be perfectly linear, however, some minor deviations from a straight line are evident in Figure 5. These small deviations could be caused by the manual calibration technique. The lower regions of the noise-corrected lines (lower left corner in Figure 5   For the calibration shown in Figure 5, an SNR value of 0 sponds to a calibrated reflectivity of −24 dBZ. The extension magenta line, below the noise value of −61 dB in Figure 5, show measure power down to an SNR of about −12 dB. These valu histogram in Figure 6, which shows that the number of mea dB drops significantly. Using this information, the sensitivity . Calibration curves for the H channel (red/blue) and V channel (green/magenta). Crosses: measured received power. Lines: measured received power minus estimated noise power. X-axis: input power from signal generator. Y-axis: received power as measured by the digital receiver. Since the receive path is fixed for the HCR, the H and V calibrations apply to both co-and crosspolar measurements.
For the calibration shown in Figure 5, an SNR value of 0 dB at a range of 1 km corresponds to a calibrated reflectivity of −24 dBZ. The extension of the linear region of the magenta line, below the noise value of −61 dB in Figure 5, shows that the radar can reliably measure power down to an SNR of about −12 dB. These values are supported by the SNR histogram in Figure 6, which shows that the number of measurements made below −12 dB drops significantly. Using this information, the sensitivity of the V-channel at various ranges is estimated in Table 3.
For the calibration shown in Figure 5, an SNR value of 0 dB at a range of 1 km corresponds to a calibrated reflectivity of −24 dBZ. The extension of the linear region of the magenta line, below the noise value of −61 dB in Figure 5, shows that the radar can reliably measure power down to an SNR of about −12 dB. These values are supported by the SNR histogram in Figure 6, which shows that the number of measurements made below −12 dB drops significantly. Using this information, the sensitivity of the V-channel at various ranges is estimated in Table 3.  It is important to properly characterize the antenna system to ensure accurate parameters are provided to the calibration computations. The HCR front-end antenna assembly (Figure 7) was tested in a near-field anechoic chamber in order to characterize the antenna pattern, yielding estimates of the gain and half-power beam width. These parameters are typically provided by the antenna manufacturer, however with the unique, custom steerable reflector and a cone-shaped radome, the antenna patterns needed to be re-established. NCAR contracted a commercial vendor (Custom Microwave of Longmont, CO, USA) to characterize the antenna, reflector, and radome. The radome is an outer cover protecting the antenna and reflector system, ideally designed to be transparent to microwave energy. The characterization process was performed in three stages, with each stage adding a component: (a) the 12-inch lens antenna only; (b) the lens antenna plus the reflector assembly; and (c) the lens, the reflector, and the radome. The measurements were carried out at two different resolutions: one with a high grid spacing of 0.5 × wavelength over a 24 × 24 inch scan window ( Figure 7) and one with a grid spacing of 1 × wavelength over a 16 × 16 inch scan window.  The test results provide both the shape of the antenna power levels and losses associated with the various configurat antenna pattern and amplitude from step (c), i.e., the configurat The test results provide both the shape of the antenna radiation pattern and the power levels and losses associated with the various configurations. An example of the V antenna pattern and amplitude from step (c), i.e., the configuration that is used in the field, is shown  Figure 8 for the principal plane, indicated by the red line. The pattern of the main beam is largely unchanged by the reflector and the radome; however, we do see significant signal loss through the radome ( Table 4).
The test results provide both the shape of the antenna power levels and losses associated with the various configur antenna pattern and amplitude from step (c), i.e., the configura is shown in Figure 8 for the principal plane, indicated by the main beam is largely unchanged by the reflector and the ra significant signal loss through the radome (Table 4).     Table 4 shows the measured gains for the various antenna/reflector/radome configurations. The loss from the reflector only appears to be within the uncertainty of the measurements, so we can consider it to be negligible. The loss from the radome is significantly higher than that based on attenuation estimates from the manufacturer. Table 5 summarizes the laboratory calibration results for the HCR before and after OTREC, plus an estimate of the uncertainty of each quantity. The receiver mismatch loss is computed from theory [14]. All other items are determined by measurement. The values in the "Uncertainty" column indicate an estimate of the uncertainty for each item.  As an external, pod-mounted system, the HCR experiences large temperature variations. During OTREC, the aircraft took off and landed in hot and humid tropical conditions with air temperatures exceeding 30 • C but climbed to altitudes over 14 km and air temperatures of below −65 • C during flights. Therefore, in order to maintain accurate receiver gain calibrations, the receiver temperature is monitored and test signals from a noise source with a pre-calibrated Excess Noise Ratio (ENR Q , [15]) are injected into the vertical channel during flight and on the ground. By injecting a known noise power into the vertical receiver channel (termed an NScal event), the gain of the V co-polar channel (DBMVC) is monitored and recorded. The total V-receiver gain depends on two components: The LNA gain stability is critical to accurate receiver performance; the LNAs are equipped with thermostatically controlled heaters to keep their temperature and corresponding gain as constant as possible. The LNA heater circuit is set to maintain temperatures between 30 • C and 32 • C. During operations, the heaters cycle on and off and the LNA temperature is correlated with the received power (DBMVC). Below we describe how temperature and power data collected during NScal events can be used to establish the temperature vs. power relationships, which are then used to correct the power-related HCR data fields.

Laboratory Calibration Summary and Sensitivity Assessment
It is important to note that not all NScal events are suitable for LNA temperature dependency corrections. When the HCR pod is subjected to very low temperatures over a long period of time, the LNA heaters are not always powerful enough to keep the LNA temperature stable and we see LNA temperatures dropping significantly, sometimes by more than 10 • C. During these times, the heaters do not cycle but are on all the time and the NScal events performed during such times cannot be used to establish an LNA gain vs. temperature relationship. However, once the relationship has been established from events performed at other times (which we will call "qualifying" events), it can be used to correct the gain during time periods when the LNA temperature is low. An example of a qualifying NScal event from SOCRATES is shown in Figure 9, and we will use it to explain the correction procedure.
As a first step, the LNA temperature data (light blue line in Figure 9a), which is available on a 1 s temporal resolution, is smoothed by applying a 20 s moving average filter (dark blue line in Figure 9a). The 2D (time-range) DBMVC field is averaged in the range dimension to get one power value for each time step (red line in Figure 9a) and then resampled onto the 1 s LNA temperature time (red line in Figure 9b). Inspection of Figure 9a reveals that the dark blue LNA temperature curve lags behind the red power curve by a few seconds. In order to establish a valid relationship between these two curves, we need to correct for this lag. We find both the peaks and the valleys in each curve and calculate the average temporal difference between matching peaks and valleys for each NScal event. We then shift the LNA temperature curve in time by this difference (dark blue line in Figure 9b). After the lag has been corrected, a geometric mean regression [16], which is preferable to a least squares linear regression in situations when both variables contain random errors, is performed between the LNA temperature and the power for each qualifying NScal event (Figure 9c). The regression coefficients are averaged over all qualifying NScal events, and this relationship is used to correct the power for LNA temperature dependency for all qualifying and non-qualifying NScal events, while also taking the time lag between the power and LNA temperature curves into account. The power curve corrected for LNA temperature dependency (pink line in Figure 9b) clearly demonstrates how the LNA temperature correction removes the power fluctuations caused by the cycling of the LNA heater. Time lag and regression coefficients for the different field campaigns are listed in Table 6.

HCR data fields.
It is important to note that not all NScal events are suitable for LNA temperature dependency corrections. When the HCR pod is subjected to very low temperatures over a long period of time, the LNA heaters are not always powerful enough to keep the LNA temperature stable and we see LNA temperatures dropping significantly, sometimes by more than 10 °C. During these times, the heaters do not cycle but are on all the time and the NScal events performed during such times cannot be used to establish an LNA gain vs. temperature relationship. However, once the relationship has been established from events performed at other times (which we will call "qualifying" events), it can be used to correct the gain during time periods when the LNA temperature is low. An example of a qualifying NScal event from SOCRATES is shown in Figure 9, and we will use it to explain the correction procedure. As a first step, the LNA temperature data (light blue line in Figure 9a), which is available on a 1 s temporal resolution, is smoothed by applying a 20 s moving average filter (dark blue line in Figure 9a). The 2D (time-range) DBMVC field is averaged in the range dimension to get one power value for each time step (red line in Figure 9a) and then resampled onto the 1 s LNA temperature time (red line in Figure 9b). Inspection of Figure  9a reveals that the dark blue LNA temperature curve lags behind the red power curve by a few seconds. In order to establish a valid relationship between these two curves, we At a first glance it may seem counterintuitive that the amplifier gain increases with increasing temperatures, which is contrary to what one might expect from a typical amplifier. We conducted several experiments in the lab to confirm the sense of the temperature correction (not shown) and concluded that it is correct as presented in this study. As we collect more data during future field experiments and in the lab, the correction coefficients listed in Table 6 may change.
After the NScal events have been corrected for LNA temperature fluctuations, a relationship between IF-stage gain and pod temperature can be established. Note that for estimating "pod temperature", we average data from four temperature sensors placed at different locations within the pod. To quantify the pod temperature vs. power relationship, we follow the procedure laid out in [15] and first correct ENR Q , which is 20.84 dB for the HCR, for fluctuations in pod temperature. The corrected ENR corr is: where T 0 = 290 K. Taking the Boltzmann constant (K B = 1.38 × 10 −23 J K −1 ) and the pulse width (τ in s) into account, ENR corr is converted to logarithmic units as follows: ENR dB corr = 10 log 10 K B ENR corr We then calculate the difference between ENR dB corr and the measured power DBMVC P diff as: where C is a constant (30 dBZ for the HCR). Note that the temperature dependency in Equations (1)-(3) is very weak and ENR dB corr is therefore almost constant. P diff vs. pod temperature for the OTREC campaign is plotted in Figure 10 where a and b show the dependency before and after the LNA temperature correction, respectively. During OTREC, NScal events were mostly carried out on the ground before take-off, in the first flight hour once altitude was reached, and at the end of flights during descents. Data from the ground NScal events cluster at high temperatures in the lower right corner of Figure 10a while the events conducted early in the flight show temperatures decreasing to~15 • C. The NScal events from the descents show temperatures between 0 and 7 • C and significantly diverge from the expected linear relation. They are exactly the nonqualifying events mentioned before (crosses in Figure 10) where the LNA heater could not keep the temperature at the desired level after several hours of flight at below −60 • C air temperatures. Comparing Figure 10a,b shows that these outliers could be corrected by using the power vs. LNA temperature correction, promising a significant improvement of reflectivity, especially in the later parts of the flights when the pod is very cold. For the IF-stage gain correction based on pod temperature, we again calculate geometric mean regression coefficients (Table 6), but this time for Pdiff vs. pod temperature (regression line in Figure 10b). As expected, no temperature dependency is observed after the correction (Figure 10c). Note that for both temperature dependency corrections, we use the LNA and pod temperatures measured during the lab calibration (Section 3.1.1) as the baseline. They are also listed in Table 6.
With all relationship coefficients, time lags, and lab calibration temperatures established, the power-related fields (DBMVC, DBMHX, and DBZ) are corrected for both temperature dependencies. It is interesting to note that the two temperature corrections are similar in magnitude but opposite in sign (Table 6).

Theory of Observed Sea Surface Backscatter
Using the ocean surface backscatter as an external reference for radar calibration has become a standard procedure for air-and spaceborne radars at W band. The method has been used and refined, e.g., for the Cloud Radar System (CRS) on board the NASA ER-2 research aircraft [17], the Radar Airborne System Tool for Atmosphere (RASTA) on board the French Falcon 20 aircraft [18], for CloudSat [19], or the MIRA radar on board the German High Altitude and Long Range Research Aircraft (HALO) [20]. The technique compares the normalized ocean surface cross section σ0 measured in clear air to an ocean surface backscattering model to investigate the measurement bias.
To calculate σ0, we start with three well-known relationships (e.g., [17]). The received power Pr in W for a weather radar is given as: where Pt is the peak transmit power in W, Ga is the antenna gain, λ is the radar wavelength in m, σ0Lin is the ocean surface cross section in linear units, β and φ are the horizontal and vertical beam widths in rad, Θ is the radar beam incidence angle in rad, lr is the loss between the antenna and the receiver port, ltx is the loss between the transmitter and the For the IF-stage gain correction based on pod temperature, we again calculate geometric mean regression coefficients (Table 6), but this time for P diff vs. pod temperature (regression line in Figure 10b). As expected, no temperature dependency is observed after the correction (Figure 10c). Note that for both temperature dependency corrections, we use the LNA and pod temperatures measured during the lab calibration (Section 3.1.1) as the baseline. They are also listed in Table 6.
With all relationship coefficients, time lags, and lab calibration temperatures established, the power-related fields (DBMVC, DBMHX, and DBZ) are corrected for both temperature dependencies. It is interesting to note that the two temperature corrections are similar in magnitude but opposite in sign (Table 6).

Theory of Observed Sea Surface Backscatter
Using the ocean surface backscatter as an external reference for radar calibration has become a standard procedure for air-and spaceborne radars at W band. The method has been used and refined, e.g., for the Cloud Radar System (CRS) on board the NASA ER-2 research aircraft [17], the Radar Airborne System Tool for Atmosphere (RASTA) on board the French Falcon 20 aircraft [18], for CloudSat [19], or the MIRA radar on board the German High Altitude and Long Range Research Aircraft (HALO) [20]. The technique compares the normalized ocean surface cross section σ 0 measured in clear air to an ocean surface backscattering model to investigate the measurement bias.
To calculate σ 0 , we start with three well-known relationships (e.g., [17]). The received power P r in W for a weather radar is given as: P r = P t G 2 a λ 2 σ 0Lin β ϕ cos(Θ) 512 ln(2) π 2 l r l tx l 2 atmLin h 2 (4) where P t is the peak transmit power in W, G a is the antenna gain, λ is the radar wavelength in m, σ 0Lin is the ocean surface cross section in linear units, β and ϕ are the horizontal and vertical beam widths in rad, Θ is the radar beam incidence angle in rad, l r is the loss between the antenna and the receiver port, l tx is the loss between the transmitter and the antenna port, l atmLin is the zenith one-way path-integrated atmospheric attenuation in linear units, and h is the altitude of the aircraft in m.
The radar constant is defined as: R c = 1024 ln(2) λ 2 l r l tx 10 24 where c is the speed of light in m s −1 , τ is the pulse width in s, and K is the radar dielectric factor for water in GHz. Finally, radar reflectivity in mm 6 m −3 is given by Z = P r R c h 2 10 6 cos 2 (Θ) (6) Combining Equations (4)-(6) yields a relatively simple Equation (7) for σ 0 which, after translating to logarithmic units, is as follows: σ 0 = DBZ + 10 log 10 π 5 c τ |K| 2 2 λ 4 10 18 + (2 l atm − 10 log 10 (cos (Θ))), where σ 0 is expressed in dB. The first term on the right-hand side of Equation (7) is the measured reflectivity in dBZ. The second term is constant as it contains of all the radar system parameters and the speed of light where we use c = 3 × 10 8 m s −1 , τ = 2.56 × 10 −7 s, |K| 2 = 0.711, and λ = 3.2 mm. The third term on the right-hand side is the atmospheric attenuation l atm in dB multiplied by two (for the transmit and return paths) and adjusted for the incidence angle. Atmospheric attenuation depends on atmospheric pressure, temperature, and relative humidity, and we utilize the ERA5 reanalysis data to calculate l atm using the wave propagation model by the International Telecommunication Union [21]. For comparison purposes, we also implemented the wave propagation model by [22], which produced results that were within~0.2 dB of the ITU results. This comparison provides confidence to the l atm estimate.

Sea Surface Backscatter Modelling
Once the observed σ 0 has been calculated it can be compared to that predicted by an ocean surface backscattering model. When the HCR operates at nadir pointing, quasispecular scattering theory is applicable, which has been shown to work well for low incidence angles. It gives σ 0 as [17,23]: where v is the horizontal surface wind speed in m s −1 , SST is the sea surface temperature in • C, Γ e is the ocean surface effective Fresnel reflection coefficient, and s(v) 2 is the surface mean square slope, which is discussed below. The ocean surface effective Fresnel reflection coefficient is [17]: Γ e (λ, SST) = C e (n(λ, SST) − 1) n(λ, SST) + 1 (9) where C e is the Fresnel reflection coefficient correction factor, which is given as 0.88 by [17] for 94 GHz radars. The complex refractive index for sea water n depends on the wavelength and the sea surface temperature. In theory, it also depends on the salinity of the sea water, but this dependency is very weak so that a constant salinity of 35‰ can be used without loss of accuracy. The dependency on the SST is also relatively weak and therefore the SST is often assumed to be constant, e.g., by [17,20]. However, because in our case the HCR has been deployed in areas with vastly different SSTs, from the Caribbean to the Southern Ocean, including the SST dependency in the calculations is desirable. We use the fit for the microwave dielectric constant of sea water by [24], which is based on microwave satellite observations. Note that [24] gives the frequency validity range of their fit as only "up to at least 90 GHz", slightly below the HCR's 94 GHz. Several empirical relationships exist for the effective mean square surface slope s(v) 2 . Cox and Munk [25] developed a linear relationship with wind speed as: which was later refined by Wu [26,27] and Freilich and Vanhoff [28] into the following logarithmic relationship: where a 0 and a 1 are constants with different values derived by different studies in different wind speed regimes, which are listed in Table 7. We use s(v) 2 by Cox and Munk ( [25], which we will call the CM model), Wu ([26,27], the Wu model), and Freilich and Vanhoff ( [28], the FV model), and the complex refractive index for sea water by [24] to calculate σ 0 with Equation (8). We again use the ERA5 reanalysis data for the U and V surface wind components and for the SST.
Before we compare the model σ 0 with that calculated from measurements using Equation (7), we investigate how the model σ 0 varies with the surface wind speed and SST. We first vary wind speeds between 1 and 20 m s −1 while keeping the sea surface temperature constant at 20 • C in the CM model ( Figure 11a) and then keep wind speed constant at 5 m s −1 while varying the sea surface temperature between 0 and 30 • C ( Figure  11b). The sea surface return values of σ 0 decrease with increasing angles off nadir as the beam is increasingly scattered in directions other than back to the radar receiver. Variations in sea surface temperature shift the curves up and down by a small, but not insignificant amount (up to~1.5 dB in the 30 • C temperature range, Figure 11b). Varying the surface wind speed, however, changes the slope of the curves significantly (Figure 11a), where lower wind speeds result in steeper curves and the slope flattens as wind speed increases. These results intuitively make sense when we keep in mind that wind speed is a proxy for wave conditions on the ocean surface. The maximum σ 0 is expected when the beam is perpendicular to the wave surface. The farther the angle deviates from perpendicular, the more the power is reflected in directions other than back towards the receiver. At low wind speeds, representing little or no wave activity, the beam is perpendicular to the ocean surface at nadir pointing, and can therefore be almost completely reflected back to the receiver (specular reflection), but the return power decreases significantly with less perpendicular incidence angles. At higher wind speeds, representing significant wave activity, the slope of the waves determines in which direction the power is reflected. In these circumstances, nadir pointing no longer implies a 90 • angle between the beam and the ocean surface and significant portions of the power are reflected out of the receive path. However, at angles pointing off nadir, more of the signal power can be reflected back to the receiver if the beam happens to hit the waves at just the right angle, leading to increased return power, which therefore leads to flatter backscatter curves.
the ocean surface at nadir pointing, and can therefore be almost completely reflected back to the receiver (specular reflection), but the return power decreases significantly with less perpendicular incidence angles. At higher wind speeds, representing significant wave activity, the slope of the waves determines in which direction the power is reflected. In these circumstances, nadir pointing no longer implies a 90° angle between the beam and the ocean surface and significant portions of the power are reflected out of the receive path. However, at angles pointing off nadir, more of the signal power can be reflected back to the receiver if the beam happens to hit the waves at just the right angle, leading to increased return power, which therefore leads to flatter backscatter curves.

Comparison of Measured and Modelled Sea Surface Backscatter
Comparing modelled and measured σ 0 when pointing directly nadir is not ideal, because uncertainties in the reanalysis wind speeds have the biggest effect at very low incidence angles (Figure 11a). Wind speed variations seem to have the least effect between 5 • and 15 • incidence angles (Figure 11a) and it is therefore desirable to measure σ 0 at these angles. During all three field campaigns sea surface calibration (SScal) events were performed during most flights by scanning the radar ±20 • off nadir. This scanning pattern was carried out for at least several minutes at a time.
As W-band radars can be heavily attenuated in clouds, care needs to be taken to only use data without cloud contamination. It is up to the radar operator on board the aircraft to determine suitably clear conditions over the ocean. The operator may use downlinked satellite data, the on-board forward-looking camera, or simply check out of the window to determine cloud conditions. Luckily, clear air conditions are also usually the least interesting from a science perspective so that SScal events during these times have little impact on the scientific objectives of the mission. Nevertheless, cloud contamination often occurs so that the first step in the processing of the SScal data is to filter out cloud contaminated data and other unsuitable data. To identify rays that only traverse clear air, we first remove all zenith-pointing rays and times when the aircraft was flying at altitudes less than 2.5 km since the ocean return at low altitudes can be so strong that it saturates the receiver. For the remaining rays we calculate the sum of the reflectivity values in linear space from the aircraft to the first gate identified as ocean surface (Section 2.3). If the reflectivity sum is larger than a certain threshold (in our case, 0.8 dBZ), we assume that it contains cloud data and exclude it from the SScal analysis. The non-cloud-contaminated results are plotted for each SScal event, along with the three models. Some typical examples are shown in Figure 12. The red and blue lines show the measured σ 0 as a function of incidence angle while the green lines represent the different models. The black line is a fit through the measurements. Comparing the measurements with the model data gives an estimate of how well the radar reflectivity is calibrated. The ERA5 surface wind speed and SST are shown in the upper right corner.
contains cloud data and exclude it from the SScal analysis. The non-cloud-contaminated results are plotted for each SScal event, along with the three models. Some typical examples are shown in Figure 12. The red and blue lines show the measured σ0 as a function of incidence angle while the green lines represent the different models. The black line is a fit through the measurements. Comparing the measurements with the model data gives an estimate of how well the radar reflectivity is calibrated. The ERA5 surface wind speed and SST are shown in the upper right corner. Even after the removal of cloud-contaminated cases, which is done automatically in our SScal analysis procedure, not all SScal events can be used for calibration. There are several reasons why SScal events may not be suitable: After the removal of cloud contaminated data, sometimes not enough data points remain (Figure 13a). In some cases, the slope of the measured σ0 does not agree well with the modelled slope (Figure 13b). Given Even after the removal of cloud-contaminated cases, which is done automatically in our SScal analysis procedure, not all SScal events can be used for calibration. There are several reasons why SScal events may not be suitable: After the removal of cloud contaminated data, sometimes not enough data points remain (Figure 13a). In some cases, the slope of the measured σ 0 does not agree well with the modelled slope (Figure 13b). Given the fact that the slope is highly sensitive to varying wind speeds (Section 3.2.2), we propose that the disagreement between the slope of the measured and modelled σ 0 does not necessarily mean that the radar is not well calibrated, but rather that the reanalysis of the wind speed is not representative of the actual wave conditions. This discrepancy is especially likely near coastlines because the assumption that wind speed is a good proxy for wave conditions may not be valid. SScal events were also not considered when the wind speed is very low and variable within a single event (Figure 13c). Other SScal events were removed because data measured on one side of the aircraft were distinctly different from data measured on the other side of the aircraft (Figure 13d). We hypothesize that these distinct measurements were taken under conditions when the aircraft was flying perpendicular to the wave direction, so that the radar scanned the approaching waves on one side and the departing waves on the other side, resulting in different wave slopes with different scattering properties.
After the removal of the non-suitable SScal events, we were left with 27 good events for CSET, 27 for SOCRATES, and 45 for OTREC. Going through the individual plots of each SScal event (not shown) it is evident that the difference between the models and the observations varies between individual events, which is to be expected. Some events show excellent agreement (e.g., Figure 12a) while others show a significant bias of sometimes >2 dB (e.g., Figure 12b). When the slopes of the measured and modelled σ 0 do agree but the measured curve is shifted up or down as a whole, a bias in the radar calibration is likely. Of course, this up or down shift could also be caused by erroneous sea surface temperatures, but that is rather unlikely because the variations are very small (Figure 11b). It is interesting to note that there was not a single model that always had the best agreement with the observations. Rather different models performed better for different events, different wind speeds, or different incidence angles. In general, the slopes of the CM and Wu models were similar to each other and agreed somewhat better with the measurements than the FV model. speed is very low and variable within a single event (Figure 13c). Other SScal events were removed because data measured on one side of the aircraft were distinctly different from data measured on the other side of the aircraft (Figure 13d). We hypothesize that these distinct measurements were taken under conditions when the aircraft was flying perpendicular to the wave direction, so that the radar scanned the approaching waves on one side and the departing waves on the other side, resulting in different wave slopes with different scattering properties. After the removal of the non-suitable SScal events, we were left with 27 good events for CSET, 27 for SOCRATES, and 45 for OTREC. Going through the individual plots of each SScal event (not shown) it is evident that the difference between the models and the observations varies between individual events, which is to be expected. Some events show excellent agreement (e.g., Figure 12a) while others show a significant bias of sometimes >2 dB (e.g., Figure 12b). When the slopes of the measured and modelled σ0 do agree but the measured curve is shifted up or down as a whole, a bias in the radar calibration is likely. Of course, this up or down shift could also be caused by erroneous sea surface temperatures, but that is rather unlikely because the variations are very small (Figure 11b). It is interesting to note that there was not a single model that always had the best agreement with the observations. Rather different models performed better for different events, different wind speeds, or different incidence angles. In general, the slopes of the CM and To investigate if we have an overall bias, we first calculate the difference between the measurements and the models for each data point between the incidence angles of 5 • and 15 • and then calculate the mean and standard deviation of these differences. To summarize the bias at different incidence angles, we collect the data into 0.5 • bins and calculate the mean (Figure 14a-c), mean of the differences (i.e., the bias, Figure 14d-f), and standard deviations within each bin. Comparing the results from CSET, SOCRATES, and OTREC ( Figure 14), it is evident that the bias curves of the CM and Wu models have mostly a negative slope (except for high incidence angles in OTREC) whereas the FV model has a steeper positive slope (Figure 14d-f). The steeper slope of the FV model indicates that it is less representative of the HCR measurements than the other two models and therefore we put more emphasis on the CM and Wu models. As a consequence of the different direction of the slopes in the models, the CM and Wu models agree better with the measurements at low incidence angles when the overall bias is negative (as in CSET, Figure 14f) and high incidence angles when the overall bias is positive (SOCRATES and OTREC, Figure 14e,f). The opposite is true for the FV model. The ideal model for the HCR is likely somewhere in-between the FV model and the CM/Wu models.
During CSET, we observed a small mean bias of about −0.3 dB with all three models (Figure 14a). Standard deviations were also low, at less than 1 dB. The good agreement between the measurements and the models, and the low standard deviation can likely be attributed to quite calm conditions during CSET. Wind speeds were low to moderate (not shown) leading to low wave activity in the Pacific. In SOCRATES, the bias was 1.2 dB with the CM and Wu models and 0.7 dB with the FV model (Figure 14e), with standard deviations of just over 1 dB. Wind speeds were generally very high during SOCRATES, which is reflected in the flat curve of the measured radar cross section (Figure 14b). The angle between the aircraft track and the waves seems to play a significant role in SOCRATES, as there were several cases where the data measured on one side of the aircraft were distinctly different from data measured on the other side of the aircraft, as shown in Figure 13d. In OTREC, the overall bias was the largest at 1.4 dB for the CM model, 1.2 dB for the Wu model, and 1.7 dB for the FV model ( Figure 14f). However, the uncertainty in the OTREC results was also the largest, with standard deviations of more than 2 dB (Figure 14c,f). Two main factors likely play a role in the large uncertainty of the OTREC data: (a) wind speeds were generally low, which is unfavourable as the sensitivity to wind speed deviations is the largest at low wind speeds ( Figure 11a); and (b) many SScal events were carried out close to the coast where the assumption that wind speed is a good proxy for wave conditions is questionable. Overall, the observed biases of around 1-2 dB are very encouraging, and we consider the HCR to be well calibrated. However, the fact that the biases increased between the different field campaigns suggests the need for close attention and is still under investigation.
Wu models were similar to each other and agreed somewhat better with the measurements than the FV model.
To investigate if we have an overall bias, we first calculate the difference between the measurements and the models for each data point between the incidence angles of 5° and 15° and then calculate the mean and standard deviation of these differences. To summarize the bias at different incidence angles, we collect the data into 0.5° bins and calculate the mean (Figure 14a-c), mean of the differences (i.e., the bias, Figure 14d-f), and standard deviations within each bin. Comparing the results from CSET, SOCRATES, and OTREC ( Figure 14), it is evident that the bias curves of the CM and Wu models have mostly a negative slope (except for high incidence angles in OTREC) whereas the FV model has a steeper positive slope (Figure 14d-f). The steeper slope of the FV model indicates that it is less representative of the HCR measurements than the other two models and therefore we put more emphasis on the CM and Wu models. As a consequence of the different direction of the slopes in the models, the CM and Wu models agree better with the measurements at low incidence angles when the overall bias is negative (as in CSET, Figure 14f) and high incidence angles when the overall bias is positive (SOCRATES and OTREC, Figure 14e,f). The opposite is true for the FV model. The ideal model for the HCR is likely somewhere in-between the FV model and the CM/Wu models. During CSET, we observed a small mean bias of about −0.3 dB with all three models ( Figure 14a). Standard deviations were also low, at less than 1 dB. The good agreement between the measurements and the models, and the low standard deviation can likely be attributed to quite calm conditions during CSET. Wind speeds were low to moderate (not shown) leading to low wave activity in the Pacific. In SOCRATES, the bias was 1.2 dB with the CM and Wu models and 0.7 dB with the FV model (Figure 14e), with standard deviations of just over 1 dB. Wind speeds were generally very high during SOCRATES, which is reflected in the flat curve of the measured radar cross section (Figure 14b). The angle between the aircraft track and the waves seems to play a significant role in SOCRATES, as there were several cases where the data measured on one side of the aircraft were distinctly different from data measured on the other side of the aircraft, as shown in Figure  13d. In OTREC, the overall bias was the largest at 1.4 dB for the CM model, 1.2 dB for the Wu model, and 1.7 dB for the FV model (Figure 14f). However, the uncertainty in the

Spectrum Width Correction
The Doppler spectrum width is a measure of the variability of the observed velocities within the measurement volume of the radar beam. Since the set of observed particles (scatterers) move relative to each other, depending on the level of turbulence, the observed velocities form a distribution, approximately Gaussian in shape. Spectrum width can be thought of as the standard deviation of this velocity distribution.
The motion of the platform (i.e., aircraft) causes broadening of the observed spectrum width for the following reasons: The HCR beam width is 0.73 • . During vertically pointing operations, this means a spread of about 0.36 • ahead of the vertical, and about 0.36 • behind the vertical. The aircraft is typically moving with a ground speed of 150-250 m s −1 . Since a radar measures velocity in the radial sense, the particles ahead of the beam center will appear to move towards the aircraft and the particles behind the center will appear to move away from the aircraft. The extra velocity spread, at the edge of the beam, is approximately rad (0.36 • ) × aircraft speed, i.e., 1.6 m s −1 at 250 m s −1 ground speed. This effect significantly increases spectrum width. We estimate a correction to spectrum width to account for this effect. The Equations (12) and (13) are as follows: where vel plane is the velocity of the aircraft relative to the ground, el is the elevation angle, beamWidth rad is the radar beam width in radians, and WIDTH_RAW is the measured spectrum width. Generally, the elevation angle will be +90 • or −90 • , so the sin(el) term mostly reduces to 1.

Radial Velocity Correction
A Doppler radar such as the HCR measures velocity in a radial sense-i.e., towards or away from the instrument. In vertical pointing modes, it is important to keep the beam pointing as close to truly vertical as possible so that the aircraft motion is orthogonal to the pointing angle. If the beam is not truly vertical, it is important to correct the measurements for platform motion and pointing angle deviations from the vertical. Details on the development of a correction methodology suitable for the HCR are described in [6]. Therefore, here we will only give a brief description of the current implementation and updates to the methodology.
Radial velocity correction is a two-step process. First, velocity is corrected for vertical and horizontal platform motion, and deviations of the elevation angle from vertical pointing. For this step, we use an earth-centric coordinate system where the x-axis points east, the y-axis points north, and the z-axis points up. We further need to keep in mind that the radar azimuth angle (az) is positive clockwise from north, and the elevation angle (el) is positive up from horizontal. Given the measured eastward (vel east plane ), northward (vel north plane ), and vertical velocity (vel vert plane ) of the aircraft, we can calculate the corrections in x, y, and z direction as: x corr = sin(az) cos(el) vel east plane , y corr = cos(az) cos(el) vel north plane , z corr = sin(el) vel vert plane .
The motion and angle corrected radial velocity (VEL) is then: where VEL_RAW is the measured radial velocity.
In the second step, we attempt to correct any remaining biases by assuming that the ocean/land surface is stationary, having a radial velocity of zero. Obviously, this step can only be applied to nadir-pointing data. In principle, we can simply add or subtract the radial velocity of the gates identified as surface (Section 2.3) in each ray to each range gate, forcing the surface to have zero radial velocity. However, it is important to filter the observed surface velocity before applying the correction, so that measurement noise or non-stationary surface features (such as waves) do not introduce new errors into the data. We use a third order Savitzky-Golay filter [29] with a 15 s length for CSET and OTREC, and a 20 s length for SOCRATES to smooth the surface radial velocity before applying the correction. Special care needs to be taken in cases where the surface echo is extinct (Section 2.3). In these cases, we first remove observations at the edges of the surface echo gap, which are often unreliable as the signal weakens. Then we fill in the gap with radial velocity data from before the gap which has been averaged over a certain time period, apply the Savitzky-Golay filter to the filled in data, and apply the correction to observed velocity VEL to obtain the final corrected velocity field VEL_CORR.
An example of the radial velocity correction process for data collected in a descent during SOCRATES RF01 ( Figure 15) shows how the vertical and horizontal aircraft motion manifests as vertical columns of high or low velocities in the uncorrected radial velocities VEL_RAW (Figure 15a). Non-vertical pointing, caused by deviations in aircraft pitch during the descent, leads to strong biases. Both the nadir-and zenith-pointing data are much improved after the first step of the correction (Figure 15b). The radial velocity is now consistent between the nadir-and zenith-pointing data with vertical velocities of about −1 m s −1 above the bright band and −2 to −3 m s −1 below the bright band. However, the radial velocity of the ground, in this case, the topography presenting in a line-like structure in the nadir-pointing data, still shows a negative bias (green colors) in Figure 15b. This bias is corrected with the second step, i.e., the surface reference method (Figure 15c), which removes the bias and corrects the surface echo to close to zero (gray colors) with measurement noise evenly distributed on each side of zero (green and yellow colors). The second step changes the vertical velocity in the nadir-pointing data by~0.3 m s −1 (Figure 15c). Several issues still need to be considered after both corrections: As already mentioned, the second step of the correction cannot be applied to zenith pointing data which therefore may contain undetected biases. If and how these biases can be quantified and corrected is still a topic of investigation. Another problem that cannot be corrected is that the radar, while it rotates freely around the longitudinal axis, has with ~4° up and down, very limited rotation around the lateral axis. This means that when the aircraft has significant pitch deviations (larger than the ones shown in Figure 15), e.g., during steep climbs, the tilt angle correction of the radar is less than theoretically required, leading to erroneous angles, and the first step of the velocity correction fails. In nadir-pointing mode, this can partly be compensated with the second correction step, but in zenith-pointing mode, the velocities are unreliable in these situations. It is also important to keep in mind that during SScal events, the angles are so far off nadir that they cannot be corrected because of velocity folding-in other words, the measured velocity is no longer within the unambiguous (Nyquist) velocity interval of the radar.

Conclusions
The NCAR HCR has been deployed in three major field campaigns ranging in location from the tropics to the Southern Ocean. To provide the best possible data to the scientific community, we have developed extensive quality assurance and quality control Several issues still need to be considered after both corrections: As already mentioned, the second step of the correction cannot be applied to zenith pointing data which therefore may contain undetected biases. If and how these biases can be quantified and corrected is still a topic of investigation. Another problem that cannot be corrected is that the radar, while it rotates freely around the longitudinal axis, has with~4 • up and down, very limited rotation around the lateral axis. This means that when the aircraft has significant pitch deviations (larger than the ones shown in Figure 15), e.g., during steep climbs, the tilt angle correction of the radar is less than theoretically required, leading to erroneous angles, and the first step of the velocity correction fails. In nadir-pointing mode, this can partly be compensated with the second correction step, but in zenith-pointing mode, the velocities are unreliable in these situations. It is also important to keep in mind that during SScal events, the angles are so far off nadir that they cannot be corrected because of velocity folding-in other words, the measured velocity is no longer within the unambiguous (Nyquist) velocity interval of the radar.

Conclusions
The NCAR HCR has been deployed in three major field campaigns ranging in location from the tropics to the Southern Ocean. To provide the best possible data to the scientific community, we have developed extensive quality assurance and quality control procedures. These QC steps described below have been applied to all three data sets.
A standard engineering-type calibration is carried out on the HCR receiver in the laboratory both before and after each field campaign in order to characterize the receiver performance. Furthermore, NCAR contracted with an outside vendor to quantify losses due to the reflector and radome assembly, which revealed that the combined one-way loss of the reflector and the radome amount to approximately 2 dB. Post field campaign, data collected during noise source calibration events was used to analyse system gain changes over the extreme temperature range that the radar is exposed to during flight. Both the LNA temperature data and that from the other temperature sensors within the pod were used to correct the receiver gain.
To check the reflectivity calibration, so-called sea surface calibration events were conducted during most flights, during which the radar was scanning cross-track 20 • off of nadir for several minutes in clear conditions. The ocean surface cross-section measurements collected during these events were compared to theoretical values calculated from several different ocean surface backscattering models. These comparisons show that the HCR is calibrated to within~1-2 dB of the theory, which underscores the high quality of the data.
The spectrum width was corrected for the spectral broadening that is caused by the motion of the aircraft. Radial velocity was corrected in a two-step process: (a) velocity data is corrected for platform motion and pointing deviations relative to nadir or zenith via simple trigonometry; (b) velocity measurements collected during nadir-pointing periods are further corrected by adjusting the data so that the filtered velocity of the sea or land surface is zero.
To aid the scientific community in their research using HCR data, we interpolated ERA5 pressure level and surface variables onto the HCR time-range grid. The reanalysis data was used in the sea surface calibration modelling and provides an environmental reference for the observed radar fields. Terrain elevation values at each point in the aircraft track were also added to the data set. We developed an echo identification algorithm that classifies each data point into categories, such as cloud, surface echo, or noise source calibration, among others. This classification is provided to the users in a FLAG field, which allows them to mask out undesired data.