Characterizing Buoy Wind Speed Error in High Winds and Varying Sea State with ASCAT and ERA5

Buoys provide key observations of wind speed over the ocean and are routinely used as a source of validation data for satellite wind products. However, the movement of buoys in high seas and the airflow over waves might cause inaccurate readings, raising concern when buoys are used as a source of wind speed comparison data. The relative accuracy of buoy winds is quantified through a triple collocation (TC) exercise comparing buoy winds to winds from ASCAT and ERA5. Differences between calibrated buoy winds and ASCAT are analyzed through separating the residuals by anemometer height and testing under high wind-wave and swell conditions. First, we converted buoy winds measured near 3, 4, and 5 m to stress-equivalent winds at 10 m (U10S). Buoy U10S from anemometers near 3 m compared notably lower than buoy U10S from anemometers near 4 and 5 m, illustrating the importance of buoy choice in comparisons with remote sensing data. Using TC calibration of buoy U10S to ASCAT in pure wind-wave conditions, we found that there was a small, but statistically significant difference between height adjusted buoy winds from buoys with 4 and 5 m anemometers compared to the same ASCAT wind speed ranges in high seas. However, this result does not follow conventional arguments for wave sheltering of buoy winds, whereby the lower anemometer height winds are distorted more than the higher anemometer height winds in high winds and high seas. We concluded that wave sheltering is not significantly affecting the winds from buoys between 4 and 5 m with high confidence for winds under 18 ms−1. Further differences between buoy U10S and ASCAT winds are observed in high swell conditions, motivating the need to consider the possible effects of sea state on ASCAT winds.


Introduction
High-quality wind speed observations in extreme conditions over the ocean are important to many aspects of meteorology and oceanography. Weather forecasting for the purposes of ship routing depends on reliable observations of winds and waves in conjunction with numerical weather prediction (NWP) to advise ships of adverse conditions along their routes. The availability of reliable winds from remote sensing instruments has advanced the capability of NWP models in this effort, especially winds from scatterometry [1]. Moreover, the forcing of ocean currents with NWP generally relies on calculations of wind stress using wind speed as an input variable and the accuracy of the calculated wind stress values is a function of wind speed accuracy. Bulk formula calculations of wind stress scale with the square or cube of the input wind speeds, depending on the wind speed [2,3]. Therefore, these wind stress calculations are especially sensitive to error at high wind speeds. Applications using bulk formulations of wind stress, such as ocean surface fluxes of momentum, heat, and carbon, rely on accurate winds relative to the ocean surface and are a critical component of characterizing changes in the ocean with a warming climate.
Standard sources of in situ wind data include ships, buoys, and anemometers mounted on platforms such as oil rigs. The availability of observations away from coastal regions is complicated by accessibility for routine maintenance and the harshness of weather conditions in open ocean regions such as the Southern Ocean. For these reasons, remote sensing observations play a prominent role in characterizing winds over the open ocean. Meteorological buoys are a common source of data used for training retrieval functions for remotely sensed ocean vector winds [4][5][6]. The accuracy of buoy winds is complicated by errors which may scale with the wind speed and sea state. These error sources can be either defined by the physical errors inherent to the buoy instruments and platform or errors induced by the environment in which the buoy is located. Physical errors due to the buoy platform include the effects of flow distortion by either the buoy instruments or the platform itself [7]. Environmental errors in buoy wind speed include the near-surface flow distortion over large surface waves [8,9] and the errors due to the pitch and roll of the buoy in high seas.
The movement of buoys in high winds and associated high waves has been suggested to cause problems for the accurate measurement of buoy wind speeds. Buoys may exhibit a low wind speed bias in high winds due to the effects of wave sheltering of buoy anemometers in the troughs of large waves [3,5]. Further non-linear effects may exist in the averaging of winds with the vertical movement of buoys in a logarithmic profile of winds within the near surface boundary layer [10]. The problem with vertical averaging of the wind, is that waves are shaped such that the buoy spends more time in the trough than on the crest of waves, and that a logarithmic profile causes winds in the trough, in the absence of sheltering, to be reduced more than they are increased by an equal and opposite upward motion. Flow over surface waves distorts the notion of a logarithmic wind profile and the wave boundary layer (WBL) is a concern for estimating winds at 10 m from measurements below this height [11].
Traditional techniques used to calculate the wind speed bias of buoys in high winds have not directly accounted for sea state effects, but rather use comparisons of the differences between buoy winds and alternative wind sources at high wind speeds; some of which are uncalibrated [3,5]. To quantify a bias in buoy winds due to sea state, a reference dataset of unbiased observations which are not significantly affected by sea state is needed for comparison. One of the first studies to identify this low bias used buoys with anemometer heights between 3 and 5 m compared to winds from the European Centre for Medium Range Weather Forecasts (ECMWF) operational model to develop linear corrections of buoy winds at high wind speeds based on anemometer height [3]. An alternative study used surface air pressure data with buoys as input into a numerical model to infer a low bias for buoy winds at high wind speeds, but also did not account for sea state effects [5]. There are studies that have used sea state as a factor in comparisons of scatterometer and buoy winds [4], but the results from these studies are limited to the predominant meteorological and sea state conditions of the local area and the physical characteristics of the individual buoys used.
Modern comparisons using buoy data are typically performed with NWP, reanalysis products or remote sensing sources. Scatterometer winds are often compared with buoy winds adjusted to 10 m equivalent neutral winds (U 10N ). Scatterometers measure electromagnetic backscatter from roughness elements on the ocean surface and are not sensitive to actual winds in the surface layer, but rather the roughness elements sensitive to stress on the ocean surface [12]. Roughness elements at the ocean surface are modified by stability-dependent stress rather than the stability dependent wind profile in the boundary layer. Therefore, comparisons of scatterometer derived winds with in situ winds cannot be soundly achieved without first accounting for the effects of stability. Calculations of equivalent neutral wind account for stability by using a stress and roughness length consistent with the observed atmospheric stratification and then setting the atmospheric stratification term in the modified log-wind profile equal to zero [13][14][15].
Recent work has shown that the measurements from scatterometers have a more direct relationship with surface stress than wind speed because knowledge of the air density improves the match between scatterometer measurements and in situ wind measurements [12,16]. Microwave backscatter from surface roughness elements depends on momentum transferred to the surface water waves, which is affected by both ρ and boundarylayer stratification. Equivalent neutral winds account for boundary-layer stratifications, but not air density. Stress-equivalent winds account for both effects and are calculated using a ratio of ρ to the global mean air-mass density above the ocean ( ρ = 1.225 kg m −3 ) to convert U 10N to U 10S .
After comparing scatterometer winds to buoy U 10S , variability between ASCAT and buoys can be assessed for possible errors caused by flow distortion and sea state effects, but additional systematic errors due to differences in measurement scales must also be accounted for because we cannot assume that the two sets of observations are perfectly intercalibrated. Implicit assumptions about the errors of a comparison dataset can cause errors in a final calibrated product. Some of the errors that cannot be resolved by dual comparisons include temporal and spatial representation errors, geophysical representation errors, and the error distributions inherent to the individual sources [17]. With dual collocation analyses there is no clear way to differentiate these types of errors introduced by either system. A method introduced by [18] shows when the errors from three separate datasets are uncorrelated, errors may be estimated from the covariances of the three datasets. Triple collocation (TC) allows for the relative linear calibration of the three systems using one type of wind data as the reference system and rescaling the distributions with the estimated errors from mutual calibration [17,18]. While this approach remains sensitive to errors in the comparison data, it does allow for differentiation of the errors from each of the other datasets relative to the comparison data.
This study uses TC with buoy winds from the Global Telecommunications System (GTS), scatterometer winds from the Advanced Scatterometer (ASCAT), and model short range forecast winds used as first-guess (FG) in the ECMWF fifth reanalysis dataset (ERA5) to quantify the effects of sea state on buoy winds in extreme conditions by using ASCAT winds for reference and calculating the linear calibration coefficients of buoy winds under varying high wind and wave conditions. To assess the observational error of buoys at high wind speeds, this study flips the traditional method of using buoy data as "ground truth". Scatterometer retrievals of wind have been extremely stable over time [19]; hence, they are reasonable to use as comparison data over the life of the missions and scatterometer observations are clearly not sensitive to the physical characteristics of buoys. Scatterometer winds are used as an independent record to test the changes of the observed buoy wind speeds but are not used as a wind standard for gauging the accuracy of the measured buoy winds. This approach does not require that buoy observations or scatterometer winds are correct. It is designed to determine if there is a sea state dependence in the height adjusted buoy winds.
The data are described in Section 2, including the adjustment of buoy and ERA5 FG winds to U 10S . The TC error model is described in Section 3. High wind speed comparisons in Section 4 show that buoy U 10S performs well when using the full range of winds and seas from all buoys but differs when isolating the statistics to buoys grouped by anemometer height ranges. Sea state comparisons in Section 5 show that knowledge of ocean swell may improve scatterometer wind accuracy and that there appears to be a small dependence of wind speed error on sea state for buoys with anemometer heights ranging from 4 to 5 m. The results section is followed by a discussion in Section 6 and the main conclusions in Section 7.

Data
A triple collocation dataset was produced to carry out the work described in this paper using winds from moored buoys acquired through the GTS, ERA5 FG winds, and stable ASCAT winds as the reference system. The ASCAT instrument is a scatterometer carried on the Meteorological Operational (Metop) satellites operated by the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT). For this project, the EUMETSAT Ocean and Sea Ice Satellite Application Facility (OSI SAF) Level 2 ASCAT Coastal Wind product [6,20] on a 12.5-km grid was used with data ranging from 18 August 2010 from Metop-A and 29 October 2012 from Metop-B through 31 October 2019. For the ASCAT instrument onboard both satellites, the geophysical model function (GMF) used to generate equivalent neutral wind vectors based on normalized radar cross-section (NRCS) readings was CMOD7 [21].
The recent C-Band High and Extreme Force Speeds (CHEFS) report from KNMI [11] shows that ASCAT CMOD7 winds show good agreement with buoys at high winds between 15 and 25 ms −1 , but lack good GMF calibration due to the lack of a consolidated in-situ wind speed reference for winds above 15 ms −1 . The saturation of the GMF at extreme winds (>25 ms −1 ) is partially compensated by the high calibration stability of the ASCAT instrument. Other studies have examined the quality of ASCAT winds in hurricane force conditions. A recent study comparing ASCAT winds to moored buoys and microwave radiometer winds showed that ASCAT wind retrievals are accurate at high winds up to 25 ms −1 , but start to show signal saturation above 42 ms −1 [22]; however, using a different ASCAT GMF. For the purposes of this study, we assessed the buoy wind quality using ASCAT winds below 25 ms −1 , where ASCAT winds are shown to be stable, but degradation of buoy wind speed quality may occur. Note that buoy wind measurements above 25 ms −1 are rare and hence, not suitable for statistical analysis.
The in situ wind observations were obtained from buoy data reported through the GTS which are archived and quality controlled by the European Centre for Medium Range Weather Forecasts (ECMWF) [23]. The GTS buoys included in this study are provided by the centers listed in Table 1. The buoy locations are presented in Figure 1. Only buoys moored at least 15 km off shorelines which report both wind speed and direction are used in the TC analyses. The buoy winds are reported hourly by averaging the wind speed over 10 min. Note that many of the GTS buoy winds are binned every 1 ms −1 in speed and 10 • in direction due to the GTS buffer message format.  Overall, there are 25 buoys in the anemometer height range shown in Figure 1 centered at 3 m (squares in Figure 1), 155 buoys in the range centered at 4 m (triangles), and 115 buoys in the range centered at 5 m (circles). Buoys with changes in anemometer height and/or hull configurations were treated separately in the analysis. After 2016, the NDBC replaced many of their buoys with anemometer heights near 5 m to buoys with a smaller hull and anemometer at a height closer to 4 m. These buoys are indicated with a star marker in Figure 1. We took advantage of this configuration to test the wind differences of these buoys with different anemometer heights moored at the same locations in the TC comparison results in Section 5. In addition, most of the TAO buoys with the Atlas configuration and hull were replaced with the TAO Refresh configuration during the study period [24] and these buoys were treated separately in the TC analysis.
ERA5 FG winds are used as the third dataset for TC. The ERA5 dataset has a grid spacing of 31 km with winds from 1-hourly forecasts initialized at 06 and 18 UTC [25]. Hourly output of ERA5 FG variables was matched in space and time with GTS buoys by using buoys within the overlying ERA5 grid cell to the closest hourly time. Since GTS buoys report a wind speed average from the last 10 min of each hour, the time difference between the ERA FG and buoy winds were usually within a few minutes. The scatterometer observations from ASCAT-A and ASCAT-B were collocated with buoy data in time and space within 25 km and 30 min (less than half the time of one orbit). Only the closest distance wind vector cell observation within the time and spatial range was chosen for each buoy collocation. With this criterion we have a total of 444,102 triplets. There are instances of matches outside of the ASCAT product grid spacing of 12.5 km but less than 25 km (less than 4.5% of the data). Although in-swath grid spacing for the ASCAT products improved from 25 to 12.5 km, the effective resolution shown by spectral analysis was larger than the grid spacing value [1,26].

Wind Adjustments
The first step in calculating buoy is the conversion of raw winds measured at the buoy anemometer height to buoy . The definition of equivalent neutral winds used for this study follows the conventions [14]. The Coupled Ocean-Atmosphere Response Experiment (COARE) 3.5 bulk model [27] was used for the conversion of raw winds to and Overall, there are 25 buoys in the anemometer height range shown in Figure 1 centered at 3 m (squares in Figure 1), 155 buoys in the range centered at 4 m (triangles), and 115 buoys in the range centered at 5 m (circles). Buoys with changes in anemometer height and/or hull configurations were treated separately in the analysis. After 2016, the NDBC replaced many of their buoys with anemometer heights near 5 m to buoys with a smaller hull and anemometer at a height closer to 4 m. These buoys are indicated with a star marker in Figure 1. We took advantage of this configuration to test the wind differences of these buoys with different anemometer heights moored at the same locations in the TC comparison results in Section 5. In addition, most of the TAO buoys with the Atlas configuration and hull were replaced with the TAO Refresh configuration during the study period [24] and these buoys were treated separately in the TC analysis.
ERA5 FG winds are used as the third dataset for TC. The ERA5 dataset has a grid spacing of 31 km with winds from 1-hourly forecasts initialized at 06 and 18 UTC [25]. Hourly output of ERA5 FG variables was matched in space and time with GTS buoys by using buoys within the overlying ERA5 grid cell to the closest hourly time. Since GTS buoys report a wind speed average from the last 10 min of each hour, the time difference between the ERA FG and buoy winds were usually within a few minutes. The scatterometer observations from ASCAT-A and ASCAT-B were collocated with buoy data in time and space within 25 km and 30 min (less than half the time of one orbit). Only the closest distance wind vector cell observation within the time and spatial range was chosen for each buoy collocation. With this criterion we have a total of 444,102 triplets. There are instances of matches outside of the ASCAT product grid spacing of 12.5 km but less than 25 km (less than 4.5% of the data). Although in-swath grid spacing for the ASCAT products improved from 25 to 12.5 km, the effective resolution shown by spectral analysis was larger than the grid spacing value [1,26].

Wind Adjustments
The first step in calculating buoy U 10S is the conversion of raw winds measured at the buoy anemometer height to buoy U 10N . The definition of equivalent neutral winds used for this study follows the conventions [14]. The Coupled Ocean-Atmosphere Response Experiment (COARE) 3.5 bulk model [27] was used for the conversion of raw winds to U 10N and the calculation of the friction velocity from both buoy and ERA5 FG winds. The modified log-wind profile is shown in Equation (2) where U(z) is the wind at the anemometer height, U s is the velocity frame of reference (the surface current), u * is the friction velocity, k is von Karman's constant, z is the height above the ocean surface, z 0 is the roughness length, ϕ is a function of atmospheric stability [28,29], and L is the Monin-Obukhov scale length [30,31].
First, u * and z 0 were calculated using known values of U(z) from the buoy, and temperature and pressure from ERA5 which are factored into ϕ. The u * and z 0 values were used to calculate U 10N using Equation (2) with ϕ = 0. U 10S was then calculated using Equation (1). In our analysis we assumed a value of U s = 0 ms −1 in the model.
The COARE algorithm parameterizes z 0 into two separate terms, with the first based upon the roughness of the ocean under aerodynamically smooth conditions while the second term accounts for the wind stress in the form of surface gravity waves (Equation (3)) [27].
υ is the kinematic viscosity and γ is the roughness Reynolds number for smooth flow (γ = 0.11 from laboratory experiments [27]). The influences of sea state on z 0 are parameterized through the Charnock coefficient (α). Based on data from four separate air-sea field experiments with momentum flux measurements, a wind speed dependent form of α was calculated where α = mU 10N + b with m = 0.017 m −1 s and b = −0.005.
Another form includes the parameterization of the influence of surface waves on the ocean roughness with: where D = 0.09 and H s is the significant wave height. Here, z 0 rough is also a function of the inverse wave age (w a −1 ). w a is defined as C p /u * , where C p = gT p /2π (Equation (19)) is the phase speed of the dominant wave, and T p is the spectral peak period. Since this study involves comparisons of buoy winds against ASCAT under varying sea state conditions, we first tested the sensitivity of the height adjustment of raw buoy winds to U 10S using both the wind speed dependent form of α and the sea state form as a function of H s and w a . Figure 2 shows scatterplots of ASCAT winds versus buoy U 10S derived from the COARE 3.5 model using both the "wind speed only" form of z 0 (Figure 2, left) and the z 0 parameterization based upon sea state conditions (Figure 2, right).
From Figure 2 it is apparent that the choice of COARE 3.5 z 0 parameterization has little effect on the height adjustment of buoy winds with our distribution of GTS buoy winds measured between 3 and 5 m. The correlation coefficient (r), bias (buoy U 10S − ASCAT) and standard deviation are closely matched for buoy U 10S compared to ASCAT for both parameterizations of z 0 . Although the changes in z 0 are important to consider for fluxes, the height adjustment of wind speed is rather insensitive to roughness length. The results in Section 5 were also tested with the wave age formulation, but they were not visually different from using the wind speed formulation. For simplicity, we chose the wind speed-dependent formulation in the COARE model for further analysis.
The distribution of buoy U 10S for each anemometer height used in the analysis is shown in Figure 3. Wind observations from buoys with 5 m anemometers were readily available for all wind ranges. The median observations of buoy U 10S from 4 and 5 m anemometers aligned well, but there was a higher occurrence of high U 10S with 5-m anemometers compared to 4-and 3-m anemometers. More buoys with 5-m anemometers were in the mid-latitude regions than buoys with 4-or 3-m anemometers and there was a more frequent occurrence of high winds in the mid-latitudes. The median values of buoy U 10S from 3 m anemometers was lower than for 4 and 5 m anemometers. Most of the 3-m observations were from the RAMA array in the Indian Ocean, and there is a more frequent From Figure 2 it is apparent that the choice of COARE 3.5 parameterization has little effect on the height adjustment of buoy winds with our distribution of GTS buoy winds measured between 3 and 5 m. The correlation coefficient (r), bias (buoy ASCAT) and standard deviation are closely matched for buoy compared to ASCAT for both parameterizations of . Although the changes in are important to consider for fluxes, the height adjustment of wind speed is rather insensitive to roughness length. The results in Section 5 were also tested with the wave age formulation, but they were not visually different from using the wind speed formulation. For simplicity, we chose the wind speed-dependent formulation in the COARE model for further analysis.
The distribution of buoy for each anemometer height used in the analysis is shown in Figure 3. Wind observations from buoys with 5 m anemometers were readily available for all wind ranges. The median observations of buoy from 4 and 5 m anemometers aligned well, but there was a higher occurrence of high with 5-m anemometers compared to 4-and 3-m anemometers. More buoys with 5-m anemometers were in the mid-latitude regions than buoys with 4-or 3-m anemometers and there was a more frequent occurrence of high winds in the mid-latitudes. The median values of buoy from 3 m anemometers was lower than for 4 and 5 m anemometers. Most of the 3-m observations were from the RAMA array in the Indian Ocean, and there is a more frequent occurrence of light winds in this region. The narrow PDF of 4 m buoy winds with the median winds near 6-7 ms −1 reflects the fact that most of these winds are from the TAO, PIRATA, and RAMA arrays in the tropical region where steady trade winds dominate the climatology. There is a much lower occurrence of high winds (greater than 10 ms −1 ) for buoys with anemometers near 4 m. The non-uniform nature of global winds combined with the steepness of the 4 m buoy PDF between 7 and 10 ms −1 is indicative of conditional sampling of winds at point locations where the wind climatology varies with latitude.  If the wind record of buoys around 4-m anemometer height were combined and compared to the wind record of 3-and 5-m winds, we may see an artificial bias in the 4-m buoy winds between 7 and 10 ms −1 due to the different distributions of wind speeds. Since the buoy wind climatologies vary with location the differences are expected, and the technique to assess differences associated with height adjustment must be robust to these different distributions. This will be accomplished by using only collocated observations with ASCAT in space and time. The buoy wind calibrations in the following sections are not performed on groups of buoys based on anemometer height, but rather the calibrations occur at the individual buoy level. If the wind record of buoys around 4-m anemometer height were combined and compared to the wind record of 3-and 5-m winds, we may see an artificial bias in the 4-m buoy winds between 7 and 10 ms −1 due to the different distributions of wind speeds. Since the buoy wind climatologies vary with location the differences are expected, and the technique to assess differences associated with height adjustment must be robust to these Remote Sens. 2021, 13, 4558 8 of 27 different distributions. This will be accomplished by using only collocated observations with ASCAT in space and time. The buoy wind calibrations in the following sections are not performed on groups of buoys based on anemometer height, but rather the calibrations occur at the individual buoy level.

Buoy Validation Parameters
In this section, we determine the parameters to validate the buoy winds using the TC model first introduced by Stoffelen [18]. This approach considers errors in both the buoy winds and the reference ASCAT winds. Through this approach, we determined improved bias and standard deviations for the triple collocated data.
This procedure was applied to ASCAT-A and ASCAT-B data generated with the CMOD7 GMF [21]. ASCAT winds have been shown to be stable over time and therefore justified as a reference to test the buoy error dependency on sea state [1,19,32]. Instead of applying the TC on the entire buoy dataset, we first applied the procedure at the individual buoy level. The buoy dataset included buoys with many different physical characteristics such as different hull designs and instrument heights and are reported through the GTS by different organizations with their own calibration standards and quality control methods [23]. Performing TC at the individual buoy level more effectively isolates the error sources unique to the individual buoys. After the procedure was complete, buoys were grouped by anemometer height for an analysis of the residuals between the calibrated buoy U 10S and ASCAT winds and tested against sea state variables.

Triple Collocation
For a full discussion on the triple collocation error model, the reader is referred to [17,18,33,34]. We had three measurement systems (x i , i = 0, 1, 2) measuring the "truth" value (t) for U 10S , and if we assumed linear calibration suffices, then the measurements would satisfy the following set of equations where a i and b i are the linear scalings and bias calibration corrections relative to t, and ε i is the random error of system i.
From this system of equations, we can derive the error variances based on the assumption that the random measurement errors are not biased ( ε i = 0, where denotes the average). Calibration and wind retrieval errors often result in systematic wind speed errors and can be detected in wind speed comparisons alone. On the other hand, 3D turbulence and the different spatial and temporal aggregation in in situ, satellite, and NWP model winds provide, inherently, a vector nature to wind velocity differences [35]. When comparing different error sources, wind speed components, rather than magnitude and direction, have a more symmetric error distribution and are easier to describe with the linear TC model. Therefore, we performed the TC calibration on the u and v components of the three wind datasets instead of the wind speed magnitude. There are additional assumptions that must be satisfied to perform TC using the error model of Equation (6). The measurement errors must have constant variance over the whole range of measurement values and, except for spatial representation errors, must be uncorrelated with each other. The mixed second order moments relating the calibration factors are shown in Equation (7).
If system 0 and 1 have a better spatial resolution than system 2, they both may resolve actual wind variance not resolved by system 2. This representation error variance (r 2 ) is part of the wind variances for systems 0 and 1 given as the covariance of the observation errors with TC (r 2 = ε 0 ε 1 ). In our example, since both ASCAT and buoys had better spatial resolution than ERA5, they both may have resolved true wind variance that is not Remote Sens. 2021, 13, 4558 9 of 27 resolved by ERA5 and r 2 is accounted for in the TC model. The r 2 calculation methods are discussed in following paragraphs.
The calibration scalings (slopes associated with errors in proportionality) are given in Equations (8)-(10) based upon different scenarios for choosing a reference and the assumption that system 2 has a coarser spatial resolution than both systems 0 and 1. The bias corrections are given by Equation (11), where M i is the 1st order moment of system i.
The calibrated datasets are created by: As seen in Equations (8)- (11), r 2 has a direct influence on the calibration scalings (a i ) and the bias correction factors (b i ); thus, care must be taken to separate and properly characterize these systematic and random errors. Methods to calculate r 2 typically involve using the difference of the power wind spectra of collocated scatterometer and coarser resolution model winds [36]. Another method was to calculate the cumulative variance of scatterometer and model winds as a function of scale and r 2 is found by taking the difference in cumulative variance between the scatterometer and model at the scale of the coarser model resolution [37]. By spatially analyzing the difference between scatterometer and model winds, the spatial content of the common variance t in Equation (6) can be further analyzed [38]. Both methods require series of wind data representative of global conditions across the ocean. Since our TC dataset includes observations which may be highly variable and localized, we sought an alternative method to compute r 2 .
Following the examples of [34,39], we performed TC using the coarsest measurement (ERA5 FG U 10S ) as the standard and repeat TC with different r 2 values until optimal intercalibration was reached. This is akin to using system 2 as the reference in the TC equations. The r 2 values for u and v that determine a wind speed magnitude bias close to zero (x scat + x buoy − 2 * x ERA5 ≈ 0) for the calibrated triplets were considered the bestestimated spatial representation error variance. To achieve this, we performed TC using the entire dataset and test different combinations of r 2 u and r 2 v until the aforementioned bias was approximately 0. The ratio of r 2 u and r 2 v for stable winds found using classical wind spectra based TC as defined by [36] is r 2 u /r 2 v = 1.5. We took advantage of this ratio to simplify our search for the optimal r 2 values. For the overall TC dataset, we found that r 2 u = 0.4 and r 2 v = 0.6 m 2 s −2 . These values are lower than many previous calculations of representation errors [21,34,40,41], but are reasonable given the small effect of r 2 u and r 2 v on the final results. We used ERA5 as the reference dataset for the calculation of r 2 u and r 2 v only. For the main analysis, TC was performed for individual buoys using ASCAT as the reference dataset with r 2 u = 0.4 and r 2 v = 0.6 m 2 s −2 .

High Wind Speed Comparisons
For the purposes of this study, we defined low winds as winds less than 5 ms −1 , moderate winds being between 5 and 15 ms −1 , and high winds being between 15 and 25 ms −1 . Above this range we considered extreme winds (>25 ms −1 ). No statistically significant sample of winds >25 ms −1 have been measured by buoys. In this section we show comparisons of winds less than 25 ms −1 , with a focus on winds in the high wind speed category. As a first test, we compared the height adjusted buoy winds to ASCAT prior to calibration for comparison of wind speed magnitude in the high wind ranges. Figure 4 shows the comparisons of winds grouped by anemometer height and wind speed ranges based on the average of ASCAT and buoy U 10S . The raw winds from the different anemometer heights (Figure 4, top) followed an expected pattern with higher winds from 5-m anemometers compared to 4-and 3-m winds in the same ASCAT wind speed ranges. The height adjustment to brought the winds closer together (Figure 4, bottom), but unexpectedly, there was still an organized separation with 3-m winds much lower than the 4-and 5-m buoy residuals for winds above 7.5 ms −1 . For buoys with anemometers near 4-and 5-m, buoy was closely matched for the ranges shown in Figure 4, but both groups of buoy winds were above ASCAT for winds greater than 15 ms −1 . Table 2 shows the same wind characteristics as the bottom panel of Figure 4, but the winds were grouped into 5 ms −1 ranges and it also provided the averages for each range using the entire buoy dataset. The high standard deviations for buoys with 2.5 to 3.5 m anemometers above 15 ms −1 reflects the lack of observations above this range and there was high uncertainty in these values. Nevertheless, the separation of buoy measured near 3 m with buoy measured near 4 and 5 m shows that the choice of buoys is an important factor in remote sensing comparisons.
Most of the high wind speed observations above 15 ms −1 are from buoys with anemometers near 5 m as shown in the PDFs of Figure 3. Even with the higher availability of 5-m winds, the standard deviation of the wind residuals in the 20-25 ms −1 range was greater than the standard deviation of buoy measured near 4-m anemometer heights, although the biases were the same with buoy winds higher than ASCAT by 0.73 ms −1 . Given that there was a much larger sample of winds measured near 5 m compared  The raw winds from the different anemometer heights (Figure 4, top) followed an expected pattern with higher winds from 5-m anemometers compared to 4-and 3-m winds in the same ASCAT wind speed ranges. The height adjustment to U 10S brought the winds closer together (Figure 4, bottom), but unexpectedly, there was still an organized separation with 3-m winds much lower than the 4-and 5-m buoy residuals for winds above 7.5 ms −1 . For buoys with anemometers near 4-and 5-m, buoy U 10S was closely matched for the ranges shown in Figure 4, but both groups of buoy winds were above ASCAT for winds greater than 15 ms −1 . Table 2 shows the same wind characteristics as the bottom panel of Figure 4, but the winds were grouped into 5 ms −1 ranges and it also provided the averages for each range using the entire buoy dataset. The high standard deviations for buoys with 2.5 to 3.5 m anemometers above 15 ms −1 reflects the lack of observations above this range and there was high uncertainty in these values. Nevertheless, the separation of buoy U 10S measured near 3 m with buoy U 10S measured near 4 and 5 m shows that the choice of buoys is an important factor in remote sensing comparisons.
Most of the high wind speed observations above 15 ms −1 are from buoys with anemometers near 5 m as shown in the PDFs of Figure 3. Even with the higher availability of 5-m winds, the standard deviation of the wind residuals in the 20-25 ms −1 range was greater than the standard deviation of buoy U 10S measured near 4-m anemometer heights, although the biases were the same with buoy winds higher than ASCAT by 0.73 ms −1 . Given that there was a much larger sample of winds measured near 5 m compared to 4 m above 20 ms −1 , we expected that the high variability in the residuals was driven more by the large variability of winds in the mid-latitude regions, corresponding to the location of most of the 5-m anemometer buoys shown in Figure 1. The CHEFS report by the Royal Netherlands Meteorological Institute [11] showed a bias of 0.3 ms −1 for winds between 15 and 25 ms −1 (15 ms −1 < (ASCAT + buoy U 10N )/2 < 25 ms −1 , N ≈ 9700) and a standard deviation of 1.27 ms −1 using the same buoy and ASCAT dataset, but not the exact same observations. If we make the same comparison using our data for the same range (15 ms −1 < (ASCAT + buoy U 10N )/2 < 25 ms −1 , N ≈ 6674), we find a similar bias of 0.34 ms −1 with a standard deviation of 1.20 ms −1 . However, if buoy U 10S is used, we find a slightly higher bias of 0.41 ms −1 with a similar standard deviation of 1.19 ms −1 . It is apparent that the choice of buoys with different anemometer heights can significantly affect the overall statistics given the drastically different biases shown in Figure 4 and Table 2, and from the perspective of determining long-term trends, or return periods of extreme winds, this result indicates that changing buoy heights could have an impact on these statistics.
Several geophysical effects may lead to differences between buoy U 10S and ASCAT winds. Large wind variability in the extratropics and the effects of currents in the tropics are shown to affect buoy wind comparisons [16]. To isolate the error sources more effectively to the individual buoys and their locations, we performed TC at the individual level. Figure 5 shows the calibration scalings (a buoy ) for the buoy wind components calibrated using ASCAT as the reference winds. The calibration scalings were used in conjunction with the bias correction terms (b buoy , Figure 6) to linearly correct buoy U 10S towards ASCAT. The biases were simply the average differences between the datasets (Equation (11)). The combination of the scaling (a buoy ) and the offset (b buoy ) can be used to determine ranges over which an instrument overestimates or underestimates relative to one of the other datasets (Equation (12)). These were exactly analogous to changes in a line due to a regression of the slope and y-intercept. The key difference in concept is accounting for similarity in the sub-gridscale variability in two of the datasets relative to another when determining the slope (Equations (8)- (10)).
From a visual inspection of the buoy calibration scalings in Figure 5, we observed some areas with patterns in the calibration components. In the tropics, there is a large variation in a buoy and b buoy values with the TAO array. Given that all the TAO buoy winds are measured from a height of 4 m, this variation in wind components seems to not be systematically related to the anemometer height, but rather geophysical effects in this region. In the western tropical Pacific, the v-components for a buoy were nearly all positive and there was a large variation in the v-components for b buoy . Before calibration, most of the v-component wind values with the TRITON buoys were below ASCAT values for the entire wind range. Another area of interest is off the Pacific Northwest coast of the United States, where a buoy values for both u and v components are consistently less than 1. This is an area that experiences strong upwelling of cold water during the northern hemisphere summer.
It is important to note that sampling and the appropriateness of the TC error model determines the accuracy of the result [41], and individual buoy calibrations are subject to conditions local to the respective buoy area. The ERA5 variability and variability deficit depend on local conditions, particularly near the equator where both the root-mean-square deviation and bias change sharply with latitude [42]. The inter-tropical convergence zone moves over a year and hence the speeds, variability conditions, and errors at most tropical buoy locations change systematically over the year. This may violate the assumption of constant bias and random error as a function of speed. Most of the buoys in the 3.5-4.5 m anemometer height range are located in the trades and have a rather narrow dynamic range, which makes the results on the error estimates rather uncertain as distinction between a buoy and b buoy values is poor. This is important to consider with the given distribution of a buoy and b buoy values shown in Figures 5 and 6. With the given caveats in the TC methodology, we did not observe any strong trends between the anemometer height ranges and the calibration components from Figures 5 and 6 alone, but rather differences due to possible geophysical affects for specific areas. Table 3 shows the average and standard deviation of the wind component residual biases for calibrated buoy and ERA5 FG U 10S grouped by anemometer height bins. Overall, the calibrated ERA5 FG and buoy wind components compare favorably with ASCAT for all anemometer height bins with low bias for each bin. The pattern of standard deviations from ERA5 FG winds and buoy U 10S are similar, where the standard deviation of components from buoys in the 3.5-4.5 m bin are smaller on average than the other two wind height bins. This again may reflect that the overwhelming number of buoys with anemometers near 4 m are in the tropical trade wind regions. Buoys in the 2.5-3.5 m bin appear mainly in tropical moist convective regions, hence the low winds and high variability.     In the following section we provide evidence that the differences in the calibrated residuals are intricately linked to the local sea state; therefore, we do not draw a conclusion about these biases in this section, other than that they are small when grouped as a whole. Since TC is performed at the individual buoy level, the calibration of each buoy to ASCAT is dependent on the dominant conditions of the local region. The following sections helps to contextualize the observed differences in the comparisons with ASCAT for buoy winds calibrated under different conditions. Since the interpretation of wind speed magnitude comparisons are easier understood than comparisons with individual components, we converted the calibrated u and v buoy U 10S components into wind speed magnitude for comparison with ASCAT in the following section.

Residual Analysis with Sea State Sea State Comparisons
There are many ways to describe sea state [43][44][45], and the easiest to visualize is that wind-waves tend to be steeper, with a greater ratio of wave height to wavelength, while swell are more gently sloped. Swell waves can have quite large amplitudes, but they have longer wavelengths resulting in smaller slopes. The sea state parameters described herein are all derived from ERA5. The ERA5/ECMWF wave model used for the separation of wind-waves from swell is described in [46]. One of the output parameters at each ERA5 grid point is the two-dimensional wave spectrum, F( f , θ). The wave spectrum describes the distribution of wave energy as a function of frequency ( f , which is related to the wavelength via the dispersion relation) and wave propagation direction (θ). The spectral components subject to wind forcing are delineated by Equation (13), where c is the wave phase speed (as a function of wave frequency) and ϕ is the wind direction. To a good approximation, waves are still subject to wind forcing when β i > 1 [46].
The moment of order n for F( f , θ), is defined by Equation (14) and the definition of significant wave height is shown by Equation (15). The zero-moment of Equation (14) (m 0 ) is the mean variance of the sea surface elevation [47].
H S is the significant wave height when the integral in Equation (14) is performed over all components of the spectrum. It is a statistical quantity of the wave field that roughly corresponds to the average height of the highest one-third of waves (trough to crest). By extension, the wind-wave significant wave height (H wind ) is computed by only integrating Equation (14) over the components of F( f , θ) that satisfy β i > 1. For the purposes of our study, wind seas were classified with β i > 1. The mean wave direction (θ) was derived from the full 2-dimensional wave spectrum [46]. Swell and mixed seas were defined by β i ≤ 1 and we delineated mixed seas and swell dominant conditions using the squared fraction of ERA5 H swell and ERA5 H wind in Equation (16).
If H swell was at least twice the height of H wind (w f greater than 4.0), then we defined the sea state as swell dominated. In between w f values of 0.25 and 4.0 we defined a sea state of mixed wind-waves and swell. H s alone may not provide enough information on how either the buoy or ASCAT winds are affected by sea state. For this reason, we investigated the relationships between common sea state parameters and the calibrated wind residuals (buoy U 10S -ASCAT). The integral wave steepness is defined in Equation (17), where L is the mean wavelength and T m is the mean wave period. Peak wave period (T p ) differs from T m in that T p represents the period of the waves with the highest energy versus T m which represents the weighted mean of all wave periods in the spectrum. The calculation method of T p and T m from the full 2-dimensional wave spectrum in the ERA5 wave model is given in [46]. Wave orbital velocity (U orb ) is defined in Equation (18) using T p and the wave phase speed (C p ) is shown in Equation (19). It has been theoretically demonstrated that straining of shorter waves by the orbital motion of the longer waves contributes to the steepening of the gravity waves [48] and consideration of orbital velocity leads to drag coefficient formulations that are in better agreement with experimental results [49,50]. Here, we did not define U orb where the mean wave direction is in-line with or against the wind direction, but instead used the simple relationship shown in Equation (18) for all conditions. The correlations between the wave and thermal parameters calculated with ERA5 and the calibrated wind residuals are shown in Figure 7.  (19) ters calculated with ERA5 and the calibrated wind residuals are shown in Figure 7.
, (17) ,  Figure 7, the strongest correlations with the calibrated residuals occur with ERA5 air temperature and sea surface temperature (SST). Compared to Ku-band scatterometers, SST effects are shown to be much smaller at C-band frequencies (ASCAT) and are wind speed-dependent [51]. Since this study is concerned with the possible effects of wave sheltering of buoy winds, it is important to isolate the error sources which may confuse the interpretation of the buoy wind speed differences under high winds and waves. Here, we continue with the analysis of the residuals with sea state, but also examine the effects of SST on the residuals.
All of the sea state variables shown in Figure 7 are nearly independent from the wind speed residuals with extremely low correlations.
shows a correlation near 0. However, we noted that the correlation may change when the conditions are limited to wind and wave directions that are nearly aligned. and have correlations close to zero, showing that the wind speed differences between ASCAT and buoys are not best described by a linear function of wave height alone. In fact, Figure 7 shows that the  Figure 7, the strongest correlations with the calibrated residuals occur with ERA5 air temperature and sea surface temperature (SST). Compared to Ku-band scatterometers, SST effects are shown to be much smaller at C-band frequencies (ASCAT) and are wind speed-dependent [51]. Since this study is concerned with the possible effects of wave sheltering of buoy winds, it is important to isolate the error sources which may confuse the interpretation of the buoy wind speed differences under high winds and waves. Here, we continue with the analysis of the residuals with sea state, but also examine the effects of SST on the residuals.
All of the sea state variables shown in Figure 7 are nearly independent from the wind speed residuals with extremely low correlations. U orb shows a correlation near 0. However, we noted that the correlation U orb may change when the conditions are limited to wind and wave directions that are nearly aligned. H wind and H swell have correlations close to zero, showing that the wind speed differences between ASCAT and buoys are not best described by a linear function of wave height alone. In fact, Figure 7 shows that the mean wave period is better related to the distribution of calibrated wind residuals than wave height. S and w a have weak and opposite correlations. The calculations of S and w a both include wave period and are traditional indicators for defining the sea state [44,49,52,53]. As S increases, the sea state is more defined by wind seas than swell [45]. Here, we observed a positive correlation of the calibrated wind residuals with S with the trend where either buoy U 10S increases in reference to ASCAT or ASCAT winds decrease in reference to the buoys as the sea state moves from more swell dominated to wind-wave conditions. The negative correlation with w a supports this idea given that large w a values are associated with swell dominant conditions. We observe a smaller correlation to w a than [54] and noted that their correlations with w a were dependent on the incidence angle of the wind readings. In addition, we noted that using either buoy or ASCAT winds in the formulation of u * in the w a calculation led to a larger correlation with the calibrated wind speed residuals, and we observed a similar correlation as [54] when using buoy winds in our w a formulation (not shown). We only used ERA5 model first-guess winds in our w a calculation to avoid a spurious cross-correlation.
These cases show that sea state was related with the distribution of wind speed residuals for sea state variables derived from H s and T p , but the relationship was weak when looking at individual correlations alone. Figure 8 shows the results of comparing the calibrated wind speed residuals (buoy U 10S -ASCAT) to H s with sea state defined by wind seas (β i > 1), mixed seas (β i ≤ 1, 0.25 < w f ≤ 4.0) and swell (β i ≤ 1, w f > 4.0). The first noticeable trend is that the calibrated buoy U 10S values for swell dominated seas were below ASCAT and calibrated buoy U 10S values for wind-waves are above ASCAT. Unlike the swell dominated residuals, the wind-wave residuals did not strongly decrease with increasing H s . Swell conditions showed the strongest variability in the given H s range. To within uncertainty in these biases, the calibrated wind speed biases for mixed seas were usually between the residuals from the wind-wave and swell dominated seas. For the most part, these differences greatly exceeded one standard deviation of the random uncertainty, i.e., tended to be statistically significant. the wind readings. In addition, we noted that using either buoy or ASCAT winds in the formulation of * in the calculation led to a larger correlation with the calibrated wind speed residuals, and we observed a similar correlation as [54] when using buoy winds in our formulation (not shown). We only used ERA5 model first-guess winds in our calculation to avoid a spurious cross-correlation. These cases show that sea state was related with the distribution of wind speed residuals for sea state variables derived from and , but the relationship was weak when looking at individual correlations alone. Figure 8 shows the results of comparing the calibrated wind speed residuals (buoy -ASCAT) to with sea state defined by wind seas ( 1), mixed seas ( 1, 0.25 4.0) and swell ( 1, 4.0). The first noticeable trend is that the calibrated buoy values for swell dominated seas were below ASCAT and calibrated buoy values for wind-waves are above ASCAT. Unlike the swell dominated residuals, the wind-wave residuals did not strongly decrease with increasing . Swell conditions showed the strongest variability in the given range. To within uncertainty in these biases, the calibrated wind speed biases for mixed seas were usually between the residuals from the wind-wave and swell dominated seas. For the most part, these differences greatly exceeded one standard deviation of the random uncertainty, i.e., tended to be statistically significant.  As seen with the relatively low correlation in Figure 7, H s alone was not a great indicator of the differences between calibrated buoy U 10S and ASCAT. Since w a is a traditional indicator of sea development [27], we investigate how the residuals change combining both w a and H s in Figure 9. This figure shows that changes in both w a and H s were strongly related to the differences between buoy winds and ASCAT, and that these differences varied in a well-defined pattern. Young seas with a small wave age below 30 were generally associated with calibrated buoy U 10S above ASCAT wind speeds, while old seas with high w a and high H s were associated with ASCAT winds higher than calibrated buoy U 10S .
To have confidence in the results from Figures 8 and 9, we also had to isolate possible effects on the ASCAT winds due to SST since SST had the highest direct correlation with the wind speed residuals (Figure 7) and varied with latitude and wind speed conditions. In Figure 10, we tested the calibrated residual (calibrated buoy U 10S -ASCAT) in relatively high seas with H s between 4 and 6 m and binned by both SST and w a ranges. We chose this H s range as it is where we observe the largest changes in the residuals from low to high w a in Figure 9. A shift of the entire PDF along the x-axis of Figure 8 would correspond to a change in the mean residuals, indicative of a dependency of the residuals on the variable in question. For SST, all categories of residual had the same bias, until SST exceeded 24 • C, at which point there was a small observable shift in the negative direction. There was a slightly more noticeable shift in the mean residuals with changing values of C p , where the residuals moved in the negative direction as C p increases. The same was true with increasing wind speed, where the residuals moved in the negative direction as ERA5 FG U 10S increased. The dependency of the wind speed residuals in this H s range was more sensitive to C p and ERA5 FG U 10S rather than SST, giving confidence that the large wind speed residuals in Figures 8 and 9 are not due to shifting SST.
Remote Sens. 2021, 13, x FOR PEER REVIEW 18 of 29 As seen with the relatively low correlation in Figure 7, alone was not a great indicator of the differences between calibrated buoy and ASCAT. Since is a traditional indicator of sea development [27], we investigate how the residuals change combining both and in Figure 9. This figure shows that changes in both and were strongly related to the differences between buoy winds and ASCAT, and that these differences varied in a well-defined pattern. Young seas with a small wave age below 30 were generally associated with calibrated buoy above ASCAT wind speeds, while old seas with high and high were associated with ASCAT winds higher than calibrated buoy . Figure 9. Average calibrated wind speed residuals (calibrated buoy -ASCAT) organized by and . * in is calculated with ERA5 model first guess winds. Squares with an asterisk (*) indicate statistically significant differences between the ASCAT and calibrated buoy values for the given and range defined using Welch's t-test with a p-value limit of 0.05.
To have confidence in the results from Figures 8 and 9, we also had to isolate possible effects on the ASCAT winds due to SST since SST had the highest direct correlation with the wind speed residuals (Figure 7) and varied with latitude and wind speed conditions. In Figure 10, we tested the calibrated residual (calibrated buoy -ASCAT) in relatively high seas with between 4 and 6 m and binned by both SST and ranges. We chose this range as it is where we observe the largest changes in the residuals from low to high in Figure 9. A shift of the entire PDF along the x-axis of Figure 8 would correspond to a change in the mean residuals, indicative of a dependency of the residuals on the variable in question. For SST, all categories of residual had the same bias, until SST exceeded 24 °C, at which point there was a small observable shift in the negative direction. There was a slightly more noticeable shift in the mean residuals with changing values of , where the residuals moved in the negative direction as increases. The same was true with increasing wind speed, where the residuals moved in the negative direction as ERA5 FG increased. The dependency of the wind speed residuals in this range was more sensitive to and ERA5 FG rather than SST, giving confidence that the large wind speed residuals in Figures 8 and 9 are not due to shifting SST.
Since the * value in was also derived from ERA5 FG winds, and * has a direct relationship with wind speed [27], the PDFs in Figure 10 show a contrasting effect where increasing (and increasing ) is associated with a negative shift in the residuals (calibrated buoy decreasing w.r.t ASCAT), but increasing ERA5 FG also is associated with the negative shift, but decreasing . Therefore, the strong negative residuals with high and high were associated with light winds or long wave periods. For the same range between 4 and 6 m, if we averaged and * into bins, we found that with increasing there was a more drastic decrease in * compared to an increase in (not shown). Where we observed the significant negative residuals in Figure 9 with between 4 and 6 m and between 70 and 170, the average ERA5 FG value was Figure 9. Average calibrated wind speed residuals (calibrated buoy U 10S -ASCAT) organized by w a and H s . u * in w a is calculated with ERA5 model first guess winds. Squares with an asterisk (*) indicate statistically significant differences between the ASCAT and calibrated buoy U 10S values for the given w a and H s range defined using Welch's t-test with a p-value limit of 0.05.
Remote Sens. 2021, 13, x FOR PEER REVIEW 19 of 29 a moderate 6.9 ms −1 , while the average value was 14.1 s (with 22 ms −1 ) and an average * value of 0.23 ms −1 . Although the residuals were more sensitive to changes in * compared to , we observed significant differences between calibrated buoy and ASCAT even at moderate winds speeds under these conditions. This example shows the differences with sea state are not only isolated to the extreme low and high wind speed ranges. It is worth mentioning at this point that the TC calibration may affect the interpretation of these results. The residuals were statistically biased due to conditional sampling of high ERA5 winds in the given range between 4 and 6 m. Since SST has a strong relationship with changes in latitude, the calibrated buoy winds were influenced by changes in and values in this direction. and had a strong relationship with changes in longitude, and, hence, were affected by the changes in shown near the coast in Figure 5. Although calibration of the buoy winds reduced the overall bias with respect to ASCAT, it is important to consider how this may affect the results when limiting the sample to a range smaller than which the calibrations took place. Overall, TC calibration at the individual buoy level reduced the differences between ASCAT and the buoy winds (shown in Figures 8 and 9) compared to TC calibration using all buoys (not shown). This gives confidence that TC at the individual buoy level is appropriate for use for the sea state comparisons shown here.
In the cases shown in this section, the TC calibration was not independent of the dominant sea state for the individual buoy locations. For example, if the 4-m buoys were Since the u * value in w a was also derived from ERA5 FG winds, and u * has a direct relationship with wind speed [27], the PDFs in Figure 10 show a contrasting effect where increasing C p (and increasing w a ) is associated with a negative shift in the residuals (calibrated buoy U 10S decreasing w.r.t ASCAT), but increasing ERA5 FG U 10S also is associated with the negative shift, but decreasing w a . Therefore, the strong negative residuals with high H s and high w a were associated with light winds or long wave periods. For the same H s range between 4 and 6 m, if we averaged C p and u * into w a bins, we found that with increasing w a there was a more drastic decrease in u * compared to an increase in C p (not shown). Where we observed the significant negative residuals in Figure 9 with H s between 4 and 6 m and w a between 70 and 170, the average ERA5 FG U 10S value was a moderate 6.9 ms −1 , while the average T p value was 14.1 s (with C p = 22 ms −1 ) and an average u * value of 0.23 ms −1 . Although the residuals were more sensitive to changes in u * compared to C p , we observed significant differences between calibrated buoy U 10S and ASCAT even at moderate winds speeds under these conditions. This example shows the differences with sea state are not only isolated to the extreme low and high wind speed ranges.
It is worth mentioning at this point that the TC calibration may affect the interpretation of these results. The residuals were statistically biased due to conditional sampling of high ERA5 winds in the given H s range between 4 and 6 m. Since SST has a strong relationship with changes in latitude, the calibrated buoy winds were influenced by changes in a buoy and b buoy values in this direction. C p and w a had a strong relationship with changes in longitude, and, hence, were affected by the changes in a buoy shown near the coast in Figure 5. Although calibration of the buoy winds reduced the overall bias with respect to ASCAT, it is important to consider how this may affect the results when limiting the sample to a range smaller than which the calibrations took place. Overall, TC calibration at the individual buoy level reduced the differences between ASCAT and the buoy winds (shown in Figures 8 and 9) compared to TC calibration using all buoys (not shown). This gives confidence that TC at the individual buoy level is appropriate for use for the sea state comparisons shown here.
In the cases shown in this section, the TC calibration was not independent of the dominant sea state for the individual buoy locations. For example, if the 4-m buoys were in an area dominated by swell conditions for most of the timeseries, then the calibrated residuals for swell would be less than other buoys that were not in swell-dominated conditions. Organizing the residuals in this way does, however, give statistical evidence of the possible problems associated with either ASCAT or the buoys with the dominant sea states. To further isolate the differing effects of swell and wind-waves, we performed TC calibration at the individual buoy level explicitly using cases with wind-wave-dominated seas while still holding ASCAT as the reference dataset.

Wind-Wave-Dominated Seas
It has been established that the wind-wave portion of the wave spectrum that dominates the magnitude of the wind stress is directly affected by large gravity waves [55][56][57][58].
Our results up to this point indicate that wind stress might be affected by the presence of swell in addition to large wind-driven gravity waves and further investigation is required to separate the possible sea state effects on buoys versus the scatterometer winds. Here we again performed TC with ASCAT as a reference at the individual buoy level but isolated the observations to the predefined wind-wave conditions (β i > 1) using buoys which have at least 250 triplets under these conditions. This limits the collocated data to 13% of the original triplets corresponding to 82 of the original 295 buoys. Of these, only 2 of the buoys have anemometers with a height near 3 m and 13 have an anemometer height between 3.8 and 4.1 m. There was a much larger sample of 66 buoys with anemometer heights near 5 m. For this reason, we chose to base our comparisons in this section on winds from anemometers near 4 and 5 m. Performing TC in this manner allowed us to isolate possible wave flow distortion effects in high winds and associated wind-driven waves while limiting the swell effects illustrated in the previous section.
We tested the buoy winds measured from anemometers between 3.6 and 5.1 m to quantify the differences between calibrated buoy U 10S and ASCAT for different ranges of wind speed and H s (Figure 11). By taking the mean wind speed residuals from buoys with anemometers near 5 m and subtracting the mean residuals from buoys with anemometers near 4 m, we observed how increasing wave height and wind speeds contribute to buoy wind speed error relative to ASCAT. From Figure 11, there is not a strong trend in the differences of the residuals increasing with increasing wind speed and wave height until we reach H s above 4. We observe significant differences between the mean buoy residuals with H s between 4 and 5 m and ASCAT wind speeds between 12 and 18 ms −1 . The significant differences between the residuals are negative in this range, indicating that calibrated buoy U 10S from anemometers near 4 m is generally greater than calibrated buoy U 10S from buoys with anemometers near 5 m under these specific conditions. in the differences of the residuals increasing with increasing wind speed and wave height until we reach above 4. We observe significant differences between the mean buoy residuals with between 4 and 5 m and ASCAT wind speeds between 12 and 18 ms −1 . The significant differences between the residuals are negative in this range, indicating that calibrated buoy from anemometers near 4 m is generally greater than calibrated buoy from buoys with anemometers near 5 m under these specific conditions. Figure 11. Heat map of the difference between the binned average of calibrated wind speed residuals (calibrated buoy U10SC-ASCAT) for ASCAT wind and ERA5 ranges for buoy residuals at 5 m minus the binned residuals at 4 m anemometer height in wind-wave-dominated seas ( 1).
Bold values with an asterisk (*) indicate statistically significant differences between the residuals at separate anemometer heights defined using Welch's t−test with a p−value limit of 0.05.
This finding is the opposite of what we would expect from previous arguments of wave sheltering, where winds measured from a lower height should compare lower to a Figure 11. Heat map of the difference between the binned average of calibrated wind speed residuals (calibrated buoy U10SC-ASCAT) for ASCAT wind and ERA5 H s ranges for buoy residuals at 5 m minus the binned residuals at 4 m anemometer height in wind-wave-dominated seas (β i > 1). Bold values with an asterisk ( * ) indicate statistically significant differences between the residuals at separate anemometer heights defined using Welch's t-test with a p-value limit of 0.05. This finding is the opposite of what we would expect from previous arguments of wave sheltering, where winds measured from a lower height should compare lower to a reference in high winds and waves compared to winds from a higher anemometer [3]. If the wind profile is assumed to be close to logarithmic near the surface, we would expect to observe larger differences between 4 m anemometer height residuals as high waves may effectively lower the winds from lower anemometer heights more than higher anemometers compared to the wind reference. It is worth pointing out that the uncertainty in the differences of the binned residuals with H s above 4 m noticeably increases, but within the uncertainty bounds we do not observe a large separation between the residuals measured from the different anemometer heights. Our results in Figure 11 support the argument that wave sheltering is not a significant contributing factor to buoy wind speed error for winds measured at 4 and 5 m in the given ranges below 21 ms −1 .
For a comparison of the calibration coefficients in wind-waves we plot the a buoy and b buoy values for the u and v components of the buoys with reference to ASCAT (Figures 12 and 13). . We can only speculate on the cause of the buoy and ASCAT differences in the ucomponent of the winds with latitude; however, since the sample is mixed with buoys that have anemometers near 4 and 5 m, we suggest that this change is not likely to be due to the physical characteristics of the individual buoys. Although our calibration was performed at the individual buoy level, a similar pattern was found in a study using global TC where the different errors in u and v of ERA5 are the main cause [42]. We did not observe a similar shift in the v-component values with changing latitude, but do note that many of the v-component values off the eastern coast of the United States are below 1 and u-component above 1, suggesting that many of the buoys underestimate the u-component and overestimate the v-component off the eastern United States and in the Gulf of Mexico. Compared to the values, there was less of an observable pattern in the values shown in Figure 13.  Since there are systematic changes in the a v buoy values with latitude, a comparison of the wind calibration coefficients averaged together are skewed given that the distribution of buoy anemometer heights is also not consistent with latitude. In addition, the distribution of buoys with anemometer heights near 4 m are located further offshore than many of the buoys with anemometers near 5 m. This could affect the TC statistics given higher wind variability in the near coastal regions [34]. However, there are instances where the NDBC buoys changed from anemometers near 5 m to anemometers near 4 m in the same location. Of the 14 buoys in the 3.5 to 4.5 m anemometer height range that met the minimum number of triplets under the wind-wave criteria, 5 of the buoys have separate calibrations before and after a change from approximately 5 to 4 m. Since these buoys were in the same general area before and after the switch, we can directly compare the calibration coefficients to investigate any trends before and after the changes. Table 4 shows the calibration coefficients of these stations.
From this subset of 5 stations, there is not a strong trend of changes the calibration coefficients from the 5 m anemometer height buoys to the 4 m wind height. The a values do not strongly change from the higher to the lower anemometer height. The average of the differences for a u buoy and a v buoy from 5 to 4 m height are −0.012 and −0.0328, respectively, with standard deviations of 0.042 and 0.067. The standard deviations are both higher than the averaged differences between the scalings for the two anemometer heights. The averaged differences in the bias terms are a bit larger with the differences being −0.086 for Since there are systematic changes in the values with latitude, a comparison of the wind calibration coefficients averaged together are skewed given that the distribution of buoy anemometer heights is also not consistent with latitude. In addition, the distribution of buoys with anemometer heights near 4 m are located further offshore than many of the buoys with anemometers near 5 m. This could affect the TC statistics given higher wind variability in the near coastal regions [34]. However, there are instances where the NDBC buoys changed from anemometers near 5 m to anemometers near 4 m in the same location. Of the 14 buoys in the 3.5 to 4.5 m anemometer height range that met the minimum number of triplets under the wind-wave criteria, 5 of the buoys have separate calibrations before and after a change from approximately 5 to 4 m. Since these buoys were in the same general area before and after the switch, we can directly compare the calibration coefficients to investigate any trends before and after the changes. Table 4 shows the calibration coefficients of these stations.

Discussion
Wind observations from buoys with many physical differences were used in this study. The buoys are generally either aluminum or foam discus buoys with anemometer heights between 3 and 5 m. The error characteristics were divided by anemometer height, but there may also be errors due to differences in hull type and flow distortion around the anemometer. Other differences in the observed winds may be due to different calibration practices from organizations reporting the buoy observations to the GTS. From the results shown here, the differences in buoy U 10S and ASCAT winds may be influenced by sea state, especially with high H s and large or small w a values relative to the value for local equilibrium. Here, we summarized our findings of how sea state may play a role in the differences in buoy U 10S and ASCAT and contrast this with other possible error sources and give explanations on possible ways the noted results may be statistical rather that physical.

Wind-Wave Flow Distortion
Our analysis of calibrated wind speed residuals in wind-wave-dominant conditions for buoys with anemometers near 4 and 5 m shows a minor separation between the average wind residuals in high wind-waves, but the differences between the calibrated residuals do not follow the expected pattern where the wind from 4 m buoys is distorted more than the wind from 5 m buoy in high winds and seas. We expected to observe higher flow distortion with winds measured closer to the ocean surface given previous arguments of wave sheltering [3]. Our results indicate that there is a larger difference with calibrated buoy U 10S and ASCAT winds for buoys with anemometers near 5 m compared to a smaller difference between calibrated buoy U 10S and ASCAT winds for buoys with anemometers near 4 m for H s between 4 and 5 m and ASCAT winds between 12 and 18 ms −1 (Figure 11). The uncertainty of the observations drastically increases with increasing wind speed for pure wind-waves, especially with winds over 18 ms −1 . This analysis assumes that both ASCAT and ERA5 are a good reference to test the buoy winds against. The physics for deriving the scatterometer winds from ocean surface roughness elements does not depend on anemometer height, but the TC calibration of buoy winds to ASCAT may depend on predominant local conditions and TC assumptions.
With these considerations, we concluded that wave sheltering is not significantly affecting the winds from buoys with 4 and 5 m anemometers with high confidence for winds under 18 ms −1 . Compared to [3], the averaged differences between buoys with 4 and 5 m anemometers using ASCAT as the reference were much lower than the differences between the CASID buoy with a 3-m anemometer and NDBC buoys 46,004 and 46,005 with 5-m anemometers using ECMWF model winds as the reference dataset during the Ocean Storms Experiment. The results shown here compare favorably with [8] and illustrate that buoy wind speed errors possibly due to wave sheltering are smaller than other sources of error at high wind speeds such as platform flow distortion [7]. It is unfortunate that the sample of buoys with anemometer heights near 3 m in wind-wave conditions was too small for the TC analysis. It is expected that wind-wave flow distortion would play a larger role in the accuracy of winds measured from lower anemometer heights given the nature of the logarithmic wind profile in the boundary-layer. Although our results indicate that there is little difference between buoy U 10S with buoys anemometer heights near 4 and 5 m in high wind-wave conditions, we cannot rule out that wave sheltering is not a factor in the accuracy of winds measured from buoys with anemometers below 4 m.

Swell Wave Effects
Compared to pure wind-waves, the observed differences between ASCAT and buoy U 10S are larger in scale for pure swell conditions in high seas (Figure 8), although the differences may not be due to H s alone. NRCS SST dependencies have been evaluated as rather weak for C-band scatterometry [51], but direct correlations tend to be more noticeable than those to sea state (Figure 7). This is probably due to the fact that SST correlates with climate zones and their varying geophysical conditions and wind variability. SST gradients have been shown to have a profound effect on wind variability and hence buoy and scatterometer collocation errors [42,59]. Refinement of our analysis revealed that SST changes were not a large factor on the wind speed residuals when isolating conditions to where we observe the largest differences between calibrated buoy U 10S and ASCAT in Figure 9. We illustrated that the shifts the largest residuals were associated with shifts in w a when isolating H s between 4 and 6 m. We also observed that the shifts in the w a were due more to shifts in u * than C p . The combination of high H s and moderate-to-low u * was connected to ASCAT winds consistently above buoy U 10S . Evaluating covariance of these variables may provide a good indication of causation, but all other true conditions should remain the same ideally, which does not tend to happen. It is also important to consider how wind PDF and variability with the given distribution of buoy winds may affect the comparisons. For example, wind variability broadens the buoy wind speed PDF and hence may cause pseudo biases and errors in proportionality when comparing to scatterometer winds [18]. Further tests may be needed to estimate the dominating factors at play.
The steady decrease in wind speed residuals with increasing swell and the lack of separation between 4 and 5 m anemometer residuals provides a suggestion that this problem is not related with the buoy wind speed error. Scatterometer readings are sensitive to roughness elements caused by stress on the ocean, and there are multiple field experiments [55,56] and models [49] indicating stress decreases with swell compared to wind-waves. There are relatively few studies that have directly documented ASCAT wind error with swell waves. A neural model developed by [52] for the C-band European Remote Sensing Satellite (ERS2) scatterometer related the NRCS to both surface wind and sea state information and found a marginal impact by swell waves on the NRCS and related this error to significant wave slope. The correlation they found was that NRCS dependencies on sea state reduce with increasing incidence angle, but given the larger noise in ERS winds, no causation seems implied. A study by [60] found that a variety of swell parameters significantly influenced the radar cross section of the ERS-1/2 scatterometers in a manner that led to overestimation of the wind retrievals. A more recent study by [54] using buoy wind comparisons to ASCAT winds from the CMOD5.N GMF also confirm that low frequency swell causes overestimation of ASCAT winds measured at low incidence angles in reference to NDBC buoys. Wind retrievals based on CMOD5 have documented vector-dependent biases which are dependent on incidence angle and change across the swath. Therefore, a change in vector distribution in the across-track wind vector cells (WVCs) may cause biases and drifts in the derived ASCAT winds. This problem was significantly improved with development of the CMOD7 GMF [21], although our results show there could still be room for improvement.
Ocean swell modifies the surface roughness spectrum by tilting the surfaces over which capillary waves ride, and therefore, influencing the angle at which emitted backscatter from the satellite encounters the surface roughness elements which affects the returned signal strength [61]. The extra tilting may contribute to a larger signal variation of scatterometer measurements and modification of the wind speed dependence of the NRCS [52,54,61]. The signal increase due to the non-linear incidence-angle dependence would result in higher U 10S for higher swell for fixed ocean stress conditions. One may expect to see higher slope modulation effects in high wind-waves, but our observations surprisingly show that the effect is most pronounced in high swell with large wave ages associated with long wavelengths and hence low surface slopes ( Figure 9). The inclusion of incidence angle (across-swath WVC number) as shown by [52,54,62] with estimates of w a and H s may help to further refine the conditions under which the tilting effects are most pronounced. This is especially important with the planned launch of the EUMETSAT Polar System-Second Generation Satellite B in 2022 with a C-band scatterometer having a lower minimum incidence angle compared to ASCAT [63].
Another possible explanation is that non-linear wave-wave interaction is contributing to the amplitude of Bragg scattering waves [64]. This mechanism is controversial [58] and likely to be overestimated in [64]. However, it is plausible that swell acts to damp the magnitude of short gravity waves and the gravity-capillary waves that cause the Bragg scattering. This explanation has the advantage of explaining the systematic errors seen for the small wave ages observed with rising seas but does not explain why we observe ASCAT winds above the buoys in areas of high swell.
It is important to note that these are just a couple of possible explanations for the comparison results shown here in areas of high swell. The subject of this paper was a test of how buoy winds are affected in high winds and seas and a more thorough investigation of ASCAT winds should be undertaken. It is possible that the impact of sea state is more pronounced on the NRCS values from the three individual beams of the ASCAT before they are merged and used with the scatterometer GMF [54]. An analysis with the ASCAT maximum likelihood estimator (MLE) winds would be beneficial for a fundamental understanding of how sea state may impact scatterometer winds and is suggested for future work.

Other Sources of Comparison Error
In addition to errors in the comparison of ASCAT and buoys due to sea state, other error sources must be considered that influence the comparisons in high wind conditions. By using buoy U 10S , we accounted for the possible effects of air mass density and to better match buoy and ERA5 winds with ASCAT. Even with these considerations other factors may lead to some of the observed differences we find in this study.

Platform Airflow Distortion
Another possible source of error is with flow distortion of the air with the buoy platforms and instrument packages. Even though a full investigation of the flow characteristics is outside of the scope of this study, recent studies have used novel techniques to characterize the effects of flow distortion on buoy wind speeds. The directional pattern of inter-anemometer disagreement for buoys with dual anemometers from the Woods Hole Oceanographic Institute and comparison to ASCAT to characterize flow distortion by the buoy platform and instruments was calculated by [7]. They found that the disagreement between the anemometers can be up to 5% due to flow distortion. Occasionally, these differences can contribute to errors of ±1 ms −1 at 20 ms −1 . These errors are of a similar magnitude as the observed wind speed differences from our analysis. Many buoys do not report a heading as with [7], but for a full investigation of buoy wind speed error, flow distortion by the instrument platform must be considered.

Triple Collocation
An investigation of multiple collocation methods is shown by [41] who summarize benefits and limitations. TC regresses three datasets simultaneously with a set of assumptions, captured in an error model, common to the three systems compared. TC is amended by assumptions on spatial representativeness. Where the error properties of buoys, ASCAT and ERA5 are all different, the behavior of each of them with respect to the error model assumptions may affect the TC results of all three systems, since the equations are coupled through the mixed second-order moments (Equation (8)) which are a function of the representation error. In this respect, we particularly note the regionally varying and large biases in ERA5 [42], which may affect the a buoy and b buoy values we compute for the buoys. Finally, the accuracy of regression does not only depend on the error characteristics and the sample size, but also on the PDF and dynamic range of the weather regime being sampled. The differences between calibrated buoy U 10S and ASCAT U 10S may therefore be affected by regionally varying biases in ERA5 model first guess winds. Our TC analysis was performed at the individual buoy level. Although we observe large regional differences in the calibration scalings ( Figure 5) and bias corrections (Figure 6), the difference between calibrated buoy winds and ASCAT in the high SWH ranges of Figures 8 and 9 is reduced when performing TC at the individual buoy level compared to a calibration using all buoys (not shown). Even so, the TC assumption of constant bias and random error as a function of speed may be violated for individual buoy calibrations in areas where the wind variability and errors change systematically from season-to-season, which complicates the correct interpretation of the results for these regions.

Conclusions
To quantify the effects of sea state on comparisons of ASCAT and buoy winds, we converted raw buoy winds and ERA5 FG winds to U 10S and performed TC calibration using ASCAT as a reference dataset. It is shown that the binned wind speed statistics (Table 1) vary by the chosen buoys within different anemometer height ranges. It is therefore important to consider which buoys are being used when computing wind speed comparison statistics on a global scale. We first performed TC at the individual buoy level under all sea states and found a pattern where calibrated residuals (calibrated buoy U 10S -ASCAT) are negative for swell waves in high seas (H s > 4 m) and slightly positive under most wind-waves. This led to further statistical isolation of sea state and conditional TC calibrations under dominant wind-wave conditions. By isolating the conditions to wind seas, we found that wave height has a marginal effect on the height adjusted winds speed differences of buoys with 4-and 5-m anemometer heights using ASCAT as the reference. Most of the differences were small with the given wind and wave distribution. From these results we concluded that errors from flow distortion of the waves on buoy winds are not a dominant factor on buoy wind speed error for buoys in this anemometer height range, with high confidence in our results for winds below 18 ms −1 . This does not account for moored buoy wind measurements below 4 m, which still need further investigation. We also noted that the statistical sea state effects found may be related to the particular TC methodology that are conditional on the weather regimes of the local buoy area, since local random errors and biases are quite variable for the three systems used for TC (i.e., buoys, ASCAT and in particular ERA5).
It is suggested that future studies use collocation with ASCAT and high-quality buoy winds, where the sources of uncertainty from flow distortion have already been accounted for or shown to be small. Many of the 3-m discus buoys with 5-m anemometers from the NDBC have been replaced in recent years with smaller 2.4-m discus buoys with 4-m anemometers in the same locations. Consequently, the ability to test the wind speed differences for different buoys with the same environmental factors is now a reality and future studies can compare winds from buoys located at the same location with a relatively long record of winds.