Validation of Drifting Buoy Data for Ocean Wave Observation

: Drifting buoys collect wave data in the open ocean far from land and in areas with strong currents. However, the validation of the drifting buoy wave data is limited. Here, we compared the drifting buoy wave data, ERA5 wave data, and moored GPS buoy wave data. Data from 2009 to 2018 near the coast of Japan were used. The agreement of the drifting buoy-observed wave parameters with the moored GPS buoy-observed wave parameters is better than that of ERA5 wave parameters, which is statistically signiﬁcant. In particular, the accuracy of the ERA5 wave heights tends to be lower where the ocean currents are fast. On the other hand, the agreement between the drifting buoy-observed wave heights and the moored GPS buoy-observed wave heights was good even in the areas with strong currents. It is conﬁrmed that the drifting buoy wave data can be used as reference data for wave modeling study.


Introduction
Ocean waves are the boundary between the atmosphere and the ocean, where the air-sea interactions occur. The study of surface waves in the open ocean is important for climate modeling and weather forecasting. The modeling of ocean waves is also important for shipping routes and offshore industry. Validation of ocean wave hindcast data using in-situ observation data is necessary for wave modeling studies. However, it is difficult to moor a wave measurement buoy to observe waves in the deep sea area, which is very far from land. It is difficult to moor a buoy in the area where the current is fast. Therefore, free-drifting buoys are useful for ocean wave study in the open ocean. In particular, free-drifting buoys are useful for studying wave-current interactions because free-drifting buoys can measure ocean waves in fast-flowing regions, such as the western boundary current.
Moored buoys also drift. The difference between a moored buoy and a free-drifting buoy is that the moored instrument has an additional constraint that makes the instrument semi-Lagrangian. As a result, when the same measuring instrument is used, the data from the mooring buoy is more likely to be damaged by the mooring effect than those from free-drifting buoy. In this report, free-drifting buoys are referred to as drifting buoys.
The locations of drifting buoys change because of currents; therefore, compared to moored buoys, the drifting buoys are unsuitable for investigating the accuracy of regionality, seasonality, or long-term trends in predicted data. Consequently, validation studies of wave prediction using data from drifting buoys are rare, e.g., References [1][2][3][4]. Nevertheless, using drifting buoys has advantages (aside from operational costs) because their positions are unrestricted compared with those of moored buoys. If the drifting buoy can measure other parameters, the dependence of the wave data accuracy on the measured parameters can be investigated. In addition, the Lagrangian surface currents can easily be estimated from the drifting buoy data. Wave observations by drifting buoys have been conducted by References [5][6][7][8][9]; however, the number of wave data was few, and the accuracy of evaluation of the wave data is insufficient.
In addition to wave hindcast data from ocean wave models, reanalysis data are also used for the wave research in the open sea. The reanalysis wave data is produced by assimilating the altimeter data to the wave hindcast data. ERA5 reanalysis data is often used for research. The JMA (Japan Meteorological Agency) produces wave assimilation dataset with a spatial resolution of 0.05 • . However, it incorporates moored buoy data and drifting buoy data and is not an independent dataset.
The accuracy of the ERA5 wave data was verified by Reference [10], which shows the good accuracy. If it is verified that the drifting buoy wave data is more accurate than the ERA5 wave data, it can be shown that the drifting buoy wave data is useful for verification of the wave prediction data.
The objective of this study is to validate the accuracy of the drifting buoy wave data with moored GPS buoy data. We verify whether the drifting buoy data can be used as reference data by comparing the drifting buoy wave data and the ERA5 wave data with the moored buoy wave data.
Section 2 describes the data and methods. The system used to observe the drifting buoy wave data, ERA wave data, and GPS wave data is described in this section. The results are presented in Section 3. Section 4 discusses the wave data comparison. Section 5 provides the conclusion.

Drifting Buoy Data and Analysis
Data from the JMA drifting buoys by December 2018 were used for comparison herein. The JMA buoy data can be downloaded from http://www.data.jma.go.jp/gmd/kaiyou/ db/vessel_obs/data-report/html/buoy/buoy_e.php (accessed on 30 June 2021). The shape and weight distribution of the drifting buoy were designed to follow the sea surface at any time. The wave properties were measured by an accelerometer. The accelerometer was placed at the center of gravity of the buoy. The accelerometer was supported by a gimbal. Therefore, even if the buoy is tilted by waves, the axis to be measured is always in the vertical direction. The sampling interval of the surface displacements was 0.5 s, and the measurement time was 512 s. The significant wave height and period measured by the drifting buoy are the wave height and period determined by the zero-up crossing method. They are obtained from 1024 sea surface displacements. The drifting buoy-measured significant wave height and period are denoted by H b and T b , respectively. The significant wave height and period obtained by the zero-up crossing method are denoted as H 1/3 and T 1/3 . The buoy position was obtained from the GPS receiver [11,12].
The buoys usually observed significant wave height and period at 3 h intervals, but sometimes observed them at 1 h intervals. The wave height resolution measured by the drifting buoy is 0.1 m, and the period resolution is 1 s. The buoy was almost spherical. The diameter, disc diameter, and height of the buoy were 0.46 m, 0.64 m, and 0.54 m, respectively. The buoy weight was approximately 30 kg. The specification of the drifting buoy is described in http://www.data.jma.go.jp/kaiyou/db/buoy/buoy-info.html (accessed on 30 June 2021). The buoy data were transmitted by a satellite communication system. The satellite altitude was approximately 800 km. Therefore, the distance from the satellite earth station to the buoy position was limited to a few thousand kilometers [12].

ERA5 Wave Data
The ERA5 wave data can be obtained from https://cds.climate.copernicus.eu/ (accessed on 30 June 2021). The ERA5 wave height and period are defined as H e = H 0 ≡ 4M 1/2 0 and T e = T 0 ≡ M −1 M −1 0 , respectively, where f is a wave frequency, and F( f ) is a wave frequency spectrum [13].
The system used to produce the ERA5 data has been described in Reference [14]. The wave model was based on the WAM cycle 4, e.g., References [15,16]. The reanalysis data were produced using a coupled model system, which is called the Integrated Forecasting System (IFS). Observations from buoys and satellite scatterometers were assimilated in the ERA5 atmosphere data through a four-dimensional variational data assimilation scheme [17]. The wind input source function was from Reference [18]. The nonlinear interaction source function was from Reference [19]. The dissipation source function was from Reference [20]. The altimeter wave height data were assimilated in the ERA5 wave data through the optimal interpolation method [16,21].
The spatial resolution of the downloaded ERA5 wave data (H e and T e ) was 0.5 • . The 3 h ERA5 wave data were spatially interpolated on the drifting buoy position.

GPS Wave Buoy Data
The wave data from the GPS wave buoys [6,22] were used to evaluate the performance of the drifting buoy wave data. Table 1 and Figure 1 summarize the locations of the moored GPS wave buoys.
The distance between the moored GPS wave buoys and the coast was from 10 km to 20 km. Most of the water depths at the locations of the moored GPS wave buoys were greater than 100 m. The surface elevations were measured by the GPS receivers, not by the accelerometers. The sampling interval of the surface elevations was 1 s. The moored GPS-estimated wave height (H g ) and period (T g ) were estimated by the zero-up crossing method (H g is H 1/3 , and T g is T 1/3 ) from 1024 surface elevations [23]. The wave height (H g ) and period (T g ) were estimated at 20 min intervals. The GPS wave data can be downloaded from https://www.mlit.go.jp/kowan/nowphas/index_eng.html (accessed on 30 June 2021). The resolutions of H g and T g were 0.01 m and 0.1 s, respectively. The wave height and period of the closest GPS buoy were compared when a drifting buoy was close to one of the moored GPS buoys.

Method of Comparison
The root mean square difference (RMSD), correlation coefficient (r c ), and scatter index (SI) were used as the skill metrics for the comparisons, where X is the reference value, Y is the evaluated value, and . . . denotes an average. The correlation coefficient and SI were computed as and The statistical significance of the estimated skill metrics was tested. The bootstrap method, e.g., References [24,25], was used for verification, which is described in Appendix B.

Comparison with the GPS Buoys
Buoy data and ERA5 data at 3-h intervals were compared. The drifting buoy data observed for 512 s (GPS buoy data observed for 1024 s) were used for comparison (Sections 2.1 and 2.3). Figure 2a shows the scatter plot between the GPS-observed wave heights (H g ) and the drifting buoy-measured wave heights (H b ). The maximum distance threshold R dmax between a GPS buoy and a drifting buoy was 5 km. The mean distance of the buoys was R a = 3.6 km. Table 2 summarizes the skill metrics of the comparison, which shows wave height agreement was good. Table 2. Comparisons of the wave parameters of the GPS buoys (H g , T g ), drifting buoys (H b , T b ) and ERA5 (H e , T e ). The bias and the ratio for wave parameters X − Y are X − Y and X / Y , respectively. Unit: m or s for the bias and R d .  Figure 2b shows the scatter plot of the wave periods. The maximum distance threshold of the buoys was the same as that in Figure 2a. The correlation of wave periods was smaller than that of wave heights, while the SI of wave periods was smaller than that of wave heights.
We compared GPS-observed, drifting buoy-observed, and ERA5 wave parameters ((H g , H b , H e ) and (T g , T b , T e )). Figure 2c shows the scatter plot of the GPS-observed wave heights (H g ), drifting buoymeasured wave heights (H b ), and ERA5 wave heights (H e ) at the drifting buoy positions. The (H g , H b ) points were indicated in black, while the (H g , H e ) points were indicated in red in Figure 2c. The maximum distance threshold between a GPS buoy and a drifting buoy was R dmax =15 km, and the mean distance was R a = 9.1 km. The 3 h wave heights (H g , H e , and H b ) were compared when all of them were estimated. The number of data was N c = 102.
The spatial variability of the wave parameters was large near the coast, and the distance R dmax was preferably smaller for the wave parameter comparison. In contrast, the number of data was smaller when the distance R dmax was smaller. The value of R dmax =15 km (R a = 9.1 km) was found to optimize the balance between maximum distance and number of data.
The mean drifting buoy-measured wave height was closer to mean GPS buoymeasured wave height than mean ERA5 wave height. The correlation r c (H g , H b ) was higher than that r c (H g , H e ), the RMSD R d (H g , H b ) was smaller than R d (H g , H e ), and the scatter index SI(H g , H b ) was smaller than SI(H g , H e ). The agreement between the drifting buoy-measured wave height (H b ) and the GPS buoy-measured wave height (H g ) was better than that between the ERA5 wave heights (H e ) and H g . Figure 2d shows the scatter plot of the GPS-observed wave periods (T g ), drifting buoymeasured wave periods (T b ), and ERA5 wave periods (T e ) at the drifting buoy positions for R dmax =15 km.
The correlation r c (T g , T b ) was higher than r c (T g , T e ). The RMSD of the wave period R d (T g , T b ) was slightly smaller than R d (T g , T e ), although that for the buoy-measured wave periods (T g , T b ) was T 1/3 , and that for the ERA5 wave period T e was T 0 . The scatter index SI(T g , T b ) was smaller than SI(T g , T e ). The agreement of the drifting buoy-measured wave data with the GPS buoy-measured wave data was better than that of the ERA5 wave data.

Intercomparison of Wave Heights for Various Cases
The agreement between H g and H e was not good in Figure 2c. This reason was explored. GPS buoy-measured wave heights and ERA5 wave heights were compared in the case of winds from the sea and the other case. The case of winds from the sea means u w > 0 for the buoy C, u w < 0 for buoys D, E, F, G, and H, and v w > 0 for buoys J, K, P, and Q (Figure 1), where (u w , v w ) are ERA5 wind vectors at the JMA drifting buoy positions.
The skill metrics are summarized in Table 3. The fetch length was short in the case of Table 3b. It is expected that the impact of the difference between true fetch length and ERA5 fetch length on wave height difference is larger in the case of Table 3b than in the  case of Table 3a. However, the agreement between H g and H e in the case of Table 3a was better than that in the case of Table 3b. Table 3. Comparisons of GPS wave heights and ERA5 wave heights for various cases in Figure 2c. (a) Case of winds from the sea, which means u w > 0 for the buoy C, u w < 0 buoys for D, E, F, G, and H, and v w > 0 for buoys J, K, P, and Q (Figure 1  The wave heights by the GPS buoys near the Kuroshio (buoys J, K, P, and Q) and other buoys were compared with ERA5 wave heights. The agreement between H g and H e near the Kuroshio was poor (Table 3d). Most of the GPS buoy-measured wave data was from the buoy P and Q in Table 3d (Table 1), where the current speed is large by the Kuroshio (Figure 1). The effects of current are not taken into account in producing ERA wave data [26]. The main reason of the disagreement between H g and H e in Figure 2c is due to the strong current of the Kuroshio. On the other hand, the agreement between H g and H b was good in all of the cases in Table 3. The agreement was better as shorter R a .

Issues of ERA5 Wave Data
The following reasons can be considered as reasons for the low accuracy of ERA5 wave data. First, the ERA5 wave data was used in coastal areas. The native spatial resolution of ERA5 wave data is 0.36 • [14], which is too large to resolve the wave data in the coastal area. Next, the ERA5 wave data does not take into account the effect of ocean currents [26]. In addition, real-time altimeter data is not as well calibrated compared to data products from References [27,28]. The comparison with high resolution wave data product will be necessary.

Data Processing Difference
The raw data of the drifting and GPS buoys are unavailable. The comparison between H 0 and H 1/3 by the numerical simulation for various wave spectra was summarized in Reference [29]. The ratio H 1/3 / H 0 ranged from 0.94 to 0.99, which was consistent with that shown in Figure A1 in Appendix A (blue lines). However, the ratio of the wave heights was not consistent with that in Figure 2c because of the spatial variability of the wave heights near the coast. The ratio H b / H g was smaller than 1 ( Table 2). In contrast, the simulation result showed that the H 1/3 by the drifting buoy (512 s observation at 0.5 s interval) was higher than that by the GPS buoy (1024 s observation at 1 s interval) because of the spatial variability of the wave heights.
The comparison between T 0 and T 1/3 by the numerical simulation was conducted by Reference [30], who showed that the ratio of T 1/3 / T 0 was close to 1, especially for the narrower spectrum. The value T 1/3 / T 0 was also close to 1 ( Figure A1, green lines).
The order of the magnitudes of the wave periods is T b > T e > T g (Table 2), while T 1/3 (GPS buoy) > T 1/3 (drifting buoy) > T 0 ( Figure A1). The effect of different processing on the difference of the wave periods was small compared with the other effects, such as spatial variability and Doppler shift.

Comparison with the Moored GPS Buoy Data
The agreement between the GPS and drifting buoy-observed wave heights was good. Although the drifting buoy data were contaminated by low-frequency noise [8], the effect on wave height was small.
A difference can also be found in the wave sensors between the drifting and moored GPS buoys. The wave sensor of the drifting buoy was an accelerometer. The wave parameters from the accelerometer and from the GPS sensor agreed very well [31]. The good agreement between the two types of buoy was particularly notable considering that they employ different wave sensing instrumentation. The difference in the wave parameters of the moored GPS buoy and the drifting buoy caused by the different sensors was small compared with that caused by the horizontal variability of the wave parameters.
The difference between GPS wave height and drifting buoy wave height has various factors other than the difference in buoy position. Among them, the contribution of the wave height resolution and the sampling variability in the zero-up crossing method to the RMSD is evaluated. The variance of the drifting buoy-measured wave height by the (0.1 m) resolution was 0.1 2 /12 m 2 , because the variance of a uniform random number with a range of 0.1 is 0.1 2 /12. The RMSD between H g and H b associated with the resolution was The RMSD, correlation, and SI of the wave heights showed that the accuracy of the drifting buoy-observed wave height was better than that of the ERA5 wave heights. The statistical significance was explored using the bootstrap method explained in Appendix B. This method accounts for the resolution of the wave parameters (e.g., 0.1 m for H b ) and the sampling variability of the zero-up crossing method ( Figure A1, black solid and dashed lines). The probabilities of R d (H g , H b ) < R d (H g , H e ), r c (H g , H b ) > r c (H g , H e ), and SI(H g , H b ) < SI(H g , H e ) were almost 100 % for R dmax =15 km (Figure 2c). The drifting buoy-observed wave height was more accurate than the ERA5 wave heights at more than 99% confidence levels.
A bias was observed between T g and T b in Figure 2b, which was mainly caused by the spatial variability of the wave periods and the low-frequency noise of the drifting buoys [8]. We checked the dependency of the drifting current speeds on the difference of the wave periods in Figure 2b. However, no clear relationships were observed between them.
The contribution of the wave period resolution and the sampling variability in the zeoup crossing method to the RMSD from the linear regression line in Figure 2b is evaluated. The RMSD from the regression line was R d (a p T g + b p , T b ) = 0.78 s, where a p is a slope, and b p is an intercept of the regression line (Figure 2b). The RMSD between T g and T b associated with the resolution was (1/12 + 0.1 2 /12) 1/2 0.3 s. The mean wave period was approximately 8 s in Figure 2b. The values of S d (T 1/3 ) T 1/3 −1 were approximately 0.03 and 0.02 for the drifting and GPS buoys, respectively ( Figure A1, red lines). The RMSD between T g and T b associated with the sampling variability was (0.24 2 + 0.16 2 ) 1/2 0.3 s. They sum up to approximately 0.3 √ 2 0.4 s, which is greater than half of the RMSD from the linear regression line (R d (a p T g + b p , T b )). The correlation coefficient between the moored and drifting buoy periods was approximately 0.82 in Reference [8], while the correlation in Figure 2b was 0.80. The wave period for the comparison in Reference [8] was T 1 = M 0 M −1 1 . The wave periods T g and T b were T 1/3 , which was close to T 0 = M −1 M −1 0 . The effect of the low-frequency noise on T 0 was larger than that on T 1 . The accuracy of T 1/3 was lower than that of T 1 .
The RMSD, correlation, and SI of the wave periods showed that the accuracy of the drifting buoy-observed wave periods was better than that of the ERA5 wave periods. However, its statistical significance is lower than in the case of wave height. The agreement of wave periods was not robust compared with the wave height case.

Conclusions
The agreement of the drifting wave buoy-observed wave heights with the GPS observations was better than that of the ERA5 wave heights. This is also true for the period. However, the result that the accuracy of the drifting wave buoy-observed wave periods was better than that of the ERA5 wave periods was not as robust as the result for the wave heights. The sampling variability by the zero-up crossing method (i.e., Figure A1, black and red lines), resolution of the wave parameters (e.g., 0.1 m for H b ), and sampling variability of the wave parameters were considered in the comparison. The difference between the GPS buoy-measured and drifting buoy-measured wave data was associated with the spatial variability. The accuracy of the ERA5 wave height is lower than that of the drifting buoy wave, especially where the ocean current is fast, such as the Kuroshio.
The drifting buoys would be useful for studying wave-current interactions. The wave prediction model that incorporated the wave-current interactions in the open ocean can be validated by the drifting buoys.
Funding: This study was financially supported by a Grant-in-Aid for Scientific Research (C-2) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (20K04708).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement: Not applicable.
Acknowledgments: The GPS buoy wave data were observed by Maritime Bureau, Ministry of Land, Infrastructure, Transport, and Tourism of Japan, and analyzed by Port and Airport Research Institute. The GFD DENNOU Library (available online at http://www.gfd-dennou.org/arch/dcl/ (accessed on 30 June 2021) ) was used for drawing the figures. The comments from the four anonymous reviewers were helpful in revising the manuscript.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A. Relationships of the Wave Parameters
The JMA drifting buoy-observed wave parameters were estimated through the zero-up crossing method, while the ERA5 wave parameters were estimated from a wave spectrum. The relationship between the wave parameters from the zero-up crossing method and from the wave spectrum was explored.
The sampling variabilities of the significant wave height and period through the zeroup crossing method were also evaluated. The absolute Fourier amplitudes (square root of spectra) were estimated, where the wave spectrum was the JONSWAP spectrum. The parameter α in the JONSWAP wave spectrum, which controlled the spectrum amplitude, was α = 8.1 × 10 −3 . The time series of the surface elevations (η) was generated by the inverse Fourier transform (IFFT) with the amplitudes and randomized phases. The time step of the surface elevation generation was 0.125 s. The number of surface elevations generated by the IFFT was 2 14 (2048 s). The 1024 surface elevations were generated at 0.5 s intervals from the generated surface elevations. The significant wave height (H 1/3 ) and period (T 1/3 ) from the 1024 surface elevations were evaluated using the zero-up crossing method. This evaluation of the significant wave heights and periods was conducted 10 3 times for a given peak wave frequency of the JONSWAP-type wave spectrum. The means and the standard deviations of the significant wave heights and periods were estimated from 10 3 ensembles of the significant wave heights and periods.
The sampling variabilities of the GPS wave data were also evaluated. The 1024 surface elevations were generated at 1 s intervals from the generated surface elevations at 0.125 s intervals. The significant wave heights and periods were evaluated from the 1024 surface elevations at 1 s intervals. This evaluation was also conducted 10 3 times. The mean and the standard deviation of H 1/3 and T 1/3 were evaluated from the 10 3 ensembles. Figure A1 shows the normalized standard deviations of the significant wave heights and periods by the mean values (S d (X) X −1 , where S d (X) = (X − X ) 2 1/2 , and X is H 1/3 or T 1/3 ) as a function of H 0 or T 0 . For example, the normalized standard deviation for the drifting buoy wave height S d (H 1/3 ) H 1/3 −1 was approximately 0.065 for H 0 = 2 m (black solid line in Figure A1). The value of S d (H 1/3 ) H 1/3 −1 increased with the higher wave heights. The sampling variability of the significant wave heights by the drifting buoys was larger than that by the GPS buoys. The value of S d (T 1/3 ) T 1/3 −1 was less than 0.03 for T 1/3 ≤ 8 s (red line, Figure A1). The value of S d (T 1/3 ) T 1/3 −1 also increased with the longer wave periods. Figure A1 shows the ratio of the wave heights as H 1/3 /H 0 −1 as a function of H 0 , where H 0 = 4(M 0 ) 1/2 . For example, the value of H 1/3 /H 0 −1 was approximately −0.05 for H 0 = 2 m (blue solid line, Figure A1), showing H 1/3 = 0.95H 0 or H 0 = (1/0.95)H 1/3 . The wave height by the zero-up crossing method (H 1/3 ) was slightly lower than that of H 0 . Figure A1 shows the value T 1/3 /T 0 −1 as a function of T 0 , where T 0 =M −1 M −1 0 . The value of T 1/3 /T 0 −1 was 0.03 for T 0 7.5 s (green solid line), indicating that T 1/3 = 1.03T 0 or T 0 = (1/1.03) T 1/3 . The value of T 1/3 was slightly longer than T 0 . The ratio T 1/3 /T 0 decreased with the longer wave periods.

Appendix B. Bootstrap Method
The effective sample sizes N e (k) were evaluated for each buoy using the method of Reference [32], where k is the buoy number (k = 1, . . . , N B ), and the N B is the number of drifting buoys for comparison. An effective sample size cannot be greater than the sample size. A pair of wave data (H e , H b ) is resampled. The normal random numbers Π(H b + , S d (H b + )) are generated. The parameter is a uniform random number, −0.05 m < < 0.05 m, which considers that the resolution of the buoy-observed wave height is 0.1 m. For the observed value of H b , the actual drifting buoy wave height is H b + .
A number Π( Y , S d (Y)) is a normal random number of which the mean is Y , and the standard deviation is S d (Y) ≡ (Y − Y ) 2 1/2 . The standard deviation S d (H b + ) is estimated from S d (H 1/3 ) H 1/3 −1 in Figure A1 (black line), which is described in Appendix A. The number of N e (k) pairs of (H e , Π(H b + , S d (H b + ))) was generated for each k, and the skill metrics was evaluated from the wave parameters of the ∑ N B k=1 N e (k) pairs. Skill metrics, such as r c (H g , H b ), R d (H g , H b ), and SI(H g , H b ), were compared with r c (H g , H e ), R d (H g , H e ), and SI(H g , H e ). These comparisons were made 10 3 times. The probabilities of the accuracy deterioration of the ERA5 wave data were evaluated by comparing 10 3 pairs of skill metrics.