Application of Three-Dimensional Interpolation in Estimating Diapycnal Di ﬀ usivity in the South China Sea

: Diapycnal di ﬀ usivity is an important parameter to characterize oceanic turbulent mixing and vertical transport. However, due to the challenging accessibility of ﬁeld observations, the observation of diapycnal di ﬀ usivity in the South China Sea (SCS) is rare. In this study, a three-dimensional ﬁeld of diapycnal di ﬀ usivity in the SCS with high spatial resolution is performed by interpolating the rare ﬁeld observations, which aims to provide a reference for the value of diapycnal di ﬀ usivity in ocean models. Given the anisotropy of diapycnal di ﬀ usivity and its rapid change in the magnitude in the vertical direction, several typical interpolation methods are compared in this study. Results of two cross-validation methods demonstrate that the three-dimensional (3D) thin-plate spline interpolation method yields the most reasonable and accurate results among a total of ﬁve typical methods used in this study.


Introduction
Turbulent mixing is a microscale process that makes the properties of water bodies more uniform, which is extremely important for regulating climate and marine productivity [1,2]. Moreover, turbulent mixing could be an important mechanism for controlling the vertical transport of nutrients, dissolved gases and particulate in water [3]. The South China Sea (SCS) is the largest marginal sea of the Northwest Pacific and the mixing in the SCS plays a key role in maintaining the strength of abyssal water transport and the Pacific circulation [3,4].
Turbulent mixing in the SCS has been a hot topic for scholars for a long time. Oceanic turbulent mixing is commonly characterized by turbulent kinetic energy dissipation rate and diapycnal diffusivity, and there is a strong correlation between these two indicators [5]. Tian et al. [3] found that the diapycnal diffusivity in the deep ocean basin of the SCS was two orders larger than that in the western Pacific, based on a fine scale parameterization and hydrographic observations obtained from the northern SCS and the western Pacific. Using the Thorpe scale method, Alford et al. [6] and Buijsman et al. [7] revealed that the diapycnal diffusivity of the Luzon Strait increases in the deep water. Lozovatsky et al. [8] studied the spatial structure and temporal variability of turbulent kinetic energy dissipation rate in limited areas of the northern SCS based on the microstructure measurements. Measurements and numerical simulations [9,10] indicated that energetic internal tides and internal waves generated near the Luzon Strait propagate into the SCS and enhance turbulent mixing. Yang et al. [4] presented the first three-dimensional (3D) patchy distribution of turbulent mixing in the SCS based on the Gregg-Henyey-Polzin (GHP) parameterization and hydrological observations from 2005 to 2012.

Data
Yang et al. [4] estimated the diapycnal diffusivity in the SCS based on in situ observations from 2005 to 2012 and the GHP parameterization. Detailed information of the measuring data can be referred to Yang et al. [4].
The 3D structure of diapycnal diffusivity in the SCS given by Yang et al. [4] is shown in Figure 1. The horizontal distribution of diapycnal diffusivity ranges from 110 • E to 121 • E and 11 • N to 21 • N with a resolution of 0.5 • × 0.5 • , and the vertical distribution ranges from 200 m to 5000 m with a resolution of 200 m. Based on these data (a total of 2353 data), a 3D spatial interpolation analysis is performed in this paper to reconstruct the 3D spatial distribution of diapycnal diffusivity in the SCS. resolution of 200 m. Based on these data (a total of 2353 data), a 3D spatial interpolation analysis is performed in this paper to reconstruct the 3D spatial distribution of diapycnal diffusivity in the SCS.

Methodology
Several 3D spatial interpolation methods are used to estimate the diapycnal diffusivity at unknown points and the 2D TPS interpolation method is used as a comparison. The following contents are the methods that are commonly used for interpolating scattered point data.

The 3D TPS Interpolation
Radial basis functions have the advantage of being unconstrained by dimensions, and they are statistically unbiased estimates of minimum variance [18]. In this study, the radial basis function method is applied due to its good accuracy in the interpolation of scattered points [18,19]. The TPS function is a commonly used radial basis function, which guarantees the minimum bending energy of the deformation field [13]. The TPS interpolation has good numerical stability and convergence, and it is more suitable for global deformation [20]. Therefore, the TPS function is chosen as the basis function in this study. Assuming that the diffusivity is an isotropic distribution, there are n data points with measured diapycnal diffusivity ) , , (

Methodology
Several 3D spatial interpolation methods are used to estimate the diapycnal diffusivity at unknown points and the 2D TPS interpolation method is used as a comparison. The following contents are the methods that are commonly used for interpolating scattered point data.

The 3D TPS Interpolation
Radial basis functions have the advantage of being unconstrained by dimensions, and they are statistically unbiased estimates of minimum variance [18]. In this study, the radial basis function method is applied due to its good accuracy in the interpolation of scattered points [18,19]. The TPS function is a commonly used radial basis function, which guarantees the minimum bending energy of the deformation field [13]. The TPS interpolation has good numerical stability and convergence, and it is more suitable for global deformation [20]. Therefore, the TPS function is chosen as the basis function in this study. Assuming that the diffusivity is an isotropic distribution, there are n data points with measured diapycnal diffusivity K(x i , y i , z i ), I = 1, 2, . . . , n. Then, the estimated diapycnal diffusivity at other points K(x i , y i , z i ) is calculated by 3D TPS interpolation as follow: where r ij represents Euclidean distance between (x i , y i , z i ) and (x j , y j , z j ), the parameters c i , λ 1 , λ 2 , λ 3 , λ 4 can be solved by where · · · · · · 1 x n y n z n However, the actual diapycnal diffusivity in the SCS is anisotropic, and its variability in the vertical direction is greater than that in the horizontal direction. In this study, the anisotropic of data, which has a crucial influence on the interpolation results, is taken into account in interpolation process.
As shown in Table 1, the mean absolute change rate (e.g., in vertical direction, calculate the absolute rate of change for the diapycnal diffusivity values of any two adjacent points in vertical direction, and then average all absolute change rate) of the diapycnal diffusivity values in the vertical direction is two orders of magnitude higher than that in the horizontal direction. In addition, turbulent mixing in the SCS is mainly triggered by internal waves breaking, and the horizontal feature scale of the typical movement is also two orders of magnitude larger than the vertical feature scale [2,9]. In other words, in the interpolation, diapycnal diffusivity in the horizontal direction should have a greater weight than that in the vertical direction [21]. Therefore, the weight in the vertical direction is set to 1/100 of that in the horizontal direction to achieve the anisotropy characteristics of diapycnal diffusivity, and Equation (1) is rewritten as:

Directions
The Mean Absolute Rate of Change (m 2 s −1 km −1 ) Zonal direction 6.5 × 10 −5 Meridional direction 6.7 × 10 −5 Vertical direction 9.1 × 10 −3 In the subsequent 3D interpolation methods used in this study, the weight in the vertical direction is always set to 1/100 of that in the horizontal direction.

The 3D IDW Interpolation
The 3D IDW interpolation method is based on the hypothesis that each measurement point has a local influence, which is inversely proportional to the distance between the point to be interpolated and the measurement point. All distances in this study are calculated in Cartesian coordinate system. The diapycnal diffusivity K(x j , y j , z j ) estimated by the 3D IDW interpolation method is performed as follows: where The distances in this article are all calculated under the Cartesian coordinate system, and the spherical coordinate system is not used.

The 3D CPF
In this paper, the diapycnal diffusivity estimated by the 3D CPF method is performed as follows: where K max , S max , and L max represent the highest order of polynomials in zonal, meridional and vertical direction, respectively, which are limited to 10 in this study. β k,s,l are coefficients that can be solved by the least squares method.

The 2D TPS Interpolation
The 3D diapycnal diffusivity data can be decomposed into multiple 2D data at different water depths and then 2D interpolation schemes can be adopted. To make a comparison with the 3D TPS method, the 2D TPS method is used to interpolate the diapycnal diffusivity at each depth. The 2D TPS method is introduced as follows: assuming that there are n data points with measured diapycnal diffusivity K(x i , y i ) at a certain depth, diapycnal diffusivity at other points K(x j , y j ) can be estimated in Section 3.1, and the only difference is that the z dimension is deleted.

Cross-Validation
In this study, the 10-fold random cross-validation [17] is employed to assess the interpolation results. The measured diapycnal diffusivity dataset is randomly divided into ten subsets, nine of which are selected as set A and one as set B. The data in set A are used to interpolate, while the accuracy of estimation is evaluated with the data in set B. The random cross-validation is repeated 50 times. By calculating the averaged error of interpolated results in the 50 experiments, the effectiveness and accuracy of the aforementioned interpolation methods can be assessed.
In order to verify the effectiveness of interpolation in where an entire section is removed, as a supplement to the 10-fold random cross validation scheme, another cross-validation scheme (plan cross-validation) is also implemented in this study. In this scheme, the measured diapycnal diffusivity on a whole certain horizontal plan is retained to evaluate the accuracy of each interpolation method, meanwhile the remaining data is used to interpolate. In order to ensure that the estimate is interpolated and there are enough measure points in the evaluated, water depth planes ranging from 400 m to 3800 m are selected for evaluation and the plane cross-validation is repeated 18 times.

Statistical Analysis
Two statistical parameters, namely the mean absolute error (MAE) and the mean relative error (MRE), are used to quantify the interpolation performance. The MAE and MRE are calculated as follows: where K i and ∼ K i are measured and estimated diapycnal diffusivity, respectively, and m is the number of validation data in set B.

Result Analysis and Discussion
Given the rapid change of diapycnal diffusivity in the magnitude in the vertical direction (the measured diapycnal diffusivity varying between 10 −5 m 2 /s and 10 −1 m 2 /s), the interpolation of log10 (K) may produce better results than the interpolation of K. In addition, the prediction error could be used to determine whether the data should be transformed [22]. The 3D TPS method is taken as an example to compare the interpolation results of K and log10 (K). Equations (1) and (3) are rewritten as (11) and (12) when the interpolation is carried out on log10 (K).
The 10-fold random cross-validation results of the 3D TPS method are shown in Figure 2 which clearly shows that the interpolation errors for the logarithm of K are much smaller than those for K, suggesting the reliability of interpolating the logarithm of K. Noting that the interpolated log10 (K) values are converted back to K values before calculating the interpolation errors. Therefore, the logarithmic pretreatment of diapycnal diffusivity is taken in the following experiments. used to determine whether the data should be transformed [22]. The 3D TPS method is taken as an example to compare the interpolation results of K and log10 (K). Equations (1) and (3) are rewritten as (11) and (12) when the interpolation is carried out on log10 (K).
. 0 10 log The 10-fold random cross-validation results of the 3D TPS method are shown in Figure 2 which clearly shows that the interpolation errors for the logarithm of K are much smaller than those for K, suggesting the reliability of interpolating the logarithm of K. Noting that the interpolated log10 (K) values are converted back to K values before calculating the interpolation errors. Therefore, the logarithmic pretreatment of diapycnal diffusivity is taken in the following experiments. N is the total number of random cross-validation experiments) for each combination are calculated during the 10-fold random cross-validation process [16]. In order to facilitate the comparison with the 3D TPS method, MAE and MRE with the minimum are regarded as fitting errors and shown in Figure 3. The degrees corresponding to the minimum values of MAE and MRE in the 10-fold random cross-validation process are given in Table 2   As shown in Figure 3, the errors of the 3D CPF method ( are unreliable and unstable during the random cross-validation process, which suggests that the 3D CPF method may not be a good approach in this study. In order to expand the measured diapycnal diffusivity into a spatial complete field, other 3D interpolation methods were applied. Among them, the 3D piecewise linear (3D PL) interpolation method was implemented by MATLAB griddata function (linear interpolation as an interpolation method) and used as a comparison. The 10-fold random cross-validation results of these 3D interpolation methods are shown in Figure 4. It is clearly shown that the 3D TPS method performs best in the 10-fold random cross-validation with   As shown in Figure 3, the errors of the 3D CPF method (MAE = 0.0030 m 2 /s, MRE = 85.06%) are unreliable and unstable during the random cross-validation process, which suggests that the 3D CPF method may not be a good approach in this study. In order to expand the measured diapycnal diffusivity into a spatial complete field, other 3D interpolation methods were applied. Among them, the 3D piecewise linear (3D PL) interpolation method was implemented by MATLAB griddata function (linear interpolation as an interpolation method) and used as a comparison. The 10-fold random cross-validation results of these 3D interpolation methods are shown in Figure 4. It is clearly shown that the 3D TPS method performs best in the 10-fold random cross-validation with MAE = 0.0013 m 2 /s and MRE = 32.65%, followed by the 3D PL method with MAE = 0.0016 m 2 /s and MRE = 45.67%.
The statistical results of the overestimation and underestimation of the 3D interpolation methods are shown in Table 3. According to the statistical results in Table 3, the interpolated results are more likely to be significantly overestimated (the average proportion of relative error greater than 50%) for this diapycnal diffusivity field. Among the 3D TPS, PL, and IDW methods, the 3D TPS method yields the smallest proportions for both overestimation and underestimation, although the low error proportion of the 3D TPS method is only a slight improvement on that of the 3D PL method.

3D Interpolation Methods
The Average Proportion of Relative Error Greater than 50% The  The MAEs and MREs of the 2D TPS method at different depths are illustrated in Figure 5. The MAE and MRE of all points to be estimated are 0.0043 m 2 /s and 172.51%, respectively, which are much larger than the errors of the 3D TPS method. In addition, it is obvious that the 2D interpolation method cannot accurately describe the true distribution of geographic phenomena in the 3D space, and the interpolation effect is easily affected by the number of measured points on the plane. Therefore, the 2D TPS method is not suitable for 3D reconstruction of the diapycnal diffusivity field.  Although the background diffusivity is 10 −5 m 2 /s or even 10 −4 m 2 /s, considering that the average value of the measured diapycnal diffusivity is 0.0052 m 2 /s and the MRE of the 3D TPS method is 32.65%, the MAE of the 3D TPS method is reasonable. The first random cross-validation experiment is taken as an example: Figure 6 shows that the diapycnal diffusivity obtained by the 3D TPS method has a good agreement with the measured diapycnal diffusivity in set B and the relative errors of most points to be inspected fluctuate in [−50%, 50%].   Figure 7. All cross-validation processes suggest that the 3D TPS method produces better results (the MAE and MRE of all points to be estimated are 0.0011 m 2 /s and 27.70%, respectively) than the 3D PL and IDW methods. The MAE and MRE of all points with 3D PL method are 0.0015 m 2 /s and 39.08% respectively. Considering that the 3D PL method requires the least amount of calculation among the 3D interpolation methods, the error level of the 3D PL method is encouraging. The statistical results of the overestimation and underestimation of the 3D interpolation methods are shown in Table 3. According to the statistical results in Table 3, the interpolated results are more likely to be significantly overestimated (the average proportion of relative error greater than 50%) for this diapycnal diffusivity field. Among the 3D TPS, PL, and IDW methods, the 3D TPS method yields the smallest proportions for both overestimation and underestimation, although the low error proportion of the 3D TPS method is only a slight improvement on that of the 3D PL method. Results of plane cross-validation experiments are shown in Figure 7. All cross-validation processes suggest that the 3D TPS method produces better results (the MAE and MRE of all points to be estimated are 0.0011 m 2 /s and 27.70%, respectively) than the 3D PL and IDW methods. The MAE and MRE of all points with 3D PL method are 0.0015 m 2 /s and 39.08% respectively. Considering that the 3D PL method requires the least amount of calculation among the 3D interpolation methods, the error level of the 3D PL method is encouraging.

Results of plane cross-validation experiments are shown in
It is interesting to find that the errors of the 3D IDW are much larger than those of the 3D PL and TPS methods. It can be seen from Figure 8 that the diapycnal diffusivity is obviously overestimated in the weak mixing area, while it is underestimated in the high mixing area. This phenomenon can be seen more clearly from random cross-validation (Figure 9). The main reason is that turbulent mixing generally increases with depth in the SCS, which leads to the fact that the measured points in the deep sea have a greater impact on the interpolation points than points closer to the interpolation points (inconsistent with the assumption of the 3D IDW method). Moreover, the interpolated values with the 3D IDW method within the data set are bounded by min(K i ) < K j < max(K i ), where K i and K j are measured and estimated diapycnal diffusivity, respectively. In other words, the 3D IDW method is essentially composed of smoothing procedures [14], which cause the diapycnal diffusivity to be easily underestimated in the high mixing area and overestimated in the weak mixing area. It is interesting to find that the errors of the 3D IDW are much larger than those of the 3D PL and TPS methods. It can be seen from Figure 8 that the diapycnal diffusivity is obviously overestimated in the weak mixing area, while it is underestimated in the high mixing area. This phenomenon can be seen more clearly from random cross-validation (Figure 9). The main reason is that turbulent mixing generally increases with depth in the SCS, which leads to the fact that the measured points in the deep sea have a greater impact on the interpolation points than points closer to the interpolation points (inconsistent with the assumption of the 3D IDW method). Moreover, the interpolated values with the 3D IDW method within the data set are bounded by , where i K and j K are measured and estimated diapycnal diffusivity, respectively. In other words, the 3D IDW method is essentially composed of smoothing procedures [14], which cause the diapycnal diffusivity to be easily underestimated in the high mixing area and overestimated in the weak mixing area.   be seen more clearly from random cross-validation (Figure 9). The main reason is that turbulent mixing generally increases with depth in the SCS, which leads to the fact that the measured points in the deep sea have a greater impact on the interpolation points than points closer to the interpolation points (inconsistent with the assumption of the 3D IDW method). Moreover, the interpolated values with the 3D IDW method within the data set are bounded by , where i K and j K are measured and estimated diapycnal diffusivity, respectively. In other words, the 3D IDW method is essentially composed of smoothing procedures [14], which cause the diapycnal diffusivity to be easily underestimated in the high mixing area and overestimated in the weak mixing area.    The 3D TPS and PL methods are selected to reconstruct spatial fields of diapycnal diffusivity in the plane cross-validation. Figure 10 shows the depth-averaged diffusivity reconstructed by interpolation in the upper (400-600 m), intermediate (600-1600 m), and deep layers (>1600 m). Since the 3D TPS method ensures that the first derivative of the space field is continuous [23], the maps obtained by the 3D TPS method are smoother than those obtained by the 3D PL method, and the gradient changes are smaller. In addition, both cross-validation methods show that the errors of the 3D TPS method are smaller than those of the 3D PL method. Moreover, since the derivative of the diapycnal diffusivity appears in the equations of motions, the application effect of the 3D TPS method may be better than that of the 3D PL method. In a word, results calculated by the 3D TPS are a slight improvement over those calculated by the 3D PL method.
Combining all the above analysis, the best interpolation method of diapycnal diffusivity in the SCS is the 3D TPS method, and both cross-validation methods provide evidence for the rationality of this method. The interpolated results of the 3D TPS method with horizontal resolution of 0.25° × 0.25° and vertical resolution of 100 m are show in Figure 11. The interpolated field of diapycnal diffusivity can be obtained with high spatial resolution, but it does not include the information about the diapycnal diffusivity changing with time. Therefore, the cross-validation results can only prove that the interpolation accuracy is satisfactory. The real assessment will be whether the interpolation results of the 3D TPS method improve the model's representation of reality. The 3D TPS and PL methods are selected to reconstruct spatial fields of diapycnal diffusivity in the plane cross-validation. Figure 10 shows the depth-averaged diffusivity reconstructed by interpolation in the upper (400-600 m), intermediate (600-1600 m), and deep layers (>1600 m). Since the 3D TPS method ensures that the first derivative of the space field is continuous [23], the maps obtained by the 3D TPS method are smoother than those obtained by the 3D PL method, and the gradient changes are smaller. In addition, both cross-validation methods show that the errors of the 3D TPS method are smaller than those of the 3D PL method. Moreover, since the derivative of the diapycnal diffusivity appears in the equations of motions, the application effect of the 3D TPS method may be better than that of the 3D PL method. In a word, results calculated by the 3D TPS are a slight improvement over those calculated by the 3D PL method.
Combining all the above analysis, the best interpolation method of diapycnal diffusivity in the SCS is the 3D TPS method, and both cross-validation methods provide evidence for the rationality of this method. The interpolated results of the 3D TPS method with horizontal resolution of 0.25 • × 0.25 • and vertical resolution of 100 m are show in Figure 11. The interpolated field of diapycnal diffusivity can be obtained with high spatial resolution, but it does not include the information about the diapycnal diffusivity changing with time. Therefore, the cross-validation results can only prove that the interpolation accuracy is satisfactory. The real assessment will be whether the interpolation results of the 3D TPS method improve the model's representation of reality.

Conclusions
In order to expand the 3D scatter data of diapycnal diffusivity into a spatial complete field, the 3D TPS method which accounts for anisotropy (vertical scales smaller than the horizontal ones) in the data is carried out to estimate the diapycnal diffusivity in the SCS. Given the rapid change of diapycnal diffusivity in the magnitude, better results are obtained by logarithmic pretreatment of diapycnal diffusivity. The selection of spatial interpolation method for estimating the diapycnal diffusivity is also discussed in this paper. Results of 10-fold random cross-validation experiments illustrate that the 3D TPS method has the smallest errors ( ). Results of plane crossvalidation experiments also indicate that the 3D TPS method can reconstruct the diapycnal diffusivity field relatively accurately. Moreover, the interpolated results may provide a reference for calculating vertical fluxes across the basin, and we will study what the error induced in total flux is by having transects in the subsequent work.
Both cross-validation methods show that the diapycnal diffusivity obtained by the 3D interpolation has a good agreement with the measured diapycnal diffusivity. The diapycnal diffusivity field reconstructed by the 3D TPS method is differentiable, with lower errors than those with the 3D PL method. However, note that for the diapycnal diffusivity field, the 3D TPS method is a slight improvement over the 3D PL method. The 3D CPF method is considered to be inappropriate in estimating the diapycnal diffusivity due to its excessive and unstable errors. Meanwhile, the diapycnal diffusivity map reconstructed by the 3D IDW method is extremely inconsistent with the measured data, and this shortcoming is related to the nature of the IDW function itself.
The 3D TPS method applied in this study, which accounts for anisotropy in the data, may provide potential applications for improving the mixing configuration in the numerical model used for simulating a realistic ocean state in the SCS. Moreover, the 3D interpolation methods used in this study might be also applicable to data analysis in any basin (adjust the ratio of horizontal weight and vertical weight appropriately), but their interpolation accuracy needs to be evaluated. A forthcoming validation is planned using the interpolation results of the 3D TPS method to test an ocean model of the SCS against standard model configurations for the diapycnal diffusivity.

Conclusions
In order to expand the 3D scatter data of diapycnal diffusivity into a spatial complete field, the 3D TPS method which accounts for anisotropy (vertical scales smaller than the horizontal ones) in the data is carried out to estimate the diapycnal diffusivity in the SCS. Given the rapid change of diapycnal diffusivity in the magnitude, better results are obtained by logarithmic pretreatment of diapycnal diffusivity. The selection of spatial interpolation method for estimating the diapycnal diffusivity is also discussed in this paper. Results of 10-fold random cross-validation experiments illustrate that the 3D TPS method has the smallest errors (MAE = 0.0013 m 2 /s, MRE = 32.65%), followed by the 3D PL method (MAE = 0.0016 m 2 /s, MRE = 45.67%). Results of plane cross-validation experiments also indicate that the 3D TPS method can reconstruct the diapycnal diffusivity field relatively accurately. Moreover, the interpolated results may provide a reference for calculating vertical fluxes across the basin, and we will study what the error induced in total flux is by having transects in the subsequent work.
Both cross-validation methods show that the diapycnal diffusivity obtained by the 3D interpolation has a good agreement with the measured diapycnal diffusivity. The diapycnal diffusivity field reconstructed by the 3D TPS method is differentiable, with lower errors than those with the 3D PL method. However, note that for the diapycnal diffusivity field, the 3D TPS method is a slight improvement over the 3D PL method. The 3D CPF method is considered to be inappropriate in estimating the diapycnal diffusivity due to its excessive and unstable errors. Meanwhile, the diapycnal diffusivity map reconstructed by the 3D IDW method is extremely inconsistent with the measured data, and this shortcoming is related to the nature of the IDW function itself.
The 3D TPS method applied in this study, which accounts for anisotropy in the data, may provide potential applications for improving the mixing configuration in the numerical model used for simulating a realistic ocean state in the SCS. Moreover, the 3D interpolation methods used in this study might be also applicable to data analysis in any basin (adjust the ratio of horizontal weight and vertical weight appropriately), but their interpolation accuracy needs to be evaluated. A forthcoming validation is planned using the interpolation results of the 3D TPS method to test an ocean model of the SCS against standard model configurations for the diapycnal diffusivity.