Reconstruction of Daily Sea Surface Temperature Based on Radial Basis Function Networks

A radial basis function network (RBFN) method is proposed to reconstruct daily Sea surface temperatures (SSTs) with limited SST samples. For the purpose of evaluating the SSTs using this method, non-biased SST samples in the Pacific Ocean (10◦N–30◦N, 115◦E–135◦E) are selected when the tropical storm Hagibis arrived in June 2014, and these SST samples are obtained from the Reynolds optimum interpolation (OI) v2 daily 0.25◦ SST (OISST) products according to the distribution of AVHRR L2p SST and in-situ SST data. Furthermore, an improved nearest neighbor cluster (INNC) algorithm is designed to search for the optimal hidden knots for RBFNs from both the SST samples and the background fields. Then, the reconstructed SSTs from the RBFN method are compared with the results from the OI method. The statistical results show that the RBFN method has a better performance of reconstructing SST than the OI method in the study, and that the average RMSE is 0.48 ◦C for the RBFN method, which is quite smaller than the value of 0.69 ◦C for the OI method. Additionally, the RBFN methods with different basis functions and clustering algorithms are tested, and we discover that the INNC algorithm with multi-quadric function is quite suitable for the RBFN method to reconstruct SSTs when the SST samples are sparsely distributed.


Introduction
High-quality sea surface temperature (SST) data play an important role in many applications, including climate predictions [1], ocean data assimilation [2], and global change research [3].However, raw SST data can vary significantly between different types of measurements [4], such as in-situ measurements (e.g., ships or buoys) and satellite sensors (e.g., infrared thermal radiometers or microwave radiometers on satellite platforms).In-situ SST measurements are accurate but sparse, while satellite-retrieved SST measurements have poor accuracy but provide a dense coverage globally.Therefore, one method to solve these problems is to combine different sources of SSTs to reconstruct new SST fields [4,5].
Even through many different sources of SSTs are combined, due to the limitations of satellite orbits and the field of view (FOV) of sensors, global ocean coverage remains difficult to achieve [6].Additionally, high-quality SST samples usually have irregular distributions as some data are discarded, such as in-situ SST data from questionable drifters and satellite-retrieved SST data contaminated by clouds [6,7].Especially for the SST data from infrared radiometers, the SSTs in some regions are largely missing since they can not get the information through clouds.Thus, it is critical to reconstruct SST fields from incomplete SST samples.
Several alternative methods, such as empirical orthogonal function (EOF) [8,9], data interpolating orthogonal function (DINEOF) [10,11], and optimum interpolation (OI) [7], have previously been used for SST reconstruction.Currently, the most popular method to reconstruct SST is with the OI algorithm, and many high-quality SST analysis products using the OI method have been designed with different kinds of temporal and spatial resolutions for various applications [5,6,[12][13][14][15][16][17].However, when the distribution of SST samples are very sparse and the local correlation scales are not large enough in the OI method, the SSTs are often replaced by the background fields and the SST accuracy in these regions is relatively poor [6].
A SST field can generally be described as a nonlinear pattern in space and time, which can be expressed as a set of complex functions.Radial basis function network (RBFN) is an artificial neural network that uses radial basis functions (RBFs) as activation functions in the field of mathematical modeling.It is a good function approximator when RBFNs have enough hidden neurons and samples [18][19][20].RBFNs have been employed in many applications [21][22][23].A RBFN-based response model for SST has recently been developed by Ryu et al. [24].However, this model does not optimize the quantity and distribution of hidden knots, which are sensitive to the accuracy of the model and should be adjusted with different distributions of SST samples.
Thus, we introduce a RBFN method for reconstructing daily SST fields when the SST samples are limited and sparsely distributed, and design an improved nearest neighbor cluster (INNC) algorithm for optimizing the quantity and distribution of hidden knots for RBFNs in this study.The remainder of this paper is organized as follows.Section 2 provides an outline of the SST samples that are used in this study.Then, the RBFN method and the INNC algorithm are described in Section 3. Section 4 evaluates the performance of the RBFN method by using different basis functions and clustering methods for RBFNs, and analyze its SST accuracy with the OI method.Then, some characteristics of this method are discussed in Section 5. Finally, the conclusions are summarized in Section 6.

Data Description
Since the satellite-retrieved SST data from infrared radiometers are often contaminated by clouds, the SST samples that we can obtain is very limited, especially when typhoon or tropical storm comes with heavy clouds.This study considers data from the Pacific Ocean (10 • N-30 • N, 115 • E-135 • E) (see Figure 1a).The SST samples in the study are selected from both the in-situ SST and the AVHRR SST L2p data, but the coverage of SST samples is still very low when the tropical storm Hagibis arrived from 12 June to 18 June 2014 (see Figure 1b).The AVHRR SST L2p data have been provided by the group for high resolution sea surface temperature (GHRSST) (ftp://data.nodc.noaa.gov/pub/data.nodc/ghrsst/) and have been widely used in SST studies [4].The in-situ SSTs are from the in-situ SST quality monitor (iQuam) system by using the highest quality flag value of 5 that is designed for high-accuracy applications (http://www.star.nesdis.noaa.gov/sod/sst/iquam/index.html) [25].These SST samples combining the AVHRR L2p data and in-situ SST data have been averaged onto a 0.25 • grid.To evaluate the SST errors only from RBFNs, the OISST product was used as a reference, and non-biased SST samples were selected from the OISST products according to the distributions of the SST samples.The OISST products are the level-4 data in GHRSST with a grid resolution of 0.25 • , where in-situ data have been combined with AVHRR data through the analyses system of National Oceanic and Atmospheric Administration (NOAA)'s national climatic data center (NCDC) [6].Thus, the distributions of SST samples are consistent with the AVHRR L2p data and in-situ SST data, while the values of SST samples were assigned by the OISST products.Accordingly, the global SST samples from 13 June 2014 are also used to test the performance of RBFN method on the global SST model.In addition, the reconstructed SSTs from this RBFN method are validated with the reference SSTs in the study.

RBFN Method
The daily SST can be considered as two parts: the background field and the SST increment [6,26].The previous SST field is used as the background field.In particular, the values of climatology change are added to the background field.The reconstructed SST t T can be defined as where ( ) , t t X X are the longitude and latitude at time t, respectively.
X denotes the set of all possible power with d degree.
Here, d = 2 and the 2-degree polynomial can be expressed as (1, , , , , ) denotes the distance between Using this RBFN method, the number of hidden knots t K , the position, and error variance are unknown, and are strongly related to the precision of the reconstructed SST.The error variance is estimated by the squared deviation of the difference between ( ) t f X and SST increments t Y .Since the number of hidden knots and the position can be affected by the distribution of SST samples, the

RBFN Method
The daily SST can be considered as two parts: the background field and the SST increment [6,26].The previous SST field is used as the background field.In particular, the values of climatology change are added to the background field.The reconstructed SST T t can be defined as where Y t is the SST increment, T b t is the background field, and T t−1 is the reconstructed SST for the previous time.For convenience, we use the OISST product as the a priori SST, and to minimize the effects of this, the procedure was run for 10 days in advance of the first date.The daily SST climatology, T c t , was obtained from 30 years (1982-2011) of OISST data [27].(T c t − T c t−1 ) is the SST climatology that changes between time t and t − 1.The SST increment Y t is calculated from the RBFN proposed by Ryu et al. [24].It should be noted that this RBFN is used to estimate SST increments, not SSTs as in Ryu's et al. study.The RBFN can be given by where f (X t ) is a RBFN function, X t = (X t1 , X t2 ) is a 2-dimensional matrix, and X t1 , X t2 are the longitude and latitude at time t, respectively.X d t denotes the set of all possible power with d degree.Here, d = 2 and the 2-degree polynomial can be expressed as k denotes the distance between X t and hidden knots u t,k for k = 1, ..., K t , and φ(z) = z.β t,0 , β t,m , m = 1, ..., 5; and β t,k+5 denote regression coefficients for the intercept, polynomials, and basis functions, respectively.ε t is an independent Gaussian random error with mean 0 and variance σ 2 y t .Using this RBFN method, the number of hidden knots K t , the position, and error variance are unknown, and are strongly related to the precision of the reconstructed SST.The error variance is estimated by the squared deviation of the difference between f (X t ) and SST increments Y t .Since the number of hidden knots and the position can be affected by the distribution of SST samples, the INNC algorithm was designed to determine the hidden knots in the SST fields.Figure 2 shows the schematic diagram of this RBFN method for reconstructing SST.
INNC algorithm was designed to determine the hidden knots in the SST fields.Figure 2 shows the schematic diagram of this RBFN method for reconstructing SST.

INNC Algorithm
The nearest neighbor cluster (NNC) algorithm is used to search the optimal hidden knots only from the SST samples for RBFNs [28,29], while the INNC algorithm can choose the hidden knots from both SST samples and background field, and consider the values of SST samples.The INNC algorithm is described as follows.
(1) Standardizing the original SST data, and make sure each variable of ( , , ) x y z in the SST matrix, with the mean of 0 and standard deviation of 1, where z is the value of the SST at the position of ( ) , x y in the SST matrix.
(2) Define a minimal distance D and set the first SST sample 1 1 1 ) , ( , x y z as the first center 1 c . (3) For the second SST sample  σ to obtain the optimal t K and t u for RBFN.Thus, the optimal t K and t u may vary with different quantities and distributions of SST samples.In addition, the SST values at the positions of the optimal hidden knots from the background fields are added to the SST samples, so that there is at least one SST sample in the SST field within the minimal distance of D.

INNC Algorithm
The nearest neighbor cluster (NNC) algorithm is used to search the optimal hidden knots only from the SST samples for RBFNs [28,29], while the INNC algorithm can choose the hidden knots from both SST samples and background field, and consider the values of SST samples.The INNC algorithm is described as follows.
(1) Standardizing the original SST data, and make sure each variable of (x, y, z) in the SST matrix, with the mean of 0 and standard deviation of 1, where z is the value of the SST at the position of (x, y) in the SST matrix.(2) Define a minimal distance D and set the first SST sample (x 1 , y 1 , z 1 ) as the first center c 1 .
(3) For the second SST sample (x 2 , y 2 , z 2 ), the Euclidean distance s to the center c 1 is calculated.
If s > D, then the position (x 2 , y 2 , z 2 ) is the next center c 2 , otherwise the algorithm searches for the next SST sample (x 3 , y 3 , z 3 ).(4) For the i-th SST sample (x i , y i , z i ), the Euclidean distance {s k } to each center {c k } is calculated, k = 1, . . .,K.K is the number of center.If the minimal distance s m > D, then the position (x i , y i , z i ) is the next center c K+1 , otherwise the algorithm searches for the next SST sample, until the last one is found.(5) The values from the background field T b t are used to fill the positions without SST samples, before repeating step (3) for each position to select the centers from the background field.This continues until all of the positions are processed in the SST matrix, and the hidden knots u t are obtained by using the positions of centers {c k } in the SST matrix.
In terms of the INNC algorithm, the parameters K t and u t are determined when the minimal distance D is defined.The value of D ranges from 0.2 to 1.5 in this study, and the optimal D is selected with the minimal error variance σ 2 y t to obtain the optimal K t and u t for RBFN.Thus, the optimal K t and u t may vary with different quantities and distributions of SST samples.In addition, the SST values at the positions of the optimal hidden knots from the background fields are added to the SST samples, so that there is at least one SST sample in the SST field within the minimal distance of D.

Evaluating the Performance of the RBFN Method
As described above, a simplified multi-quadric function z 2 + s 2 1 (s 1 equal 0 here) is chosen as a basis function, and the INNC algorithm is designed to search for the optimal hidden knots for RBFNs.In order to evaluate the performance of this RBFN method, a Gaussian function exp(−s 2 z 2 ) is selected for comparison and s 2 = K t /d max [30], because d max is the maximum distance between the hidden knots, thus d max = D in the study.K-means algorithm and Kohonen-map algorithm are also tested for selecting hidden knots for RBFNs.While the number of hidden knots K t should be assigned for these two clustering algorithms, so the value of K t ranges from 20 to 120 for K-means algorithm, and the Kohonen-maps with regular array of m × m hidden knots are tested and the value of m ranges from 5 to 15, then the optimal values of K t and m are separately selected with the minimal error variance σ 2 y t .More Details about K-means algorithm and Kohonen-map algorithm are described in [31][32][33].
Additionally, the OI method is used to compare with this RBFN method.Since the values of SST samples are assigned by the reference SST in the study, we select the noise-to-signal standard deviation ratio of AVHRR data for SST samples (the value is 0.5) and the average correlation scales for OI method (zonal and meridional correlation scales are 151 km and 155 km, respectively).Details of the OI method are described by Reynolds et al. [6].
To validate the accuracy of SST, the reconstructed SSTs are compared with the reference SSTs, and four commonly used error metrics are calculated: root mean square error (RMSE), mean absolute error (MAE), Pearson correlation coefficient (R), and signal-to-noise ratio (SNR).The SNR is the ratio of the standard deviation of the SST results to the standard deviation of the errors [34].

Results
In this section, the RBFN methods with different basis functions and clustering methods are tested, and the performance of the RBFN method for reconstructing daily SSTs is evaluated by comparing with the OI method.

Results from Different Basis Functions
Using the INNC algorithm, the optimal hidden knots can be chosen from SST samples and background fields.But the distributions of SST samples are very sparse in this study, the reconstructed SSTs from the RBFN method may be influenced by the type of the basis function.
Figures 3 and 4 display the SST increments that are estimated by using Equation ( 2) with Gaussian function and multi-quadric function on 13 June and 18 June, respectively.It is clear that the SST increments obtained from Gaussian function have a lot of anomaly circular regions, as shown in Figures 3a and 4a, where the SST increments are significantly different to the values from the regions behind them.While the SST increment fields using multi-quadric function in Figures 3b and 4b are quite close to their neighborhood regions.Besides, due to the characteristics of the Gaussian function, the effective areas of the hidden knots are limited, so the SST increments are strongly affected by the hidden knots that are selected from the background fields when there are few SST samples nearby, which may cause large errors on SST increments.

Results from Different Clustering Algorithms
A RBFN with too many or too few hidden knots is not conductive to simulating SST fields.Additionally, the distribution of hidden knots has a strong influence on the construction of a RBFN.The statistical results from Tables 1 and 2 demonstrate that the multi-quadric function is significantly better than the Gaussian function as the basis function for the RBFN method, especially from the results on 18 June, the RMSE from the multi-quadric function is 0.47 • C, which is quite smaller than the value of 0.73 • C from the Gaussian function.

Results from Different Clustering Algorithms
A RBFN with too many or too few hidden knots is not conductive to simulating SST fields.Additionally, the distribution of hidden knots has a strong influence on the construction of a RBFN.Therefore, the quality of reconstructed SSTs from the RBFN method is highly related to the quantity and distribution of hidden knots, and it is important to select the suitable hidden knots for RBFNs.
Three clustering algorithms, including K-means algorithm, Kohonen-map algorithm, and INNC algorithm are compared to selecting the optimal hidden knots for RBFNs on 13 June and 18 June, respectively.Since the RBFN is quite sensitive to the hidden knots, the values of hidden knots are very important to the reconstructed SSTs.As a result, if a hidden knot is obtained from the background field, the value of which is close to 0 in the SST increment field in Equation ( 2), and the SST accuracy will decrease in this situation.Thus, it is conductive to select the hidden knots from the SST samples rather than from the background field.
Figures 5 and 6 show the distributions of hidden knots from these three algorithms.It is easy to discover that there are more hidden knots that are selected from the SST samples (green points) by the INNC algorithm than those by the other two algorithms, because the clustering centers of the K-means algorithm and the Kohonen-map algorithm are chosen by training the input samples, while the INNC algorithm directly selects the hidden knots from the SST samples and make sure that as many hidden knots as possible are obtained from SST samples within the minimal distance D. As the statistical results shown in Tables 1 and 2, the INNC algorithm in this study with the highest accuracy of reconstructed SSTs, is more suitable than the K-means algorithm and the Kohonen-map algorithm to select the optimal hidden knots for the RBFN method.
Three clustering algorithms, including K-means algorithm, Kohonen-map algorithm, and INNC algorithm are compared to selecting the optimal hidden knots for RBFNs on 13 June and 18 June, respectively.Since the RBFN is quite sensitive to the hidden knots, the values of hidden knots are very important to the reconstructed SSTs.As a result, if a hidden knot is obtained from the background field, the value of which is close to 0 in the SST increment field in Equation ( 2), and the SST accuracy will decrease in this situation.Thus, it is conductive to select the hidden knots from the SST samples rather than from the background field.
Figures 5 and 6 show the distributions of hidden knots from these three algorithms.It is easy to discover that there are more hidden knots that are selected from the SST samples (green points) by the INNC algorithm than those by the other two algorithms, because the clustering centers of the Kmeans algorithm and the Kohonen-map algorithm are chosen by training the input samples, while the INNC algorithm directly selects the hidden knots from the SST samples and make sure that as many hidden knots as possible are obtained from SST samples within the minimal distance D. As the statistical results shown in Tables 1 and 2, the INNC algorithm in this study with the highest accuracy of reconstructed SSTs, is more suitable than the K-means algorithm and the Kohonen-map algorithm to select the optimal hidden knots for the RBFN method.

Comparison with the OI Method
Due to the limitation of the correlation scales for the OI method, the SST increments will close to 0 when there are no SST samples behind them, while the SST increments from the RBFN method are obtained based on the tendency of the whole SST increment field.This is the key difference between the OI method and the RBFN method for reconstructing SST.algorithm are compared to selecting the optimal hidden knots for RBFNs on 13 June and 18 June, respectively.Since the RBFN is quite sensitive to the hidden knots, the values of hidden knots are very important to the reconstructed SSTs.As a result, if a hidden knot is obtained from the background field, the value of which is close to 0 in the SST increment field in Equation ( 2), and the SST accuracy will decrease in this situation.Thus, it is conductive to select the hidden knots from the SST samples rather than from the background field.
Figures 5 and 6 show the distributions of hidden knots from these three algorithms.It is easy to discover that there are more hidden knots that are selected from the SST samples (green points) by the INNC algorithm than those by the other two algorithms, because the clustering centers of the Kmeans algorithm and the Kohonen-map algorithm are chosen by training the input samples, while the INNC algorithm directly selects the hidden knots from the SST samples and make sure that as many hidden knots as possible are obtained from SST samples within the minimal distance D. As the statistical results shown in Tables 1 and 2, the INNC algorithm in this study with the highest accuracy of reconstructed SSTs, is more suitable than the K-means algorithm and the Kohonen-map algorithm to select the optimal hidden knots for the RBFN method.

Comparison with the OI Method
Due to the limitation of the correlation scales for the OI method, the SST increments will close to 0 when there are no SST samples behind them, while the SST increments from the RBFN method are obtained based on the tendency of the whole SST increment field.This is the key difference between the OI method and the RBFN method for reconstructing SST.

Comparison with the OI Method
Due to the limitation of the correlation scales for the OI method, the SST increments will close to 0 when there are no SST samples behind them, while the SST increments from the RBFN method are obtained based on the tendency of the whole SST increment field.This is the key difference between the OI method and the RBFN method for reconstructing SST.
Figure 7 displays the time-series of each error metric from the OI and RBFN methods during the period from 12 June to 18 June 2014.The performance of each error metric from the RBFN method is much better than that from the OI method.The average values of these error metrics in Table 3 indicate that the RBFN method increases R and SNR from 0.96 and 3.82 to 0.98 and 4.94, respectively, and decreases RMSE and MAE from 0.69 • C and 0.46 • C to 0.48 • C and 0.35 • C, respectively.Although the RBFN method requires more computation time than the OI method, due to the selection of optimal hidden knots by using the INNC algorithm, the RBFN method has a better accuracy of reconstruction than the OI method.
Remote Sens. 2017, 9, 1204 8 of 15 Figure 7 displays the time-series of each error metric from the OI and RBFN methods during the period from 12 June to 18 June 2014.The performance of each error metric from the RBFN method is much better than that from the OI method.The average values of these error metrics in Table 3 indicate that the RBFN method increases R and SNR from 0.96 and 3.82 to 0.98 and 4.94, respectively, and decreases RMSE and MAE from 0.69 °C and 0.46 °C to 0.48 °C and 0.35 °C, respectively.Although the RBFN method requires more computation time than the OI method, due to the selection of optimal hidden knots by using the INNC algorithm, the RBFN method has a better accuracy of reconstruction than the OI method.In Figures 8 and 9, the reconstructed SSTs from 13 June and 18 June are separately displayed as examples.The results show that the reconstructed SSTs from the OI method (shown in Figures 8b  and 9b) and the RBFN method (shown in Figures 8d and 9d) are quite similar to their reference SSTs (see Figures 8c and 9c).But, the SSTs in the region outlined by the square from the OI method have a significantly high SST anomaly, where these phenomenon are not obvious in the SSTs from the RBFN method and their reference data.This indicates that the RBFN method has a relatively better performance than the OI method to reconstruct SSTs when the original SST samples is quite sparse, as shown in Figures 8a and 9a   In Figures 8 and 9, the reconstructed SSTs from 13 June and 18 June are separately displayed as examples.The results show that the reconstructed SSTs from the OI method (shown in Figures 8b and 9b) and the RBFN method (shown in Figures 8d and 9d) are quite similar to their reference SSTs (see Figures 8c and 9c).But, the SSTs in the region outlined by the square from the OI method have a significantly high SST anomaly, where these phenomenon are not obvious in the SSTs from the RBFN method and their reference data.This indicates that the RBFN method has a relatively better performance than the OI method to reconstruct SSTs when the original SST samples is quite sparse, as shown in Figures 8a and 9a.However, the global distribution of original SST samples is much more complex than the local regions.As shown in Figure 10, the distribution of them is very irregular on 13 June 2014 in the global ocean, and the coverage of these SST data is not complete, especially in the equatorial regions, where the data is relatively limited.Therefore, in order to test the performance of RBFN method, we run the RBFN using the global SST samples and compare the results with the OI It should be noted that both the OI method and the RBFN method in this study choose the previous OISST product as the background field, and the global SST field from the RBFN method are combined by each 20 • × 20 • boxes of reconstructed SST samples that using the same scheme as the local experiment in Section 3.
Figure 11 displays the global distributions of SST biases separately from the OI method and the RBFN method on 13 June 2014 according to the corresponding reference SST.The results show that the variations of SST biases from these two methods are quite similar, but the SST biases from the RBFN method are relatively greater than those from the OI method when the SST samples in Figure 10 are not sparsely distributed.It indicates that this RBFN method might not be as good as the OI method when the coverage of SST samples is relatively high.Additionally, when compared with the global statistical results from the OI method in Table 4, the values of RMSE and MAD from the RBFN method are slightly larger, but its SNR value is relatively smaller.Thus the performance of RBFN method in the study is not very stable with different distributions of SST samples, and it still needs to be improved when applied in the global region.Figure 11 displays the global distributions of SST biases separately from the OI method and the RBFN method on 13 June 2014 according to the corresponding reference SST.The results show that the variations of SST biases from these two methods are quite similar, but the SST biases from the RBFN method are relatively greater than those from the OI method when the SST samples in Figure 10 are not sparsely distributed.It indicates that this RBFN method might not be as good as the OI method when the coverage of SST samples is relatively high.Additionally, when compared with the global statistical results from the OI method in Table 4, the values of RMSE and MAD from the RBFN method are slightly larger, but its SNR value is relatively smaller.Thus the performance of RBFN method in the study is not very stable with different distributions of SST samples, and it still needs to be improved when applied in the global region.

The INCC Algorithm for RBFNs
The hidden knots for RBFNs are commonly selected by clustering algorithms [28,29,35] or random sampling [24].Since the hidden knots with different quantities and distributions should be tested in RBFNs, huge computational resources are required when random sampling is used to select the optimal hidden knots from SST fields.Commonly used clustering algorithms for RBFNs, such as K-means algorithm and Kohonen-map algorithm, are only used to determine the distribution of hidden knots when the number of hidden knots is given [32,35].The K-means algorithm should be executed hundreds of times, which also takes a lot of time even though the algorithm has a fast convergence speed.For the Kohonen-map algorithm, an input vector is not only related to the nearest hidden knot, but also to its neighbor hidden knots, so the computation is also huge and the training step for the clustering centers consume a lot of time before the iteration ends.While using the INNC algorithm with a parameter D, both the quantity and the distribution of hidden knots could be directly determined without any iterative computation, and the values of D in this study were chosen from 0.2 to 1.5.Thus, the INNC algorithm is more efficient in selecting the optimal hidden knots for RBFNs.
Additionally, the INNC algorithm select the hidden knots from SST samples and background fields within the minimal distance D, and then the values of hidden knots from the background fields are added to the SST samples, which makes sure that the distributions of the hidden knots and the

The INCC Algorithm for RBFNs
The hidden knots for RBFNs are commonly selected by clustering algorithms [28,29,35] or random sampling [24].Since the hidden knots with different quantities and distributions should be tested in RBFNs, huge computational resources are required when random sampling is used to select the optimal hidden knots from SST fields.Commonly used clustering algorithms for RBFNs, such as K-means algorithm and Kohonen-map algorithm, are only used to determine the distribution of hidden knots when the number of hidden knots is given [32,35].The K-means algorithm should be executed hundreds of times, which also takes a lot of time even though the algorithm has a fast convergence speed.For the Kohonen-map algorithm, an input vector is not only related to the nearest hidden knot, but also to its neighbor hidden knots, so the computation is also huge and the training step for the clustering centers consume a lot of time before the iteration ends.While using the INNC algorithm with a parameter D, both the quantity and the distribution of hidden knots could be directly determined without any iterative computation, and the values of D in this study were chosen from 0.2 to 1.5.Thus, the INNC algorithm is more efficient in selecting the optimal hidden knots for RBFNs.
Additionally, the INNC algorithm select the hidden knots from SST samples and background fields within the minimal distance D, and then the values of hidden knots from the background fields are added to the SST samples, which makes sure that the distributions of the hidden knots and the SST samples are close to uniform in the SST field.While the multi-quadric basis function for RBFNs considers large related domains in SST field, so the values from both the SST samples and the backgrounds fields will contribute to the reconstructed SSTs when there are few SST samples nearby.

SST Samples
The coverage of SST samples that we selected in the Pacific Ocean is very low because the tropical storm Hagibis come with heavy clouds, which makes poor quality of satellite-retrieved SSTs, and the SSTs we can used are very limited.Theoretically, the errors of reconstructed SSTs are from both background fields and SST increments.The SST increments are estimated by using current SST samples, and the quality of background fields is strongly influenced by the previous SST samples, thus the distributions of SST samples at both the previous time and the current time play a significant role in reconstructing SST.It can well explain the fact that the minimal RMSE did not happen on 15 June in Figure 7a, which has the highest coverage of SST samples, as shown in Figure 1b.Because the 14 June with the lowest SST coverage in the study has a low accuracy of reconstructed SST, and make a poor quality of the background field for 15 June, so the accuracy of reconstructed SST on 15 June is not the best even though it has the highest coverage of SST samples.
The quality of SST samples is not only relevant to the coverage of SST samples, but it also depends on their accuracy.Since the SST sample values that were used were assigned by the reference SSTs, which can be considered as non-biased SST data, the reconstructed SSTs are of relatively high quality.However, the satellite-retrieved SST is often contaminated by the atmosphere and the SST data from different sources may contain errors from their respective surveying systems, meaning that it is not easy to obtain non-biased SST data, and the accuracy will decrease if large errors exist in the SST samples.
The SST samples with high coverage are conductive to acquiring high-quality daily SSTs.While acquiring larger quantities of SST samples often needs the combination of different sources of SSTs, more SST errors may come from these different kinds of sensors, and this has a negative influence on the quality of the SST samples overall.For the purpose of obtaining high-accuracy SST data from the RBFN method, more SST data should be used in reconstructing the SST.Additionally, strict quality control is required for SSTs from various measurements, and bias corrections for satellite-retrieved SSTs are necessary to eliminate the errors in SST samples.

The Performance of the RBFN Method
Since the OI method estimates SSTs using only the information within a limited distance, the estimated SSTs in some areas rely on background fields, which create large errors in the SST fields.The RBFNs with enough hidden knots use all the SST samples to approximate the SST field, so the SSTs from the RBFN method contain more information on the whole SST field, not only the information about SST samples nearby.Thus, the accuracy of SST that is obtained using the RBFN method is better than that using the OI method when the distribution of SST samples is quite sparse.
As shown in Figure 7a, the RMSE on 15 June from the RBFN method is significantly smaller than that from the OI method when the quality of the background field is quite poor.This shows that when compared with the SSTs from the OI method, the quality of SSTs from the RBFN method is more stable when the SST errors are associated with the background field.
On the other hand, the OI method is effective in estimating SST using neighboring samples, which was conductive to obtaining a higher accuracy of SSTs when the coverage of the SST samples is very high.While the quantity and distribution of hidden knots in RBFNs are determined by the minimal error variance, which is a global variable of the SST field, and the distribution of hidden knots may not be suitable in some local regions, and the SST errors of RBFNs may occur even in some areas with large amount of SST samples nearby.This is may be the reason why the RBFN method is not as effective as the OI method in the global region.Because both the OI method and the RBFN method is designed to minimize the errors of the SST field, and if the SST samples is sparsely distributed, the distribution of expected errors is relatively uniform in the SST field.In this case, a global parameter of RBFN is suitable for the whole SST field.But when the coverage of SST samples is high, the expected errors of RBFN in the regions with SST samples is much smaller than those without SST samples nearby, then the expected errors are distributed unevenly in the SST field, and a global parameter may not adapt for the RBFN to simulate the high quality of SST, which may make the errors much larger than those of the OI method, as shown in Figure 11.
In addition, although the INNC algorithm with D is very efficient in searching for hidden knots in SST fields, the optimal D of the INNC algorithm should be selected from the values between 0.2 and 1.2, which makes the RBFN method more computationally expensive than the OI method.Overall, the OI and RBFN methods have their own characteristics in terms of reconstructing SST, but the RBFN method is more effective than the OI method when the SST samples are sparsely distributed.

Conclusions
Reconstructing SST fields from a limited number of SST samples is important for data applications.In this study, a RBFN method is proposed and a INNC algorithm is designed to search for the optimal hidden knots for RBFNs.When compared with the other clustering algorithms as K-means algorithm and Kohonen-map algorithm, the INNC algorithm can obtain more hidden knots from the SST samples, which will contribute to acquiring a high quality of reconstructed SST.Using the multi-quadric function as the basis function is more effective than the Gaussian function to avoid the SST increment anomaly on the regions of hidden knots.Thus, the INNC algorithm with multi-quadric function is quite suitable for the RBFN method to reconstruct SSTs in the study.
To evaluate the accuracy of this RBFN method, SST samples with low coverage are reconstructed by using the RBFN and OI methods, respectively.The results show that when compared with the SSTs from the OI method, the quality of SST from the RBFN method is more stable when the SST errors are associated with the background field.Moreover, the RBFN method is less affected by missing values in SST samples, and the SSTs from the RBFN method have a higher accuracy than those from the OI method when SST samples are sparsely distributed.According to this characteristic of the RBFN method, it is quite suitable to be used for SST reconstruction that only combine the in situ data and the satellite retrieved SSTs from infrared radiometers.
When considering the efficiency of RBFNs, hidden knots with different quantities and distributions are determined simultaneously using various D values in the INCC algorithm, and the optimal D for RBFN is selected with the minimal error variance.The step-length of D is set as 0.02 in this study.We believe that if the step-length of D can be set smaller, then the accuracy may be improved.
However, the accuracy of reconstructed SSTs is strongly influenced by the quality of SST samples from both the present and current time, and the biases in SST samples will increase the errors in SSTs, thus the strict quality control and bias corrections to SST samples are required in practice.Though the RBFN method has a better performance than the OI method to reconstruct SST field with the limited SST samples, the advantage is not obvious when the coverage of SST samples is high (the details are described in the Supplemental Material), so this RBFN method is appropriate to be used in some local regions, not to the global region at the present stage, and we will improve this RBFN method in our future work.

Figure 1 .
Figure 1.(a) The study area (10°N-30°N, 115°E-135°E) in red rectangle region and the track of the tropical storm Hagibis labeled by day.The white and green areas in (a) show land and ocean, respectively; (b) The coverage of sea surface temperature (SST) samples in the study region during the period from 12 June to 18 June 2014.

Figure 1 .
Figure 1.(a) The study area (10 • N-30 • N, 115 • E-135 • E) in red rectangle region and the track of the tropical storm Hagibis labeled by day.The white and green areas in (a) show land and ocean, respectively; (b) The coverage of sea surface temperature (SST) samples in the study region during the period from 12 June to 18 June 2014.

Figure 2 .
Figure 2. Schematic diagram of the RBFN method for reconstructing SST.

1 c 2 ( , , ) x y z is the next center 2 c
, the Euclidean distance s to the center is calculated.If s > D, then the position 2 2 , otherwise the algorithm searches for the next SST sample

( 4 )
For the i-th SST sample ( , , ) k = 1,…,K.K is the number of center.If the minimal distance m s D > , then the position ( , , )i i ix y z is the next center 1 K c + , otherwise the algorithm searches for the next SST sample, until the last one is found.(5) The values from the background field b t T are used to fill the positions without SST samples, before repeating step (3) for each position to select the centers from the background field.This continues until all of the positions are processed in the SST matrix, and the hidden knots t u are obtained by using the positions of centers { } k c in the SST matrix.In terms of the INNC algorithm, the parameters t K and t u are determined when the minimal distance D is defined.The value of D ranges from 0.2 to 1.5 in this study, and the optimal D is selected with the minimal error variance 2 t y

Figure 2 .
Figure 2. Schematic diagram of the RBFN method for reconstructing SST.

Figure 3 .
Figure 3. Reconstructed SST increments on 13 June 2014 obtained by using (a) Gaussian function and (b) multi-quadric function.White areas are lands in the images.

Figure 3 .
Figure 3. Reconstructed SST increments on 13 June 2014 obtained by using (a) Gaussian function and (b) multi-quadric function.White areas are lands in the images.

Figure 3 .
Figure 3. Reconstructed SST increments on 13 June 2014 obtained by using (a) Gaussian function and (b) multi-quadric function.White areas are lands in the images.

Figure 4 .
Figure 4. Reconstructed SST increments on 18 June 2014 obtained by using (a) Gaussian function and (b) multi-quadric function.White areas are lands in the images.

Figure 4 .
Figure 4. Reconstructed SST increments on 18 June 2014 obtained by using (a) Gaussian function and (b) multi-quadric function.White areas are lands in the images.

Figure 5 .
Figure 5.The distributions of hidden knots for radial basis function network (RBFN) on 13 June 2014 from (a) K-means algorithm; (b) Kohonen-map algorithm and (c) improved nearest neighbor cluster (INNC) algorithm respectively.The red points and green points are the locations of hidden knots for RBFNs separately selected from the background field and the SST samples.

Figure 6 .
Figure 6.The distributions of hidden knots for RBFN on 18 June 2014 from (a) K-means algorithm; (b) Kohonen-map algorithm and (c) INNC algorithm respectively.The red points and green points are the locations of hidden knots for RBFNs separately selected from the background field and the SST samples.

Figure 5 .
Figure 5.The distributions of hidden knots for radial basis function network (RBFN) on 13 June 2014 from (a) K-means algorithm; (b) Kohonen-map algorithm and (c) improved nearest neighbor cluster (INNC) algorithm respectively.The red points and green points are the locations of hidden knots for RBFNs separately selected from the background field and the SST samples.

Figure 5 .
Figure 5.The distributions of hidden knots for radial basis function network (RBFN) on 13 June 2014 from (a) K-means algorithm; (b) Kohonen-map algorithm and (c) improved nearest neighbor cluster (INNC) algorithm respectively.The red points and green points are the locations of hidden knots for RBFNs separately selected from the background field and the SST samples.

Figure 6 .
Figure 6.The distributions of hidden knots for RBFN on 18 June 2014 from (a) K-means algorithm; (b) Kohonen-map algorithm and (c) INNC algorithm respectively.The red points and green points are the locations of hidden knots for RBFNs separately selected from the background field and the SST samples.

Figure 6 .
Figure 6.The distributions of hidden knots for RBFN on 18 June 2014 from (a) K-means algorithm; (b) Kohonen-map algorithm and (c) INNC algorithm respectively.The red points and green points are the locations of hidden knots for RBFNs separately selected from the background field and the SST samples.

Figure 7 .
Figure 7. (a) root mean square error (RMSEs); (b) mean absolute error (MAEs); (c) Pearson correlation coefficient (Rs) and (d) signal-to-noise ratio (SNRs) of the reconstructed SSTs from the optimum interpolation (OI) and RBFN methods in the study region during the period from 12 June to 18 June 2014. .

Figure 7 .
Figure 7. (a) root mean square error (RMSEs); (b) mean absolute error (MAEs); (c) Pearson correlation coefficient (Rs) and (d) signal-to-noise ratio (SNRs) of the reconstructed SSTs from the optimum interpolation (OI) and RBFN methods in the study region during the period from 12 June to 18 June 2014.

Figure 8 .
Figure 8.The distributions of (a) original SST samples; (b) OI SST; (c) reference SST; and, (d) RBFN SST on 13 June 2014.The region outlined by the square is used for comparison in detail.White areas in the images indicate no data or land.

Figure 9 .
Figure 9.The distributions of (a) original SST samples; (b) OI SST; (c) reference SST; and, (d) RBFN SST on 18 June 2014.The region outlined by the square is used for comparison in detail.White areas in the images indicate no data or land.

Figure 8 . 15 Figure 8 .
Figure 8.The distributions of (a) original SST samples; (b) OI SST; (c) reference SST; and, (d) RBFN SST on 13 June 2014.The region outlined by the square is used for comparison in detail.White areas in the images indicate no data or land.

Figure 9 .
Figure 9.The distributions of (a) original SST samples; (b) OI SST; (c) reference SST; and, (d) RBFN SST on 18 June 2014.The region outlined by the square is used for comparison in detail.White areas in the images indicate no data or land.

Figure 9 .
Figure 9.The distributions of (a) original SST samples; (b) OI SST; (c) reference SST; and, (d) RBFN SST on 18 June 2014.The region outlined by the square is used for comparison in detail.White areas in the images indicate no data or land.

Figure 10 .
Figure 10.The global distribution of original SST samples on 13 June 2014.White areas in the images indicate no data or land, and the coastlines are highlighted with back contours.

Figure 10 .
Figure 10.The global distribution of original SST samples on 13 June 2014.White areas in the images indicate no data or land, and the coastlines are highlighted with back contours.

Figure 11 .
Figure 11.The global distributions of (a) OI SST-reference SST and (b) RBFN SST-reference SST on 13 June 2014.White areas in the images indicate no data or land.

Figure 11 .
Figure 11.The global distributions of (a) OI SST-reference SST and (b) RBFN SST-reference SST on 13 June 2014.White areas in the images indicate no data or land.

Table 1 .
Statistical results from different basis functions and clustering methods on 13 June 2014.The subscripts G and M stand for Gaussian function and multi-quadric function, respectively.RMSE (°C) MAE (°C)R SNR

Table 2 .
Statistical results from different basis functions and clustering methods on 18 June 2014.The subscripts G and M stand for Gaussian function and multi-quadric function, respectively.

Table 1 .
Statistical results from different basis functions and clustering methods on 13 June 2014.The subscripts G and M stand for Gaussian function and multi-quadric function, respectively.

Table 2 .
Statistical results from different basis functions and clustering methods on 18 June 2014.The subscripts G and M stand for Gaussian function and multi-quadric function, respectively.

Table 3 .
Statistical results from the OI and RBFN methods.

Table 4 .
Global Statistical results from the OI and RBFN methods on 13 June 2014.

Table 4 .
Global Statistical results from the OI and RBFN methods on 13 June 2014.