Neural Network Approach to Retrieving Ocean Subsurface Temperatures from Surface Parameters Observed by Satellites

: The extraction of physical information about the subsurface ocean from surface information obtained from satellite measurements is both important and challenging. We introduce a backpropagation neural network (BPNN) method to determine the subsurface temperature of the North Paciﬁc Ocean by selecting the optimum input combination of sea surface parameters obtained from satellite measurements. In addition to sea surface height (SSH), sea surface temperature (SST), sea surface salinity (SSS) and sea surface wind (SSW), we also included the sea surface velocity (SSV) as a new component in our study. This allowed us to partially resolve the non-linear subsurface dynamics associated with advection, which improved the estimated results, especially in regions with strong currents. The accuracy of the estimated results was veriﬁed with reprocessed observational datasets. Our results show that the BPNN model can accurately estimate the subsurface (upper 1000 m) temperature of the North Paciﬁc Ocean. The corresponding mean square errors were 0.868 and 0.802 using four (SSH, SST, SSS and SSW) and ﬁve (SSH, SST, SSS, SSW and SSV) input parameters and the average coefﬁcients of determination were 0.952 and 0.967, respectively. The input of the SSV in addition to the SSH, SST, SSS and SSW therefore has a positive impact on the BPNN model and helps to improve the accuracy of the estimation. This study provides important technical support for retrieving thermal information about the ocean interior from surface satellite remote sensing observations, which will help to expand the scope of satellite measurements of the ocean.


Introduction
The datasets provided by satellite remote sensing have promoted ocean research. However, these datasets are mostly confined to the sea surface because it is difficult for electromagnetic radiation to reach the interior of the ocean [1]. Although large-scale in situ observation projects, such as the Argo buoy, have improved deep ocean data mining [2], the existing observational datasets of the subsurface still do not satisfy the needs of research into the internal processes of the oceans [3,4]. We need to obtain more information about the interior of the ocean because most of the important oceanographic phenomena exist below the surface and these phenomena are useful in studying both the characteristics of the ocean and global climate change [5][6][7][8].
In recent decades, most of the additional heat in the Earth's climate system has entered the oceans [9] and therefore heating of the ocean interior has become an important factor in slowing down global warming [10][11][12][13]. Several studies have shown that the current stagnation in the increase in the sea surface temperature (SST) is reflected in the warming of the upper ocean [14][15][16]. Changes in the ocean system have important impacts on the atmospheric circulation and global climate change [17]. The air-sea interactions caused by the thermal difference between the land and the sea can affect regional and global large-scale circulation systems, leading to disastrous weather and climate events, such as storm surges [18][19][20][21] and super typhoons [22][23][24]. The change in the thermal condition of the oceans has a very close relationship with monsoon activity [25], affecting both the monsoon circulation and precipitation in the monsoon region, and influencing interannual and interdecadal changes in the regional climate [26][27][28]. Model results have also shown that warming of the upper ocean may be associated with some climate events, such as La Niña and the El Niño-Southern Oscillation (ENSO) [7,29]. However, there have been relatively few studies considering the subsurface as a result of the lack of subsurface data. The subsurface temperature is less affected by the air-sea interface than the SST and its interannual variation is more significant. It therefore has a greater role in climate trends, which means that it is important to explore the internal thermal structure of the ocean.
Although the ocean interior cannot be observed directly by satellites, it is possible to estimate subsurface information from sea surface parameters [6]. Statistical methods are commonly used to obtain information about the subsurface or deep ocean from surface satellite remote sensing data [30][31][32]. Fox [33] built one of the earliest systems used to retrieve 3D information about the ocean through multivariate linear regression: the Modular Ocean Data Assimilation System (MODAS). MODAS has been used to retrieve subsurface temperature/salinity fields through similar techniques using datasets from satellite remote sensing [34]. Global 3D temperature, salinity and velocity fields have been reconstructed through a combination of multiple linear regression and the optimum interpolation method using Argo temperature-salinity profiles and satellite remote sensing datasets [35,36]. Empirical orthogonal function (EOF) analysis is another important statistical method based on a mode-extracting approach to project surface information to subsurface fields [3,37,38]. Based on EOF analysis, the multivariate EOF reconstruction (mEOF-R) method was developed to reconstruct 3D fields of the ocean [39][40][41]. This method performed well in recent studies of the Southern Ocean [42,43].
Dynamic approaches such as the surface quasi-geostrophic theory (SQG) method have been developed to reconstruct 3D ocean fields [44][45][46][47]. An effective SQG (eSQG) approach, which takes the potential vorticity into account, was used to retrieve information about the subsurface ocean using surface parameters such as the sea surface height (SSH) and sea surface density (SSD) [48]. The eSQG approach has been widely applied by many researchers [49][50][51][52][53][54][55][56][57]. However, the SQG methods only perform well in the upper ocean (above 500 m) and are based on the assumption that there is a good correlation between the SSH and SSD. To modify this shortcoming, the interior plus surface quasi-geostrophic (isQG) method was proposed [58,59] and has been validated by both numerical and observational methods [60,61]. Yan [62] proposed an SQG-mEOF-R approach combining the advantages of the mEOF and SQG methods to retrieve information about the ocean interior from surface parameters.
With the development of artificial intelligence, technical routes have been proposed that are different from traditional methods. Ali [1] proposed using neural networks to estimate the subsurface thermal structure of the ocean with the SST, SSH, sea surface wind (SSW), net radiation and net heat flux as the input parameters. Based on a self-organizing map (SOM) neural network, Wu [63] used the SSH, SST and sea surface salinity (SSS) to estimate subsurface temperature anomalies and obtained good retrieval results in the depth range 30-700 m. Bao [64] proposed using a generalized regression neural network with the fruit fly optimization algorithm to retrieve the salinity profile of the Pacific Ocean and achieved better results at the halocline than those obtained using a linear methodology. Su [65] used a support vector machine to retrieve the subsurface temperature of the Indian Ocean and a random forest algorithm [66] to retrieve the subsurface temperature of the global ocean using the SSH, SST, SSS and SSW, which proved that the SSS and SSW are helpful in improving the accuracy of the estimation and provided a technical method of measuring the subsurface temperature of the ocean. Su [67] then improved the accuracy of the estimation by using a geographically weighted regression model that took the distribution of the ocean fields into consideration. K-means clustering and feed-forward artificial neural networks were combined to estimate the subsurface temperature and achieve good results down to 1900 m [68]. Valuable research in retrieving the internal characteristics of the ocean have been carried out using sea surface parameters-for example, using neural network approaches to retrieve the temperature-salinity profile [69], combining an empirical mode projection with satellite altimetry to estimate the 3D structure of the Southern Ocean [70] and taking advantage of both convolutional neural networks and long short-term memory networks to reconstruct the 3D ocean temperature [71].
Sea surface datasets from satellite remote sensing are becoming more important in retrieving subsurface parameters. However, the earlier methods generally rely on statistical or dynamic techniques and do not use artificial intelligence algorithms. The machine learning approaches that have been applied still do not make full and efficient use of sea surface parameters. There is still a need to improve the accuracy of these methods and the retrieval results based on sea surface parameters. Back-propagation neural networks (BPNNs) have been proved to be an extraordinarily useful method in data classification and regression [72][73][74]. We propose a BPNN model to retrieve ocean subsurface temperatures using the SST, SSH, SSS, SSW and sea surface velocity (SSV). The use of the SSV as a nonlinear advection term has not been reported previously. The accuracy of our results was evaluated by comparing the estimated temperature field with the reprocessed observational temperature field.
The remainder of this paper is organized as follows. Section 2 introduces the datasets for the study area and the BPNN model. Section 3 evaluates the performance of the BPNN model in estimating the subsurface temperature fields and compares the results estimated by the BPNN model with four and five input parameters. Section 4 presents a short discussion. Our conclusions are presented in Section 5.

Study Area, Datasets and Methods
Research on the North Pacific Ocean (NPO) has increased in recent years, with an increasing emphasis on air-sea interactions at mid-latitudes. The NPO is very sensitive to global climate change and has a significant impact on the weather and ecological environment of the neighboring regions of East Asia and North America [75][76][77]. Water temperature is the most important indicator of the thermal state of the ocean and fluctuations will lead to changes in the ocean-atmospheric circulation, which, in turn, affects the Earth's climate [78][79][80]. It is therefore worth studying the NPO region, especially its thermal structure. The study area of the NPO considered here is located at (0.125-69.875 • N, 110.125 • E-99.875 • W).
We used datasets for the SSH, 3D temperature (3DT), SSS, SSV and SSW. The SSH, 3DT, SSS and SSV data were obtained from the Copernicus Marine and Environmental Monitoring Service (CMEMS; https://marine.copernicus.eu/). Among these, the 3DT is a level 4 gridded reprocessed dataset obtained by combining Argo and satellite remote sensing data [3,34]. Errors on temperature are about 0.8 • C at 100 m and 0.2 • C at 1000 m. The SSH and SSV data were provided by AVISO altimetry (AVISO, http://www.aviso. altimetry.fr/), which is estimated by optimal interpolation, merging the measurements from different altimeter missions (Jason-3, Sentinel-3A, HY-2A, Saral/AltiKa, Cryosat-2, Jason-2, Jason-1, T/P, ENVISAT, GFO, ERS1/2); the data can be downloaded from CMEMS. The errors of SSH have been estimated to be lower than 0.5 mm/year at global scale, and lower than 3 mm/year at regional scale. Errors on SSV have been estimated to range between 10 and 16 cm/s depending on the ocean surface variability. The SSS data were obtained from level 4 re-analyzed datasets through multi-dimensional interpolation of the Soil Moisture and Ocean Salinity (SMOS) product SSS with surface temperature and in situ salinity data [81]. RMS differences of SSS are about 0.1 psu. The 3DT dataset contains 17 depth levels and we used the depth range 0-1000 m. The first depth level (0 m) of the 3DT dataset is the SST. The other 16 depth levels (30,50,75,100,125,150,200,250, 300, 400, 500, 600, 700, 800, 90 and 1000 m) are used as labels for the training, testing and evaluation of the accuracy of the BPNN model and are referred to here as the observed temperatures. The SSW was provided by the Cross-Calibrated Multi-Platform (CCMP; https://rda.ucar. edu/datasets/ds745.1/) wind velocity product, which combines multisource surface wind data using a variational analysis method to produce high-resolution gridded analyses. To avoid differences between individual years and days, all the datasets used in this study were monthly mean values covering the same time period from January 2005 to December 2015 at a spatial resolution of (0.25 • × 0.25 • ) and covering the same region of the NPO.
Back-propagation is one of the most important and commonly used supervised deep learning algorithms for artificial neural networks [82]. According to the Kolomogorov-Arnold-Sprecher theorem, a three-layer (input-hidden-output) neural network can be used to perform any continuous function [83,84], although both the operational efficiency and the accuracy of the results should be taken into consideration when deciding the number of neurons in the hidden layers [85]. After testing, we decided to use a single hidden layer neural network composed of 15 hidden layer units, for which the input layer obtained the surface features and the output layer gave the predicted temperature field at each depth. The back-propagation algorithm was used for model training.
We used the SSH, SST, SSS, SSW and SSV in the NPO as the input or independent parameters for the BPNN model and the output or dependent parameters were the temperature at 16 depth levels. As vectors, the SSW and SSV are divided into two components: (SSW u , SSW v ) and (SSV u , SSV v ), where the subscripts u and v represent the meridional (positive for northward and negative for southward) and latitudinal (positive for eastward and negative for westward) components, respectively. Figure 1 shows the sea surface parameters (SSH, SST, SSS, SSW and SSV) and the subsurface temperature field (e.g., 500 m depth in March 2015) for the study area. These parameters have their own characteristic spatial distribution. The SSH presents a significant block pattern distribution over the NPO, with an especially high-value area in the southwestern NPO and a low-value area in the northern NPO ( Figure 1a). The SST shows a clear zonal distribution in which the temperature decreases significantly as the latitude increases ( Figure 1b). In general, the SSS has the highest value around the middle of the NPO and along the equator (Figure 1c). The components of the SSW and SSV show that their spatial distributions are very different. The SSW shows a wide range of regional characteristics, whereas the SSV presents an alternating distribution over the NPO (Figure 1d,e,f,g).
Considering that each depth level has its own unique physical state, the links between its temperature field and the surface parameters should be unique. Therefore, we need to train the subsurface temperature estimation model of each depth level separately. In other words, for each depth level, we will have a corresponding temperature estimation model, which means there will be 16 different models in total because we have 16 different depth levels. Specifically, Figure 2 shows the flow chart for our subsurface temperature estimation algorithm. First, we randomly divided the datasets for the 120 months from January 2005 to December 2014 into two groups: one group used 60% of the data as the training set and the other group used 40% of the data as the validation set. This proportional distribution is reasonable because it is based on a comprehensive consideration of our large dataset and previous research experience [1,63,[65][66][67][68]86,87]. The training set was used to train the BPNN model and the validation set was used to validate the model during the training period to prevent overfitting and to optimize the weight values to enhance the prediction ability of the model. We then took the observed temperature at each depth level as the label (or target) and used the back-propagation algorithm for training. The neural networks stopped training and stored the trained weights of the model when the number of iterations reached 1000 or when the mean square error (MSE) between the predicted and observed results was <0.01 • C. The trained BPNN model was then used to predict the output of the test set (the dataset for the 12 months of 2015) at the corresponding depth layers.
The MSE is the average squared difference between the output temperature and the target temperature and is mainly used to evaluate the performance of the BPNN model for different combinations of input parameters at the same depth level. The coefficient of determination (R 2 ) is the proportion of the variance in the dependent variable that is predictable from the independent variables. R 2 can be used to represent the correlation between the estimated and observed temperature field and to evaluate the performance of the model at different depth levels, in addition to evaluating the performance of the model under different combinations of parameter inputs. A lower MSE and a higher R 2 represent a better performance of the BPNN model.

Results
The BPNN model was set up to retrieve the temperature at 16 different depth levels (30,50,75,100,125,150,200,250, 300, 400, 500, 600, 700, 800, 900 and 1000 m) using the sea surface parameters. Figures 3 and 4 show the temperature fields estimated by the BPNN model with all five input parameters in January and July 2015 (at five different depths of 100, 300, 500, 700 and 1000 m) and the observed temperature field at the same depth for comparison. The temperature field estimated by the BPNN model and the observed temperature field both show a consistent spatial distribution in January 2015 (Figure 3). The temperature field at 100 m depth in the study area shows a zonal distribution and the two high value areas near 15 • N and the equator are predicted well by the model (Figure 3a). By contrast, the results for July 2015 (Figure 4) show that the estimated results at 100 m depth are not as good as the estimates in January 2015, which may be related to the active dynamics of the summer NPO decreasing the accuracy of the model estimation in the upper ocean. However, the model accurately predicts the low-temperature zone along 7 • N ( Figure 4a). The performance of the model at 300-1000 m depth is similar in both January and July, but the spatial distribution of the temperature field is very different from that at 100 m depth and the spatial heterogeneity becomes inconspicuous with increasing depth (Figures 3 and 4), which is related to the relatively stable thermal environment of the deeper ocean. Figures 3 and 4 show the performance of the BPNN model in retrieving the internal temperature field of the NPO. Although the estimations in different months and at different depth levels show regional differences, the model is intuitively satisfactory overall. Table 1 and Figure     Previous studies have shown that the SSH, SST, SSS and SSW can help to retrieve the subsurface thermal structure [1,[65][66][67]. We therefore set up two groups of experiments with different input parameter combinations to determine how the SSV affects the performance of the model output. The BPNN input parameters in the first group were the SSH, SST, SSS and SSW. The BPNN input parameters in the second group were the SSH, SST, SSS, SSW and SSV. The results (Table 1) show that the performance of the model not only varies at different depths, but also varies between different combinations of input parameters. For all 16 depths, the mean MSE of the five-and four-parameter BPNN model are 0.802 and 0.868, respectively, with mean R 2 values of 0.960 and 0.952, respectively. On average, the five-parameter BPNN model performs better than the four-parameter model as a result of its lower mean MSE and higher mean R 2 . Figure 5 shows that the five-parameter BPNN model has a significantly lower MSE and higher R 2 than the four-parameter BPNN model at every depth, which further proves that the SSV can improve the accuracy of the estimation of the BPNN model in the NPO. These results further prove that SSV may play an important part in connecting the surface and internal ocean through ocean processes such as non-linear advection dynamics.
We used density scatter plots ( Figure 6) to show the relationship between the temperature estimated by the five-parameter BPNN model and the observed temperature at 200 m (MSE = 0.895, R 2 = 0.968), 400 m (MSE = 0.397, R 2 = 0.958), 600 m (MSE = 0.146, R 2 = 0.949) and 800 m (MSE = 0.049, R 2 = 0.943). The accuracy of the estimation by the model can be tested directly by scatter regression and the temperature estimated by the BPNN model was verified by the observed temperature, which showed a low MSE and high R 2 ( Figure 6). Figure 6 shows that most of the points are distributed evenly and densely along the isoline, indicating that the estimated results from the BPNN model are reliable and the model performs well in the study area. To further validate the reliability of the BPNN model at different depths, we randomly picked four pixels from the study area to compare the predicted vertical temperature profiles (VTP) with the observed temperature profiles (Figure 7). The four random pixels  Figure 7 also shows that the four vertical profiles of the temperature estimated by the BPNN are generally consistent with the observations, which reflects the good performance of the BPNN model in the NPO. It should be noted that the estimated temperatures are significantly lower than observed temperatures between 100 and 300 m in Figure 7b. This may be related to the warming effect of the Alaska Current at that location. To further explore the influence of the SSV on the results of the model estimation, we carried out comparative studies on cases (a) and (b). Figure 8 shows the sea surface parameters around case (a) and the VTP estimated by the four-and five-parameter models. Without a sea surface current, the MSE and R 2 values for case (a) calculated with the fourparameter model were 0.349 and 0.988, respectively. However, there is a strong Kuroshio Current in this region, which transports warmer and saltier water from low latitudes along the western boundary. The surface current velocity (Figure 8f,g) can be used to help calculate this transport. With this additional non-linear term, the estimated temperature is notably warmer (Figure 8i) than the previous result (Figure 8h), giving MSE and R 2 values of 0.273 and 0.996, respectively, with the five-parameters model.  Figure 9 shows the sea surface parameters around case (b) and the VTP estimated by the four-and five-parameters models. The MSE and R 2 values of case (b) calculated with the four-parameter model were 0.112 and 0.861, respectively (Figure 9h). Similar to case (a), there is a boundary current, but the current is much weaker than the Kuroshio Current. After taking the sea surface current into account, the estimated temperature is closer to the observed temperature, giving MSE and R 2 values of 0.079 and 0.882, respectively, using the five-parameter model (Figure 9i). Although the surface current velocity has a notable role, it is smaller than in case (a) because many mesoscale eddies propagate in a westerly direction along the Aleutian Islands [88]. The mesoscale eddies make the current much less stable, which leads to larger fluctuations in the estimated temperature (Figure 9i).

Discussion
Previous studies [1,[65][66][67][68] have confirmed that the sea surface parameters SSH, SST, SSS and SSW can help in retrieving the thermal structure of the subsurface. We show here that the accuracy of the temperature estimation can be improved by adding the SSV as a variable on the basis of the existing SSH, SST, SST and SSW fields. Including more independent variables as input parameters of the model might therefore be a reasonable approach to improving the accuracy of the estimation. Variables such as the solar radiation and net heat flux [1] and the mixed-layer depth [89] might help to improve the accuracy of the estimation [90]. However, the computation requirements increase as the number of input variables increases and we therefore need to evaluate which variables are more effective in the estimation of temperature.
These results show a weaker seasonal variability of the deeper temperature fields than the natural variability (Figures 3 and 4). Possible reasons for these weak seasonal signals are: (1) the original CMEMS training data lack seasonal variation signals; and (2) the BPNN model cannot correctly extract the seasonal signals that exist in the original training data. By analyzing the original target temperature field, we showed that the weak seasonal signal in these results is because the original target data lack seasonal variation signals ( Figure 10). The state of the original target temperature field used for training therefore has an important impact on the results. We used model assimilation datasets for training. In future work, we will also consider using datasets combined with observational data from diverse sources, such as the training data, because the observed datasets can enhance the seasonal variation signal and might further improve the accuracy of the estimation. The datasets also play an important role in model training and determine the range of application of the trained model. All the datasets used in this study are monthly mean values with the same spatial resolution of (0.25 • × 0.25 • ) and therefore the model can reasonably estimate the subsurface temperature field with the same or lower resolution, but might not accurately capture signals with a higher temporal or spatial resolution. The temperatures estimated by the model are mostly confined to the temperature range used as the target, which means that if there is an abnormally high or low temperature in the study area, then this may be underestimated by the BPNN model. This might be solved by increasing the grid density and improving the computing power. Our trained model may have fitted well to the data source because the optimum weights of the model were trained by these datasets. However, datasets of the same variable obtained in different institutions may vary as a result of the different sources of the original data and different processing techniques. Therefore, the accuracy of the estimation may not be satisfactory when applying the trained BPNN model to other datasets and it would be better to use a combination of datasets from the same source as the application to train the model. In summary, the systematic errors of our model mainly came from the datasets we used and partly came from the network training parameters setting.
The Kling-Gupta efficiency (KGE) and Nash-Sutcliffe efficiency (NSE) are also commonly used statistical indexes in the evaluation of hydrological models [91][92][93][94]. Therefore, we firstly applied Equation (1) [91] to the four cases shown in Figure 7: where r is the linear correlation between the observed and simulated temperatures, σ sim is the standard deviation of the estimated temperatures, σ obs is the standard deviation of the observed temperatures, µ sim is the estimated mean temperature and µ obs is the observed mean temperature. The KGE of cases (a), (b), (c), and (d) in Figure 7 where σ obs is the standard deviation of the observed temperatures. The NSE values of the cases (e), (f), (g) and (h) are 0.9958, 0.9962, 0.8920 and 0.9737, respectively. The closer the KGE and NSE are to 1, the better the agreement between the estimations and observations. The model can therefore perform well under the metric of the KGE and NSE, which further verifies the stability of the model. There is some concern about the performance of other machine learning methods when they are applied to relative studies. The advantage of unsupervised learning methods over supervised learning methods is the completion of training and adaptive clustering without labels. The SOM, a popular unsupervised learning algorithm in the field of computer vision, was applied to estimate the subsurface temperature in the North Atlantic and showed good performance in the depth range 30-700 m with a correlation coefficient of about 0.8 [63]. The results were not as good as those presented in this paper. This may be because, in addition to a lack of input parameters (SSW and SSV), the SOM algorithm itself is limited in estimating the subsurface temperature. SOM users must select the parameters, neighborhood function, grid type and centroid number before training, which introduces more artificial errors. Unlike a supervised learning method, the SOM method lacks a specific objective function. The BPNN method is therefore more suitable for regression analyses, such as subsurface temperature estimation, despite the advantages of the SOM in clustering analysis. The BPNN model is trained by machine learning algorithms and is essentially a fitting function. It is not restricted by physical mechanisms and may therefore cause unknown problems and instabilities. This might be a common failing of artificial intelligence methods in relative studies [63,[65][66][67]90]. Future research should therefore consider coupling physical models with the data-driven machine learning model.

Conclusions
We introduced a BPNN method to estimate the internal temperature structure of the NPO based on sea surface parameters (SSH, SST, SSS, SSW and SSV). The BPNN method is one of the most popular machine learning methods and showed good performance in retrieving the thermal structure of the ocean subsurface. The estimated temperature field at each depth was verified by the observed temperature field and the accuracy of the estimation by the model was determined using the MSE and R 2 values. Our results show that the temperature estimation based on the BPNN model is both reliable and accurate. The correlation between the predicted and measured results was improved over those in previous models.
We analyzed the influence of different sea surface parameter combinations on the estimation of temperature through comparative experiments. In addition to the previously used variables (the SSH, SST, SSS and SSW) [1,[63][64][65][66][67], we added the SSV as a new component and therefore the non-linear subsurface dynamics due to the advection associated with the SSV were taken into account. This improved the estimated temperature, especially in regions with a strong current. As a consequence, the accuracy of the BPNN model with five input parameters was higher than that of the BPNN model with four input parameters. This suggests that models with more sea surface parameters (e.g., the net heat flux) should be investigated to determine whether they can improve the accuracy of the model.
It should be noted that our results were sensitive to the original training data, which has a weaker seasonal variability than the in-situ data. This should be the general disadvantage of AI methods, that is, the accuracy of training data will affect the final output result. It is valuable to estimate how the error of original training data transmits along with the BPNN model to the final result, which could provide with us the uncertainty of the result in future. In addition, the temporal or spatial resolutions in this study may have missed some of the mesoscale dynamics, which can lead to extreme temperatures. These problems will be investigated in future work.
We developed a method to determine the thermal structure of the ocean interior based on the sea surface parameters measured by satellites. This study will help the development of deep-sea remote sensing technology based on machine learning and provide technical support for the construction of ocean observation datasets. This model provides a comprehensive understanding of the internal dynamics of the marine environment, which is crucial in studies of the global climate. However, the sea surface parameters used in the machine learning model are still limited and the dynamic mechanisms of the estimated results have not yet been explained. There is therefore still room to improve the accuracy of the estimation of the internal thermal structure of the ocean. Future work will consider the global-scale ocean with more surface datasets to retrieve internal parameters such as the 3D salinity and velocity fields. We will compare the results of our study with other approaches such as statistical methods and other machine learning methods to further verify the reliability and applicability of the proposed method. Advanced deep learning methods will be combined with dynamic methods to give more explanatory results.