An Information Spatial-Temporal Extension Algorithm for Shipborne Predictions Based on Deep Neural Networks with Remote Sensing Observations—Part I: Ocean Temperature

: For ships on voyage, using satellite remote sensing observations is an effective way to access ocean temperature. However, satellite remote sensing observations can only provide the surface information. Additionally, this information obtained from satellite remote sensing observations is delayed data. Although some previous studies have investigated the spatial inversion (spatial extension) or temporal prediction (temporal extension) of satellite remote sensing observations, these studies did not integrate ship survey observations and the temporal prediction is limited to sea surface temperature (SST). To address these issues, we propose an information spatial-temporal extension (ISTE) algorithm for remote sensing SST. Based on deep neural networks (DNNs), the ISTE algorithm can effectively fuse the satellite remote sensing SST data, ship survey observations data, and historical data to generate a four-dimensional (4D) temperature prediction ﬁeld. Experimental results show that the ISTE algorithm performs superior prediction accuracy relative to linear regression analysis-based prediction. The prediction results of ISTE exhibit high coefﬁcient of determination (0.9936) and low root mean squared errors (around 0.7 ◦ C) compared with Argo observation data. Therefore, for shipborne predictions, the ISTE algorithm driven by satellite remote sensing SST can be as an effective approach to predict ocean temperature.


Introduction
The ocean environment has an important impact on human maritime operations. Having timely information about the current and future ocean environment is of great significance to ships that perform operations at sea. For ships on voyage, shipborne predictions that mean conducting variables predictions of ocean environment on board, are convenient approaches for obtaining marine information relative to receiving huge marine data from shore-based institutions.
In terms of ocean temperature, traditional methods of shipborne predictions are mainly based on statistical analysis of historical data, such as linear regression analysis-based prediction (LRAP). However, in general, traditional statistical methods cannot effectively capture the spatial characteristics of ocean data. Moreover, traditional statistical methods are not able to make effective use of observation data. This makes the prediction results generated through traditional statistical methods less accurate. Under the condition of shipborne predictions, using satellite remote sensing observations is an effective way to obtain more accurate ocean temperature. However, satellite remote sensing observations can only provide surface information and cannot directly provide subsurface information. Moreover, in this context, the acquired satellite remote sensing observations are delayed data points, and the real-time remote sensing information is hardly available. In order to address these issues, it is necessary to conduct information spatial-temporal extension (ISTE) of satellite remote sensing observations. As a result, four-dimensional (4D) prediction data can be generated from two-dimensional (2D) observation data and shipborne predictions driven by satellite remote sensing observations can be realized.
In fact, the information spatial extension (ISE) is equivalent to the spatial inversion of remote sensing data while the information temporal extension (ITE) is equivalent to the temporal prediction of remote sensing data. Previous studies have proposed various algorithms about ISE or ITE, which mainly include empirical analysis methods and artificial intelligence (AI) methods.
The algorithmic principle of Modular Ocean Data Assimilation System (MODAS) [1] is a typical empirical analysis method for underwater inversion, which can effectively portray the underwater temperature structure using satellite remote sensing observations. With the rapid development of technology, AI represented by deep learning, has shown its strength in many scientific fields, which include geoscience [2]. Subsequent researchers have attempted to use AI methods to perform underwater inversion. Ali et al. [3] used artificial neural networks (ANNs) to estimate the subsurface temperature structures from surface variables. Lu et al. [4] combined a pre-clustering process and a neural network to estimate the subsurface temperature anomaly using ocean surface variables at the global scale. Han et al. [5] used a convolutional neural network (CNN) to estimate subsurface temperature from satellite remote sensing observations. Sammartino et al. [6] proposed a Multi-Layer-Perceptron (MLP) network to reconstruct the three-dimensional (3D) fields of temperature.
It can be anticipated that the inversion accuracy will be improved using vertical observations in the inversion process, similar to studies carried out by Wang et al. [7], which combined Argo profiles with satellite observations to reconstruct weekly three-dimensional temperature fields of the Pacific Ocean. In the situation of shipborne prediction, besides satellite remote sensing observations, it is convenient to obtain the ship survey observations. The ship survey observations data can provide subsurface vertical information. However, these observations are only distributed along the ship route. Meanwhile, the observation information of other locations is difficult to be obtained. In this context, using the limited ship survey observations to supplement the remote sensing observations with a suitable algorithm will further improve the prediction accuracy of the ocean environment.
Although previous studies have carried out research on the fusion of satellite remote sensing information and ocean observation vertical information, most of the vertical information comes from Argo grid data products, which means that these algorithms need enough vertical information data. In the situation of shipborne predictions, the vertical information data is limited along the ship route. For these special limited data, traditional methods with statistical analysis are difficult to fuse the ship survey observations and remote sensing observations [8]. Therefore, a new scheme for data fusion that can make full use of limited ship survey observations is needed.
In order to obtain the future information of ocean environment, the ITE is required. Temporal intelligent prediction can be seen as an ITE algorithm of remote sensing observations, which can use AI methods to predict the future data. Neetu et al. [9] proposed a nonlinear data-adaptive approach known by the name of genetic algorithm for predicting satellite-observed sea surface temperature (SST) in the Arabian Sea. Su et al. [10] proposed a support vector machine approach that can estimate the subsurface temperature anomaly. Zhang et al. [11] first adopts long short-term memory (LSTM) to predict SST. Xiao et al. [12] proposed a machine learning method combining the LSTM deep recurrent neural network model and the AdaBoost ensemble learning model (LSTM-AdaBoost) to predict the short and mid-term daily SST. Wei et al. [13] used ANNs to predict SST of the South China Sea. Sun et al. [14] proposed a time-series graph network (TSGN) for SST prediction that can jointly capture graph-based spatial correlation and temporal dynamics. All these previous studies have developed various algorithms and achieved great success, which greatly promoted the development of ITE algorithms for remote sensing observations. However, for the temporal intelligent prediction of ocean temperature, most previous studies only focused on SST prediction [9][10][11][12][13][14][15][16][17][18][19]. However, sea thermocline has a more important influence on the maritime operation, such as underwater navigation and hydroacoustic communication. A practical and effective prediction method should be applicable to the full range of water depths.
Deep learning, based on deep neural networks (DNNs), is a branch of machine learning. DNNs are ANNs structures with multiple layers, which can learn complex functions by combining nonlinear modules [20,21]. In geoscience, deep learning based on DNNs can better capture the spatial and temporal features of data that are difficult to extract by traditional methods with statistical analysis [2,22,23]. To address the issues above, this study proposes an DNN-based ISTE algorithm for remote sensing SST. The ISTE algorithm can effectively fuse the satellite remote sensing SST data, ship survey observations data and historical data of ocean temperature to generate 4D ocean temperature prediction data. The main contributions of this study are as follows: (1) compared with previous studies, this study effectively integrate the limited ship survey observations using DNNs, which can improve the underwater inversion accuracy of satellite remote sensing SST; (2) this study uses DNNs to achieve temperature predictions over the full water depth, including the mixed layer, thermocline and deep layer; (3) the ISTE algorithm proposed in this study could be a new effective approach for shipborne predictions.
The remainder of this article is organized as follows. Section 2 describes the ISTE principle and experimental details. Section 3 presents the experimental results and performance evaluation of the ISTE. A discussion is given in Section 4. Finally, Section 5 summarizes this study.

Data and Tools
Imagine that a ship was operating in the northeastern part of the South China Sea on 13 September 2016. The ship has downloaded global satellite remote sensing SST data for the day 10 September 2016. Three sets of vertical temperature observations measured by ship are available on 11 September 2016, 12 September 2016 and 13 September 2016, respectively. Additionally, historical data for global ocean temperature were stored on board. Based on the available information, we want to predict the 4D ocean temperatures from 13 September 2016 to 10 October 2016 within the region of 112-124 • E and 13.5-23 • N.
The remote sensing data of daily average SST for experiment come from the L4 satellite remote sensing grid data of Copernicus Marine Environment Monitoring Service (CMEMS) available online: https://resources.marine.copernicus.eu/product-detail/SST_GLO_SST_ L4_REP_OBSERVATIONS_010_024/INFORMATION/ (accessed on 6 June 2021). These data were produced by running the Operational Sea Surface Temperature and Sea Ice Analysis system [24,25]. The remote sensing SST of experimental area is shown in Figure 1.
We use Argo data to simulate the ship survey observations, which come from the Chinese Argo Real-time Data Center. Available online: http://www.argo.org.cn/ (accessed on 10 October 2019). The type of Argo instrument used in the experiment is HM2000_TS1, which has small errors and high data quality. These Argo data can provide a temperature profile of 2000 m every 5 days. The Argo after 13 September 2016 will be used to evaluate the prediction effects. The Argo distribution is shown in Figure 2. June 2021). These data were produced by running the Operational Sea Surface Temperature and Sea Ice Analysis system [24,25]. The remote sensing SST of experimental area is shown in Figure 1. We use Argo data to simulate the ship survey observations, which come from the Chinese Argo Real-time Data Center. Available online: http://www.argo.org.cn/ (accessed on 10 October 2019). The type of Argo instrument used in the experiment is HM2000_TS1, which has small errors and high data quality. These Argo data can provide a temperature profile of 2000 m every 5 days. The Argo after 13 September 2016 will be used to evaluate the prediction effects. The Argo distribution is shown in Figure 2. The historical data for global daily average ocean temperature come from the HY-COM reanalysis data products which were generated by HYCOM model and Navy Coupled Ocean Data Assimilation system. Available online: https://www.hycom.org/ (accessed on 9 September 2020). The HYCOM reanalysis data has assimilated available satellite altimeter data, satellite remote sensing data of SST, and other data from XBT, Argo  The historical data for global daily average ocean temperature come from the HYCOM reanalysis data products which were generated by HYCOM model and Navy Coupled Ocean Data Assimilation system. Available online: https://www.hycom.org/ (accessed on 9 September 2020). The HYCOM reanalysis data has assimilated available satellite altimeter data, satellite remote sensing data of SST, and other data from XBT, Argo buoys and mooring buoys. In the reanalysis data, there are dynamical constraints between neighboring grid points. The purpose of ANN training is to mine the hidden patterns in the training sample dataset.
The ANN tool used in this study is Pytorch, which has been widely used in many scientific fields. Available online: https://pytorch.org/ (accessed on 10 October 2020). The training sample is chosen from the reanalysis data from the years of 2006-2015. The grid size is 150 × 120 horizontally. The horizontal resolution is 9 km. We use a stretched terrain-following coordinate in the vertical direction and the grid has 20 layers vertically. In order to evaluate the improving effects of ISTE, we have also carried out a prediction experiment using LRAP.

Information Spatial-Temporal Extension Algorithm
The ISTE process can be seen as a regression problem, whereas FC (Fully Connected)-DNNs are very effective in data fitting and pattern capturing. Through data training, FC-DNNs can approximate the nonlinear mapping relationship between the samples and labels. Since FC-DNNs have superior advantages, we choose FC-DNNs as the network architectures for ISTE.
A FC-DNN consists of an input layer, multiple hidden layers, and an output layer. Its architecture is shown in Figure 3.
e Sens. 2022, 14, x A FC-DNN consists of an input layer, multiple hidden layers, and an output Its architecture is shown in Figure 3. The ISTE algorithm is composed of ISE process, intelligent correcting (IC) proces ITE process. The general scheme uses the satellite remote sensing SST data to genera initial estimation field by ISE. Then, the limited observations with vertical temperatu formation were used to correct the initial estimation field to obtain an initial correction Finally, use the initial correction field for ITE. In this way, 4D ocean temperature pred data can be obtained. The overall flows of ISTE are shown in Figure 4. The ISTE algorithm is composed of ISE process, intelligent correcting (IC) process and ITE process. The general scheme uses the satellite remote sensing SST data to generate an initial estimation field by ISE. Then, the limited observations with vertical temperature information were used to correct the initial estimation field to obtain an initial correction field. Finally, use the initial correction field for ITE. In this way, 4D ocean temperature prediction data can be obtained. The overall flows of ISTE are shown in Figure 4.
The ISTE algorithm is composed of ISE process, intelligent correcting (IC) process and ITE process. The general scheme uses the satellite remote sensing SST data to generate an initial estimation field by ISE. Then, the limited observations with vertical temperature information were used to correct the initial estimation field to obtain an initial correction field. Finally, use the initial correction field for ITE. In this way, 4D ocean temperature prediction data can be obtained. The overall flows of ISTE are shown in Figure 4. The procedures of ISTE are described below.
(1) Train the Reconstruction DNN using historical ocean temperature data.
(2) Put the satellite remote sensing SST into the trained Reconstruction DNN to generate the initial estimation field. The procedures of ISTE are described below.
(1) Train the Reconstruction DNN using historical ocean temperature data.
(2) Put the satellite remote sensing SST into the trained Reconstruction DNN to generate the initial estimation field. (3) Train the Correction DNN using the initial estimation field and historical ocean temperature data. (4) Put the ship survey observations into the trained Correction DNN to generate the initial correction field. (5) Train the Extrapolation DNN using historical ocean temperature data. (6) Put the initial correction field into the trained Extrapolation DNN to generate the final prediction field.
As mentioned above, the ISTE process requires the construction of Reconstruction DNN, Correction DNN and Extrapolation DNN. All the three DNNs adopt fully connected forms, and they are described in the following subsections.

Reconstruction DNN
Since there are correlations between the SST and the subsurface temperature, we can then use the Reconstruction DNN to fit these correlations. Thus, when the SST have been known, the subsurface temperature can be predicted through the Reconstruction DNN.
To speed up training and prevent overfitting, we use temperature anomalies as training samples and training labels. Temperature anomaly is the anomaly relative to the multi-year temperature average. This is the reason why previous studies prefer to use temperature anomalies [9,12,16,18].
For the Reconstruction DNN, we set the number of hidden layers to 16 and the number of neurons for each layer to 48. The learning rate is 0.001 and the epoch is 1000. The activation function is ReLU, the optimizer is Adam, and the loss function is MSELoss. The input layer is SST anomaly with 18,000 neurons, and the output layer is the subsurface temperature anomaly at a certain depth with 18,000 neurons.
The SST anomaly data in this study come from historical reanalysis data and the purpose is to mine the physical patterns in the reanalysis data. As the ocean has different properties at different depths, we use a layer-training scheme. For example, the training sample is the SST anomaly, and the training label is the temperature anomaly in the 10th depth layer. After the Reconstruction DNN is trained, the 10th depth layer temperature anomaly of 10 September 2016 can be generated by inputting the SST anomaly of 10 September 2016 to the Reconstruction DNN.

Correction DNN
The initial estimation field can be obtained through the Reconstruction DNN. Since the initial estimation field is only produced by learning the historical pattern, it inevitably has some errors. However, we can reduce the errors by fusing the vertical temperature information into the initial estimation field. The vertical temperature information can be provided by ship survey observations.
In fact, it is not possible to obtain observations for all grid points. This requires us to correct the initial estimation field using the limited observations. It has proved to be feasible and effective to perform IC process for the initial field using the limited observations [26]. The IC algorithm can perform horizontal extension of the limited observation information.
In this way, the purpose of using the limited observations to correct the initial field can be realized.
The Correction DNN has 2 hidden layers, the number of neurons for each layer is 48. The learning rate is set to 0.01 and the epoch is set to 500. The activation function is ReLU, the optimizer is Adam, and the loss function is MSELoss. The input layer is the observation point increment with 3 neurons, which is the difference between the temperature at observation location in the reanalysis data and the temperature at observation location in the initial estimation field. The output layer is the entire field increment with 18,000 neurons, which is the difference between the temperature of reanalysis data and the temperature of initial estimation field. More details about the IC algorithm can be found in the study carried out by Mao et al. [26].
Through training, we can obtain a Correction DNN that can fit nonlinear mapping function η between the entire field increment and the observation point increment: where E is the entire field increment, X o is the observation point increment. Then, the initial correction field A 1 can be calculated by where M 1 is the initial estimation field.

Extrapolation DNN
With reference to the operation of the numerical model, the temperatures at the moments to be predicted are related to the initial field. We can use the Extrapolation DNN to fit the relationship between the initial field and the temperature field at different moments. In this way, once we have obtained the initial field, we can use the trained Extrapolation DNN to make temporal prediction.
Assume A p is the target temperature field to be predicted, the time gap is ∆t. We believe that A p is related to A 1 , that is: where P ∆t represents the mapping function within ∆t time. P ∆t cannot be calculated directly, but can be fitted by trained Extrapolation DNN. For example, if the temperature of 10 September 2016 is the initial field, suppose we want to predict the temperature of 13 September 2016. We can take the temperature of 10 September in the past years as the samples and the temperature of 13 September in the past years as the labels. The mapping function P ∆t between 10 September and 13 September can be fitted through data training.
The Extrapolation DNN has 2 hidden layers and the number of neurons for each layer is 64. The learning rate is set to 0.002 and the epoch is set to 500. The activation function is ReLU, the optimizer is Adam, and the loss function is MSELoss. The input layer is the temperature of initial correction field. The number of neurons in the input layer is 18,000. The output layer is the temperature field at the time to be predicted. The number of neurons in the output layer is 18,000. It should be noted that due to the large fluctuations in thermocline, overfitting tends to occur with the same parameter settings. Therefore, the training process in thermocline needs to add the dropout setting. Here, we set the dropout value to 0.1. Once the Extrapolation DNN has been trained using historical data, the temperature at any time can be predicted. Figure 5 shows the results of the inversion and correction. With the help of DNNs, which is a powerful tool, it is possible to achieve obvious correcting effect with a small amount of vertical information data. Then, we used a series of Argo observations at different times to evaluate the prediction results through statistics. The evaluation criteria used for following analysis are reported in Table 1:

Evaluation Criteria Formulas
Absolute Error (AE)

AE m a 
Mean Absolute error (MAE)  Figure 6 is the prediction errors for mixed layer. We use 0 m, 10 m, 20 m, and 30 m depths to present the prediction errors for mixed layer. The results show that the ISTE method shows a clear superiority over the LRAP method. Compared with the LRAP method, the largest improvement of prediction accuracy in ISTE is at 10 m depth with 52% reduction in error. Additionally, the smallest improvement of prediction accuracy in ISTE is at 30 m depth with 16% reduction in error. Then, we used a series of Argo observations at different times to evaluate the prediction results through statistics. The evaluation criteria used for following analysis are reported in Table 1: Table 1. The criteria for evaluation. m is the prediction results, a is the Argo data, a mean is the mean value of Argo data, n is the number of observation points.

Evaluation Criteria Formulas
Absolute Error (AE) AE = |m − a| Mean Absolute error (MAE) Figure 6 is the prediction errors for mixed layer. We use 0 m, 10 m, 20 m, and 30 m depths to present the prediction errors for mixed layer. The results show that the ISTE method shows a clear superiority over the LRAP method. Compared with the LRAP method, the largest improvement of prediction accuracy in ISTE is at 10 m depth with 52% reduction in error. Additionally, the smallest improvement of prediction accuracy in ISTE is at 30 m depth with 16% reduction in error.

Mixed Layer
For ISTE, the mean prediction error at 10 m depth is the smallest with the MAE of 0.4554 • C, whereas the mean prediction error at 30 m depth is the largest with the MAE of 1.0784 • C. This is probably because the fluctuations are smallest at 10 m, making it easier for the DNN to capture the pattern hidden in historical data. For 30 m depth, it is closer to the thermocline, which is more volatile and less regular. This results in DNNs not being able to capture data patterns well.
It is also understandable that the error for the sea surface is not the smallest. The sea surface is more influenced by the atmosphere and has more disturbances than the 10 m depth. Therefore, the prediction error at the sea surface is not the smallest. For ISTE, the mean prediction error at 10 m depth is the smallest with the MAE of 0.4554 °C , whereas the mean prediction error at 30 m depth is the largest with the MAE of 1.0784 °C. This is probably because the fluctuations are smallest at 10 m, making it easier for the DNN to capture the pattern hidden in historical data. For 30 m depth, it is closer to the thermocline, which is more volatile and less regular. This results in DNNs not being able to capture data patterns well.
It is also understandable that the error for the sea surface is not the smallest. The sea surface is more influenced by the atmosphere and has more disturbances than the 10 m depth. Therefore, the prediction error at the sea surface is not the smallest.
We also notice that the prediction error is relatively stable in the first 15 days near the sea surface, with an error of around 0.5 °C , whereas there is a significant increase in prediction error after 15 days. As the ISTE algorithm itself does not have the problem of error accumulation over time, this abrupt increase may be due to changes in the background field outside the predicted region. Compared with numerical models, ISTE lacks the in- We also notice that the prediction error is relatively stable in the first 15 days near the sea surface, with an error of around 0.5 • C, whereas there is a significant increase in prediction error after 15 days. As the ISTE algorithm itself does not have the problem of error accumulation over time, this abrupt increase may be due to changes in the background field outside the predicted region. Compared with numerical models, ISTE lacks the information on external changes given by the boundary field. This may lead to a sharp increase in prediction error when there is a dramatic change in the ocean-atmospheric environment outside the region. Figure 7 shows the prediction error for the thermocline. The mean prediction error of ISTE is generally below 1 • C, but there are several extreme errors that greater than 2 • C. This demonstrates that the predictions for the thermocline are more unstable compared with the mixed layer. Figure 7 shows the prediction error for the thermocline. The mean prediction error of ISTE is generally below 1 °C, but there are several extreme errors that greater than 2 °C. This demonstrates that the predictions for the thermocline are more unstable compared with the mixed layer. The prediction error was greatest at 100 m depth, with an MAE of 0.9843 °C. The prediction error was least at 200 m depth, with an MAE of 0.5902 °C. However, the improvement of ISTE is the smallest at 200 m depth relative to LRAP, where the mean prediction error is reduced by only 20%.

Thermocline
Comparing the predictions of ISTE and LRAP, it can be found that the improvement brought by ISTE is not as visually obvious as the mixed layer and the reduction in mean The prediction error was greatest at 100 m depth, with an MAE of 0.9843 • C. The prediction error was least at 200 m depth, with an MAE of 0.5902 • C. However, the improvement of ISTE is the smallest at 200 m depth relative to LRAP, where the mean prediction error is reduced by only 20%.
Comparing the predictions of ISTE and LRAP, it can be found that the improvement brought by ISTE is not as visually obvious as the mixed layer and the reduction in mean prediction error is relatively small. This is mainly due to two reasons. The first reason is that the underwater inversion accuracy for SST from satellite remote sensing inevitably decreases when the depth reaches below 50 m, which leads to increasing errors in subsequent prediction. The second reason is that due to the large fluctuations of the thermocline itself. Neither regression nor DNN methods can fully fit the variation pattern. However, even so, the DNN method outperform regression method in terms of mining temperature variation patterns and reducing prediction errors. Figure 8 shows the prediction errors at the deep layer. The mean prediction error of ISTE remains largely within 0.4 • C. The results show that ISTE still plays a relatively stable role at the deep layer. At the depths of 300, 400, 500, and 1000 m, the prediction errors are smaller than those of LRAP. The results demonstrate that ISTE outperforms LRAP in general. We can also see that as depth increases, ISTE brings very limited improvement at deep layer. This is due to a number of reasons. The temperature fluctuation at deep layer is smaller and therefore the prediction errors are smaller in magnitude. In addition, the reanalysis data used for training does not guarantee high accuracy at deep layer. All these reasons lead to a poorer performance of ISTE at deep layer than in the mixed layer and thermocline. even so, the DNN method outperform regression method in terms of mining temperature variation patterns and reducing prediction errors. Figure 8 shows the prediction errors at the deep layer. The mean prediction error of ISTE remains largely within 0.4 °C. The results show that ISTE still plays a relatively stable role at the deep layer. At the depths of 300, 400, 500, and 1000 m, the prediction errors are smaller than those of LRAP. The results demonstrate that ISTE outperforms LRAP in general. We can also see that as depth increases, ISTE brings very limited improvement at deep layer. This is due to a number of reasons. The temperature fluctuation at deep layer is smaller and therefore the prediction errors are smaller in magnitude. In addition, the reanalysis data used for training does not guarantee high accuracy at deep layer. All these reasons lead to a poorer performance of ISTE at deep layer than in the mixed layer and thermocline. Although the prediction performance of ISTE at the deeper layers has been reduced, this does not largely affect the value of ISTE applications. It is mainly the mixed layer and thermocline that have a significant impact on human activities at sea, such as sonar detection, offshore fishing, and so on. The improved prediction performance of ISTE for the mixed layer and thermocline will help humans to have a better understanding of the underwater information.

Vertical Profiles
In order to detect the prediction effect in vertical structure, we compare the prediction results with the Argo observation data. In fact, the Argo observations we use are from four Argo buoys, which produce observations every five days. Therefore, we divided the observations into 5 groups to test the prediction effect, as shown in Figures 9-13. Here, we only present the prediction results for water depths less than 1000 m. The Argo profiles are plotted from all Argo data, the LRAP and ISTE profiles are plotted based on interpolation of the grid data. this does not largely affect the value of ISTE applications. It is mainly the mixed layer and thermocline that have a significant impact on human activities at sea, such as sonar detection, offshore fishing, and so on. The improved prediction performance of ISTE for the mixed layer and thermocline will help humans to have a better understanding of the underwater information.

Vertical Profiles
In order to detect the prediction effect in vertical structure, we compare the prediction results with the Argo observation data. In fact, the Argo observations we use are from four Argo buoys, which produce observations every five days. Therefore, we divided the observations into 5 groups to test the prediction effect, as shown in Figures 9-13. Here, we only present the prediction results for water depths less than 1000 m. The Argo profiles are plotted from all Argo data, the LRAP and ISTE profiles are plotted based on interpolation of the grid data.  The time of group 1 is within 15 September-18 September. In this early stage, ISTE generally match the vertical structure of Argo temperature profiles. In terms of comparing the prediction errors at different depths, the deep layer errors are much smaller than the mixed layer and thermocline. Figure 9a,c,d show that the ISTE temperature profiles are closer to the Argo temperature profiles compared with LRAP.
The time of group 2 is within 20 September-23 September. In this stage, the SST prediction by ISTE is significantly better than that of LRAP, just like Figure 9. This proves that we have used the observations effectively.   The time of group 1 is within 15 September-18 September. In this early stage, ISTE generally match the vertical structure of Argo temperature profiles. In terms of comparing the prediction errors at different depths, the deep layer errors are much smaller than the mixed layer and thermocline. Figure 9a,c,d show that the ISTE temperature profiles are closer to the Argo temperature profiles compared with LRAP.
The time of group 2 is within 20 September-23 September. In this stage, the SST prediction by ISTE is significantly better than that of LRAP, just like Figure 9. This proves that we have used the observations effectively. Figure 10b shows that both ISTE and LRAP present large errors at the thermocline. The maximum error is even more than 2 °C . This may be due to the emergence of temperature changes in this region that are significantly different from its climatic state.
The time of group 3 is within 25 September-28 September. Figure 11b still shows obvious prediction errors at the thermocline. Meanwhile, the prediction errors of SST are smaller. This also proves that the thermocline temperature is much more difficult to predict than SST. It exhibits more instability.
The time of group 4 is within 30 September-3 October. We notice a significant increase in the prediction error of the ISTE for the SST. This is because the sea surface is more sensitive to the atmospheric environment as time progresses. The accumulated influence of external factors thus leads to a weakening of the correlation between the initial  Figure 10b shows that both ISTE and LRAP present large errors at the thermocline. The maximum error is even more than 2 • C. This may be due to the emergence of temperature changes in this region that are significantly different from its climatic state.
The time of group 3 is within 25 September-28 September. Figure 11b still shows obvious prediction errors at the thermocline. Meanwhile, the prediction errors of SST are smaller. This also proves that the thermocline temperature is much more difficult to predict than SST. It exhibits more instability.
The time of group 4 is within 30 September-3 October. We notice a significant increase in the prediction error of the ISTE for the SST. This is because the sea surface is more sensitive to the atmospheric environment as time progresses. The accumulated influence of external factors thus leads to a weakening of the correlation between the initial field and the temperature field to be predicted, which in turn increases the prediction error.
The time of group 5 is within 5 October-8 October. We can see that the prediction effect of ISTE generally keep its stability. Additionally, the overall prediction errors of ISTE are smaller than those of LRAP. Figures 9-13 show that ISTE predicts the structure of the vertical temperature profile in the experimental area well. This demonstrates that ISTE can predict the temperature over the full water depth.

Statistical Analysis
The above results are the result of training using a 10-year sample of data. In addition to this, sensitivity tests were conducted using 5 and 8 years of sample data, respectively. All the statistical analysis results are shown in Table 2. The results show that the ISTE method has a much smaller prediction error. In terms of training results for the 10-year sample, the MAE of ISTE is 0.4949 • C. Compared with LRAP, the MAE of ISTE is reduced by 0.14 • C, whereas the MAPE of ISTE is reduced by 0.6%. ISTE exhibited a lower RMSE of 0.7199 • C, with a 0.21 • C reduction relative to LRAP. ISTE also exhibited a higher R 2 of 0.9936, with an improvement of 0.0042 relative to LRAP.
From the results we can also see that the sample length did not have a significant effect on the prediction results, which indicates that the prediction results are not sensitive to the sample length. However, in terms of the characteristics of DNNs, we should use as many sample data as possible to ensure the stability of the data training.
The above experimental results fully demonstrate the effectiveness of the ISTE algorithm, which is significantly better than the traditional LRAP algorithm for the prediction accuracy of ocean temperature.

Discussion
Ocean dataset is different from the dataset in other fields such as image and finance. Ocean dataset has its own inherent physical patterns. In terms of ocean temperature, the hidden patterns in temperature data are theoretical vertical modes [27,28] or empirical modes [29] in oceanography. The variation of thermocline or currents are all correlated with the ocean vertical structure changes. It is the correlation between different depth layers that justifies the use of sea surface information to predict the subsurface information [30,31]. Thus, AI methods are used to learn these hidden patterns in historical sample data through DNNs and to provide rapid prediction for humans.
In the process of ISE, traditional empirical analysis methods cannot make full use of all SST grid points simultaneously to produce the target inversion field. Besides this, traditional empirical analysis methods can only analyze linear relationships between the SST and subsurface temperature [8], whereas the AI method are different from the traditional empirical analysis method and have excellent performance that traditional empirical analysis methods do not have.
The driving factor of AI method is the entire SST field, which is a nonlinear correlation with the subsurface temperature field. DNNs can better reflect the spatial correlation characteristics between each grid point of temperature field and capture more structural characteristics of temperature field. Thus, DNN-based ISE can produce a better reconstruction result of temperature field.
In the IC process, the fusion of ship survey observations is superior to traditional linear interpolation. The IC algorithm can largely compensate for missing effective information using the patterns that are mined from historical data. Therefore, the IC process can make full use of the limited ship survey observations to correct the initial estimation field, and thus produce an initial correction field.
For ITE, the DNN-based algorithm can approximate the dynamical equations to some extent. It can fit the patten of historical data and achieve an approximate prediction. This algorithm establishes a one-to-one mapping relationship between the moment to be predicted and the initial field. The advantage of this algorithm is that there is no error accumulation over time. The stability of the prediction error can be guaranteed.
Through experiments, it can be found that the sea thermocline temperature (STT) is more difficult to be predicted compared with SST. The reason is that the ocean is different from a general data set. The real ocean is a 3D fluid affected by gravity, which forms a layered structure due to the difference in density, as shown in Figures 9-13. When the ocean is affected by external forces, it will cause sharp fluctuations in the thermocline. The SST change is relatively stable compared with STT and the SST changing regulations is easier to be captured by DNNs, which can lead to better prediction effects. Therefore, since the thermocline has stronger volatility and uncertainty, the STT changing regulations with time is more difficult to be captured compared with SST.
Taken together, the ISTE algorithm can makes full use of satellite remote sensing observations and limited ship survey observation data. Then, achieve the spatial and temporal extension of available information. In fact, this process achieves 4D temperature prediction. Therefore, the ISTE algorithm can be as an effective shipborne prediction method. The prediction performance of ISTE is significantly better than the traditional LRAP methods.
In this article, only an experimental study of the ISTE algorithm for temperature prediction has been carried out. In practice, we can obtain more types of observational data, including sea surface height (SSH) and sea surface salinity (SSS). Based on these facts, we will consider these factors in the next step.
(1) Develop ISE algorithms that can fuse all satellite remote sensing data, including SST, SSH and SSS. (2) Develop IC algorithms with temperature and salt constraints.
(3) Develop multi-factor coupled ITE prediction algorithms. (4) Consider exploring more advanced DNN structures that combine multiple neural networks.
We expect that these works will further improve the prediction accuracy for ocean environment. The more advanced technology will enable the ISTE algorithm to be used in a wider application context, including weather prediction and climate prediction.

Conclusions
Satellite remote sensing data can provide data with high temporal and spatial resolution, but they are limited to the surface and cannot provide underwater information. For shipborne prediction, it is convenient to obtain ship survey observations to access accurate vertical information. The ISTE algorithm can be used to fully integrate the satellite remote sensing data and ship survey observation data to achieve a 4D ocean environment prediction.
This paper demonstrates the effectiveness of the ISTE algorithm through prediction experiments on ocean temperatures. The experimental results show that the MAE of ISTE is around 0.5 • C, the MAPE is around 4%, the RMSE is around 0.7 • C and the R 2 is 0.9936, Relative to the observations. Compared with the traditional LRAP method, the prediction accuracy of ISTE has been significantly improved.
The ISTE algorithm is a convenient, practical, and effective method for shipborne prediction. We anticipate that the DNN-based ISTE algorithm will help humans to obtain timely information about the ocean environment and further benefit relevant marine operations, including but not limited to sonar detection and offshore fishing.