An Ensemble-Based Machine Learning Model for Estimation of Subsurface Thermal Structure in the South China Sea

: Reconstructing the vertical structures of the ocean from sea surface information is of great importance for ocean and climate studies. In this study, an ensemble machine learning (Ens-ML) model is proposed to retrieve ocean subsurface thermal structure (OSTS) by using satellite-derived sea surface data and Argo data in the South China Sea (SCS). The input data include sea surface height (SSH), sea surface temperature (SST), sea surface salinity (SSS), sea surface wind (SSW), and geographic information (including longitude and latitude). We select three stable machine learning models, namely, extreme gradient boosting (XGBoost), RandomForest and light gradient boosting machine (LightGBM) as our benchmark models, and then use an artificial neural network (ANN) technique to combine outputs from the three individual models. The proposed Ens-ML model using sea surface data only by SSH, SST, SSS, and SSW performs less satisfactorily than that considering the contribution of geographical information, indicating that the geographical information is essential to estimate the OSTS accurately. The estimated OSTS from the Ens-ML model are compared with Argo data. The results show that the proposed Ens-ML model can accurately estimate the OSTS (upper 1000 m) in the SCS, which is relatively more accurate and precise than the individual models. The performance of the Ens-ML model also varies with season, and better estimation is obtained in winter, which is probably due to stronger mixing and weaker stratification. This study shows the great potential and advantage of the multi-model ensemble of machine learning algorithm for the ocean’s interior information retrieving, showing great potential in expanding the scope of ocean observations.


Introduction
Ocean temperature, one of the most important variables of seawater, plays a significant role in marine disasters, marine ecosystems, ocean dynamics, and climate changes [1,2], for example, it is closely related to the formation of many high impact thermos-dynamical ocean-atmospheric processes, such as the marine heatwaves [3,4], thermocline layer formation [5], El Niño evolution [6,7], or deep-water formation [8,9]. To better understand the role of ocean temperature in marine disaster and ecosystems, ocean circulation and climate changes, it is essential to identify the ocean subsurface thermal structure (OSTS) accurately.
As the marginal sea of the western Pacific, the South China Sea (SCS) lies under the influence of the Western Pacific Warm Pool [10,11]. It has an extended continental shelf in the south and extends deeper than 5000 m in the central region ( Figure 1a). Extended continental shelves exist along the north boundary and southwest boundary, while the deeper water is oriented in a southwest-northeast direction across the central part of the SCS [12]. Due to its unique geographical location and complicated dynamical features, the temperature structures in the SCS have remarkable characteristics associated with El Niño-Southern Oscillation (ENSO) [13], Asian monsoon and Pacific western boundary current system [14,15], which have a far-reaching influence on the regional and global climate [16,17]. In the climatological mean, sea surface temperature (SST) in the SCS shows a chiefly southwest-northeast oriented pattern, which is gradually increasing southward (Figure 1b). Previous studies confirmed that ocean temperature plays a vital role in the ecosystems, dynamics and thermodynamics of the SCS. For example, ocean temperature variability can significantly influence the occurrence of marine heatwaves in the SCS [18]. In addition, ocean temperature can also play a role in the processes of air-sea interactions in the SCS, which can exert noticeable impacts on the East Asian climate [19]. Despite the importance of the ocean temperature structures in the SCS, it remains a challenging problem for oceanographers and climatologists to accurately estimate them. Current observations of OSTS in the SCS are sparse and insufficient, which significantly constrain the studies of ocean processes. Therefore, the most commonly used methods to estimate the OSTS are numerical simulation, numerical-based data assimilation and dynamical modeling [20,21]; however, these dynamical models, due to simulating full aspects of the ocean thermal and dynamical processes that are governed by a suit of equations, are very time and computationally consuming. How to accurately estimate the OSTS with an acceptable computational resource is an active area of research.
Remote Sens. 2022, 14, x FOR PEER REVIEW 2 of 18 As the marginal sea of the western Pacific, the South China Sea (SCS) lies under the influence of the Western Pacific Warm Pool [10,11]. It has an extended continental shelf in the south and extends deeper than 5000 m in the central region ( Figure 1a). Extended continental shelves exist along the north boundary and southwest boundary, while the deeper water is oriented in a southwest-northeast direction across the central part of the SCS [12]. Due to its unique geographical location and complicated dynamical features, the temperature structures in the SCS have remarkable characteristics associated with El Niño-Southern Oscillation (ENSO) [13], Asian monsoon and Pacific western boundary current system [14,15], which have a far-reaching influence on the regional and global climate [16,17]. In the climatological mean, sea surface temperature (SST) in the SCS shows a chiefly southwest-northeast oriented pattern, which is gradually increasing southward (Figure 1b). Previous studies confirmed that ocean temperature plays a vital role in the ecosystems, dynamics and thermodynamics of the SCS. For example, ocean temperature variability can significantly influence the occurrence of marine heatwaves in the SCS [18]. In addition, ocean temperature can also play a role in the processes of air-sea interactions in the SCS, which can exert noticeable impacts on the East Asian climate [19]. Despite the importance of the ocean temperature structures in the SCS, it remains a challenging problem for oceanographers and climatologists to accurately estimate them. Current observations of OSTS in the SCS are sparse and insufficient, which significantly constrain the studies of ocean processes. Therefore, the most commonly used methods to estimate the OSTS are numerical simulation, numerical-based data assimilation and dynamical modeling [20,21]; however, these dynamical models, due to simulating full aspects of the ocean thermal and dynamical processes that are governed by a suit of equations, are very time and computationally consuming. How to accurately estimate the OSTS with an acceptable computational resource is an active area of research. In recent decades, the latest technological advances in remote sensing have significantly improved our understanding of ocean circulation and dynamics by providing wellsampled surface observations, such as sea surface height (SSH), SST, sea surface salinity (SSS), and sea surface wind (SSW). Several early studies suggested that many ocean subsurface phenomena have surface manifestations, therefore, satellite-derived sea surface data can be used to reconstruct the ocean interior [22][23][24][25][26]. In this regard, extensive efforts have been made to reconstruct ocean interior structures via surface information from satellite data [27][28][29][30][31]. Methods for retrieving ocean interior structures from sea surface data are often based on linear regression of single or multiple variables [32,33], statistical and dynamic methods [34], or machine learning methods [24,35,36]. Khedouri et al. [37] used purely statistical relationships to estimate OSTS from sea surface data, such as sea surface In recent decades, the latest technological advances in remote sensing have significantly improved our understanding of ocean circulation and dynamics by providing well-sampled surface observations, such as sea surface height (SSH), SST, sea surface salinity (SSS), and sea surface wind (SSW). Several early studies suggested that many ocean subsurface phenomena have surface manifestations, therefore, satellite-derived sea surface data can be used to reconstruct the ocean interior [22][23][24][25][26]. In this regard, extensive efforts have been made to reconstruct ocean interior structures via surface information from satellite data [27][28][29][30][31]. Methods for retrieving ocean interior structures from sea surface data are often based on linear regression of single or multiple variables [32,33], statistical and dynamic methods [34], or machine learning methods [24,35,36]. Khedouri et al. [37] used purely statistical relationships to estimate OSTS from sea surface data, such as sea surface dynamic height and SST. Based on the empirical orthogonal function (EOF), DeWitt [38] presented a statistical method to develop the relationship between dynamic height and am-Remote Sens. 2022, 14, 3207 3 of 17 plitudes of the first two vertical modes. They pointed out that sea surface dynamic height is a key parameter to estimate ocean temperature structure. Later, Carnes et al. [23] retrieved the ocean vertical temperature and salinity structures based on a regression relationship between sea surface height and ocean vertical temperature structures. Chu et al. [28] developed a parametric model to retrieve OSTS from SST observations. Based on a "gravest empirical mode" (GEM), Watts et al. [39] found that the temperature and salinity structures between 150 and 3000 dbars in the south of Australia can be accurately reconstructed through GEM. Based on satellite altimetry data, Meijers et al. [34] used the GEM method to estimate ocean subsurface temperature and salinity structures in the Southern Ocean. Guinehut et al. [33] achieved good performance in the estimation of temperature and salinity structures based on satellite-derived data and in-situ measurements through linear regression methods. Su et al. [40] developed a geographically weighted regression model to retrieve the ocean subsurface temperature anomaly in the Indian Ocean, which outperforms the ordinary linear regression model. Yu et al. [41] proposed a ridge regression model to derive ocean thermal structure in the Bay of Bengal using satellite altimetry data and Argo in-situ data.
Recently, machine learning technology has been widely used in many fields to solve specific problems throughout academia, including fields of atmospheric and oceanic sciences [42]. Several early studies have used machine learning models to retrieve ocean interior structure from multisource surface data. For example, Ali et al. [24] adopted an artificial neural network (ANN) method to reconstruct the vertical temperature structure from sea surface data such as SST, SSS, wind stress, net radiation, and net heat flux. They suggested that the ANN approach has a good performance in estimating OSTS from sea surface data. Later, Wu et al. [35] used a self-organizing map (SOM) neural network to retrieve ocean subsurface temperature structures from sea surface data. Combined with EOF analysis, Chen et al. [43] used a SOM approach to retrieve the subsurface thermal structure via sea surface data in the Northwestern Pacific Ocean. Su et al. [44] have adopted a support vector machine (SVM) approach to retrieve the ocean vertical temperature structure anomaly in the Indian Ocean via sea surface data. This method has also been extended to estimate subsurface temperature anomalies on a global scale [45]. More recently, they tried other machine learning approaches such as RandomForest, extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) to estimate the OSTS from sea surface data, and all of them have proven to be useful methods [26,46,47]. Han et al. [48] adopted a convolutional neural network (CNN) to retrieve OSTS in the Pacific using sea surface data, which yields a good performance. Lu et al. [49] adopted a method that combines a preclustering process and neural network approach to estimate subsurface thermal structures. They reported that the pre-clustered neural network method performs better than the same method without clustering. By using a stacked Long Short-Term Memory neural network (LSTM) method, a new model is proposed by Nardelli et al. [50] to estimate ocean hydrographic profiles in the North Atlantic Ocean. Recently, Meng et al. [51] adopted a CNN network to estimate subsurface and deep ocean temperature fields from satellite-derived sea surface data. They also proposed a scheme to improve the horizontal resolution of estimated temperature fields.
As suggested above, machine learning is a useful technology to estimate the subsurface thermal structure from sea surface data. The practical utility of estimates of OSTS derived from sea surface data will depend on the particular problem at hand. The performance of machine learning models mainly relies on algorithms and combinations of input parameters [31]. The optimal parameter combination used in machine learning models is found to depend on the kind of problem you wish to solve, the number of variables, the kind of model that would suit it best and so on. Most existing studies related to estimation of OSTS from sea surface data have focused on large-scale areas, such as global, Pacific Ocean, and Indian Ocean, but no related studies have been carried out in the SCS. Another thing to note here is that most of the previous studies were carried out using only individual machine learning models to estimate the OSTS, such as ANN, SOM, SVM, CNN, XGBoost, RandomForest, and LightGBM, which may not make the perfect estimation for a given dataset due to their limitations. In recent years, the multi-model ensemble approach has been used to improve the overall accuracy of the prediction process, which usually yields better performance than individual models [52][53][54]. At present, the utilizations of the ensemble machine learning approach in the estimation of OSTS is still relatively rare. To resolve limitations in a single method, in this study, we used an ensemble machine learning (Ens-ML) model to estimate OSTS in the SCS. Based on previous studies, we chose three stable machine learning models (XGBoost, RandomForest, and LightGBM) as our basic machine learning model, and then we used an ANN to combine the outputs from each model. The remainder of the paper is organized as follows: Data and methods are described in Section 2, results are present in Section 3, and summary and discussion follow in Section 4.

Data
The sea surface data used in this study are all from satellite observations, including SST, SSS, SSH and SSW (decomposed into eastward wind speed (USSW) and northward wind speed (VSSW) components). In addition, the geographic information including longitude (LON) and latitude (LAT) are also used in this study.
The monthly SST data are from the National Oceanic and Atmospheric Administration (NOAA) Optimum Interpolation Sea Surface Temperature (OISST) version 2 products, which combine AVHRR, AMSR and in-situ observations [55]. The SSS data are obtained from Soil Moisture and Ocean Salinity (SMOS) Level-3 SSS product [56]. The SSH data measured by altimeters are obtained from Archiving, Validation and Interpretation of Satellite Oceanographic (AVISO) data center of CNES (Center National d'Etudes Spatiales) [57]. The SSW data are from the Cross-Calibrated Multi-Platform (CCMP) wind velocity product, which apply a variational analysis method to combine data from different sources to synthesize high-resolution surface wind data [58]. The Argo dataset used in this study is the monthly gridded Argo dataset produced by Roemmich and Gilson [59]. It has a horizontal resolution of 1 • × 1 • and is interpolated to 58 depth levels (upper 2000 m) from January 2004 to present. In this study, the Argo data is used as labeled training data, as well as validating the accuracy and reliability of the results from models. Table 1 summaries all the data used in this study. The Study area is located between 5~23 • N latitude and 105~122 • E longitude. It should be noted that all the above data have been processed into monthly averages and interpolated onto a grid with a resolution of 0.5 • longitude and latitude grid with the same temporal and spatial coverage of the SCS. In addition, only the 2010-2019 time series for all variables have been considered.

Methods
Recently, the multi-model ensemble method has been widely studied, which has been demonstrated to yield more favorable results than individual machine learning models in many applications. In this study, we explore the use of a multi-model ensemble of machine learning approaches to estimate OSTS in the SCS. Among various ensemble learning methods, ANN is one of the most popular techniques to combine multiple models due to its effectiveness and simplicity. Generally, ensemble approaches are algorithm-independent, and impose no special restrictions on selection of the benchmark models. Based on the results of previous studies, we selected three stable machine learning models, XGBoost, RandomForest, and LightGBM as our basic model, all of which have been shown to have ability to estimate the OSTS [26,46,47,60].
In general, successful utilization of a machine learning application requires data under the categories: training data, validation data, and testing data. The training data are used to train the model. The validation data are used to validate the performance of the model during training and control overfitting almost simultaneously. In this study, all the surface input data from January 2010 to December 2018 were randomly split into two separate categories: 70% of the dataset for training and the remaining 30% for validation. As input, the sea surface data from January 2019 to December 2019 were used to predict the OSTS in the SCS. Out of the total 120 months of data, we used 75 months of data for training, 33 months of data for validation and 12 months of data for testing.
The flowchart of the proposed Ens-ML model for OSTS estimation in the SCS is shown in Figure 2. The Ens-ML model procedure involves two main steps: (1) training and validating the model, (2) estimating with model. The model was trained and validated using satellite-derived surface data and Argo labeled data before it was applied to reconstruct the OSTS. Based on previous studies and analyses, we selected SST, SSS, SSH, USSW, VSSW, and the geographical information, i.e., LON, and LAT, as independent input variables for machine learning models to estimate the OSTS in the SCS [24,26,47,51]. We first trained the three individual machine learning models (XGBoost, RandomForest, and LightGBM) to estimate the OSTS from sea surface data. After that, the outputs of the three machine learning models were combined using the ANN.

Methods
Recently, the multi-model ensemble method has been widely studied, which has been demonstrated to yield more favorable results than individual machine learning models in many applications. In this study, we explore the use of a multi-model ensemble of machine learning approaches to estimate OSTS in the SCS. Among various ensemble learning methods, ANN is one of the most popular techniques to combine multiple models due to its effectiveness and simplicity. Generally, ensemble approaches are algorithmindependent, and impose no special restrictions on selection of the benchmark models. Based on the results of previous studies, we selected three stable machine learning models, XGBoost, RandomForest, and LightGBM as our basic model, all of which have been shown to have ability to estimate the OSTS [26,46,47,60].
In general, successful utilization of a machine learning application requires data under the categories: training data, validation data, and testing data. The training data are used to train the model. The validation data are used to validate the performance of the model during training and control overfitting almost simultaneously. In this study, all the surface input data from January 2010 to December 2018 were randomly split into two separate categories: 70% of the dataset for training and the remaining 30% for validation. As input, the sea surface data from January 2019 to December 2019 were used to predict the OSTS in the SCS. Out of the total 120 months of data, we used 75 months of data for training, 33 months of data for validation and 12 months of data for testing.
The flowchart of the proposed Ens-ML model for OSTS estimation in the SCS is shown in Figure 2. The Ens-ML model procedure involves two main steps: (1) training and validating the model, (2) estimating with model. The model was trained and validated using satellite-derived surface data and Argo labeled data before it was applied to reconstruct the OSTS. Based on previous studies and analyses, we selected SST, SSS, SSH, USSW, VSSW, and the geographical information, i.e., LON, and LAT, as independent input variables for machine learning models to estimate the OSTS in the SCS [24,26,47,51]. We first trained the three individual machine learning models (XGBoost, RandomForest, and LightGBM) to estimate the OSTS from sea surface data. After that, the outputs of the three machine learning models were combined using the ANN. Model parameter optimization, an important part of machine learning, has a significant influence on the performance of the models. Some of the most important parameters have to be carefully optimized to achieve good predictive performance. In this study, we applied the grid search method to obtain an optimal combination of model parameters for the machine learning models. The optimal parameter combinations are shown in Table 2. Model parameter optimization, an important part of machine learning, has a significant influence on the performance of the models. Some of the most important parameters have to be carefully optimized to achieve good predictive performance. In this study, we applied the grid search method to obtain an optimal combination of model parameters for the machine learning models. The optimal parameter combinations are shown in Table 2. For model evaluation, the performance of the Ens-ML model was evaluated quantitatively using statistical measures, i.e., the root mean square error (RMSE hereafter) and coefficient of determination (R 2 hereafter). RMSE describes the specific values of the errors between the estimated values and observed values, while R 2 is used to assess how strong the linear relationship is between two variables. In the following section, we present a performance study on the Ens-ML model in the estimation of the OSTS in the SCS.

Results
With the optimal combination of parameters, we used three individual machine learning models (XGBoost, RandomForest, and LightGBM) and the Ens-ML model to separately estimate the OSTS in the SCS. First, we evaluated the performance and stability of the Ens-ML model by comparing with three individual models in terms of RMSE and R 2 , which were calculated at each depth levels based on the testing data set for 2019 over the SCS. The vertical distributions of RMSE and R 2 for the Ens-ML model and three individual models are shown in Figure 3.  For model evaluation, the performance of the Ens-ML model was evaluated quantitatively using statistical measures, i.e., the root mean square error (RMSE hereafter) and coefficient of determination (R 2 hereafter). RMSE describes the specific values of the errors between the estimated values and observed values, while R 2 is used to assess how strong the linear relationship is between two variables. In the following section, we present a performance study on the Ens-ML model in the estimation of the OSTS in the SCS.

Results
With the optimal combination of parameters, we used three individual machine learning models (XGBoost, RandomForest, and LightGBM) and the Ens-ML model to separately estimate the OSTS in the SCS. First, we evaluated the performance and stability of the Ens-ML model by comparing with three individual models in terms of RMSE and R 2 , which were calculated at each depth levels based on the testing data set for 2019 over the SCS. The vertical distributions of RMSE and R 2 for the Ens-ML model and three individual models are shown in Figure 3. The overall trends for RMSE and R 2 of the Ens-ML model and three individual machine learning models are as follows: the RMSE increases from the surface to maximum values near 70 m depth, and then decreases with depth; whereas an opposite trend is seen in R 2 , which decreases from the surface to minimum values near 70 m, and then increases from 70 to 300 m, finally decreases from 300 to 1000 m. Our results show that, at about 70 The overall trends for RMSE and R 2 of the Ens-ML model and three individual machine learning models are as follows: the RMSE increases from the surface to maximum values near 70 m depth, and then decreases with depth; whereas an opposite trend is seen in R 2 , which decreases from the surface to minimum values near 70 m, and then increases from 70 to 300 m, finally decreases from 300 to 1000 m. Our results show that, at about 70 m, the RMSE is relative higher than other depths, whereas R 2 is relative lower than other depths. The depth of 70 m is approximately the depth of the thermocline layer, where the temperature decreases rapidly with depth, which leads to the difficulty of estimation of the OSTS [61]. In contrast, the RMSE values of the Ens-ML model are obviously smaller than those of the three individual models at different depths, while the R 2 values of the Ens-ML model are larger than those of the three individual models (Figure 3). Obviously, the Ens-ML model yields the highest overall accuracy indicated by the lowest RMSE (0.31 • C, averaged over all depths) and highest R 2 (0.89) ( Table 3). The second and third are the LightGBM model and the XGBoost model, respectively. The RandomForest model yields the lowest estimation accuracy. It should be pointed out that the gaps of R 2 below 450 m between the Ens-ML model and three individual models are larger than the upper layer. This indicates that the Ens-ML model, when compared with the three individual models, tends to perform better in the lower layers than in the upper layers. Another important issue in the estimation is the selection of input variables for models. Early study has suggested that geographic information can help to improve estimation accuracy of the OSTS [47]. To examine the influence of geographic information on the estimation of the OSTS in the SCS, we set up two groups of experiments (cases 1 and 2) with different input parameter combinations to examine whether geographic information has an influence on the performance of the Ens-ML model (Table 4). In the first group (case 1), we select SST, SSS, SSH, USSW, and VSSW as inputs. In addition to these parameters, we also include LON and LAT as input parameters in the second group (case 2). It can be clearly seen that the seven-parameter Ens-ML model (case 2) yields a significantly lower RMSE value than that of the five-parameter model (case 1) at all depths, and higher R 2 values ( Figure 4). The vertically averaged RMSE and R 2 of the Ens-ML model with the sevenparameter model are 0.31 • C and 0.89, respectively. While for the five-parameter model, the vertically averaged RMSE and R 2 are 0.69 • C and 0.79, respectively. It should be noted that a significant difference in RMSE between the 7-parameter model and 5-parameter model was observed in the depth range of 250 to 600 m. For the 5-parameter model, the RMSE tends to increase from 250 to about 370 m, and then decrease with increasing depth. While for the 7-parameter model, the RMSE tends to decrease with increasing depth. This is an interesting question and requires further study. The results suggest that adding LON and LAT improves the estimation accuracy of the Ens-ML model in the SCS.    Next, we evaluate the performance of the seven-parameter Ens-ML model from different aspects in detail. First, some results of the Ens-ML-estimated OSTS at 50, 70, 100, 300, 500, and 1000 m depths in 2019 are shown in Figure 5. The Argo-derived OSTS at the same depths are used for evaluating the Ens-ML estimated OSTS and validate the estimated results by their differences, which are obtained from the Argo data minus Ens-ML model estimated data. We find good agreement between the Ens-ML model estimated OSTS and Argo OSTS maps at different depths. Most of the significant temperature features in the interior SCS can be effectively predicted from the sea surface data via the ensemble machine learning approach. For example, both datasets show that the temperature decreases with increasing depth. At 50 m depth, the temperature range is 22.5~27.1 °C, with a front trending from the southwest to the northeast separating the basin into two roughly equal water masses-cool water to the northwest, warm water to the southeast. The minimum temperature is found to the east of Vietnam about 111~113°E, 13~16°N, which coincides with the East Vietnam eddy [14]. The temperature difference between the Argo data and Ens-ML model estimation at 50 m depth only varies from 0.38 °C in the south to −0.21 °C in the north ( Figure 5). At 70 m, the temperature varies from 25.0 °C in the northeast to 21.5 °C in the west of the SCS, which is also oriented northeast-southwest, with a positive temperature gradient toward southeastward. The spatial distribution of temperature difference between the Argo observation and model estimation at 70 m depth is similar to that of 50 m depth with a relatively large range from −0.43 °C to 0.40 °C. At 100 m depth, the temperature varies from 21.5 °C in the east to 18.5 °C in the west of the SCS with a positive temperature gradient toward southeast. The maximum difference is observed at about 112~116°E, 11~14°N, with a range of 0.21-0.32 °C. The results obtained show that the temperature tends to be stable with increasing depth. At 300 m depth, the temperature varies from 13.6 °C to 15.2 °C, while for the 500 and 1000 m depths, the temperature only varies from 7.6 °C to 8.6 °C and from 3.9 °C to 5.2 °C, respectively. Below 500 m depth, the temperature differences between the Argo-derived data and Ens-ML model estimation become relatively small with a range of −0.05-0.06 °C. Comparisons Next, we evaluate the performance of the seven-parameter Ens-ML model from different aspects in detail. First, some results of the Ens-ML-estimated OSTS at 50, 70, 100, 300, 500, and 1000 m depths in 2019 are shown in Figure 5. The Argo-derived OSTS at the same depths are used for evaluating the Ens-ML estimated OSTS and validate the estimated results by their differences, which are obtained from the Argo data minus Ens-ML model estimated data. We find good agreement between the Ens-ML model estimated OSTS and Argo OSTS maps at different depths. Most of the significant temperature features in the interior SCS can be effectively predicted from the sea surface data via the ensemble machine learning approach. For example, both datasets show that the temperature decreases with increasing depth. At 50 m depth, the temperature range is 22.5~27.1 • C, with a front trending from the southwest to the northeast separating the basin into two roughly equal water masses-cool water to the northwest, warm water to the southeast. The minimum temperature is found to the east of Vietnam about 111~113 • E, 13~16 • N, which coincides with the East Vietnam eddy [14]. The temperature difference between the Argo data and Ens-ML model estimation at 50 m depth only varies from 0.38 • C in the south to −0.21 • C in the north ( Figure 5). At 70 m, the temperature varies from 25.0 • C in the northeast to 21.5 • C in the west of the SCS, which is also oriented northeast-southwest, with a positive temperature gradient toward southeastward. The spatial distribution of temperature difference between the Argo observation and model estimation at 70 m depth is similar to that of 50 m depth with a relatively large range from −0.43 • C to 0.40 • C. At 100 m depth, the temperature varies from 21.5 • C in the east to 18.5 • C in the west of the SCS with a positive temperature gradient toward southeast. The maximum difference is observed at about 112~116 • E, 11~14 • N, with a range of 0.21-0.32 • C. The results obtained show that the temperature tends to be stable with increasing depth. At 300 m depth, the temperature varies from 13.6 • C to 15.2 • C, while for the 500 and 1000 m depths, the temperature only varies from 7.6 • C to 8.6 • C and from 3.9 • C to 5.2 • C, respectively. Below 500 m depth, the temperature differences between the Argo-derived data and Ens-ML model estimation become relatively small with a range of −0.05-0.06 • C. Comparisons show that the maximum temperature differences between the Argo observation and model estimation occur at 50~70 m depth. Our results demonstrate that the proposed Ens-ML model presents a good performance in the estimation of OSTS in the SCS. A more detailed assessment of the Ens-ML model performance is given next.
show that the maximum temperature differences between the Argo observation and model estimation occur at 50~70 m depth. Our results demonstrate that the proposed Ens-ML model presents a good performance in the estimation of OSTS in the SCS. A more detailed assessment of the Ens-ML model performance is given next. Here, we quantitatively evaluate the accuracy of the OSTS estimation by the Ens-ML model using RMSE and R 2 as the performance measures ( Table 5). The low RMSE and high R 2 mean more reasonable estimation accuracy. As shown in Table 5, the Ens-ML model shows good performances at all depths, although the performance measures vary with depth. Although the estimation accuracy shows some differences at different depths, Here, we quantitatively evaluate the accuracy of the OSTS estimation by the Ens-ML model using RMSE and R 2 as the performance measures ( Table 5). The low RMSE and high R 2 mean more reasonable estimation accuracy. As shown in Table 5, the Ens-ML model shows good performances at all depths, although the performance measures vary with depth. Although the estimation accuracy shows some differences at different depths, the Ens-ML model is generally satisfactory. This also indicates that the Ens-ML model achieves a good performance for OSTS estimation in the SCS.  18.5 • N) were selected to further evaluate the vertical performance of the Ens-ML model ( Figure 6). Zonal section A is an important section to investigate the influence of Kuroshio intrusion on the SCS [11]. Meridional section B, located between the tropics and subtropics, is an important section for studying meridional ocean transport in the SCS. Figure 7 shows the comparison of OSTS from Argo data and the Ens-ML model estimation along these two sections. It can be clearly seen that the spatial distribution of the Ens-ML model estimated OSTS exhibits good agreement with the Argo observations. Most of the main observed features of the vertical temperature the Ens-ML model is generally satisfactory. This also indicates that the Ens-ML model achieves a good performance for OSTS estimation in the SCS. In addition, two sections (zonal section A along 18°N from 112°E to 118°E, and meridional section B along 113°E from 5.5°N to 18.5°N) were selected to further evaluate the vertical performance of the Ens-ML model ( Figure 6). Zonal section A is an important section to investigate the influence of Kuroshio intrusion on the SCS [11]. Meridional section B, located between the tropics and subtropics, is an important section for studying meridional ocean transport in the SCS. Figure 7 shows the comparison of OSTS from Argo data and the Ens-ML model estimation along these two sections. It can be clearly seen that the spatial distribution of the Ens-ML model estimated OSTS exhibits good agreement with the Argo observations. Most of the main observed features of the vertical temperature structure are well reproduced by the Ens-ML model. In the upper 200 m, the temperature change with depth is large, varying from 27 °C at the surface to 12 °C at about 200 m. Blow 200 m, the temperature only changes slightly varying from 10 °C at 300 m to 4 °C at 1000 m with the vertical gradient decreasing with depth.   For the zonal section (section A), in the upper 50 m, the estimated temperature value is slightly greater than the Argo value, with differences (Argo minus Ens-ML) up to −0.30 °C. The maximum difference (exceeding 0.35 °C) occurs between about 100 to 150 m, where the Ens-ML model-estimated temperature is relatively smaller than the Argo values. For the meridional section (section B), relatively large temperature differences (exceeding 0.40 °C) are present in the depth range of 50 to 200 m between 6°N to 15°N, with Argo values higher than the estimated temperature values. Vertically averaged temperature errors for the zonal and meridional sections over 1000 m are about 0.05 °C and 0.06 °C, respectively. As the statistical analysis suggested, more than 98% of the section's points are within ±0.10 °C, and more than 99% of the section points are within ±0.20 °C for the estimation.
A comparison between the Ens-ML model estimated temperature profiles and Argo profiles in 2019 are also presented in this study. Given the characteristics of bathymetry and Argo distributions, we selected four 2° × 2° boxes designated A, B, C, and D in Figure  6 to further evaluate the performance of the Ens-ML model in the SCS. Box A (116~118°E and 19~21°N) is along the continental slope south of China. Box B (117~119°E and 16~18°N) coincides with the West Luzon eddy. Box C (114~116°E and 11~13°N) is in the central southern SCS. Box D (110.5~112.5°E and 15~17°N) is identical with the East Vietnam eddy. Figure 8 shows the temperature profiles in the four boxes from the Ens-ML model estimation and the Argo observations. We find a good agreement between the Ens-   Figure 8 shows the temperature profiles in the four boxes from the Ens-ML model estimation and the Argo observations. We find a good agreement between the Ens-ML model-estimated profiles and Argo observed profiles. The RMSE between the averaged profiles from the Ens-ML model estimation and Argo observation are 0.17 • C, 0.40 • C, 0.61 • C, and 0.53 • C in boxes A, B, C, and D, respectively ( Figure 8). Obviously, the Ens-ML model accurately reproduces the temperature profiles seen in the Argo observational data.
Scatter plots of temperature from the Argo observations and the Ens-ML model estimation show that most of the points are more or less evenly distributed around the line of equal value, giving a low RMSE and a high R 2 (Figure 9). These results also suggest that the estimated results of the Ens-ML model are reliable and performed well in the OSTS estimation in the upper 1000 m. Ens-ML model accurately reproduces the temperature profiles seen in the Argo observational data.  As discussed above, the Ens-ML model has been shown to have good performance in the yearly mean OSTS estimation over the SCS. A question then arises, how about its performance in different seasons. Next, we quantitatively assess the performance of the As discussed above, the Ens-ML model has been shown to have good performance in the yearly mean OSTS estimation over the SCS. A question then arises, how about its performance in different seasons. Next, we quantitatively assess the performance of the Ens-ML model for different seasons. February, May, August, and November of 2019 are selected to represent the winter, spring, summer, and fall seasons of the year. To improve the comparability of the model accuracy at different depths, we normalized the RMSE values (NRMSE is the RMSE divided by the standard deviation of the Argo temperature at that depth). Vertical distributions of the NRMSE and R 2 at different depths for different seasons are shown in Figure 10. On the whole, the Ens-ML performance is clear with relatively higher NRMSE and smaller R 2 in the upper 100 m. The NRMSEs in different seasons show first a downtrend and then an uptrend with a turning point occurring at about 200 or 300 m depths. The trend features of R 2 are also unstable and fluctuating, which show an uptrend from the surface to 300 m, and then a downtrend to the 1000 m depth. These suggest that the accuracy of the Ens-ML model generally decreases with increase of depth from 300 m. The reason for this variability may be that, in the deeper layer (below 300 m), the ocean phenomena have weaker surface manifestation, which are harder to interpret from sea surface information. In addition, relatively high NRMSE values in the upper 100 m might be related to the complex dynamic processes of the upper ocean. It is clearly seen that the prediction accuracy of the Ens-ML model varies with seasons. The vertical average NRMSE (R 2 ) values are 0.26 °C (0.92), 0.30 °C (0.88), 0.37 °C (0.86), and 0.26 °C (0.91) for February, May, August, and November, respectively. The highest accuracy appears in February, and the lowest accuracy appears in August. The OSTS in winter and autumn tends to be stable, and the magnitude of OSTS changes is small. The accuracy of the Ens-ML model is relatively high for all four seasons, indicating that the Ens-ML model has a good performance in the estimation of OSTS for different seasons.

Discussion and Conclusions
Ocean temperature has a large influence on marine environment and climate, with direct and indirect impacts on both natural and human activities, from the ecosystem system to the fishing industry. Hence, accurately estimating ocean subsurface thermal structures is of great importance for marine environmental monitoring, marine disaster prevention and climate changes. At present, machine learning models have been widely used to estimate ocean thermal structure. However, until now, most studies of OSTS estimation are based on single models, not taking advantage of the potential improvement in estimation resulting from the use of an ensemble of models. To our knowledge, there have been no studies reported to use machine learning to estimate the OSTS in the SCS. Therefore, It is clearly seen that the prediction accuracy of the Ens-ML model varies with seasons. The vertical average NRMSE (R 2 ) values are 0.26 • C (0.92), 0.30 • C (0.88), 0.37 • C (0.86), and 0.26 • C (0.91) for February, May, August, and November, respectively. The highest accuracy appears in February, and the lowest accuracy appears in August. The OSTS in winter and autumn tends to be stable, and the magnitude of OSTS changes is small. The accuracy of the Ens-ML model is relatively high for all four seasons, indicating that the Ens-ML model has a good performance in the estimation of OSTS for different seasons.

Discussion and Conclusions
Ocean temperature has a large influence on marine environment and climate, with direct and indirect impacts on both natural and human activities, from the ecosystem system to the fishing industry. Hence, accurately estimating ocean subsurface thermal structures is of great importance for marine environmental monitoring, marine disaster prevention and climate changes. At present, machine learning models have been widely used to estimate ocean thermal structure. However, until now, most studies of OSTS estimation are based on single models, not taking advantage of the potential improvement in estimation resulting from the use of an ensemble of models. To our knowledge, there have been no studies reported to use machine learning to estimate the OSTS in the SCS. Therefore, in this study, we introduce an ensemble machine learning model based on an ANN, which combines outputs from three individual models (XGBoost model, RandomForest model, and LightGBM model) to estimate the OSTS in the SCS.
The Ens-ML model was used to estimate the OSTS in the SCS with sea surface data as input and Argo observations for labeling. The performance of the Ens-ML model was examined in comparison with three individual models. Our results show that the Ens-ML model has the highest prediction accuracy; its average RMSE is 0.31 • C, and its average R 2 is 0.89. The RMSE (R 2 ) values of the Ens-ML model are much smaller (larger) than those of the three individual models, indicating that the Ens-ML model performs better than the individual models. In addition to SSH, SST, SST, and SSW, geographic information (longitude and latitude) are two necessary parameters for accurately estimating the OSTS in the SCS, significantly improving the estimation accuracy of the Ens-ML model. Spatial distribution of the OSTS estimated by the Ens-ML model agrees well with the Argo observation, most of the observed OSTS features can be effectively detected from sea surface data via the Ens-ML model. The accuracy of the Ens-ML model for OSTS estimation generally decreases with increasing depth from 300 m, which is likely due to stable ocean stratification in the deeper ocean. The estimation accuracy of the Ens-ML model not only varies significantly with depth, but also varies with seasons. The vertical average NRMSE (R 2 ) for February, May, August, and November are 0.26 • C (0.92), 0.30 • C (0.88), 0.37 • C (0.86), and 0.26 • C (0.91), respectively. Obviously, the average NRMSE (R 2 ) in February is less (greater) than in May, August and November, suggesting that the estimation accuracy of the Ens-ML model in February is the best. Overall, the performance of the Ens-ML model is good for all four seasons, indicating that the Ens-ML model has a good seasonal applicability for the OSTS estimation in the SCS. This study demonstrates that a satellitebased ensemble machine learning approach to estimate OSTS can be accurate and reliable across different regions and times.
This study proposed and evaluated a multi-model ensemble approach to extend the satellite-derived oceanic data from the sea surface to subsurface. Our results show that this ensemble approach is robust and effective at estimating the OSTS in the SCS. However, it should be noted, the Ens-ML model, as a statistical method, enables us to estimate OSTS within the ranges of input parameters; this may lead to an underestimation of some extreme events. The reason for this is that the labeled OSTS of the Ens-ML model is in the range of input data for labeling. For any input data, the Ens-ML model can only estimate the OSTS that lies in the range of the input data used to train it. For those extreme events with OSTS exceeding the range of labeling data, the Ens-ML model may underestimate the temperature.
To extend this multi-model ensemble approach to the global scale, it is necessary to validate the model with different regions and data, and with better machine learning models. Besides different machine learning techniques, another important factor affecting estimation accuracy is the selection of input variables. It should be noted that the input variables used in this study might not be optimal combination for model. For example, previous studies have shown that wind stress, instead of wind speed, has a more significant impact on ocean temperature structure [62][63][64]. It would be interesting in future work to examine the different effects of wind stress and wind speed on OSTS estimation using a machine learning model. Once the OSTS has been accurately estimated, it can be used for practical applications, such as marine disaster prediction and acoustic propagation estimations [65]. Besides, application of the proposed multi-model ensemble approach to the estimation of other critical oceanic parameters, such as ocean subsurface salinity structure, oceanic density fields, and other oceanic variables, can be investigated as well. These are left for future study.