Reconstructed 3-D Ocean Temperature Derived from Remotely Sensed Sea Surface Measurements for Mixed Layer Depth Analysis

: The mixed layer depth (MLD) is generally estimated using in situ or model data. However, MLD analyses have limitations due to the sparse resolution of the observed data. Therefore, this study reconstructs three-dimensional (3D) ocean thermal structures using only satellite sea surface measurements for a higher spatial and longer temporal resolution than that of Argo and diagnoses the decadal variation of global MLD variability. To simulate the ocean thermal structures, the relationship between the ocean subsurface temperature and the sea surface fields was computed based on gridded Argo data. Based on this relationship, high spatial resolution and extended periods of satellite-derived altimeter, sea surface temperature (SST), and wind stress data were used to estimate the 3D ocean thermal structures with 0.25° spatial resolution and 26 standard depth levels (5–2000 m) for 24 years (1993–2016). Then, the MLD was calculated using a temperature threshold method (∆T = 0.2 °C)and correlated reasonably well (>0.9) with other MLD datasets. The extended 24-year data enabled us to analyze the decadal variability of the MLD. The global linear trend of the 24-year MLD is −0.110 m yr −1 ; however, from 1998 to 2012, the linear trend is −0.003 m yr −1 which is an order of magnitude smaller than that of other periods and corresponds to a global warming hiatus period. Via comparisons between the trends of the SST anomalies and the MLD anomalies, we tracked how the MLD trend changes in response to the global warming hiatus.


Introduction
The ocean mixed layer is located in the upper layer and has quasi-homogeneous temperature and density due to vertical mixing [1][2][3]. The thickness of the mixed layer (or mixed layer depth)is a proxy for the ocean heat content associated with the upper layer thermodynamics. More specifically, the mixed layer depth (MLD) contributes to the heat transport from the surface to the ocean interior and regulates the ocean-atmosphere heat flux variability [4][5][6]. Since the ocean is a large heat reservoir on Earth, it is important to note the heat transfer from the ocean surface to the deep layer [7]. Our reconstructed 3D subsurface temperature data and MLD calculated from this will provide another good source of warming processes. The MLD is determined by the vigorous turbulent mixing induced by the convective mechanism of the buoyancy flux and wind-induced mechanical stirring [1]. In addition, the MLD plays a key role in biogeochemical cycles, contributing to the export of large amounts of carbon by isolating phytoplankton cells and other particles inside the mixed layer [8][9][10][11]. The ocean stores 50 times more carbon dioxide than the atmosphere and it is well known that it absorbs about 30% of the anthropogenic carbon dioxide released into the atmosphere [12]. The atmosphere and ocean continue to release and absorb carbon through an interface between the two. Therefore, ongoing research is needed on the variation of carbon in the ocean upper layer. In the recent paper, the relationship between MLD and fCO2 was investigated by using a machine learning method, and the results showed that fCO2, which is fugacity of CO2, increases more in fall than in spring because of a higher sea surface temperature (SST) and deeper MLD [13]. Accordingly, accurate MLD measurements are even more important to understand the global carbon cycle. Therefore, it is essential to understand the MLD variability not only because it is directly linked to air-sea interactions but also because it has a larger heat capacity than the atmosphere [14].
For the determination of the MLD, the conventional methods are based on a specific criterion such as a temperature and density threshold from in situ profiling data such as Argo floats. However, it is very limiting to understand the spatial and temporal variability of the MLD on a global scale only using Argo-derived profiling data because the floats do not stay in a particular area. For this reason, on a regional scale, data on grids with insufficient floats can cause inappropriate information and thus, grid-averaged data from numerous Argo floats have been used widely in global scale studies [15,16]. The gridded data is constructed to compensate for the lack of spatial and temporal resolution of the Argo float data for general applications in MLD analyses [17]. Even though it has widespread use, this gridded dataset has an insufficient spatial resolution (>1°) to diagnose smalland mesoscale changes in the MLD. Moreover, the uncertainty due to the sparse horizontal density of the Argo floats remains to be resolved. In addition, within the period of the gridded Argo data (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), there are considerable restrictions on studying the decadal variation in the interior ocean.
To resolve such resolution problems, many studies have attempted to estimate the ocean interior temperature from satellite-derived data using various approaches such as parametric models [18], neural networks [19,20], and statistical approaches [21][22][23]. In addition, the Improved Synthetic Ocean Profile (ISOP), which is a statistical method (extending the surface observation to the subsurface and providing an initial subsurface condition for numerical studies) that has been widely used in operational ocean models such as Global Ocean Forecasting System (GOFS) [24]. The ISOP is a method developed by the US Navy to construct a vertical profile from the remotely observed sea surface height (SSH) and sea surface temperature (SST) based on statistical relationships. Recently, new methods based on Artificial Intelligence (AI)/machine learning methods such as support vector machine (SVM) [25], XGBoost [26], random forest [27], and clustering-neural network [28] are used for retrieving subsurface temperature. Furthermore, Su et al. [29] used geospatial modeling (GWR) to estimate subsurface temperature, which has especially good performance with low RMSE. In particular, Guinehut et al. [23] used statistical relationships between sea subsurface temperature and sea surface fields based on in situ observations, reanalysis data, and altimetry data. Their study has been based primarily on the Analyse Reconstruction et Indicateurs de la Variabilité Océanique (ARIVO) climatology, computed by the optimal interpolation using the temperature and salinity profiles for short-time periods from 2002 to 2007. In their study, the sea level anomaly (SLA), sea surface temperature anomaly (SSTA), and ARIVO monthly temperature fields were selected as independent variables for multiple linear regression (MLR) analysis, which is the primary approach for calculating the ocean subsurface temperature and salinity. The anomaly data, used as independent variables, were calculated by removing the ARIVO monthly climatology from the satellite data.
Even though this study applies a similar approach, i.e., MLR, which Guinehut et al. [23] used to calculate the ocean interior thermal structures, our study differs in the following ways. First, the periods of climatology used to calculate the anomaly of the independent variables in the regression model are different. Guinehut et al. [23] used the ARIVO climatology, which is a 6-year climatology from 2002 to 2007. However, in this study, we calculated the anomalies of the independent variables in the MLR model using a climatology based on 11 years (2005-2015) of data, which is a slightly more extended period. By using a relatively more extended period of climatology data to estimate the anomalies, we could reduce the uncertainties and, in turn, increase the accuracy of the MLR model. Second, our data is different in its input data. Guinehut et al. [23] used in situ T, S profiles and satellite altimeter SLA and SST data as variables. We added a variable for wind stress and designed the equation to consider not only the ocean itself but also the physical change in MLD caused by wind. Most previous methods for estimating the subsurface oceanic properties depends on machine learning and neural network [27,28]. However, this study focuses on the fact that the interiors of ocean temperature structures can be easily estimated based on the high correlation between the sea surface and the subsurface data.
The variability in the MLD has been extensively studied on various spatial scales. Kara et al. [30] examined the climatological MLD with respect to the spatial and monthly variability of the global ocean. They demonstrated that a strong seasonality in the MLD was observed at high latitudes and in the subtropical Pacific Ocean, as well as in a very shoal mixed layer in the Antarctic at all times and in the maximum mixed layer in the North Atlantic Ocean in winter. Carton et al. [5] documented interannual and longer changes in the mixed layer properties on a global scale during the 45 years from 1960 to 2004 using the World Ocean Atlas 2005 global ocean dataset. They also revealed a winter-spring MLD variability related to climate indices in each ocean. As such, the pattern of MLD variability in the central subtropics of the Pacific Ocean agrees with the intensification of the surface winds is possibly responsible for the sea surface temperature (SST) change related to the Pacific Decadal Oscillation. Subsequently, the mixed layer was deeper (50-100 m) in the North Atlantic subpolar gyre over the 45 years. In addition, many other studies on the MLD variability have also been reported in the Southern Ocean [6,31,32], the North Pacific [1,33,34], the Atlantic [35], the equatorial Pacific [36], and the Indian Ocean [37]. However, most studies have been based on model or reanalysis data, and analyses based on observation data have coarse resolutions that could cause discrepancies in regions where mesoscale variabilities predominate.
In this study, we first tried to construct observation-based 3D ocean temperature data for an extended period (1993-2016) with a relatively fine spatial resolution (about 25km). The construction was based on the MLR with satellite and Argo data to analyze the MLD variability on a global scale. After the construction, we diagnosed the accuracy of the estimated 3D ocean temperature data and then performed an analysis on the trend and decadal variability of the global MLD. This paper is arranged as follows. The gridded Argo data and satellite-derived data used for reconstructing the 3D thermal structure are described in Section 2. Section 3 presents the methodology used to estimate the ocean thermal structure via MLR and, therefore, to determine the MLD. In addition, the ensemble empirical mode decomposition (EEMD) used to analyze the decadal variability and the secular trend of the MLD is introduced. Section 4 includes a sensitivity test of different input data to the MLR model and the evaluation of the reconstructed 3D subsurface temperature. A description of the global MLD with the decadal variation analysis is then followed by the conclusions in Section 5.

Materials
The monthly gridded Argo data provided by the International Pacific Research Center (IPRC) from January 2005 to 2016 was primarily used in this study (http://apdrc.soest.hawaii.edu/projects/Argo/data/gridded/On_standard_levels/index-1.html). The global array of approximately 3800 Argo floats are distributed roughly every 3° (300 km) and is a significant component of the ocean observing system [38]. The delayed mode float data are interpolated to a regular (1°) spatial resolution after quality control. In particular, the temperature, salinity, and absolute dynamic height (ADH) data were used in this analysis. The ADH is computed using the temperature and salinity data from Argo and the SLA from the altimeter data [39,40]. We assume that the ADH anomaly at the sea surface is equal to the SLA, according to IPRC-Argo (http://apdrc.soest.hawaii.edu/projects/Argo/data/Documentation/gridded-var.pdf); therefore, the satellite-observed SLA could be used to replace it. The gridded Argo data (temperature and ADH) have global coverage with 27 standard depth levels [41] from the surface down to 2000 m. To obtain 11 years (2005-2015) of anomaly data (i.e., SSTA, the subsurface temperature anomaly (STA), and SLA), the monthly climatology of the gridded Argo data for the same span was subtracted. The monthly STA and SLA data for 2016 calculated with the same climatology were then used to validate the result of the reconstructed thermal structure. It is worth to note that the depth of surface Argo measurements is 2.5m, which is considered as sea surface in this study so it is not like the remotesensed skin-layer temperature.
The satellite-derived surface data were used in the MLR model to predict the subsurface temperature at better spatial resolutions and more extended temporal coverage than the gridded Argo data. The data for the SST and SLA data were obtained from the Optimal Interpolation Sea Surface Temperature (OISST) and the altimetry distributed by the National Oceanic and Atmospheric Administration (NOAA) and the Copernicus marine environment monitoring service (CMEMS), respectively. The monthly OISST is a combined dataset of Advanced Very High-Resolution Radiometer (AVHRR), ships, and buoys and has a 1/4° uniform distribution from 1981 to the present [42]. The monthly SLA was merged with altimetry satellite datasets and calculated from SSH minus the 20-year mean SSH. This altimetry-merged dataset has a 1/4° horizontal spatial resolution and is available from 1993. Furthermore, the monthly mean wind stress data were also used for the MLR model. The data were obtained from the ECMWF (European Centre for Medium-Range Weather Forecasts) ERA-Interim data (https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/erainterim). In order to use it for the MLR model, the wind data were made as wind stress anomaly (WSA), as SLA and SSTA were made. Although there are some uncertainties of using three inputs for the MLR method, the estimated subsurface temperature is well predicted as shown in Table 1.
In addition, we used TAO/TRITON buoy data to verify the subsurface temperature data during the predicted non-Argo period. The TAO/TRITON system is an array of moored buoys in the tropical Pacific Ocean provided by the National Oceanic and Atmospheric Administration (NOAA) Pacific Marine Environment Laboratory [43]. The subsurface temperature data from two stations (St1: Longitude 165° E and Latitude 8° N; St2: Longitude 155° W and Latitude 8°) were randomly selected among the stations observed from January 1993 to December 2016.
The MLD reanalysis datasets were used to evaluate the quantitative accuracy of the variability in the calculated MLD in this study. The monthly MLD datasets calculated with temperature criterion of 0.2 ℃ are distributed from the European Centre for Medium-Range Weather Forecasts (ECMWF) Ocean Reanalysis System 4 (ORAS4) [44] and the Simple Ocean Data Assimilation ocean/sea-ice reanalysis (SODA) [45].

Multiple Linear Regression
In this study, we estimated the subsurface temperature using MLR based solely on the sea surface data. Regression analysis is one of the most common statistical methods used to explain the relationship between independent variables and dependent variables [46], and based on this relationship, a dependent variable can be predicted. Of the various regression analysis methods, the MLR analysis is most accurate when used to predict dependent variables using a number of independent variables [47]. The MLR model is used to fit the relationship between the quantitative dependent variable Y and the independent variables X , X , ⋯ , X . The basic concept of MLR with independent variables is as follows: where the β0 is an intercept and β1, β2, …, βi are the regression coefficients. Here, to estimate the STA, we applied the basic formula of MLR (Equation (1)) on each grid (x, y), time (t), and depth (z). The location is not given in the equation for simplicity.
The STA and each coefficient are functions of time and depth, while SSTA, SLA, and WSA are functions of time only. The regression coefficients were used to create a monthly climatology. These coefficients were calculated via the least square method (LSM), which finds the minimum of the sum of the squares of the residuals between the regression line and the observations such that: where E and n represent the sum of the squared residuals and the number of the month, respectively. The LSM finds its optimum when E is at a minimum: Equation (4) can also be represented by the following matrix equation.
Using the MLR model, the satellite-based subsurface temperature was estimated via three steps. First, the regression coefficients (β , β , β , β ) were calculated based on the SSTA, SLA, WSA, and STA of Argo. Second, prior to applying the satellite-based surface data at a high spatial resolution to the model, the estimated regression coefficients were interpolated using a bi-linear interpolation [48] to have the same resolution of the satellite data as a 0.25° grid. Third, the STA was calculated by replacing the satellite surface data (Equation (2)) (OISST, altimetry, and WSA) from 1993 to 2016. The Argo temperature climatology from 2005 to 2015 was spatially interpolated into a 0.25° grid and then combined with the estimated STA to produce the subsurface temperature.
In the MLR model, the selection of the independent variable is critical. In this study, SSTA, SLA, and WSA were selected to estimate the temperature below the surface layer. This approach is based on the fact that the SST controls the thermal structure of the ocean in the first 100 m [23]. In addition, the spatial distributions of STA at a depth of 100 m and the sea surface height anomaly are similar. Figure 1 shows an example of these distributions in December 2015. The negative temperature anomaly values are located in the western Pacific equatorial region, and the positive temperature anomaly values are located in the eastern Pacific equatorial region and the Indian Ocean. The correlation coefficient between STA and SLA in Figure 2 demonstrates their strong relationship over most of the global ocean. In particular, the Pacific equatorial region, which has a correlation coefficient of more than 0.9, is likely an El Niño-Southern Oscillation (ENSO)-like signal. The life cycles of both types of ENSO (El Niño and La Niña) are dominantly contributed by thermocline feedback representing the vertical advection process of STAs [49][50][51][52]. Therefore, El Niño and La Niña defined with SSTA, are also closely related to the SLA variation, which indicates the variability of the upper-ocean heat content. Consequently, this extreme SLA change in the tropics could be inferred to occur with changes in the heat budget of the mixed layer. Indeed, in the region where this mesoscale phenomenon is predominant, lower correlation coefficients are seen, such as in the western boundary regions and the Southern Ocean. The thermal expansion derived from the STA could have a significant influence on the SLA because more than two-thirds of the sea level change worldwide would be controlled by thermal expansion [53]. Therefore, there were many attempts, including that of Guinehut et al. [23], to estimate the ocean thermal structure based on significant correlations between the SST and the corresponding SLA (e.g., [18,19,21,[54][55][56]).

Mixed Layer Depth Estimation
Previous studies have adopted several approaches to estimate the MLD. The MLD calculations using temperature or density thresholds are the simplest and widely used in various studies on the MLD on a regional or global scale [57]. The threshold most commonly used is 0.2 °C for the temperature and 0.03 kg m −3 for the density. In this study, the MLD was calculated based on the threshold value of the temperature difference from the surface value to a depth of 10 m, with ∆T = 0.2 °C as the criterion [58].

Ensemble Empirical Mode Decomposition
EEMD was used in this study to investigate the decadal variability and the secular trend of the MLD on a global scale. EEMD is a method of sifting time series data into definable timescale signals that were developed by Wu and Huang [59]. It was developed from Empirical Mode Decomposition (EMD), introduced by Huang [60], which solves the mode mixing problem by adding a white noise series to the original data. The decomposing process is performed by identifying all the local maximum and minimum envelopes in the data and then fitting the adjacent extreme envelopes to a cubic spline. The first intrinsic mode function (IMF) is the mean of the maximum and minimum envelopes. The second IMF is found by repeating the above process with the first IMF signal subtracted from the original data. The IMFs are extracted until the leftover oscillation does not have a time-frequency signal, and the remaining nonstationary oscillation becomes the residual. Each IMF has a unique time signal, which runs from low to high frequency in turn. The number of IMFs (n) is determined by the logarithmic number of the time series data (N), i.e., n=log2N [59]. The EEMD process was applied to the MLD anomaly (MLDA) data for 288 months (1993-2016) resulting in eight IMFs and a monotonic residual. The 6-8th IMFs, which have more than a 10-year period, and the secular trend (residual) were combined to analyze the decadal variability from the time-series of the MLDA.

Sensitivity Test of the Model
Prior to the calculation of the STA by applying the previously discussed MLR model (Equation (2)) to the satellite-derived data, a sensitivity test was conducted for the independent variables (not shown). To set the MLR model to quantify the relationship between the subsurface and surface data, both the dependent variable (STA) and independent variables (SLA, SSTA and sea surface salinity anomaly (SSSA)) were derived from gridded Argo data from 2005 to 2015. The sensitivity test was performed by comparing the monthly STA estimated from the Argo-based surface datasets to the original Argo data in 2016. The root means square error (RMSE) between the STA results at entire depths (Levitus level) was calculated. The three input groups used to estimate STA are 1) SLA and WSA, 2) SLA, SSTA and WSA, and 3) SLA, SSTA, WSA, and SSSA. The thermal structures calculated with SLA, SSTA, and WSA and with SLA, SSTA, WSA, and SSSA show comparable mean RMSEs to the Argo data at 0.223 °C and 0.230 °C, respectively, which are half the mean RMSE between the Argo data and the STA estimated using only SLA and WSA (0.445 °C). Nevertheless, despite the differences between the three different datasets, the overall patterns are similar. In general, the RMSEs show an increasing pattern from the surface to near a depth of 100 m; below that, they tend to decrease with depth. At a depth of 100 m, where the RMSE is the largest, the mean RMSEs of each group data are 0.9712 °C, 0.7386 °C, and 0.4115 °C, respectively. The spatial difference (standard deviation) at a depth of 100 m is also significant (0.921 °C, 0.415 °C, and 0.428 °C, respectively).
First, to examine the SSTA influence on the STA estimation, we compared STA of group 1 (SLA and WSA as input data) and group 2 (SLA, SSTA and WSA as input data) by depth. The difference between the mean RMSEs of the two groups is nearly double; however, these differences are much more significant in the near-surface. At depths shallower than 500 m, the STA estimation is improved by more than 50% on average when SSTA is included in the independent variables. This is because the SSTA variation contributes significantly to the shallow subsurface temperature structure, especially in the seasonal development of STA [20]. In the spatial distribution of RMSE at each depth, the high RMSEs in the equatorial and western boundary regions are broadly distributed in group 1.
Second, by comparing groups 2 and 3, a sensitivity test of salinity on the STA estimation was performed. The RMSEs of these two groups and the Argo data were negligible. As expected, the spatial distribution of the mean RMSE is also highly comparable and shows the most considerable error and standard deviation at a depth of close to 100 m. In particular, both datasets show large RMSEs in equatorial regions at that depth. The relatively high RMSEs in these regions are likely influenced by the complex ocean interior structure and current and the east-west MLD tilt, in addition to variations in the freshwater flux (e.g., precipitation, evaporation, and river runoff). More details are given in Section 4.2.
Overall, STA showed the most consistent value compared to the Argo data when the MLR is calculated using SLA, SSTA, and WSA. Using these three inputs, the STA estimation for the surface layer improved significantly compared to that only including SLA over the entire world and at depth. Therefore, even though there is a slight difference between the STAs of the MLR model with and without the salinity data, their horizontal or vertical patterns are very similar. Therefore, it is likely not necessary to consider salinity in large-scale variations.

Reconstructed 3D Subsurface Temperature
Ultimately, the MLR model was executed using SSTA, WSA, and SLA as independent variables for the reconstruction of the 3D temperature structure. To produce a complete 3D temperature structure, the Argo subsurface temperature climatology was added to the monthly STA predicted for 2005-2015 using the MLR model. Then, the modeled 3D temperature structure was compared to that observed by Argo (Figure 3). The spatial patterns of the modeled subsurface temperature from the MLR at depths of 100 m, 200 m, and 500 m (Figures 3a-c) are mostly consistent with those of the Argo observations (Figures 3d-f). However, the spatial distributions of the RMSEs between the two datasets at individual depths show differences (Figures 3g-i). The spatial distribution of the RMSEs at a depth of 100 m, which represents the most significant RMSE, has high RMSEs in the tropics, as stated in Section 4.1. Further, the regions of the western boundary currents of the oceans, such as the Kuroshio (near 145° E, 42° N), the Gulf Stream (near 48° W, 45° N), and the Brazil-Malvinas Confluence Region (near 55° W, 46° S), including the region with the Antarctic Circumpolar Current (ACC) (near 17-77° E, 37°-47° S), also show some differences.
The relatively high RMSE in the tropic region has a significant error even when the MLR is executed using the SSTA, WSA, and SLA of the Argo data as the input and target data, respectively (Figure 3g). In addition, even though the data on the wind stress, precipitation, and evaporation were included as independent variables in the MLR model, the results did not significantly change (not shown). The distribution of RMSE between STA from the MLR model and the Argo data are also similar to those of ST at whole depths. The main reason for this phenomenon is thought to be the pronounced east-west gradient of the MLD. The trade wind-induced upwelling or downwelling in the tropical region drives a shallow MLD in the east and a deep MLD in the west [61]. In addition, in the upper layer of the MLD, the ocean current flows to the west due to the wind and the horizontal pressure gradient; however, in the lower layer of the MLD, the undercurrent flows to the east. Due to the complex current systems of these equatorial regions, the STA varies greatly. In the western boundary current regions, there are differences between the two datasets for different reasons than those at the equator. The western boundary current regions have intense mesoscale variabilities, such as eddies, according to the predominant signal in the ocean circulation [62]. The gridded Argo data are generated by 1° data via an optimal interpolation based on the Argo float spacing of 3° (300 km). Therefore, the characteristics of mesoscale phenomena (100-300 km) such as eddies are not included (Figures 4b,e,h) [56]. However, because the mesoscale structures are available in the satellite-based SLA fields [23], our results, based on these data, differ significantly from the Argo data in such eddy-rich regions. In these regions, warm and cold eddies might contribute to a high RMSE to a depth of 500 m, as shown in Figures 3g-i, because they can cause deep isotherm changes (refer to [63,64]). For example, Figure 4 shows STAs at a depth of 100 m and the satellite-derived SLA in the Kuroshio Extension, Gulf Stream, and Brazil-Malvinas Confluence region in December 2005. The modeled STA simulates eddy signals (Figures 4a,d,g) that are also manifested in the corresponding SLA data (Figures 4c,f,i), whereas the Argo STA appears only in smoothed features (Figures 4b, e, h). Subsequently, this implies that the modeled STA includes advantages for high spatial resolution satellite data. Here, the MLR model was developed over the period from 2005 to 2015 when Argo data are available. Then, the model was applied to obtain ST from 1993 to 2016, far beyond the model period. Therefore, the accuracy of the inferred subsurface temperature during the period when Argo data is unavailable needs to be verified. Figure 5 shows the time series of subsurface temperature at 100m depth from TAO/TRITON, MLR, and Argo data. As mentioned earlier (Figure 3), the subsurface temperature estimated from the MLR model shows the lowest correspondence to that from Argo data at 100m depth in the equatorial Pacific Ocean. Therefore, we selected this depth for the comparison. The ST calculated by the MLR model is likely consistent with that of TAO/TRITON at both stations. In particular, the ST from both the MLR model and TAO/TRITON data shows a rapid temperature drop in 1998 and 2015. Meanwhile, the Argo data do not show these temperatures drop properly, while the MLR model reflects the decline well. This inconsistency seems to be why the gridded Argo data was constructed by interpolating the space between observational points. After 2015, the MLRbased ST also follows the trend of the rapid increase in the observed ST. Overall, the ST at 100 m depth from the MLR model at St1 has a correlation of about 0.86 with that of the TAO/TRITON, which is comparable to the correlation between TAO/TRITON and Argo (0.87). Furthermore, at St2, the correlation between the ST of the TAO/TRITON and the MLR is rather higher than that with the Argo.
In Figure 2, the STA does not correlate with the SLA in high latitudes, although it does not mean that the model is failed in those regions. It just shows a relationship between each independent variable. As shown in Equation (1), the regression coefficient could be adjusted according to the role of each independent variable to the model development. That is, each grid has a different coefficient, which is adjusted according to the truth data. Therefore, even though the insignificant relationship between the two variables is unclear, such low correlation is not a big issue for model development in this research, as evidenced by the low RMSE in Figures 3g, h, i. In addition, Figure 6 shows a temporal comparison between STs from Argo and the MLR model in the regions with a low correlation between the STA and the SLA. It means that even in those regions with low correlations, the ST estimated by the MLR model follows the temporal variation of the Argo based ST (Figure 6), which is a mainly seasonal component. In order to examine the estimated subsurface temperature at different depths, Table 1 was made. The correlation and RMSE are relatively higher and lower in the upper layer (<50m depth) and a relatively low correlation and high RMSE in the middle layer (>75m depth). These results are similar to others (e.g., SVM method by Su et al. [25] (their Figure 4); random forest method by Su et al. [29] (their Figure 3); XGBoost (Su et al. [26] (their Figure 6)).  The MLD was calculated using the temperature threshold method (∆T = 0.2 °C) from the estimated 3D ocean temperature structure, and the seasonal mean MLD is shown in Figure 7. During boreal winter (December-February), the North Pacific has an MLD of more than 100 m (Figure 7a). In particular, a maximum MLD of up to 300 m appears at high latitudes of over 40° N, similar to the 175-250 m suggested by Kara et al. [1]. At the same time, the North Atlantic has the deepest MLD (a maximum MLD >500 m). Holte et al. [65] noted that the winter MLD in the Gulf Stream region is more than 500 m, and de Boyer Monteǵut et al. [2] stated that the maximum MLD in the winter in the North Atlantic is approximately 740 m in the Greenland, Iceland, and Norwegian seas and 500 m in the Labrador Sea. Our results also showed comparable MLDs in these regions. In the spring, the MLD in these areas gradually grows shallower, and the MLD in the Southern Ocean becomes deeper (Figure 7b). The MLD in the Southern Ocean records its maximum depth during the boreal summer (austral winter) and reaches several hundreds of meters, on average 300 m [2], 500 m [32], and 800 m [65], especially in the ACC region. After the boreal summer, these MLDs become shallower again up to approximately 100 m [32] (Figure 7d). The estimated MLDs in this study show good agreement with the results of previous studies in other regions with strong seasonality. Huang et al. [66] noted that the MLDs in the central Pacific equatorial region range from approximately 40 m to 70 m and that the eastern and western parts of the MLDs are 30 m and 30-50 m, respectively. In addition, the MLD in the North Atlantic subpolar gyre is approximately 50-100 m [5] and Schiller et al. [37] found that the MLD in the Indian Ocean is approximately 20-30 m at 20° N but reaches over 200 m at 40° S. Overall, the isothermal MLDs estimated in this study are consistent with the MLDs reported by previous studies, which suggests that the results of this study are valid. To quantitatively evaluate the accuracy of the variability in the calculated MLD, a comparison between the MLDs from the reconstructed 3D thermal structure and the MLDs calculated from the in situ (Argo) and reanalysis datasets (ORAS4 and SODA) was performed for 11 years (2005-2015) ( Table 2). On a global scale, the isothermal MLD (criteria 0.2 °C) calculated in this study shows high correlations of more than 0.85 with other datasets (Argo: 0.9141, ORAS4: 0.859, and SODA: 0.894). The RMSEs are highest at 5.597 m with ORAS4, at 3.9431 m with SODA, and 3.3227 m with Argo, which is the primary dataset used for this study. The correlation coefficients for each ocean are the highest in the Pacific Ocean (0.974-0.991), followed by the Atlantic Ocean (0.975-0.979), and the Indian Ocean (0.947-0.965). In particular, there is considerable uncertainty concerning the MLD estimation in the north of the North Atlantic. Overall, the MLD calculated based on the reconstructed 3D ocean temperature structure in this study appears to be similar to that suggested by previous studies. We investigate the long-term variability in the MLD from 1993-2016 on a global scale in Section 4.3. Table 2. Correlation coefficients between the mixed layer depth anomalies estimated in this study and those from other datasets (the Argo, Ocean Reanalysis System 4 (ORAS4), and Simple Ocean Data Assimilation ocean/sea-ice reanalysis (SODA) datasets) for 11 years (2005-2015).  We also examined the decadal variation of the MLDA at 10°-latitude intervals ( Figure 9). Even when there is a difference in peak time by latitude, the MLDA variation weakens around the hiatus period at each latitude. The overall time-series pattern is similar; for example, the MLDA increases until 1998, shows a double peak during the hiatus period, and then decreases dramatically. During the hiatus period, these double-peak patterns are clear in the regions lower than the mid-latitudes in the Northern Hemisphere and higher than the mid-latitudes in the Southern Hemisphere. However, the pattern over the total period shows the opposite phase in the high latitudes of the Northern Hemisphere. Furthermore, due to the deep MLD in the North Atlantic, the MLDA in these regions showed the greatest variability (from −20 m to 500 m) (Figure 9b). In particular, in the pre-hiatus period, the MLDA showed a shallowing trend of approximately 40 m (50-60° N), which is a much larger scale than that of other regions.

Regions
The MLD variation is affected by a combination of multiple components, such as thermodynamic and air-sea interactions. Above all, the SST is strongly connected to the MLD because the SST reflects the thermal structure of the upper ocean layer [23]. For example, as the SST increases, the stratification strengthens and the MLD becomes shallower. Therefore, the MLDA and SSTA trends were analyzed during each period on a global scale, as shown in Figure 10. The spatial pattern of the SSTA trend during the pre-hiatus period is negative in the eastern equatorial Pacific and is positive along the mid-latitudes in the western Pacific with a warming of the North Atlantic and a cooling of the Indian Ocean (Figure 10a) [70][71][72]. This is due to changes in the ENSO phase. The prehiatus period ranged from 1993 to 1998 when ENSO shifted from a positive phase to a negative phase. Accordingly, the MLD deepens (shoals) in the region where the SSTA has a cooling (warming) trend, showing a high correlation between the two components worldwide, especially in the Pacific equatorial region (Figure 10b). The SSTA trend over the hiatus period is spatially comparable to that of the pre-hiatus trend; however, the warming and cooling rates are notably reduced due to the global warming slowdown (Figures 9a,c). Therefore, affected by the SSTA variability, the spatial MLDA trend also shows insignificant values compared to other periods, as shown in the time series ( Figure  8).
However, as noted in previous studies, MLDs tend to be deeper in the Western Pacific equatorial region [72] and in the South Indian Ocean [73] and become shallower in the central and eastern equatorial regions [70] and in the North Pacific [71]. After the hiatus period, the SSTA trend shows an opposite spatial pattern to that of the SSTA trend for the pre-hiatus period with warming in the eastern equatorial Pacific tongue and the North Atlantic and a cooling trend in the North Pacific (Figure 10e). During this period, the warming of the eastern Pacific caused by a strong ENSO positive phase, which has persisted since the mid-2010s, and the changes in the MLDA trend are also observed accordingly. A shallower MLD is exhibited in the western Pacific and the North Pacific, and the MLD strongly deepens in the AMOC region. The overall spatial distributions of SST and MLDA variability are consistent, including the red circled areas ( Figure 10). Strangely, the large MLDA fluctuation trends in the Southern Ocean (below 50 °S) can be seen in every period. This seems to be due to the fast Antarctic circulation flow system. While the lower MLDA trend reflects well the warming in the western part of South America, the cooling trend for the southern part of South America is somewhat insignificant, and the specific mechanism on this requires further research. Even though the posthiatus period is as short as five years, it is notable regarding the pattern shift and requires continuous monitoring in the future.

Discussion and Conclusion
In this analysis, we mentioned a technique for estimation of ST using SST, WSA, and SLA data. However, it is worth to note that satellite altimeters measure not only ocean heat-related phenomena, but also many dynamical processes such as ocean Rossby waves [74], ocean current [75], and winds [76]. While planetary waves play a fundamental role in redistributing and dispersing the large-scale time-varying energy in the ocean interior, the ocean current plays an important role in the energy transfer processes associated with wind. Thus, the sea surface height measurements in MLR include all combinations of these direct and indirect ocean processes. Concerned with climate changes, many studies have been conducted to understand ocean interior thermal structures using ocean surface measurements. Understanding ocean subsurface thermal fields are of great importance to comprehending the ocean inertia mechanism and its air-sea interactions. In particular, the mixed layer depth is a proxy for the ocean heat content associated with the upper layer thermodynamics and, therefore, regulates the heat flux variability directly linked to the atmosphere [4][5][6]. Accordingly, the Argo project was launched to obtain internal ocean temperature and salinity data around the world distributed on an approximately 3° grid with high accuracy. However, the spatial distribution is still too sparse to observe mesoscale phenomena and model and reanalysis data require multiple procedures such as assimilation and interpolation. Therefore, to diagnose both mesoscale patterns and decadal changes in the MLD, we first reconstructed the ocean thermal structure at a high spatial resolution for the 24 years from 1993 to 2016 using MLR. Using this method, the statistical relationship between the subsurface temperature and the surface field was computed based on the gridded Argo data. High resolution and longer temporal coverage satellite-derived SLA, WSA, and SSTA data were applied to the regression on the basis of their strong correlations with STA [23]. In fact, during the sensitivity test, we also examined the influence of salinity on the MLR model; however, there was no significant difference in the results. Further, appropriate satellite-based salinity data could not be obtained due to the low spatial resolution (1° grid) and short temporal coverage (from 2009 to the present) of the data. Therefore, the 3D ocean thermal structure was generated from the SLA, WSA, and SSTA with 0.25° resolution from 1993-2016 with 27 standard depth levels [41] down to a depth of 2000 m. The results properly detected mesoscale structures such as eddies. Inevitably, this simple statistical method shows relatively high RMSE values relative to the gridded Argo data in the upper layer of the tropics due to the complex physical variations in the ocean in the upper equatorial regions. Nevertheless, this difference exists at a depth of 100-200 m, which is deeper than the mean MLD in the tropics; therefore, there would be no significant problem in calculating the MLD from the estimated subsurface temperature. Lastly, the MLD data from this research can be downloaded on our website ftp://crs.pusan.ac.kr/mlr_mld.