Ocean Front Reconstruction Method Based on K-Means Algorithm Iterative Hierarchical Clustering Sound Speed Profile

As one of the most common mesoscale phenomena in the ocean, the ocean front is defined as a narrow transition zone between two water masses with obviously different properties. In this study, we proposed an ocean front reconstruction method based on the K-means algorithm iterative hierarchical clustering sound speed profile (SSP). This method constructed the frontal zone from the perspective of SSP. Meanwhile, considering that acoustic ray tracing is a very sensitive tool for detecting the location of ocean fronts because of the strong dependence of the transmission loss (TL) on SSP structure, this paper verified the feasibility of the method from the perspective of the TL calculation. Compared with other existing methods, this method has the key step of iterative hierarchical clustering according to the accuracy of clustering results. The results of iterative hierarchical clustering of the SSP can reconstruct the ocean front. Using this method, we reconstructed the ocean front in the Gulf Stream-related sea area and obtained the three-dimensional structure of the Gulf Stream front (GSF). The three-dimensional structure was divided into seven layers in the depth range of 0–1000 m. Iterative hierarchical clustering SSP by K-means algorithm provides a new method for judging the frontal zone and reconstructing the geometric model of the ocean front in different depth ranges.


Introduction
As one of the most important phenomena in ocean dynamics, the ocean front is defined as a narrow transition zone between two or more water masses with different properties [1]. The frontal zone is a convergence zone area, which has very strong vertical movement. The exchange of momentum, heat, and water vapor in the frontal zone is very active [2], and the position and strength of the frontal zone vary with time or season [3], so ocean fronts have become a hot topic of physical oceanography in recent years. For the detection of the ocean front, the dynamic method or the gradient judgment method of hydrological elements including temperature can be used [4,5], in which a critical value should be given first. If the calculated horizontal temperature gradient is greater than the critical value, it is considered that there is an ocean front [6].
The impact of the ocean front on acoustic transmission loss (TL) can not be ignored [7]. Research has shown that the maximum difference of the TL between the existing front and none is about 20 dB [8], which greatly has an impact on the detection performance of the sonar equipment. Therefore, it is critical to accurately construct the three-dimensional geometric structure of the frontal zone, which helps study the influence of acoustic propagation. As an important factor affecting the characteristics of underwater acoustic propagation, the sound speed profile (SSP) reflects the vertical distribution structure of the seawater of the number of layers of clustering, the classification and recognition of sound speed profiles based on K-means, the verification of the accuracy of clustering results based on calculating the TL, iterative layered processing and verification, and the construction of the three-dimensional geometric structure of the GSF. Section 4 of this paper is a summary and conclusion, and we summarize the whole research process and give relevant conclusions.

Physical Oceanography and Data Introduction
The Gulf Stream is the most powerful and influential warm stream in the world [17,18]. Due to the differences in intensity and position of the Gulf Stream in different seasons, the range and position of the GSF also show obvious seasonal differences. In this paper, our study area was selected within the range shown in Figure 1a. We enlarge Figure 1a and get Figure 1b to show the surface sound speed distribution of the study area (the scope of the study area is 38 • -40 • W, 50.6 • -51.6 • N). As can be seen in Figure 1b, due to the existence of the GSF, the study area is divided into three water masses with different properties. They are a high sound speed (high temperature and salinity) water mass, which is located in the southern part of the frontal zone; low sound speed (low temperature and salinity) water mass, which is located in the northern part of the frontal zone; and a transitional water mass, which is located in the frontal zone.
J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 3 of 18 results and discussion, and we divide this section into five parts, including the selection of the number of layers of clustering, the classification and recognition of sound speed profiles based on K-means, the verification of the accuracy of clustering results based on calculating the TL, iterative layered processing and verification, and the construction of the three-dimensional geometric structure of the GSF. Section 4 of this paper is a summary and conclusion, and we summarize the whole research process and give relevant conclusions.

Physical Oceanography and Data Introduction
The Gulf Stream is the most powerful and influential warm stream in the world [17,18]. Due to the differences in intensity and position of the Gulf Stream in different seasons, the range and position of the GSF also show obvious seasonal differences. In this paper, our study area was selected within the range shown in Figure 1a. We enlarge Figure 1a and get Figure 1b to show the surface sound speed distribution of the study area (the scope of the study area is 38°-40° W, 50.6°-51.6° N). As can be seen in Figure 1b, due to the existence of the GSF, the study area is divided into three water masses with different properties. They are a high sound speed (high temperature and salinity) water mass, which is located in the southern part of the frontal zone; low sound speed (low temperature and salinity) water mass, which is located in the northern part of the frontal zone; and a transitional water mass, which is located in the frontal zone. Due to the existence of the GSF, the study area is divided into three water masses with different properties. They are high sound speed (high temperature and salinity) water mass, which is located in the southern part of the frontal zone; low sound speed (low temperature and salinity) water mass, which is located in the northern part of the frontal zone; transitional water mass, which is located in the frontal zone.
HYCOM (the Hybrid Ocean Model) [19,20] was used to output daily average temperature, salinity, and depth (pressure) data, which were gridded reanalyzed, and the data type was NetCDF (Network Common Data Format). Considering that the ocean front is obviously strong in winter, we used the daily average data on 1 January to detect the frontal zone more clearly. The data have a horizontal resolution of 0.08° in longitude and 0.04° in latitude and the vertical stratification is 40 (The stratification is uneven). The speed of sound in seawater is a function of temperature, salinity, and depth (pressure), and the relationship between them can be expressed by a simplified empirical formula of sound speed:

1449.22
The surface sound speed of parts of the North Atlantic and the location of the study area (the scope of the study area is 38 • -40 • W, 50.6 • -51.6 • N); (b) The sea surface sound speed distribution of the study area on 1 January 2019. Due to the existence of the GSF, the study area is divided into three water masses with different properties. They are high sound speed (high temperature and salinity) water mass, which is located in the southern part of the frontal zone; low sound speed (low temperature and salinity) water mass, which is located in the northern part of the frontal zone; transitional water mass, which is located in the frontal zone.
HYCOM (the Hybrid Ocean Model) [19,20] was used to output daily average temperature, salinity, and depth (pressure) data, which were gridded reanalyzed, and the data type was NetCDF (Network Common Data Format). Considering that the ocean front is obviously strong in winter, we used the daily average data on 1 January to detect the frontal zone more clearly. The data have a horizontal resolution of 0.08 • in longitude and 0.04 • in latitude and the vertical stratification is 40 (The stratification is uneven). The speed of sound in seawater is a function of temperature, salinity, and depth (pressure), and the relationship between them can be expressed by a simplified empirical formula of sound speed: where C refers to sound speed, ∆C t , ∆C S , ∆C P , ∆C StP are sound speed disturbance caused by temperature, salinity, pressure and their combined influence respectively, and their expressions are as follow: ∆C p = 1.60518(10) −1 P + 1.0279(10) −5 P 2 + 3.451(10) −9 P 3 − 3.503(10) −12 P 4 , (4) where t is the temperature, P is the pressure in standard atmospheric pressure, and S is the salinity. This empirical equation was proposed by Wilson in 1960 [21]. The formula has a wide range of applications in the global ocean, and the error is basically controlled at 1 m/s. Using the sound speed empirical formula, n sound speed profiles in the study area are obtained, and all sound speed profiles form the sound speed matrix T: The sound speed matrix T is n × m matrix, where n is the number of sound speed profiles (each row of the matrix represents an SSP), and m is the number of layers of sound speed profiles of the whole sea depth.

K-Means Algorithm and Its Application
Fuzzy cluster analysis refers to constructing a fuzzy matrix according to the properties of the research object itself and determining the cluster relationship according to certain membership, aiming to perform clustering objectively and accurately [22]. In this paper, as a kind of fuzzy clustering method, the K-means algorithm was used to cluster n sound speed profiles in the study area. K-means algorithm is an iterative clustering analysis method. It divides the data into K groups in advance, randomly selects K objects as the initial clustering center, then calculates the distance (in this paper, the distance between the objects and the clustering centers indicates the degree of similarity between the objects and the clustering centers: the smaller the distance, the higher the degree of similarity; on the contrary, the larger the distance, the lower the degree of similarity. Euclidean distance was used for calculation, and its expression is given below) between each object and each initial clustering center to assign each object to the nearest clustering center. The clustering center and the objects assigned to them represent a cluster [23]. Each time a sample was allocated, the clustering center was recalculated according to the existing objects in the cluster. This process was repeated until a certain termination condition (such as minimizing the square error) was met. The scope of the study area in this manuscript is 38 • -40 • W, 50.6 • -51.6 • N, and the data we used have a latitude horizontal resolution of 0.04 • and a longitude horizontal resolution of 0.08 • , so there are 26 * 26 = 676 sound speed profiles in the study area. K-means algorithm divides the 676 sound speed profiles in the study area into K groups (the classification number K is artificially defined. Considering that the GSF divides the study area into three water masses, namely, water masses with different properties on both sides of the frontal zone and transitional water mass between them, K is taken as 3), and the clustering center of each group was found by minimizing the square error. The square error can be expressed as: where x is each SSP, C q is the cluster group (q = 1, 2, 3), µ q is the mean vector of C q , which is the clustering center, also known as the center of mass, and its expression is: In Equation (7), x − µ q is the Euclidean distance between x and µ q , and its expression is: The steps of clustering of sound speed profiles using the K-means algorithm in this paper are as follows: (1) Determining the number of layers and the depth range of each layer. After calculating the sound speed matrix T, according to the actual influence depth of the GSF (in this paper, the actual depth of influence of the GSF is 1000 m (Section 3.1)), we divide the depth into five layers evenly as our initial stratification, and the depth range of each layer is 200 m. A total of 676 sound speed profiles of each layer are clustered via the K-means algorithm.
(2) Initializing the sound speed matrix T of each layer with a number between 0 and 1. This step can be completed through translation-standard deviation transformation and translation-range transformation.
The expression of translation-standard deviation transformation is as follows: The expression of translation-range transformation is as follows: After transformation, all T ij ∈ [0, 1] of the sound speed matrices can be obtained. The initialized matrix is a membership matrix T .
(3) In the matrix T , three sound speed profiles are randomly selected as the initial three clustering centers.
(4) The distance from each SSP to the clustering center is calculated, and the first clustering is carried out according to the principle of minimizing the square error to obtain the first clustering center: where C q is the cluster group (q = 1, 2, 3), T i is the row vector of the membership matrix T (i =1, 2, . . . , 676), that is the SSP after initialization.
(5) After the first clustering of all sound speed profiles, the clustering center changes. Repeat step (4) to calculate each clustering center again until the square error is the smallest: that is, given a certain δ, make E(v + 1) − E(v) < δ (in this paper, δ = 10 −5 is the default value), then stop the iteration (the number of iterations is v), and complete the clustering of the sound speed profiles in this layer. (6) Repeat steps (2) to (5) and use the K-means algorithm to cluster the sound speed profiles of other layers.
Through the above steps, we completed the clustering of the sound speed profiles under the initial stratification (five layers) by using the K-means algorithm. The result of the clustering is that the 676 sound speed profiles of each layer are divided into three groups. Each group represents a kind of water mass, and we can detect the frontal zone based on the clustering results.

BELLHOP Ray Model and Its Application
We used the BELLHOP ray model to calculate the TL, which can verify the accuracy of the clustering results in different layers. BELLHOP ray model is a toolbox for underwater sound field calculation [24]. Based on the ray theory, the sound field in a horizontal non-uniform environment is calculated by the Gaussian beam tracking method [25]. By adjusting the parameters of the environment file, the model can calculate the sound field and obtain the corresponding sound line information, multi-path information, and so on. Compared with the normal mode model or parabolic equation model, the ray model can clearly describe the change of sound energy in the process of propagation [26], which is more suitable for the ocean front environment in this paper. Because the TL varies greatly when passing through the frontal zone (including the size, intensity, and location of the convergence area) [27,28], it is more reliable to judge the starting position of the frontal zone by comparing the TL of the existing ocean front and none. The BELLHOP ray model was used to calculate the TL to verify the accuracy of clustering results of the SSP of each layer in judging the starting position of the frontal zone. The specific steps are as follows: (1) Sound speed sections with different longitude in the study area are selected. Take the actual sound speed section as the existing ocean front environment (there are 26 sound speed profiles of each longitude section): (j is the distribution number of the selected collection sampling points of the SSP of each layer in depth. For example, for the SSP of 0-200 m, there are 10 sampling points in the depth range of 0-200 m, j = 1, 2, . . . , 10).
The distance-independent environment configured using the first SSP is regarded as an environment without an ocean front (the dimension of T without is the same as T exist ): The above two environments are input into the BELLHOP. On this basis, the TL between existing the frontal zone (TL exist ) and none (TL without ) are calculated and compared by setting sound sources with different depths and propagation directions (the sound source depth is located in the middle depth of each layer. The receiving depth is consistent with the sound source depth, and the TL we got corresponds to the value of the receiving depth. The setting is the same for all layers).
(2) The five-point moving average method is used to process the TL. For the deter- then the corresponding position R exist is regarded as the starting point of the frontal zone (in the paper, the critical value TL di f f is 0.2 dB, that is, under the same conditions, when the difference of the TL between with and without the ocean front exceeds 0.2 dB, it is considered that a frontal zone appears at this position, and the corresponding position R exist is regarded as the actual starting position of the frontal zone). Under the same conditions, the starting position of the frontal zone under the section obtained by the K-means algorithm is R K−means , for the certain error standard R error , if it is considered that the clustering results obtained by using K-means are accurate in this layer (in this paper, the error standard R error is twice the distance between the two sound speed profiles, which is 11.2 km).
(3) Iterative optimization of stratification. If the error of the original hierarchical clustering result is greater than the error standard, that is then the layer is divided into two layers evenly according to the depth range of this layer, and each layer is re-clustered according to the steps of the hierarchical clustering of sound speed profiles via K-means. Next, use the same method to verify the accuracy of the clustering results until all the hierarchical clustering results meet the error standard range for the judgment of the starting point of the frontal zone. This is an iterative process, which is completed to realize the iterative hierarchical clustering of the SSP, and all the results of the iterative hierarchical clustering can meet the allowable range of error.

Introduction of Technical Route
The whole research process is summarized, and the technical route is given as follows ( Figure 2). First of all, we used the sound speed empirical formula to fit the temperature, salinity, and depth (pressure) data in the study area and obtain the sound speed matrix. Then, the K-means algorithm was used to cluster sound speed profiles in the study area to detect the frontal zone. In order to verify the validity of the clustering results, we used the BELLHOP ray model to calculate and compare the TL between the existing front environment and none and use the calculated starting position of the frontal zone as the criterion. If the error of the clustering results was less than the given error range, we considered the clustering results obtained by the K-means algorithm at this layer to be accurate. If the error is larger than the given error range (twice of the distance between the two sound speed profiles (11.2 km)), we need to divide this layer into two layers evenly to re-cluster the sound speed profiles of each layer, and then verify the results through the TL. Until all the iterative hierarchical clustering results meet the error standard range for the judgment of the starting point of the frontal zone, the verification process is completed. The iterative process was completed to realize the iterative hierarchical clustering of the sound speed profiles, and all the results of the iterative hierarchical clustering can meet the allowable range of error. Finally, we used the results of iterative hierarchical clustering of sound speed profiles to reconstruct the structure of the ocean front.
cluster the sound speed profiles of each layer, and then verify the results through the TL. Until all the iterative hierarchical clustering results meet the error standard range for the judgment of the starting point of the frontal zone, the verification process is completed. The iterative process was completed to realize the iterative hierarchical clustering of the sound speed profiles, and all the results of the iterative hierarchical clustering can meet the allowable range of error. Finally, we used the results of iterative hierarchical clustering of sound speed profiles to reconstruct the structure of the ocean front.

Results and Discussion
In this section, we cluster sound speed profiles with the K-means algorithm to obtain the position and range of the frontal zone of the GSF. At the same time, we verify the accuracy of clustering results based on the calculation of the TL. For larger errors in Kmeans results, we completed iterative hierarchical processing and verification. Finally, we use the results of iterative hierarchical clustering of sound speed profiles through the Kmeans algorithm to reconstruct the structure of the GSF.

Selection of the Number of Layers of Clustering
We selected a typical sound speed section at 39° W (the latitude range is 50.6°-51.

Results and Discussion
In this section, we cluster sound speed profiles with the K-means algorithm to obtain the position and range of the frontal zone of the GSF. At the same time, we verify the accuracy of clustering results based on the calculation of the TL. For larger errors in Kmeans results, we completed iterative hierarchical processing and verification. Finally, we use the results of iterative hierarchical clustering of sound speed profiles through the K-means algorithm to reconstruct the structure of the GSF.

Selection of the Number of Layers of Clustering
We selected a typical sound speed section at 39 • W (the latitude range is 50. Until all the iterative hierarchical clustering results meet the error standard range for the judgment of the starting point of the frontal zone, the verification process is completed. The iterative process was completed to realize the iterative hierarchical clustering of the sound speed profiles, and all the results of the iterative hierarchical clustering can meet the allowable range of error. Finally, we used the results of iterative hierarchical clustering of sound speed profiles to reconstruct the structure of the ocean front.

Results and Discussion
In this section, we cluster sound speed profiles with the K-means algorithm to obtain the position and range of the frontal zone of the GSF. At the same time, we verify the accuracy of clustering results based on the calculation of the TL. For larger errors in Kmeans results, we completed iterative hierarchical processing and verification. Finally, we use the results of iterative hierarchical clustering of sound speed profiles through the Kmeans algorithm to reconstruct the structure of the GSF.

Selection of the Number of Layers of Clustering
We selected a typical sound speed section at 39° W (the latitude range is 50.6°-51.  Observing the sound speed section at 39 • W and the sound speed profiles of these three points, we can see that the sound speed distribution belongs to several depth ranges. The depth of the influence of the GSF is about 1000 m (the horizontal distribution of sound speed hardly changes at a depth of 1000 m, so our research sea depth is 0-1000 m). The first depth range is 0-200 m, where the sound speed varies greatly at different positions at the same depth. In terms of the same positions, the vertical layering in this depth range is denser, and the internal sound speed gradient reverses. The second depth range is about 200-400 m, where the sound speed decreases with depth in a relatively smooth manner (but point C has a positive and negative change in the gradient of the sound speed). Meanwhile, the vertical stratification gradually becomes sparse. The third range is at about 400-600 m depth, where there is the SOFAR axis (minimum speed of sound) at point A covered in this range. At about 600-800 m depth, there are SOFAR axes at points B and C covered in this range. The last range is at 800-1000 m. As the depth increases, the sound speed also increases in this range. The influence of the ocean front below 1000 m becomes very small and we will not consider it.
Based on the above analysis, we divided the depth range of the study area into five layers, namely 0-200 m, 200-400 m, 400-600 m, 600-800 m, and 800-1000 m, each layer is 200 m deep, corresponding to five different depth ranges. According to the above initial stratification results, we used the K-means algorithm to cluster the sound speed profiles of each layer, respectively.

Classification and Recognition of Sound Speed Profiles Based on K-Means
We used the K-means algorithm to classify and identify the sound speed profiles of the above five layers and obtain the following clustering results (Figure 4). We found that each layer has three types of sound speed profiles, the blue points represent the north water mass of the frontal zone, the red points represent the south water mass of the frontal zone, and the green points represent the frontal transition water mass. The first layer is 0-200 m (Figure 4a), the direction of the frontal zone is northwest-southeast, and it is wider in the west, which is very similar to Figure 1b about the detecting result of the frontal zone. The second layer is 200-400 m (Figure 4b), compared with the first layer, the frontal zone is wider in the west, and the width of the frontal zone at 39 • 30 W section reaches 67.2 km. The third and fourth layers (400-600 m and 600-800 m) are different from the second layer, the frontal zone in the west narrows suddenly and moves south. Instead, the eastern frontal zone is wider than the previous layer. The last layer is 800-1000 m, compared with the 600-800 m, the frontal zone is narrower in the east. The narrowest part of the whole frontal zone in this layer appears in the middle of the study area, only 11.2 km.
According to the above results, we can find that the position, width, and range of the GSF are different at layers, which further confirms the irregular variation of the structure of the frontal zone with depth. Taking the experiment on the 39 • W characteristic section as an example, the widths of the frontal zone at different layers are 22.4 km, 33.6 km, 11.2 km, 11.2 km, 11.2 km, respectively.

Verify the Accuracy of Clustering Results Based on the Calculation of the TL
We used the BELLHOP ray model to calculate the TL, which is used to verify the accuracy of the clustering results in different layers. We selected sound speed sections at different longitudes in the study area. On this basis, by setting sound sources of different depths, we calculated and compared the TL in the presence and absence of the frontal zone. Next, we set the position with a large difference in the TL as the starting point of the frontal zone (in this paper, this standard is set to 0.2 dB, that is, under the same conditions, when the difference of the TL between the existing the ocean front and the none exceeds 0.2 dB, it was considered that a frontal zone appears at this position, and it was regarded as the actual starting position of the frontal zone). By comparing the differences of the starting points, we can verify the accuracy of the results of K-means of each layer. The relevant parameters set in the process of calculating TL are as follows: the sound source frequency is 50 Hz, the receiving depth is consistent with the source depth, the glancing angle is −20-20 • , the seabed sound speed is 1600 m/s, the density is 1.8 g/cm 3 , and the sound absorption coefficient of the seabed is 0.8 dB/λ. According to the above results, we can find that the position, width, and range of the GSF are different at layers, which further confirms the irregular variation of the structure of the frontal zone with depth. Taking the experiment on the 39° W characteristic section as an example, the widths of the frontal zone at different layers are 22.4 km, 33.6 km, 11.2 km, 11.2 km, 11.2 km, respectively.  (Figure 4a), which is more than twice the distance between the two sound speed profiles (11.2 km). It is necessary to divide the 0-200 m into 0-100 m and 100-200 m, use the K-means algorithm to cluster the SSP, and then verify the accuracy of the new results. In Figure 5b, using the same method, the value is 77.2 km, while the result of K-means under this condition is 39.2 km (Figure 4b), which is also more than twice the distance between the two sound speed profiles. It is also necessary to divide the 200-400 m into 200-300 m and 300-400 m, use the K-means algorithm to cluster the SSP, and then verify the accuracy of the results of K-means.

Verify the Accuracy of Clustering Results Based on the Calculation of the TL
When the sound speed section is 39.5° W, the comparison of the TL between the existing frontal zone and none at 0-200 m (Figure 5a) and 200-400 m (Figure 5b) are as follows. We calculated the TL of 39.5° W section in the layer of 0-200 m (the propagation direction is 50.6°-51.6° N). The position when the difference exceeds 0.2 dB is used as the criterion for judging the starting point of the frontal zone, and this value is 17.9 km, while the result of K-means under this condition is 50.4 km (Figure 4a), which is more than twice the distance between the two sound speed profiles (11.2 km). It is necessary to divide the 0-200 m into 0-100 m and 100-200 m, use the K-means algorithm to cluster the SSP, and then verify the accuracy of the new results. In Figure 5b, using the same method, the value is 77.2 km, while the result of K-means under this condition is 39.2 km (Figure 4b), which is also more than twice the distance between the two sound speed profiles. It is also necessary to divide the 200-400 m into 200-300 m and 300-400 m, use the K-means algorithm to cluster the SSP, and then verify the accuracy of the results of K-means. The position when the difference exceeds 0.2 dB is used as the criterion for judging the starting point of the frontal zone, and the above two values are 17.9 km and 77.2 km, respectively, the errors between the results of the K-means and the calculation of the TL are more than twice the distance between the two sound speed profiles (11.2 km). It is necessary to divide each layer into two layers evenly, respectively, use the K-means algorithm to cluster the SSP, and then verify the accuracy of the new results. The position when the difference exceeds 0.2 dB is used as the criterion for judging the starting point of the frontal zone, and the above two values are 17.9 km and 77.2 km, respectively, the errors between the results of the K-means and the calculation of the TL are more than twice the distance between the two sound speed profiles (11.2 km). It is necessary to divide each layer into two layers evenly, respectively, use the K-means algorithm to cluster the SSP, and then verify the accuracy of the new results.
As for the verification of the accuracy of clustering results of the other three layers, we find that all errors are less than the error range given by us (11.2 km) (the specific error comparison is in Section 3.4).

Iterative Layered Processing and Verification
According to the conclusion of Section 3.3, we need to re-average the stratification of 0-200 m and 200-400 m, and then use the K-means algorithm to cluster the set of sound speed profiles of these four layers. This is iterative hierarchical processing. We used the K-means algorithm to classify and identify the sound speed profiles of the above four layers and obtained the following clustering results ( Figure 6).  (Figure 6b) is different, especially in the west of the study sea area, the frontal zone is wider than 0-100 m. The width of the frontal zone at 39°30′ W section reaches 84.5 km at 100-200 m, while at 0-100 m, it is 22.4 km. The results of 200-300 m (Figure 6c) and 200-400 m (Figure 4b) are similar, but the result of 300-400 m (Figure 6d) is different. The width of the frontal zone at 39°30′ W section is 67.2 km at 300-400 m, which is narrower than 200-300 m.   Taking the experiment on 0-100 m as an example, we selected these three sound speed sections (39.5° W, 39° W, and 38.5° W) to calculate the TL (each sound speed section has two directions, so there are six comparisons in Figure 8). The calculation of the TL verifies that the difference between the starting position of the ocean front obtained by the iterative hierarchical clustering result (Figure 6a) and the starting position of the frontal Taking the experiment on 0-100 m as an example, we selected these three sound speed sections (39.5 • W, 39 • W, and 38.5 • W) to calculate the TL (each sound speed section has two directions, so there are six comparisons in Figure 8). The calculation of the TL verifies that the difference between the starting position of the ocean front obtained by the iterative hierarchical clustering result (Figure 6a) and the starting position of the frontal zone obtained by the TL calculation is less than twice the distance between the two sound velocity profiles (11.2 km), which verifies the accuracy of the clustering results under the new stratification. Taking the experiment on 0-100 m as an example, we selected these three sound speed sections (39.5° W, 39° W, and 38.5° W) to calculate the TL (each sound speed section has two directions, so there are six comparisons in Figure 8). The calculation of the TL verifies that the difference between the starting position of the ocean front obtained by the iterative hierarchical clustering result (Figure 6a) and the starting position of the frontal zone obtained by the TL calculation is less than twice the distance between the two sound velocity profiles (11.2 km), which verifies the accuracy of the clustering results under the new stratification. We gave Tables 1-3 about the comparison of the K-means clustering results of the 39°30′ W, 39°00′ W, and 38°30′ W sections under different conditions with the starting position (SP) of the frontal zone obtained from calculating the TL. We found that the error of all results is less than twice the distance of the adjacent sound speed profiles (11.2 km). We gave Tables 1-3 about the comparison of the K-means clustering results of the 39 • 30 W, 39 • 00 W, and 38 • 30 W sections under different conditions with the starting position (SP) of the frontal zone obtained from calculating the TL. We found that the error of all results is less than twice the distance of the adjacent sound speed profiles (11.2 km). We also calculated the mean error of these three sections, which is 4.3 km. This value is less than the distance between two adjacent sound speed profiles (5.6 km), which proves that it is feasible to use the K-means algorithm to perform iterative hierarchical clustering of sound speed profiles to detect the frontal zone, and the detecting results are also credible.

The Construction of the Three-Dimensional Geometric Structure of the GSF
After verifying the accuracy of clustering results based on the calculation of the TL, we used the results of iterative hierarchical clustering of sound speed profiles via the K-means algorithm to reconstruct the three-dimensional geometric structure of the GSF. The reconstruction method is to use the frontal zone identified by the K-means algorithm hierarchical iterative clustering SSP as the structure of the GSF at this layer. Finally, we got the structure of the GSF in Figure 9. We can see that the final structure of the GSF is divided into seven layers in the depth direction, including 0-100 m, 100-200 m, 200-300 m, 300-400 m, 400-600 m, 600-800 m, and 800-1000 m.

The Construction of the Three-Dimensional Geometric Structure of the GSF
After verifying the accuracy of clustering results based on the calculation of the TL, we used the results of iterative hierarchical clustering of sound speed profiles via the Kmeans algorithm to reconstruct the three-dimensional geometric structure of the GSF. The reconstruction method is to use the frontal zone identified by the K-means algorithm hierarchical iterative clustering SSP as the structure of the GSF at this layer. Finally, we got the structure of the GSF in Figure 9. We can see that the final structure of the GSF is divided into seven layers in the depth direction, including 0-100 m, 100-200 m, 200-300 m, 300-400 m, 400-600 m, 600-800 m, and 800-1000 m. Notably, the previous studies all used SSP of the full sea deep for clustering [14], but we used the characteristics of GSF in different layers to classify the research sea area into five layers as our initial stratification. Moreover, in our study, there is a key step of iterative optimization and stratification based on the accuracy of the K-means clustering results. Finally, the study area is divided into seven layers in the depth direction and the results of iterative hierarchical clustering can meet the allowable range of the error. The results of iterative hierarchical clustering of sound speed profiles can realize the reconstruction of the GSF.
We proposed a new method to detect the frontal zone in this paper, which is from Notably, the previous studies all used SSP of the full sea deep for clustering [14], but we used the characteristics of GSF in different layers to classify the research sea area into five layers as our initial stratification. Moreover, in our study, there is a key step of iterative optimization and stratification based on the accuracy of the K-means clustering results. Finally, the study area is divided into seven layers in the depth direction and the results of iterative hierarchical clustering can meet the allowable range of the error. The results of iterative hierarchical clustering of sound speed profiles can realize the reconstruction of the GSF.
We proposed a new method to detect the frontal zone in this paper, which is from the perspective of ocean acoustics and then we analyzed the sound speed profile by K-means, got the results, and verified them with the calculation results of TL, to complete iterative layering and reconstruct the GSF. Only as far as the proposed method itself is concerned, it is reliable, because the clustering analysis algorithm we used in the research process is mature in theory, and it is reasonable to judge the starting position of the frontal zone by calculating the TL, so as to verify the accuracy of the results obtained by the K-means algorithm. The whole research process is a closed-loop. There is a key step of iterative optimization and stratification based on the accuracy of the K-means clustering results, which is the main innovation of this paper. Iterative hierarchical clustering SSP by K-means algorithm provides a new method for judging the frontal zone and reconstructing the geometric model of ocean front in different depth ranges.

Conclusions
In this paper, we use the K-means algorithm to perform iterative hierarchical clustering of the sound speed profiles of the Gulf Stream related study area and construct the geometric structure of the GSF based on the clustering results for the first time. We use the characteristics of GSF in different layers to classify the research sea area into five layers as initial stratification. Meanwhile, considering that acoustic ray tracing is a very sensitive tool for detecting the location of ocean fronts because of the strong dependence of the TL on SSP structure, we verify the accuracy of the clustering results in different depth ranges by using the acoustic ray tracing to calculate the TL. However, by calculating the TL and obtaining the SP of the frontal zone, it is found that the SPs by K-means exceed the error range for the 0-200 m and 200-400 m layers, while the SPs by K-means are within the error range for the other three layers of 400-1000 m. Therefore, we divide the above two layers with larger errors (0-200 m and 200-400 m) evenly and complete the clustering and verification process again. This is an iterative and optimization process, and it is also one of the main innovations of this paper. The result of the iterative hierarchical clustering is that the depth range of the GSF is divided into seven layers (0-100 m, 100-200 m, 200-300 m, 300-400 m, 400-600 m, 600-800 m and 800-1000 m). The clustering result of each layer is within the error range of the SP of the frontal zone obtained by the calculation of the TL. Finally, we successfully construct the three-dimensional structure of the GSF for the first time. Iterative hierarchical clustering SSP by K-means algorithm provides a new method for judging the frontal zone and reconstructing the geometric model of ocean front in different depth ranges.
The ocean front is one of the most important phenomena in ocean dynamics, for its detection, the traditional method is from the perspective of oceanography, such as the method of the temperature gradient. However, for ocean acoustic detection, this detection method based on temperature gradient is not straightforward, because the basis and core of acoustic detection is the SSP. For this reason, the identification of the SSP through the K-means algorithm can provide a new insight from the perspective of ocean acoustics into the judgment of ocean fronts in different layers. Compared with the previous research, we can further control the error of the clustering analysis results to an acceptable range through the key step of iterative optimization of stratification, and the GSF structure obtained by this approach also appears more detailed in the depth direction.
It should be pointed out that although we have proved the applicability of this method to the GSF in this paper, we have not proved whether this method is applicable in other mesoscale phenomena in the ocean, which is also one of the key points of future research. In addition, we will also explore the feasibility of this method for spatial structure modeling in other fields.