Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

DSE-NN: Discretized Spatial Encoding Neural Network for Ocean Temperature and Salinity Interpolation in the North Atlantic

J. Mar. Sci. Eng. 2024, 12(6), 1013; https://doi.org/10.3390/jmse12061013

by Shirong Liu, Wentao Jia^* and Weimin Zhang

Reviewer 1: Anonymous

Reviewer 2:

Joonho Lee

Reviewer 3: Anonymous

Reviewer 4:

Andre Belem

J. Mar. Sci. Eng. 2024, 12(6), 1013; https://doi.org/10.3390/jmse12061013

Submission received: 27 April 2024 / Revised: 1 June 2024 / Accepted: 14 June 2024 / Published: 18 June 2024

(This article belongs to the Special Issue Recent Scientific Developments in Ocean Observation)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

This research presents a discretized spatial encoding neural network (DSE-NN), an encoder-decoder model that is based on deep supervision, network visualization, and hyperparameter optimization. The authors believe that the work can serve as benchmark for subsequent processing and comprehension of spatiotemporal relationships within marine data. Though the topic seems interesting, I have the following comments:

General comments: The abstract needs general reorganization. The introduction appears quite too long. Readers may be lost in the reading. This paper is poorly written, particularly the result and discussion sections need to be extensively improved based on the suggestions below.

Specific comments:

Line 17: The word “manifest” sounds not right could you replace it with “shows”?

Line 18-21: Can try to make this section more meaningful?

Line 65: Can please define RF and RBF since this is the time you are using it in this section?

Line 86-87: Since you have defined DSE-NN earlier, could remove the words in bracket?

Line 94: What is EN4.2.2?

Line 104: can insert the access link and date to the date.

Line 156: This does not sound correct, “traditionally connected neural…..” maybe right

Line 173: Equation 1 is not referenced anywhere in the article.

Line 184: Equation 2 is not referenced anywhere in the article.

Line 186: Equation 3 is not referenced anywhere in the article.

Line 199: Equation 4 is not referenced anywhere in the article.

Line 274: Equation 5 is not referenced anywhere in the article.

Note: Please can you reference other equations in the manuscript.

Line 353: 3.1-RMSE curve is not well explained. Can you cited the figures you are explaining?

e.g …… in terms of salinity between 1995 and 2020, it can be seen that the DNN curve has higher salinity loss than the DSE-NN curve, but they both have a stable decreasing trend of salinity loss over the years (Fig 4a) and so on….

Line 388 : 3.2 The North Atlantic data interpolation demonstration. Can you repeat the same suggestion I gave above for the section?

Line 437 3.3 Comparison of DSE-NN and DNN results. Repeat the same suggestion I gave above for the section.

Line 453: 4. Discussion. This is more like results; can you include references in this section? You need to link this research with past works.

Thanks

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Moderate editing of English language required

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper is about "Deep Neural Networks with Discretized Spatial Encoding for Ocean Temperature and Salinity Interpolation."

The paper demonstrates the differences between commonly used interpolation methods and those based on Deep Neural Networks (DNN). It effectively describes the improvements and analytical methods. However, the scope of comparison and analysis is limited, focusing only on two methods, which is a significant drawback. Therefore, substantial revisions are necessary, and a major revision and resubmission are recommended.

Figure 2 is not clearly visible.

In Figure 4, why were only DSE_NN and DNN compared? Comparing other existing methods as well would help demonstrate the necessity of using DNN in this paper.

EN4 is a global result, but why was only the North Atlantic analyzed? Analyzing the Pacific, Indian Ocean, and other regions is also considered necessary.

Why do Figures 5 and 6 show results only for the Atlantic?

Comparative results by depth should be included.

Comments on the Quality of English Language

This paper is about "Deep Neural Networks with Discretized Spatial Encoding for Ocean Temperature and Salinity Interpolation."

Figure 2 is not clearly visible.

In Figure 4, why were only DSE_NN and DNN compared? Comparing other existing methods as well would help demonstrate the necessity of using DNN in this paper.

EN4 is a global result, but why was only the North Atlantic analyzed? Analyzing the Pacific, Indian Ocean, and other regions is also considered necessary.

Why do Figures 5 and 6 show results only for the Atlantic?

Comparative results by depth should be included.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper introduced a method to interpolate temperature and salinity using spatial discretization associated with a neural network.

The paper deserves to be published, but some clarification are needed to help the reader to understand what the author did.

Figure 2. It is not clear what N represents and matrix 2xN represent. In the same figure it seems that the 2xN matrix is not used in the next Neural Network, as only 1 perpector is shown in the input layer.

Line 296 $n$

Line 326 Here, N is called the number of labels, but in Fig. 2 it seems related to the grid dimension. Please use different symbol for different meaning to avoid confusing the reader.

Line 525. From the description authors gave of their method, it does not seem a classical encoder-decoder architecture, from Figure 2 it seems that they transform the 2D data in a reduced space, but then only one point information is given in a neural network. I think that the authors should explain better what they did.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

This is the review of manuscript jmse-3009519 by Liu and colleagues on Deep Neural Networks for enhancing the accuracy of ocean temperature and salinity interpolation. Most of my comments below follow a "chronological" order, appearing as they do in the manuscript. I suggest the authors create a table to facilitate the interpretation of this review. The first part consists of general comments and suggestions, followed by specific corrections by figure or line(s) below.

I fully understand the authors' viewpoint in starting the manuscript by discussing the importance of temperature and salinity interpolation techniques in the oceans, citing some articles that already address the use of ML for these tasks. However, it is also crucial to mention that a significant number of fields of oceanographic variables in dozens of public datasets are created using Optimal Interpolation (OI), and this fact cannot be overlooked in the first sentence. I suggest citing Zhang, C., Wang, D., Liu, Z., Lu, S., Sun, C., Wei, Y., & Zhang, M. (2022). Global gridded argo dataset based on gradient-dependent optimal interpolation. Journal of Marine Science and Engineering, 10(5), 650 and/or Jha, R. K., & Udaya Bhaskar, T. V. S. (2021). Optimal parameters for generation of gridded product of Argo temperature and salinity using DIVA. Journal of Earth System Science, 130(3), 170. I would also include a reading of Johnson, G. C., & Fassbender, A. J. (2023). After two decades, Argo at PMEL, looks to the future. Oceanography, 36(2/3), 54-59.

Following this assumption, between lines 64-85, the authors write quite objectively about the evolution of neural networks in interpolation tasks, however, I felt it necessary to explicitly state here (for the non-specialized reader) that deep neural networks (DNNs) are structurally different from convolutional neural networks (CNNs) more commonly associated with the term "deep learning". Similarly, in lines 75-77, a differentiation between "prediction" and "interpolation" must be made very clear, so as not to confuse the reader.

Here I would like to make some general suggestions (and many questions) for the authors:

Although the choice of the North Atlantic Ocean as a case study is understandable, there is a lack of explanation at the end of the introduction of "why", which immediately creates doubt in the reader about the effectiveness of the method in other areas. Similarly, the use of EN4 quality controlled ocean data is critical for building confidence in the method, but the authors do not explain "why" they used EN4 and not other datasets, such as the Global Ocean Data Analysis Project (GODAP). Although there is integration between different datasets in EN4 (for example, WOD is integrated), it is produced exclusively by a single institution, unlike collaborative programs between multiple international organizations. Another very important point is the "density" of data for training and testing, over time. A simple histogram with count by time for both datasets (training and testing) by year is already sufficient to correlate with the discussion in item 4.

Regarding the methodology, specifically in the application of the DNN, it is clear enough. However, the sequence between lines 156-190 is confusing as there is no convincing explanation provided about the need to transform the coordinates into a matrix of points. The authors should anticipate a question like: "What is the purpose of the spatial discretization method, which involves converting geospatial data into a series of discretized boundary values?" Similarly, in section 2.2.2 (time discretization), the authors should more explicitly indicate "why does the time continuum effectively capture the dynamic characteristics in the context of oceanographic data analysis?" Note that it was enough to "grid" (or block) the data in time and space and use indices, considering a spatial and temporal resolution. However, the choice of resolution is critical for this process, and the authors do not explore (explain) these choices.

Something noteworthy in the manuscript is that the proposed method addresses the main issue "against" DNNs in general (which is overfitting), by incorporating a weight decay term in the loss function, and this is only discussed in section 2.3.2 when it should already be pointed out since the introduction. Another notable point is that the "early stopping strategy" is not explicitly explained in terms of its execution. I think it would be interesting for the reader to better understand (via code?) how this was achieved. Similarly, in lines 236-239, the authors should provide context or explanation regarding the metrics. Note that here MAE is mentioned (and why not MAPE?) but in the first results section, the authors use RMSE. Was this an error? Furthermore, in the continuation, in section 2.3.1, there are missing explanations: regarding the choice of FLC (which is a basic mode of a CNN) and LeakyReLU (and why not simply a ReLU or even a PReLU?). Also, as an additional suggestion for sections 2.3.2 and 2.3.3, they could be merged and simplified. Similarly, there is a lack of explanation of why the use of Adam (Adaptive Moment Estimation) and why not an Adamax?

Note that these methodological questions are important because the article should not be seen as a "recipe" provided by the authors. The world of DNN-CNNs is filled with options ranging from layer structuring to activation modes, and the "whys" are very important in understanding the mechanism behind the process.

The figures need to be improved, especially figures 4, 5, and 6. Note that between lines 413 and 432, the authors discuss the seasonal variability of temperature and salinity, but it is not possible to assess/understand the accuracy of the results because there are no figures that effectively represent the text. In particular, figure 6 is very poor. Figure 6 should explicitly represent the TS pair jointly (how about a TS diagram?) under different conditions/seasons. It is also possible to compose this figure with data simulated by the DNN and those from the EN4 dataset. The captions of these figures are completely inconsistent with the text. I recommend reading the document https://www.mdpi.com/journal/jmse/instructions#figures. An additional concern is that the authors should explain the magnitude of what they call "temperature/salinity loss" in relation to the accuracy of the measurement methods (see section 6.22 of the EN4 Product Guide https://www.metoffice.gov.uk/hadobs/en4/EN.4.2.2_Product_User_Guide_v1.0.pdf).

Finally, I believe that item 4, Discussion, should more explicitly include the impacts of both the optimizer and the loss function choices in relation to benchmarking. See, the article is great, but it's worth deepening the discussion so that it doesn't look like a “Cooking Recipe”. An example of the importance of this has to do with the seasonal frequency distribution of data impacting the argumentation between lines 484 and 488.

With these points explicitly stated here, I suggest a major review to increase the quality of the manuscript for publication in the journal.

Specific suggestions:

Fig. 1: I suggest improving this figure. It doesn't add much to the text. It is recommended to fix the map limits between 80°S-80°N for Mercator projections. Perhaps adding color to the symbols to indicate any temporal variability could be considered. However, it doesn't add much in the current position in the text. If there is a need for a map figure, I suggest plotting the distribution of points used in the study (from EN4).

Fig. 2: Although this is just an explanatory figure of the processing, it is important to note that it does not explicitly illustrate the fully connected layers (FCL) and Leaky Rectified Linear Unit (LeakyReLU) layers to handle preprocessed spatiotemporal discretized data, as the authors explain. Therefore, the exact number of hidden layers is not mentioned. Still, the authors could create a slightly more realistic flowchart regarding the methodology used. With this, figures 2 and 3 should be merged.

Lines 65-67: It is necessary to explicitly explain “BP neural network models and RBF neural network” (which means Backpropagation and Radial Basis Function, respectively), not only using acronyms.

Line 69: SMOS Satellite - SMOS (Soil Moisture and Ocean Salinity), right. Again, only acronyms ?

Line 94: The authors MUST explain the EN4 quality controlled ocean data (EN4.2.2) here, as it appears first, and correct the misspelling "(p)rofile".

Lines 150-151: It is necessary to better explain the sentence "Specialized processing of longitude and latitude data is not only necessary but also vital to optimizing analysis results and improving model prediction accuracy”. What it means ?

Lines 219-223: It would also be interesting to mention the "cons", for example, complexity (computationally intensive and require large amounts of data for training) and overfitting (tendency to overfit the training data). This last point is especially important in the context of climate change (for example: Kuhlbrodt, T., Swaminathan, R., Ceppi, P., & Wilder, T. (2024). A glimpse into the future: The 2023 ocean temperature and sea-ice extremes in the context of longer-term climate change. Bulletin of the American Meteorological Society).

Lines 266-267: "Weight decay, also known as L2 regularization, is a commonly used regularization technique applied to parameterized machine learning models." References are needed here, one or two. The authors MUST explain "why" (for example, by explaining that this adds a penalty term to the loss function during training).

Lines 281-283: "In oceanographic data analysis, particularly when the input dimension is high, such as the natural extension of polynomial regression to multivariate data, the complexity of the model increases rapidly with the degree." I would really like to see one or two citations here! I suggest reading Reddy, T., RM, S. P., Parimala, M., Chowdhary, C. L., Hakak, S., & Khan, W. Z. (2020). A deep neural networks-based model for uninterrupted marine environment monitoring. Computer Communications, 157, 64-75.

Small corrections in lines 290 and 296 for the "$" symbol in $w$, $b$, and $n$. I would still very much like to see how this is done in the code, but the authors did not provide any link for it.

Line 337: Adam - Adaptive Moment Estimation! And the explanation given (... combines the benefits of Momentum and RMSProp algorithms ...) is not enough.

Line 359: "higher salinity loss" (?) I believe the use of the word "loss" here is incorrect as it refers to RMSE.

Fig 4: There is no description in the caption about the figures (a, b, c, e, f). Perhaps a 2,3 layout would be more efficient than a 3,2.

Fig 5: Very difficult to visualize what the authors suggest. I suggest using different color palettes for salinity and temperature. The caption also needs to be corrected by adding units to the temperature. Here, I believe it would be best to represent by transects, for example, at 45°N, 35°N, and more specifically 26.5°N, which corresponds to RAPID (https://rapid.ac.uk/rw/news/).

Lines 438-439: KDE - Kernel Density Estimation. The authors MUST explain these terms in the text. It's not enough to be in the caption of figure 7.

Lines 443-445: ...curve for salinity loss and temperature loss. Explain what loss means in this context.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

I think since you do not have any section for a discussion, you may need to include some references in your conclusion to link this study with past studies.

Good luck

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

I have received the revised paper and appreciate the thorough responses to most of the revision requests. However, despite the fact that this study focuses on interpolation, I find it difficult to accept the reasoning that the study should be limited to the Atlantic Ocean solely because it has more data or because the Atlantic Ocean is important for climate change. Does the Pacific Ocean not play an important role in climate change as well?

If the authors must focus exclusively on the Atlantic Ocean, then there needs to be more substantial justification for this choice, and the title of the paper should be revised accordingly.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

This is the 2nd review of manuscript jmse-3009519-v2 by Liu and colleagues on Deep Neural Networks for enhancing the accuracy of ocean temperature and salinity interpolation.

First of all, I would like to thank the authors for their dedication and responses to the suggestions given in the first review. The topic is very interesting and fascinating, and it was a pleasure to read the versions of this manuscript.

The modifications made in sections 2 (especially in the text of 2.1 and 2.2) and 3, have really made the text much clearer. I believe that with the promise of depositing the code on GitHub later (Don’t forget this, please), the work will be very valuable. I recommend creating a Zenodo (https://zenodo.org/) linked to the GitHub repository, to have a DOI attached to the published article.

I also believe that the conclusion (item 5) is indeed better, as the authors emphasize the benefits of the method, mentioning the "efficacy of DSE-NN" without using the term "superiority" explicitly. Likewise, focusing on the "mathematical operation of increasing dimensions via spatial discretization" to improve the efficiency and accuracy of the interpolation without altering the computational load is much better than simply detailing the architecture. I also liked the change where the authors suggest that the DSE-NN approach is a new idea for data interpolation or downscaling analysis, in addition to helping in the evolution of advanced ocean models.

I still see problems with the text citing the “weighted mean absolute error (MAE)” in line 225, solely, without showing its relationship with the model in any of the figures, and having section 3.1 RSME curves (!). Since the metrics are different, I believe that the authors should also point to the RSME in line 225 as a method, or eventually eliminate the mention of the MAE in the manuscript. Likewise, I think that an explanation of the choices (which optimizer/penalizer to use) would be a worthwhile investment in the manuscript, but I respect the authors' decision not to delve into this.

Basically, I checked each point of the authors' response with version 2, and I am sure that this will be a successful manuscript!

One last point, line 70 - “Recurrent” (not Recently)! Neural Networks (RNN). And I understand that they decided to remove the "prediction" to make it broader.

With that, I congratulate the authors and look forward to seeing the article (and the code!) published.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

The paper is now ready to publish.

Article Menu

DSE-NN: Discretized Spatial Encoding Neural Network for Ocean Temperature and Salinity Interpolation in the North Atlantic

Further Information

Guidelines

MDPI Initiatives

Follow MDPI