Next Article in Journal
On the Use of the Hybrid Causal Logic Methodology in Ship Collision Risk Assessment
Next Article in Special Issue
Data-Driven, Multi-Model Workflow Suggests Strong Influence from Hurricanes on the Generation of Turbidity Currents in the Gulf of Mexico
Previous Article in Journal
Tetracycline Photocatalytic Degradation under CdS Treatment
Previous Article in Special Issue
Coastal Flooding and Inundation and Inland Flooding due to Downstream Blocking
 
 
Article
Peer-Review Record

Evaluation of Structured and Unstructured Models for Application in Operational Ocean Forecasting in Nearshore Waters

J. Mar. Sci. Eng. 2020, 8(7), 484; https://doi.org/10.3390/jmse8070484
by Shannon Nudds 1,*, Youyu Lu 1, Simon Higginson 1, Susan P. Haigh 2, Jean-Philippe Paquin 3, Mitchell O’Flaherty-Sproul 2, Stephanne Taylor 1, Hauke Blanken 4, Guillaume Marcotte 5, Gregory C. Smith 3, Natacha B. Bernier 3, Phillip MacAulay 6, Yongsheng Wu 1, Li Zhai 1, Xianmin Hu 1, Jérôme Chanut 7, Michael Dunphy 4, Frédéric Dupont 8, David Greenberg 1, Fraser J. M. Davidson 9 and Fred Page 2add Show full author list remove Hide full author list
Reviewer 1: Anonymous
Reviewer 2: Anonymous
J. Mar. Sci. Eng. 2020, 8(7), 484; https://doi.org/10.3390/jmse8070484
Submission received: 30 April 2020 / Revised: 3 June 2020 / Accepted: 26 June 2020 / Published: 30 June 2020

Round 1

Reviewer 1 Report

This paper presents a collaborative work driven by OPP of Canada, on evaluation of NEMO and FVCOM. The area of study is the Saint John Harbour, and the emphasis is local flows with high-resolution. Overall the paper is interesting and its results have impact. My comments are as follows.

I. Tuning towards more on multiscale
This paper deals with high-resolution simulaiton of local flows, which are driven by large-scale background estuary flows., and it's a work on multiscale flows and simulations, although term "multiscale" is not used in writing. It will be interesting, also pertinent to the theme of this special issue, to discuss more and deeper if the authors can. Depending on the authors' discretion, such discussion can be, for instance,

a)  How to refine meshes to achieve enough resolution in resolving small-scale flows, e.g., at river mouths in Sec. 2, and how small-scale local flows and large-scale background flows are affected each other in Sec. 3.

b) As stated at the end of the Sec. IV, there are lower limits for mesh resolution in the models, and a detailed discussion on this in the main body will be interesting.

c) Moreover, the abstract can be tuned a little towards multiscales.

II. Content of section II
This sec. is on the framework for the evaluation, and is good to have. Currently, it is a genrnal discussion, and I would think it's better to be more specific. For instance, Most equations and relevant content (e.g., their definition and explanation) in Sec. 3 can be moved to Sec. 2. This not only makes Sec. 2 stronger, and also the paper be easier to read.

III. Some minors and question.

a) In fig. 8, in comparison to observation, NEMO is much better than the other two, but why (they have
same mesh resolution, ...)?

b) The writing can be improved. For instance, in discussion on OPP in the 1st paragraph of Introduction sec., it's good to add some ref. on OPP, e.g., at the 1st sentence of the paragraph. Also, the paper tells a lot on what has been done, but what is extracted/concluded from what is done is not that clear.

c) Sec. IV is more like a concluding remark sec., rather than a discussion sec. Maybe change it to the former?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This subject matter is relevant to the JMSE audience.

The concept of comparing practical implementations of two community code bases and documenting the many technical details involved is of practical interest to other organisations involved in ocean forecasting.

Overall, the technical detail and plots are very well done but I am not convinced by the title and contextual framing. If these text-only issues are addressed it would be approptiate for publication.

 

Major suggestions:

A) Despite the title, I dont actually see any 'evaluation framework' explicitly described.   

By the content it can be inferred that the authors are asserting that the multi-dimensional and complicated task of evaluating a near-shore forecast system is a matter of best left to the developers themselves using expert judgement and metrics defined after-the-fact.   This may indeed be reasonable - but should be stated clearly.    In contrast, why doesn't the framework involve actual customers or pre-defined metrics?

Even if the set of evaluation perspectives described is to be applied to the other planned domains, how should the outcomes be acted on?   It isn't clear to me whether the authors are suggesting that evaluation is best done as a single score `A/B` test or if the various skill measures should be somehow boiled down and interpreted by a panel of experts. 

It seems to me that work acknowledges that practical evaluation in an operational setting is inherently non-trivial and a 'moving feast'; but the text does not say so directly and does not layout a generic framework for evaluating forecast models as promised. 

 

B) Where is the forecast dimension?

I struggled to clarify what role actual forecasts played in these evaluations.   Where the runs a series of non-overlapping short forecasts (more akin to an analysis run) and if not what was the forecast horizon.  It may well be in the text somewhere, but I managed to get through the manuscript and not be able to answer this fundamental question.   The absence of metrics that  focused on the decay of skill with lead time added to this.   More simply, how much worse is the 3 day forecast than the 1 day forecast? 

C) Forecast system design and simulation scope

The configuration of the models was adequately described with regard to domains, grids and boundary conditions.   But I think the matter of how the basic scope of what physics to include and exclude was not addressed.  For instance, why are surface waves excluded? why bother including tides if  harmonics are treated as a point of truth?  why not run lower resolution models as an ensemble?   why is there no data assimilation?  is the lack of  real-time observations a factor in forecast system design?    I am sure the authors can easily answer such design questions, but manuscript would be improved if this was addressed.   

 

Minor suggestions:

Caption of Fig4 should mention the tidal filtering described in the text to alert reader to fact that 'residual' here is not identical to common understandings. 

Section 3.7 and Fig9.    I couldn't find expected interpretation of the time evolution of instantaneous skill.   At face-value there is some increase of this skill measure with time - what does that mean?   It is very common to evaluation forecasts by the decay of skill with lead time .... eitherway some comments would help.

Section 3.4 on tidal evaluations.     I think justification of the relevance of the metrics based on harmonic analysis could be improved.  Why is the M2 vector difference a better metric than alternatives such as a timeseries statistics against observable total water level; or the time and magnitude error of tidal extrema?       

Similar topic, the tidal treatment of long periods (Sa, SSa) may be worth a comment.  

 

 

Fig4 is of notably lower resolution than all other figures.  Not fatal, but is a distraction that would be better fixed if possible.

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have addressed most of my questions and I have nothing more to say.
Back to TopTop