A Comprehensive Approach to Account for Weather Uncertainties in Ship Route Optimization

: This work aims at deﬁning in a probabilistic manner objectives and constraints typically considered in route optimization systems. Information about weather-related uncertainties is in-troduced by adopting ensemble forecast results. Classical reliability methods commonly used in structural analysis are adopted, allowing to achieve a simple yet effective evaluation of the probability of failure and the variability associated with the predicted fuel consumption and time of arrival. A quantitative example of application is provided, taking into consideration one of the main North Atlantic routes.


Introduction
The selection of the most appropriate path for a ship to sail towards the destination has been a concern since the ancient times of maritime history. The degrading effect of environmental loads on vessel performance and their potential risk to its operability has been known since the early stages of navigation. Systematic collection of weather information has helped seafarers since the 18th century to reduce voyage time, but it is only in the second half of the 20th century that formal numerical techniques, initially based on the isochrone method [1], have been proposed to automatically identify the most favourable path.
Several approaches have been developed along the decades, with a sharp increase since the beginning of the new century as thoroughly depicted in [2,3]. The main concern has been on keeping the multi-objective nature of the optimization problem [4][5][6]. The objectives to be optimized typically encompass fuel consumption, voyage duration, and an index representing navigation safety. A common aspect of almost all of the proposed methods is to consider the prediction of environmental loads along the route as deterministic, even if the voyage may take several days. Nevertheless, the weather forecast is affected by uncertainties [7], which are nowadays to be mostly attributed to the inaccurate and partial knowledge about the initial conditions as third-generation numerical weather prediction models have achieved a high level of model accuracy. Only uncertainties related to the significant wave height and the wave period are accounted in this work for the sake of clarity, however, effortless extensions are possible, such as considering the effect of wind [8].
Ensemble forecasts [9][10][11] represent an effective technique to both improve the long term (more than 4-6 days) weather prediction and to assess its uncertainties, also providing their integral quantification in terms of standard deviations of the weather parameters. In its standard form, ensemble forecast consists of multiple runs of the numerical prediction models applying small perturbations to the initial conditions in order to evaluate their impact on the forecast. If not properly accounted for, the difference between the predicted and the actual weather conditions may counteract the effort made in the optimization of the route.
Pioneering work in this direction has been presented in [12], making use of ensemble forecasts to select the final route; however, ensemble forecasts were only used to evaluate the robustness of the selected route, ensuring that the constraints were not violated in a pre-defined number of cases. Applications to specific shipping sectors can be found in [13] for long term scheduling of LNG ships and in [14] which accounts for weather uncertainties for the development of speed optimization strategies in planning the operations of offshore supply vessels. In [15], uncertainties related to weather were considered on a two stageoptimization process: firstly, considering a classical multi-objective optimization of fuel and expected time of arrival (ETA), and secondly, selecting the route that minimizes the fuel consumption with a given probability of arriving on time. In [16], the effect of the uncertainty of weather forecasts in ship fuel consumption has been quantified along the ship routes.
A comprehensive and flexible way to incorporate uncertainties in the optimisation process is still missing, and the present work aims at filling this gap by proposing a method, potentially applicable to any voyage optimization algorithm, to evaluate uncertainties in the objectives and constraints typically accounted in ship routing. Section 2 presents the probabilistic approach considered, Section 3 describes the method to obtain the risk associated with a specific route, the required fuel consumption, and the expected voyage duration, each represented by a single value (similarly to the ones used in deterministic approaches) but accounting for weather uncertainties. An example of application on a north Atlantic crossing route is provided in Section 4, while conclusions and final remarks are discussed in Section 5.

Probabilistic Approach
The method assumes the knowledge of the mean values and the variances (or standard deviations) of the uncertain weather parameters along the route, e.g., significant wave height H S , mean period T m (or peak period T P ), and mean relative wave direction χ m , as well as the availability of a tool to estimate the required seakeeping responses, the fuel consumption, and the ship speed in the given weather conditions and for the imposed engine settings (typically RPM). The uncertain variables are considered normally distributed. Other sources of uncertainties not due to weather conditions are, for instance, model uncertainties in the estimation of ship responses [17][18][19], but also related to the attitude of the shipmaster and the engine control strategy adopted [20][21][22].

Navigational Risk
A probabilistic approach to navigational risk requires a robust methodology to estimate, at any given location along the route, the probability that any of the possible occurring risks will exceed a determined acceptable level l.
Consider a generic hazard (e.g., a seakeeping response) quantitatively definable by a value y. The value y may represent, for instance, a significant vertical acceleration at the bridge, the root mean square of roll angle [23], or the probability of green water [24]. The response y can be expressed as a function of a set of deterministic variables X = x 1 , . . . , x i , . . . , x n d (e.g., ship main dimensions) and a set of uncertain variables X = ( x 1 , . . . , x i , . . . , x n u ) (e.g., weather parameters), and a black-box model for the estimation of y is available: By expanding in Taylor series around the mean values of the uncertain variables X m = x m1 , . . . , x mi , . . . , x m nu , the response can be approximated in the first order as: The ship operations are considered safe if the response y does not exceed a limit value l. A limit state function can thus be defined as: The navigation is to be considered unsafe if the limit state function is negative g < 0. Assuming l to be constant for a given response, from the Taylor approximation it is straightforward to obtain an estimation of the expected value and the variance for the limit state function: is the partial derivative of the limit state function with respect to the uncertain variables, and σ x i is the standard deviation of the variables.
If only uncertainties in the significant wave height H S and a reference wave period (e.g., T P ) are considered, while all other variables are deterministic and included inX, the previous equations reduce to: where the correlation coefficient ρ between H S and T P can be estimated, for instance, from the scatter diagram of the North Atlantic, resulting in ρ = 0.575. Assuming Gaussian distribution of the uncertainties in the input variables, as a consequence of the first-order approximation approach adopted, the limit state function can also be described by a Gaussian distribution. Thus, the probability of unsafe operations p(g < 0) can then be estimated as: where Φ(x) is the normal cumulative density function [25]. It is worth noting that, since the method lacks formulation invariance, another definition of the limit state function with respect to Equation (3) may lead to different results when analytical solutions are not available.
In this work, considerations on navigation safety are limited to seakeeping responses, such as vertical acceleration at the bridge, lateral acceleration at a critical cargo location, roll, green water, and slamming, as is typical [26]. However, the concepts can be easily extended to other types of risk unrelated or limitedly related with weather, e.g., risk of collision or grounding, as far as an estimation model is available. Moreover, methods developed in the field of structural reliability also allow us to consider the limit value l as an uncertain variable, though this is not considered in the present work.

Navigation Performance
In order to evaluate the voyage performance, two parameters are relevant to ensure efficient operations of the ship are typically accounted for: duration and fuel consumption. The essential preliminary step to assess the voyage performance and the associated uncertainties is to evaluate the time and fuel required to sail an elementary portion of the route (navigation performance) and their variabilities related to the uncertain input variables.
A tool to obtain the ship speed V S and the fuel consumption per nautical mile FC nmi , given a set of deterministic variables X and a set of uncertain variables X, is here assumed to be available and considered as a black-box (e.g., [27]). Thus, for each track k in which the route is divided, one can always obtain the time d k and the fuel FC k required to sail the track: where l k is the length of the track in nautical miles. As for the case of navigational risk described in Section 2.1 that have in general complex, nonlinear, and not readily available analytical form, duration and consumptions can be linearized in a neighbour of the estimated value of the uncertain input variables, as: Assuming that the uncertain variables are normally distributed, with known variances σ 2 x i and correlation coefficients ρ ij , the linearization allows the adoption of simple equations to compute the estimates µ k and the variances σ 2 k of the parameters, as:

Definition of Optimization Objectives and Constraints
The previous section has discussed how to obtain a probabilistic estimation of the most significant quantities related to navigation performance and safety. A route optimisation software requires, however, integral quantities to define global objectives, enabling us to rank different route alternatives [28].

Safety Constraint
Risk constraints typically are defined in the form: where y and l are the value assumed by the generic hazard and is the maximum acceptable limit as previously defined, and c is a Boolean that indicates whether the condition is acceptable ("true") or not ("false"). This approach assumes a deterministic quantification of the hazard y. However, if uncertainties are affecting the quantification of the hazard, such an approach may be misleading. A hazard may assume the same mean value y m , corresponding to the deterministic estimation but being affected by a different level of uncertainty.
In the graphical example in Figure 1, two probabilistic distributions of a hazard with the same mean value, but different standard deviation, are shown, as well as the limiting value l. Distributions are, in a first approximation, assumed as symmetric with respect to the mean value. For the narrower bell, corresponding to a lower level of uncertainty, the probability of y ≥ l is about 1.6%, while for the other case, the probability exceeds 10%. The latter may be unacceptable, but the deterministic approach is not able to identify such a dangerous situation. An extreme case happens when y m approaches l and, consequently, the probability of y ≥ l approaches 50%.
value . Distributions are, in a first approximation, assumed as symmetric with respect to the mean value. For the narrower bell, corresponding to a lower level of uncertainty, the probability of ≥ is about 1.6%, while for the other case, the probability exceeds 10%. The latter may be unacceptable, but the deterministic approach is not able to identify such a dangerous situation. An extreme case happens when approaches and, consequently, the probability of ≥ approaches 50%. The probabilistic approach offers much more flexibility in the definition of the constraints, which can be expressed in the form: where is the risk-acceptable probability that the hazard will occur. For symmetric probabilistic distributions of the hazard, as in the previous example, = 0.5 the results are analogous to those of the deterministic approach. Typically, the acceptable risk differs depending, for instance, on the ship type. For instance, the acceptable risk of a passenger ship is lower than for a cargo ship, and this is lower than for a navy ship. The parameter can readily be used as a Boolean constraint to discard unsafe alternatives.

Risk Objective Function
The probability of the occurrence of an unsafe condition obtained in Equation (Error! Reference source not found.) only refers to one of the possible hazards and to a specific location along the route, so it can be hereafter referred as , .
By assuming that the hazards are perfectly correlated, the probability of being caught in unsafe situations in a specific location can be estimated as the highest probability among the hazards considered. On the other hand, if the events are considered statistically independent, such probability could be obtained by the expression where is the number of hazards. In the general case, it can be said that the probability of failure of the system falls between the range expressed in Equation (Error! Reference source not found.). With a conservative approach, the upper bound can be taken [25]: The probabilistic approach offers much more flexibility in the definition of the constraints, which can be expressed in the form: where p R is the risk-acceptable probability that the hazard will occur. For symmetric probabilistic distributions of the hazard, as in the previous example, p R = 0.5 the results are analogous to those of the deterministic approach. Typically, the acceptable risk differs depending, for instance, on the ship type. For instance, the acceptable risk p R of a passenger ship is lower than for a cargo ship, and this is lower than for a navy ship. The parameter c can readily be used as a Boolean constraint to discard unsafe alternatives.

Risk Objective Function
The probability of the occurrence of an unsafe condition obtained in Equation (6) only refers to one of the possible r hazards and to a specific location k along the route, so it can be hereafter referred as p k,r .
By assuming that the hazards are perfectly correlated, the probability of being caught in unsafe situations in a specific location can be estimated as the highest probability among the hazards considered. On the other hand, if the events are considered statistically independent, such probability could be obtained by the expression 1 − ∏ where n r is the number of hazards. In the general case, it can be said that the probability of failure of the system falls between the range expressed in Equation (12). With a conservative approach, the upper bound can be taken [25]: It is then clear that p k represents the probability of occurrence of any of the considered hazards in a given location along the route. Once p k is computed for all the control points along the route, these risk probabilities must be gathered in a single risk index as to be effectively used as an objective function to be minimized.
Depending on the requirements, different considerations may be done: 1. If the strategic objective is to prevent the ship to operate in conditions close to the maximum acceptable limit even for short periods, minimizing the highest p k is desired, thus a suitable risk index can be: where n k is the number of control points. When comparing risk 1 of two different routes, one alternative with smooth navigation for the greatest part of the voyage, but a short, particularly demanding event would have a higher risk index than a route where conditions are worse for most of the voyage.
This approach may be considered when the focus is on safety and the ultimate stresses of the system. However, if the constraints are appropriately set (e.g., a higher p R on Equation (11)), these can be sufficient to prevent dangerous conditions.
2. If, instead, comfort [29] is of primary importance, the average conditions along the route appear to be a more relevant indicator. In this case, a weighted average can be applied: where d k refers to the time the ship takes to sail the track identified by the k control point and d is the duration of the voyage, being d = ∑ d k . When comparing risk 2 of the two routes described at the previous point, the opposite would occur and the second route would have a higher risk index. The short passage through a storm is, in this case, hidden by the averaging, which may result in unexpected and undesired situations onboard 3. As to account for both the average conditions along the route and the most demanding ones, a generalized expression can be expressed in the following form: where w 1 and w 2 are weights to be defined depending on the ship type and strategy. As a default setting, w 1 = w 2 = 0.5 can be considered. 4. None of the previous methods appears adequate to reflect the level of severity felt onboard. A more appropriate solution could be obtained by assuming that the voyage results are less comfortable and more demanding if more challenging ship responses are faced for a relevant period, with respect to the duration of the route. Consequently, the navigation safety can be judged by analysing a restricted set of tracks that encompass a certain duration characterized by the higher probabilities of failure (higher values of p k ). The duration of the "relevant period" of time is arbitrarily set as equal to one-third of the voyage duration, though investigations have been carried out to adjust such a proportion by obtaining the feedback of experienced seafarers.
Similarly to case 2, a weighted average is performed, but in this case over the most severe cases, which account for 1/3 of the voyage duration. Beingk the most severe control point for which ∑ dˆk = d/3:

Performance Objective Functions
Differently from the risk objective function that has been consistently defined as a probability of failure R, to express voyage performance, physical quantities must be maintained. Thus, a different approach is required. Using Equation (9), the estimates and variances of the fuel consumption and sailing time can be calculated for all the tracks.
The corresponding integral values for the whole voyage µ and σ 2 can be obtained by summation of the previous, such as: where ρ kj is, in this case, the correlation coefficient between the performance parameter (e.g., fuel consumption) on two different tracks. Since all the other variables have been discussed in Section 2.2, the problem shifts to the assignment of realistic values to the correlation coefficient ρ kj . The problem is not trivial; in fact, the interrelationship between, for instance, the fuel consumption on two different locations along the route depends on a large number of factors, such as the intensity of the atmospheric conditions affecting the area, the topography, and the manner in which the Shipmaster conducts the navigation. Intuitively, it can be reasonably postulated that the conditions of navigation are generally similar at a distance of a few miles, and therefore the performance is also similar. After several miles, the presence of land, passage through channels, and changes in weather conditions and engine settings can result in variations in the performance, and possible similarities have no systematic nature. Consequently, ρ kj is qualitatively higher for neighbour tracks and progressively reduces if the locations considered are far from one to the other.
Some preliminary analyses on routes in the North Atlantic suggested that the correlation can be neglected if the distance d kj between the considered locations is larger than 150 nmi. Thus, a simple form for the correlation coefficient linearly dependent on the distance is proposed as: By applying similar evaluations at different routes, it has been found that the correlation vanishes at a higher distance in milder weather conditions, due to the fact that weather-unrelated considerations prevail in those cases. The correlation coefficient did not appear significantly influenced by the ship type. The influence of other factors, such as weather severity and variability, must be studied in more detail to build a more generic and robust model for the correlation.
The proposed method allows an estimation of µ d , µ FC , σ 2 d and σ 2 FC , where, within the Gaussian assumption of the distribution of the uncertain variables, the means correspond to the values obtained by a deterministic approach.
In practical terms, however, two values (mean and standard deviation) to define a single objective function are not convenient. It is then possible to make use of the additional information provided by the standard deviations to compute a predefined percentile, namely the values of the voyage duration and consumption, which have a given probability to be exceeded. The normal distribution assumption allows to readily obtain an estimation of these values as: where Φ −1 indicates the inverse Gaussian distribution, and p d and p FC the probability of the parameters to be below the values d p and FC p . For instance, for p FC = 0.75, the selected route will have 75% probability to consume less than FC p . The percentiles d p and FC p can be used to rank the different alternatives and perform optimizations, similarly to how it would be done by adopting a deterministic approach, though carrying a higher degree of information.

Example of Application
The process applied to implement the probabilistic approach discussed in the previous sections for the evaluation of route optimization constraints and objective functions is summarised in the flowchart in Figure 2.
of the parameters to be below the values and . For instance, for = 0.75, the selected route will have 75% probability to consume less than . The percentiles and can be used to rank the different alternatives and perform optimizations, similarly to how it would be done by adopting a deterministic approach, though carrying a higher degree of information.

Example of Application
The process applied to implement the probabilistic approach discussed in the previous sections for the evaluation of route optimization constraints and objective functions is summarised in the flowchart in Figure 2. The method has been applied to a transatlantic route starting from the British Channel with its destination on the east coast of the United States, as shown in Figure 3, being one of the most travelled commercial routes in this area [30]. The method has been applied to a transatlantic route starting from the British Channel with its destination on the east coast of the United States, as shown in Figure 3, being one of the most travelled commercial routes in this area [30]. The route simulation is made considering a containership with main characteristic listed in Table 1.  The route simulation is made considering a containership with main characteristics listed in Table 1. The departure time is at midnight on 25 April 2020 and the engine is set to operate at 90% of the maximum speed, corresponding to 129.6 RPM and a cruise speed of 24.35 kn in still water. During the navigation, a constant RPM engine strategy is adopted [20] until the engine overload curve is reached; when the resistance increases, the RPMs are progressively reduced to prevent overloading and make the engine operate within acceptable working conditions according to the load diagram provided by the manufacturer.
The mean weather conditions expected along the route and the relative uncertainties are obtained from the NOAA GWES ensemble forecast [9], assuming the ship will travel at the speed corresponding to the mean weather conditions [31]. Figure 4 describes the significant wave height and the mean period forecasted during the navigation, including all the ensemble members, and highlighting the mean values and 5th and 95th percentiles; the encountered conditions are also depicted with dots. The route simulation is made considering a containership with main characteristics listed in Table 1. The departure time is at midnight on 25 April 2020 and the engine is set to operate at 90% of the maximum speed, corresponding to 129.6 RPM and a cruise speed of 24.35 kn in still water. During the navigation, a constant RPM engine strategy is adopted [20] until the engine overload curve is reached; when the resistance increases, the RPMs are progressively reduced to prevent overloading and make the engine operate within acceptable working conditions according to the load diagram provided by the manufacturer.
The mean weather conditions expected along the route and the relative uncertainties are obtained from the NOAA GWES ensemble forecast [9], assuming the ship will travel at the speed corresponding to the mean weather conditions [31]. Figure 4 describes the significant wave height and the mean period forecasted during the navigation, including all the ensemble members, and highlighting the mean values and 5th and 95th percentiles; the encountered conditions are also depicted with dots.  It can be noticed that the initial part of the route is subjected to milder weather conditions, characterized by relatively low swells, and agreement among the ensembles due It can be noticed that the initial part of the route is subjected to milder weather conditions, characterized by relatively low swells, and agreement among the ensembles due to the short prediction range. After two days, an area of moderate waves is expected with maximum H S reaching 4 m, though maintaining small variability among the members. In the second half of the voyage, the situation becomes more uncertain due to the long prediction range (>4 days) with some ensemble members showing values of H S close to 6 m. Indeed, these pessimistic predictions are those which more closely represent the actual sea-states encountered by the ship. Moreover, the significant wave height is always within the 90% confidence level, while the wave period exceeds the upper limit in some cases.
By applying the equations described in the previous sections based on the first-order second-moment method, the uncertainties in the weather prediction have been propagated in the model to estimate the uncertainties in the ship's operational characteristics.
As shown in Figure 5, the variability of the speed is limited, with an average value always higher than 22 kn. This is due to a sufficient sea margin of the engine and the fact that the limits of the seakeeping responses are not exceeded, thus not requiring voluntary speed reduction [32].
second-moment method, the uncertainties in the weather prediction have been propagated in the model to estimate the uncertainties in the ship's operational characteristics.
As shown in Figure 5, the variability of the speed is limited, with an average value always higher than 22 kn. This is due to a sufficient sea margin of the engine and the fact that the limits of the seakeeping responses are not exceeded, thus not requiring voluntary speed reduction [32]. On the other hand, Figure 6 indicates a large variability in the estimation of the fuel consumption, in this case represented in the form of fuel consumption per nautical mile. This is mostly explained by an increase in the power required to face the increased resistance due to more severe weather conditions, and the consequent decrease in speed [33].  On the other hand, Figure 6 indicates a large variability in the estimation of the fuel consumption, in this case represented in the form of fuel consumption per nautical mile. This is mostly explained by an increase in the power required to face the increased resistance due to more severe weather conditions, and the consequent decrease in speed [33]. gated in the model to estimate the uncertainties in the ship's operational characteristics.
As shown in Figure 5, the variability of the speed is limited, with an average value always higher than 22 kn. This is due to a sufficient sea margin of the engine and the fact that the limits of the seakeeping responses are not exceeded, thus not requiring voluntary speed reduction [32]. On the other hand, Figure 6 indicates a large variability in the estimation of the fuel consumption, in this case represented in the form of fuel consumption per nautical mile. This is mostly explained by an increase in the power required to face the increased resistance due to more severe weather conditions, and the consequent decrease in speed [33].  The estimated value of the voyage fuel consumption, roughly corresponding to the only result that could be obtained with a deterministic approach, is equal to 408.7 t, while the 95th percentile is 418.4 t.
For what concerns safety, several seakeeping responses are taken into account, namely vertical acceleration at the bridge, lateral acceleration at a critical cargo location, roll, green water, and slamming, and the overall probability of failure is taken as the upper bound of Equation (12). The probabilities of failure computed with the four methods described in Section 3.2 are compared in Table 2. The probabilities of failure are small in all cases, since no extreme conditions that could put the ship at risk are encountered in this voyage. However, the strength of the approach can be better estimated when used to rank different routes, as it provides a meaningful way to aggregate navigation hazards and to evaluate the probability of the severity of the entire voyage, accounting for uncertainties in the weather forecast.

Conclusions
A method has been proposed to exploit the additional information provided by ensemble weather forecasts in the identification of the most favourable ship route. Several assumptions are adopted with respect to the distribution of such uncertainties and their propagation in the model. Nevertheless, the method proposed has the advantage of being simple to adopt also on existing routing software, as it maintains the typical structures of objectives and constraints, extending the amount of information provided to the ranking method (and to the decision-maker) for a more aware selection. Prediction of route performance, identified with fuel consumption and duration of the voyage, are accompanied by the variance of such objectives, enabling the evaluation of the related percentiles, e.g., instead of a deterministic estimated time of arrival (ETA), the time within which the ship will arrive with 90% of probability can be considered. Navigation safety, which typically generates difficulties in the aggregation of different hazards, is consistently represented by the probability of system failure, which is the probability of exceeding predetermined thresholds of all the considered hazards. The test case provides a numerical example of the application of the method, highlighting the differences with a classical deterministic approach.
Further investigation will allow better appreciation of the strength of the proposed approach in the ranking of different routes and compare the accuracy of the predictions. Moreover, research is being conducted within the scope of the ROUTING project (http: //routing-project.eu/ (accessed on 1 December 2021)) [34], aimed at obtaining the feedback of experienced seafarers that will support the tuning of specific parameters, such as the duration of severe conditions to compromise comfort and/or safety onboard and the percentiles of the expected ship performance, to achieve trustworthy predictions without assuming a too-conservative approach.