1. Introduction
Vessel traffic risk management aims to enhance maritime safety, which includes the protection of life, health, the marine environment and property [
1]. An important aspect of risk management lies in vessel traffic risk assessment, which can guide the authority’s work for improving safety. The risk matrix is widely used for vessel traffic risk assessments [
2]. The risk matrix displays the frequency (or probability) of events and their consequence (or severity) using a matrix [
3].
The risk matrix provides a mechanism for assigning risk and making risk acceptance decisions using a risk categorization approach. Each cell in the risk matrix corresponds to a specific combination of the frequency and consequence, which indicates a level of risk. In the risk matrix, both the frequency and consequence are measured by a category scale: frequency, reasonable probability, remoteness and extremely remoteness for the frequency measurement; minor, significant, severe and catastrophic for the consequence measurement. In addition, each cell in the matrix can be assigned an indication of risk acceptability for decision-making on risk [
4,
5].
During the process of vessel traffic risk assessment, there are several methods that can be used to estimate the vessel traffic accident frequency and consequence [
6,
7,
8]. The fundamental way to calculate the vessel traffic accident frequency is to divide the number of accidents recorded in a given period by the corresponding exposure for that period [
2]. In general, the vessel traffic accident consequence can be measured in different ways, such as economic, environmental and safety consequences [
9]. For the purposes of this study, only safety consequences are considered. However, concerns have been raised about the reliability of vessel traffic risk assessment results [
5,
10,
11]. In order to investigate the vessel traffic risk assessment reliability, uncertainties in the risk assessment process should be analyzed. In practice, several qualitative frameworks [
12,
13] and quantitative methods [
14] have been proposed to assess uncertainty in the field of maritime risk analysis.
There are typically two types of uncertainty: randomness resulting from inherent variability and cognitive imprecision resulting from expert opinions. The former type of uncertainty is often referred to as aleatory uncertainty whereas the latter is often referred to as epistemic uncertainty [
15]. As vessel traffic accidents are random events, the number of accidents can be considered as a random number generated from a random event [
16]. Thus, the vessel traffic accident frequency calculated by the number of accidents is often taken as a variable with aleatory uncertainty due to inherent variability [
17]. With regards to the vessel traffic accident consequence, it can be taken as the result of many factors affecting one another through an ambiguous development process. In the field of maritime risk analysis, the number of fatalities is generally far less than the number of vessel traffic accidents. The accident consequence is often considered as a variable with epistemic uncertainty, as the available information is usually scarce and expert judgements are often incorporated in the process of analyzing the accident consequence [
18].
Different mathematical methods are required in order to quantify different types of uncertainty. Aleatory uncertainty is often quantified by probability distributions [
19]. For the quantification of epistemic uncertainty, a number of methods can be used, such as possibility theory and evidence theory [
20]. In this study, possibility theory is selected to quantify epistemic uncertainty due to its mathematical simplicity [
21]. It should be noted that the time window used for the inclusion of data has a major effect on the quantification of uncertainty [
22]. Without considering the time window selection, uncertainties will be measured too conservatively or too optimistically. Therefore, it is considered necessary to propose an algorithm for time window selection.
When selecting the time window, the aleatory uncertainty of vessel traffic accident occurrences should be taken into account. In addition, it should be noted that the vessel traffic accident frequency changes over time due to maritime rules becoming operative on the vessel traffic being analyzed [
23]. In other words, the time series of the accident frequency may reveal certain trends. Due to uncertainties and variation trends of the vessel traffic accident frequency, the time window selection comes down to constructing and splitting the uncertain time series. In this study, the aleatory uncertainty of the vessel traffic accident frequency is quantified by the Poisson distribution [
18], so that the confidence intervals of the accident frequency can be obtained each year. When the confidence intervals are arranged in chronological order, the uncertain time series of the accident frequency is built up. Based on the orderliness of time series, the sliding window method [
24] is used to split the uncertain time series, which aggregates the relatively concentrative confidence intervals. The latest segmentation is selected as the statistical time window.
The rest of the paper is organized as follows.
Section 2 describes the algorithm for the quantification of the two types of uncertainty identified in the risk assessment process. In
Section 3, the time window selection algorithm is described. The case study is carried out and comparisons are made to verify the superiority of the proposed algorithms in
Section 4. Conclusions are provided in
Section 5.
3. Algorithm for Time Window Selection
The vessel traffic accident frequency is chosen as the index in the process of time window selection because accident consequences are generated through a fuzzy development process after accidents occur and are considered as variables with epistemic uncertainty. Based on Equation (1), the limits of the confidence interval for the number of accidents can be calculated. Then the confidence interval for the vessel traffic accident frequency can be measured in the number of accidents per ship year. When the confidence intervals for the accident frequency each year are arranged in chronological order, the uncertain time series is formed [
27].
According to the orderliness of time series, the closer the time interval is to the current time, the nearer the vessel traffic accident frequency measured over the time interval approximates to the current accident frequency. In order to reflect the variation trend of the vessel traffic accident frequency due to maritime rules becoming operative on the vessel traffic being analyzed, it is considered necessary to split the uncertain time series of the vessel traffic accident frequency calculated above.
The sliding window approach is applied in the process of splitting the uncertain time series. It regards the first year as the first segmentation and continues to expand until the vessel traffic accident frequency in certain years exceeds the confidence interval of the former segmentation. Then the next segmentation starts with this year. The aforementioned splitting procedure is repeated until the last year in the uncertain time series. It should be noted that whether the accident frequency in certain years exceeds the confidence interval of the former segmentation is used as the criterion for splitting the uncertain time series. The criterion is reasonable because it can aggregate relatively concentrative confidence intervals at a certain level of confidence. When the accident frequency exceeds the confidence interval of the former segmentation, it means that the accident frequency still changes under the influence of outside factors after overcoming the randomness of the accident data. In other words, different segmentations show different levels of the vessel traffic accident frequency. Changes in levels of the vessel traffic accident frequency reflect the effects of technical or operational developments, new requirements, or specific arrangements on the vessel traffic being analyzed. The above developments are often carried out along with maritime rules becoming operative. To represent the splitting process of the uncertain time series more clearly, the flowchart is shown in
Figure 1.
After the preliminary segmentation, there is one more step to obtain the final segmentation result. When one segmentation only contains one year and there is no remarkable change between its adjacent segmentations, it means that the large random fluctuation of data from the year leads to the segmentation. In order to avoid excessive segmentation, the segmentation and its adjacent segmentations should be merged. It should be noted that the condition for the segmentation, as can be seen in
Figure 1, is used to judge if there is a remarkable change. Finally, the latest segmentation is considered to be the statistical time window.
4. Results
Using the vessel traffic accident data from the United Kingdom’s marine accident investigation branch [
28], the case study is carried out on the algorithms for uncertainty quantification and time window selection that were illustrated in
Section 2 and
Section 3. The vessel traffic accident data include the number of United Kingdom’s vessels, vessel traffic accidents and fatalities caused by the accidents for each year 2000–2016, which are shown in
Table 1.
The vessel traffic accident frequency can be measured in the number of accidents per ship year. A ship year is defined as one ship sailing for one year, which can be understood as the corresponding exposure of the vessel traffic being analyzed. Then the vessel traffic accident frequency for each year between 2000 and2016 can be calculated by dividing the number of vessel traffic accidents by the number of vessels listed in
Table 1. According to Equation (1), the confidence intervals of the vessel traffic accident frequency for each year between 2000 and2016 can be estimated using a given confidence level. Different confidence values can be set for different research purposes. In this study, the confidence level is set as 0.95 based on the a priori knowledge of the European Maritime Safety Agency [
18]. When the confidence intervals are arranged in chronological order, an uncertain time series is formed. The uncertain time series is represented in
Figure 2, in which the accident frequency for each year 2000–2016 are represented by diamond icons and are connected by a line, the limits of the confidence interval are shown and connected by dotted lines.
According to
Section 3, the uncertain time series in
Figure 2 is split by the sliding window approach. The first year (2000) in the uncertain time series is regarded as the first segmentation. As can be seen in
Figure 2, the vessel traffic accident frequency and the corresponding confidence interval in the year 2000 are 0.132 and [0.111, 0.156]. Since the vessel traffic accident frequency in the year 2001 is 0.133 and it does not exceed the confidence interval [0.111, 0.156], the year 2001 should be merged into the first segmentation. Then the confidence interval is updated to [0.117, 0.149] based on Equation (1). The vessel traffic accident frequency in the year 2002 is 0.107 and it exceeds the confidence interval [0.117, 0.149]. As a result, the second segmentation starts with the year 2002. The above-mentioned splitting procedure is repeated until the year 2016. Then four segmentations are obtained, which are the years 2000–2001, 2002–2004, 2005, and 2006–2016. The year 2005 fits the required situation that a single segmentation only contains one year. The mean value of the vessel traffic accident frequency during the years 2006–2016 can be calculated as 0.084 and it exceeds the confidence interval of the segmentation from the year 2002 to 2004, which is [0.096, 0.116]. This indicates that there is remarkable change between the segmentation from the year 2002 to 2004 and the segmentation from the year 2006 to 2016. As discussed in
Section 3, the remarkable change between segmentations reflects the effects of technical or operational developments, new requirements, or specific arrangements on the vessel traffic being analyzed. Thus, the above-mentioned segmentations should not be merged. Finally, four segmentations are determined, which are the years 2000–2001, 2002–2004, 2005, and 2006–2016. The most recent segmentation, 2006–2016, is treated as the statistical time window.
Based on the selected time window (2006–2016), the aleatory uncertainty of the vessel traffic accident frequency can be quantified as [0.080, 0.089] for the confidence value 0.95. For accident consequences, triangular distributions are used for the epistemic uncertainty quantification. The triangular distribution has three parameters (
i,
j,
k), where
j denotes the most likely value when the membership function is 1;
i and
k respectively denote the lowest and the highest possible values when the membership function equals to 0 [
29]. The trapezoidal distribution can also be selected to quantify the epistemic uncertainty, which is another commonly used possibility distribution. In other words, there is epistemic uncertainty when selecting the possibility distribution. In order to make epistemic uncertainty as small as possible, the triangular distribution is selected in this study because less information is required to model the triangular distribution than to model the trapezoidal distribution. When there is more information about the accident consequence, it is possible to select other possibility distributions to quantify the epistemic uncertainty of the accident consequence.
In the risk matrix, the accident consequence refers to the mean value of fatalities from each vessel traffic accident specifically. It can be measured in the number of fatalities per accident. In the time window (2006–2016), the lowest value, the mean value and the highest value of fatalities per accident per year can be calculated as [0.155, 0.255, 0.394], which are considered as the three parameters of the triangular distribution. According to Equation (5), the epistemic uncertainty of accident consequence can be quantified as [0.160, 0.387] for the confidence value 0.95. Since the aleatory uncertainty of the vessel traffic frequency and the epistemic uncertainty of the accident consequence are quantified for the same confidence value, they can be represented in one risk matrix, which is shown by the rectangle of solid lines in
Figure 3.
In order to verify the availability of the proposed algorithms, two comparisons have been made. The first comparison is between the common method of estimating the vessel traffic risk and the algorithm proposed to quantify uncertainties. The common method [
2,
5] measures the vessel traffic accident frequency and the accident consequence by averaging the historical accident data in
Table 1. Then the accident frequency and consequence are calculated as 0.095 and 0.279, respectively, which are represented by a cross symbol in
Figure 3. Using the algorithm proposed to quantify uncertainties, the confidence interval of the vessel traffic accident frequency and consequence can be estimated as [0.092, 0.099] and [0.165, 0.389], which make the dashed rectangle seen in
Figure 3. It should be noted that the seventeen most recent years (2000–2016) are taken as the time window in the first comparison. As can be seen in
Figure 3, the cross symbol is in the dashed rectangle area. It means that the algorithm proposed to quantify uncertainties is effective and can provide more information for vessel traffic risk management.
The second comparison is carried out to verify the availability of the proposed algorithm for time window selection. It is obvious in
Figure 3 that the vessel traffic risk considering the time window selection is less than the vessel traffic risk calculated by all the historical accident data in the risk matrix. The reason for this is that the algorithm for time window selection can reflect the effects of technical or operational developments, new requirements, or specific arrangements on the vessel traffic being analyzed. According to
Table 1 and
Figure 2, the vessel traffic risk declines overall from the year 2000 until 2016. Thus, the vessel traffic risk may be overestimated without considering the time window selection.