1. Introduction
Throughout maritime history, ship casualties have had catastrophic impacts, affecting human life, societal wellbeing, and environmental integrity. In response to these severe events, the International Maritime Organization (IMO) introduced new regulations and updated existing rules aimed at enhancing maritime transport safety [
1,
2]. Historical records reveal an extensive and concerning list of maritime accidents, accompanied by substantial human casualties [
3]. Data from the IMO indicate that approximately 20% of all maritime accidents involve ship collisions, resulting annually in notable economic damages, fatalities, environmental pollution, and other detrimental consequences [
4].
The global increase in demand for raw materials and manufactured products has led to a rise in both the number and size of ships traversing international waters. Consequently, this trend creates denser maritime traffic and significantly escalates collision risk. The severity of consequences following marine accidents has also intensified, potentially reaching catastrophic scales [
5]. Research on maritime accidents has evolved significantly, undergoing various fundamental transformations. Gaining insights into these developments allows maritime industry stakeholders to evaluate past actions critically, enhance future maritime safety measures, and substantially reduce risks associated with vessels, their crews, cargo, and marine ecosystems [
6].
Maritime accidents present considerable risks to human safety, environmental sustainability, and the resilience of global supply chains [
7,
8]. Various types of maritime accidents such as collisions, contacts and groundings, each contribute differently to maritime casualty statistics. Collisions are noted as both the most common and severe type of accident, frequently caused by breaches of navigational regulations and human errors. Research by Uğurlu [
9] highlights that human-related factors, especially errors in perception and manoeuvring, constitute the majority of collision accidents, underlining the importance of improved training programs and stricter adherence to safety protocols. Although groundings and fires occur less often, they can produce significant environmental damage and substantial economic losses [
10,
11]. The Ever Given grounding accident in the Suez Canal notably demonstrated how such accidents can severely disrupt global trade operations [
12]. Furthermore, studies focusing on risks specific to ports, such as those conducted in Tianjin and Hong Kong, highlight how local operational practices and regulatory environments shape the nature and frequency of accidents [
13,
14].
Safe maritime operations remain essential for port functionality, with ship entry, mooring, and unmooring being particularly challenging processes [
15]. Stakeholders, including port authorities, vessel owners, and ship operators, prioritise ensuring that risk levels remain acceptably low. Thus, there is a need for methodologies capable of swiftly and effectively assessing risks related to maritime operations, especially in confined and congested waters.
In the maritime industry, numerous risk assessment methodologies exist to analyse specific risks under defined constraints. However, a comprehensive evaluation of maritime safety necessitates the simultaneous consideration of multiple interconnected factors. To achieve detailed and reliable safety assessments for individual vessels or entire fleets, it is essential to adopt a unified method capable of integrating diverse data streams into a coherent analytical framework.
This paper proposes a model that estimates navigational-accident risk from maritime Key Performance Indicators (KPIs). Expert-elicited KPI ranks are transformed into weights using the Average Rank Transformation into Weights—linear (ARTIW-L) and nonlinear (ARTIW-N)—and aggregated into a Ship Risk Profile (SRP) that varies with observed KPI performance. While the Baltic and International Maritime Council (BIMCO) framework offers comprehensive KPIs for monitoring operational performance, their linkage to accident risk remains under-explored. By quantifying how degraded KPI scores shift the SRP, the model provides a tractable way to assess—and potentially reduce—navigational-accident risk for collisions, contacts, and groundings. The novelty of this study lies in operationalising an expert-weighted KPI-to-risk mapping that yields an interpretable nominal SRP and a ship-specific normalised accident-risk index suitable for proactive navigational decision support.
Following the introduction,
Section 2 reviews navigational-accident statistics, common contributing factors, and established maritime risk-assessment approaches.
Section 3 describes the methodology used to derive expert-based KPI weights using ARTIW-L/ARTIW-N and to construct the nominal SRP and the normalised accident-risk index
.
Section 4 presents the results of the expert evaluation and the resulting KPI priorities and weights.
Section 5 discusses the implications of the findings, links them to related evidence, and outlines limitations and directions for future research.
Section 6 concludes the paper.
3. Materials and Methods: KPI Ranking, ARTIW Weighting, and SRP Construction
In this study, expert assessments are used to derive normalised importance weights for maritime KPIs and to construct a nominal Ship Risk Profile (
) for navigational-accident risk indication. The workflow consists of: (i) collecting expert KPI importance assessments and converting them into a priority order, (ii) testing inter-expert agreement using Kendall’s coefficient of concordance [
50], (iii) transforming aggregated KPI ranks into weights using the linear ARTIW-L and nonlinear ARTIW-N methods [
46,
47,
48,
49], and (iv) aggregating these weights into
and the normalised accident-risk index
. Research flowchart for constructing
and calculating the normalised accident-risk index
is presented in
Figure 3.
The ARTIW-L and ARTIW-N methods are well suited to the objectives of this study, as they operate directly on expert importance rankings and support explicit assignment of zero importance to indicators, allowing non-relevant KPIs to be excluded when full expert agreement is achieved. The combined use of linear and nonlinear transformations provides a weighting structure that simultaneously preserves proportional importance and emphasises the most and least influential KPIs. In contrast to pairwise-comparison-based approaches, such as AHP, ARTIW avoids pairwise comparisons between indicators, which makes it simpler to apply to a large set of indicators. Moreover, the ARTIW framework supports iterative development, enabling additional KPIs to be incorporated in future research through re-ranking and re-normalisation without altering the existing KPI structure.
Section 3.1 describes the KPI set, expert panel and questionnaire;
Section 3.2 and
Section 3.3 define the ARTIW-L and ARTIW-N weighting procedures;
Section 3.4 presents the agreement (consistency) analysis; and
Section 3.5 defines
and the calculation of the normalised accident-risk index
.
3.1. Expert Panel, KPIs and Questionnaire
The expert panel is designed as a group of navigational practitioners (experts). Panels of approximately 10–20 experts are commonly considered sufficient in expert-based multicriteria and safety assessments. In this methodology, the panel size is selected so that the total number of experts is not less than the half number of KPIs being evaluated, and the consistency of their judgements is subsequently assessed using concordance measures such as Kendall’s W. Experts are drawn from three professional categories—masters, chief mates and pilots—with at least five years of professional experience in the maritime sector. Where possible, the panel composition is balanced across these three groups; however, exact equality in category sizes is not critical, provided that all categories are represented and the overall panel remains sufficiently large and diverse for subsequent aggregation of rankings and consistency analysis. Such expert-driven qualitative evaluation is consistent with contemporary best practices in maritime safety assessments and risk management [
51,
52,
53,
54].
KPIs can be adopted from external sources such as BIMCO [
44], or developed independently to characterise the technical condition of the ship, the organisation of shipboard operations and human-factor management.
Accident investigations provide detailed reports identifying the primary factors contributing to an accident. To examine the root causes of navigational accidents, these contributing factors—risk factors—can be related to maritime KPIs by experts. The questionnaire is designed to assess the relationship between each KPI and these risk factors, based on experts’ experience. Using their professional judgement, domain knowledge and operational experience, experts assess every KPI using a four-level importance scale from 0 to 3 in response to the question: How important is this KPI for assessing the ship’s and crew’s ability to avoid a collision, contact or grounding accident? The scale is defined as follows:
0—not important (no meaningful link with accident risk);
1—slightly important (limited link with accident risk);
2—important (clear but not critical link with accident risk);
3—very important (direct or critical link with accident risk).
These qualitative evaluations are used to select and prioritise the most relevant KPIs for further analysis and to provide the input data for the rank-based weighting procedures with the ARTIW-L and ARTIW-N methods.
3.2. ARTIW-L Weighting of KPIs
The ARTIW-L method converts expert ranks of the KPIs into a vector of normalised subjective weights. Let
be the number of KPIs and
the number of experts. Denote by
the rank assigned by the
-th expert
to the
-th KPI
, where rank 1 corresponds to the most important KPI and rank
to the least important one. For each KPI, the average rank is calculated as:
where
is the mean rank of the
-th KPI;
is the rank given to the
-th criterion by the
-th expert.
The ARTIW-L method then linearly transforms these average ranks into normalised weights [
47,
48]. The weight of the
-th KPI is calculated from the formula:
where
is the normalised subjective weight of KPI
obtained by ARTIW-L.
By construction, and . Smaller average ranks (higher expert priority) yield larger weights, and is in a strictly linear inverse functional relationship with . This makes ARTIW-L a simple, transparent rank-based weighting scheme that preserves the ordinal priorities of the KPIs while producing a metric weight vector suitable for further quantitative analysis.
3.3. ARTIW-N Weighting of KPIs
The ARTIW-N method provides an alternative way of transforming average ranks into normalised subjective weights. In this case, the weights
are related to the criteria average ranks
by a nonlinear inverse functional relationship [
49]. As in
Section 3.2,
is the number of KPIs and
is the number of experts, and the average rank
given to each criterion is calculated according to Equation (1).
Using the average ranks
of all
criteria, the ARTIW-N method first calculates an intermediate value
for each KPI as the ratio between the minimum average rank and the average rank of KPI
:
These intermediate values are then normalised to obtain the ARTIW-N weights, calculated from the formula:
By construction,
and
. The weights
are in a nonlinear inverse relationship with the average ranks
: KPIs with very low average ranks receive relatively higher weights than under a purely linear transformation, while mid-ranked KPIs are slightly down-weighted. This nonlinear transformation “amplifies” the significance of the most important and the least important criteria by reducing the relative significance of criteria with medium importance, while preserving the same priority order of criteria as the ARTIW-L method [
47].
Together, ARTIW-L and ARTIW-N provide two complementary rank-based weighting schemes: ARTIW-L maintains a strictly linear inverse relationship between and the weights, whereas ARTIW-N emphasises the extremes of expert rankings, which can be useful in risk assessment when the most and least important KPIs are of particular interest.
3.4. Consistency of Expert Rankings
The averages of expert ranks and the corresponding ARTIW-based weights can be used as reliable results only if the experts’ opinions are sufficiently consistent and non-contradictory. The degree of consistency of the expert group is expressed by Kendall’s coefficient of concordance
, which takes values in the interval
:
indicates strong agreement, whereas
indicates that expert rankings are essentially random. When tied ranks occur in the experts’ rankings, the coefficient
is adjusted by applying the standard tie-correction factor as recommended by Kendall and Gibbons [
50]. The coefficient of concordance
is calculated from the formula:
where
is the number of KPIs,
is the number of experts, and
is the rank given to the
-th criterion by the
-th expert. The correction factor
accounts for tied ranks in the ratings of the
-th expert and is calculated from the groups of identical ranks assigned by that expert.
The correction factor
accounts for tied ranks within the ratings given by the
-th expert and is calculated as:
where
is the size of the
-th group of identical ranks (ties) assigned by the
-th expert, and
is the number of such groups in that expert’s ranking. When there are no ties, all
and the denominator in Equation (5) reduces to the standard form
.
The calculated value
is compared with its minimum value
, which depends on the chosen significance level
(typically
or, more stringently,
) and the number of degrees of freedom
[
46,
47,
48,
49]:
where
is the critical value of the Pearson’s chi-square statistic with
degrees of freedom. If
, the experts’ judgements are considered to be in agreement.
The same consistency condition can be expressed in terms of the chi-square statistic. Under the null hypothesis of no agreement between experts, the random variable is calculated as:
where
approximately follows a chi-square distribution with
degrees of freedom. For the chosen significance level
, the critical value
is taken from the chi-square distribution table. The requirement
is then equivalent to
. If the value of
calculated according to Equation (8) is greater than or equal to
, the experts’ judgements are considered to be in agreement.
For additional interpretation, the consistency coefficient
is used:
This coefficient shows how many times the calculated concordance coefficient is greater than its minimum value , and equivalently how many times exceeds its critical value . When the opinions expressed by the experts are in sufficient agreement, ; otherwise , the opinions differ significantly and the average ranks and derived weights are not considered reliable.
3.5. Construction of the Nominal Ship Risk Profile
To construct the nominal ship risk profile, which summarises the expert-based importance of all KPIs in the form of a normalised weight vector, the methodology combines the linear and nonlinear transformations by calculating the final weight of KPI
as the arithmetic mean of the ARTIW-L and ARTIW-N weights:
where
and
are the normalised weights obtained from the formulas in
Section 3.2 and
Section 3.3. The resulting weights remain normalised,
,
and preserve the same priority order of KPIs as in the original expert rankings, while integrating the sensitivity characteristics of both ARTIW variants [
46,
47,
48,
49]. In this way, the influence of the most and least important KPIs is “amplified” relative to criteria of medium importance through the ARTIW-N component, while the ARTIW-L component maintains a strictly linear response to changes in average ranks.
Since neither of these transformations has a clear theoretical advantage in the present context, and both preserve the original priority order of the KPIs, the final average weight of each KPI is additionally characterised by a simple measure of deviation that reflects the difference between the two ARTIW variants. This method-based deviation is defined as half of the absolute difference between the ARTIW-L and ARTIW-N weights:
For each KPI, the interval defines a plausible range of weight values across the considered weighting methods. Thus, represents the average estimate of KPI importance, while quantifies the associated variability. This variability enables context-dependent selection of more extreme weights when required (e.g., emphasizing the most and least important KPIs).
The nominal ship risk profile is then defined as the vector of these final KPI weights:
Each component represents the relative contribution of the corresponding KPI to the accident-risk index under the assumption that all KPIs are in a very unfavourable (worst) state.
For a given ship or time period, let
denote the normalised “risk level” of KPI
, where
corresponds to a very bad (critical) KPI state and
corresponds to a fully satisfactory state. Where a KPI is originally measured on a different scale, its observed value is first normalised to
, with higher
indicating worse performance. A normalised accident-risk index is then defined as:
where
is the total number of KPIs and
indexes the KPIs.
A deviation of the normalised accident-risk index can be obtained by propagating the deviations
:
When all KPIs take their worst-performance state (that is, for all ), the index reaches its maximum value , and the weights directly represent each KPI’s share of the total accident-risk index. For observed KPI profiles, the index provides a normalised accident-risk level (a probability proxy) conditional on the current KPI performance profile.
4. Results
This section presents the empirical results of the expert evaluation and the derived nominal Ship Risk Profile.
Section 4.1 describes the composition of the expert panel and the final KPI set retained for analysis.
Section 4.2 examines the consistency of expert judgements using Kendall’s coefficient of concordance.
Section 4.3 reports the relative importance of individual KPIs obtained by applying the ARTIW-L and ARTIW-N weighting procedures.
Section 4.4 defines and interprets the nominal Ship Risk Profile
and the normalised accident-risk index
constructed on the basis of these weights.
4.1. Expert Panel and KPI Set
The expert evaluation was conducted with 33 navigational practitioners (masters, chief mates, and pilots), each with at least five years of professional experience in maritime operations. The panel was formed to ensure that the judgements reflect accumulated navigational experience and direct responsibility for ship safety. The panel size exceeds the number of criteria retained for analysis (29 KPIs), consistent with the criterion stated in
Section 3.1 that the expert group should not be smaller than half the number of evaluated KPIs.
The starting point of the study was a candidate set of 37 KPIs. Of these, 36 indicators were adopted from the BIMCO KPI framework [
44], which covers areas such as compliance with environmental regulations; crew health, safety and injury prevention; training, retention and competence of personnel; navigational risks and accident prevention; operational efficiency; security performance; technical reliability and maintenance; and inspection outcomes, including Port State Control results. In addition, one KPI—Flag State rating indicator—was added based on the Paris MoU Ship Risk Profile and its Black–Grey–White lists, as scientific literature supports the relevance of flag performance for inspection outcomes and casualty risk [
24,
28,
34,
38].
All 37 KPIs were set out in a structured questionnaire. For each indicator, the experts were provided with a short description and were asked to assess, using their professional judgement and operational experience, how important the KPI is for assessing the ship’s and crew’s ability to avoid a collision, contact or grounding accident. The responses were given on the four-level importance scale from 0 to 3 defined in
Section 3.1 (0—not important, 3—very important), indicating the perceived strength of the link between each KPI and navigational accident risk.
Indicators that were consistently judged to have no meaningful connection with navigational accident probability were removed before further analysis. For eight KPIs, all experts assigned an importance level of 0, and these indicators were therefore excluded. The subsequent concordance analysis and ARTIW-based weighting procedures are based on the remaining 29 KPIs, which constitute the final KPI set for constructing the nominal Ship Risk Profile: KPI-01 Budget performance indicator, KPI-02 Ship dry-docking planning accuracy indicator, KPI-03 CO2 efficiency indicator, KPI-04 Operational deficiencies indicator, KPI-05 Navigational deficiencies indicator, KPI-06 Security deficiencies indicator, KPI-07 Cargo incidents indicator, KPI-08 Navigational accidents indicator, KPI-09 Crew experience indicator, KPI-10 Crew training days indicator, KPI-11 Number of cadets indicator, KPI-12 Crew turnover indicator, KPI-13 Frequency of crew disciplinary offences indicator, KPI-14 Crew work–rest hour violations indicator, KPI-15 Human resource management deficiencies indicator, KPI-16 Occupational health and safety violations indicator, KPI-17 Frequency of crew accidents and illnesses indicator, KPI-18 Passenger injury indicator, KPI-19 Overdue planned maintenance tasks indicator, KPI-20 Critical equipment and systems failures indicator, KPI-21 Classification society conditions indicator, KPI-22 Ship operational time indicator, KPI-23 Fires and explosions indicator, KPI-24 Port State Control performance indicator, KPI-25 Port State Control deficiencies indicator, KPI-26 Port State Control detentions indicator, KPI-27 Flag State rating indicator, KPI-28 Environmental regulations violations indicator, KPI-29 Pollutant spills indicator.
The excluded KPIs are NOx efficiency, SOx efficiency, Ballast water management violations, Releases of substances, Vetting deficiencies, Lost Time Injury Frequency, Lost Time Sickness Frequency, and Total Recordable Case Frequency. According to the expert panel, their exclusion reflects both limited relevance to collision/contact/grounding avoidance and conceptual overlap with retained indicators. NOx efficiency and SOx efficiency were considered not relevant to navigational-accident risk, as they primarily describe emission performance. Ballast water management violations are captured within the Environmental regulations violations indicator (KPI-28), while Releases of substances are encompassed by the Pollutant spills indicator (KPI-29). Vetting deficiencies overlap with Port State Control performance and Port State Control deficiencies. Lost Time Injury Frequency, Lost Time Sickness Frequency, and Total Recordable Case Frequency were excluded because they represent closely related occupational safety measures, where Total Recordable Case Frequency including first-aid cases already aggregates injury, sickness, and first-aid events. The expert panel therefore considered it difficult to differentiate their relative importance for navigational-accident avoidance and recommended relying on the aggregated Frequency of crew accidents and illnesses indicator (KPI-17).
4.2. Consistency of Expert Opinions
Before transforming expert rankings into KPI weights, the internal consistency of expert opinions was assessed according to the procedure described in
Section 3.4. Kendall’s coefficient of concordance
was calculated using the tie-corrected formula given in Equation (5), for
KPIs and
experts. The resulting value of the concordance coefficient for the full expert panel is
The minimum threshold value of the concordance coefficient , at which the expert opinions may still be considered consistent at the adopted significance level , was obtained from Equation (7). Using degrees of freedom and the critical value of Pearson’s chi-square statistic , the corresponding minimum concordance is
The empirical value of Pearson’s chi-square statistic was then calculated from Equation (8), which relates to . For the present data, this gives which is much larger than the critical value . Therefore, at the significance level , the null hypothesis of mutually independent, random rankings is rejected, and the consistency of the experts’ rankings is considered statistically sufficient.
The consistency coefficient , defined in Equation (9) as the ratio between the actual and minimum concordance, is equal to
Because , and , the rankings provided by the 33 experts can be regarded as mutually consistent and non-contradictory. Consequently, the average ranks and the ARTIW-based transformations derived from them can be used as a reliable aggregated representation of expert judgement in the subsequent KPI weighting.
4.3. KPI Ranking, Priorities and Weights
After the consistency of the expert judgements had been established, their rankings were converted into KPI weights using the ARTIW-L and ARTIW-N methods. For each KPI , the mean rank was first calculated from the expert assessments in accordance with Equation (1). These mean ranks were then transformed into normalised weights and by applying the linear and nonlinear transformations specified in Equations (2) and (4), respectively. Both transformations use the same input and preserve the original ordering of KPI priorities, but they differ in how they allocate relative importance to the highest- and lowest-ranked indicators.
The distribution of the average ranks
and the resulting KPI priorities is presented in
Figure 4.
Figure 5 shows the bar diagrams of the calculated normalised weights
and
, illustrating the differences between the linear and nonlinear transformations for each KPI.
The ranking and weighting results are summarised in
Table 1 in
Section 4.4. For each of the 29 KPIs, the table presents the average rank
, the normalised ARTIW-L weight
, the ARTIW-N weight
, and the resulting priority of each KPI.
4.4. Nominal Ship Risk Profile and Normalised Accident-Risk Index
To define the ship-specific normalised accident-risk index
according to Equation (13), the nominal Ship Risk Profile
is used as the core set of KPI weights and is combined with the actual KPI values
for the selected ship. The vector
is constructed from the average weights
, calculated using Equation (10), while the deviations
are determined according to Equation (11). Each component
, as defined in Equation (12),
where
is the total number of KPIs, represents the relative share of the total accident-risk index attributed to KPI
under the hypothetical assumption that all KPIs are in a highly unfavourable state. For graphical presentation, the components of
are ordered based on KPI priority and expressed as percentages in
Figure 6, along with the deviations
, which illustrate the variation in weights. For all 29 KPIs, the values of
and
are summarised in
Table 1.
The normalised accident-risk index
can be calculated together with its deviation
using Equation (14), which defines the sensitivity of the index and indicates the range from
to
. When all KPIs are in a very poor state (i.e.,
for all
), the index reaches its maximum value
. In this case, the weights
directly reflect the share of the total accident-risk index attributed to each KPI and are equal to the components of
, as presented in
Figure 6.
Due to data protection and confidentiality constraints, the calculation of is demonstrated using anonymised KPI values for a representative vessel, here referred to as Ship A. If the ship’s KPI vector is X = (0.3, 0.2, 0.5, 0.1, 0.1, 0.1, 0.3, 0.2, 0.4, 0.1, 0.2, 0.7, 0.2, 0.4, 0.5, 0.1, 0.8, 0.4, 0.6, 0.5, 0.3, 0.9, 0.7, 0.4, 0.5, 0.6, 0.2, 0.6, 0.4), where the components correspond to KPI-01 through KPI-29 in sequential order, then the normalised accident-risk index is (37.50%), with a deviation corresponding to a range from (30.57%) to (44.42%).
Table 1.
Summary of KPI average ranks, weights and nominal ship risk profile values with method-based deviations.
Table 1.
Summary of KPI average ranks, weights and nominal ship risk profile values with method-based deviations.
| KPI No. | | | | | , % | , % | Priority |
|---|
| KPI-01 | 20.439 | 0.0220 | 0.0216 | 0.0218 | 2.18% | 0.02% | 23rd |
| KPI-02 | 21.727 | 0.0190 | 0.0203 | 0.0197 | 1.97% | 0.07% | 25th |
| KPI-03 | 27.045 | 0.0068 | 0.0163 | 0.0116 | 1.16% | 0.48% | 29th |
| KPI-04 | 11.333 | 0.0429 | 0.0390 | 0.0409 | 4.09% | 0.20% | 9th |
| KPI-05 | 7.045 | 0.0528 | 0.0627 | 0.0577 | 5.77% | 0.50% | 3rd |
| KPI-06 | 15.273 | 0.0339 | 0.0289 | 0.0314 | 3.14% | 0.25% | 18th |
| KPI-07 | 21.591 | 0.0193 | 0.0205 | 0.0199 | 1.99% | 0.06% | 24th |
| KPI-08 | 11.515 | 0.0425 | 0.0384 | 0.0404 | 4.04% | 0.21% | 10th |
| KPI-09 | 11.758 | 0.0419 | 0.0376 | 0.0398 | 3.98% | 0.22% | 11th |
| KPI-10 | 9.470 | 0.0472 | 0.0466 | 0.0469 | 4.69% | 0.03% | 6th |
| KPI-11 | 23.758 | 0.0144 | 0.0186 | 0.0165 | 1.65% | 0.21% | 27th |
| KPI-12 | 17.030 | 0.0298 | 0.0259 | 0.0279 | 2.79% | 0.19% | 21st |
| KPI-13 | 13.364 | 0.0382 | 0.0331 | 0.0357 | 3.57% | 0.26% | 13th |
| KPI-14 | 6.955 | 0.0530 | 0.0635 | 0.0582 | 5.82% | 0.53% | 2nd |
| KPI-15 | 8.894 | 0.0485 | 0.0497 | 0.0491 | 4.91% | 0.06% | 5th |
| KPI-16 | 11.212 | 0.0432 | 0.0394 | 0.0413 | 4.13% | 0.19% | 8th |
| KPI-17 | 13.273 | 0.0385 | 0.0333 | 0.0359 | 3.59% | 0.26% | 12th |
| KPI-18 | 24.818 | 0.0119 | 0.0178 | 0.0149 | 1.49% | 0.29% | 28th |
| KPI-19 | 15.121 | 0.0342 | 0.0292 | 0.0317 | 3.17% | 0.25% | 17th |
| KPI-20 | 6.288 | 0.0545 | 0.0703 | 0.0624 | 6.24% | 0.79% | 1st |
| KPI-21 | 7.788 | 0.0511 | 0.0567 | 0.0539 | 5.39% | 0.28% | 4th |
| KPI-22 | 23.667 | 0.0146 | 0.0187 | 0.0166 | 1.66% | 0.21% | 26th |
| KPI-23 | 16.682 | 0.0306 | 0.0265 | 0.0285 | 2.85% | 0.21% | 20th |
| KPI-24 | 15.394 | 0.0336 | 0.0287 | 0.0311 | 3.11% | 0.24% | 19th |
| KPI-25 | 14.833 | 0.0349 | 0.0298 | 0.0323 | 3.23% | 0.25% | 16th |
| KPI-26 | 10.470 | 0.0449 | 0.0422 | 0.0435 | 4.35% | 0.14% | 7th |
| KPI-27 | 20.152 | 0.0226 | 0.0219 | 0.0223 | 2.23% | 0.04% | 22nd |
| KPI-28 | 14.182 | 0.0364 | 0.0311 | 0.0338 | 3.38% | 0.26% | 15th |
| KPI-29 | 13.924 | 0.0370 | 0.0317 | 0.0343 | 3.43% | 0.26% | 14th |
5. Discussion
Ship accidents—particularly collisions, groundings, and contacts—continue to occur despite substantial advances in maritime safety research, with severe consequences for human life, marine ecosystems, and port operations. The literature review identified more than 80 risk factors reported for these accident types, indicating fragmentation across existing studies in both risk-factor definitions and methodological approaches. In parallel, EMSA has published an extensive analysis of EMCIP navigation-accident data, identifying 1,637 contributing factors [
17], which highlights the significant yet underutilized potential of EMSA data for predictive maritime safety modelling. In this context, KPIs can help interpret patterns in these contributing factors and link observed accident mechanisms to measurable operational and organizational conditions, thereby supporting more proactive navigational-accident risk management.
This study presents a practical route for linking operational KPIs to navigational-accident risk by converting expert priorities into a weighted and combining it with observed, ship-specific KPI states to produce a normalised accident-risk index for collision/contact/grounding avoidance (noting that is a normalised index rather than a calibrated absolute probability). The expert panel—masters, chief mates, and pilots—evaluated an initial set of 37 KPIs; eight indicators were subsequently excluded because all experts assigned an importance level of 0, indicating no meaningful link to navigational-accident risk. This filtering step is informative, as it suggests that not all performance indicators commonly used in shipping operations are perceived as relevant for accident avoidance, and it helps keep the SRP focused on mechanisms that practitioners associate with navigational failures.
The resulting
(
Figure 6) represents a compact “importance map” across the 29 retained KPIs, where each weight corresponds to that KPI’s share of the maximum index under the (hypothetical) worst-performance state for all indicators. The weights are not dominated by a single KPI (the largest individual share is 6.24%), supporting the view that navigational-accident risk is multifactorial rather than attributable to one isolated deficiency. Methodologically, ARTIW-L and ARTIW-N provide a transparent means to transform ordinal expert rankings into a cardinal weight vector while explicitly characterising structural uncertainty associated with the choice of transformation: the linear form preserves proportional differences in average ranks, whereas the nonlinear form places relatively greater emphasis on the most and least important KPIs [
46,
47,
48,
49]. The weight analysis is also supported by adequate agreement among the 33 experts: Kendall’s coefficient of concordance is
, with
exceeding the critical value
, and the consistency coefficient is
; therefore, the aggregated ranking can be treated as mutually consistent for constructing a single
vector.
The constructed
is conceptually aligned with the SRP applied in the Paris MoU regime, where ships are classified as low-risk, standard-risk, or high-risk based on PSC inspection results [
28,
36,
37,
38]. The present framework extends this concept in two respects. First, it is built on a broader set of 29 KPIs spanning operational, technical, human-factor, and regulatory dimensions. Second, it enables integration of the risk profile with time-varying KPI states through the index
, allowing a quantitative assessment of how changes in KPI performance affect the index
(i.e., modelled navigational-accident risk level). In this way, the framework bridges traditional inspection-based profiling and more dynamic, performance-driven risk monitoring.
The KPI priorities derived from the expert panel mainly highlight determinants of human performance, bridge-team effectiveness, and the technical and organizational conditions that influence decision-making under workload. In the model, the ten highest-weight indicators are: critical equipment and systems failures (KPI-20; 6.24%); crew work–rest hour violations (KPI-14; 5.82%); navigational deficiencies (KPI-05; 5.77%); classification society conditions (KPI-21; 5.39%); human resource management deficiencies (KPI-15; 4.91%); crew training days (KPI-10; 4.69%); Port State Control detentions (KPI-26; 4.35%); occupational health and safety violations (KPI-16; 4.13%); operational deficiencies (KPI-04; 4.09%); and navigational accidents (KPI-08; 4.04%). Collectively, these indicators account for 49.45% of the total weight, indicating that—by model construction—almost half of the maximum accident index is driven by a relatively small set of predominantly human/organizational and technical-condition KPIs.
A similar concentration also appears in EMSA’s safety analysis using EMCIP navigation-accident data: the ten most frequent safety-issue families (bridge resource management coordination; use of electronic navigation equipment; work methods and supervision; bridge resource availability; external communications; coordination with third parties; resources for plans and procedures; safety culture and climate; safety awareness; and external environmental impact) jointly represent about 46.43% of all contributing factors reported in the analysis [
17]. When the top ten
KPIs are compared with the top ten EMCIP safety-issue families at the domain level, clear thematic alignment is observed: navigational deficiencies (KPI-05) and operational deficiencies (KPI-04) correspond closely to families related to bridge coordination and work methods/supervision; critical equipment and systems failures (KPI-20) align with equipment-related issues; and crew work–rest hour violations (KPI-14), crew training days (KPI-10), and human resource management deficiencies (KPI-15) reflect fatigue, competence, and resource-availability mechanisms frequently emphasized in accident investigations. The similarity in these aggregate shares should be interpreted as a descriptive concentration pattern rather than a statistical validation; however, it supports the plausibility that a limited set of human/organizational and technical-condition themes dominates both investigation-derived factor frequencies and KPI-based risk attribution. This pattern is consistent with the wider literature, which attributes a large share of navigational accidents to human error and related mechanisms while emphasising organisational, procedural, and technological context effects [
1,
17,
25,
36,
39,
40,
41,
42]. Overall, this domain-level alignment supports the paper’s central premise that existing KPI frameworks can capture a substantial portion of the information relevant for proactive navigational-accident risk indication, even before introducing new KPIs.
The practical value of
is illustrated in
Figure 6, because it enables a broad-perspective interpretation of how different KPI groups can drive the accident-risk index under deteriorating performance. Under the model definition, the
weights describe the maximum share of the index attributable to each KPI. As KPI values improve from poor to satisfactory states, the index declines according to the SRP weights, providing a transparent mechanism to estimate how targeted performance improvements may reduce the initial maximum risk. The demonstration with anonymised values for Ship A (
, range 30.57–44.42%) provides an illustrative example of how the SRP-based calculation can be used as an interpretable risk profile rather than an unstructured collection of raw KPI values [
38,
44,
55]; for conservative operational decision support, the upper-bound estimate may be used as a precautionary risk level.
From an operational perspective, the SRP approach can support decision-making by multiple stakeholders in constrained and congested port environments. Port authorities, pilots, Port State Control inspectors, and ship operators often need to judge whether a vessel can safely transit port entrances or narrow channels and whether berthing and unberthing can be conducted safely under unfavourable conditions—phases repeatedly identified as high-consequence and coordination-intensive [
15,
17,
56]. This perspective is consistent with port-operational research highlighting entry, mooring, and unmooring as particularly high-risk phases requiring proactive mitigation [
15,
26]. Because the framework relies on KPIs that are already collected (or can be collected) routinely—including BIMCO indicators and PSC/flag-related measures—it offers a feasible pathway for implementation without requiring immediate access to detailed accident-investigation variables [
37,
38,
44]. The same structured SRP output may also be valuable for insurers and cargo stakeholders by enabling more transparent risk differentiation across vessels and by incentivising improvements in the highest-weight operational and human-factor KPIs [
14,
53,
54].
In practice, the normalised accident-risk index can support pre-arrival screening and prioritisation by converting routinely monitored KPI states into a single, traceable risk indication. Ports and pilots could use elevated index values as a trigger for enhanced control measures, such as assigning a more experienced pilot team, requiring tug assistance or escort, deferring transit or berthing until conditions are within locally defined operational limits (e.g., visibility and wind/current limits), or requesting additional bridge-team resources. Ship operators can use the same output internally to prioritise corrective actions (maintenance reliability, work–rest compliance, competence/training, procedural deficiencies) by targeting the KPI terms that contribute most to the current index. Where used by control authorities, the index can inform risk-oriented prioritisation (screening/inspection focus) while remaining interpretable through the underlying KPI contributions.
Several limitations indicate priorities for future work. First, weights are expert-elicited and should be interpreted as prioritisation rather than an empirically calibrated causal model; validation against KPI time series linked to independent accident outcomes is required to assess empirical validity and recalibrate weights where appropriate. Second, the framework focuses on ship- and crew-related KPIs and does not explicitly represent external operational context (e.g., hydrometeorology, traffic complexity, tug availability, port infrastructure constraints, and regional practices); future iterations should incorporate context correction factors and/or a parallel port risk profile to improve realism and transferability. More generally, dynamic and context-aware risk assessment could incorporate traffic density, environmental conditions, AIS-derived encounter dynamics, and port-specific constraints as modifiers of the -based index, particularly because navigational-accident risk is highly context dependent in confined waters. Third, the current scope is limited to collisions, contacts, and groundings; extension to other accident classes would require KPI additions and revised indicator–accident mappings. Fourth, is a normalised accident-risk index rather than a calibrated absolute probability; empirical calibration is therefore needed before interpreting as a probability in an absolute sense. For operational use, SRP outputs should also be communicated in simplified forms (e.g., Low/Standard/High categories or dashboards) while preserving traceability to underlying KPIs and weights.
The nominal weights reflect structured expert judgement and therefore inherit limitations typical of elicitation studies: (i) potential panel-composition effects (experience domain, operational background, and regional practice), (ii) cognitive and framing biases when ranking a relatively large KPI set, and (iii) the possibility that perceived importance does not match observed accident outcomes under specific operating contexts. Future work should therefore validate and, if necessary, recalibrate the weighting structure using ship-level KPI time series linked to independent accident outcomes. A practical validation route is a retrospective design in which KPI vectors are paired with subsequent collision/contact/grounding events, enabling outcome-association assessment and calibration of the normalised accident-risk index. A complementary prospective design is to compute the index continuously for a monitored fleet and evaluate whether elevated index values precede adverse outcomes over a defined follow-up window, with periodic re-estimation of weights to improve transferability across ship types, trades, and regions.
The proposed workflow is KPI-set agnostic and can be applied in other regions by adapting the KPI set and re-estimating weights to reflect local conditions, operational traditions, and port-specific characteristics. Because maritime KPIs are broadly understood across stakeholder groups, the framework can be piloted in different countries or ports either by applying the current weight structure as an initial benchmark or by repeating the expert-weighting stage to recalibrate prioritisation to the local operational context, while preserving the SRP framework, index structure, and interpretability. Outside the BIMCO/Paris MoU setting, operational, technical, and human-factor indicators are expected to remain directly transferable, while regulatory-performance components (e.g., PSC- and flag-related measures) can be substituted with locally applicable inspection regimes and flag-performance classifications. For different ship types, can be re-estimated using expert panels stratified by ship type and/or empirically recalibrated using outcome-linked data to yield ship-type-specific profiles without changing the underlying framework.
These findings support the view that
can function as a scientifically grounded “translation layer” between accident-investigation knowledge and proactive performance monitoring. The concentration of weight in human/organizational and technical-condition KPIs aligns with major safety-investigation findings and highlights actionable levers for reducing navigational-accident risk through targeted improvements in bridge-team performance conditions, competence development, maintenance reliability, and compliance oversight. The results indicate what can be extracted from existing KPIs and help identify where additional indicators may be required, thereby providing direction for future KPI development and competence-building efforts in data-driven operating environments [
29,
56,
57,
58,
59,
60].
6. Conclusions
This study proposed a practical framework to link maritime KPIs with navigational-accident risk and to support proactive decision-making in ports and other confined waters. The work is based on the proposition that routinely monitored operational KPIs can be systematically mapped—via expert-derived weights—into an interpretable risk indicator for collision/contact/grounding avoidance. The novelty of the study lies in operationalising this linkage through an ARTIW-based weighting of a comprehensive KPI set to form a nominal Ship Risk Profile, , and a transparent, ship-specific normalised accident-risk index suitable for decision support.
Expert judgements were transformed into KPI weights using the ARTIW-L and ARTIW-N rank-to-weight methods and aggregated into ; observed KPI states were then used to compute . From an initial set of 37 candidate KPIs, 29 were retained for the construction of , while eight were excluded because all experts assessed them as not relevant for navigational-accident risk indication. Expert agreement was sufficient for aggregation (Kendall’s , consistency ). The resulting indicates that navigational-accident risk attribution is multifactorial; however, the ten highest-weight KPIs represent 49.45% of the total weight. The leading contributors were critical equipment and systems failures (KPI-20; 6.24%), crew work–rest hour violations (KPI-14; 5.82%), navigational deficiencies (KPI-05; 5.77%), classification society conditions (KPI-21; 5.39%), and human resource management deficiencies (KPI-15; 4.91%). An illustrative calculation using anonymised KPI values for Ship A produced (37.50%), with a deviation range from 30.57% to 44.42%, demonstrating how the framework can translate routine KPI monitoring into an interpretable risk indication and highlight where targeted performance improvements may yield the largest reduction in the index.
The proposed index is a normalised accident-risk indicator (a probability proxy) and is not an empirically calibrated absolute probability. Future work should therefore focus on empirical validation and calibration using ship-level KPI time series linked to independent accident outcomes. In addition, incorporating external operational context (e.g., hydrometeorology, traffic complexity, tug availability, and port infrastructure constraints) through correction factors and/or a parallel port risk profile would improve realism and transferability. Extending the approach to additional accident categories beyond collisions, contacts, and groundings will require revised KPI sets and indicator–accident mappings. Overall, the framework provides an interpretable and implementable pathway for translating expert-weighted KPIs into proactive navigational risk indication to support maritime stakeholders’ safety decisions.