Next Article in Journal
Effects of Higher Education on Green Eco-Efficiency and Its Optimization Path: Case Study of China
Previous Article in Journal
Bioaccumulation and Translocation of Heavy Metals in Paddy (Oryza sativa L.) and Soil in Different Land Use Practices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Identification of Risk Influential Factors for Fishing Vessel Accidents Using Claims Data from Fishery Mutual Insurance Association

1
Faculty of Maritime and Transportation, Ningbo University, Ningbo 315832, China
2
Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast University, Nanjing 211189, China
3
National Traffic Management Engineering & Technology Research Centre, Ningbo University Sub-Centre, Ningbo 315832, China
4
Ningbo Pilot Station, Ningbo 315040, China
5
Barcelona School of Nautical Studies, Universitat Politècnica de Catalunya—BarcelonaTech, 08003 Barcelona, Spain
*
Authors to whom correspondence should be addressed.
Sustainability 2023, 15(18), 13427; https://doi.org/10.3390/su151813427
Submission received: 1 August 2023 / Revised: 30 August 2023 / Accepted: 31 August 2023 / Published: 7 September 2023

Abstract

:
This research aims to identify and analyze the significant risk factors contributing to accidents involving fishing vessels, a crucial step towards enhancing safety and promoting sustainable practices in the fishing industry. Using a data-driven Bayesian network (BN) model that incorporates feature selection through the random forest (RF) method, we explore these key factors and their interconnected relationships. A review of past academic studies and accident investigation reports from the Fishery Mutual Insurance Association (FMIA) revealed 17 such factors. We then used the random forest model to rank these factors by importance, selecting 11 critical ones to build the Bayesian network model. The data-driven Bayesian network (BN) model is further utilized to delve deeper into the central factors influencing fishing vessel accidents. Upon validation, the study results show that incorporating the random forest feature selection method enhances the simplicity, reliability, and precision of the BN model. This finding is supported by a thorough performance evaluation and scenario analysis.

1. Introduction

Marine fishing is an inherently perilous occupation, with fishermen recording some of the highest mortality rates among different occupational groups [1,2,3]. The most significant risk factor they face is fishing vessel accidents, which can lead to injuries, fatalities, significant damage to vessels, and even complete vessel loss. These incidents threaten crew safety and negatively impact the overall financial stability of the fishing industry. As such, it’s crucial to give risk management measures for fishing vessels top priority and fortify these strategies to reduce casualties and limit property damage. Understanding the factors that contribute to fishing vessel accidents is crucial for developing proactive measures to prevent such incidents. Studies in this area help identify the root causes and risk factors associated with these accidents, which in turn facilitate the development and implementation of specially targeted safety policies, regulatory measures, and training initiatives. Identifying the most influential factors allows the targeted allocation of limited resources towards precise interventions that yield maximum accident reduction and return on investment.
Having access to accurate data on fishing vessel accidents is key for analysis. Prior studies often rely on information from regulatory bodies, but these datasets are often incomplete. Fishing vessel accidents are more common yet often undisclosed publicly, presenting skewed data. To address this, our research utilizes data from the Fishery Mutual Insurance Association (FMIS) of China. This national non-profit organization represents both groups and individuals in fishing-related services, offering mutual insurance to the sector. The Ningbo Fishery Mutual Insurance Association, established in 1996 as China’s first local fishery mutual insurance association, provides coverage for over 98% of fishing vessel insurance in the region. Every fishing vessel accident, regardless of its size, is thoroughly investigated by the association, generating comprehensive data for analysis. Utilizing FMIS data enhances the breadth and authenticity of the data, leading to a more accurate understanding of accident scenarios and outcomes. This wealth of information supports risk assessment, economic impact analysis, and the development of targeted safety interventions, promoting a data-driven approach to accident analysis.
The application of Bayesian models for identifying risks associated with fishing vessel accidents is not new, given their established reliability as documented in references [4,5,6,7]. However, the pronounced effects of some factors over others obscure modeling. In response to this intricate complexity, this paper presents an enhanced methodology with a novel Bayesian network model, integrating random forest feature selection. This facilitates retaining only the most significant influential factors for modeling.
A key innovation is our new database of 3448 FMIA fishing vessel accident reports from 2018 to 2022. This unparalleled and extensive collection, which covers small and large vessel accidents, stands out as the most comprehensive resource for studying fishing vessel incidents.
This paper, therefore, proposes a new Bayesian model using this unique database. The paper is structured as follows: Section 2 reviews past studies and Bayesian network techniques. Section 3 outlines the accident database and risk factor identification. Section 4 details our research methodology, including the random forest algorithm and Bayesian network learning. Section 5 presents sensitivity analysis and model validation. Section 6 concludes with discussions and study findings.

2. Literature Review

Fishing vessels are generally perceived as the least safe type of vessel, prompting extensive research on their accidents. A comprehensive literature review using databases, including Web of Science, identified vessel-related, environment-related, and accident-related factors as primary influences.
Vessel-related factors significantly impact fishing vessel accidents. Studies by Jin [8], Jin et al. [2,9], Cakir et al. [10], Uğurlu et al. [11], Wang et al. [12], and others have highlighted the correlation between factors such as vessel age, size, and type, and the incidence and severity of accidents. Recent studies like Obeng et al. [13] and Li et al. [7] used Bayesian network models to demonstrate the importance of vessel factors. Lazakis et al. [14] found trawlers had more occupational accidents than other types of fishing vessels.
Environmental factors also play a crucial role in fishing vessel accidents. Several studies have cited adverse weather conditions as a key cause of fishing vessel accidents, often exceeding human error [15,16]. Studies by Jin et al. [8,9], Davis et al. [17], Heij and Knapp [18], Weng et al. [19], and others have identified correlations between adverse weather conditions, seasons, and the occurrence and severity of accidents. By employing advanced machine learning methods, Rezaee et al. [20], Liu et al. [21], and Wang et al. [22] further clarified the relationship between weather conditions and accident severity. Özaydın et al. [23] used BN and association rule mining (ARM) methods to demonstrate the impact of adverse weather and sea conditions on fishing vessel accidents.
Accident-related factors affect consequences. Jin et al. [2,8,9], Wang et al. [22], and Wróbel et al. [24] linked accidents to geographic area and distance from the coast. Weng et al. [19] showed accidents further from the harbor had more fatalities. Jin [8] found that accident type impacted severity. Liu et al. [21] found that collision accidents tended to result in serious accident consequences. Cao et al. [5] showed that the type of accident was the highest factor affecting the severity level of accidents, and capsize/submerge, mechanical damage, and collision were the factors most likely to result in a “very serious accident”. Human error contributes to 70–85% of incidents, per multiple studies [25,26,27]. Various methodologies have been employed to examine these human errors. For instance, Wang et al. [22], Wróbel et al. [24], and Kose et al. [28] employed an accident tree analysis, HFACS methodology, and logistic regression model, respectively, to highlight the predominance of human error in maritime mishaps. Furthermore, Obeng et al. [29] concluded that inadequate training and a lack of experience were key contributors to these incidents. Celik and Cebi [30] used HFACS to identify the hierarchal structure and internal relationships of human factors in ship accidents.
While BNs are widely used in the analysis of maritime accidents, including those involving fishing vessels, current research focuses on assessing the severity of accidents. These models demonstrate good classification but can be improved in the area of feature selection. Building models with a multitude of parameters not only amplifies computational complexity but also raises the risk of overfitting. Thus, it is crucial to prioritize the identification of critical parameters when devising a risk prediction model for fishing vessels.
Another limitation of previous research is the constraint of data availability. Therefore, this study establishes a comprehensive database of fishing vessel accidents. Random forest will first be used to identify key parameters. These selected variables will then be used to construct the BN model. Subsequently, a BN model will be employed to predict fishing vessel risks, with its performance compared against a BN model that does not use feature selection. The reliability and feasibility of the RF-BN model will be assessed, analyzing the primary factors influencing fishing vessel accidents and offering technical support for safe fishing.

3. Data

3.1. Database

Most prior literature is based on data from various global databases that predominantly document significant fishing vessel accidents. However, numerous minor incidents are either not documented or lack thorough investigation, given the higher frequency of fishing vessel accidents compared to commercial vessels. To analyze accident causation in a more precise and comprehensive manner, this study primarily draws data from the FMIA, a chief provider of insurance coverage for fishing vessels, which covers more than 98% of fishing vessels in the study area. These vessels are incorporated into an organizing management system called fishery organizing companies, which assist in safety management by monitoring vessel entries/exits, inspecting safety equipment, organizing safety training for crew members, using information monitoring platforms to check the operation of fishing vessels in real time, and providing emergency assistance, etc. The companies also verify vessel certificates and inspection dates.
This study developed a comprehensive and robust five-year (2018–2022) fishing vessel accident database for the Ningbo region, China, sourcing diverse data on accidents including date, location, vessel information (operational mode, age, material, dimensions, tonnage, and power), accident type, personnel and vessel certifications, casualties, losses, and brief descriptions of accident causation. Some accident reports provide environmental conditions, but the details vary. Vessel data was sourced by referencing the fishery system using the Maritime Mobile Service Identity (MMSI) and Beidou ID. Environmental data gaps were addressed via retrospective marine meteorological assessments. This rigorous data collection approach culminated in 3448 accident samples, methodically organized in an Excel framework, archiving pertinent details derived from accident reports. Figure 1 illustrates the spatial distribution of these accidents with locations indicated by red dots.

3.2. Risk Influential Factors

Risk influential factors (RIFs) are variables that influence the safety of fishing vessels. Through a rigorous examination of the relevant literature and an in-depth analysis of accident report archives, 17 risk influential factors (RIF) pertinent to maritime accidents were identified. These encompass vessel, environmental, and other factors as shown in Table 1. Since this study focuses solely on fishing vessels, the vessel type is considered irrelevant and thus excluded from the RIFs. Fishing vessels utilize unique operational methods, including single trawl, double trawl, purse seine, and fishing transport, among others. These methods have significant implications for their associated risks. For example, anecdotal evidence from fishermen and regulatory bodies suggests double trawlers have lower operational risks than single trawlers. Therefore, this study introduces the operation mode of fishing vessels as a novel influential factor. The finalized list consists of 17 RIFs, as shown in Table 2.
The primary aim of this study is to discern the RIFs impacting the outcomes of fishing vessel accidents. These outcomes are stratified into four severity categories based on considerations such as human casualties, property damage, and equipment impairment. For categorization purposes, we use the designated deemed losses (DL, in RMB). The classifications are as follows: general (DL < 10,000), severe (10,000 ≤ DL < 100,000), major (100,000 ≤ DL < 1000,000), and critical (DL > 1000,000).

4. Research Methodology

This study first utilizes the random forest algorithm to evaluate and rank the importance of risk influential factors (RIFs). The paramount RIFs were then chosen to construct a Bayesian model, employing the tree-augmented naive Bayes (TAN) classifier specifically. After model construction, sensitivity analysis, validation, and performance assessment were conducted, as depicted in Figure 2.

4.1. Random Forest

The literature review and accident data analysis revealed 17 potential risk factors contributing to fishing vessel accidents. Due to computational complexity and overfitting risks, it is crucial to screen these factors before modeling. The aim is to identify those with the most significant impact. The selected factors will then become input variables for the Bayesian model.
Breiman [39] introduced random forest in 2001, as an ensemble statistical learning technique using classification and regression tree (CART) models. It handles multi-collinearity and high-dimensionality by accurately capturing the impact of multiple explanatory variables. As a result, it is widely regarded as one of the best algorithms [40]. It splits the dataset into multiple subsets, building comparatively weak decision tree models per subset. These are amalgamated into a potent composite model via a voting mechanism, significantly reducing over-fitting issues with models such as ID3, C4.5, and CART [41]. Any decision tree can be used as a sub-model. This study uses the CART tree. Nodes are split by minimizing the Gini index, which quantifies the purity of classification. The Gini index evaluates the efficacy of random forest feature selection. For node K with sample set D of e categorized samples D1, D2, …, De, the Gini index of node K can be computed according to Equation (1):
G k = 1 - i = 1 e P i 2
where P1, P2, …, and Pi are the probabilities corresponding to each classified sample.
Equation (1) shows that the Gini index denotes the likelihood of randomly selecting two different class samples from the dataset. As such, when selecting attributes to partition node K, minimizing the Gini index determines the optimal partitioning. If attribute F partitions node K into child nodes {K1, K2, …, Ki}, the post-partition Gini index can be computed according to Equation (2):
G k ,   F = i = 1 l | K i | | K | G ki
where Gki represents the number of samples partitioned to the ith child node from node K, and l is the total number of samples in node K.
Every sample in the test set is assessed using each decision tree, leading to corresponding class predictions C1(X), C2(X), …, CT(X), with X being a random variable that signifies the sampled instance. As trees function independently, their T output results are aggregated through a voting process. The class with the most votes from the weak T decision trees is the final class prediction.
During training, a random sampling method is implemented to create datasets. Weak classifiers use a fraction of samples, generating out-of-bag (OOB) data. The generated random forest’s performance can be assessed using this OOB data. If we assume the total number of OOB data points to be Q, these Q OOB data points serve as input and are fed into the pre-established random forest classifier. The classifier then categorizes each of the Q data points. Let C be incorrect classifications; the sample OOB error is given by Equation (3), and the total OOB error for the random forest is computed according to Equation (4):
O erri = C Q
O err = i = 1 N O erri N
where N is the number of samples. The OOB data can not only be used to calculate the model’s error but also to assess the importance of features [42,43]. Equation (5) calculates the importance qt of feature t:
V t = 1 N i = 1 N ( O errt ,   i - O erri )
where  O errt , i  is the OOB error obtained by using feature t to classify the samples and  O errt , i O e r r i  represents the change in OOB error as a result of variations in the feature variable t. Larger values indicate a greater OOB accuracy decrease, indicating higher importance.

4.2. Tree-Augmented Naive Bayes (TAN)

Introduced by Friedman et al. [44], the tree-augmented naive Bayes (TAN) classifier is a Bayesian network classifier that incorporates a tree-like structure. TAN expands naive Bayes by integrating directed arcs between strongly dependent attributes while restricting attribute connectivity. This leads to a graphical model with a tree structure that depicts attribute dependencies. Compared to standard naive Bayes, TAN better leverages attribute dependencies, avoiding exponential computation of complex dependencies and improving classification performance. In the TAN tree, the class variable is at the root without any parent nodes. Each attribute node can have one other attribute variable as its parent, allowing up to two parent nodes.
When using this model for classification, an unknown instance with an unclassified category is calculated using the Bayesian formula. The class label with the maximum probability is chosen as the assigned class, as per Equation (6):
C = arg max c j C P ( x i 1 , x i 2 , , x i n | c j ) p ( c j ) p ( x i 1 , x i 2 , , x i n ) = arg max c j C P ( x i 1 , x i 2 , , x i n | c j ) p ( c j ) = arg max c j C p ( c j ) i = 1 n p ( x i t | i = 1 n x i t )
where the set is obtained based on the constructed TAN structure.
The learning of TAN involves an optimization problem, and its mathematical formulation is well documented in [38,45]. Once the qualitative structure of the TAN network is established, the next step is parameter learning to determine the conditional probability tables (CPT) for each node. Standard methods for learning parameters from data samples include maximum likelihood estimation, Bayesian estimation for complete datasets, and the expectation-maximization algorithm for incomplete datasets [46].
Given the extensive database constructed in this study and the higher accuracy of Bayesian estimation over maximum likelihood estimation [47], Bayesian estimation is chosen for parameter learning. The implementation of Bayesian networks (BN) in maritime risk analysis generally follows a series of recognized steps, including data collection, variable identification, structure learning, model validation, and sensitivity analysis. This study mirrors such an approach, segmented into four parts: database construction, model design, model validation, and model output. The method selected for this research is the tree-augmented naive Bayes (TAN) classifier, and a BN visualization software, GeNIe modeler, which is a graphical user interface allowing for interactive model building and learning [21,23], was used to develop BN structure.

4.3. Model Validation

To examine the comprehensive impact of multiple influencing factors on fishing accidents and confirm the accuracy of the BN model, the sensitivity analysis inference process should satisfy at least two hypotheses during the following axioms [48]:
Axiom 1:
Any increase or decrease in the probability values of each parent node should result in a corresponding relative increase or decrease in its child node.
Axiom 2:
The cumulative impact resulting from a combination of probability shifts in evidence should not be less pronounced than the impact arising from a subset of that evidence.
It is important to emphasize that these hypotheses act as benchmarks to gauge the accuracy and reliability of the BN model when assessing the multifaceted influences on accidents.

4.4. Sensitivity Analysis

Sensitivity analysis is a commonly used method for uncertainty analysis. When analyzing maritime accident risks using BN, the goal of sensitivity analysis is to identify risk factors that exert a significant impact on the target variable. Recognizing these factors helps implement appropriate measures to mitigate risks associated with fundamental factors. To ensure a thorough evaluation, both mutual information and sensitivity analysis methods are employed.
(1)
Mutual Information
Mutual information is used to identify the importance and priority of risk factors in influencing the target node. Information entropy is a statistical metric denoting the level of uncertainty in a random variable. A higher entropy value signifies greater uncertainty in the variable. The calculation formula for information entropy is as follows in Equation (7).
H ( Y ) = y Y P ( y ) log P ( y )
Mutual information quantifies the shared information between two variables, acting as an indicator of their interdependence. It measures the reduction in information entropy of a query node based on the probability distribution of evidence nodes. The mutual information for two discrete random variables, X and Y, can be defined using Equation (8):
H ( X , Y ) = y Y x X P ( x , y ) log ( p ( x , y ) p ( x ) p ( y ) )
where P(x,y) is the joint probability distribution function of variables X and Y, and p(x) and p(y) are the marginal probability distribution functions of X and Y, respectively. The mutual information is used to identify the most influential factors that have the highest degree of dependence on the query node.
(2)
Sensitivity Analysis
Sensitivity analysis is a technique that helps validate the probability parameters of Bayesian networks, focusing on how small changes in the numerical parameters of the model (i.e., prior probabilities and conditional probabilities) influence the output parameters (e.g., posterior probabilities). Parameters that are particularly sensitive have a more pronounced influence on the inference results. The precision of the numerical parameters is crucial for calculating the target posterior probabilities. A large derivative of a parameter p can lead to a significant change in the posterior probability of the target. Conversely, a small derivative suggests that notable changes in the parameter may only minimally affect the posterior.

4.5. Model Prediction Performance Evaluation

This study evaluates the prediction accuracy and reliability of the BN model using a confusion matrix and various performance evaluation metrics. We partitioned a new database randomly into training and testing datasets. The former facilitated model construction, while the latter enabled model evaluation.
Overall accuracy is a simple and effective metric for evaluating the prediction accuracy of the constructed model, defined as the percentage of correctly predicted samples out of the total samples. However, it is not suitable for measuring results with imbalanced samples. To address these issues, precision, recall, F-value, specificity, and false positive rate (FPR) were employed to assess the reliability and robustness of the model.
Precision represents the probability of an optimistic prediction being a true positive among all predicted positive samples. Recall refers to the probability of a positive sample being predicted as positive among all actual positive samples, also known as sensitivity. Precision assesses the model’s accuracy, while recall evaluates the consistency of the model. However, they are mutually constrained. The F-value, calculated as twice the harmonic mean of precision and recall, therefore offers a balanced assessment of both precision and recall. Specificity represents the proportion of correctly predicted negative samples to all actual negative samples. A higher specificity value is desirable. Conversely, a lower false positive rate (FPR) value is preferable. The detailed confusion matrix is available in Table 3, and the mathematical definitions of each metric span are provided in Equations (9)–(13).
precision = T P T P + F P
recall = T P T P + F N
F = 2 p r e c i s i o n r e c a l l p r e c i s i o n + r e c a l l
specificity = T N T N + F P
F P R = F P F P + T N

4.6. Model Consistency Verification

Cohen’s kappa statistic measures agreement between categorical variables. For example, kappa can assess the consistency of different raters in classifying subjects into one of several groups. Kappa can also be used to assess the agreement between different methods of categorical assessment.
In this study, the Cohen’s kappa statistic was used to verify the model consistency of the predictive performance of each consequence of fishing vessel accidents. Kappa is calculated from the observed and expected frequencies on the diagonal of a square contingency table. In this context, the square contingency table is the confusion matrix, as shown in Table 3. The kappa statistic is defined in Equation (14):
k = p 0 - p e 1 - p e
where k is the kappa statistic, and p0 indicates the relative agreement between the true and predicted values. The value of p0 is defined in Equation (15). Pe indicates the hypothetical probability of chance agreement. The value of pe is defined in Equation (16).
p 0 = T P + T N T P + F P + F N + T N
p e = ( T P + F P ) ( T P + F N ) + ( F N + T N ) ( F P + T N ) ( T P + F P + F N + T N ) 2
The calculation result of the kappa statistic is k ∈ [−1,1]. A value closer to 1 indicates stronger model consistency. Studies [49,50] suggest that the model is considered almost perfect when k ∈ [0.81,1].

5. Results

5.1. RF-Based RIF Selection

This study utilizes a sample of 3448 accident data collected from 2018 to 2022 in Ningbo. The dependent variable in this model is the consequence of the fishing vessel accident. The training set comprises 80% of the data, equating to 2758 samples, while the remaining 690 samples form the test set. The significance of variables can be evaluated through metrics like a reduction in impurity or a decrease in accuracy. In this study, the significance of variables was evaluated using the mean decrease in the Gini coefficient, which is a measure of how each variable contributes to the homogeneity of the nodes and leaves in the resulting random forest. The higher the value of the mean decrease Gini score, the higher the importance of the variable in the model, as shown in Table 4.
Of all the explanatory variables, the operation mode of the fishing vessel ranked fourth among all explanatory variables and has a significant impact on the consequences. Its importance notably exceeds other variables, indicating its unique role in determining accident severity. Variables with importance scores greater than 0.04 are ranked in this order: season, accident type, human factors, operation mode, wind, age, gross tonnage, length, accident locations, time of day, and power. Conversely, factors like sea condition, hull type, visibility, crew, width, and equipment are less critical as influencing factors.

5.2. Bayesian Model

5.2.1. Bayesian Model Structure Learning

The random forest model identified 11 highly impactful factors: operation mode of the fishing vessel, season, age, accident type, accident locations, wind, time of day, human factors, power, gross tonnage, and length of the vessel. Their high importance scores led to their selection for the Bayesian network (BN) model.
To create a purely data-driven model, the tree-augmented naive Bayes (TAN) [5,22] approach was employed for training without integrating prior knowledge. The final training outcomes are depicted in Figure 3. The trained BN structure consists of 12 nodes interconnected by 21 links. Nodes highlighted in light blue represent vessel-related factors; those in orange correspond to accident-related factors; and the ones in bright blue signify environment-related factors. Links were established by examining the correlations among all influencing factors as per accident records.
The interaction strength between nodes within the model as derived from the training results is illustrated in Figure 4. Strong interactions signify substantial causal connections, while weaker interactions are termed weak causal relationships. The thickness of the connecting lines in Figure 4 symbolizes the intensity of these causal relationships. Bold links indicate substantial relationships. Strength is calculated via the Jensen-Shannon divergence. The substantial causal relationships include connections between length and power, length and gross tonnage, gross tonnage and operation mode, gross tonnage and age, gross tonnage and accident type, consequence of fishing vessel accident and accident type, accident type and wind, human factors and accident type, consequence of fishing vessel accident and accident locations, and accident type and season.
Table 5 shows strong correlations among vessel factors. Logically, larger vessels having greater length and width need more power. For environmental factors, wind has robust relationships with the accident type and consequence, with windy conditions increasing collisions/contact risks and severity. Accident factors also connect strongly, with about 90% of collision/contact accidents caused by human errors like improper operation and negligence.

5.2.2. Bayesian Model Probabilistic Learning

The probabilities were computed via the Bayesian method, and each node’s conditional probability tables (CPTs) were developed using the GeNIe software. The probability distributions for the nodes were determined based on historical accident data. The final Bayesian model is shown in Figure 5.
The node posterior probability distributions in Figure 5 provide initial observations. For fishing vessels, single trawl operations are most frequently linked with accidents, accounting for 52% of accidents, followed by double trawl at 19% and gillnet at 13%. Vessels over 24 m in length, with a power above 136 kW and a tonnage between 100 and 200 tons, are most prone to accidents. The age of the vessel also plays a major role, with nearly 74% of accidents involving vessels over ten years old. As age increases, so does accident likelihood. For example, vessels aged between 10 and 20 years account for 36% of accidents, while those aged over 20 years account for 38%.
Regarding environmental factors, accident likelihoods in spring, autumn, and winter are 34%, 30%, and 22%, respectively. The chance of accidents happening in the summer is 14%, which is likely due to the fishing ban during this period. The daytime accident probability is 62% versus 38% at night. Interestingly, the impact of wind speed is notably different from previous research. When wind speed is under scale 7, accidents account for 89%, possibly since weather forecasts now mitigate wind risks.
The most frequent types of fishing vessel accidents are collisions (47%), contact damage (26%), and mechanical failures (12%). Wind damage and fires are 2% and 4%. Accidents were most likely to occur in the operational sea area (65%), versus 35% near-shore. Human errors such as improper operation and negligence comprise 51% of accidents.

5.3. Sensitivity Analysis

5.3.1. Mutual Information

A sensitivity analysis of variables was performed employing the mutual information (MI) technique. MI quantifies the mutual dependence between two elements, with information entropy representing the interaction’s significance. High entropy signifies strong correlation, while low entropy implies weak correlation. The analysis was executed on the target nodes, with results presented in Table 6 and Figure 6.
Mutual information was computed between the “Consequence” target node and its influencing factors. The blue line represents the mutual information values, while the orange line indicates differences between adjacent values. A higher mutual information value signifies greater factor influence on the “Consequence”. Table 6 outlines the mutual information values, entropy percentages, and belief differences.
Power exerts significant influence with a mutual information value of 0.08309. Subsequent influential factors ranked by decreasing mutual information are length, tonnage, operation mode, accident type, and human factors, with values of 0.07446, 0.06431, 0.04053, 0.03541, and 0.02873, respectively. This affirms the significant impact of “Operation mode” on the “Consequence”, validating its inclusion as a RIF.
The 6th to 9th mutual information values show minor variation, with changes of 0.00141, 0.00127, and 0.00304 between adjacent values. Based on their mutual information and variation rate, the six RIFs are recognized as the most significantly varying factors.

5.3.2. Sensitivity Analysis

Within the scope of Bayesian modeling for risk analysis of fishing vessel accidents, recognizing risk determinants substantially influencing “Consequences” is critical for applying specific mitigation measures.
Utilizing GeNIe software, a sensitivity analysis was carried out, designating all 11 variables as target nodes. Results show high sensitivity for fishing vessel power, length, tonnage, and consequences. This is consistent with the structure of the mutual information analysis, where the top 10 scenarios under three states of power were selected for impact analysis after the sensitivity test of the Bayesian model. The distribution is shown in Figure 7, Figure 8 and Figure 9. The bar shows the range of changes in the target state as the parameter changes in its range (±10%). The color of the bar shows the direction of the change in the target state, red expresses negative and green positive change.
Figure 7 shows that vessels with power less than 44 kW account for 8.59% of accidents. Severe accidents usually occur in this range, with a peak sensitivity value of 0.126. In Figure 8, vessels with power between 44 and 136 kW comprise 2.8% of accidents, where the sensitivity value of having a severe accident is 9%. Figure 9 shows that vessels with power more than 136 kW accounted for 88.6% of accidents, and the sensitivity of having a severe accident soared to 91.16%.
The sensitivity analysis shows that fishing vessels generally have severe and general accidents. If only the power of fishing vessels is considered, the probability of major and critical accidents is relatively low.
In order to understand the causal effects of RIFs, their variation was further explored under the different consequences of fishing vessel accidents. Since the BN comprises 11 RIFs with extensive scenarios, simulating all potential state combinations is challenging. Based on the mutual information, the first six variables, power, length, gross tonnage, operation mode, accident type, and human factors, were chosen for additional sensitivity analysis to identify their nuanced influence on the “Consequence”. The probabilities for each variable’s state were progressively increased to 100%, yielding the joint probabilities depicted in Table 7. It shows the shifts in corresponding consequences when individual node states become 100%, with the respective probability changes. The upward arrow indicates that the probability of the target node increases, and the downward arrow indicates that the probability of the target node decreases. For example, when operation mode is 100% single trawling, the consequences shift from (general: 23%, severe: 68%, major: 7%, critical: 2%) to (general: 19%, severe: 72%, major: 7%, critical: 2%), with changes of −4%, 4%, 0%, and 0%, respectively. Table 7 highlights that “other” accident types have the highest “Major” consequence probability, possibly including shipwrecks. Similarly, fire incidents show the greatest “Major” likelihood. Moreover, gillnet vessels under 44 kW power have significantly higher “Severe” consequence chances.
Table 7 highlights that “other” accident types have the highest “Major” consequence probability, possibly including shipwrecks. Similarly, fire incidents show the greatest “Major” likelihood. Moreover, gillnet vessels with under 44 kW of power have significantly higher “Severe” consequence chances.

5.4. Model Validation

5.4.1. Model Accuracy

Additional sensitivity analyses were conducted to investigate the cumulative effects of multiple variables and confirm the accuracy of the model. The first six RIFs were taken as variable sets. Minor 5% increases were made to their prior probabilities towards extreme states impacting “Consequence”. This was implemented sequentially from the first node, resulting in cumulative changes in values for power, length, gross tonnage, operation mode, accident mode, and human factors. The same procedure was applied to other types of accidents, with the computed results shown in Table 8. The initial probabilities for each consequence are shown in parentheses, with subsequent columns as cumulative change values after updates. By comparing probability alterations, the cumulative effects of these significant influencing factors on the “Consequence” can be calculated.
Table 8 shows that increases or decreases in the prior probabilities of the variable nodes trigger corresponding changes in the posterior probabilities of the target node. As updates to the influencing factors occur, the probability values of the target node progressively increase, confirming the model adheres to Axioms 1 and 2. This verifies the accuracy of the proposed model.

5.4.2. Model Prediction Performance Test

A randomly chosen subset of 690 accidents (20% of the total) was used as the test dataset to assess the prediction capability of the model. This assessment is reflected in the confusion matrix shown in Table 9. Based on the matrix, the overall accuracy of the model is 84.6% (584 out of 690). The detailed accuracy values in Table 9 show the most accurate predictions were for “Major” accidents at 87.5%. Accuracy was 80.6% for “General”, 85.9% for “Severe”, and 75.0% for “Critical” accidents.
According to Section 4.5, each accident type’s five performance metrics were calculated and presented in Table 10. For “Severe” accidents, the BN model achieves 96.1% accuracy. For “General”, “Major”, and “Critical”, accidents, the model’s recall rate is over 80%. Higher specificity and a lower false-positive rate (FPR) are preferable. Table 10 shows that the specificity is over 90% for all types, while the FPR is under 9%. These comparative results further confirm the excellent performance and reliability of the constructed model.
To compare the predictive accuracy, a traditional Bayesian model using 17 factors without feature selection was built. With 20% randomly selected test data, its overall accuracy was 62.1% (428/690). The accuracy of the Bayesian model constructed through feature selection (84.6%) was significantly higher, further confirming its accuracy.
Furthermore, we calculated the kappa coefficient for the Bayesian model built without feature selection, which was 0.7633. This is significantly lower than the Bayesian model proposed in this research. The model’s validation provides further evidence of its enhanced performance.

5.4.3. Model Consistency Test

As per Equation (15) and the confusion matrix in Table 8, pe is determined to be 0.132. The overall accuracy of P0 equals 0.846. Using Equation (14), we calculate the kappa coefficient to be 0.8333. It is commonly known that when the kappa coefficient (k) falls within the range [0.81,1], the model is considered to be nearly perfect. This affirmation further underlines the strong consistency displayed by the developed model.
The kappa coefficient for the Bayesian model without RIF selection was significantly lower at 0.7633. This validates the enhanced performance of the proposed model incorporating RIF selection.

5.4.4. Case Verification

To further demonstrate the effectiveness of the model, a recent fishing vessel collision accident in 2023 was selected for evaluation. On 4th March 2023, the fishing vessel “Zhe Xiang Yu *****” grounded in the East China Sea. The accident report outlines 11 relevant parameters in Table 11. The Bayesian network (BN) model we developed was used to simulate this incident (Figure 10). The simulation concluded with an 87.0% probability of the consequence of an accident falling into the “General” category. This result aligns with the evaluation performed by the Ningbo Mutual Insurance Association, which assessed a loss of 45,256 yuan. The effectiveness of the BN model in this study is further endorsed.

6. Discussion and Conclusions

6.1. Discussion

Building an extensive database is paramount for fishing vessel accident analysis. In our research, we harnessed claims data from a mutual insurance association for the first time, creating a dedicated database for fishing vessel incidents with broader coverage than previous research. We then applied the RF-BN model framework to simulate how different variables affect the consequences. This amalgamates the random forest (RF) method to discern crucial factors based on feature importance, reducing them from 17 to 11 while maintaining the model’s precision. This significantly eases computational complexity and reduces overfitting risk. Further validation was conducted through real-world scenarios, demonstrating the model’s solid generalizability.
Additionally, the TAN Bayesian model identified associations between the selected essential factors and the severity of accidents and the mutual relations among independent variables. We ultimately pinpointed three categories of influential factors contributing to the consequences. These insights offer valuable knowledge into the factors shaping the consequences of accidents and are discussed as follows.
Given the context of this study, operation mode was the most vital variable per RF feature importance. Sensitivity analysis further highlighted its significant influence on severity. Single trawling had the highest probability of 54% versus other fishing operations. The introduction of the operation mode as a new risk factor was reinforced by both RF and Bayesian model analysis. Length, power, and tonnage are interconnected factors that considerably affect the consequences. The sensitivity analysis of the TAN model pointed to these as the three most sensitive variables, aligning with previous studies [2,9,11,12]. Our research also unveiled a strong correlation between vessel age and the consequence of an accident, congruent with prior studies [10,11]. Older vessels, especially those over 20 years old, have higher accident risks. Therefore, the safety of older fishing vessels should receive more focus. Our research also identified a higher accident likelihood for gillnet fishing vessels. Policies could encourage safer vessel construction, like double trawlers.
The three identified environmental factors include wind, time of day, and season. Prior studies [8,9] have substantiated the significance of wind as an influencing factor on the consequences, which aligns with our findings. Similarly, the season is a pivotal factor affecting accident severity, consistent with previous research [16]. The probability of accidents happening in the summer is the lowest, reflecting reality as the study area has a fishing ban from 1st May to 17th September, resulting in fewer active fishing vessels and, hence, a lower accident probability. Additionally, the time of day significantly impacts accidents and emerges as one of the most influential factors, consistent with previous research. Numerous researchers, including Jin et al. [8], have indicated that fatal fishing vessel accidents are more likely to occur at night. The increased probability of severe accidents during nighttime might be attributed to the challenge sailors face in judging distances and estimating visibility at night, leading to heightened confusion as visibility naturally diminishes.
Accident type, location, and human factors were the accident-related factors identified. The significance of these three factors has been verified in previous research. Their significance is verified in previous studies [25,26,27]. The importance of accident locations is also consistent with prior findings [8,9]. Past studies [9,16] identified accident type as a key factor in determining fishing vessel accident severity, which is in line with this study’s findings.
Building upon the critical RF factors, this study constructs a Bayesian model and conducts a sensitivity analysis. Six key factors influencing the severity of fishing vessel accidents are identified: power, length, gross tonnage, operation mode, accident type, and human factors.

6.2. Conclusions

This study demonstrates the utilization of insurance data and Bayesian network modeling to analyze fishing vessel accidents. An accident database was developed from rare FMIS data, risk factors were identified, and previous research was integrated to discern influencing factors.
The research presents a data-driven Bayesian model for fishing vessel accidents using influential factors selected via a random forest algorithm. Despite the use of a finite set of indicators, the model exhibits excellent performance and enhanced predictive abilities, as validated through testing and assessments. The findings provide invaluable accident prevention insights.
(1)
The 11 predominant factors for fishing vessel accidents were identified, including operation mode, season, age, accident type, wind, gross tonnage, time of day, human factors, power, accident locations, and length. Post-model sensitivity analysis further distilled the core factors to six: power, length, gross tonnage, operation mode, accident type, and human factors.
(2)
The data-driven Bayesian model incorporates an enhanced approach, achieving an impressive 89% prediction accuracy based on real-world case studies. This makes it a reliable tool for accident prevention.
(3)
The model provides comprehensive information for regulatory authorities and other stakeholders, delivering crucial insights for monitoring the conditions of fishing vessels and creating pertinent policies to ensure maritime safety during fishing operations.
Whilst the RF-BN model developed in this study explains the relevant factors affecting the consequences of fishing vessel accidents, there are still some limitations to this study. For example, the sample used in this paper is the data of fishing vessel accidents in a city along the coast of China, and because of the limitations of the region from which the sample originated, the conclusions drawn in this paper may not be applicable to the analysis of fishing vessel accidents in other regions. Meanwhile, constructing a BN model under the assumption that the samples and variables are independent of each other requires determining the relationship between nodes during the structural learning process, which often requires further discussion in practical applications. In this study, the relationships between nodes were constructed on a data-driven basis. However, the results may be biased if there are irrelevant connections between nodes. Therefore, in further studies, researchers can use the BN model in combination with other methods to improve the reliability of the results. In addition, data related to human factors was collected and assessed subjectively; a more rigorous approach to human factor analysis might help improve the quantitative analysis of fishing vessel accidents.

Author Contributions

Conceptualization, F.W., G.L. and P.Z.; Methodology, W.D., Y.Y. and M.G.; Software, H.F.; Validation, M.G.; Formal analysis, W.D.; Investigation, F.W., Y.Y. and G.L.; Resources, P.Z.; Data curation, H.F. and Y.Y.; Writing—original draft, F.W.; Visualization, H.F.; Supervision, G.L. and P.Z.; Project administration, P.Z.; Funding acquisition, P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Natural Science Foundation of China (52272334), the National Key Research and Development Program of China (2017YFE0194700), and the EC H2020 Project (690713).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, PZ, upon reasonable request.

Acknowledgments

We would like to thank the National “111” Center on Safety and Intelligent Operation of Sea Bridges (D21013), the Zhejiang 2011 Collaborative Innovation Center for Port Economy, and the Donghai Academy of Ningbo University for their financial support in publishing this paper. The authors would like to thank the K.C. Wong Magna Fund at Ningbo University for sponsorship.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Jaremin, B.; Kotulak, E. Mortality in the Polish small-scale fishing industry. Occup. Med. 2004, 54, 258–260. [Google Scholar] [CrossRef]
  2. Jin, D.; Thunberg, E. An analysis of fishing vessel accidents in fishing areas off the northeastern United States. Saf. Sci. 2005, 43, 523–540. [Google Scholar] [CrossRef]
  3. FAO. World Review of Fisheries and Aquaculture, the State of World Fisheries and Aquaculture; Food and Agriculture Organization of the United Nations: Rome, Italy, 2014. [Google Scholar]
  4. Zhang, G.; Thai, V.V. Expert elicitation and Bayesian Network modeling for shipping accidents: A literature review. Saf. Sci. 2016, 87, 53–62. [Google Scholar] [CrossRef]
  5. Cao, Y.; Wang, X.; Wang, Y.; Fan, S.; Wang, H.; Yang, Z.; Liu, Z.; Wang, J.; Shi, R. Analysis of factors affecting the severity of marine accidents using a data-driven Bayesian network. Ocean Eng. 2023, 269, 113563. [Google Scholar] [CrossRef]
  6. Wang, L.; Yang, Z. Bayesian network modelling and analysis of accident severity in waterborne transportation: A case study in China. Reliab. Eng. Syst. Saf. 2018, 180, 277–289. [Google Scholar] [CrossRef]
  7. Li, H.; Ren, X.; Yang, Z. Data-driven Bayesian network for risk analysis of global maritime accidents. Reliab. Eng. Syst. Saf. 2018, 230, 8938. [Google Scholar] [CrossRef]
  8. National Research Council. Fishing Vessel Safety: Blueprint for a National Program; The National Academies Press: Washington, DC, USA, 1991. [Google Scholar]
  9. Amir, W.M.; Gobi, K.V.; Kasypi, M.; NurFadhlina, H.; Azlida, A. Comprehensive analysis of the factors that affecting inefficient management of vessels using LRM. Int. J. Eng. Appl. Sci. 2014, 5, 1–15. [Google Scholar]
  10. Talley, W.K. The safety of sea transport: Determinants of crew injuries. Appl. Econ. 1999, 31, 1365–1372. [Google Scholar] [CrossRef]
  11. Celik, M.; Cebi, S. Analytical HFACS for investigating human errors in shipping accidents. Accid. Anal. Prev. 2009, 41, 66–75. [Google Scholar] [CrossRef]
  12. Kose, E.K.; Dincer, A.C.; Durukanoglu, H.F. Risk Assessment of Fishing Vessels. Tr. J. Eng. Environ. Sci. 1998, 22, 417–428. [Google Scholar]
  13. Wróbel, K.; Montewka, J.; Kujala, P. Towards the assessment of potential impact of unmanned vessels on maritime transportation safety. Reliab. Eng. Syst. Saf. 2017, 165, 155–169. [Google Scholar] [CrossRef]
  14. Wang, H.; Liu, Z.; Wang, X.; Graham, T.; Wang, J. An analysis of factors affecting the severity of marine accidents. Reliab. Eng. Syst. Saf. 2021, 210, 107513. [Google Scholar] [CrossRef]
  15. Obeng, F.; Domeh, V.; Khan, F.; Bose, N.; Sanli, E. Capsizing accident scenario model for small fishing trawler. Saf. Sci. 2022, 145, 105500. [Google Scholar] [CrossRef]
  16. Lazakis, I.; Kurt, R.E.; Turan, O. Contribution of human factors to fishing vessel accidents and near misses in the UK. J. Shipp. Ocean. Eng. 2014, 4, 245–261. [Google Scholar]
  17. Jin, D. The determinants of fishing vessel accident severity. Accid. Anal. Prev. 2014, 66, 1–7. [Google Scholar] [CrossRef] [PubMed]
  18. Jin, D.; Kite-Powell, H.L.; Thunberg, E.; Solow, A.R.; Talley, W.K. A model of fishing vessel accident probability. J. Saf. Res. 2002, 33, 497–510. [Google Scholar] [CrossRef] [PubMed]
  19. Çakır, E.; Fışkın, R.; Sevgili, C. Investigation of tugboat accidents severity: An application of association rule mining algorithms. Reliab. Eng. Syst. Saf. 2021, 209, 107470. [Google Scholar] [CrossRef]
  20. Uğurlu, F.; Yıldız, S.; Boran, M.; Uğurlu, Ö.; Wang, J. Analysis of fishing vessel accidents with Bayesian network and Chi-square methods. Ocean Eng. 2020, 198, 106956. [Google Scholar] [CrossRef]
  21. Wang, J.; Pillay, A.; Kwon, Y.; Wall, A.; Loughran, C. An analysis of fishing vessel accidents. Accid. Anal. Prev. 2005, 37, 1019–1024. [Google Scholar] [CrossRef]
  22. Obeng, F.; Domeh, V.; Khan, F.; Bose, N.; Sanli, E. Analyzing operational risk for small fishing vessels considering crew effectiveness. Ocean Eng. 2005, 249, 110512. [Google Scholar] [CrossRef]
  23. Roberts, E.S. Occupational mortality in British commercial fishing, 1976–1995. Occup. Environ. Med. 2004, 61, 16–23. [Google Scholar]
  24. Laursen, L.H.; Hansen, H.L.; Jensen, O.C. Fatal occupational accidents in Danish fishing vessels 1989–2005. Int. J. Inj. Control Saf. Promot. 2008, 15, 109–117. [Google Scholar] [CrossRef]
  25. Davis, B.; Colbourne, B.; Molyneux, D. Analysis of fishing vessel capsizing causes and links to operator stability training. Saf. Sci. 2019, 118, 355–363. [Google Scholar] [CrossRef]
  26. Heij, C.; Knapp, S. Effects of wind strength and wave height on ship incident risk: Regional trends and seasonality. Transp. Res. Part D Transp. Environ. 2015, 37, 29–39. [Google Scholar] [CrossRef]
  27. Weng, J.; Yang, D. Investigation of shipping accident injury severity and mortality. Accid. Anal. Prev. 2015, 76, 92–101. [Google Scholar] [CrossRef]
  28. Rezaee, S.; Pelot, R.; Ghasemi, A. The effect of extreme weather conditions on commercial fishing activities and vessel incidents in Atlantic Canada. Ocean Coast. Manag. 2016, 130, 115–127. [Google Scholar] [CrossRef]
  29. Liu, K.; Yu, Q.; Yuan, Z.; Yang, Z.; Shu, Y. A systematic analysis for maritime accidents causation in Chinese coastal waters using machine learning approaches. Ocean Coast. Manag. 2021, 213, 105859. [Google Scholar] [CrossRef]
  30. Özaydın, E.; Fışkın, R.; Uğurlu; Wang, J. A hybrid model for marine accident analysis based on Bayesian Network (BN) and Association Rule Mining (ARM). Ocean Eng. 2022, 247, 110705. [Google Scholar] [CrossRef]
  31. Antão, P.; Soares, C.G. Analysis of the influence of human errors on the occurrence of coastal ship accidents in different wave conditions using Bayesian Belief Networks. Accid. Anal. Prev. 2019, 133, 105262. [Google Scholar] [CrossRef]
  32. Khan, B.R.; Yin, J.; Mustafa, F.H.; Liu, H. Risk assessment and decision support for sustainable traffic safety in Hong Kong waters. IEEE Access 2020, 8, 72893–72909. [Google Scholar] [CrossRef]
  33. Zhang, J.; Teixeira, P.; Soares, C.G.; Yan, X.; Liu, K. Maritime Transportation Risk Assessment of Tianjin Port with Bayesian Belief Networks. Risk Anal. 2016, 36, 1171–1187. [Google Scholar] [CrossRef]
  34. Khan, B.; Khan, F.; Veitch, B. A Dynamic Bayesian Network model for ship-ice collision risk in the Arctic waters. Saf. Sci. 2020, 130, 104858. [Google Scholar] [CrossRef]
  35. Fan, S.; Yang, Z.; Blanco-Davis, E.; Zhang, J.; Yan, X. Analysis of maritime transport accidents using Bayesian networks. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2020, 234, 439–454. [Google Scholar] [CrossRef]
  36. Uğurlu, Ö.; Yıldız, S.; Loughney, S.; Wang, J.; Kuntchulia, S.; Sharabidze, I. Analysing of Collision, Grounding and Sinking Accident Occurring in the Black Sea Utilizing HFACS and Bayesian Networks. Risk Anal. 2020, 40, 2610–2638. [Google Scholar] [CrossRef] [PubMed]
  37. Yu, Q.; Teixeira, P.; Liu, K.; Rong, H.; Soares, C.G. An integrated dynamic ship risk model based on Bayesian Networks and Evidential Reasoning. Reliab. Eng. Syst. Saf. 2021, 216, 107993. [Google Scholar] [CrossRef]
  38. Fan, S.; Blanco-Davis, E.; Yang, Z.; Zhang, J.; Yan, X. Incorporation of human factors into maritime accident analysis using a data-driven Bayesian network. Reliab. Eng. Syst. Saf. 2020, 203, 107070. [Google Scholar] [CrossRef]
  39. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  40. Ahmed, M.M.; Abdel-Aty, M.A. The viability of using automatic vehicle identification data for real-time crash prediction. IEEE Trans. Intell. Transp. Syst. 2012, 13, 459–465. [Google Scholar] [CrossRef]
  41. Blau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  42. Chow, C.; Liu, C. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 1968, 14, 462–467. [Google Scholar] [CrossRef]
  43. Zou, X.; Yue, W.L. A Bayesian Network Approach to Causation Analysis of Road Accidents Using Netica. J. Adv. Transp. 2017, 2017, 2525481. [Google Scholar] [CrossRef]
  44. Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian Network Classifiers. Mach. Learn. 1997, 29, 131–163. [Google Scholar] [CrossRef]
  45. Yang, Z.; Yang, Z.; Smith, J.; Robert, B.A.P. Risk analysis of bicycle accidents: A Bayesian approach. Reliab. Eng. Syst. Saf. 2021, 209, 107460. [Google Scholar] [CrossRef]
  46. Wang, S.; Yan, R.; Qu, X. Development of a non-parametric classifier: Effective identification, algorithm, and applications in port state control for maritime transportation. Transp. Res. Part B Methodol. 2021, 128, 129–157. [Google Scholar] [CrossRef]
  47. Ji, Z.; Xia, Q.; Meng, G. A Review of Parameter Learning Methods in Bayesian Network. Adv. Intell. Comput. Theor. Appl. 2015, 11, 3–12. [Google Scholar] [CrossRef]
  48. Swaminathan, H.; Gifford, J.A. Bayesian Estimation in the Rasch Model. J. Educ. Stat. 1982, 7, 175–191. [Google Scholar] [CrossRef]
  49. Landis, J.R.; Koch, G.G. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef]
  50. Fleiss, J.L. Measuring nominal scale agreement among many raters. Psychol. Bull. 1971, 76, 378–382. [Google Scholar] [CrossRef]
Figure 1. The distribution of accident locations.
Figure 1. The distribution of accident locations.
Sustainability 15 13427 g001
Figure 2. The proposed methodology.
Figure 2. The proposed methodology.
Sustainability 15 13427 g002
Figure 3. Bayesian network structure diagram.
Figure 3. Bayesian network structure diagram.
Sustainability 15 13427 g003
Figure 4. Strength map of interaction pairs (links) in BN.
Figure 4. Strength map of interaction pairs (links) in BN.
Sustainability 15 13427 g004
Figure 5. BN parameter learning results.
Figure 5. BN parameter learning results.
Sustainability 15 13427 g005
Figure 6. Mutual information values and variance.
Figure 6. Mutual information values and variance.
Sustainability 15 13427 g006
Figure 7. Sensitivity analysis of fishing vessels with power less than 44 kW.
Figure 7. Sensitivity analysis of fishing vessels with power less than 44 kW.
Sustainability 15 13427 g007
Figure 8. Sensitivity analysis of fishing vessels with power between 44 kW and 136 kW.
Figure 8. Sensitivity analysis of fishing vessels with power between 44 kW and 136 kW.
Sustainability 15 13427 g008
Figure 9. Sensitivity analysis of fishing vessels with power more than 136 kW.
Figure 9. Sensitivity analysis of fishing vessels with power more than 136 kW.
Sustainability 15 13427 g009
Figure 10. Real case verification.
Figure 10. Real case verification.
Sustainability 15 13427 g010
Table 1. RIFs in the retrieved literature.
Table 1. RIFs in the retrieved literature.
RIFsNumber of ReferencesReferences
Time of day14Cao et al. [5], Wang et al. [6], Li et al. [7], Jin [8], Lazakis et al. [14], Weng et al. [19], Liu et al. [21], Wang et al. [22], Antão et al. [31], Khan et al. [32], Zhang et al. [33], Khan et al. [34], Fan et al. [35], Uğurlu et al. [36],
Age9Jin [8], Uğurlu et al. [11], Heij et al. [18], Liu et al. [21], Wang et al. [22], Khan et al. [32], Fan et al. [35], Yu et al. [37], Fan et al. [38]
Wind13Jin et al. [2], Cao et al. [5], Wang et al. [6], Li et al. [7], Jin [8], Jin et al. [9], Wang et al. [12], Heij et al. [18], Rezaee et al. [20], Wang et al. [22], Obeng et al. [29], Khan et al. [32], Yu et al. [37]
Accident locations12Jin et al. [2], Cao et al. [5], Wang et al. [6], Jin [8], Jin et al. [9], Wang et al. [12], Lazakis et al. [14], Weng et al. [19], Wang et al. [22], Antão et al. [31], Khan et al. [32], Zhang et al. [33]
Sea condition5Cao et al. [5], Li et al. [7], Fan et al. [35], Yu et al. [37], Fan et al. [38]
Gross tonnage10Cao et al. [5], Wang et al. [6], Li et al. [7], Jin [8], Heij et al. [18], Liu et al. [21], Wang et al. [22], Fan et al. [35], Yu et al. [37], Fan et al. [38]
Length7Jin et al. [2], Wang et al. [6], Li et al. [7], Wang et al. [12], Liu et al. [21], Fan et al. [35], Fan et al. [38]
Power3Li et al. [7], Liu et al. [21], Wang et al. [22]
Width2Li et al. [7], Liu et al. [21]
Accident type9Cao et al. [5], Wang et al. [6], Jin et al. [9], Laursen et al. [16], Weng et al. [19], Liu et al. [21], Wang et al. [22], Antão et al. [31], Khan et al. [32]
Season8Jin et al. [2], Wang et al. [6], Jin [8], Wang et al. [12], Heij et al. [18], Liu et al. [21], Khan et al. [32], Zhang et al. [33]
Human factors9Cao et al. [5], Wang et al. [6], Li et al. [7], Uğurlu et al. [11], Obeng et al. [13], Obeng et al. [29], Antão et al. [31], Uğurlu et al. [36]
Visibility9Cao et al. [5], Wang et al. [6], Li et al. [7], Wang et al. [22], Liu et al. [29], Khan et al. [32], Zhang et al. [33], Khan et al. [34], Yu et al. [37]
Hull type6Wang et al. [6], Li et al. [7], Jin [8], Liu et al. [21], Fan et al. [35], Fan et al. [38]
Crew7Cao et al. [5], Wang et al. [6], Uğurlu et al. [11], Lazakis et al. [14], Weng et al. [19], Wang et al. [22], Khan et al. [32]
Equipment4Li et al. [7], Lazakis et al. [14], Fan et al. [35], Fan et al. [38]
Vessel type12Cao et al. [5], Wang et al. [6], Li et al. [7], Lazakis et al. [14], Weng et al. [19], Liu et al. [21], Wang et al. [22], Antão et al. [31], Khan et al. [32], Zhang et al. [33], Fan et al. [35], Yu et al. [37]
Table 2. Definition and status of risk influencing factors (RIFs).
Table 2. Definition and status of risk influencing factors (RIFs).
NumberRIFsDescriptionValue
Fishing vessel-related factors
1Operation modeSingle_trawl; Double_trawl; Gill_net; Fishing_transport; Others1, 2, 3, 4, 5
2Length (m)Less_than_12; Between_12_to_24; More_than_241, 2, 3
3Width (m)Less_than_2; Between_2_to_6; More_than_61, 2, 3
4Age (years)Less_than_10; Between_10_to_20; More_than_201, 2, 3
5Power (kW)Less_than_44; Between_44_to_136; More_than_1361, 2, 3
6Hull typeSteel; wood1, 2, 3
7Gross tonnage (GT) Less_than_100; Between_100_to_200; More_than_2001, 2, 3
8CrewYes/No Manned as per regulations with an adequate/inadequate number of crew members holding valid certifications and qualifications1, 2
9EquipmentYes/No Outfitted as per regulations with sufficient/insufficient fire-fighting, positioning, and emergency provisions.1, 2
Environment-related factors
10Wind (Beaufort Scale)Less_than_4; Between_4_to_6; More_than_61, 2, 3
11Visibility (m)Good: ≥1000; Poor: <10001, 2
12Time of dayDaytime: 07:00–18:59; Night: 19:00–06:591, 2
13SeasonSpring; Summer; Autumn; Winter1, 2, 3, 4
14Sea condition (m)Good: wave height < 2.5; Poor: wave height ≥ 2.51, 2
Accident-related factors
15Human factorsYes/No This accident was caused by human violations or errors, such as fatigue, inadequate lookout, errors in judgment, improper operation, lack of experience or training, management and supervision deficiencies, etc.1, 2
16Accident locationsNear-shore: <3 nautical miles; Offshore1, 2
17Accident typeCollision; Contact damage; Wind damage; Fire; Mechanical failure; Others1, 2, 3, 4, 5, 6
Table 3. The confusion matrix.
Table 3. The confusion matrix.
Actual PositiveActual Negative
Predicted PositiveTrue Positive (TP)False Positive (FP)
Predicted NegativeFalse Negative (FN)True Negative (TN)
Table 4. Importance score of factors influencing the “Consequences”.
Table 4. Importance score of factors influencing the “Consequences”.
VariablesImportance Score
Season0.1312
Accident type0.1301
Human factors0.1285
Operation mode0.1095
Wind0.0841
Age0.0788
Gross tonnage0.0701
Length0.0495
Accident locations0.0482
Time of day0.0426
Power0.0401
Equipment0.0371
Visibility0.0326
Width0.0311
Crew0.0271
Sea condition0.0241
Hull type0.0190
Table 5. Parent node and child node influence strength.
Table 5. Parent node and child node influence strength.
Parent NodeChild NodeAverageMaximumWeighted
LengthPower0.6542920.9793830.654292
LengthGross tonnage0.5218620.8346580.521862
Gross tonnageOperation mode0.3854480.6916470.385448
Gross tonnageAge0.3621530.5011350.362153
Gross tonnageAccident type0.2735120.5029210.273512
Consequences_of_fishing_vessel_accidentAccident type0.2528980.4507600.252898
Accident typeWind0.2375040.6815750.237504
Accident typeHuman factors0.2224670.5561500.222467
Consequences_of_fishing_vessel_accidentAccident locations0.2183010.6598210.218301
Accident typeSeason0.2152340.5229130.215234
Consequences_of_fishing_vessel_accidentSeason0.2070520.5409380.207052
Consequences_of_fishing_vessel_accidentGross tonnage0.1983040.7521660.198304
Operation modeAccident locations0.1883810.4579700.188381
Consequences_of_fishing_vessel_accidentPower0.1872480.6937270.187248
Operation modeTime of day0.1490660.4166670.149066
Consequences_of_fishing_vessel_accidentWind0.1422690.3510610.142269
Consequences_of_fishing_vessel_accidentTime of day0.1336440.5113640.133644
Consequences_of_fishing_vessel_accidentAge0.1305580.2984940.130558
Consequences_of_fishing_vessel_accidentOperation mode0.1275740.3816210.127574
Consequences_of_fishing_vessel_accidentLength0.1129400.2214740.112940
Consequences_of_fishing_vessel_accidentHuman factors0.1105280.2830770.110528
Table 6. Mutual information between the node “Consequences” and the parent node.
Table 6. Mutual information between the node “Consequences” and the parent node.
NodeMutual InformationEntropy Reduction
Percent/%
Variance of
Beliefs
Power0.083097.8400.0223723
Length0.074466.3600.0193670
Gross tonnage 0.064315.8700.0161052
Operation mode0.040533.5000.0090452
Accident type0.035413.1800.0033807
Human factors0.028731.4600.0004962
Age0.007670.7570.0012194
Wind0.006260.1560.0012789
Accident locations0.004990.1400.0050074
Season0.001950.1330.0001943
Time of day0.000520.0150.0000040
Table 7. RIFs affecting the consequences of fishing vessel accidents.
Table 7. RIFs affecting the consequences of fishing vessel accidents.
GeneralSevereMajorCritical
Variables23%68%7%2%
Power
<4475% (52↑)21% (−47↓)2% (−5↓)3% (1↑)
44 to 13658% (35↑)32% (−36↓)8% (1↑)2% (0-)
>13620% (−3↓)71% (3↑)7% (0-)2% (0-)
Length
<1263% (40↑)31% (−37↓)4% (−3↓)2% (0-)
12 to 2433% (10↑)58% (−10↓)6% (−1↓)3% (1↑)
>2418% (−5↓)72% (4↑)8% (1↑)2% (0-)
Gross tonnage
<10056% (33↑)36% (−32↓)6% (−1↓)2% (0-)
100 to 20020% (−3↓)70% (2↑)7% (0-)3% (1↑)
>20019% (−4↓)72% (4↑)8% (1↑)1% (−1↓)
Operation mode
Single trawl19% (−4↓)73% (4↑)7% (0-)2% (0-)
Double trawl20% (−3↓)71% (3↑)8% (1↑)1% (−1↓)
Gill net47% (24↑)46% (−22↓)4% (−3↓)3% (1↑)
Fishing transport14% (−9↓)77% (9↑)7% (0-)2% (0-)
Others31% (8↑)61% (−7↓)8% (0-)1% (−1↓)
Accident type
Collision24% (1↑)70% (2↑)5% (−2↓)1% (−1↓)
Contact damage23% (0-)71% (3↑)5% (−2↓)1% (−1↓)
Wind damage60% (37↑)30% (−38↓)5% (−2↓)5% (3↑)
Fire18% (−5↓)60% (−8↓)18% (11↑)4% (2↑)
Mechanical failure9% (−14↓)75% (7↑)15% (8↑)1% (−1↓)
Others37% (14↑)45% (−23↓)8(1↑)10% (8↑)
Human factors
Yes23% (0-)71% (3↑)5% (−2↓)1% (−1↓)
No24% (1↑)64% (−4↓)9% (2↑)3% (1↑)
Table 8. Effects of small changes in influencing factors on target nodes.
Table 8. Effects of small changes in influencing factors on target nodes.
Power5%5%5%5%5%5%
Length 5%5%5%5%5%
Gross tonnage 5%5%5%5%
Operation mode 5%5%5%
Accident type 5%5%
Human factors 5%
General (23%)26.128.430.632.233.433.7
Severe (68%)69.872.172.272.572.873.2
Major (7%)7.297.377.537.758.88.97
Critical (2%)2.042.112.162.223.053.10
Table 9. Confusion matrix for prediction results.
Table 9. Confusion matrix for prediction results.
ActualGeneralSevereMajorCriticalTotalAccuracy (%)
Predicted
General1291514216080.6
Severe32414251148285.9
Major213524087.5
Critical0116875.0
Total163431752169084.6
Table 10. Performance results for different consequences.
Table 10. Performance results for different consequences.
GeneralSevereMajorCritical
Precision0.7910.9610.4680.240
Recall0.8060.8590.8750.750
F-measure0.7990.9070.6080.363
Specificity0.9350.9180.9380.972
FPR0.0640.0810.0610.027
Table 11. Details of the fishing vessel accident that occurred in 2023.
Table 11. Details of the fishing vessel accident that occurred in 2023.
RIFsStateRIFsState
Operation modeGill netAccident typeContact damage
Age15Gross tonnage285
Wind6.5SeasonSummer
Power220Accident locationsOffshore
Human factorsYesTime of dayNight
Length28.5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, F.; Du, W.; Feng, H.; Ye, Y.; Grifoll, M.; Liu, G.; Zheng, P. Identification of Risk Influential Factors for Fishing Vessel Accidents Using Claims Data from Fishery Mutual Insurance Association. Sustainability 2023, 15, 13427. https://doi.org/10.3390/su151813427

AMA Style

Wang F, Du W, Feng H, Ye Y, Grifoll M, Liu G, Zheng P. Identification of Risk Influential Factors for Fishing Vessel Accidents Using Claims Data from Fishery Mutual Insurance Association. Sustainability. 2023; 15(18):13427. https://doi.org/10.3390/su151813427

Chicago/Turabian Style

Wang, Fang, Weijie Du, Hongxiang Feng, Yun Ye, Manel Grifoll, Guiyun Liu, and Pengjun Zheng. 2023. "Identification of Risk Influential Factors for Fishing Vessel Accidents Using Claims Data from Fishery Mutual Insurance Association" Sustainability 15, no. 18: 13427. https://doi.org/10.3390/su151813427

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop