Next Article in Journal
Resilience by the Sea: Coastline Evolution in Latina, Latium
Previous Article in Journal
Analysis of Carbon Impacts of the Sanya Bay Ecological Restoration Project
Previous Article in Special Issue
An Evaluation of Port Environmental Efficiency Considering Heterogeneous Abatement Capacities: Integrating Weak Disposability into the Epsilon-Based Measure Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Preparing VTS for the MASS Era: A Machine Learning-Based VTSO Recruitment Model

1
Korea Coast Guard, Busan 49112, Republic of Korea
2
Korea Institute of Maritime and Fisheries Technology, Busan 49111, Republic of Korea
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(11), 2127; https://doi.org/10.3390/jmse13112127
Submission received: 17 October 2025 / Revised: 7 November 2025 / Accepted: 8 November 2025 / Published: 10 November 2025
(This article belongs to the Special Issue Sustainable and Efficient Maritime Operations)

Abstract

As the maritime industry transitions toward Maritime Autonomous Surface Ships (MASS), Vessel Traffic Service Operators (VTSOs) face new challenges in managing mixed traffic of conventional and autonomous vessels. Effective VTSO selection is becoming increasingly critical for maritime safety, yet current recruitment processes rely on subjective methods that limit objective evaluation of candidate suitability. This study presents the first machine learning-based classification model for VTSO recruitment. Eight features were defined, including sea service experience, navigation career, education, certifications, and language proficiency. Due to limited access to actual recruitment data, expert-validated simulated datasets were constructed through labeling by 40 maritime professionals and density estimation-based augmentation. Four algorithms were compared, with XGBoost achieving 94.6% F1-score. Feature importance analysis revealed TOEIC score as the most critical predictor, followed by seafaring career, with 3–4 years of experience identified as optimal. These findings indicate that English proficiency for communication with shore remote control centers and practical maritime experience for assessing autonomous vessel behaviors constitute core VTSO competencies in the MASS era. The proposed model demonstrates potential to improve subjective recruitment methods by discovering quantifiable competency patterns, offering a pathway toward data-driven, standardized, and transparent decision-making for enhanced maritime safety.

1. Introduction

The maritime industry is undergoing a transformative shift with the emergence of Maritime Autonomous Surface Ships (MASS). As autonomous and remotely controlled vessels move toward commercial deployment in the coming years, they will inevitably share waterways with conventional manned ships, creating mixed traffic environments that pose unprecedented challenges for maritime traffic management [1,2]. The introduction of MASS fundamentally alters the established human-to-human interaction between vessel operators and shore-based traffic controllers, as the absence of seafarers on board eliminates traditional communication and coordination practices that have been the foundation of maritime safety for decades [1].
In this evolving maritime landscape, the role and responsibility of VTSOs become increasingly critical, as they must not only manage mixed traffic scenarios but also serve as the primary human decision-makers in overseeing vessels with varying degrees of autonomy, thereby necessitating enhanced competencies and more rigorous recruitment criteria. In recent decades, the growth of the global merchant fleet has led to increased congestion and complexity in maritime traffic. This is particularly pronounced in coastal regions, straits, and adjacent waterways. Paradoxically, maritime accidents have decreased by half over the past decade, which can be attributed to the implementation of Vessel Traffic Services (VTS) systems in conflict-prone areas [3].
VTS serves as a cornerstone of modern maritime traffic safety management, playing a crucial role in ensuring safe navigation of vessels in ports and coastal waters [4]. The significance of effective maritime traffic management extends beyond safety to encompass broader sustainability dimensions. Research demonstrates that professional navigation oversight, such as maritime pilotage, directly contributes to emission reduction, spill avoidance, and marine biodiversity protection while optimizing operational efficiency [5].
Furthermore, given that maritime logistics operations within seaports account for approximately 60% of port CO2 emissions [6], the role of VTS in managing vessel movements efficiently becomes critical not only for safety but also for environmental sustainability objectives. VTS have become a cornerstone in maintaining the safety and efficiency of maritime navigation. As noted in the IALA VTS Manual [7], over 500 VTS centers are currently in operation worldwide, and their significance continues to increase in parallel with the persistent growth of global maritime traffic. VTSO perform complex and specialized tasks including collision avoidance between vessels, provision of navigational safety information, and emergency response. Their expertise and decision-making capabilities directly impact maritime safety.
Research findings indicating that 60 to 90% of maritime accidents are attributed to human factors [8] further emphasize the importance of maritime professionals such as VTSO. Moreno et al. [3] highlighted that VTSO serve as the critical link connecting safety and efficiency within maritime systems. Particularly, as VTSO must simultaneously control multiple vessels in real-time while performing complex situational assessments and decision-making, their professional competencies become decisive factors determining the safety and efficiency of maritime traffic.
However, current VTSO recruitment processes primarily rely on traditional methods centered on subject-specific written examinations and interviews. These methods present limitations in objectively and systematically evaluating candidates’ actual job suitability. Such subjective evaluation methods heavily depend on the experience and intuition of hiring personnel, making it difficult to establish consistent selection criteria and carrying the risk of missing excellent talent or selecting unsuitable personnel. These limitations of traditional evaluation methods are further exacerbated in the MASS era. In MASS operational environments, VTSOs must manage complex traffic scenarios involving mixed fleets of autonomous and manned vessels.
This requires multifaceted competencies beyond conventional navigation knowledge and experience, including understanding of autonomous systems, remote monitoring capabilities, and international communication skills. Traditional subjective evaluation methods are inadequate for systematically identifying and predicting the complex interactions and nonlinear relationships among these multiple competency factors. Therefore, a data-driven approach capable of objectively analyzing diverse candidate characteristics and discovering hidden patterns and competency combinations is essential. Recently, machine learning-based personnel selection systems have been introduced across various industries, significantly improving recruitment objectivity and prediction accuracy.
In the maritime field, modern approaches are being attempted, such as Pekdas et al. [9] applying machine learning to maritime professional recruitment using the Human-in-the-Loop (HITL) approach and achieving 86% prediction accuracy. The HITL approach is a methodology where domain experts directly participate in data labeling and model validation during the machine learning model training process, systematically incorporating professional knowledge. This is particularly effective in securing high accuracy and reliability that are difficult to achieve through automated labeling alone in complex and specialized domains. The effectiveness of this approach has been demonstrated across various fields: Butler et al. [10] applied expert manual correction to micro-expression recognition datasets, Zhang et al. [11] developed entity extraction methods utilizing human input, Bartolo et al. [12] improved reading comprehension model performance through iterative expert annotation, and Fan et al. [13] enhanced detection accuracy by integrating user intervention into network anomaly detection systems.
These studies commonly demonstrate that direct expert labeling and validation are essential for performance improvement in complex domains. Research on personnel selection has been actively conducted in the maritime field, though it has primarily focused on seafarer recruitment. Pekdas et al. [9] applied machine learning to maritime professional recruitment, Celik et al. [14] proposed a systematic model for captain selection, and Kartal et al. [15] conducted comparative analysis of multinational navigators. However, these studies all targeted seafarers working at sea, and research on VTSO recruitment, who control vessel traffic from shore, remains absent. VTSOs possess fundamentally different working environments and required competencies compared to seafarers. While seafarers board vessels directly to perform navigation duties, VTSOs monitor and control multiple vessels simultaneously from shore-based control centers using various electronic equipment such as radar, Automatic Identification System (AIS), and Closed-Circuit Television (CCTV).
Therefore, there are fundamental limitations in directly applying existing seafarer recruitment research results to VTSO recruitment. This study aims to address this research gap by predicting VTSO recruitment suitability through a binary classification model combining machine learning and HITL approaches, utilizing candidate features such as sea service experience, navigation career, education, certifications, and language proficiency.
However, accessing actual VTSO recruitment datasets, including detailed information of both successful and unsuccessful candidates, is practically impossible due to privacy regulations and institutional policies protecting personal information. This constraint necessitated the construction of expert-validated simulated datasets, where 40 current VTS professionals with over 5 years of control experience participated in the HITL labeling process to ensure data reliability and realism.
The key novelties of this research are threefold: (1) it represents the first application of machine learning-based prediction models specifically designed for shore-based VTSO recruitment, distinct from existing studies focused on seafarer selection; (2) it systematically integrates domain expert knowledge through HITL labeling with 40 current VTS professionals to ensure prediction reliability; and (3) it addresses limited data challenges through KDE-based augmentation while preserving the statistical characteristics of realistic recruitment scenarios.
Section 2 reviews previous research on personnel selection in maritime and other fields and identifies research gaps. Section 3 provides detailed explanations of data collection, preprocessing, inter-variable correlation analysis, and augmentation processes. Section 4 presents the methodology of the proposed machine learning model. Section 5 analyzes experimental results, and Section 6 conducts in-depth discussion of the results. Finally, Section 7 presents research conclusions and future research directions.

2. Literature Review

This literature review section analyzes existing studies related to personnel selection in the maritime sector and examines current research trends. Traditionally, multi-criteria decision making (MCDM) methodologies have been predominantly utilized in the maritime field, with statistical modeling and machine learning approaches beginning to be introduced in recent years. To systematically identify research gaps, existing studies are analyzed across two key methodological dimensions: feature design (general vs. maritime-specific competencies) and algorithmic approaches (MCDM vs. machine learning methods). Table 1 demonstrates the methodological diversity of personnel selection research in the maritime sector, summarizing the key characteristics and contributions of each study. Subsequently, machine learning-based personnel selection studies from other industries are reviewed to explore their applicability to the maritime field. Finally, through comprehensive analysis of existing literature, research gaps are identified, and the necessity of this study is presented.

2.1. Machine Learning-Based Personnel Selection Studies in Other Industries

Machine learning-based personnel selection research has been actively conducted across various industries, primarily developing around text-based matching and HITL approaches. In text-based matching research, Wang et al. [19] proposed the PJFCANN model combining co-attention neural networks and graph neural networks to utilize past successful recruitment cases for current hiring decisions. Wang and Zhu [20] compared advanced deep learning architectures including BERT and Bi-LSTM-CRF, emphasizing the importance of selecting appropriate models based on dataset characteristics rather than model complexity.
HITL-based research focuses on the systematic integration of expert knowledge. Pessach et al. [21] achieved 86% prediction accuracy by combining Variable-Order Bayesian Networks (VOBN) with HITL methodology, while Al-Quhfa et al. [22] systematically compared nine machine learning algorithms including K-NN, SVM, Random Forest, and Multi-Layer Perceptron, confirming that Random Forest achieved the highest performance at 92.8%. According to Wu et al.’s [23] comprehensive survey, systematically integrating domain expert knowledge into data processing and model training stages can simultaneously improve both model performance and reliability. These studies commonly demonstrate the superiority of ensemble methods, the importance of integrating domain expertise through HITL approaches, and the effectiveness of utilizing multiple data sources.

2.2. Applicability to Maritime Field and Research Gaps

Most existing studies have targeted general corporate environments or IT sectors, with Pekdas et al. [9] being the first to attempt machine learning-based personnel selection in the maritime field, though limited to seafarer recruitment. The methodological insights from these existing studies are applicable to VTSO recruitment. The superiority of ensemble methods such as Random Forest and neural networks can be applied to evaluating complex maritime professional competencies, while HITL methodologies for integrating domain expertise are useful for systematizing maritime experts’ experience.
Moreover, the importance of combining multiple data sources and building interpretable models can contribute to ensuring transparency and reliability in VTSO recruitment. Despite these applicabilities, existing research cannot resolve fundamental limitations in VTSO recruitment. In terms of job characteristics, VTSOs require highly specialized competencies distinct from general jobs studied in existing research, including real-time multi-vessel monitoring, radar/AIS system operation, and complex maritime situation assessment. Regarding evaluation characteristics, unlike the general personality traits or standardized job experiences utilized in existing studies, VTSOs require maritime-specific competencies as core elements, including maritime expertise, sea-going experience, navigation certificates, and maritime safety knowledge.
Most importantly, VTS work is directly linked to maritime safety. Therefore, prediction models require much higher accuracy and reliability than those for general corporate recruitment. To address these research gaps, this study proposes the development of machine learning models that reflect the unique job characteristics and maritime-specific competencies of VTSOs. We aim to systematically integrate maritime experts’ domain knowledge through an HITL approach to construct a VTSO-specific binary classification model based on sea-going experience, navigation career, maritime certificates, and other relevant factors.

3. Data Preprocessing

This study developed a systematic dataset that combines HITL approaches with data augmentation techniques for developing machine learning models that reflect the realistic characteristics of VTSO recruitment. The problem was formulated as a binary classification task for pass/fail decisions, reflecting the high competition rates of actual Korean VTSO recruitment, and a labeling process was implemented that systematically incorporates domain knowledge from maritime experts. As illustrated in Figure 1, the data construction process consists of four sequential stages, each designed to address specific methodological challenges.
First, Feature Definition identifies eight key features that capture both maritime-specific competencies (License, Career, Position) and general qualifications (Age, Education, TOEIC Score). Second, HITL Labeling employs 40 maritime experts from multiple VTS centers to evaluate 100 candidate profiles, with the top 18% labeled as successful to reflect actual Korean VTS statistics (17.9% pass rate). Third, Correlation Analysis applies Pearson correlation coefficients for continuous variables and Cramer’s V for categorical variables to confirm feature independence while identifying structural relationships in maritime career progression. Fourth, KDE-based Augmentation expands the original 100 samples to 2000 balanced samples (1000 successful, 1000 unsuccessful) through Kernel Density Estimation for continuous variables and proportional allocation for categorical variables, maintaining statistical characteristics while addressing class imbalance.
This systematic workflow ensures data quality and realism while enabling effective machine learning model training.

3.1. Dataset Construction

Eight key features were defined by analyzing the job characteristics and recruitment requirements of VTSO. The feature design was configured to comprehensively reflect the elements evaluated in actual recruitment processes, and their validity was verified through consultation with current VTSOs and maritime academics. Table 2 presents the detailed definitions and characteristics of each feature.
Realistic career progression models were applied to reflect systematic career development patterns in the maritime field. The data incorporated standard career advancement pathways in the Korean maritime industry: high school graduates begin sea service at age 19 and progress sequentially through 4th → 3rd → 2nd → 1st class positions, while university graduates start at age 23 and advance through 3rd → 2nd → 1st class positions. These patterns faithfully reflect the actual licensing acquisition processes and promotion systems in the maritime industry, ensuring data realism.

3.2. Human-in-the-Loop (HITL) Labeling

To construct a realistic dataset, VTSO recruitment statistics from the Korea Coast Guard’s competitive recruitment for general civil servants were analyzed for 2022–2025. Over three years, a total of 481 candidates applied, with 86 being selected, recording an average competition ratio of 5.59:1 and a pass rate of 17.9% [24]. Reflecting this realistic recruitment ratio, 100 virtual candidate profiles were constructed through consultation with current VTSO and maritime field professors, with 18 candidates (18%) designated as successful and 82 candidates (82%) as unsuccessful, implementing a realistic class distribution. For VTSO recruitment, determining candidate suitability requires advanced professional knowledge in maritime safety, vessel operations, and traffic control.
Therefore, an HITL labeling process utilizing the collective intelligence of domain experts was applied to achieve high accuracy and reliability that would be difficult to attain through automated labeling alone. The expert group comprised 40 current professionals with over 5 years of control experience from Busan Port VTS, Busan New Port VTS, Jeju Port VTS, and Mokpo Port VTS, selected through direct field visits. The selected experts possessed the professional knowledge necessary for candidate suitability assessment, having accurate understanding of the complexity and required competencies of actual VTS operations.
During the labeling process, each expert reviewed the profiles of 100 candidates and performed binary evaluation as either “suitable for recruitment” or “unsuitable for recruitment.” The evaluation results from 40 experts were aggregated to calculate the Approval Count for each candidate, and reflecting the actual recruitment rate (17.9%), the top 18 candidates by approval count were labeled as successful (1), with the remaining 82 candidates labeled as unsuccessful (0). To assess inter-rater reliability, we computed Fleiss’ Kappa among the 40 domain experts, yielding κ = 0.258 (p < 0.001), indicating fair agreement according to Landis and Koch [25]. While the fair-level Kappa reflects moderate variability in expert judgments, it demonstrates the inherent complexity and multifaceted nature of VTSO recruitment decisions. The statistically significant result (p < 0.001) confirms that expert evaluations were systematic rather than random, and our collective intelligence approach—using aggregated approval counts rather than requiring unanimous consensus—effectively captures this diversity of professional perspectives to generate reliable training labels. This collective intelligence-based labeling approach minimized individual expert subjective bias and enabled the generation of objective and reliable labels.

3.3. Feature Correlation Analysis

Prior to modeling, quantitative correlation analysis between features was conducted to verify the presence of strong associations among variables. According to feature types, Pearson correlation coefficients were applied to continuous variables and Cramer’s V was applied to categorical variables for quantitative assessment of associations. The Pearson correlation coefficient measures the strength of linear relationships between two continuous variables X   and Y , defined as follows:
r = i = 1 n c o n t   ( X i X ¯ ) ( Y i Y ¯ ) i = 1 n c o n t   ( X i X ¯ ) 2 i = 1 n c o n t   ( Y i Y ¯ ) 2
where X i and Y i represent the variable values of the i -th observation, X ¯ and Y ¯ are the means of each variable, and n c o n t is the number of observations for continuous variable pairs. For example, the analysis between Age and TOEIC Score yielded r = 0.028 , indicating a very weak correlation that was not statistically significant, confirming that these two features were mutually independent. Meanwhile, associations between categorical variables were measured using Cramer’s V, which is calculated based on the chi-square statistic as follows:
V = χ 2 n c a t m i n ( k 1 , r 1 )
where χ 2 is the chi-square statistic obtained through independence testing between two categorical variables, n c a t is the total sample size in the contingency table, and k and r represent the number of categories for each variable, respectively. Figure 2 presents the results of this feature independence analysis. The analysis revealed strong associations between Seafaring Career and Last Position ( V = 0.755 ) and between Education Level and Educational Institution ( V = 0.709 ).
This reflects the industry’s personnel structure where career advancement naturally leads to position promotion and education level is naturally associated with institutional type. Additionally, License showed moderate associations with other maritime professional features ( V = 0.356 ~ 0.623 ), while Completion of VTS Courses demonstrated weak associations with all features ( V 0.176 ), confirming its independence. Consequently, although strong associations exist between some features, these can be interpreted as structural relationships inherent to the maritime industry. Since each feature provides unique information for VTSO suitability prediction, inclusion of all features in the model was deemed appropriate.

3.4. Data Augmentation

The original dataset of 100 samples that completed correlation review presented two critical limitations for machine learning model training: insufficient data volume and severe class imbalance (18 successful vs. 82 unsuccessful candidates). To address these issues, differentiated data augmentation strategies were applied according to variable types. Among various data augmentation techniques, we selected Kernel Density Estimation (KDE) over alternatives such as Synthetic Minority Over-sampling Technique (SMOTE) and Bayesian sampling methods based on the following rationale.
SMOTE [26], which generates synthetic samples by interpolating between neighboring minority class instances, is primarily designed for purely continuous feature spaces and exhibits limitations when dealing with mixed-type datasets containing both continuous and categorical variables [27]. Since our VTSO dataset comprises six categorical features (Education Level, Educational Institution, License, Seafaring Career, Last Position, VTS Course Completion) and only two continuous features (Age, TOEIC Score), SMOTE’s interpolation-based approach would require complex adaptations for categorical features, potentially introducing unrealistic feature combinations that violate domain constraints inherent to maritime career progression.
Bayesian sampling methods [28], while theoretically elegant, require specification of prior distributions and assumptions about the underlying data generation process. Given our limited sample size ( n = 100 ) and the absence of established prior knowledge about VTSO candidate distributions, Bayesian approaches would introduce additional uncertainty through potentially misspecified priors. In contrast, KDE offers a non-parametric approach that directly estimates probability density functions from observed data without imposing distributional assumptions, making it particularly suitable for small datasets with unknown underlying distributions [29]. Moreover, KDE enables independent augmentation of continuous variables while preserving their marginal distributions, allowing us to maintain the proportional allocation strategy for categorical variables based on observed frequencies in the original dataset. KDE is a non-parametric statistical method that estimates probability density functions using kernel functions from given data, defined as follows:
f ^ ( x ) = 1 n h i = 1 n   K x x i h
where K is the kernel function, h is the bandwidth, and n is the number of data points. The augmentation process was specifically applied to continuous variables while preserving the statistical characteristics of the original data. Separate KDE models were constructed for each class (successful/unsuccessful) to capture different distribution patterns. For bandwidth selection, we applied Silverman’s rule of thumb [30]:
h = 0.9 × m i n ( σ , I Q R 1.34 ) × n 1 / 5
where σ is the standard deviation and I Q R is the interquartile range. This yielded bandwidth values of h A g e = 2.09   years for Age and h T O E I C = 40.20   points for TOEIC Score in the successful candidate class ( n = 18 ), and h A g e = 1.99   years and h T O E I C = 33.79   points in the unsuccessful candidate class ( n = 82 ). These bandwidth values enable smooth density estimation while avoiding over-smoothing that would obscure important distributional characteristics.
New values were sampled from the estimated probability density functions, with Age constrained to integer values within the 18–60 year range and TOEIC Score limited to 5-point increments within the 400–990 point range to maintain realistic score granularity consistent with actual TOEIC assessment protocols. Categorical variables were proportionally allocated according to the distribution patterns of the original data. For example, if a specific career segment comprised 40% of successful candidates, the same proportion was allocated to the augmented successful candidate samples to maintain realistic feature combinations that reflect actual maritime career progression pathways.
This proportional allocation strategy ensures that the augmented dataset preserves the structural relationships between categorical features observed in the original data, such as the natural association between Seafaring Career and Last Position. Table 3 presents a comparison of dataset characteristics before and after augmentation.
Figure 3 visually demonstrates the distribution comparison before and after augmentation.
In (a) Age distribution and (b) TOEIC Score distribution, it can be confirmed that KDE-based augmentation successfully preserved the probability density characteristics of the original data while generating new samples. Solid lines represent original data, and dotted lines represent augmented data, with distribution characteristics of successful (red) and unsuccessful (blue) candidates consistently maintained after augmentation. (c) Categorical variable distribution preservation shows that license grade distributions were augmented while maintaining original proportions within each class. (d) Class balance improvement confirms that the severe imbalance of the original data was improved to complete balance through augmentation.
Through this systematic augmentation process, as shown in Table 3, the original 18 successful samples were expanded to 1000 and the 82 unsuccessful samples were expanded to 1000, constructing a final dataset of 2000 completely balanced samples.

4. Methodology

This section implements and compares four machine learning algorithms to solve the binary classification problem for predicting VTSO recruitment suitability. The algorithm selection criteria considered: (1) superior performance on structured tabular data, (2) effectiveness for binary classification problems, (3) interpretability and feature importance analysis support, and (4) robustness in limited data environments. To facilitate understanding, we present definitions for common variables and notations used throughout this section:
(1)
n : Total count of training instances
(2)
y i , y ^ i : Actual value ( y i ) and predicted value ( y ^ i ) of the i-th sample
(3)
K : Tree quantity in ensemble architectures
(4)
L : Objective or loss criterion function

4.1. XGBoost

XGBoost is an efficient implementation of the gradient boosting framework developed by Chen and Guestrin [31], widely used in various machine learning problems due to its high predictive performance and computational efficiency [32]. For VTSO recruitment data, the objective function of XGBoost is defined as follows:
o b j = i = 1 n   L ( y i , y ^ i ) + k = 1 K   Ω ( f k )
In this context, Ω ( f k ) denotes the penalty component that quantifies the structural complexity of tree k. The mathematical formulation of this penalty component Ω ( f k ) is expressed as:
Ω ( f k ) = γ T + 1 2 λ j = 1 T   w j 2
In this formulation, γ controls the penalty applied to leaf node quantity, T denotes the total leaf count, λ serves as the L2 regularization coefficient, and w j indicates the j-th leaf’s weight value. The learning mechanism of XGBoost operates by minimizing the defined objective. Throughout training, the objective is approximated via Taylor series expansion, facilitating effective determination of both tree architecture and leaf weight values across successive iterations. Hyperparameter tuning was conducted using grid search methodology, focusing on critical parameters such as step size for learning, maximum depth of trees, and minimum child weight thresholds [33]. A key benefit of XGBoost lies in its capability to quantify feature contribution [34]. This functionality enabled identification of the most influential variables for predicting.

4.2. Random Forest

Random Forest represents an ensemble approach utilizing multiple decision trees, originally introduced by Breiman [35], and has gained widespread adoption owing to its superior predictive performance, resilience to overfitting, and versatility across diverse data types [36]. The training methodology of Random Forest comprises several key steps. Initially, through bootstrap sampling, multiple subsets of the training data are randomly created, and this process is repeated to generate multiple bootstrap samples.
Subsequently, each bootstrap subset is used to construct an individual decision tree, wherein at every branching point, the best partition is determined from a randomly selected feature subset. Ultimately, the complete model is assembled by combining numerous trees into a forest structure. When making predictions, the output is derived by consolidating predictions from all constituent trees. In classification scenarios, a plurality vote scheme is employed:
y ^ = m o d e { f k ( x ) , k = 1 , , K }
where f k ( x ) is the prediction of the k-th tree. Random Forest includes several key hyperparameters that control model performance. The number of trees in the forest determines the ensemble size, while the maximum depth of each tree controls model complexity. Parameters for minimum samples required for node splitting and leaf nodes affect tree structure and generalization.
The number of features considered at each split influences model diversity and performance. Similar to XGBoost, Random Forest offers the notable benefit of quantifying feature contribution [37]. This capability enabled the identification of variables exerting the most substantial influence on VTSO recruitment suitability predictions. Furthermore, Random Forest was considered appropriate for recruitment prediction within intricate maritime contexts given its proficiency in modeling non-linear patterns and multi-dimensional feature interactions.

4.3. Multi-Layer Perceptron Neural Network

Multi-Layer Perceptron (MLP) is a feedforward artificial neural network proposed by Rumelhart et al. [38], consisting of an input layer, one or more hidden layers, and an output layer. MLP was applied to learn complex nonlinear patterns in VTSO recruitment data. Each neuron in the neural network receives input signals, computes a weighted sum, and generates output through an activation function. This process is expressed as follows:
z = i = 1 d   w i x i + b ,   y ^ i = g ( z )
In this formulation, d represents the input feature dimension, w i denotes the connection weight parameters, x i indicates the input variable values, b signifies the bias parameter, and g corresponds to the activation mechanism. To address the vanishing gradient issue and enhance training efficiency, this study utilized the ReLU activation mechanism:
g ( z ) = m a x ( 0 , z )
Neural network learning is performed through the backpropagation algorithm. For binary classification, the following cross-entropy loss function is used:
L = 1 n i = 1 n   [ y i l o g ( σ ( y ^ i ) ) + ( 1 y i ) l o g ( 1 σ ( y ^ i ) ) ]
where σ is the sigmoid function. The Adam algorithm [39] was used for optimization with adaptive learning rates to update weights. The main advantage of MLP in VTS recruitment prediction is its ability to effectively capture complex interactions among various features such as maritime experience, license grades, and educational background.

4.4. Support Vector Machine

Support Vector Machine (SVM) represents a robust classification methodology introduced by Vapnik [40], which aims to find the optimal separating hyperplane that maximizes the margin between classes. SVM was applied to provide robust classification performance suitable for the binary classification problem of VTSO recruitment.
SVM transforms input space data into a high-dimensional feature space to enable linear separation. The classification function of SVM using RBF kernel is defined as follows:
y ^ = s i g n i = 1 n   α i y i K ( x i , x ) + b
where α i are Lagrange multipliers and K ( x i , x ) is the kernel function. The RBF kernel is defined as:
K ( x i , x j ) = e x p γ x i x j 2
where γ is a parameter controlling the width of the kernel, with larger values forming more complex decision boundaries. The optimization problem of SVM is solved through the following quadratic programming:
m i n α   1 2 i = 1 n   j = 1 n   α i α j y i y j K ( x i , x j ) i = 1 n   α i Subject   to   constraints :   0 α i C ,     i = 1 n   α i y i = 0
where C is a regularization parameter that balances the tolerance for classification errors and the size of the margin. In VTSO recruitment, SVM provides stable classification performance even in limited data environments and can derive interpretable results through clear decision boundaries.

4.5. Algorithm Development and Training Configuration

This study implemented and compared four machine learning algorithms for predicting VTSO recruitment suitability: XGBoost, Random Forest, MLP, and SVM with RBF kernel. The dataset configuration for each model was designed as follows:
D = D t r a i n D v a l i d D t e s t
where D t r a i n , D v a l i d , and D t e s t represent the training, validation, and test datasets, respectively. The complete dataset | D | consists of 2000 data points, which were randomly split in an 8:1:1 ratio, resulting in 1600, 200, and 200 data points for training, validation, and testing, respectively. To address variability introduced by random splitting, 5-fold cross-validation was applied. Each dataset’s data points are expressed as follows:
d t r a i n i = ( x t r a i n i , y t r a i n i ) , i = 1 , , 1600
d v a l i d i = ( x v a l i d i , y v a l i d i ) , i = 1 , , 200
d t e s t i = ( x t e s t i , y t e s t i ) , i = 1 , , 200
where y t r a i n i , y v a l i d i , y t e s t i { 0,1 } represent binary labels indicating VTSO recruitment suitability. The original input feature vector consists of eight features:
x i = [ A g e , E d u c a t i o n   L e v e l , E d u c a t i o n a l   I n s t i t u t i o n , L i c e n s e , S e a f a r i n g   C a r e e r , L a s t   P o s i t i o n , C o m p l e t i o n   o f   V T S   C o u r s e s , T O E I C   S c o r e ]
Among these, A g e and T O E I C   S c o r e are continuous variables that were standardized using StandardScaler. The remaining six categorical variables were processed using ordinal encoding for ordered variables ( L i c e n s e , S e a f a r i n g   C a r e e r , L a s t   P o s i t i o n , E d u c a t i o n   L e v e l ) and one-hot encoding for unordered variables ( E d u c a t i o n a l   I n s t i t u t i o n , C o m p l e t i o n   o f   V T S   C o u r s e s ). The hyperparameters for each model were optimized through grid search and 5-fold cross-validation based on F1-score criterion. To assess model stability and reliability, all performance metrics were evaluated using the same 5-fold cross-validation, with detailed settings presented in Table 4.
All experiments were conducted in the computing environment specified in Table 5.

4.6. Model Assessment and Evaluation Criteria

This section outlines the validation and testing methodology employed for the models, along with the performance metrics utilized for evaluation. The validation dataset D v a l i d and test dataset D t e s t are mathematically formulated as follows:
D v a l i d = { d i , v a l i d | i I v a l i d } , D t e s t = { d i , t e s t | i I t e s t }
In this notation, d i , v a l i d and d i , t e s t denote the i-th observations within the validation and test subsets, respectively. The symbols I v a l i d and I t e s t correspond to the indexing collections for these validation and test subsets. Individual observation structures are formulated as:
d i , s = ( x i , s , y i , s )
Here, x i , s R 8 is the input feature vector (where 8 is the total number of features), and y i , s { 0,1 } is the binary label indicating VTSO recruitment suitability. The subscript s denotes either “valid” (validation) or “test” (testing). The model’s predictions on the validation and test datasets are computed as follows:
y ^ i , v a l i d = f θ ( x i , v a l i d ) , y ^ i , t e s t = f θ ( x i , t e s t )
In this notation, f θ denotes the optimized prediction model, with θ corresponding to the learned parameter set encompassing weight and bias values. Model performance assessment employed five key metrics: Accuracy, Precision, Recall, F1-score, and ROC AUC. Accuracy serves as the primary indicator quantifying the ratio of correct classifications relative to total predictions, thereby evaluating overall model effectiveness.
Precision quantifies the fraction of true positives among all positive classifications. Recall captures the fraction of genuine positive cases successfully detected by the model. F1-score represents the harmonic average of Precision and Recall, facilitating assessment of equilibrium between these two measures. ROC AUC evaluates the model’s discriminatory capability between positive and negative categories under varying decision boundaries. The mathematical formulations for these metrics are presented below:
A c c u r a c y = T P + T N T P + T N + F P + F N
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F 1 - S c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
In this context, TP (True Positive) denotes instances where VTSO recruitment suitability is accurately classified, while TN (True Negative) represents instances where unsuitability is accurately identified. FP (False Positive) arises when an unsuitable candidate is erroneously classified as suitable, whereas FN (False Negative) emerges when a suitable candidate is incorrectly classified as unsuitable.
Furthermore, the ROC AUC (Receiver Operating Characteristic Area Under the Curve) metric was employed to evaluate the calibration quality of predicted probabilities. This metric offers a consolidated score capturing the model’s discriminative performance across the entire spectrum of decision thresholds, computed as follows:
R O C A U C = 1 n 0 n 1 i : y i = 1   j : y j = 0   I ( p ^ i > p ^ j )
Here, n 0 and n 1 represent the number of samples in the negative and positive classes, respectively, p ^ i is the predicted probability that the i -th sample belongs to the positive class, and I is the indicator function. The complete workflow for training and evaluating the four machine learning models is detailed in Algorithm 1.
Algorithm 1. Model Training and Evaluation Process for VTSO Recruitment Prediction
1: Input: Training data D t r a i n , validation data D v a l i d , test data D t e s t
2: Output: Trained models { f X G B o o s t θ , f R F θ , f M L P θ , f S V M θ }, performance metrics
3: Encode categorical features using ordinal and one-hot encoding
4: Standardize continuous features (Age, TOEIC Score) using StandardScaler
5: for each model m {XGBoost, Random Forest, MLP, SVM} do
6:   Initialize f m θ with hyperparameters (see Table 4)
7:   Initialize c v m
8:   for fold ← 1 to 5 do
9:      Split D t r a i n into D t r a i n ( f o l d ) and D v a l i d ( f o l d )
10:      if  m requires scaling then
11:       Scale D t r a i n ( f o l d ) and D v a l i d ( f o l d )
12:      Train f m θ on D t r a i n ( f o l d )
13:      Predict y ^ v a l i d f m θ ( x v a l i d ( f o l d ) )
14:      Compute F1-score and append to c v m
15:   Compute average F1-score for model m
16: for each model m  do
17:   if m requires scaling then
18:      Scale D t r a i n , D v a l i d , and D t e s t
19:   Train f m θ on D t r a i n
20:   Predict y ^ v a l i d f m θ ( x v a l i d )
21:   Predict y ^ t e s t f m θ ( x t e s t )
22:   Compute performance metrics on D v a l i d and D t e s t using Equations (15)–(17)
An early stopping strategy was implemented to mitigate overfitting during the validation phase. In particular, for XGBoost and MLP architectures, the training procedure was terminated when validation loss exhibited no enhancement across 100 consecutive epochs. The aforementioned performance indicators were utilized to conduct thorough evaluation of each model’s effectiveness. Throughout validation, these indicators were continuously tracked to identify the best-performing model, with identical indicators subsequently employed in the testing phase to determine final model performance.

5. Results and Analysis

This section presents the experimental results of the four machine learning models for VTSO recruitment prediction. The analysis examines model performance evaluation (Section 5.1), feature importance analysis (Section 5.2), and partial dependence plot examination (Section 5.3) to identify the optimal model and understand the key factors influencing recruitment suitability decisions.

5.1. Model Performance Evaluation

The performance of XGBoost, Random Forest, MLP, and SVM models was evaluated using accuracy, precision, recall, F1-score, and ROC AUC metrics through 5-fold cross-validation. Table 6 presents the performance results with standard deviations, demonstrating model stability and consistency across different data subsets.
XGBoost achieved the highest performance across most metrics with 94.5% accuracy and 0.946 F1-score, while SVM demonstrated the best ROC AUC of 0.988 and highest recall of 0.970. MLP showed the lowest performance with 90.5% accuracy, indicating relatively weaker predictive capability compared to ensemble and kernel-based methods. Figure 4 shows the ROC curves for the four models.
SVM achieved the highest ROC AUC value of 0.988, followed by XGBoost (0.981), Random Forest (0.975), and MLP (0.968). All models demonstrated excellent classification performance with ROC AUC values exceeding 0.968. Figure 5 presents the confusion matrices for each model. XGBoost achieved the highest F1-score of 0.946 and minimized FNs to 4, maintaining the lowest rate of missing actually suitable candidates. SVM recorded the fewest FNs with 3 cases but had a relatively high number of FPs with 11 cases. MLP showed the highest classification errors, with 13 FPs and 6 FNs.
Overall, XGBoost demonstrated the best performance in F1-score, while SVM achieved the highest ROC AUC. All models attained accuracy above 90%, confirming their effectiveness for predicting VTSO recruitment suitability.

5.2. Feature Importance Analysis

Feature importance analysis was conducted to identify the relative contribution of each input variable in predicting VTSO recruitment suitability across the four models. Each model’s feature importance values were individually normalized to a 0–1 scale, then averaged across all models to provide a comprehensive ranking. Figure 6 shows the normalized feature importance values for each model.
As shown in Figure 6, TOEIC Score consistently demonstrated the highest importance across all models, achieving maximum normalized importance (1.0) in Random Forest, MLP, and SVM models. The average importance across all four models was 0.908, establishing it as the most critical predictor. Seafaring Career emerged as the second most important feature with an average importance of 0.782, followed by Last Position (0.373), License (0.227), and Age (0.225).
The normalized importance values reveal that Random Forest, MLP, and SVM models all identified TOEIC Score as their most influential feature, while XGBoost showed a more distributed importance pattern across features. Educational background variables (Education Level, Educational Institution) and VTS Course Completion consistently ranked among the least important factors across all models, suggesting that practical maritime experience and English proficiency are more critical determinants than formal education credentials for VTSO recruitment suitability.

5.3. Partial Dependence Plot Analysis

Partial dependence plot (PDP) analysis was conducted to quantitatively examine the individual effects of each feature on VTSO recruitment suitability prediction. Figure 7 presents the partial dependence plots for the 8 main features and 12 variables subdivided through one-hot encoding.
License demonstrated a consistent linear pattern across all models, with recruitment probability continuously increasing as the grade advances from 4th to 1st class (1st class being the highest maritime license). Seafaring Career exhibited an intriguing non-linear pattern, showing a sharp increase up to the 3–4-year experience range followed by a decline, forming an inverted U-shaped curve. Last Position showed an upward trend until the 2nd Officer position, then maintained a relatively flat pattern.
Age displayed an inverted U-shaped distribution with recruitment probability increasing until approximately 30 years of age, then declining thereafter. Education Level showed an upward trend until the bachelor’s degree level, then plateaued. TOEIC Score demonstrated a strong positive correlation across all models, with recruitment probability continuously rising as scores increased. Regarding educational institutions, both Maritime university and Non-maritime university showed upward patterns, while Maritime training institute and Maritime-related high school exhibited downward trends.
For VTS course completion, completing VTS courses (O) showed upward patterns across all models, while not completing them (X) displayed downward patterns. The specific implications of these partial dependence patterns for VTSO recruitment and their interpretation within the practical context of the maritime industry will be discussed in detail in the following section.

6. Discussion

This section analyzes the experimental results within the context of VTSO recruitment and maritime industry practices. Three key aspects are examined: model performance characteristics and their suitability for recruitment prediction, feature importance patterns and their interpretation through maritime domain expertise, and practical implications for implementing machine learning-based recruitment systems.

6.1. Model Performance Analysis

The experimental results reveal distinct performance characteristics among the four machine learning models. XGBoost achieved the highest F1-score (0.946), demonstrating strong balance between precision and recall. SVM obtained the highest ROC AUC (0.988), indicating better probabilistic calibration across all classification thresholds. This performance divergence reflects differences in model optimization objectives. XGBoost’s gradient boosting framework excels at minimizing classification errors through iterative refinement.
Only 4 false negatives were produced, which is critical for avoiding rejection of suitable candidates. In contrast, SVM’s margin maximization approach produces well-calibrated probability estimates, achieving the lowest false negative rate (3 cases) but with more false positives (11 cases). The strong performance of ensemble methods, particularly XGBoost and Random Forest, aligns with previous maritime personnel selection research. Pekdas et al. [9] demonstrated that XGBoost achieved the highest accuracy in seafarer recruitment prediction, followed by Random Forest. This pattern suggests the inherent suitability of ensemble methods for handling complex, multi-dimensional recruitment problems with limited data availability in the maritime domain.
Shin and Yang [41] reported similar advantages in maritime accident prediction using small-scale datasets, further supporting their robustness across different maritime applications. The choice between models involves strategic trade-offs. XGBoost’s higher F1-score suggests better classification accuracy for binary pass/fail decisions, making it suitable for automated screening. SVM’s superior ROC AUC indicates more reliable probability estimates, valuable for ranking candidates or setting flexible decision thresholds based on recruitment quotas. Random Forest (F1-score: 0.931, ROC AUC: 0.975) represents a robust alternative, while MLP’s lower performance (F1-score: 0.908) suggests limited suitability despite its theoretical capacity for complex pattern recognition.

6.2. Feature Analysis and Maritime Context Interpretation

Feature importance and partial dependence analysis results demonstrate that international communication capabilities and appropriate levels of practical experience are key factors in VTSO recruitment. TOEIC Score achieving the highest average importance of 0.908 reflects the essential nature of VTS operations. TOEIC is a globally recognized English proficiency assessment, and effective English communication with multinational vessels in international maritime traffic control is a critical competency directly linked to maritime safety.
The consistent upward pattern of TOEIC Score across all models in partial dependence analysis indicates that improving English proficiency continuously increases recruitment probability. The average importance of 0.782 for Seafaring Career and its intriguing inverted U-shaped partial dependence pattern reflect the unique operational characteristics of the maritime industry. The pattern of sharp increase until 3–4 years of experience followed by decline suggests that experience at the Second Mate level is optimal for VTS operations. Second Mates serve as Navigation Officers aboard vessels, responsible for navigation duties, thus acquiring the core competencies required for VTS control. Conversely, from Chief Mate onwards, officers primarily handle cargo operations, reducing direct relevance to VTS work.
Additionally, excessive seafaring experience may cause recruiters to feel burdened by candidates’ high-level experience or raise concerns about adaptation to shore-based control operations, resulting in declining recruitment probability. The Age pattern of increase until approximately 30 years followed by decline demonstrates the balance point between physical and cognitive abilities required for VTS operations and accumulated experience. This indicates that concentration and decision-making capabilities needed for complex multi-vessel control operations are optimized within specific age ranges. Education Level also showed an upward pattern until bachelor’s level followed by plateauing, confirming the importance of systematic higher education. Analysis by educational institution revealed that Maritime university showed the highest upward pattern, attributed to systematic merchant marine officer training programs and organizational adaptability developed through collective living, which are positively perceived by evaluators.
Non-maritime university also demonstrated an upward pattern because systematic and in-depth learning experiences from four-year university education contribute to developing comprehensive thinking abilities required for VTS operations. Conversely, Maritime training institute and Maritime-related high school showed downward patterns, related to Korea’s maritime officer training system. Maritime training institutes offer accelerated programs compressing four-year maritime university curricula into one year, potentially lacking theoretical depth, which may explain these results.
For VTS course completion, completing courses showed upward patterns across all models while not completing showed downward patterns, reflecting Korea’s VTS education system. Currently, VTS-related courses are offered in maritime-related high schools and universities, and such theoretical foundational knowledge contributes to improving practical adaptability. Overall, feature analysis results confirm that international communication capabilities, maritime expertise with appropriate levels of sea-going experience, systematic higher education, and VTS-related specialized knowledge are key elements in VTSO recruitment.

6.3. Practical Implications and Future Directions

The results of this study provide specific practical implications for improving the efficiency and objectivity of VTSO recruitment processes. The overwhelmingly high importance of TOEIC Score (0.908) suggests the necessity of developing current general English proficiency assessment methods into forms specialized for VTS operations. Development of VTS-specific English examinations or educational programs that can evaluate specialized maritime terminology, emergency response communication, and real-time communication capabilities with multinational vessels should be considered.
The identification of Seafaring Career (0.782) as the second most critical predictor confirms that practical maritime experience remains fundamental to VTSO competency. In the emerging era of MASS, these core competencies—English proficiency and practical seafaring knowledge—will become even more vital, as VTSOs must effectively communicate with both conventional vessels and Shore Remote Control Centers (SRCC), while simultaneously applying their maritime expertise to assess autonomous vessel behaviors and manage mixed traffic environments involving both manned and unmanned vessels.
The proposed machine learning models demonstrate the potential to transform existing subjective and inconsistent recruitment methods into objective and standardized systems. The XGBoost model’s 94.6% F1-score can effectively filter out unsuitable candidates in the primary screening process, reducing the number of interview candidates and significantly saving recruitment costs and time. Additionally, the SVM model’s high ROC AUC (0.988) can rank candidates’ suitability probabilistically, supporting recruiters in making more sophisticated selection decisions. However, this study has several important limitations. The most significant limitation is the practical constraints of data collection. Detailed information of both successful and unsuccessful candidates in actual recruitment processes is virtually impossible to obtain due to personal information protection and institutional policies.
Consequently, this study was compelled to use simulated data, but reliability was secured by constructing a dataset close to reality through rigorous validation by 40 current VTSOs and maritime experts. Additionally, this model was developed specifically for Korea’s VTS recruitment system and educational framework, potentially lacking robustness for direct application in other countries. Particularly, VTS course completion reflects variables specific to Korea’s educational processes.
However, core characteristics such as License, Seafaring Career, Age, and TOEIC Score are globally common maritime professional evaluation criteria, and VTS systems worldwide follow common operational standards specified by IALA, making universal application possible through appropriate localization. Future research requires model validation using actual data of successful and unsuccessful candidates through cooperation with relevant institutions. Furthermore, additional research utilizing VTS-specific English examinations developed to replace current TOEIC Scores could further improve model accuracy.
As autonomous vessels increasingly operate in coastal waters and VTS areas, future studies should explore how the feature importance identified in this study—particularly the critical role of English proficiency and seafaring experience—translates to the new operational context where VTSOs must manage mixed traffic of manned and unmanned vessels. This includes investigating whether additional features related to technical aptitude for digital navigation systems, emergency response capabilities for unmanned vessels, and ability to interpret data from Shore Remote Control Centers should be incorporated into the recruitment prediction model. Development of universal models suitable for multinational VTS systems and methodological improvements to minimize subjectivity in HITL approaches while systematically incorporating more expert opinions are also important tasks. Through such developments, it will be possible to establish standardized VTSO recruitment systems that can contribute to improving maritime safety worldwide as the maritime industry transitions toward greater automation.

7. Conclusions

This study presents the first application of machine learning models combining HITL approaches with KDE-based data augmentation for VTSO recruitment, offering a methodology to transform existing subjective hiring practices into objective and quantitative evaluation systems. Unlike previous maritime personnel selection research that primarily focused on seafarer recruitment, this study is distinguished by developing prediction models that reflect the unique operational characteristics of shore-based VTSOs.
As the maritime industry enters an era of Maritime Autonomous Surface Ships (MASS), where VTSOs will face unprecedented challenges in managing mixed traffic environments of conventional and autonomous vessels, this study provides a foundational framework for developing future VTSO selection systems that can identify candidates with the competencies necessary for these evolving operational demands. The experimental results demonstrated the superiority of ensemble methods, with XGBoost achieving optimal balanced performance. Feature importance analysis revealed that English communication ability was the overwhelmingly dominant predictor, while seafaring career exhibited a non-linear pattern with optimal recruitment probability at appropriate levels of practical experience.
These findings provide scientific evidence for establishing systematic recruitment criteria by quantitatively identifying the complexity and expertise required for international maritime traffic control. In the context of emerging MASS operations, English proficiency for communication with Shore Remote Control Centers and practical seafaring knowledge for assessing autonomous vessel behaviors will become increasingly critical as VTSOs assume expanded responsibilities as primary human decision-makers in mixed traffic scenarios.
The methodological contribution of this study lies in constructing reliable prediction models through HITL labeling utilizing domain experts’ collective intelligence and KDE augmentation in limited data environments. Practically, this approach can improve recruitment process consistency and transparency, enhance the accuracy of talent selection, and contribute to administrative efficiency through pre-screening of interview candidates. However, several limitations must be acknowledged.
First, the model was developed using simulated data constructed through expert consultation rather than actual recruitment records, as real-world hiring data remains inaccessible due to privacy and institutional constraints. While the HITL approach with experienced VTSOs ensured domain validity, empirical validation with actual recruitment outcomes is necessary to confirm predictive accuracy in operational contexts. Second, the model reflects Korea-specific maritime educational systems and licensing frameworks, potentially limiting direct applicability to nations with different training pathways and regulatory structures. Third, despite rigorous expert consensus procedures, the HITL labeling process inherently retains subjective judgment, which may introduce systematic biases. Future research should address these limitations through several key directions. Collaboration with maritime authorities to collect and validate models with real-world recruitment data, cross-cultural validation studies to enhance international applicability, and incorporation of explainable AI techniques and fairness-aware machine learning approaches to improve decision transparency and address potential biases are essential.
As autonomous vessels increasingly operate in VTS areas, integrating MASS-specific competency criteria—including technical aptitude for digital navigation systems and emergency response capabilities for unmanned vessels—will be critical. Finally, developing VTS-specific English proficiency assessments that evaluate maritime terminology and emergency communication protocols would provide more targeted evaluation than general standardized tests. Through these advancements, standardized VTSO recruitment systems can be established worldwide, contributing to improved maritime safety as the industry transitions toward greater automation.

Author Contributions

Conceptualization, G.-h.S. and M.J.; methodology, G.-h.S. and M.J.; software, G.-h.S.; validation, G.-h.S. and M.J.; formal analysis, G.-h.S. and M.J.; investigation, G.-h.S.; resources, M.J.; data curation, G.-h.S.; writing—original draft preparation, G.-h.S.; writing—review and editing, G.-h.S. and M.J.; visualization, G.-h.S.; supervision, M.J.; project administration, G.-h.S. and M.J.; funding acquisition, M.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data are available on request due to restrictions, e.g., privacy or ethics. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy concerns.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chong, J.C. Impact of Maritime Autonomous Surface Ships (MASS) on VTS Operations. Master’s Thesis, World Maritime University, Malmö, Sweden, 2018. [Google Scholar]
  2. Kim, D.-W.; Lee, M.-K.; Park, S.-W.; Park, Y.-S. A Study on the Introduction of Maritime Autonomous Surface Ships and the Role of Vessel Traffic Service. J. Navig. Port Res. 2023, 47, 430–436. [Google Scholar] [CrossRef]
  3. Moreno, F.C.; Rodríguez, R.V.; Lorente, P.S.; López, M.J.U. Relationship between human factors and a safe performance of vessel traffic service operators: A systematic qualitative-based review in maritime safety. Saf. Sci. 2022, 155, 105892. [Google Scholar] [CrossRef]
  4. Shin, G.-H.; Yang, H. Vessel trajectory prediction in harbors: A deep learning approach with maritime-based data preprocessing and berthing side integration. Ocean Eng. 2025, 316, 119908. [Google Scholar] [CrossRef]
  5. Issa-Zadeh, S.B.; Garay-Rondero, C.L. Maritime Pilotage and Sustainable Seaport: A Systematic Review. J. Mar. Sci. Eng. 2025, 13, 945. [Google Scholar] [CrossRef]
  6. Issa-Zadeh, S.B.; Garay-Rondero, C.L. Decarbonizing Seaport Maritime Traffic: Finding Hope. World 2025, 6, 47. [Google Scholar] [CrossRef]
  7. IALA. IALA Guideline G1045 Staffing Levels at VTS Centres; Edition 1.2; International Association of Marine Aids to Navigation and Lighthouse Authorities: Saint Germain en Laye, France, 2022; Available online: https://www.iala.int/product/g1045/ (accessed on 25 December 2024).
  8. Papadimitriou, E.; Podofillini, L.; Brilakis, I.; Adjé, P.A.; Dang, V.N. Transport safety and human factors in the era of automation: What can transport modes learn from each other? Accid. Anal. Prev. 2020, 144, 105656. [Google Scholar] [CrossRef]
  9. Pekdas, I.G.; Uflaz, E.; Tornacı, F.; Arslan, O.; Turan, O. Developing a machine learning-based evaluation system for the recruitment of maritime professionals. Ocean Eng. 2024, 313, 119406. [Google Scholar] [CrossRef]
  10. Butler, C.; Oster, H.; Togelius, J. HITL AI for analysis of free response facial expression label sets. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (IVA ’20), Online, 20–22 October 2020; pp. 1–8. [Google Scholar]
  11. Zhang, S.; He, L.; Dragut, E.; Vucetic, S. How to invest my time: Lessons from HITL entity extraction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’19), Anchorage, AK, USA, 4–8 August 2019; pp. 2305–2313. [Google Scholar]
  12. Bartolo, M.; Roberts, A.; Welbl, J.; Riedel, S.; Stenetorp, P. Beat the AI: Investigating adversarial human annotation for reading comprehension. Trans. Assoc. Comput. Linguist. 2020, 8, 662–678. [Google Scholar] [CrossRef]
  13. Fan, X.; Li, C.; Yuan, X.; Dong, X.; Liang, J. An interactive visual analytics approach for network anomaly detection through smart labeling. J. Vis. 2019, 22, 955–971. [Google Scholar] [CrossRef]
  14. Celik, M.; Er, I.D.; Topcu, Y.I. Computer-based systematic execution model on human resources management in maritime transportation industry: The case of master selection for embarking on board merchant ships. Expert Syst. Appl. 2009, 36, 1048–1060. [Google Scholar] [CrossRef]
  15. Kartal, S.E.; Ugurlu, O.; Kaptan, M.; Arslanoglu, Y.; Wang, J.; Loughney, S. An analysis and comparison of multinational officers of the watch in the global maritime labor market. Marit. Policy Manag. 2019, 46, 757–780. [Google Scholar] [CrossRef]
  16. Koutra, G.; Barbounaki, S.; Kardaras, D.; Stalidis, G. A multicriteria model for personnel selection in maritime industry in Greece. In Proceedings of the 2017 IEEE 19th Conference on Business Informatics (CBI 2017), Thessaloniki, Greece, 24–27 July 2017; Volume 1, pp. 287–294. [Google Scholar] [CrossRef]
  17. Wang, Y.; Yeo, G.T. The selection of a foreign seafarer supply country for Korean flag vessels. Asian J. Shipp. Logist. 2016, 32, 221–227. [Google Scholar] [CrossRef]
  18. Ding, J.F.; Liang, G.S. The choices of employing seafarers for the national shipowners in Taiwan: An empirical study. Marit. Policy Manag. 2005, 32, 123–137. [Google Scholar] [CrossRef]
  19. Wang, Z.; Wei, W.; Xu, C.; Xu, J.; Mao, X.L. Person-job fit estimation from candidate profile and related recruitment history with co-attention neural networks. Neurocomputing 2022, 501, 14–24. [Google Scholar] [CrossRef]
  20. Wang, Y.; Zhu, Z. The application of deep learning model in recruitment decision. Wirel. Commun. Mob. Comput. 2022, 2022, 9645830. [Google Scholar] [CrossRef]
  21. Pessach, D.; Singer, G.; Avrahami, D.; Ben-Gal, H.C.; Shmueli, E.; Ben-Gal, I. Employees recruitment: A prescriptive analytics approach via machine learning and mathematical programming. Decis. Support Syst. 2020, 134, 113290. [Google Scholar] [CrossRef]
  22. Al-Quhfa, H.; Mothana, A.; Aljbri, A.; Song, J. Enhancing talent recruitment in business intelligence systems: A comparative analysis of machine learning models. Analytics 2024, 3, 297–317. [Google Scholar] [CrossRef]
  23. Wu, X.; Xiao, L.; Sun, Y.; Zhang, J.; Ma, T.; He, L. A survey of HITL for machine learning. Future Gener. Comput. Syst. 2022, 135, 364–381. [Google Scholar] [CrossRef]
  24. Korea Coast Guard. Competition Ratios for Career-Based Competitive Recruitment of General Civil Servants. Available online: https://www.kcg.go.kr/kcg/na/ntt/selectNttList.do?mi=2797&bbsId=311 (accessed on 17 September 2025).
  25. Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef]
  26. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  27. Fernández, A.; García, S.; Galar, M.; Prati, R.C.; Krawczyk, B.; Herrera, F. Learning from Imbalanced Data Sets; Springer: Cham, Switzerland, 2018. [Google Scholar] [CrossRef]
  28. Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis, 3rd ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2013. [Google Scholar]
  29. Scott, D.W. Multivariate Density Estimation: Theory, Practice, and Visualization, 2nd ed.; Wiley: New York, NY, USA, 2015. [Google Scholar]
  30. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Chapman and Hall: London, UK, 1986. [Google Scholar]
  31. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  32. Nielsen, D. Tree Boosting with XGBoost—Why Does XGBoost Win “Every” Machine Learning Competition? Master’s Thesis, Norwegian University of Science and Technology, Trondheim, Norway, 2016. [Google Scholar]
  33. Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
  34. Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 4765–4774. [Google Scholar]
  35. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  36. Liaw, A.; Wiener, M. Classification and regression by random forest. R News 2002, 2, 18–22. [Google Scholar]
  37. Strobl, C.; Boulesteix, A.L.; Kneib, T.; Augustin, T.; Zeileis, A. Conditional feature importance for random forests. BMC Bioinform. 2008, 9, 307. [Google Scholar] [CrossRef]
  38. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by backpropagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
  39. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  40. Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995. [Google Scholar]
  41. Shin, G.H.; Yang, H. Maritime accident prediction in Busan port using machine learning: An integrated approach with maritime accident reports and VTS data. Ocean Eng. 2025, 316, 119968. [Google Scholar] [CrossRef]
Figure 1. Data preprocessing workflow for VTSO recruitment binary classification model.
Figure 1. Data preprocessing workflow for VTSO recruitment binary classification model.
Jmse 13 02127 g001
Figure 2. Feature independence analysis for continuous and categorical features.
Figure 2. Feature independence analysis for continuous and categorical features.
Jmse 13 02127 g002
Figure 3. Distribution comparison before and after KDE-based data augmentation.
Figure 3. Distribution comparison before and after KDE-based data augmentation.
Jmse 13 02127 g003
Figure 4. ROC curves comparison for four machine learning models on test dataset.
Figure 4. ROC curves comparison for four machine learning models on test dataset.
Jmse 13 02127 g004
Figure 5. Confusion matrices for four machine learning models on test dataset.
Figure 5. Confusion matrices for four machine learning models on test dataset.
Jmse 13 02127 g005
Figure 6. Feature importance comparison across four machine learning models.
Figure 6. Feature importance comparison across four machine learning models.
Jmse 13 02127 g006
Figure 7. PDP for all features in VTSO recruitment prediction across four machine learning models.
Figure 7. PDP for all features in VTSO recruitment prediction across four machine learning models.
Jmse 13 02127 g007
Table 1. Summary of Personnel Selection Studies in Maritime Industry.
Table 1. Summary of Personnel Selection Studies in Maritime Industry.
Author(s)Study FocusFeaturesMethodology
Pekdas et al. [9]Maritime professionals recruitmentExperience, References, English proficiency, GPA, Experience, Professional knowledge, MMPI-I scalesDecision Tree, Random Forest, Gradient Boosted Trees, Naive Bayes, PNN
Kartal et al. [15]Multinational officers of the watch comparisonCost-related (wage, social security fees, other costs), Cultural properties, Education, Professional propertiesFuzzy Analytic Hierarchy Process (FAHP)
Koutra et al. [16]Maritime executive personnel selectionTrustworthiness, Responsibility, Decision making, Team spirit, Communication skills, Time management, Foreign languages, Computer skillsAHP + Correspondence Analysis (CA)
Wang and Yeo [17]Foreign seafarer supply country selection for Korean flag vesselsTotal crew costs, Seafarer qualifications, Government relations, Communication (language competence, cultural adaptation), Supply ability for well-trained seafarers, Government support from supply countriesDelphi, Fuzzy AHP, TOPSIS
Celik et al. [14]Master selection for merchant shipsOccupational information, Professional discipline, Leadership & coaching, Personality characteristicsAnalytic Network Process (ANP)
Ding and Liang [18]Seafarer employment choice behaviorCrew cost, Competence & efficiency (knowledge, skills, communication, physical/psychological conditions), Quality standard system (STCW95 compliance)Binary Logit Model
Table 2. Feature Definitions and Characteristics for VTSO Dataset.
Table 2. Feature Definitions and Characteristics for VTSO Dataset.
CategoryFeatureTypeRange/ClassificationDescription
Basic
Information
AgeContinuous18–60 yearsApplicant’s age
Education
Level
CategoricalHigh school or below/Bachelor’s/Master’s or aboveFinal education level
(3 levels)
Educational
Institution
CategoricalMaritime university/Non-maritime university/Maritime-related high school/Maritime training instituteClassification of maritime-related educational institutions
Maritime Professional ExperienceOfficer
License Grade
Categorical1st/2nd/3rd/4thMaritime officer license grade (1st is highest)
Navigation
Career
Categorical0/0~1 year/1~3 years/3~5 years/5~10 years/10+ yearsActual navigation experience period (5 intervals)
Last
Position
CategoricalCadet/3rd Officer/2nd Officer/Chief Officer/MasterFinal position aboard ship
(5 levels)
Competency IndicatorsVTS Course
Completion
BinaryCompleted/Not completedCompletion of VTS-related courses at educational institution
TOEIC
Score
Continuous550–985 pointsCertified English proficiency test score
Table 3. Comparison of Dataset Characteristics Before and After KDE-Based Augmentation.
Table 3. Comparison of Dataset Characteristics Before and After KDE-Based Augmentation.
CharacteristicOriginal DatasetAugmented Dataset
Pass Samples18 (18%)1000 (50%)
Fail Samples82 (82%)1000 (50%)
Total Samples100(100%)2000(100%)
Table 4. Hyperparameter Settings for Machine Learning Models.
Table 4. Hyperparameter Settings for Machine Learning Models.
ModelParameterValueSearch SpaceDescription
XGBoostn_estimators100[50, 100, 200]Number of boosting rounds
max_depth3[3, 6, 9]Maximum tree depth
learning_rate0.1[0.01, 0.1, 0.3]Step size shrinkage
min_child_weight1[1, 3, 5]Minimum sum of instance weight in child
eval_metricloglossN/AEvaluation metric
Random Forestn_estimators50[50, 100, 200]Number of trees in forest
max_depth15[5, 10, 15, None]Maximum depth of trees
min_samples_split10[2, 5, 10]Minimum samples to split node
min_samples_leaf1[1, 2, 4]Minimum samples in leaf node
MLPhidden_layer_
sizes
(150, 75)[(50,), (100,), (100,50), (150,75)]Neurons in hidden layers
learning_rate_init0.01[0.001, 0.01]Initial learning rate
alpha0.01[0.0001, 0.001, 0.01]L2 regularization parameter
max_iter10001000Maximum iterations
early_stoppingTrueN/AUse early stopping
SVMkernelrbf[rbf]Kernel type
C0.1[0.1, 1.0, 10.0]Regularization parameter
gammaauto[scale, auto, 0.001, 0.01]Kernel coefficient
Table 5. Hardware Specifications and Software Configuration for Model Implementation.
Table 5. Hardware Specifications and Software Configuration for Model Implementation.
ComponentSpecification
CPUAMD Ryzen 5 2600 Six-Core Processor 3.40 GHz
GPUNVIDIA Geforce RTX 3060 Dual OC D6 12 GB
Memory16 GB
Programming LanguagePython 3.8.18
OS PlatformWindows 10 x64
Machine Learning LibraryXGBoost 2.1.1, Scikit-learn 1.3.2
Supporting PackagesPandas 2.0.3, Matplotlib 3.7.5, NumPy 1.24.3,
Table 6. Performance Evaluation of Machine Learning Models on Test Dataset.
Table 6. Performance Evaluation of Machine Learning Models on Test Dataset.
ModelAccuracyPrecisionRecallF1-ScoreROC AUC
XGBoost0.945 ± 0.0080.932 ± 0.0100.960 ± 0.0120.946 ± 0.0090.981 ± 0.007
Random Forest0.930 ± 0.0120.913 ± 0.0150.950 ± 0.0140.931 ± 0.0130.975 ± 0.011
MLP0.905 ± 0.0150.878 ± 0.0180.940 ± 0.0160.908 ± 0.0050.968 ± 0.013
SVM0.930 ± 0.0110.898 ± 0.0130.970 ± 0.0100.933 ± 0.0090.988 ± 0.006
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shin, G.-h.; Jung, M. Preparing VTS for the MASS Era: A Machine Learning-Based VTSO Recruitment Model. J. Mar. Sci. Eng. 2025, 13, 2127. https://doi.org/10.3390/jmse13112127

AMA Style

Shin G-h, Jung M. Preparing VTS for the MASS Era: A Machine Learning-Based VTSO Recruitment Model. Journal of Marine Science and Engineering. 2025; 13(11):2127. https://doi.org/10.3390/jmse13112127

Chicago/Turabian Style

Shin, Gil-ho, and Min Jung. 2025. "Preparing VTS for the MASS Era: A Machine Learning-Based VTSO Recruitment Model" Journal of Marine Science and Engineering 13, no. 11: 2127. https://doi.org/10.3390/jmse13112127

APA Style

Shin, G.-h., & Jung, M. (2025). Preparing VTS for the MASS Era: A Machine Learning-Based VTSO Recruitment Model. Journal of Marine Science and Engineering, 13(11), 2127. https://doi.org/10.3390/jmse13112127

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop