Next Article in Journal
Helium Speech Recognition Method Based on Spectrogram with Deep Learning
Previous Article in Journal
Applying Big Data for Maritime Accident Risk Assessment: Insights, Predictive Insights and Challenges
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Early Employability of Vietnamese Graduates: Insights from Data-Driven Analysis Through Machine Learning Methods

by
Long-Sheng Chen
1,2,
Thao-Trang Huynh-Cam
2,*,
Van-Canh Nguyen
3,
Tzu-Chuen Lu
2 and
Dang-Khoa Le-Huynh
4
1
Department of Industrial Engineering and Management, National Taipei University of Technology, Taipei 106344, Taiwan
2
Department of Information Management, Chaoyang University of Technology, Taichung 413310, Taiwan
3
Office of Quality Assurance, Dong Thap University, Cao Lanh City 81118, Vietnam
4
Department of Industrial Engineering and Management, Hsiuping University of Science and Technology, Taichung 412406, Taiwan
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2025, 9(5), 134; https://doi.org/10.3390/bdcc9050134
Submission received: 9 April 2025 / Revised: 13 May 2025 / Accepted: 14 May 2025 / Published: 19 May 2025

Abstract

Graduate employability remains a crucial challenge for higher education institutions, especially in developing economies. This study investigates the key academic and vocational factors influencing early employment outcomes among recent graduates at a public university in Vietnam’s Mekong Delta region. By leveraging predictive analytics, the research explores how data-driven approaches can enhance career readiness strategies. The analysis employed AI-driven models, particularly classification and regression trees (CARTs), using a dataset of 610 recent graduates from a public university in the Mekong Delta to predict early employability. The input factors included gender, field of study, university entrance scores, and grade point average (GPA) scores for four university years. The output factor was recent graduates’ (un)employment within six months after graduation. Among all input factors, third-year GPA, university entrance scores, and final-year academic performance are the most significant predictors of early employment. Among the tested models, CARTs achieved the highest accuracy (93.6%), offering interpretable decision rules that can inform curriculum design and career support services. This study contributes to the intersection of artificial intelligence and vocational education by providing actionable insights for universities, policymakers, and employers, supporting the alignment of education with labor market demands and improving graduate employability outcomes.

1. Introduction

Graduate employability has become a key performance indicator for higher education institutions (HEIs), particularly as labor markets become more competitive and skill-based employment trends evolve [1,2]. Universities are increasingly assessed not only on academic excellence but also on their ability to equip students with the necessary skills and qualifications for the workforce [3]. In emerging economies such as Vietnam, employability is a pressing concern, particularly for graduates from provincial universities that may lack the prestige and employer recognition of elite institutions [4,5]. The need to enhance employability outcomes has placed significant pressure on universities to adapt their curricula and support systems to better prepare students for early career success [6].
Vietnam has witnessed a significant increase in university enrollment over the past decade, particularly in rural regions such as the Mekong Delta. According to the General Statistics Office of Vietnam [7], the number of students in the region increased from 127,379 in 2018 to 160,653 in 2020, reflecting a growing demand for higher education. However, employment within six months of graduation remains unguaranteed for many graduates, especially those from lesser-known institutions. The General Statistics Office of Vietnam [8] reported an overall unemployment rate of 2.03% in the first quarter of 2024, with rural unemployment at 2.58%, which is significantly higher than the urban rate of 1.20%. Since 2016, universities have been required to report graduates’ employment status within 12 months after graduation. The student recruitment quota decisions from the government for universities have been based on these reported results [9]. These figures highlight the need for universities to take proactive measures in improving recent graduate employability [6] through data-driven interventions.
Despite the growing attention on employability, many previous studies have relied on traditional survey methods to assess graduate outcomes [9,10,11]. While these approaches provide valuable insights into employment trends, they often lack predictive power and fail to identify specific academic and vocational factors that contribute to successful job placement. Traditional survey methods contain many challenges, such as more complex analysis, difficult data collection, difficult recording and scoring, biased risks, and lower reliability in predicting the future [12], usually leading to greater computational costs. Consequently, there is a need for models tailored to diverse educational settings to address these limitations and challenges, particularly in remote areas such as Vietnam’s Mekong Delta. Recent advances in artificial intelligence (AI) and machine learning (ML) offer novel opportunities for analyzing large-scale graduate employment data, allowing universities to make evidence-based decisions on curriculum improvements and career support programs [13,14,15].
ML models, particularly decision trees (DTs) and support vector machines (SVMs), have demonstrated strong predictive capabilities in employability research. Studies have shown that academic performance, such as grade point average (GPA), university admission scores, and extracurricular activities, significantly influence job placement outcomes [2,16]. Decision tree-based models, such as classification and regression Trees (CARTs), have been widely used due to their ability to provide interpretable decision rules, helping policymakers and educators understand key employability determinants [17,18]. These findings align with research indicating that GPA and soft skills are among the most critical factors in employers’ hiring decisions [19].
Moreover, when building ML models, we must solve class imbalance problems. This study employs both oversampling (the synthetic minority oversampling technique, SMOTE) and undersampling (random undersampling). Class imbalance can bias machine learning models toward the majority class, reducing sensitivity to minority class patterns that are often of critical importance in behavioral prediction tasks [20]. SMOTE generates synthetic examples by interpolating between minority class instances, effectively increasing representation without mere duplication [21]. Meanwhile, random undersampling reduces the size of the majority class to balance the dataset, minimizing computational costs and potential overfitting [22]. By integrating both approaches, this study aims to enhance classifier performance and achieve a more equitable decision boundary across classes.
Building upon this foundation, the present work aims to predict the early employability of recent graduates in Vietnam using an AI-driven analysis of academic and vocational factors. This study focuses on a dataset comprising 610 graduates from a public university in the Mekong Delta, analyzing their academic records based on the Vietnamese GPA scale (0–10 points) and employment status within six months of graduation. By applying CART models and other predictive techniques, this research seeks to identify the most influential factors that contribute to early job placement and provide actionable insights for HEIs to enhance recent graduate employability.
The significance of this study lies in its ability to bridge the gap between vocational education research and data-driven employability analysis. Unlike previous studies that primarily focus on theoretical discussions of employability frameworks [23] or questionnaire-based analyses [9,10,11,18,24,25,26], this research leverages real-world student data to extract meaningful patterns and inform educational policy. In data science, the integration of relevant sciences in observed and empirical contexts has been significantly researched [27]. The findings are expected to guide universities in refining their academic programs, improving career counseling services, and aligning graduate competencies with labor market demands, ultimately fostering a more effective transition from education to employment.
In the following sections, this paper discusses the methodology used to construct predictive employability models, presents the key findings from the analysis, and explores the implications of these results for universities, policymakers, and industry stakeholders. By integrating AI techniques with vocational education research, this study contributes to the ongoing efforts to enhance recent graduate employability and ensure that students are better equipped for the evolving job market. Additionally, our proposed models, using academic performance recorded in the school database, can minimize computational costs and serve as early warning systems. With the aid of our lower-cost prediction models, university leaders can take early proactive measures to improve employability before students in a remote area start their first year and/or graduate in a data-driven manner.

2. Related Works

2.1. Graduate Employability Predictions

Employability refers to an individual’s ability to secure a graduate-level job, retain employment, and transition to new job opportunities when necessary [28]. Recent or fresh graduates are those who have recently completed their university degrees and have been the early stages of their careers [29]. Given the increasing importance of employability, numerous studies have employed AI-driven and ML models to predict recent graduates’ job prospects [16,17,18]. These models can precisely and rapidly inform HEI policymakers about which graduates will be early (un)employed and key factors for early (un)employment. With this rapid and accurate information, HEI policymakers can provide supporting measures to help land students a stable job within six months of the university-to-work transition. Table 1 summarizes recent studies on graduate employability. The literature provided evidence that AI-driven analysis and ML methods have become increasingly valuable tools for predicting graduates’ employability in diverse educational contexts, such as India [13,17], the Philippines [16], China [30], Egypt [18], Jordan [31], and Nigeria [32]. Thus, this study seeks to predict the early employability of Vietnamese graduates during the six months of the university-to-work transition. The aim is to provide an intersection of AI-driven data and vocational education, as well as actionable insights for universities, policymakers, and employers, supporting the alignment of education with labor market demands and improving graduate employability outcomes.
Table 1. Summary of recent studies on graduate employability prediction.
Table 1. Summary of recent studies on graduate employability prediction.
SourceResearch ContextTasksMethodsResults
Mishra et al. [13]Master of Computer applications of students in IndiaClassificationDT, RF, NB, MLP, SMODT: 70.19%
Casuat and Festijo [16]Graduates of engineering in the PhilippinesClassificationDT, RF, SVMDT: 85%
RF: 84%
SVM: 91.22%
He et al. [30]ChinaClassification Regression DT, RF77.18%
ElSharkawy et al. [18]296 graduates of IT and several IT employers in EgyptClassificationDT, NB, LR, RF, SVMDT: 100%, LR and SVM: 98%
Alkashami et al. [31]Three Jordanian universitiesStatistics
Classification
Adaptive neuro-fuzzy inference system (ANFIS), DT, SVM, NB, MLPDT: F1 of 82.4%
SVM: F1 of 77.8%.
Baffa et al. [32]NigeriaClustering
Classification
RF, LR, DTDT: 97%
RF: 98%
Jayachandran and Joshi [17]IndiaClassificationRF, DT, k-nearest neighbor (KNN), SVM, gradient boosting, adaptive boosting, and extreme gradient boostingSVM: 74.37%

2.2. Machine Learning Model Selection

From the evidence from Table 1, among the various AI-driven models, DT and SVM have emerged as prominent methods for employability prediction. Several studies have demonstrated the efficacy of the DT and SVM algorithms in different educational and labor market contexts. For instance, the authors in [17] examined the employability of engineering students by utilizing multiple ML algorithms, including random forest (RF), DT, k-nearest neighbor (KNN), SVM, gradient boosting, adaptive boosting, and extreme gradient boosting. Their findings indicated that the SVM outperformed the other algorithms, achieving an accuracy rate of 74.37%.
Similarly, the authors in [31] investigated the early employment readiness of graduates from three Jordanian universities using an adaptive neuro-fuzzy inference system (ANFIS) and DT, SVM, naïve Bayes (NB), and multilayer perceptron (MLP) algorithms. Their results demonstrated that DT and SVM performed well, achieving F1 scores of 82.4% and 77.8%, respectively.
Further supporting the effectiveness of DT and SVM, Kumar and Babu [2] predicted the employability of Indian engineering students by analyzing demographic variables, university factors, extracurricular activities, and family backgrounds. Their study found that DT and SVM yielded the best performance, with an accuracy of 98%.
Additionally, Casuat and Festijo [16] examined employability prediction for graduates in the Philippines, incorporating personal skills, communication skills, and grade point average (GPA) as key factors. They employed DT, RF, and SVM, concluding that the SVM was the most effective algorithm with an accuracy of 91.22%, followed by DT at 85%.

2.2.1. Decision Trees (DTs)

DTs are supervised learning algorithms that are widely employed for classification and regression tasks. It has a tree-shaped graph in which factors appear, confirming important factors; those that disappear in the tree are recognized as being unimportant [15,33]. DTs have numerous advantages, such as (1) ease of understanding and interpretation through visualized flow charts or IF-THEN rules [15,33], (2) high prediction accuracy (Table 1), (3) the ability to handle both numeric and category data, and (4) the ability to address overfitting problems when they properly pruned [34]. Some of the most popular DT algorithms are CARTs (classification and regression trees). CARTs typically use the Gini impurity or entropy criterion to test the degree of impurity or randomness in a set of labels. Hence, this study employs CARTs to predict early employability and identify key factors for early employment. The C5.0 algorithm is also used as a base comparison.

2.2.2. Support Vector Machine (SVM)

An SVM is a supervised learning method used for classification, regression, and outlier detection problems. The SVM is one of the best prediction methods based on statistical learning frameworks. It is used to create a hyperplane (the best line), which can segregate a multi-dimensional space into classes so that a new data point can be placed into the correct category [17]. This research employs an SVM to address overfitting issues.

2.3. Key Factors for Graduate Employability

University graduates’ employability relies on several factors (Table 2). These factors included gender [32,35,36], field of study [35,37], university admission scores, extracurricular activities [2,16], and academic performance [16,30,32,35,36]. Since these factors were used in many educational contexts, they were employed to predict the early employability of recent Vietnamese graduates and as a benchmark with similar educational contexts.
The literature suggests that using an AI-driven analysis of academic and vocational factors to predict the employability of recent graduates from an emerging university in Vietnam is a novel undertaking. Thus, this study seeks to address this gap by applying a DT algorithm with CART and C5.0, as well as an SVM algorithm, to predict early employability during the 6-month university-to-work transition of recent graduates of a Vietnamese university in the Mekong Delta region, using the following factors: gender, field of study, university entrance scores, and GPA scores for four university years. It also identifies key determinants influencing early employment outcomes. The aim is to contribute to early job placement and provide actionable insights for HEIs to enhance graduate employability.
Table 2. Key factors for graduate employability.
Table 2. Key factors for graduate employability.
FactorsDescriptionReferences
MajorFields of study in the degree programs that graduates were enrolled in.[35,37]
GenderThe gender of graduates: female or male.[32,35,36,37]
University entrance scoresThe scores for university entrance exams used as admission criteria.[2,13,16]
Academic performanceGPA scores and scores for assignments, projects, internships, etc.
Weighted grades of degree courses.
[16,17,30,32,35,36,37]

3. Methodology

The implementation process of this work is displayed in Figure 1.
Step 1. Data collection, data preprocessing, and samples
The dataset used in the present study was obtained from graduates who recently completed a four-year bachelor’s degree at a Vietnamese public university in the Mekong Delta in June 2023. The dataset was directly obtained from the school database, and graduates’ identities were kept anonymous for ethical reasons. Data preprocessing was performed using Microsoft Excel 2016 to ensure accuracy. Graduates with missing values, those who were unemployed and looking for jobs at the time of this research, those who had part-time jobs, those who were employed more than 6 months after graduation, and those who were engaged in a course of study, training, or research were excluded. After these exclusions, the final dataset used for constructing the ML models included 610 graduates. Table 3 describes the demographic statistics of the final dataset.
Table 4 describes the input and output factors. The input factors comprise the students’ major, gender, admission scores, and GPA during the four academic years of the bachelor’s program. The output factor is recent graduates’ (un)employment within 6 months after graduation. For ease of interpretation and data consistency, we used ordinal encoding for categorical data: major and gender. It is unnecessary to transfer the admission scores and the GPA of the four academic years since they are already numerical data. Figure 2 presents the correlation matrix for these factors.
Data were also normalized using Equation (1).
X n o r = X X m i n X m a x X m i n
where X is the origin value, Xmax is the maximum value, Xmin is the minimum value, and Xnor is the normalized value.
Step 2. ML model construction
We used the output factor “employed” as a class label to classify the dataset into three cases, as shown in Figure 3. Case 1 used the origin data as a base comparison. Case 2 randomly undersampled the majority class “employed”, while Case 3 oversampled (SMOTE) the minority class “unemployed” to address problems relating to imbalanced data, which often cause low accuracy. Figure 3 shows the number of graduates in the three classification cases.
The present study constructed ML models using C5.0, CART, and SVM models. The C5.0 model was constructed on See5 software, which is available at https://www.rulequest.com/see5-win.html (accessed on 1 February 2025). The CART and SVM models were constructed on Jupyter Notebook software version 6.5.4 in the Python 3 (ipykernel) language, which is available at https://jupyter.org/ and https://scikit-learn.org/stable/ (accessed on 1 February 2025). The dataset was split into ten folds for training and testing in an 80:20 ratio (Table 5). Every model was constructed using these training–testing folds in accordance with the following steps:
(1)
Set the parameters of each model as shown in Table 6;
(2)
Process the ML models with training, testing, and cross-validation;
(3)
Repeat Steps (1)~(2) ten times;
(4)
Calculate the mean value and standard deviation (SD) to compare the prediction performance among the constructed AI models.
Table 5. Data division for ML model construction.
Table 5. Data division for ML model construction.
DatasetNumber of GraduatesPercentage
Training set48880%
Testing set12220%
Total526100%
Table 6. Parameters used for building ML models.
Table 6. Parameters used for building ML models.
ModelParametersParameter Values
SVMKernelrbf
C1
CARTCriterionGini *
Max depth3, 4, 5 for pruning the tree
Random stateNone
C5.0Default in the See5 software
* Note: To measure the impurity and avoid overfitting. The Gini value ranges from 0–1, where 0 is perfectly pure [38].
Step 3. Evaluation
The overall accuracy, confusion matrix, receiver operator characteristic curve (ROC curve), and area under the ROC curve (AUC) were used to calculate the performance of the constructed AI models, which were adapted from [39] in order to avoid misinterpretation due to the presence of imbalanced class problems. The accuracy and confusion matrix values range from 0–1, where 1 represents excellent prediction. The value of the ROC–AUC ranges from 0.5~1, where 1 represents excellent prediction performance.
(a)
Confusion matrix: Table 7 describes the confusion matrix values used in the present study. TP means that actually employed recent graduates are correctly predicted as “employed”; TN means that actually unemployed recent graduates are correctly predicted as “unemployed”. FP and FN are errors, showing that actually employed and unemployed recent graduates are incorrectly predicted/classified.
(b)
Accuracy is measured through Equation (2).
A c c u r a c y = T P + T N T P + R P + F N + T N
(c)
ROC–AUC: The ROC curve is a plot utilized to visualize the behavior of classification models. The AUC is used to measure whether a model can classify “employed” and “unemployed” classes correctly.
Table 7. Confusion matrix values.
Table 7. Confusion matrix values.
Positive
Prediction is “employed”
Negative
Prediction is “unemployed”
Recent graduate is actually “employed”True Positive (TP)False Negative (FN)
Recent graduate is actually “unemployed”False Positive (FP)True Negative (TN)
Step 4. Importance feature selection
After comparing the classification performances of the constructed ML models, we selected the best-performing data fold and classification cases to identify the most influential factors contributing to the early (un)employment of recent graduates, determine inter-relationships among these factors, and interpret the extracted knowledge.
Step 5. Discussion and Conclusion
Based on the extracted key factors, we suggested potential solutions to increase the employability rate of recent graduates. Then, we summarized these findings, highlighted contributions and limitations, and outlined directions for future research on recent graduate employability.

4. Results

4.1. Results of the Classification

This section will report the results of the three classification cases. As can be seen from Table 8, the overall accuracy result for Case 3 is higher than for Cases 1 and 2. The CART model was the best (93.60%), and the worst model was SVM (62.10%). Hence, CART will be used for further analyses.
For better insights into the prediction performance, the confusion matrix and the ROC–AUC were computed. Figure 4 compares the confusion matrix values of the CART model among the three classification cases. It is obvious in Figure 4c that this value for Case 3 is close to 1, confirming that the CART model constructed in Case 3 is excellent. This means that the CART built in Case 3 can predict employed and unemployed recent graduates correctly. Thus, it was used to identify important factors and determine their inter-relationship.

4.2. Results of Importance Features

Figure 5 shows the rank of importance factors from the CART. Obviously, the top three most important factors include the GPA in year 3 and year 4, and the admission score. The least important factor is gender [3,13,19,30].
Although the most important factors were determined, the inter-relationship among them has remained unknown. Therefore, we used the CART to extract knowledge rules, which are expected to help university policymakers enhance graduate employment percentages at the early stages. The guidance and programming codes for extracting these rules are available at: https://github.com/justmarkham/scikit-learn-tips/blob/master/notebooks/24_decision_tree_visualization.ipynb (accessed on 1 February 2025) and https://www.youtube.com/watch?v=EMcNjJ6Gj8w (accessed on 1 February 2025).

5. Discussion

5.1. Research Findings

In total, there are 14 rules that potentially lead recent graduates to employment or unemployment within 6 months of graduation. Rules 1–8 are for employment, whilst rules 9–14 are for unemployment.
  • Employment: Rules 1~8
    (1)
    If GPA_Year 3 = 5.85~8.32, GPA_Year 4 ≤ 8.54, and Major ≤ 20.5, then employability = Yes;
    (2)
    If Admission score > 18.64, GPA_Year 3 > 8.2, and GPA_Year 4 ≤ 8.8, then employability= Yes;
    (3)
    If Major = 1~1, GPA_Year 2 = 6.65~6.92, and GPA_Year 3 > 6.63, then employability= Yes;
    (4)
    If GPA_Year 4 > 8.54, then employability= Yes;
    (5)
    If GPA_Year 3 > 5.84 and Major ≤ 4.5, then employability= Yes;
    (6)
    If Major = 16~21, Admission score ≤ 17.17, and GPA_Year 3 > 6.77, then employability= Yes;
    (7)
    If Major = 6~21 and Admission score: ≤ 16.23, then employability= Yes;
    (8)
    If Gender = Male, Admission score: > 17.43, and GPA_Year 1 > 7.7, then employability= Yes.
  • Unemployment: Rules 9~14
    (9)
    If Major = 18~21, Admission score ≤ 20.87, and GPA_Year 1 ≤ 7.16, then employability= No;
    (10)
    If Admission score < 17.9 and GPA_Year 3 = 5.81~ 6.63, then employability= No;
    (11)
    If Major = 5~21, Gender = Female, GPA_Year 2 = 7.52~8.48, GPA_Year 3 > 8.23, and GPA_Year 4 > 8.85, then employability= No;
    (12)
    If Major = 5 ~16, Gender = Female, GPA_Year 2 ≤ 7.52, and GPA_Year 3 > 7.85, then employability= No;
    (13)
    If Gender = Male and GPA_Year 1 < 7.77, then employability= No;
    (14)
    If Major = 5~7, GPA_Year 1 ≤ 6.57, and GPA_Year 2 = 6.37~6.92, then employability= No.
From the above extracted rules, it can be concluded that the GPA in year 3 and year 4, university entrance exam score, and major can contribute to recent graduate (un)employment. The most impacted rules (i.e., the first two rules for (un)employment) emphasize the following points:
  • In the academic curriculum of the research university, the third and fourth years consist of professional subjects and industry internships, which are very much in demand by recruiters and potentially provide job offers after internships. Graduates of almost all majors, except Tourism, with a high GPA in year 3 (ranging from 5.85 to 8.32 points) and in year 4 (≤ 8.54 points) are employable within 6 months after graduation. Under a similar situation, students whose GPA in their third year is lower than 5.81 are not employable even if they have high admission scores;
  • The university entrance exam score is important for recent graduates’ employability. It has a strong relationship with their GPA in the third year. Students with admission scores of > 18.64 points, which is 2 points higher than the minimum score, and a GPA in their third year higher than 8.23 points, will be employed. Students with admission scores lower than 17.9 points and a GPA in year 3 between 5.81~6.63 points will be unemployed;
  • Most majors provided by the research university, except Tourism, majorly contribute to the early employability of recent graduates. Graduates of Tourism are unemployed perhaps resulting from the effect of post-COVID-19 pandemic.
The results indicate that academic performance, particularly in the later years of university, has a substantial impact on employability. The strong correlation between third-year GPA and employment outcomes suggests that students who establish a solid academic foundation before their final year have a higher likelihood of securing jobs quickly. GPA is frequently used for generally screening the resumes of job applicants. Those with higher GPAs will gain better evaluations and more opportunities for job interviews since good scores are often considered a sign of intelligence, motivation, and other job skills [19]. This finding is consistent with prior studies highlighting the predictive power of cumulative academic achievements in hiring decisions [3,13,19,30]. Furthermore, university entrance scores emerged as a significant predictor, indicating that students with higher initial academic capabilities tend to maintain strong performance throughout their studies, ultimately improving their employability prospects.
Another notable finding is the relatively lower importance of gender in employability prediction. While previous research has highlighted gender disparities in labor market access [17], this study found that academic achievements and admission scores outweighed gender differences in determining early employment. This suggests that, at least within this dataset, academic merit plays a dominant role in shaping employment opportunities, although further research is needed to explore potential industry-specific biases.

5.2. Discussions on Research Findings

The findings from this study highlight the need for consistent internal and external actions to enhance recent graduate employability. Internally, universities should modify and/or update majors and their corresponding curricula to match the demands of labor markets [3]. In the current highly competitive labor markets, up-to-date, career-aligned, practical, and systematic academic programs could better equip students for evolving job markets. Curriculum designs should emphasize skill-based learning alongside theoretical knowledge. The curriculum should be promptly developed and adapted to satisfy students’ needs [25]. A can-do curriculum and/or syllabus in accordance with Bloom’s taxonomy, rather than a textbook-based curriculum, prioritizing what students can do after learning should be more encouraged. Students and industry employers should be involved. On a frequent basis, professional subjects and practical skills in specific majors should be carefully checked and added/removed to ensure a match between academics and labor market demands, since the major/field of studies strongly impacts employability [25,35,37]. Getting a good education–job match would provide recent graduates with greater employability since it minimizes the fact that early-career employees have earned a degree but lack skills or competencies [40]. Practical, technical, foreign language, and research skills should be added to the academic curriculum. Overseas internships should be encouraged more. These efforts can equip graduates for expected workplace expansions (i.e., working both within Vietnam and between countries worldwide).
Given that the GPA in later years significantly influences employability, universities can implement more targeted interventions, such as tutoring programs and academic mentoring, to support students at critical stages of their education. Study counsel teams support students’ learning, since recent students may be unsure how many credits they should select and what learning method would better their learning performance.
Career counseling services should focus on preparing students early in their academic journey, helping them develop competencies aligned with employer expectations. Personalized career coaching based on academic performance data can help students proactively address employability challenges. As university entrance scores are illustrated to have predictive power, universities could refine their admissions criteria to select students with stronger academic potential, ultimately improving overall employment outcomes.
For sustainable development, university policymakers can leverage academic performance indicators to identify high-potential candidates. Developing AI-based screening tools that assess GPA trends and academic progress can enhance recruitment strategies. In addition, while academic performance is a key employability predictor, organizations should also invest in onboarding programs that equip graduates with workplace skills, such as communication and teamwork, to complement their technical knowledge.
Governments can introduce policies that incentivize partnerships between universities and industries, facilitating internships and work-integrated learning programs that improve job readiness. Government policymakers should encourage universities to adopt AI-driven analytics for tracking graduate employment trends. Such data-driven approaches can help refine national education policies and ensure alignment with labor market demands. Data can have the potential to become meaningful knowledge and be more practical when they are observed and empirical in specific contexts [27].
The Ministry of Education and Training should establish standardized policies and a national framework for curriculum design, in which professional knowledge and skills should be integrated with the 17 Sustainable Development Goals (SDGs) of UNESCO. Additionally, technical skills, such as information technology and computer programming skills and foreign language skills, should be included in all majors. Official regulations and guidance for using English to teach professional courses/majors and conduct research in academics, e.g., in every semester, at least two professional courses are required to be taught in English, should be issued. As many curricula have been textbook-based or theory-based with endless lectures, quizzes, and exams, these efforts will improve employment outcomes and reduce costly waste from ineffective academic programs [41].
The following sections suggest action steps and pilot programs for curriculum/syllabus changes and/or policy adjustments. Here is an example of a course description and learning objective in a can-do syllabus, piloted for the course Mathematic Application in a first-year bachelor’s degree. These course descriptions and learning objectives are based on Bloom’s taxonomy, adapted from [42], and are integrated with SDG Goal #8: Decent work and economic growth.
  • Course description
This course focuses on the applications of mathematics in management and business. Students will use dynamic models and data analysis with an emphasis on model construction and interpretation in specific contexts. Students will practice implementing model construction and interpretation with Excel (Visual Basic for applications), Excel data analysis, and Solver, C5.0, and Python languages. In addition, students will be trained to have a better understanding of the decision-making process and to make decisions with a scientific attitude.
  • Learning outcomes
At the end of the course, students could be able to:
Learning Outcomes/ObjectivesBoom’s level
  • know how to program and analyze data using Excel, the C5.0 algorithm, and Python
  • know what the decision-making process is
(1) Know
  • understand how to analyze multivariate datasets with the aim of extracting the meaningful message contained within available data
  • discuss the effects of (un)employment
(2) Understand
  • demonstrate mathematical and statistical knowledge
  • apply mathematics to aid decision-making
(3) Apply
  • diagram linear regression for influence factors for university students’ (un)employability
(4) Analyze
  • read papers in the areas of forecasting, employability analysis, and economics with a reasonable assessment of the basic mathematical techniques employed in these papers
(5) Evaluate
  • design a questionnaire to collect data on the topic “Influencing factors for university graduates’ (un)employability”
  • develop a prediction model using Excel and/or Python based on students’ designed questionnaires
  • interpret the results generated
(6) Create

6. Conclusions

6.1. Concluding Remarks

Graduate employability is a complex issue that has significant implications for individuals, universities, families, and society as a whole. This study uncovered the critical role of academic performance in predicting early employment outcomes among Vietnamese graduates using an AI model and data-driven analysis. Comparing the performance of different models, CART (93.6%) outperformed C5.0 (71.31%) and SVM (62.10%), reinforcing its reliability in predicting graduate employability. The identified key factors—GPA in the third and fourth years, along with university admission scores—serve as crucial indicators for early employment. Furthermore, this study confirms that graduates with strong academic records are more likely to secure employment within six months, underscoring the need for universities to enhance academic support systems to improve student performance. The extracted decision rules offer practical guidance for university policymakers, enabling them to refine academic policies and develop strategies aimed at increasing employability rates and ensuring sustainable educational development. Additionally, a provided example of a pilot course is expected to be an initial practical step for curriculum changes, aiming to bridge the gaps between research and practice.
The present work contributed significantly to the practices and theories on recent graduates’ employability. For practice, we contributed a prediction model using data-driven analysis and AI models. By utilizing AI-driven analytics, universities can identify key employability factors and implement targeted interventions to enhance student success. We also contributed a practical pilot example for curriculum changes. The findings provide valuable insights for higher education institutions, policymakers, and employers, highlighting the importance of data-driven strategies in improving graduate employability. Our proposed models used academic performance, recorded in the school database, which can minimize computational costs and serve as early warning systems. In short, our proposed model can bridge the gap between research and practice.
For theories, our study is the first to fill gaps in the research. Table 9 compares this study with several prior related studies using CARTs. The key difference is that many studies considered the employability of graduates in the field of information, engineering, and computer science; of master’s degree graduates; or graduates undergoing on-the-job training, who have higher opportunities for employment. Some attempted to maximize accuracy without providing practical insights on the most influential factors for employment. Some used survey questionnaires and skills rather than academic performance for prediction models. In several studies in similar educational settings, it is also unclear how long graduates took to be employed after graduation. Limited research focused on the influencing factors of the early employability of multidisciplinary graduates within 6 months of graduation in emerging regions such as Vietnam’s Mekong Delta. Our methods are more practical, have more predictive power, and are efficient, time-saving, and low-cost for identifying specific academic and vocational factors that contribute to successful job placements. Based on these highlighted gaps, we suggest that machine learning methods in a data-driven manner should explain and interpret more meaningful knowledge rather than emphasize accuracy. Multidisciplinary graduates, instead of those from one or two disciplines, should be considered for general insights and broader benchmarking.
Table 9. Comparison of employability prediction studies using the CART model.
Table 9. Comparison of employability prediction studies using the CART model.
SourceResearch ContextResearch Major/Field of StudyGraduation TimeData Collection MethodAccuracy PerformanceMost Influential Factors
Mishra et al. [13]IndiaMaster-degree graduates of Computer ApplicationsUnclearQuestionnaire70.19%Course projects, third-semester GPA, and academic hours
He et al. [30]ChinaComputer Applications2016School77.18%Academic achievement and graduation qualification
Alkashami et al. [31]JordanInformation technology, Computer science, engineering2015–2019School82.4%X
Jantawan and Tsai [43]ThailandComputer science and information technology12 months since graduationLocal goverment98.3%X
Casuat and Festijo [16]The PhilippinesEngineering
(on-the-job students)
2015–2019Career Center of Technological Institute (on-the-job training faculty)85%X
This studyVietnam’s Mekong Delta21 majors/fields of studyWithin 6 months of graduationSchool database93.6%Third-year GPA, university entrance scores, and final-year GPA

6.2. Limitation and Future Works

Even though this research provides valuable insights, it has several limitations. First, the analysis was conducted on data from a single university in the Mekong Delta, which may limit the generalizability of the findings to other institutions. Future researchers should incorporate datasets from multiple universities to validate these findings across diverse educational contexts.
Second, this study primarily focused on academic and demographic factors, whereas employability is also influenced by extracurricular activities, internships, industry connections, and the development of soft skills. Expanding the feature set to include these variables could provide a more comprehensive understanding of the determinants of employability.
Third, while manual ordinal encoding was applied to categorical features such as gender and major to ensure consistency across classifiers (CART, C5.0, and SVM), we acknowledge that the “major” variable is inherently nominal. Assigning ordinal integers (e.g., 1 to 21) to nominal categories may introduce artificial ordering or relational assumptions, particularly in decision tree models such as CARTs. This could impact both model interpretability and reliability. Future studies may consider alternative encoding techniques, such as target encoding or embeddings, to better reflect the true nature of nominal variables without compromising model compatibility.
The superior performance of CART compared to SVM and C5.0 raises important theoretical implications regarding model assumptions and susceptibility to bias. CART, being a non-parametric model, does not require assumptions about data distribution, making it suitable for heterogeneous or nonlinear datasets such as those involving consumer behavior or subjective motivations. However, decision trees such as CARTs are highly flexible and prone to overfitting, especially when dealing with small or imbalanced datasets, which may compromise generalizability. While this study reports a high accuracy in Case 3 with oversampling, the possibility of overfitting is not addressed, which could inflate the model’s apparent performance. Furthermore, without a detailed comparison of preprocessing steps, feature selection impacts, or hyperparameter tuning strategies among CART, SVM, and C5.0, it is difficult to isolate whether CART’s advantage stems from methodological robustness or model-specific alignment with the data. Acknowledging these theoretical considerations is essential to ensure the robustness of findings and to guide future applications of machine learning models in similar behavioral prediction contexts.
Although the CART model in Case 3 achieved the highest accuracy (93.6%), the use of random oversampling (SMOTE) to address class imbalance may lead to overfitting, limiting the model’s generalizability to new data. Future studies could consider adopting additional techniques to mitigate overfitting, such as ensemble learning methods (e.g., bagging and boosting) and cross-validation. These approaches can enhance model robustness and improve predictive performance across diverse datasets.
While the CART model demonstrated high predictive accuracy, the integration of explainable AI (XAI) methods, which are defined as explainability (self-explanatory) and interpretability intelligent systems [44,45], could enhance the interpretability of the model. Explainability describes the reasons behind AI-based models for decision-making and prediction. Interpretability expresses what is occurring in AI models and provides a human-understandable explanation of AI models’ decisions. XAI methods allow policymakers and educators to derive more meaningful insights. Future research should explore the application of XAI techniques to improve transparency and trust in AI-driven decision-making processes related to graduate employability. Finally, this study only employed CART, C5.0, and SVM for AI-driven decision-making processes related to graduate employability, whereas ensemble methods (e.g., random forest, gradient boosting, and XG boost) or deep learning models may offer better performance. Future researchers could apply these ensemble methods and/or deep learning models for broader comparisons. By addressing these limitations and expanding the scope of analysis, future studies can further contribute to the development of effective strategies for improving graduate employability, ensuring a smoother transition from education to employment.

Author Contributions

Conceptualization, T.-T.H.-C. and L.-S.C.; methodology, T.-T.H.-C. and L.-S.C.; software, T.-T.H.-C.; validation, T.-T.H.-C. and L.-S.C.; formal analysis, T.-T.H.-C. and V.-C.N.; investigation, V.-C.N.; resources, L.-S.C.; data curation, V.-C.N.; writing—original draft preparation, T.-T.H.-C., V.-C.N. and D.-K.L.-H.; writing—review and editing, T.-C.L. and L.-S.C.; visualization, V.-C.N. and D.-K.L.-H.; supervision, L.-S.C. and T.-C.L.; project administration, L.-S.C. and T.-C.L.; funding acquisition, L.-S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science and Technology Council, Taiwan, grant number NSTC 112-2410-H-027-029.

Data Availability Statement

The data is not publicly available.

Acknowledgments

We are grateful Dong Thap University, Vietnam and Chaoyang University of Technology, Taiwan for providing access to their facilities, which allow us to conduct the experiments reported in this work. We would like to express our gratitude to the National Science and Technology Council, Taiwan for providing the financial support that made this research possible.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Oliver, B. Teaching to promote graduate employability. In University Teaching in Focus; Routledge: London, UK, 2021; pp. 201–216. [Google Scholar]
  2. Kumar, M.S.; Babu, G.P. Comparative study of various supervised machine learning algorithms for an early effective prediction of the employability of students. J. Engineer. Sci. 2019, 10, 240–251. [Google Scholar]
  3. Bird, S.A. Community of Practice in a Seminar Setting: An Action Research Approach to Graduate Employability and Skills in Higher Education. Ph.D. Thesis, University of the West of England, Bristol, UK, 2023. [Google Scholar]
  4. Shibuya, S. Urbanization, jobs, and the family in the Mekong delta, Vietnam. J. Compar. Fami. Stu. 2018, 49, 93–108. [Google Scholar] [CrossRef]
  5. Vo, L.V.; Nguyen, L.; Le, U.V. The Status of provincial universities in the Mekong Delta area of Vietnam. Inter. J. Innova. Creat. Chan. 2020, 11, 105–120. [Google Scholar]
  6. Tran, T.T. Is graduate employability the ‘whole-of-higher-education-issue’? J. Edu. Work 2015, 28, 207–227. [Google Scholar] [CrossRef]
  7. General Statistics Office of Vietnam. Education. 2020. Available online: https://www.gso.gov.vn/en/px-web/?pxid=E1019&theme=Education (accessed on 15 May 2024).
  8. General Statistics Office of Vietnam. Education. 2024. Available online: https://www.gso.gov.vn/en/px-web/?pxid=E0263&theme=Population%20and%20Employment (accessed on 15 May 2024).
  9. Nguyen, V.T.; Peiró, J.M.; Le, Q.C.; González-Romá, V.; Martinez-Tur, V. Vietnamese Graduates’ Labour Market Entry and Employment: A Tracer Study; Uppsala University: Uppsala, Sweden, 2020. [Google Scholar]
  10. Thang, P.V.M.; Wongsurawat, W. Enhancing the employability of IT graduates in Vietnam. High. Edu. Skill. Work-Based Learn. 2016, 6, 146–161. [Google Scholar] [CrossRef]
  11. Phan, V.T.T.; Nguyen, L.D.T.; Nguyen, K.D. Twenty-First century essential employability skills for English as a foreign language undergraduate in a context of the Mekong Delta. Euro. J. Edu. Res. 2022, 11, 1089–1102. [Google Scholar] [CrossRef]
  12. Curuksu, J.D. Data Driven. An Introduction to Management Consulting in the 21st Century; Management for Professionals; Springer: Cham, Switzerland, 2018. [Google Scholar] [CrossRef]
  13. Mishra, T.; Kumar, D.; Gupta, S. Students’ employability prediction model through data mining. Inter. J. Appl. Eng. Res. 2016, 11, 2275–2282. [Google Scholar]
  14. Rattan, V.; Mittal, R.; Singh, J. Generative AI with ensemble machine learning framework for computer science graduates employability prediction using educational big data. Inter. J. Com. Digi. Sys. 2024, 17, 1–15. [Google Scholar] [CrossRef]
  15. Huynh-Cam, T.T.; Chen, L.S.; Lu, T.C. Early prediction models and crucial factor extraction for first-year undergraduate student dropouts. J. Appli. Res. High. Edu. 2024, 17, 624–639. [Google Scholar] [CrossRef]
  16. Casuat, C.D.; Festijo, E.D. Predicting Students’ Employability Using Machine Learning Approach. In Proceedings of the 2019 IEEE 6th International Conference on Engineering Technologies and Applied Sciences (ICETAS), Kuala Lumpur, Malaysia, 20–21 December 2019; pp. 1–5. [Google Scholar]
  17. Jayachandran, S.; Joshi, B. Customized support vector machine for predicting the employability of students pursuing engineering. Inter. J. Infor. Tech. 2024, 16, 3193–3204. [Google Scholar] [CrossRef]
  18. ElSharkawy, G.; Helmy, Y.; Yehia, E. Employability prediction of information technology graduates using machine learning algorithms. Inter. J. Advan. Com. Sci. App. 2022, 13, 359–367. [Google Scholar] [CrossRef]
  19. Pinto, L.H.; Ramalheira, D.C. Perceived employability of business graduates: The effect of academic performance and extracurricular activities. J. Vocation. Behav. 2017, 99, 165–178. [Google Scholar] [CrossRef]
  20. He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Know. Data Eng. 2009, 21, 1263–1284. [Google Scholar]
  21. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Arti. Intel. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  22. Drummond, C.; Holte, R.C. C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In Workshop on Learning from Imbalanced Datasets II; ICML: Washington, DC, USA, 2003. [Google Scholar]
  23. Behle, H. Students’ and graduates’ employability. A framework to classify and measure employability gain. Policy Rev. High. Edu. 2020, 4, 105–130. [Google Scholar] [CrossRef]
  24. Menon, M.E.; Pashourtidou, N.; Polycarpou, A.; Pashardes, P. Students’ expectations about earnings and employment and the experience of recent university graduates: Evidence from Cyprus. Inter. J. Edu. Develop. 2012, 32, 805–813. [Google Scholar] [CrossRef]
  25. Tavares, O. The role of students’ employability perceptions on Portuguese higher education choices. J. Edu. Work 2017, 30, 106–121. [Google Scholar] [CrossRef]
  26. Rowe, A.D.; Zegwaard, K.E. Developing graduate employability skills and attributes: Curriculum enhancement through work-integrated learning. Asia-Pac. J. Coop. Educ. 2017, 18, 87–99. [Google Scholar]
  27. Murtagh, F.; Devlin, K. The development of data science: Implications for education, employment, research, and the data revolution for sustainable development. Big Data Cog. Comp. 2018, 2, 14. [Google Scholar] [CrossRef]
  28. Bennett, D.; Knight, E.; Li, I. The impact of pre-entry work experience on university students’ perceived employability. J. Further High. Edu. 2023, 47, 1140–1154. [Google Scholar] [CrossRef]
  29. Prastika, N.D.; Amalia, S.; Wardani, R.; Anggraini, F. Revealing the work readiness of fresh graduates in the capital city environment new country. Daengku J. of Human. Soc. Sci. Inno. 2022, 2, 656–663. [Google Scholar]
  30. He, S.; Li, X.; Chen, J. Application of Data Mining in Predicting College Graduates Employment. In Proceedings of the 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 28–31 May 2021; pp. 65–69. [Google Scholar]
  31. Alkashami, M.; Taamneh, A.; Khadragy, S.; Shwedeh, F.; Aburayya, A.; Salloum, S. AI different approaches and ANFIS data mining: A novel approach to predicting early employment readiness in middle eastern nations. Inter. J. Data Net. Sci. 2023, 7, 1267–1282. [Google Scholar] [CrossRef]
  32. Baffa, M.H.; Miyim, M.A.; Dauda, A.S. Machine learning for predicting students’ employability. UMYU Sci. 2023, 2, 001–009. [Google Scholar]
  33. Chang, J.R.; Chen, M.Y.; Chen, L.S.; Chien, W.T. Recognizing important factors of influencing trust in O2O models: An example of Opentable. Soft Comp. 2020, 24, 7907–7923. [Google Scholar] [CrossRef]
  34. Chen, L.S.; Chang, P.C. Identifying Crucial Website Quality Factors of Virtual Communities. In Proceedings of the International Multi Conference of Engineers and Computer Scientists (IMECS 2010), Hong Kong, 17–19 March 2010; pp. 487–492. [Google Scholar]
  35. Othman, Z.; Shan, S.W.; Yusoff, I.; Kee, C.P. Classification techniques for predicting graduate employability. Inter. J. Adv. Sci. Engineer. Infor. Tech. 2018, 8, 1712–1720. [Google Scholar] [CrossRef]
  36. Gong, H.; Zhou, C.; Chen, Y.; Teng, C. Analysis on the influencing factors of graduate employment quality based on training data mining. In Proceedings of the 2020 International Conference on Information Science and Education (ICISE-IE), Sanya, China, 4–6 December 2020; pp. 209–214. [Google Scholar]
  37. Haque, R.; Quek, A.; Ting, C.Y.; Goh, H.N.; Hasan, M.R. Classification techniques using machine learning for graduate student employability predictions. Inter. J. Adv. Sci. Engineer. Infor. Tech. 2024, 14, 45. [Google Scholar] [CrossRef]
  38. Kaushik, A. Gini Impurity and Entropy for Decision Tree. Available online: https://medium.com/@arpita.k20/gini-impurity-and-entropy-for-decision-tree-68eb139274d1#:~:text=Gini%20Impurity%20measures%20how%20well,tree%20to%20the%20leaf%20node (accessed on 30 April 2025).
  39. Huynh-Cam, T.T.; Chen, L.S.; Nguyen, V.C.; Nguyen, T.H.; Lu, T.C. Why first-year e-students are dissatisfied: Machine learning methods for enhancing retention. Inter. J. Appli. Sci. Engineer. 2024, 21, 2023532. [Google Scholar] [CrossRef]
  40. Salas-Velasco, M. Mapping the (mis) match of university degrees in the graduate labor market. J. Labour Market Res. 2021, 55, 14. [Google Scholar] [CrossRef]
  41. Jung, J.; Wang, Y.; Sanchez Barrioluengo, M. A scoping review on graduate employability in an era of ‘Technological Unemployment’. Higher Edu. Res. Develop. 2024, 43, 542–562. [Google Scholar] [CrossRef]
  42. University of Arkansas. Using Bloom’s Taxonomy to Write Effective Learning Objectives. 2022. Available online: https://tips.uark.edu/using-blooms-taxonomy/#gsc.tab=0 (accessed on 1 February 2025).
  43. Jantawan, B.; Tsai, C.F. The application of data mining to build classification model for predicting graduate employment. arXiv 2013, arXiv:1312.7123. [Google Scholar]
  44. Mohseni, S.; Zarei, N.; Ragan, E.D. A multidisciplinary survey and framework for design and evaluation of explainable AI systems. ACM Trans. Inter. Intel. Sys. TIIS 2021, 11, 1–45. [Google Scholar] [CrossRef]
  45. Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Info. Fusion 2023, 99, 101805. [Google Scholar] [CrossRef]
Figure 1. Implementation process.
Figure 1. Implementation process.
Bdcc 09 00134 g001
Figure 2. Correlation matrix.
Figure 2. Correlation matrix.
Bdcc 09 00134 g002
Figure 3. The three classification cases.
Figure 3. The three classification cases.
Bdcc 09 00134 g003
Figure 4. The ROC–AUC values of the CART (fold 6) among the three classification cases.
Figure 4. The ROC–AUC values of the CART (fold 6) among the three classification cases.
Bdcc 09 00134 g004
Figure 5. Importance features from the CART (fold 6).
Figure 5. Importance features from the CART (fold 6).
Bdcc 09 00134 g005
Table 3. Demographic statistics of the final dataset (n = 610).
Table 3. Demographic statistics of the final dataset (n = 610).
No.Factor NamesDistributionFrequencyPercentage
1Field of studySocial works 081.31
Political education071.15
Pre-school education9715.9
Physical education081.31
Primary education426.89
Accounting6710.98
Information and Communication304.92
English linguistics7111.64
Chinese linguistics7512.3
Agriculture091.48
Aquaculture233.77
Cultural management040.66
Business administration304.92
Musical education142.3
Chemistry education121.97
History education020.33
Literature education213.44
English teacher education193.11
Mathematic education132.13
Finance and Banking264.26
Tourism325.25
2GenderFemale44472.79
Male16627.21
3Employment statusEmployed in < 6 months52686.23
Unemployed < 6 months8413.77
Table 4. Input and output factor descriptions.
Table 4. Input and output factor descriptions.
No.Factor DescriptionTypePreprocessed/Transferred Values
1MajorFields of study in the degree programs that recent graduates were enrolled in.Category1 = Social works, 2 = Political education,
3 = Pre-school education,
4 = Physical education,
5 = Primary education, 6 = Accounting,
7 = Information and Communication,
8 = English linguistics,
9 = Chinese linguistics,
10 = Agriculture, 11 = Aquaculture,
12 = Cultural management,
13 = Business admiration,
14 = Musical education,
15 = Chemistry education,
16 = History education,
17 = Literature education,
18 = English teaching,
19 = Mathematic education,
20 = Finance and Banking, 21 = Tourism
2GenderThe gender of recent graduates.Category1 = Male, 2 = Female
3Admission_scoreThe min–max scores for the university entrance exams that the recent graduates took for admission to the university.Numeric14~27.5
4GPA_Year 1 *Recent graduates’ GPA in the 1st year of university.4.32~8.73 (min–max)
5GPA_Year 2 *Recent graduates’ GPA in the 2nd year of university.4.15~8.87 (min–max)
6GPA_Year 3 *Recent graduates’ GPA in the 3rd year of university.5.60~9.47 (min–max)
7GPA_Year 4 *Recent graduates’ GPA in the 4th year of university.7.03~9.51 (min–max)
8Output
(class label):
Employed
Graduates were employed or unemployed within 6 months after graduation.Category1 = Employed, 0 = Unemployed
* Note: the Vietnamese grading system ranges from 0–10 points, where 10 is the highest.
Table 8. Comparisons of the results of overall accuracy among the three classification cases.
Table 8. Comparisons of the results of overall accuracy among the three classification cases.
ModelCase 1
Origin
Case 2
Undersampling
Case 3
Oversampling (SMOTE)
Mean (SD)Mean (SD)Mean (SD)
C5.085.81 (1.71)52.6 (3.17)71.32 (2.94)
CART76.23 (3.26)56.28 (6.69)93.60 (1.49)
SVM86.00 (0.00)53.89 (9.80)62.10 (3.11)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, L.-S.; Huynh-Cam, T.-T.; Nguyen, V.-C.; Lu, T.-C.; Le-Huynh, D.-K. Predicting Early Employability of Vietnamese Graduates: Insights from Data-Driven Analysis Through Machine Learning Methods. Big Data Cogn. Comput. 2025, 9, 134. https://doi.org/10.3390/bdcc9050134

AMA Style

Chen L-S, Huynh-Cam T-T, Nguyen V-C, Lu T-C, Le-Huynh D-K. Predicting Early Employability of Vietnamese Graduates: Insights from Data-Driven Analysis Through Machine Learning Methods. Big Data and Cognitive Computing. 2025; 9(5):134. https://doi.org/10.3390/bdcc9050134

Chicago/Turabian Style

Chen, Long-Sheng, Thao-Trang Huynh-Cam, Van-Canh Nguyen, Tzu-Chuen Lu, and Dang-Khoa Le-Huynh. 2025. "Predicting Early Employability of Vietnamese Graduates: Insights from Data-Driven Analysis Through Machine Learning Methods" Big Data and Cognitive Computing 9, no. 5: 134. https://doi.org/10.3390/bdcc9050134

APA Style

Chen, L.-S., Huynh-Cam, T.-T., Nguyen, V.-C., Lu, T.-C., & Le-Huynh, D.-K. (2025). Predicting Early Employability of Vietnamese Graduates: Insights from Data-Driven Analysis Through Machine Learning Methods. Big Data and Cognitive Computing, 9(5), 134. https://doi.org/10.3390/bdcc9050134

Article Metrics

Back to TopTop