A Text-Based Project Risk Classification System Using Multi-Model AI: Comparing SVM, Logistic Regression, Random Forests, Naive Bayes, and XGBoost
Abstract
1. Introduction
1.1. Artificial Intelligence in Project Management
1.2. Sector-Specific Applications of AI in Risk Management
1.3. Predictive Analytics and Risk Management
1.4. NLP in Risk Analysis
1.5. Conceptual Frameworks and Governance in AI-Based Risk Management
1.6. Critical Synthesis
1.6.1. Research Questions
- ➢
- Do artificial intelligence algorithms have reliable ways of classifying and predicting project risks, their triggers, and possible effects just by examining unstructured textual descriptions of project documents?
- ➢
- What machine learning algorithms and associated feature extraction (e.g., TF-IDF, Word Embeddings) approaches are more effective in organizing, processing, and extracting actionable information about various unstructured risk data?
- ➢
- How far can model interpretability tools enhance the reliability, transparency, and practicality of AI-based risk prediction systems for major project stakeholders and managers, thereby bridging the gap between advanced analytics and managerial decision-making?
1.6.2. Research Objectives
- ➢
- Design and test an artificial intelligence-based system that will predict types of project risks, their stimuli, and their possible outcomes based on an edited collection of project risk records.
- ➢
- Perform comparative study on two popular feature extraction methods—TF-IDF and Word Embeddings—together with a set of machine learning models such as Logistic Regression, Support Vector Machines, Random Forest, XGBoost, and Naive Bayes to determine the relative performance of each pair under different risk classification problems.
- ➢
- Determine and explain the issues and opportunities that arise from implementing AI into real-world project risk management conditions. Specifically, aspects of the concept include the interpretability of models, scalability, and the ability to integrate smoothly with existing project management processes.
- ➢
- Suggest a commonplace structure of a unitary choice support system. It is hoped that this system can be used not only to support proactive risk mitigation but also to enhance quality assurance throughout the project lifecycle, thereby creating a new synergy between the latest AI research and the practical applicability of project management.
2. Materials and Methods
2.1. Research Design
2.2. Dataset Construction
2.3. Data Preprocessing
- Text Normalization: All the text was turned to lowercase; the use of punctuation, numbers, and special characters were removed with the re module in Python. Tokenization: Whitespace-based splitting of sentences into tokens.
- Remove stopwords: A frequent list of stopwords (e.g., the, and, in, etc.) was eliminated by nltk stopword lists. Lemmatization: Words are shortened to their root form with the WordNetLemmatizer (e.g., delays–delay).
- Concatenation: Clean tokens were concatenated to create processed descriptions. The process reduces noise in text-based data and increases the discriminative ability of features [47].
2.4. Feature Extraction
2.4.1. TF-IDF
2.4.2. Word Embeddings
2.5. Experimental Setup
2.5.1. Train–Test Split
2.5.2. Classification Models
2.6. Evaluation Metrics
- Precision: Fraction of correct optimistic prediction (per class).
- Recall: The possibility of recognizing all true positives of classes.
- F1-Score: The harmonic means of false positives and false negatives based on precision and recall.
- Confusion Matrices: Visualization of performance per-class with heatmaps in seaborn module version 0.13.2.
- Import Excel data to a pandas dataframe.
- Use preprocessing pipeline (normalization, tokenization, lemmatization).
- TF-IDF and Word Embeddings based on parallel branches. Classify the Trains based on Risk Category, Trigger, and Impact.
- Measure models on macro-averaged measures and confusion matrices.
- Use SHAP values to analyze importance of features.
- Compare the performance of models that are trained on feature extraction and classification.
3. Results
3.1. Overview of Experimental Findings
3.2. Data Characteristics
3.3. Tools and Environment
- Pandas was used to load the data from an Excel file and pack it into a dataframe.
- Sklearn has multiple uses. At the beginning of the program, it was used for splitting the data into train and test, with a ratio of 80% for training the models and 20% for testing the ability to generalize. To extract features and form the text, a method called TfidfVectorize was utilized from the feature_extraction submodule of sklearn. After that, certain AI models were deployed using this library—models like the following: Logistic Regression, Random Forest classifier, Multinomial Naive Bayes and Support Vector Machine. Finally, for a comprehensive evaluation of the trained models, submodule metrics were employed with the main methods being classification_report, confusion_matrix, and ConfusionMatrixDisplay.
- Matplotlib was utilized to display and save graphics and images.
- Re was employed in the data cleaning process to lower all the letters, to eliminate punctuation, numbers and special characters, and to remove extra spaces.
- Seaborn was used to make a heatmap for the misclassification matrix.
- XGBoost was employed to deploy and train an Extreme Gradient Boosting AI model.
- SHAP was used to evaluate the importance of each word in a visual manner.
- Numpy used for small functions like sorting the results in ascending/descending order based on an importance coefficient.
- Gensim was used to download and load the GloVe-wiki-gigaword-100 model.
3.4. Experimental Workflow
3.5. Model Evaluation
3.5.1. Risk Category Classification
- For the Quality class, the n-grams with the most impact on the decision-making process are “quality project”, with a magnitude above 1, followed by “conflict unforeseen” and “oversight led low”, with lower magnitudes, just under 0.2.
- For Scope, the main n-grams used by the model to make a decision are “scope project”, with a magnitude greater than 1, followed by “delay”, “oversight led delay”, and “led missed”, with magnitudes lower than 0.2.
- For the Cost category, the specific n-grams are “cost project”, with the highest magnitude above 1, followed by “control caused”, “stakeholder”, “oversight”, and “caused oversight”, with significantly lower values for the magnitude.
- For the Schedule category, the most important n-grams are “schedule”, with a magnitude close to 1, and “cost missed”, with a magnitude value of approximately 0.2.
- For the Communication class, the n-grams that have the most importance are “communication project”, with the most significant magnitude, just under 1, followed by “misaligned led delay” and “led rework missed”.
3.5.2. Trigger Classification
3.5.3. Impact Classification
3.6. Task-Specific Models
4. Discussion
4.1. Dataset Insights and Organization
4.2. Architecture and Feature Extraction Options Modeled
- TF-IDF (Term Frequency–Inverse Document Frequency): This is a standard technique that does well in sparse data representation.
- Word Embeddings (GloVe100): A semantic method of obtaining contextual meaning of a pre-trained embedding model.
4.3. Target Variable Analysis of Performance
4.3.1. Risk Category
4.3.2. Trigger
4.3.3. Impact
4.4. Lessons Learned and Implications
4.5. Future Considerations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| ML | Machine Learning |
| SVM | Support Vector Machine |
| XGBoost | Extreme Gradient Boosting |
| NLP | Natural Language Processing |
| KBR | Knowledge-Based Reasoning |
| OA | Optimization Algorithms |
| CV | Computer Vision |
| GBDT | Gradient Boosting Decision Tree |
| PCA | Principal Component Analysis |
| XAI | Explainable Artificial Intelligence |
| LDA | Latent Dirichlet Allocation |
| NER | Named Entity Recognition |
| BERT | Bidirectional Encoder Representations from Transformers |
| SHAP | Shapley Additive Explanations |
| TF-IDF | Term Frequency–Inverse Document Frequency |
| XGB Classifier | Extreme Gradient Boosting Classifier |
| SMOTE | Synthetic Minority Oversampling Technique |
References
- Elseknidy, M.; Al-Mhdawi, M.; Qazi, A.; Ojiako, U.; Mahammedi, C.; Pour Rahimian, F. Developing a sustainability-driven risk management framework for green building projects: A literature review. J. Clean. Prod. 2025, 519, 145891. [Google Scholar] [CrossRef]
- Kalogiannidis, S.; Kalfas, D.; Papaevangelou, O.; Giannarakis, G.; Chatzitheodoridis, F. The role of artificial intelligence technology in predictive risk assessment for business continuity: A case study of Greece. Risks 2024, 12, 19. [Google Scholar] [CrossRef]
- Kumar, Y.; Marchena, J.; Awlla, A.H.; Li, J.J.; Abdalla, H.B. The AI-Powered evolution of big data. Appl. Sci. 2024, 14, 10176. [Google Scholar] [CrossRef]
- Shobanke, M.; Bhatt, M.; Shittu, E. Advancements and future outlook of Artificial Intelligence in energy and climate change modeling. Adv. Appl. Energy 2025, 17, 100211. [Google Scholar] [CrossRef]
- Hashimzai, I.A.; Mohammadi, M.Q. The Integration of Artificial Intelligence in Project Management: A Systematic Literature Review of emerging trends and challenges. TIERS Inf. Technol. J. 2024, 5, 153–164. [Google Scholar] [CrossRef]
- Tian, K.; Zhu, Z.; Mbachu, J.; Ghanbaripour, A.; Moorhead, M. Artificial intelligence in risk management within the realm of construction projects: A bibliometric analysis and systematic literature review. J. Innov. Knowl. 2025, 10, 100711. [Google Scholar] [CrossRef]
- Dubey, R.; Bryde, D.J.; Dwivedi, Y.K.; Graham, G.; Foropon, C. Impact of artificial intelligence-driven big data analytics culture on agility and resilience in humanitarian supply chain: A practice-based view. Int. J. Prod. Econ. 2022, 250, 108618. [Google Scholar] [CrossRef]
- Adebayo, Y.; Udoh, P.; Kamudyariwa, X.B.; Osobajo, O.A. Artificial Intelligence in Construction Project Management: A Structured literature review of its evolution in application and future trends. Digital 2025, 5, 26. [Google Scholar] [CrossRef]
- Kakoma, P.; Nyimbili, P.H.; Tembo, M.; Mwanaumo, E.M. A performance forecasting model for optimizing CDF-Funded construction projects in the Copperbelt Province, Zambia. J. Contemp. Urban Aff. 2025, 9, 290–309. [Google Scholar] [CrossRef]
- Al-Sinan, M.A.; Bubshait, A.A.; Aljaroudi, Z. Generation of Construction Scheduling through Machine Learning and BIM: A Blueprint. Buildings 2024, 14, 934. [Google Scholar] [CrossRef]
- Wieland-Jorna, Y.; van Kooten, D.; A Verheij, R.; de Man, Y.; Francke, A.L.; Oosterveld-Vlug, M.G. Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review. JAMIA Open 2024, 7, ooae044. [Google Scholar] [CrossRef]
- Kineber, A.F.; Elshaboury, N.; Oke, A.E.; Aliu, J.; Abunada, Z.; Alhusban, M. Revolutionizing Construction: A Cutting-Edge Decision-Making model for artificial intelligence implementation in sustainable building projects. Heliyon 2024, 10, e37078. [Google Scholar] [CrossRef]
- Ozdemir, S.; de Arroyabe, J.C.F.; Sena, V.; Gupta, S. Stakeholder diversity and collaborative innovation: Integrating the resource-based view with stakeholder theory. J. Bus. Res. 2023, 164, 113955. [Google Scholar] [CrossRef]
- Youssef, M.M.; Esaam, R. Revitalization approaches to maximize heritage urban DNA characteristics in declined cities: Foah City as a case study. J. Contemp. Urban Aff. 2023, 7, 56–72. [Google Scholar] [CrossRef]
- Chen, R.; Dai, T.; Zhang, Y.; Zhu, Y.; Liu, X.; Zhao, E. GBDT-IL: Incremental Learning of gradient boosting decision trees to detect botnets in Internet of Things. Sensors 2024, 24, 2083. [Google Scholar] [CrossRef] [PubMed]
- Sarker, I.H. AI-Based modeling: Techniques, applications and research issues towards automation, intelligent and smart systems. SN Comput. Sci. 2022, 3, 158. [Google Scholar] [CrossRef]
- Zakaria, M.; Lynda, D.; Ramdane, B. Bagging Ensemble Based on Multi-Layer Perceptron Neural Network for Landslide Susceptibility Assessment. In Proceedings of the 2023 International Conference on Earth Observation and Geo-Spatial Information (ICEOGI), Algiers, Algeria, 22–24 May 2023; IEEE: New York, NY, USA; pp. 1–6. [Google Scholar] [CrossRef]
- Bin Rashid, A.; Kausik, A.K. AI Revolutionizing Industries Worldwide: A comprehensive overview of its diverse applications. Hybrid Adv. 2024, 7, 100277. [Google Scholar] [CrossRef]
- Prasetyo, M.L.; Peranginangin, R.A.; Martinovic, N.; Ichsan, M.; Wicaksono, H. Artificial Intelligence in Open Innovation Project Management: A Systematic literature review on technologies, applications, and integration requirements. J. Open Innov. Technol. Mark. Complex. 2024, 11, 100445. [Google Scholar] [CrossRef]
- Dritsas, E.; Trigka, M. Exploring the intersection of machine learning and big Data: A survey. Mach. Learn. Knowl. Extr. 2025, 7, 13. [Google Scholar] [CrossRef]
- Jagannathan, M.; Roy, D.; Delhi, V.S.K. Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies. CSI Trans. ICT 2022, 10, 97–106. [Google Scholar] [CrossRef]
- Salimimoghadam, S.; Ghanbaripour, A.N.; Tumpa, R.J.; Rahimi, A.K.; Golmoradi, M.; Rashidian, S.; Skitmore, M. The rise of Artificial intelligence in project Management: A Systematic literature review of current opportunities, enablers, and barriers. Buildings 2025, 15, 1130. [Google Scholar] [CrossRef]
- Soori, M.; Jough, F.K.G.; Dastres, R.; Arezoo, B. AI-Based Decision Support Systems in Industry 4.0, a review. J. Econ. Technol. 2024, 4, 206–225. [Google Scholar] [CrossRef]
- Sadeghi, Z.; Alizadehsani, R.; Cifci, M.A.; Kausar, S.; Rehman, R.; Mahanta, P.; Bora, P.K.; Almasri, A.; Alkhawaldeh, R.S.; Hussain, S.; et al. A review of Explainable Artificial Intelligence in healthcare. Comput. Electr. Eng. 2024, 118, 109370. [Google Scholar] [CrossRef]
- El-Bouri, R.; Taylor, T.; Youssef, A.; Zhu, T.; A Clifton, D. Machine learning in patient flow: A review. Prog. Biomed. Eng. 2021, 3, 022002. [Google Scholar] [CrossRef] [PubMed]
- Jada, I.; Mayayise, T.O. The impact of artificial intelligence on organisational cyber security: An outcome of a systematic literature review. Data Inf. Manag. 2023, 8, 100063. [Google Scholar] [CrossRef]
- Al-Amiery, A. The ethical implications of emerging AI technologies in healthcare. MedMat 2025, 2, 85–100. [Google Scholar] [CrossRef]
- Gao, N.; Touran, A.; Wang, Q.; Beauchamp, N. Construction risk identification using a multi-sentence context-aware method. Autom. Constr. 2024, 164, 105466. [Google Scholar] [CrossRef]
- Dikmen, I.; Eken, G.; Erol, H.; Birgonul, M.T. Automated construction contract analysis for risk and responsibility assessment using natural language processing and machine learning. Comput. Ind. 2025, 166, 104251. [Google Scholar] [CrossRef]
- Boamah, F.A.; Jin, X.; Senaratne, S.; Perera, S. AI-driven risk identification model for infrastructure project: Utilising past project data. Expert Syst. Appl. 2025, 283, 127891. [Google Scholar] [CrossRef]
- Supriyono; Wibawa, A.P.; Suyono; Kurniawan, F. Advancements in natural Language Processing: Implications, challenges, and future directions. Telemat. Inform. Rep. 2024, 16, 100173. [Google Scholar] [CrossRef]
- Eker, H. Natural Language Processing risk assessment application developed for marble quarries. Appl. Sci. 2024, 14, 9045. [Google Scholar] [CrossRef]
- Qiang, X.; Li, G.; Hou, J.; Fan, C. Research on Automatic Classification of mine safety Hazards using Pre-Trained Language Models. Electronics 2025, 14, 1001. [Google Scholar] [CrossRef]
- Kim, E.W.; Shin, Y.J.; Kim, K.J.; Kwon, S. Development of an automated construction contract review framework using large language model and domain knowledge. Buildings 2025, 15, 923. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MI, USA, 2–7 June 2019; Scientific Research Publishing: Irvine, CA, USA, 2019; pp. 4171–4186. Available online: https://www.scirp.org/reference/referencespapers?referenceid=3984485 (accessed on 15 June 2025).
- Jim, J.R.; Talukder, A.R.; Malakar, P.; Kabir, M.; Nur, K.; Mridha, M. Recent advancements and challenges of NLP-based sentiment analysis: A state-of-the-art review. Nat. Lang. Process. J. 2024, 6, 100059. [Google Scholar] [CrossRef]
- Feretzakis, G.; Vagena, E.; Kalodanis, K.; Peristera, P.; Kalles, D.; Anastasiou, A. GDPR and large language models: Technical and legal obstacles. Futur. Internet 2025, 17, 151. [Google Scholar] [CrossRef]
- Saki, S.; Soori, M. Artificial intelligence, machine learning and deep learning in advanced transportation systems, a review. Multimodal Transp. 2025, 5, 100242. [Google Scholar] [CrossRef]
- Agarwal, A.; Nene, M.J. A five-layer framework for AI governance: Integrating regulation, standards, and certification. Transform. Gov. People Process Policy 2025, 19, 535–555. [Google Scholar] [CrossRef]
- Xu, Y.; Reniers, G.; Yang, M. A multidisciplinary review into the evolution of risk concepts and their assessment methods. Processes 2024, 12, 2449. [Google Scholar] [CrossRef]
- Hassija, V.; Chamola, V.; Mahapatra, A.; Singal, A.; Goel, D.; Huang, K.; Scardapane, S.; Spinelli, I.; Mahmud, M.; Hussain, A. Interpreting Black-Box Models: A review on Explainable Artificial intelligence. Cogn. Comput. 2023, 16, 45–74. [Google Scholar] [CrossRef]
- Pantanowitz, L.; Hanna, M.; Pantanowitz, J.; Lennerz, J.; Henricks, W.H.; Shen, P.; Quinn, B.; Bennet, S.; Rashidi, H.H. Regulatory aspects of AI-ML. Mod. Pathol. 2024, 37, 100609. [Google Scholar] [CrossRef]
- Adamantiadou, D.S.; Tsironis, L. Leveraging Artificial intelligence in Project Management: A Systematic review of applications, challenges, and future directions. Computers 2025, 14, 66. [Google Scholar] [CrossRef]
- Nigar, M.; Juli, J.F.; Golder, U.; Alam, M.J.; Hossain, M.K. Artificial intelligence and technological unemployment: Understanding trends, technology’s adverse roles, and current mitigation guidelines. J. Open Innov. Technol. Mark. Complex. 2025, 11, 100607. [Google Scholar] [CrossRef]
- PMI. Risk Management in Portfolios, Programs, and Projects: A Practice Guide|Project Management Institute. 2021. Available online: https://www.pmi.org/standards/risk-management-in-portfolios (accessed on 20 June 2025).
- Manning, C.D.; Schütze, H.; Raghavan, P. Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar] [CrossRef]
- Martins, P.; Cardoso, F.; Váz, P.; Silva, J.; Abbasi, M. Performance and scalability of data cleaning and preprocessing tools: A benchmark on large Real-World datasets. Data 2025, 10, 68. [Google Scholar] [CrossRef]
- Jurafsky, D.; Martin, J.H. Speech and Language Processing, 3rd ed.; Pearson; Scientific Research Publishing: Irvine, CA, USA, 2023; Available online: https://www.scirp.org/reference/referencespapers?referenceid=3970727 (accessed on 22 June 2025).
- Zhang, W.; Yoshida, T.; Tang, X. A comparative study of TF*IDF, LSI and multi-words for text classification. Expert Syst. Appl. 2010, 38, 2758–2765. [Google Scholar] [CrossRef]
- Jain, S.; Jain, S.K.; Vasal, S. An effective TF-IDF model to improve the text classification performance. In Proceedings of the 2024 IEEE 13th International Conference on Communication Systems and Network Technologies (CSNT), Jabalpur, India, 6–7 April 2024; pp. 1–4. [Google Scholar] [CrossRef]
- Liang, M.; Niu, T. Research on text classification techniques based on improved TF-IDF algorithm and LSTM inputs. Procedia Comput. Sci. 2022, 208, 460–470. [Google Scholar] [CrossRef]
- Xiang, L. Application of an improved TF-IDF method in literary text classification. Adv. Multimed. 2022, 2022, 9285324. [Google Scholar] [CrossRef]
- Thor, W.-M. GLOVE: Global Vectors for Word Representation. 2024. Available online: https://apxml.com/courses/nlp-fundamentals/chapter-4-nlp-word-embeddings/glove-word-representation (accessed on 30 July 2025).
- Dvořáčková, L. Analyzing word embeddings and their impact on semantic similarity: Through extreme simulated conditions to real dataset characteristics. Neural Comput. Appl. 2025, 37, 13765–13793. [Google Scholar] [CrossRef]
- Stankevičius, L.; Lukoševičius, M. Extracting Sentence Embeddings from Pretrained Transformer Models. Appl. Sci. 2024, 14, 8887. [Google Scholar] [CrossRef]
- Chakraborty, S.; Dey, L. Multi-class classification. In Multi-Objective, Multi-Class and Multi-Label Data Classification with Class Imbalance; Springer: Berlin/Heidelberg, Germany, 2024; pp. 51–76. [Google Scholar] [CrossRef]
- Yazdi, M.; Zarei, E.; Adumene, S.; Beheshti, A. Navigating the Power of Artificial intelligence in Risk Management: A Comparative analysis. Safety 2024, 10, 42. [Google Scholar] [CrossRef]
- Rane, J.; Kaya, Ö.; Mallick, S.K.; Rane, N.L. Enhancing black-box models: Advances in explainable artificial intelligence for ethical decision-making. In Future Research Opportunities for Artificial Intelligence in Industry 4.0 and 5.0; Deep Science Publishing: San Francisco, CA, USA, 2024. [Google Scholar] [CrossRef]
- Coovadia, H.; Marx, B.; Botha, I.; Gold, N.O. Building an ethical artificial intelligence corporate governance framework for the integration of emerging technologies into business processes. S. Afr. J. Account. Res. 2025, 39, 286–316. [Google Scholar] [CrossRef]
- Burlea-Schiopoiu, A. Success Factors for an Information Systems Projects Team: Creating New Context, Innovation and Knowledge Management. In Twin Track Economies: Challenges & Solutions; Soliman, K.S., Ed.; IBIMA: London, UK, 2009; Volume 1–3, pp. 936–941. [Google Scholar]
- Mourad, H.; Fahim, S.; Burlea-Schiopoiu, A.; Lahby, M.; Attioui, A. 2022. Modeling and Mathematical Analysis of Liquidity Risk Contagion in the Banking System. J. Appl. Math. 2022, 2022, 5382153. [Google Scholar] [CrossRef]
- Song, L.-K.; Tao, F.; Li, X.-Q.; Yang, L.-C.; Wei, Y.-P.; Beer, M. Physics-embedding multi-response regressor for time-variant system reliability assessment. Reliab. Eng. Syst. Saf. 2025, 263, 111262. [Google Scholar] [CrossRef]
















| Theme | Key Contributions | Identified Gaps |
|---|---|---|
| AI in project management | Proactive risk identification, predictive modeling, decision-support | Limited integration, high implementation cost |
| Sector-specific applications | Proven effectiveness in cost, quality, and safety risk reduction | Limited generalizability across sectors |
| NLP for risk detection | Effective for text-based classification and contract analysis | Sector-specific validation, lack of standardized datasets |
| Governance and ethics | Emerging frameworks for AI oversight | Limited operational implementation |
| Column Name | Description | Data Type | Example Entry |
|---|---|---|---|
| Risk Description | A textual description of a specific risk scenario in the project. | String | “Frequent changes in project requirements due to unclear client communication.” |
| Risk Category | High-level classification of the risk into one of five categories: Communication, Cost, Quality, Schedule, or Scope. | Categorical | “Communication.” |
| Trigger | Specific cause of the risk, classified into 28 categories such as “Budget constraints,” “Inadequate resource planning,” or “Vendor reliability issues.” | Categorical | “Budget constraints.” |
| Impact | The consequence of the risk, classified into 15 categories, e.g., “Budget overrun,” “Stakeholder dissatisfaction,” or “Delay.” | Categorical | “Budget overrun.” |
| Algorithm | Purpose and Justification |
|---|---|
| Logistic Regression (LR) | Linear classification baseline model; useful with sparse and high-dimensional data. |
| Random Forest (RF) | Ensemble algorithm to deal with class skewness and non-linear decision boundary. |
| Multinomial Naive Bayes (MNB) | Effective TF-IDF feature probabilistic model. |
| Support Vector Machine (SVM) | High-dimensional classifier; robust against overfitting with kernel methods. |
| XGBoost (XGB) | An ensemble based on Gradient Boosting; it was chosen due to its high performance in multi-class tasks. |
| Library | Purpose |
|---|---|
| Pandas | Tabular manipulation, cleaning, and loading of data. |
| Sklearn | Evaluation and feature extraction of models. |
| matplotlib and seaborn | Data visualization and confusion matrices. |
| Xgboo st | Training of Gradient Boosting models. |
| SHAP | Elucidate models with SHAP figures. |
| Numpy | Arrangements and calculations of arrays. |
| Genism | Built-in GloVe embeddings. |
| Feature Extraction Method | AI Algorithm for Classification | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|---|
| TF-IDF | Logistic Regression | 0.61 | 0.61 | 0.61 |
| SVM | 0.61 | 0.61 | 0.61 | |
| Random Forest | 0.61 | 0.61 | 0.60 | |
| XGB Classifier | 0.68 | 0.61 | 0.63 | |
| Multinomial Naive Bayes | 0.61 | 0.56 | 0.57 | |
| Word Embeddings GloVe100 | Logistic Regression | 0.56 | 0.56 | 0.56 |
| SVM | 0.60 | 0.59 | 0.59 | |
| Random Forest | 0.47 | 0.48 | 0.47 | |
| XGB Classifier | 0.60 | 0.60 | 0.60 |
| Transformer Model | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|
| BERT | 0.66 | 0.59 | 0.60 |
| DistilBERT | 0.64 | 0.61 | 0.61 |
| Sentence-BERT | 0.63 | 0.60 | 0.60 |
| Feature Extraction Method | AI Algorithm for Classification | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|---|
| TF-IDF | Logistic Regression | 0.75 | 0.75 | 0.75 |
| SVM | 0.75 | 0.75 | 0.75 | |
| Random Forest | 0.75 | 0.75 | 0.74 | |
| XGB Classifier | 0.73 | 0.74 | 0.73 | |
| Multinomial Naive Bayes | 0.74 | 0.75 | 0.74 | |
| Word Embeddings GloVe100 | Logistic Regression | 0.75 | 0.75 | 0.75 |
| SVM | 0.75 | 0.75 | 0.75 | |
| Random Forest | 0.75 | 0.75 | 0.75 | |
| XGB Classifier | 0.74 | 0.73 | 0.73 |
| Transformer Model | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|
| BERT | 0.73 | 0.76 | 0.73 |
| DistilBERT | 0.73 | 0.75 | 0.73 |
| Sentence-BERT | 0.72 | 0.75 | 0.72 |
| Class Name | Weight Value |
|---|---|
| Ambiguous client requirements | 0.7619047619047619 |
| Ambiguous objectives | 1.465201465201465 |
| Budget constraints | 1.5037593984962405 |
| Communication breakdowns | 0.5952380952380952 |
| Complex stakeholder landscape | 1.5873015873015872 |
| Cultural misalignment | 1.465201465201465 |
| Delayed regulatory approvals | 0.5952380952380952 |
| Frequent change requests | 0.49261083743842365 |
| Improper risk assessment | 1.5037593984962405 |
| Inadequate resource planning | 0.4801920768307323 |
| Inadequate training | 1.0781671159029649 |
| Incomplete requirements | 1.9047619047619047 |
| Inconsistent documentation | 1.465201465201465 |
| Inexperienced contractor team | 0.5494505494505495 |
| Inexperienced team members | 1.1904761904761905 |
| Lack of testing procedures | 1.6326530612244898 |
| Lack of version control | 1.5873015873015872 |
| Last-minute changes | 1.5444015444015444 |
| Outdated tools | 1.3937282229965158 |
| Poor stakeholder engagement | 1.4285714285714286 |
| Resource unavailability | 1.2698412698412698 |
| Supplier delay | 1.3605442176870748 |
| Technical design flaws | 0.60790273556231 |
| Unaligned KPIs | 1.21580547112462 |
| Unclear specifications | 1.1661807580174928 |
| Unrealistic deadlines | 1.7316017316017316 |
| Unverified assumptions | 1.6326530612244898 |
| Vendor reliability issues | 0.5714285714285714 |
| Model Name | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|
| Logistic Regression with TF-IDF | 0.75 | 0.75 | 0.75 |
| SVM with TF-IDF | 0.75 | 0.75 | 0.75 |
| Feature Extraction Method | AI Algorithm for Classification | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|---|
| TF-IDF | Logistic Regression | 0.59 | 0.59 | 0.59 |
| SVM | 0.59 | 0.59 | 0.59 | |
| Random Forest | 0.60 | 0.60 | 0.59 | |
| XGB Classifier | 0.60 | 0.59 | 0.59 | |
| Multinomial Naive Bayes | 0.61 | 0.60 | 0.60 | |
| Word Embeddings GloVe100 | Logistic Regression | 0.61 | 0.61 | 0.61 |
| SVM | 0.63 | 0.60 | 0.60 | |
| Random Forest | 0.59 | 0.59 | 0.59 | |
| XGB Classifier | 0.59 | 0.59 | 0.59 |
| Transformer Model | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|
| BERT | 0.56 | 0.60 | 0.56 |
| DistilBERT | 0.56 | 0.60 | 0.57 |
| Sentence-BERT | 0.54 | 0.60 | 0.55 |
| Category | Macro-Average Precision | Macro-Average Recall | Macro-Average F1-Score |
|---|---|---|---|
| Risk Category | 0.77 | 0.62 | 0.63 |
| Trigger | 0.74 | 0.76 | 0.74 |
| Impact | 0.60 | 0.61 | 0.59 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ferhati, K.; Burlea-Schiopoiu, A.; Nascu, A.-G. A Text-Based Project Risk Classification System Using Multi-Model AI: Comparing SVM, Logistic Regression, Random Forests, Naive Bayes, and XGBoost. Systems 2025, 13, 1078. https://doi.org/10.3390/systems13121078
Ferhati K, Burlea-Schiopoiu A, Nascu A-G. A Text-Based Project Risk Classification System Using Multi-Model AI: Comparing SVM, Logistic Regression, Random Forests, Naive Bayes, and XGBoost. Systems. 2025; 13(12):1078. https://doi.org/10.3390/systems13121078
Chicago/Turabian StyleFerhati, Koudoua, Adriana Burlea-Schiopoiu, and Andrei-Gabriel Nascu. 2025. "A Text-Based Project Risk Classification System Using Multi-Model AI: Comparing SVM, Logistic Regression, Random Forests, Naive Bayes, and XGBoost" Systems 13, no. 12: 1078. https://doi.org/10.3390/systems13121078
APA StyleFerhati, K., Burlea-Schiopoiu, A., & Nascu, A.-G. (2025). A Text-Based Project Risk Classification System Using Multi-Model AI: Comparing SVM, Logistic Regression, Random Forests, Naive Bayes, and XGBoost. Systems, 13(12), 1078. https://doi.org/10.3390/systems13121078

