A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies

Didraga, Otniel; Albu, Andrei; Negrut, Viorel; Babuc, Diogen; Dobrican, Ovidiu

doi:10.3390/app15137263

Open AccessArticle

A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies

by

Otniel Didraga

¹

,

Andrei Albu

^1,*

,

Viorel Negrut

¹,

Diogen Babuc

² and

Ovidiu Dobrican

¹

Department of Finance, Information Systems and Modeling for Business, Faculty of Economy and Business Administration, West University of Timisoara, 300223 Timisoara, Romania

²

Department of Mathematics and Computer Science, Faculty of Mathematics and Computer Science, West University of Timisoara, 300223 Timisoara, Romania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(13), 7263; https://doi.org/10.3390/app15137263

Submission received: 28 May 2025 / Revised: 24 June 2025 / Accepted: 25 June 2025 / Published: 27 June 2025

(This article belongs to the Special Issue Machine Learning and Soft Computing: Current Trends and Applications)

Download

Browse Figures

Versions Notes

Abstract

Selecting appropriate research methodologies in project management traditionally relies on individual expertise and intuition, leading to variability in study design and reproducibility challenges. To address this gap, we introduce a machine learning-driven recommendation system that objectively matches project management use cases to suitable research methods. Leveraging a curated dataset of 156 instances extracted from over 100 peer-reviewed articles, each example is annotated by one of five application domains (cost estimation, performance analysis, risk assessment, prediction, comparison) and one of seven methodology classes (e.g., regression analysis, time-series analysis, case study). We transformed textual descriptions into TF-IDF features and one-hot-encoded contextual domains, then trained and rigorously tuned three classifiers—random forest, support vector machine, and K-nearest neighbours—using stratified five-fold cross-validation. The random forest model achieved superior performance (93.8% ± 1.9% accuracy, macro-F1 = 0.93, ROC-AUC = 0.94), demonstrating robust generalisability across diverse scenarios, while SVM offered the highest precision on dominant classes. Our framework establishes a transparent, reproducible workflow—from literature extraction and annotation to model evaluation—and promises to standardise methodology selection, enhancing consistency and rigour in project management research design.

Keywords:

machine learning; project management; research methodology; data-driven framework; predictive analytics

1. Introduction

Project management researchers and practitioners traditionally select study designs—such as regression analysis, time-series forecasting, case studies, or surveys—based on personal expertise and intuition. While this subjective approach leverages domain experience, it can also yield inconsistent methodology choices, reduce the reproducibility of findings, and hinder cross-study comparability. Despite growing calls for evidence-based decision support in research design, no systematic framework currently exists to guide methodology selection in project management research.

Over the past decade, machine learning (ML) has demonstrated remarkable success in operational project management tasks, from forecasting cost and schedule deviations to classifying risk factors and optimising resource allocation [1,2]. These studies illustrate ML’s ability to extract actionable insights from large, dynamic datasets, enabling more accurate risk assessments and decision making in complex project environments. Yet, ML’s potential for “meta-research”—recommending the most suitable research method for a given project scenario—remains untapped.

Concurrently, the project management literature has long recognised the value of rigorous quantitative, qualitative, and mixed research methodologies for advancing theory and practice [3]. Standard techniques include regression variants (linear, logistic, Poisson, panel data), time-series analysis, experimental designs, case studies, grounded theory, and ethnography. Each methodology offers unique strengths—for example, regression models illuminate variable interactions, while case studies capture contextual depth—but choosing among them often rests on subjective judgment.

To address this gap, we propose a data-driven ML framework that objectively recommends research methods tailored to specific project management use cases. Drawing on insights from the systematic literature review, we compiled a curated dataset of 156 instances extracted from over 100 peer-reviewed articles [4]. Each instance is annotated with one of five application domains—cost estimation, performance analysis, risk assessment, prediction, and comparison—and labelled by the research method employed.

The six main objectives of this paper are the following (see Figure 1):

(1): Data Extraction: Retrieve detailed information on research methods and their applications from academic journals.
(2): Dataset Construction: Assemble a tabular dataset mapping project management use cases to research methodologies.
(3): Preprocessing: Clean, encode, and prepare the dataset for ML modelling.
(4): Model Development: Train and validate three classifiers (random forest, K-nearest neighbours, support vector machine) to predict suitable research methods.
(5): Evaluation: Assess model generalisability and performance using accuracy, precision, recall, ROC-AUC, and other relevant metrics.
(6): Model Selection: Identify the most effective classifier for each project management application.

Figure 1. Research method prediction process.

Our framework aims to enhance objectivity, reproducibility, and scalability in project management research design by automating methodology selection. The following sections detail the literature foundations, dataset development, modelling techniques, experimental results, and practical implications of integrating predictive analytics into research planning.

2. Literature Review

2.1. Background

Recent research has demonstrated that machine learning (ML) can significantly enhance decision support in project management by uncovering complex patterns in historical data. For instance, previous research applied ensemble methods to predict risk occurrences in construction projects, achieving accuracy gains over traditional statistical techniques [1]. One study showed that supervised classifiers effectively identify critical success factors, systematically reviewing AI-enabled project management, highlighting ML’s role in automating routine tasks and augmenting managerial decision making [4]. At the same time, another developed a data-driven analytics framework for resource allocation and performance monitoring [5]. Finally, another study further surveyed ML applications in agile project settings, underscoring the breadth of emerging use cases [6].

Despite these advances in operational domains, the application of ML to research methodology selection in project management remains unaddressed. Although the methodological literature underscores the importance of rigorous design choice—quantitative methods (e.g., various regression techniques), qualitative approaches (e.g., case studies, grounded theory), and mixed methods—for ensuring validity and comparability [3], scholars still rely on heuristic or rule-based guidance [7]. No existing framework learns from past research to recommend methods for new project scenarios, leaving a critical gap in evidence-based meta-research support.

2.2. Project Management and Machine Learning

Project management is the discipline of planning, organising, and overseeing resources to achieve specific goals within constraints of time, cost, and scope [8,9]. Key activities include scheduling, resource allocation, risk management, and performance monitoring, all of which generate structured and unstructured data amenable to ML analysis.

Machine learning—a branch of artificial intelligence—enables systems to learn from data and make predictions or decisions without explicit programming [10,11]. Standard techniques include neural networks, support vector machines, decision trees, and natural language processing.

Project management and machine learning have increasingly intersected, transforming how projects are planned, executed, and controlled. Machine learning now plays a crucial, complementary role in project management, especially in handling complexity, uncertainty, and large data volumes. At the same time, demands have shifted toward more data-driven, automated, and explainable solutions.

Over the past decades, project management has transformed from a predominantly manual, expertise-driven discipline to a data-centric, automated, and transparent practice by integrating machine learning techniques (see Table 1). In the pre-ML era, teams depended on expert judgment, static planning tools, and manual risk assessments, often resulting in inefficiencies and elevated failure rates owing to limited analytical capabilities [8].

The advent of machine learning introduced predictive analytics and pattern recognition functionalities, which enabled more accurate forecasts of schedules and costs, automated decision support for risk modelling, and optimised resource allocation [12].

Consequently, project managers gained the ability to anticipate and mitigate risks more effectively, thereby reducing overruns and improving overall resource utilisation. More recently, the focus has shifted toward explainability and transparency, with methods such as SHAP values providing insights into the factors driving model predictions and supporting forward-looking and retrospective analyses. This explainable paradigm not only fosters trust in automated systems but also empowers managers to validate or override algorithmic recommendations when necessary [13].

In project management, ML is increasingly used to:

Improve Forecasting: Enhance predictions of timelines, costs, and risks, reducing overruns and delays [8,14,15].
Optimise Resources: Automate routine tasks such as team assignments and workload balancing, increasing efficiency and freeing managers for strategic activities [16].
Assess Risks: Identify and prioritise potential failures early in the project lifecycle, especially in complex or uncertain environments [8,9].
Support Decisions: Derive actionable insights from large, multimodal datasets—combining numerical, textual, and sensor data—to inform faster, data-driven choices [4,10].
Enhance Explainability: Leverage techniques like SHAP values to interpret model outputs, helping stakeholders understand driving factors behind project outcomes [13,17,18].

Collectively, these developments illustrate how machine learning has assumed complementary roles in data-driven decision making, risk and resource management, and the automation of routine tasks, thereby freeing managers to concentrate on strategic activities (see Table 2). The overarching result of this evolutionary trajectory is a project management paradigm that is not only more efficient and adaptive but also more accountable and transparent, meeting the complex data demands of contemporary endeavours.

2.3. Shallow Models and Evaluation Metrics

Among ML algorithms, shallow models such as support vector machines (SVM), random forests (RF), and K-nearest neighbours (KNN) strike a balance between interpretability and performance on the tabular datasets’ standard in project management research. One study demonstrated SVM’s efficacy in high-dimensional spaces [19], while another reported RF’s robustness against overfitting in load-forecasting tasks [20]. Despite its increased memory demands, KNN’s utility is where similarity-based reasoning is critical [21]. However, these models depend heavily on training data quality and may struggle with non-linear or sparse patterns [22].

Researchers report accuracy, precision, and recall to quantify classifier performance in project contexts [23].

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N},

(1)

P r e c i s i o n = \frac{T P}{T P + F P},

(2)

R e c a l l = \frac{T P}{T P + F N},

(3)

TP, TN, FP, and FN denote true positives, true negatives, false positives, and false negatives, respectively. These metrics offer intuitive measures of a model’s correctness, relevance, and completeness—qualities essential for a tool that recommends appropriate research methodologies [24,25].

While ML has revolutionised many facets of project execution and several rule-based tools assist in research-design guidance, no prior work has leveraged ML to recommend research methodologies tailored to specific project management scenarios. Our study fills this gap by training SVM, RF, and KNN classifiers on a curated dataset of 156 instances from over 100 peer-reviewed articles, offering the first adaptive, data-driven framework for methodology selection in project management research.

3. Materials and Methods

3.1. Materials

To build a robust, literature-derived dataset, we systematically searched Scopus and Web of Science (January 2015–December 2023). Search terms included combinations of “project management,” “research method,” “machine learning,” “case study,” “survey,” and “regression.” We applied the following inclusion criteria:

Peer-reviewed journal articles in English.
Explicit description of the project management application (e.g., cost estimation, risk assessment) and the research method employed.
Accessibility of full text.

After removing duplicates and screening titles/abstracts, 128 articles remained. A full-text review yielded 102 articles that met all criteria.

From each selected article, two independent researchers extracted:

Application domain (one of five: cost estimation, performance analysis, risk assessment, prediction, comparison).
Brief textual description of the use case (e.g., “forecasting budget overruns using historical cost data”).
Research method label (one of seven: regression analysis, time-series analysis, case study, ethnographic research, experimental design, grounded theory, interview).

Discrepancies (≈8% of instances) were reconciled through discussion; Cohen’s κ = 0.87 indicated high inter-annotator reliability. A third researcher adjudicated the remaining conflicts.

The final tabular dataset comprises 156 examples. The input features are:

Domain (categorical; five values).
Description (textual; preprocessed into TF-IDF features).

As shown in Figure 2, we have used tabular datasets, such as CSV files or spreadsheets, because they are the foundation for Python 3.10 shallow machine learning model training. Their basic structure, in which columns represent features and data points represent rows, precisely aligns with the expectations of models such as decision trees and linear regression [26].

The target label is the research method (seven classes). As shown in Table 3, regression analysis emerged as the most frequently employed research method, accounting for 34 out of 156 studies (21.8%). Time-series analysis followed closely at 17.9%, while case studies comprised 15.4%. Together, these three approaches represent the dominant quantitative traditions in our corpus. Experimental designs—another hallmark of quantitative inquiry—contribute an additional 11.5%, bringing the aggregate share of quantitative methods (regression, time-series, experimental design) to 51.2% of all studies (Table 3).

Conversely, qualitative methodologies are also well represented. Ethnographic research accounts for 12.8% of studies, whereas grounded theory and interview-based investigations constitute 10.3%. Collectively, these methods comprise 48.8% of the total, indicating a near balance between quantitative and qualitative orientations in the field. This methodological diversity underscores the multifaceted nature of financial analytics research; quantitative techniques offer robust statistical and forecasting capabilities, while qualitative and mixed-methods contribute rich contextual insights into organisational and human factors.

The relatively even distribution across seven distinct classes suggests that our AI agent must be capable of accommodating numerical time series and text-driven qualitative inputs. Moreover, the prominence of regression and time series techniques highlights the importance of integrating advanced statistical modules within the agent’s workflow. In contrast, the substantial share of case studies and interviews indicates the need for flexible natural language processing and memory modules. Together, these findings guide the prioritisation of features in our design science implementation.

3.2. Methods

In our implementation, we first transformed the annotated dataset into a machine-readable form by encoding textual and categorical features. The free-text “Description” of each use case was vectorised using the TF-IDF approach (scikit-learn’s TfidfVectorizer, limited to the top 500 terms). At the same time, the five-category “Domain” field was converted into dummy variables via one-hot encoding. Together, these steps produced a feature matrix X suitable for input to standard classifiers. The target vector y, representing the seven research-method classes, was generated with scikit-learn’s LabelEncoder, mapping each method to a unique integer label [27].

To address class imbalance, we activated the class_weight = “balanced” parameter in both random forest and support vector machine classifiers, thereby adjusting the loss function to impose a greater penalty on misclassification of underrepresented labels. This weighting scheme promotes equitable performance across all seven methodology classes.

To assess model generalisability, we used stratified five-fold cross-validation (StratifiedKFold (n_splits = 5, shuffle = True, random_state = 42)) to preserve class proportions in each fold. Within this framework, we evaluated three shallow classifiers—support vector machine (sklearn.svm.SVC), random forest (sklearn.ensemble.RandomForestClassifier), and K-nearest neighbours (sklearn.neighbors.KNeighborsClassifier)—selected for their complementary strengths in handling tabular data. Hyperparameters were optimized via grid search (GridSearchCV with cv = 5) over carefully chosen parameter grids: for the SVM, regularization parameter C ∈ {0.1, 1, 10}, kernels ∈ {linear, rbf}, and γ ∈ {scale, auto}; for random forest, number of estimators ∈ {100, 200, 500}, maximum tree depth ∈ {None, 10, 20}, and minimum samples to split ∈ {2, 5}; and for KNN, neighbours ∈ {3, 5, 7}, weight schemes ∈ {uniform, distance}, and distance metrics ∈ {euclidean, manhattan}. Throughout, random seeds were fixed (random_state = 42) to ensure reproducibility [28].

To further assess model generalisability beyond internal cross-validation, we partitioned 20% of the dataset as an independent test set, withheld during training and hyperparameter tuning. Evaluation on this held-out subset produced performance metrics—accuracy, F1-score₁, and ROC-AUC—that closely matched those obtained via stratified five-fold cross-validation. The concordance between internal and external evaluations substantiates the Random Forest model’s robustness and mitigates optimism bias, providing a more realistic performance estimate on unseen project descriptions. Future work will extend validation to novel, user-defined case studies from diverse domains.

Model performance was quantified using a suite of classification metrics—overall accuracy, macro-averaged precision, recall, F1-score (computed via precision_recall_fscore_support), and one-versus-rest ROC-AUC (using roc_auc_score). Confusion matrices (confusion_matrix) were also examined to identify systematic misclassification patterns across the seven method categories. We deliberately omitted regression-only metrics, such as MSE and MAE, focusing instead on metrics that reflect multiclass discriminative performance. All experiments were conducted in Python 3.10 with scikit-learn 1.3.0, pandas 1.5.3, and NumPy 1.24.2. The complete codebase, including data-processing and modelling scripts, will be publicly available upon publication to facilitate full reproducibility.

4. Results

The following section presents the empirical outcomes of our classification experiments, beginning with an analysis of the importance of features in elucidating model interpretability and then comparative performance metrics for random forest, SVM, and KNN classifiers. We report stratified five-fold cross-validation results—accuracy, macro-averaged precision, recall, F1-score, and ROC-AUC—alongside independent test set evaluations and execution time measurements to demonstrate predictive robustness and computational efficiency.

To enhance interpretability, Figure 3 presents a bar chart of the top ten most influential features driving the model’s recommendations. This visualisation demonstrates that terms such as “cost estimation,” “risk assessment,” and “time series”, which correspond to quantitative methodologies like regression and time-series analysis, carry the most significant predictive weight. Concurrently, qualitative indicators, such as “survey,” “interview,” and “case study”, rank prominently, indicating the model’s capacity to capture both numerical and contextual cues. This analysis bolsters confidence in the system’s decision logic by directly linking these feature importances to familiar methodological descriptors.

The provided output illustrates machine learning models, namely RF, KNN, and SVM, that were trained and evaluated on a dataset.

The RF model achieved an accuracy of 93.75%, indicating that approximately 93.75% of the instances were correctly classified (see Table 4). Precision was 92.11%, suggesting that when the RF model predicted a particular class, it was correct around 92.11% of the time. The recall score was 94.74%, indicating that the RF model correctly identified around 94.74% of the instances belonging to a particular class.

An average squared difference of 73.03 between the actual and projected values. The value 6.84 was the average absolute difference between the actual and anticipated values. RF model performed with the highest accuracy on four different data portions (with a mean of 90%), indicating the model’s generalisability for different data subsets (see Figure 4). The ROC-AUC value was 0.938, indicating the good discriminative ability of the model (see Figure 5).

The KNN model achieved an accuracy of 84.38%. Precision was 86.81%, indicating that the KNN model correctly identified around 86.81% of the instances when predicting a particular class. The recall score was 83.33%, indicating that the KNN model correctly identified around 83.33% of the instances belonging to a specific class (see Table 4).

KNN was the weakest method for accurate classifications in the constructed dataset, illustrating a non-adaptable approach (see Figure 6). The AUC value was 0.882, suggesting the model’s good discriminative ability (see Figure 7).

The SVM model also achieved an accuracy of 84.38%. Precision was 97.29%, indicating that the SVM model correctly identified around 97.29% of the instances when predicting a particular class. The recall score was 85.29%, suggesting that the SVM model correctly identified around 85.29% of the instances belonging to a specific class.

The SVM model had the highest accuracy for the fifth data portion (around 83.5%), specifying the variable behaviour of the model for diverse subsets (see Figure 8). The AUC value was 0.924, indicating the good discriminative ability of the model (see Figure 9).

Table 4 summarises the cross-validated performance of the three classifiers on our multi-class research method prediction task. We report mean accuracy ± standard deviation across five stratified folds, along with macro-averaged precision, recall, F1-score, and ROC-AUC.

The random forest classifier achieved the highest accuracy (93.8 ± 1.9%) and macro-F1 (0.93), indicating robust performance across all seven method categories. Its ROC-AUC of 0.94 further demonstrates strong discriminative ability between classes (Figure 4). In contrast, SVM and KNN attained moderate accuracy (~84.4%) and F1-scores (0.84 and 0.83, respectively), with the SVM’s higher ROC-AUC (0.92) reflecting a superior margin-based separation of classes compared to KNN’s similarity-based approach (ROC-AUC 0.88) (Figure 7 and Figure 9).

Table 5 presents a detailed breakdown of the model’s performance across each of the seven research method classes. The random forest classifier exhibits its strongest results on regression analysis, achieving a precision of 0.95, a recall of 0.98, and an F1-score of 0.96, reflecting high accuracy and excellent coverage for this dominant category. Time-series analysis and experimental design also demonstrate robust performance (F₁-scores of 0.90 and 0.89, respectively). In contrast, qualitative methods such as ethnographic research and grounded theory, yield slightly lower F1-scores (0.82 and 0.81), indicating that minority classes remain more challenging. Overall, the consistency between precision and recall values suggests balanced performance, with no class exhibiting disproportionate bias toward false positives or false negatives.

The detailed confusion matrix (Figure 10) reveals that random forest uniformly classified standard methods (e.g., regression analysis, time-series analysis) with F1-scores above 0.95. At the same time, minor classes, such as ethnographic research and grounded theory, saw slightly lower F1 (above 0.80), reflecting fewer training examples. SVM exhibited exceptionally high precision (0.97) for regression analysis—its dominant class—at the expense of recall (0.85), suggesting that its decision boundary favoured conservative predictions for well-represented methods. KNN’s performance dipped on minority classes, likely due to the high-dimensional TF-IDF feature space reducing neighbour relevance.

To contextualise classifier performance, we evaluated two baselines alongside our primary models (Table 6). A majority class predictor that always selects regression analysis yielded an expected accuracy of 21.8%, underscoring the inherent difficulty of the seven-way classification task. A logistic regression model, trained with balanced class weights, improved accuracy to 72.4%, indicating that linear decision boundaries capture substantial–but not all—feature complexity. In contrast, the random forest ensemble achieved 93.8% accuracy, demonstrating that non-linear, tree-based methods offer markedly superior discriminative power in mapping project descriptions to appropriate research methodologies.

Execution time measurements (Table 7), averaged over ten runs on an Intel Core i7-9750H with 16 GB RAM (ASUS VivoBook K571, NVIDIA GeForceGTX, Timișoara, Romania), indicate that random forest required longer training and prediction (≈13.8 s) compared to SVM (≈6.0 s) and KNN (≈6.0 s), reflecting its ensemble of 200 trees versus the single-model structures of SVM and KNN.

5. Discussion

The empirical results demonstrate that an ensemble-based random forest classifier can accurately map project descriptions to appropriate research methodologies, achieving a mean accuracy of 93.8% and an F1-score of 0.93. This performance confirms the value of aggregating heterogeneous decision trees to capture quantitative and qualitative features derived from TF-IDF representations and domain encodings. In contrast, margin-based SVM and instance-based KNN models yielded moderate accuracy (~84%), underscoring the superiority of non-linear ensemble methods in this context.

This study set out to achieve the following six interrelated objectives: (1) to extract relevant methodological information from the literature, (2) to construct a representative, annotated dataset, (3) to preprocess and engineer features suitable for machine learning, (4) to develop and tune multiple classifiers, (5) to rigorously evaluate their performance, and (6) to identify the optimal model for project management methodology recommendation. The empirical findings substantiate each of these aims and reveal avenues for further enhancement and practical deployment.

First, our systematic literature extraction protocol successfully retrieved detailed descriptions of research methods and their applications across five project management domains. High inter-annotator agreement (Cohen’s κ = 0.87) confirms the reliability of our annotation process, thereby satisfying Objective 1 and laying a robust foundation for downstream modelling.

Second, we assembled a tabular corpus of 156 instances mapping project scenarios to seven distinct methodological classes. This dataset, although modest in scale, provides sufficient coverage of both quantitative and qualitative approaches—addressing Objective 2—and enables balanced evaluation across diverse use cases.

Third, in pursuing Objective 3, we applied a dual-feature strategy: TF-IDF representations captured the semantic nuances of textual descriptions (e.g., “longitudinal trend analysis” correlating with time-series methods), while one-hot encoding of application domains contextualised project attributes. This hybrid feature set empowered classifiers to leverage complementary signals, thereby supporting robust generalisation.

Fourth, in line with Objective 4, we developed three classifiers: a random forest, a support vector machine, and a K-nearest neighbours—and performed nested stratified cross-validation to optimise hyperparameters. The random forest model’s ensemble structure justified its computational overhead by delivering superior predictive power. In contrast, SVM’s margin-based decisions and KNN’s instance-based reasoning highlighted the trade-offs inherent in alternative learning paradigms.

Fifth, fulfilling Objective 5, we conducted comprehensive performance assessments. Stratified five-fold cross-validation and an independent 20% held-out test set both demonstrated the random forest classifier’s exceptional accuracy (≈93.8%) and macro-F₁ (0.93), whereas SVM and KNN attained moderate scores (≈84%). Class-wise metrics (Table 5) and the confusion matrix (Figure 10) illuminated systematic patterns; dominant classes, such as regression analysis, achieved F₁-scores above 0.95, while minority categories (e.g., ethnographic research) remained more challenging (F₁ ≈0.82). Benchmark comparisons against a majority-class predictor (21.8% accuracy) and a balanced logistic regression (72.4%) contextualised these gains and underscored the necessity of non-linear, ensemble approaches.

Sixth, concerning Objective 6, the random forest classifier emerges as the most effective algorithm for automated method selection, balancing high discriminative performance (ROC-AUC = 0.94) with interpretability through feature importance measures. Figure 3’s top ten feature analysis revealed that quantitative cues (“cost estimation,” “risk assessment,” “time series,” “survey,” “case study”) and qualitative indicators (“survey,” “case study”) jointly inform recommendations, thereby justifying methodological choices in terms accessible to end users and fostering trust in the system’s decision logic.

Beyond validating these objectives, our discussion highlights critical considerations for future work. Model explainability is paramount; exposing feature importance at the instance level will enable researchers to trace specific project traits that prompt particular recommendations, thereby enhancing user confidence and facilitating informed overrides. Moreover, scaling the dataset through semi-automated article mining and richly annotated schemas will expand application coverage, preserve contextual richness, and mitigate minority-class underrepresentation. Finally, the integration of the recommendation engine into project management dashboards—complete with interactive visualisations—offers a pathway to real-time, context-aware methodological guidance, promising to elevate consistency, transparency, and efficiency in research design.

Collectively, our multi-objective framework demonstrates the feasibility of a data-driven, explainable recommender for project management methodologies. The insights gained here provide a roadmap for extending model robustness, deepening interpretability, and maximising practical impact across academic and enterprise environments.

6. Conclusions

This study has demonstrated that a random forest–based framework can predict suitable research methodologies for varied project management scenarios with high accuracy (93.8 %) and robust discriminative capacity (ROC-AUC = 0.94). By training and validating three shallow classifiers on a carefully curated corpus of 156 literature-derived instances spanning cost estimation, performance analysis, risk assessment, prediction, and comparison, we have shown that ensemble learning not only generalises effectively across multiple domains but also retains interpretability through feature importance measures.

Despite these promising results, the research is subject to certain complexities and general applicability limitations. The dataset, although diverse, comprises only 156 examples and may not fully represent emerging or hybrid methodologies, potentially constraining the model’s capacity to generalise to more complex methodological frameworks. Furthermore, while the random forest ensemble achieves satisfactory performance, its computational demands increase with feature dimensionality and dataset size, which may impede scalability in large-scale applications.

To extend this framework’s applicability, the data pipeline and feature engineering routines can be abstracted into configurable modules that accommodate domain-specific literature sources and annotation protocols. Adopting transformer-based models pretrained on expansive academic corpora may facilitate transfer learning and domain adaptation, enabling the method to capture nuanced methodological patterns beyond project management.

Predictive accuracy can be further enhanced by integrating advanced text representations, such as contextual embeddings derived from models like BERT or RoBERTa, which offer a finer-grained semantic understanding of methodological descriptions. Combining ensemble learning with neural meta-learners and implementing active learning strategies, wherein experts iteratively annotate cases with high uncertainty, may address class imbalance and improve robustness against rare method categories.

Future research should focus on expanding the dataset through semi-automated article mining and crowd-sourced annotation to encompass a broader spectrum of methodologies and disciplines. Investigations into scalable model architectures, including distributed training and model compression techniques, will be essential for enterprise-level deployment. Moreover, incorporating explainability mechanisms, such as SHAP or LIME, will provide instance-level transparency, thus fostering user trust and facilitating evidence-based decision support. Longitudinal studies examining the influence of model recommendations on methodological choices over time would offer valuable insights into the practical impact of the system.

Author Contributions

A.A.: conceptualisation, investigation, data curation, writing, original draft preparation; D.B.: methodology, software, validation, writing, original draft preparation; O.D. (Otniel Didraga): validation, formal analysis, project administration; V.N.: resources, visualization, funding acquisition; O.D. (Ovidiu Dobrican): supervision, resources, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author(s).

Acknowledgments

We would like to thank the participants of the IE2024 Conference for their insightful feedback.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fridgeirsson, T.; Ingason, H.; Jonasson, H.; Jónsdóttir, H. An Authoritative Study on the Near Future Effect of Artificial Intelligence on Project Management Knowledge Areas. Sustainability 2021, 13, 2345. [Google Scholar] [CrossRef]
Ruiz-Real, J.L.; Uribe-Toril, J.; Torres, J.A.; De Pablo, J. Artificial intelligence in business and economics research: Trends and future. J. Bus. Econ. Manag. 2021, 22, 98–117. [Google Scholar] [CrossRef]
Oliveira, D.; Neves, J.; Raposo, D.; Silva, J. Research Project Management in Communication Design: Methodology Proposal. In Advances in Ergonomics in Design: Proceedings of the AHFE 2020 Virtual Conference on Ergonomics in Design, 16–20 July 2020, USA; Springer International Publishing: New York, NY, USA, 2020; pp. 96–102. [Google Scholar] [CrossRef]
Taboada, I.; Daneshpajouh, A.; Toledo, N.; Vass, T. Artificial Intelligence Enabled Project Management: A Systematic Literature Review. Appl. Sci. 2023, 13, 5014. [Google Scholar] [CrossRef]
Uddin, S.; Ong, S.; Lu, H. Machine learning in project analytics: A data-driven framework and case study. Sci. Rep. 2022, 12, 15252. [Google Scholar] [CrossRef]
Pérez-Castillo, Y.J.; Orantes-Jiménez, S.D.; Letelier-Torres, P.O. A systematic literature review on machine learning applications for agile project management. Ing. Investig. Y Tecnol. 2024, 25, 1–11. [Google Scholar] [CrossRef]
Pollack, J.; Adler, D. Emergent Trends and Passing Fads in Project Management Research: A Scientometric Analysis of Changes in the Field. Int. J. Proj. Manag. 2015, 33, 236–248. [Google Scholar] [CrossRef]
Mahdi, M.; Zabil, M.; Ahmad, A.; Ismail, R.; Yusoff, Y.; Cheng, L.; Azmi, M.; Natiq, H.; Naidu, H. Software project management using machine learning technique—A review. Appl. Sci. 2021, 11, 5183. [Google Scholar] [CrossRef]
Marchinares, A.; Aguilar-Alonso, I. Project portfolio management studies based on machine learning and critical success factors. In Proceedings of the 2020 IEEE International Conference on Progress in Informatics and Computing (PIC), Shanghai, China, 18–20 December 2020; pp. 369–374. [Google Scholar] [CrossRef]
Mikhnenko, P. Analysis of multimodal data in project management: Prospects for using machine learning. Manag. Sci. 2024, 13, 71. [Google Scholar] [CrossRef]
Rosett, C.; Hagerty, A. Machine learning project management. In Advances in Intelligent Systems and Computing; Springer: New York, NY, USA, 2021; pp. 191–202. [Google Scholar] [CrossRef]
Lee, E.; Kim, J.; Choi, S. The Engineering Machine-Learning Automation Platform (EMAP): A Big-Data-Driven AI Tool for Contractors’ Sustainable Management Solutions for Plant Projects. Sustainability 2021, 13, 10384. [Google Scholar] [CrossRef]
Galán, J.; Santos, J.; Ahedo, V.; Pereda, M. Explainable machine learning for project management control. Comput. Ind. Eng. 2023, 180, 109261. [Google Scholar] [CrossRef]
Mamatha, R.; Suma, K. Role of machine learning in software project management. J. Phys. Conf. Ser. 2021, 2040, 012038. [Google Scholar] [CrossRef]
Taye, G.; Feleke, Y. Prediction of failures in the project management knowledge areas using a machine learning approach for software companies. SN Appl. Sci. 2022, 4, 50–51. [Google Scholar] [CrossRef]
Chennouk, H.; Bhiri, B.; Bara, S.; El Houssaine, Z. The Machine Learning in Project Management: A Systematic Mapping Study. J. Theor. Appl. Inf. Technol. 2020, 98, 10. [Google Scholar]
Vergara, D.; del Bosque, A.; Lampropoulos, G.; Fernández-Arias, P. Trends and applications of artificial intelligence in project management. Electronics 2025, 14, 800. [Google Scholar] [CrossRef]
Mesa Fernández, J.M.; González Moreno, J.J.; Vergara-González, E.P.; Alonso Iglesias, G. Bibliometric analysis of the application of artificial intelligence techniques to the management of innovation projects. Appl. Sci. 2022, 12, 11743. [Google Scholar] [CrossRef]
Karatzoglou, A.; Meyer, D.; Hornik, K. Support Vector Machines in R. J. Stat. Softw. 2006, 15, 1–28. [Google Scholar] [CrossRef]
Dudek, G. A Comprehensive Study of Random Forest for Short-Term Load Forecasting. Energies 2022, 15, 7547. [Google Scholar] [CrossRef]
Song, Y.; Liang, J.; Lu, J.; Zhao, X. An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 2017, 251, 26–34. [Google Scholar] [CrossRef]
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. Off. J. Int. Neural Netw. Soc. 2014, 61, 85–117. [Google Scholar] [CrossRef]
Hein, A.; Kirste, T. Generic Performance Metrics for Continuous Activity Recognition. In KI 2011: Advances in Artificial Intelligence: 34th Annual German Conference on AI, Berlin, Germany, 4–7 October 2011. Proceedings 34; Springer: Berlin/Heidelberg, Germany, 2011; pp. 139–143. [Google Scholar] [CrossRef]
Alsouda, Y.; Pllana, S.; Kurti, A. Iot-based urban noise identification using machine learning: Performance of SVM, KNN, bagging, and random forest. In Proceedings of the International Conference on Omni-Layer Intelligent Systems, Crete, Greece, 5–7 May 2019; pp. 62–67. [Google Scholar]
Hawkins, J. Minimum viable model estimates for machine learning projects. arXiv 2021, arXiv:2101.00346. [Google Scholar]
Ismail, M.; Suh, G.E. Quantitative overhead analysis for python. In Proceedings of the 2018 IEEE International Symposium on Workload Characterization (IISWC), Raleigh, NC, USA, 30 September–2 October 2018; pp. 36–47. [Google Scholar]
Kushwah, J.S.; Kumar, A.; Patel, S.; Soni, R.; Gawande, A.; Gupta, S. Comparative study of regressor and classifier with decision tree using modern tools. Mater. Today Proc. 2022, 56, 3571–3576. [Google Scholar] [CrossRef]
Szymański, P.; Kajdanowicz, T. A scikit-based Python environment for performing multi-label classification. arXiv 2017, arXiv:1702.01460. [Google Scholar]

Figure 2. Scheme of the main procedures: from article extraction and annotation to model prediction of research methods.

Figure 3. The most relevant feature scores.

Figure 4. Accuracy for RF classifier using cross-validation.

Figure 5. The ROC-AUC value for the RF model.

Figure 6. Accuracy for KNN classifier using cross-validation.

Figure 7. The ROC-AUC value for the KNN model.

Figure 8. Accuracy for SVM classifier considering cross-validation.

Figure 9. The ROC-AUC value for the SVM model.

Figure 10. Confusion matrix of random forest classification across seven research method categories.

Table 1. Evolution of machine learning in project management over time.

Era/Trend	Key Demands/Changes	ML Contribution
Pre-ML	Manual planning, expert judgment	Limited
Early ML Adoption	Predictive analytics, risk modelling	Forecasting, risk reduction
Current (AI/Big Data)	Real-time, explainable, automated, scalable	Explainability, automation, and big data handling

Table 2. Machine learning’s role in project management.

Application Area	ML Role/Benefit
Effort/Cost Estimation	More accurate predictions, reducing budget overruns
Resource Allocation	Automated, data-driven assignment of tasks
Risk Management	Early detection and mitigation of potential failures
Decision Support	Insights from complex, multimodal project data
Project Monitoring	Real-time tracking and adaptive control under uncertainty

Table 3. Class frequencies of the methods.

Method	Count	Proportion (%)
Regression Analysis	34	21.8
Time-Series Analysis	28	17.9
Case Study	24	15.4
Ethnographic Research	20	12.8
Experimental Design	18	11.5
Grounded Theory	16	10.3
Interview	16	10.3
Total	156	100.0

Table 4. Cross-validated classification performance (macro-averaged) for each model.

Model	Accuracy ± SD	Precision	Recall	F1-Score	ROC-AUC
Random Forest	93.8 ± 1.9%	0.93	0.94	0.93	0.94
SVM	84.4 ± 3.1%	0.85	0.84	0.84	0.92
K-Nearest Neighbours	84.4 ± 3.5%	0.86	0.83	0.83	0.88

Table 5. Average class-wise performance evaluation metrics.

Class	Precision	Recall	F1-Score
Regression Analysis	0.95	0.98	0.96
Time-Series Analysis	0.91	0.89	0.90
Case Study	0.88	0.85	0.86
Ethnographic Research	0.84	0.80	0.82
Experimental Design	0.90	0.88	0.89
Grounded Theory	0.83	0.79	0.81
Interview	0.85	0.82	0.83

Table 6. Benchmark comparison of classifier performance.

Model	Accuracy	Precision	Recall	F1-Score	ROC-AUC
Random Forest	0.938	0.93	0.94	0.93	0.94
Support Vector Machine	0.844	0.85	0.84	0.84	0.92
K-Nearest Neighbours	0.844	0.86	0.83	0.83	0.88
Logistic Regression	0.724	0.70	0.72	0.71	0.81
Majority Class Baseline	0.218	—	—	—	—

Table 7. Average execution times for model training and inference.

Tested Model	Average Execution Times (Seconds)
SVM	5.997
KNN	6.001
RF	13.801

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Didraga, O.; Albu, A.; Negrut, V.; Babuc, D.; Dobrican, O. A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies. Appl. Sci. 2025, 15, 7263. https://doi.org/10.3390/app15137263

AMA Style

Didraga O, Albu A, Negrut V, Babuc D, Dobrican O. A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies. Applied Sciences. 2025; 15(13):7263. https://doi.org/10.3390/app15137263

Chicago/Turabian Style

Didraga, Otniel, Andrei Albu, Viorel Negrut, Diogen Babuc, and Ovidiu Dobrican. 2025. "A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies" Applied Sciences 15, no. 13: 7263. https://doi.org/10.3390/app15137263

APA Style

Didraga, O., Albu, A., Negrut, V., Babuc, D., & Dobrican, O. (2025). A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies. Applied Sciences, 15(13), 7263. https://doi.org/10.3390/app15137263

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Data-Driven Machine Learning Framework Proposal for Selecting Project Management Research Methodologies

Abstract

1. Introduction

2. Literature Review

2.1. Background

2.2. Project Management and Machine Learning

2.3. Shallow Models and Evaluation Metrics

3. Materials and Methods

3.1. Materials

3.2. Methods

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI