DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects

Gökay, Gonca Tokdemir; Gökalp, Ebru; Eren, P. Erhan

doi:10.3390/app16052551

Open AccessArticle

DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects

by

Gonca Tokdemir Gökay

^1,*,

Ebru Gökalp

²

and

P. Erhan Eren

¹

Graduate School of Informatics, Middle East Technical University, 06800 Ankara, Türkiye

²

Department of Computer Engineering, Hacettepe University, 06800 Ankara, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(5), 2551; https://doi.org/10.3390/app16052551

Submission received: 14 January 2026 / Revised: 12 February 2026 / Accepted: 19 February 2026 / Published: 6 March 2026

(This article belongs to the Special Issue Machine Learning and Data Analysis: Bridging Theory and Real-World Solutions)

Download

Browse Figures

Versions Notes

Abstract

There is a persistent paradox in the data science domain: despite the growing recognition of data as a strategic asset, many projects designed to leverage its value still suffer from high failure rates. To address this challenge, this study introduces the Data Science Projects Success Assessment Model (DS PRO-S), developed using a Design Science Research approach to make data science project success explicit, measurable, and comparable. DS PRO-S functions as a meta-model and an instantiation toolkit, complete with an operational methodology that supports success and health assessments using critical success factors (CSFs) and success criteria at both the phase and project levels through four distinct modules. This modular structure enables evaluations at any point in the data science lifecycle and informs timely, data-driven interventions before issues propagate. The measurement and evaluation framework within DS PRO-S aligns with ISO/IEC 15939, incorporating mathematical formulations for aggregating success criteria and CSFs into upper-level scores. To demonstrate its instantiability, completeness, and operational utility, case studies were conducted in a predictive analytics project of a large energy enterprise and a generative AI project of a vendor. The findings indicate that DS PRO-S is applicable in diverse project contexts in the data science domain and offers a robust solution for assessments.

Keywords:

data science project; success evaluation; design science research

1. Introduction

Data science has evolved from an emerging field to a fundamental driver of organizational transformation across various industries. When implemented successfully, data science projects can deliver numerous advantages that lead to competitive advantage and sustainability for organizations. Thanks to technological advancements, these advantages have begun to have even greater impact potential and have thus transformed data into a strategic asset [1]. However, the high failure rates reported (e.g., up to 87% of data science initiatives fail to reach production [2], 95% of GenAI pilots fail to deliver financial impact [3] and 60% of AI use cases are expected to fail to achieve their expected value by 2027 [4]) point to challenges in translating theoretical breakthroughs into practical applications.

Implementing data science projects presents unique challenges. Even though they share similarities with other technical project types, they exhibit distinct differences, shaped by their specific characteristics [5]. Unlike conventional projects with well-defined requirements and predictable outcomes, data science projects are characterized by inherent uncertainty, evolving objectives, and heavy dependence on data quality and availability [6]. These distinctive features create implementation complexities that existing approaches, originally designed for more structured environments, may fail to adequately address.

These unique challenges, together with the high reported failure rates, necessitate effective success management approaches that allow for an understanding of where and when these projects succeed or fail [7], and how to improve success potential. However, the current literature and industry lack a formalized, operationalizable success assessment model that accounts for the distinct characteristics of data science projects and is applicable across diverse project types and contexts.

Accordingly, this article proposes the Data Science Project Success Assessment Model (DS PRO-S), comprising a meta-model and an instantiation toolkit. The model is built upon two fundamental success constructs: critical success factors (CSFs) and success criteria. By introducing evaluations based on these constructs at both project and phase levels, the model contains four separate modules: Phase Health, Phase Success, Project Health, and Project Success. Thus, DS PRO-S offers a modular and asynchronous assessment capability with operational flexibility. Accompanied by an operational methodology, the holistic solution forms a novel framework that makes data science project success explicit, measurable, and comparable.

For building the DS PRO-S, a Design Science Research (DSR) approach was adopted to find answers to the main research question of “What are the essential dimensions of a holistic assessment model designed to define, measure, and evaluate the success of data science projects?” Grounded in established principles from information systems, strategic management, and systems theory, DS PRO-S addresses the socio-technical nature of data science. Organizations can utilize DS PRO-S to determine if their projects are on the right track or achieving the targeted objectives, allowing them to take the required corrective actions.

The remainder of this paper is structured as follows: In Section 2, a summary of the research background is presented, while Section 3 explains the research methodology used for developing DS PRO-S. In Section 4, DS PRO-S is described. Section 5 is dedicated to the two case studies and the cross-case study analysis. The last two sections, Section 6 and Section 7, present the contributions of the study, conclusions, and opportunities for future research.

2. Background

Ika & Pinto [8] (p. 837) express project success as “the end point or the achievement of the goal the organization and related stakeholders seek to reach through a temporary endeavor.” Evaluating projects and understanding where they succeed or fail [7] is essential, yielding several benefits for the organization.

Assessing the success of projects in a structured way moves organizations beyond ad hoc judgments to quantified evidence. Specifying what success means for a project upfront improves shared understanding, communication, and resource allocation [9]. The early detection of potential problems enables diagnosis of underperformance and helps minimize deviations from targeted outcomes. Beyond judgments within a single project, benchmarking across projects provides critical insights for improvement [10], supporting continuous learning for enhanced outcomes in future projects. Considering its importance and impact, recent studies in project management [11,12,13] proposed integrating an evaluation of success (as part of the success management process) into widely used and universally accepted management standards such as PMBOK [14], ISO 21502 [15], and PRINCE2 [16].

In recent decades, researchers have developed a range of evaluation models and frameworks that provide a structured basis to define, measure, and manage project success. These studies strive for a shared definition of success through multi-level criteria [17], improving clarity and comparability. This enables objective measurement and evidence-informed decision-making [18], surfacing gains, losses, and actionable remedies for managers and stakeholders [19]. Because projects differ, some models accommodate diversity by tailoring criteria to project objectives and/or stakeholder perspectives [20,21]. Contemporary approaches broaden success beyond time, cost, scope, and quality to include outcomes and longer-term impact, yielding a more holistic view of value creation [22].

Data science projects diverge from traditional IT or software projects due to their interdisciplinary nature, need for rapid iteration, and uncertainty in inputs and outputs [23,24]. Even though robust conceptual success models exist in the Information Systems (IS) domain (such as the Model for Measuring IS Project Success [25]), they rarely offer implementation-level guidance and do not account for distinct data science characteristics like inherent uncertainty and heavy data dependence. Consequently, applying them to data science initiatives would require significant modifications, rendering them impractical for direct use by practitioners. On the other hand, there are only a few studies in the data science domain that discuss a project’s success. These provide CSFs or assessment factors (e.g., [26,27]) for specific data science project types (e.g., big data) or for specific scopes (e.g., ethical principles), and thus their applicability is limited when considering the broader contexts of data science projects. Accordingly, there is a gap in the literature for a dedicated success assessment model for data science projects.

3. Methodology

The main research question (RQ) of this research is as follows:

RQ: What are the essential dimensions of a holistic assessment model designed to define, measure, and evaluate the success of data science projects?

In the search for answers to this primary question, this study adopts DSR, which is offered as a problem-solving paradigm extending human and organizational capabilities by crafting novel artifacts to solve practical problems [28]. In applying DSR to the main artifact of this study, DS PRO-S, six activities defined in the Design Science Research Methodology (DSRM) [29] were undertaken, as depicted in Figure 1.

In applying DSRM, the development of the proposed model was grounded on three complementary systematic literature syntheses to ensure scientific validity and methodological rigor. The initial systematic literature review (SLR) [30] provided an initial scan of success determinants in a data science project, revealing the immaturity of success research in the data science domain. The Multivocal Literature Review (MLR) then formed the main literature base by combining scientific and gray sources across data science, information systems, and software engineering. This synthesis was used to clarify the success-assessment gap and to derive theoretically grounded examples of CSFs and success criteria, which are relevant to data science projects. It showed that the existing success-related evidence in the data science domain remains fragmented across subdomains and often stops at factor lists, providing limited guidance for assessments of success [26,31,32]. In addition, a focused SLR [33] examining data science project implementation practices was performed to capture contextual elements, such as their characteristics, goals, phases and tasks, to support tailoring the model. This highlighted that data science projects are iterative and experimentation-driven, with evolving objectives and a strong dependence on data quality/availability [34,35]. These reviews ensured that the design of the model is evidence-based and methodologically sound.

3.1. DSRM Activity 1: Problem Identification

This activity aims at specifying the research problem and motivating the search for a solution [29]. Although data science initiatives are launched with strong expectations, many do not reach production and do not translate into clear business outcomes. Considering the specific nature of data science projects and their high failure rates, an assessment model for defining, measuring, and evaluating success that is specifically developed for data science projects is needed. Organizations engaging in data science initiatives need to assess their projects to see if the project is on the right track or if the deliverables are being used by the business as intended and deliver the targeted outcomes. By doing so, pain points can be identified and required actions can be taken, while the projects that will no longer be utilized by the business can be canceled. However, literature poses a gap in offering a formalized, operationalizable success assessment model that accounts for the distinct characteristics of data science projects, like inherent uncertainty.

3.2. DSRM Activity 2: Objectives of the Solution

The second activity of DSRM focuses mainly on deriving solution objectives from the defined problem based on knowledge of existing solutions [29]. Although the success-related frameworks or models proposed in project management (e.g., [36,37,38]) or IS (e.g., [39,40,41]) domains are conceptually strong, they mostly lack implementation-level guidance for assessments and granularity, and do not account for the distinct characteristics of a data science project. On the other hand, there are only a few studies on data science Project Success [26,27,31,32] that offer a framework or model, which either fall short in offering the constructs required to measure success or only focus on a specific perspective (e.g., moral decision making) and project type (e.g., big data projects), which limits their applicability among diverse types of projects.

The main goal of this research on building this solution, DS PRO-S, is defined as developing a specific, comprehensive success assessment model and a methodology to operationalize it, to ensure data-science investments have sustained business value. Based on the literature on project success and data science projects, this goal is broken down into objectives, each of which states a desired capability for DS PRO-S, as depicted in Table 1.

3.3. DSRM Activity 3: Design and Development

DSRM Activity 3 is the design and development stage, where the artifact is created. It involves specifying the intended functionality and architecture and building the artifact, drawing on theory to translate objectives into a feasible solution [29]. This stage was carried out in three main steps to design and develop DS PRO-S as a model that comprises a Meta-Model Blueprint and an Instantiation Toolkit & Implementation Guidance, as explained below.

First, we formulated the design rationale that explains why the artifact was needed and which principles guide its structure. For this purpose, insights from the literature were used to determine the High-Level Requirements (HLRs) of the artifact. Specifically, prior project success research indicates that a comprehensive assessment must span the full project lifecycle, including not only downstream outcomes (e.g., product usage) but also upstream “ex-ante” phases [42] such as ideation and assessment, where weak idea selection can lead to sub-optimal execution. This end-to-end perspective is especially critical for data science projects, given their experimental, iterative nature and cross-unit involvement; otherwise, success assessments remain partial and may fail to capture real value. At the same time, lifecycle phases should not be treated as “black boxes”: beyond evaluating outcomes, success management requires attention to CSFs that reduce failure risk across phases [30], especially because CSFs shift in relative importance over their lifecycle [43]. Reinforcing the need for a lifecycle-aware design, Khang and Moe [44] show that the success of a preceding phase can influence later phases, motivating a phased evaluation logic that explicitly models interdependencies.

The literature further suggests that success evaluation requires clear success criteria derived from project objectives, since success (or failure) is ultimately determined by the extent to which objectives are achieved [45]. Establishing a shared evaluation reference point early in the lifecycle supports team alignment and commitment to those objectives [46], which is especially important when multiple disciplines collaborate. However, success research also cautions against assuming “one-size-fits-all” success determinants: universalistic CSF views have been criticized and replaced with contingency-oriented perspectives that account for project differences [47,48,49,50]. Accordingly, evaluations should incorporate context-specific elements while retaining shared dimensions that enable cross-project comparison [51,52], implying a model design that combines common core components with data science-specific characteristics and flexibility for additional contextual circumstances.

Multiple studies emphasize that project conditions and success determinants can change over time; what is critical in one phase may become less relevant later [38]. Because KPIs are often defined early with limited information, they may be overly generic and require refinement as knowledge accumulates [53], a dynamic that is amplified in data science projects where requirements and goals evolve through data exploration [34]. This motivates a model that supports adaptation and iterative updating across the lifecycle, rather than a static, one-time evaluation.

HLRs determined based on these insights were linked to the objectives (in Table 1) for traceability and to the respective references from the literature to reflect theoretical foundations. These mappings are depicted in Appendix A (Table A1) and provide the basis for the design rationale of the proposed model. Key assumptions and exclusions were also documented (Table A2). Based on these, core design considerations were articulated as answers in the five W-Questions to point out what the model would cover (Table A3). Then, the principles and key decisions on design elements were defined, which explain how the artifact would be structured and delivered (Table A4). These top-level rules translated the HLRs into guiding prescriptions for functionality, architecture, and customization. As illustrated in Table A4, each decision traces back to its originating requirement to ensure that a coherent set of directives informs the blueprint construction in the next step. Illustrative excerpts are provided in Table 2, Table 3 and Table 4, while full tables are provided in Appendix A.

Based on the articulated design rationale, the model development continued by first identifying the key modeling needs and matching each need with an appropriate, structured technique selected from targeted reviews of both literature and practice. Established approaches were adopted where appropriate (e.g., GQM) or applicable ways were suggested (e.g., for deriving CSFs). These are depicted in Table A5 in Appendix A.

The main conceptual building blocks of the model were defined as CSFs and success criteria. Grounded in literature and informed by the views of a panel of eight domain experts, these constructs, together with other model elements (such as typical goals, phases, tasks, deliverables, and challenges), were assembled to form a meta-model comprising four core components (Project, Success, Measurement, and Evaluation). The basic properties of the developed model are summarized in Table 5.

The constructed meta-model includes CSFs and success criteria as conceptual constructs that define the aspects that need to be considered when assessing the success of data science projects. While these constructs are foundational at the conceptual level, their specific values are context-dependent and may vary across projects, organizations, and application domains. Accordingly, for each project, the concrete instantiations of these constructs need to be identified by practitioners. To facilitate the instantiation of the meta-model and support its practical application, an instantiation toolkit was developed. The toolkit comprises non-exhaustive, illustrative instantiations of configurable elements (e.g., success criteria with example metrics), synthesized from the literature and refined through expert input. Additionally, it includes templates (e.g., GQM worksheets) and applicable method explanations (e.g., weighting methods). A user guide was also prepared for the easy and traceable implementation of each evaluation dimension, following an executable define–measure–evaluate–update cycle. Worked examples in the user guide demonstrate the instantiation and calculation mechanisms for data science project use cases.

3.4. DSRM Activity 4 and Activity 5: Demonstration and Evaluation

The demonstration (using the artifact to solve a problem), and evaluation (measuring its performance against objectives) activities defined in the DSRM [29] were implemented through an iterative “build-and-evaluate” cycle (Figure 1), to reflect their interdependence. To ensure the theoretical soundness and operational effectiveness of the artifact, appropriate design evaluation methods from Hevner et al.’s study [54] were selected. In the selection of these methods, the evaluation strategies from the FEDS Framework [55] were utilized, starting with artificial evaluations and progressing towards naturalistic evaluations in the artifact’s real environment. First of all, DS PRO-S’ logical consistency and mathematical formulation were verified by using literature-based use cases and simulations. Then, expert interviews were conducted for validation of the content, as well as structural and procedural validation. Finally, case studies were performed to evaluate the instantiability, completeness and operational utility of DS PRO-S, as applied in data science projects in organizations, which are reported in detail in Section 5.

Each demonstration and evaluation cycle generated improvement inputs that were fed back to DSRM Activity 3 (Design and Development) to refine DS PRO-S before moving to the next cycle. Refinements primarily concerned (i) improving the operational clarity and usability of the toolkit (e.g., templates and user guidance), (ii) strengthening the measurement and evaluation logic to reduce ambiguity and improve interpretability, and (iii) adjusting methodological rules to better reflect the practical constraints observed. This iterative approach ensured that DS PRO-S was incrementally enhanced and stabilized before use by practitioners.

The next section explains the developed artifact, DS PRO-S, in detail.

4. DS PRO-S: Data Science Project Success Assessment Model

DS PRO-S features a framework-level structure comprising a metamodel for instantiation and an operational methodology for success assessments across the project and product lifecycle (Figure 2).

Grounded in established principles from IS, strategic management, and systems theory, DS PRO-S addresses the socio-technical nature of data science. Organizations can utilize DS PRO-S to determine if their projects are on the right track or achieving targeted objectives, allowing them to take the required corrective actions.

DS PRO-S is designed to operate alongside an organization’s established management processes, including those for projects, quality, and data science, as well as other relevant business functions. It does not replace or replicate prevailing standards such as ISO 9001:2015 Clause 9 [56] activities or the specific evaluation stage of CRISP-DM. Instead, it leverages existing outputs from those environments (e.g., project management information system (PMIS) reports, quality management system (QMS) documentation, financial records, user feedback, and analytics reports) and provides a success-focused evaluative perspective (Figure 3). DS PRO-S utilizes a manager-mediated approach. In this configuration, the Project Manager (or a designated lead) serves as the primary interface, responsible for executing the operational layer with inputs such as client satisfaction data, financial metrics, and test results.

4.1. DS PRO-S Meta-Model Components

The structural logic of DS PRO-S is defined by four interrelated components, each focusing on a specific aspect of the success assessment process. These components are operationalized through an Instantiation Toolkit, which transforms the abstract model into a customized assessment for a specific project context (Figure 4).

Project Component: Establishes the structural and strategic schema for the assessment environment by defining elements such as project phases, goals, specific objectives, and deliverables. These elements are instantiated to form the project structure and phase structure that ground subsequent measurement and evaluation activities.

Success Component: Defines the evaluative framework by distinguishing between the achievement of objectives (Success) and the underlying enablers (Health). It holds the hierarchical categories for success constructs.

Measurement Component: Operationalizes the assessment by defining the computational logic needed to translate raw observations into standardized indicators. Grounded in the Measurement Information Model (MIM) defined in ISO/IEC/IEEE 15939:2017 [57], it defines base measures, derived measures (such as normalized metric values), and the specific mathematical formulas used for aggregation.

Evaluation Component: Translates quantitative measurement results into qualitative judgments. It includes the rating scales, thresholds, and decision criteria (e.g., “Fully Achieved” vs. “Partially Achieved”) used to trigger management actions or corrective steps.

4.2. Integrated Lifecycle View

A comprehensive understanding of Project Success requires several stages of the project lifecycle to be taken into consideration [27,42]. DS PRO-S uses an integrated project and product lifecycle (Figure 5) because data science projects possess specific characteristics such as high uncertainty [58], continuous recalibration needs, and experimentation [6,59,60].

Both ends of the lifecycle matter in data science projects. Strong upstream work ensures the organization commits to the right projects. Inadequate problem framing, weak benefit hypotheses, superficial feasibility and data-readiness checks, or poor stakeholder alignment at this stage propagate avoidable risk and misallocate resources, yielding “not-optimal” projects that are costly to execute (due to their extra scope) and unlikely to deliver business value. The downstream side is equally critical: even high-quality outputs are deployed, fulfilling the requirements, while stated goals remain unrealized without sustained adoption in operations. Continuous monitoring and adaptation beyond deployment are required to capture model performance degradation, or model drift, caused by evolving data, shifting user behavior, or changing external conditions.

Assessments in DS PRO-S can begin after project kick-off and continue until the latest planned date for the objectives of the project to be realized.

4.3. Success Constructs, Assessment Levels, and Dimensions

As briefly mentioned in Table 5, DS PRO-S is built upon two fundamental success constructs: CSFs and success criteria. CSFs represent the essential conditions or management practices that increase the likelihood of a positive outcome, such as securing top management support. In contrast, success criteria are the specific measures used to define and measure success, such as efficiency increase or model reliability. Based on these constructs, the model establishes two complementary evaluation dimensions: Success and Health. Success refers to the extent to which the intended objectives are achieved, assessed against explicit, measurable criteria. Health refers to the extent to which enabling conditions for success are in place.

Grounded in General Systems Theory [61], Success and Health are assessed at two levels: Project and Phase. In Phase-Level Assessments, the model introduces a mechanism to detect and mitigate issues during the phase at which they occur. This prevents the compounding effect where upstream failures (e.g., in data readiness) propagate into consecutive phases, creating “not-optimal” projects that are costly to execute.

The combinations of success constructs and evaluation levels create four modules of DS PRO-S, as presented in Figure 6.

4.4. Modules of DS PRO-S

Practitioners can adopt DS PRO-S through a modular approach that avoids the need for a total, “all-or-nothing” implementation. The individual modules presented in Figure 6 can be deployed independently, except for Project Health, which requires inputs from the Phase Success module. For example, Phase Health can be utilized only for high-risk development stages, or Project Success can be evaluated as a standalone unit. This structural flexibility allows the model to scale in alignment with an organization’s specific needs, resource availability, and current process maturity.

4.4.1. Project Success Structure

DS PRO-S adopts a leveled success structure (Figure 7). The Project Success Structure is composed of Output Success and Outcome Success and their subcategories.

Output Success is partitioned into two distinct subcategories. Adapted from ISO/IEC 25010 [62], Product Quality Success signifies that the final deliverable fulfills relevant quality characteristics and aligns with the requirements and expectations of key stakeholders [9,36,42,63]. Project Management Success is achieved when the project is completed within the traditional triple constraints of scope, time, and cost [36,38]. Furthermore, it also requires adherence to the organization’s applicable processes and project management methodologies (e.g., project management methodology). Both Project Management and Product Quality Success are critical components of Output Success, as the achievement of only planned time and budget does not guarantee the product satisfies the requirements [64,65].

Outcome Success also has two subcategories. Business Value Success refers to achieving targeted, non-financial organizational benefits by utilizing the project’s deliverables, such as improvements in service levels, cycle-time reductions, or enhanced stakeholder satisfaction [66]. Financial Success measures the realized financial gains that occur as a direct consequence of those business benefits, such as significant cost reductions or revenue growth [66,67]. In practice, the assessment of Outcome Success primarily focuses on the short-to-medium term benefits. This boundary is necessary because, over the long term, it becomes increasingly difficult to isolate the project’s specific contribution from other external organizational or market factors.

To evaluate success, the overarching goal must be operationalized into measurable objectives under sub-categories within the Project Success Structure; each objective is then made measurable via success criteria and metrics. This mapping preserves traceability from goal to measurement and enables aggregation from the criterion level to category level and, ultimately, to overall Project Success. Once the success criteria (or criterion) are selected for each objective, the Project Success Structure becomes ready for measurement and evaluation.

4.4.2. Phase Success Structure

Failure to perform successfully in the phases of a project eventually results in the project falling short in achieving overall success. In DS-PRO-S, Phase Success is assessed within each phase using two complementary subcategories (Figure 8):

Deliverable Success: The main phase deliverables meet their stated requirements, and specifications, and the relevant stakeholders’ expectations (e.g., validated data sets, engineered features, trained models).
Phase Management Success: The phase is completed within its planned scope, schedule, and cost. In phases where baselines are not yet applicable (e.g., pre-kickoff phases), management success is evidenced through process adherence and governance (e.g., decision cadence, risk/assumption logging, required reviews and sign-offs).

Figure 8. Phase Success structure in DS PRO-S.

These categories contribute to Phase Success via configurable weights, allowing their relative importance to vary by phase and context. For example, in exploratory phases (e.g., data understanding, modeling), where uncertainty makes schedule baselines less predictable, Deliverable Success (e.g., decision-ready problem framing, data suitability, reproducible experiments) typically receives greater weight.

The Phase Success Structure is completed by mapping the objective(s) of the phase, to each Phase Success category and matching these with the success criteria and their metrics.

4.4.3. Phase Health Structure

CSFs’ importance varies with the project lifecycle stages [43]. DS PRO-S responds to this dynamic by allowing each project phase to be associated with a distinct set of CSFs, categorized under Project, Data, Organization, Technology, and Strategy (P-DOTS). This structure, shown in Figure 9, ensures a comprehensive representation of the CSFs in data science projects, tailored to the specific goals, context, and challenges of each phase.

The contribution of these categories to Phase Health is not static across the project lifecycle. Their relative importance (i.e., weight) is configurable, allowing teams to emphasize the enablers that are most critical in each phase based on the phase’s tasks and challenges. For instance, Strategy-related CSFs (e.g., regulations and legal aspects are adequately assessed, and project goals are aligned with the business strategy) may carry more weight during the early phases. On the other hand, Technology-related CSFs (e.g., the availability of suitable tools and systems, supportive IT governance, flexible infrastructure) may become more relevant in later phases involving execution, integration, or deployment. This flexibility enables DS PRO-S to account not only for what is critical to success, but also for when this becomes critical, supporting more targeted evaluation and timely interventions.

The Phase Health structure is completed once relevant CSFs are selected for the categories shown in Figure 9.

4.4.4. Project Health Structure

Building on the definition of CSFs as conditions that increase the likelihood of success, and the fact that CSFs are assessed at the phase level, DS PRO-S defines Project Health by treating success in individual project phases as CSFs for the overall project. In other words, achieving success in a given phase is considered a project-level enabler, a necessary condition that increases the likelihood of achieving project success. This formulation is grounded in General Systems Theory, specifically Bertalanffy’s concept of hierarchical order [61] (p. 69), which establishes that the overall state of a complex system is functionally dependent on the precise performance of its constituent parts.

This perspective is operationalized by modeling project-level CSFs to include “Achieving success in Phase i” for each defined project phase (Figure 10). Additionally, the contribution of each phase’s success to overall Project Health is not assumed to be equal. It is configurable based on the nature of the project. For instance, in a data science project with low exploratory uncertainty, early-phase performance (e.g., problem definition, data understanding) may carry less weight. Conversely, in research-driven or innovation-heavy projects, the early phases may carry more weight, since misalignment or failure at that stage could critically impact downstream phases.

4.4.5. Measurement Information Models (MIMs) as Measurement and Evaluation Specifications

DS PRO-S adopts an MIM [57] to define the measurement and evaluation elements and describe how their functions work. MIMs of modules provide specific mathematical formulations for aggregating success criteria and CSFs into upper-level scores for Success and Health. This ensures that the measurement and evaluation elements (Base Measures, Derived Measures, Indicators, Rating Scales, and Decision Criteria) are traceable, replicable, and auditable. The MIMs of DS PRO-S modules comprise the definitions and formulations for the measures and indicators given in Appendix A, Table A6.

In DS PRO-S, aggregation is performed hierarchically using configurable weights. At each parent construct (e.g., a Success Category, a Health Category, Phase Success, or Project Success), practitioners assign weights to the child elements (e.g., success criteria, metrics, CSFs, or categories), to reflect their relative importance in the given project context or lifecycle phase. Weights are determined during instantiation using prioritization methods (e.g., AHP, $100 allocation, T-shirt sizing, or organization-defined approaches).

During measurement, base measures (e.g., measured value of a metric) are collected with which derived measures are calculated and converted into indicators for decision support. To arrive at scores at the parent level (e.g., Phase Success Score), weighted average of the child scores (e.g., Category (Deliverable or Phase Management) Success Scores) are computed. In parallel, indicators of Weighted Average Coverage (to show data completeness related to scores) and Maturity Share (to show elapsed time window for scores) 488 are also computed to enable decision-makers to interpret scores cautiously.

In any of the evaluation dimensions, the aggregation of base-level measurements into higher-level constructs (e.g., success criteria) and categories (e.g., Output Success, Outcome Success, Project Success) to reveal scores as indicators does not assume that the categories are fully mutually exclusive. Overlaps are expected; for instance, the technical quality of a product (an output) often serves as a prerequisite for generating business value (an outcome).

This structural logic aligns with established project success frameworks where categories are treated as analytical lenses rather than isolated, independent variables. Aggregation is utilized as a decision-support mechanism to offer stakeholders a comprehensive overview and facilitate benchmarking, while the underlying disaggregated data is preserved to facilitate more granular analysis. Practitioners are encouraged to interpret both the aggregated scores and the individual results to capture the insights that could have been shadowed at upper levels.

4.5. DS PRO-S Operational Methodology

The operational methodology of DS PRO-S outlines the end-to-end process of implementing the DS PRO-S model for a data science project. It enables a thorough walkthrough of the model’s core capabilities, covering all supported evaluation levels (Project and Phase) and dimensions (Health and Success). Practitioners can choose to implement the entire process as a structured and comprehensive approach to success management across the lifecycle of their data science projects. Alternatively, they may adopt only selected evaluation dimensions (e.g., Phase Health only) depending on their organizational needs and contextual constraints.

The methodology is designed as a dynamic cycle comprising four interrelated processes (Figure 11). Firstly, the DS PRO-S is instantiated for a specific project or phase, and the model for success or health is built; then, the measurements are conducted. For each measurement, evaluations are made of the results, which inform the decisions. After each evaluation or any trigger (e.g., iteration), the model is updated to reflect the changes or decisions.

Once the DS PRO-S model is instantiated and approved, it is set as the project’s DS PRO-S Baseline v0. This baseline serves as the formal reference point for all future measurements and evaluations.

Any major update to the model (e.g., changes to goals, success criteria, or evaluation logic) must fulfil the following criteria:

Be formally reviewed and approved by designated stakeholders (e.g., senior management, sponsor);
Be incrementally versioned (e.g., DS PRO-S Baseline v1, v2);
Be justified and documented, with a rationale linked to the evaluation results, project changes, or stakeholder input.

This governance practice ensures that updates do not intentionally bias the evaluation process or misrepresent project performance. Minor adjustments (e.g., correcting a metric label) may be logged without formal approval, but all changes that would affect measurement and evaluation results require approval.

5. Multiple Case Studies

This section explains the multiple-case study design employed and conducted between 11 and 22 December 2025 to validate the applicability of DS PRO-S. The naturalistic validation [55] strategy was selected, which refers to testing the artifact within its intended environment involving real users, real projects, and actual data. To assess external validity within this naturalistic setting, the research adopted a design based on theoretical replication logic [68] and a “maximum variation” sampling strategy [69]. Therefore, two contrasting cases were selected to verify that the DS PRO-S model and methodology remain valid regardless of the data science project type or organizational context.

5.1. Objective and Research Questions

The primary objective of this validation study was to assess the operational validity of the DS PRO-S. The study sought to determine whether the full framework (comprising the meta-model, methodology, and instantiation toolkit) could be adopted by practitioners to not only construct context-specific assessment models but also operate them effectively for decision-making.

Consistent with DSR evaluation criteria [54] and case study research standards [68], the case studies were structured around three specific research questions:

RQ1. Instantiability: Can practitioners derive context-specific models from the meta-model using the provided Instantiation Kit and the User Guide without requiring the researcher to select and configure model elements on their behalf?

RQ2. Completeness: Do the artifact’s modular structure, evaluation levels, and success constructs sufficiently capture the success assessment needs for projects?

RQ3. Operational Utility: Does the instantiated model provide actionable insights that enable decision-making for project success?

5.2. Case Selection and Unit of Analysis

In line with the maximum variation strategy to prove generalizability, two disparate cases were selected. Table 6 summarizes the structural contrasts between the selected cases.

Unit of Analysis: The unit of analysis is a single data science project
Case Selection Rationale:
- Case A (Development Context): A predictive analytics project within a large energy enterprise (Company A) in Türkiye. Selected to validate the application of DS PRO-S in a high-uncertainty, internal environment.
- Case B (Operations Context): A GenAI (LLM) project delivered by a vendor (Company B) in Türkiye. Selected to validate the application of DS PRO-S in a high-stability, commercial environment.

Table 6. Characteristics of selected cases (maximum variation strategy).

Characteristic	Case A (Internal/Predictive)	Case B (Vendor/GenAI)
Organization Size	Large enterprise (>5000 employees)	SME/vendor (~100 employees)
Industry	Energy	Information technology (IT) and software services
Project Development by	In-house Data Analytics department, delivering to a business unit of the company.	The company is a solution provider (i.e., vendor) delivering to a client.
Technology	Predictive Analytics: Machine learning. Deterministic outputs focusing on numerical precision.	Generative AI: Large Language Model (RAG). Probabilistic outputs focusing on semantic relevance and user experience
Lifecycle Phase	Development: (two phases) data preparation and modeling	Operations and Maintenance: Focus on stability and SLA compliance.

The first case study was conducted in a large enterprise (Company A) that functions in the energy sector with more than 5000 employees. While being one of the energy product producers for customers in Europe and Asia, it advances its processes through distinct data science projects. The company has a Data Analytics Department; however, there is no dedicated team in the department for data science project success assessment. All the data science projects are evaluated quarterly through a standardized questionnaire with the aim of identifying problems in project execution.

The second case study was performed at an SME (Company B) that specializes in Analytics and Generative AI (GenAI) projects and has approximately 100 employees. The mission of the company is to solve complex problems through human-centered and AI-supported solutions. The AI department is structured to provide innovative, scalable, and integrated solutions for businesses undergoing digital transformation.

5.3. Case Study Protocol

To ensure methodological consistency and guard against researcher bias, a standardized protocol was defined. This protocol mandates that the researcher acts as a facilitator, while the practitioners (e.g., Project Lead) execute the operational methodology of DS PRO-S.

Materials and Instruments: The validation relied exclusively on the artifact’s components:
- User Guide: Provided to the practitioners to explain the core concepts (Health vs. Success) and the operational cycle.
- Instantiation Kit: A collection of global lists of model elements (such as CSFs, success criteria, phases, deliverables), templates, and derivation logic (e.g., GQM mappings) used to construct specific models.
- Reference Examples: Pre-worked scenarios used to demonstrate the logic of the model.
The Execution Procedure: The protocol was structured as a three-stage process for each case, mirroring the operational methodology of DS PRO-S.
- Stage. 1: “Contextual Setup and Training” Stage: The researcher introduced the DS PRO-S framework to ensure the project team understood the rationale behind the model and methodology and the structural distinction between concepts such as CSF (to define and measure health) vs. success criteria (to define and measure success).
- Stage. 2: “Define” Stage (Instantiation): The practitioners were tasked with constructing their project-specific models (i.e., Project Success, Phase Success, Phase Health). For the Project Success and Phase Success modules, the practitioners mapped their project/phase goals to success categories and derived success criteria and specific metrics. For Phase Health, the practitioners elicited CSFs under related categories. They presented their measurement and evaluation specifications for each module.
- Stage. 3: “Measure and Evaluate” Stage: The practitioners executed instantiated models using real project data (from the data sources and respondents specified in their models) via surveys or direct extracts from systems or reports, which yielded the measurement and evaluation results. They interpreted the calculated results (e.g., Health and Success Scores) and determined actions where necessary. Finally, they provided feedback on DS PRO-S based on their experience.

Given that the DS PRO-S artifact is available as a set of logical templates (meta-model) rather than a fully automated software tool, the researcher provided computational support. While the practitioners provided the information needed for building the models and defined the calculation logic (e.g., defining the normalization rule for the Absolute Error metric) required in the forms, the researcher performed manual data entry and the execution of calculations. This ensured that the validation focused on the utility of the model’s outputs, rather than the usability of the spreadsheet interface.

5.4. Results of Case Study A

Case Study A involved a predictive analytics project carried out by the Data Analytics team for a business unit (BU) of Company A. The goal of the project was defined as “developing a decision support solution to accurately predict critical accumulation levels within a processing unit (i.e., container), thereby increasing the processing potential in that unit”. At the start of the case study, two parallel phases were active: data preparation and modeling, which belonged to the second iteration cycle. By the measurement cut-off date, the data preparation phase was completed, and the modeling phase was still in progress.

The Case Study Protocol was followed (as noted in the previous subsection), and five sessions were conducted with the Senior Data Scientist, who acted as the project coordinator. In the first session, other members of the project team were also present to share insights. The outputs of the study are summarized for each module below.

5.4.1. Case A—Project Success Assessment

The Project Success Structure is depicted in Table 7. For each of the objectives, success criteria and metrics were defined. As per the MIM for Project Success, measurement and evaluation specifications were prepared.

Prepared questionnaires (Table A7) were sent to identified respondents, and data was collected for all due metrics as of the measurement cut-off date. The measurement results were then mapped to the rating scales, and the decision criteria are depicted in Figure 12. The success criteria and metrics (Figure A1) and metric sheet (Figure A2) for Project Success are provided in Appendix B.

In the case study, the success criteria for the Project (or Phase) Management were established based on subjective metrics—specifically, the perceptions of stakeholders. This approach was chosen because a formal system for monitoring schedule or scope adherence was not currently in place.

The score for Project Success is provisional (i.e., if the project is not completed, only 4% of the aggregated durations of metrics are used). The Relative Success Score of the project is 125%, showing a relatively higher achievement with respect to the elapsed time. However, since the coverage is quite low (13%), this score is based on limited evidence from the metrics, as the solution was still in the development stage. The business and financial impacts were not yet realized. The results must be interpreted with caution, as the “Exceeded” status reflects that the project was currently ahead of its management baseline, but does not guarantee that the ultimate technical or financial goals are attainable.

5.4.2. Case A—Phase Success Assessment

Phase Success assessments were conducted for the Data Preparation and Modeling Phases. Based on the Phase Success Structures (Table 8), the success criteria and metrics were defined and prioritized, and the corresponding measurement and evaluation elements were instantiated. The assessment results were calculated with the collected data defined for each phase. For the Phase Management Success criteria, measured values were collected by questionnaires, and for Deliverable Success-related criteria, the data source was the local database of the Data Scientist (Table A8 and Table A9). The success criteria and metrics (Figure A3 and Figure A4) and metric sheets (Figure A5 and Figure A6) that were used to assess the success of the two phases are given in Appendix B.

Data preparation was completed by the measurement cut-off date, with a 93% Success Score (Figure 13). The Deliverable Success Score reached 100%, suggesting the technical output (prepared data) fully met the objectives. Time and Process Compliance Scores were relatively lower, with higher misalignment among perceptions of the business unit (process engineer) and analytics team (data scientists).

In the Modeling phase, Model Performance (SC. 1) has not reached its measurement date; the 69% score for Phase Success is only derived from Phase Management-related criteria (Figure 14). The “Slightly Behind” status is triggered by the lower-than-targeted scores in terms of Scope Compliance (67%) and Process Efficiency (67%).

5.4.3. Case A—Phase Health Assessment

For the two phases, CSFs were selected and mapped to PDOTS categories. Perceptions on the achievements of CSFs were collected via questionnaires using a 5-Point Likert Scale (Table A10 and Table A11).

The Data Preparation Phase was completed with a 91% Health Score (Figure 15). Even the data quality-related conditions were not fully met, and relatively high misalignment among analytics team members existed; the phase was completed with 100% Deliverable Success. The results show that this performance was achieved with a highly competent team, strong communication, commitment, and involvement by team members.

The enabling conditions for success were maintained for the Modeling Phase, with team competency, tool availability, and management support all at 100% (Figure 16). This high Health Score suggests that the “Slightly Behind” success status (Figure 14) was likely due to the inherent complexity of the modeling task rather than a lack of resources.

5.4.4. Actions for Improvement in Case A

Based on the assessment results, Company A identified the following actions for improvement:

Establish Alignment through Dedicated Meetings: To resolve the misalignment between the Analytics and Business Unit teams caused by the lack of formal tracking, weekly meetings will be used to establish a shared understanding of the completed work and the remaining schedule.
Manage Expectations to Control Timeline Slippage: It is necessary to manage stakeholder expectations more effectively by increasing the frequency of feedback loops. This action is required to communicate the delays caused by the inclusion of external topics or “extra” tasks and ensure that both technical and business stakeholders remain aligned even as the schedule is adjusted.
Standardize Iteration Tracking via Management Tools: Although data preparation is finalized, ongoing data quality concerns still pose potential risks for further iterations. Additionally, since projects start very quickly with limited Business Understanding (e.g., with few tags suggested by the Business Unit), this leads to many subsequent iterations and a high volume of “back-and-forth” information flow. It is identified that a task management tool must be used to systematically track these technical inputs and manage these complex exchanges more efficiently.

5.5. Results of Case Study B

The second case was a GenAI project carried out by an SME for the Sales Department of an external client. The goal was defined as “to develop a secure, Generative AI-powered decision support assistant with a natural language interface that automates information retrieval from complex internal documentation, thereby significantly reducing analysis time for the client’s sales force”. By implementing this project, Company B aims to develop a long-term relationship with this big client and secure more projects. The solution was recently deployed, and the project is in the operations and maintenance phase.

Six sessions were conducted with the Technical Lead. The Project Management Office (PMO) supports the Technical Lead by tracking the schedule and contractual aspects, so she attended some of the sessions to share information. The studies conducted in line with the Case Study Protocol are explained in the following subsections.

5.5.1. Case B—Project Success Assessment

The Project Success Structure is presented in Table 9. The objectives were defined for Level-2 Success Categories, taking into account the “solution provider” perspective. Success criteria and metrics were determined in line with contractual terms and conditions. Measurement and evaluation specifications were prepared as per MIM for Project Success.

Measurements for Project Success were objective and retrieved from either the systems or reports of the company. The questionnaire (Table A12), success criteria and metrics (Figure A7), and metric sheet (Figure A8) are provided in Appendix B. The measurement results and their mapping to the rating scales and decision criteria are presented in Figure 17.

The current Success Score of 95% is based on 97% of the data from base measures. The project is on track; however, a drill-down to success criteria-level results indicates a relatively low performance on software (SW) compliance (66%). While the team has built a solution in which the model meets the criteria successfully (e.g., relevance, groundedness, etc.), the application side (e.g., version control, security headers) is lagging. Similarly, the system robustness tests were not run, which poses a risk for safety. Project Management Success is ahead of the plan.

5.5.2. Case B—Phase Success Assessment

The objectives of the Operations and Maintenance Phase were described under Phase Success categories, as shown in Table 10, and success criteria with metrics were defined. Specifications for measurement and evaluation were prepared. All measurements were objective and included information from Azure monitoring, customer evaluation reports, e-mail archives, and Teams logs. The questionnaire (Table A13), success criteria and metrics (Figure A9), and metric sheet (Figure A10) for Phase Success assessment are depicted in Appendix B.

The phase is at an early stage, as the solution was recently deployed (Maturity: 10%) (Figure 18). Since all the metrics are steady-state metrics (being calculated for a certain duration and targeting a certain stable level (e.g., Mean Time to Respond ≤ 4 h)), coverage is 100%, which means that the 91% score was calculated with data from all metrics defined for Phase Success. Incident management efficiency is slightly behind, requiring corrective actions and closer monitoring.

5.5.3. Case B—Phase Health Assessment

The Phase Health Structure was created by identifying CSFs for the P-DOTS categories. Achievements were collected through a questionnaire (with a Likert-5 scale) (Table A14) that was filled out by both the Client (Senior Data Analyst) and the PMO.

This phase received a “Fully Healthy” status with 83% fulfillment of the defined CSFs (Figure 19). However, this high score shadows the low score (38%) for configuration management, which was deemed important for the phase but was not fulfilled at the time of measurement.

5.5.4. Actions for Improvement in Case B

The assessments informed the following decisions by Company B to improve project and phase performance:

Prioritize Software Compliance: To mitigate the risk of low software compliance, a new sprint will be performed to meet the requirements for software features.
Resolve System Robustness Issues ASAP: Although the target date for stress tests is not due, there seems to be a planning flaw; therefore, tests will be moved to an earlier date.
Train the Client Sales Team on Prompting: When Phase Health and Phase Success Assessment results are evaluated together, the low Incident Management Efficiency Score (57%) is attributed to the Client’s Sales Teams’ poor prompting proficiency (CSF score: 38%). A targeted training course on prompting will be arranged in two weeks.
Prepare the Configuration Management Process: Currently, Company B lacks a formal process for configuration management, as evidenced by the low CSF score (38%) for Effective Configuration Management. This process will be prepared, documented and put into practice in two weeks.

5.6. Cross Case Study Analysis and Results

The meta-model layer of DS PRO-S enables comparisons between the two disparate cases.

In Case A, the absence of formal processes for project management, such as a detailed schedule or scope documentation, was noted. Because the Analytics team was developing solutions for an internal business unit, the success criteria were primarily based on subjective metrics. Consequently, the data for evaluation had to be collected through surveys for eliciting stakeholder views on criteria rather than systems. In contrast, the involvement of an external client in Case B necessitated a higher level of accountability and evidence, requiring formal logs and records. This environment forced the project management and other processes to be more structured and documented. As a result, almost all base measures required for the success assessments (apart from the health module) could be retrieved directly from Company B’s existing reporting systems and internal databases.

Regarding Project Success (Figure 20), Case A shows a high relative Project Success Score of 125%, but it should be noted that this score exists at a level of only 13% coverage and 4% maturity. This situation indicates that while project management (especially scope management) exceeds expectations, an actual technical solution has not been deployed yet. Consequently, the business or financial value associated with the project has not been created. Since Case B has much higher levels of coverage and maturity compared to Case A, its results are more reliable for making final judgments. Because the primary focus in Case B was on GenAI utilities, the software engineering aspect was treated as a secondary priority and remained behind the targets.

For Case A, assessments of Phase Success and Phase Health (Figure 21) were conducted for two parallel phases. The Data Preparation phase (Case A-1) was completed with a 93% Success Score by the measurement date, and the Phase Health Score (91%) is consistent with this achievement. The Modeling phase (Case A-2) is rated at 69% (“slightly behind”), but this is mainly because the accuracy of the model, the main output of the project, has not been measured yet, along with some difficulties in time and scope management. No negative issues were observed regarding the health of this phase. However, the high perception variability regarding data quality (in Case A-1) shows that the team has not reached a consensus on whether the data is sufficient.

While Case A is still in the pre-deployment stages, Case B recently went live, so assessments were performed for the Operations and Maintenance Phase. The success criteria for this phase differ from the pre-deployment stages, as they focus more on service-level outputs. The findings show that there are problems in meeting the contractual goals related to incident management.

Applicability of DS PRO-S

To gather insights from practitioners who used DS PRO-S for building, measuring, and evaluating their project-specific models regarding Project Success, Phase Success, and Phase Health, six questions were asked across three dimensions representing the three RQs of the case studies. Responses were collected by a 5-point Likert scale (where 5 indicates Strongly Agree and 1 indicates Strongly Disagree). The questions and average responses for each question (in parentheses) are as follows:

Instantiability:
- The Instantiation Kit (GQM templates, global lists) allowed for a context-specific model to be built without extensive external guidance (5).
- The logic for deriving metrics from higher-level success structures was clear and easy to operationalize (4.5).
Completeness:
3.
DS PRO-S’ modular structure adequately reflects the multi-dimensional nature of data science project success (Project, Phase, Success, Health) (5).
4.
The global lists of CSFs and success criteria provide sufficient reference content for deriving your own project’s success and health structure (5).
Operational Utility:
5.
The distinction between “Phase Health” (enablers) and “Phase Success” (goal achievement) provided a more accurate diagnosis of project status (5).
6.
The model outputs (assessment scores) provided actionable insights that improved decision-making compared to existing practices in your organization (5).

Additionally, the practitioners were asked open-ended questions, such as the model elements that required the most effort. The Senior Data Scientist in Case A mentioned that she had put the most effort into identifying the metrics that they could collect for the success criteria, which is understandable since Company A did not have documented project management processes. On the other hand, the Technical Lead from Company B suggested that a web application would be useful to operationalize DS PRO-S. Lastly, The PMO from Company B said that “These (assessment) results demonstrate what we actually go through in this project in a written and data-driven way”.

6. Discussion

This research proposes DS PRO-S, which is a framework-level artifact comprising a meta-model, an instantiation toolkit and an operational methodology designed to assess the success of data science projects. DS PRO-S serves as an assessment layer that can be adapted to an organization’s current maturity, complementing existing systems rather than replacing them. It facilitates a resource-light self-assessment process for projects, with clear traceability and a single point of accountability, in which the Project Manager (or a designated lead) executes DS PRO-S. Furthermore, it can be scaled to independent third-party evaluations or integrated with automated data ingestion where digital integrations are available.

Developed with unique characteristics, implementation details and success dimension of data science projects; DS PRO-S differentiates from the other model and frameworks suggested for IS or project management domains. Unlike the solutions that are primarily tailored to specific perspectives or project types in the data science domain, DS PRO-S provides a more comprehensive structure that can be instantiated to the unique characteristics of each data science initiative without losing its conceptual consistency and assessment logic. Additionally, DS PRO-S offers implementation-level guidance by supporting metric-level measurement and evaluation.

During the Demonstration and Evaluation stage of the DSRM, the DS PRO-S framework was able to produce meaningful results for two cases with different projects and organizational settings. The meta-model formulation and elements were fully functional in processing (mainly) subjective data in the internal context of Case A (an internal predictive analytics project of a large company), and objective system logs or report extracts in the relatively more structured external context of Case B (the GenAI project of a vendor). Despite the natural information gaps (due to metrics that were not measurable at the time of evaluation), presented in the form of N/As (Not Available) in the results, the meta-model remained fully operational and provided complete and traceable assessments for success and health. The interviews conducted with practitioners involved in the case studies strongly supported for the instantiability, completeness, and operational utility of DS PRO-S. Based on the six Likert items (two per dimension; 5 = strongly agree, 1 = strongly disagree), the average scores were 4.75/5 for instantiability and 5.00/5 for both completeness and operational utility.

The main contributions of DS PRO-S can be summarized as follows:

Theoretical Contribution (Meta-Model Architecture):
DS PRO-S proposes a meta-model that can be instantiated for various data science projects instead of relying on a fixed checklist of success criteria or CSFs. This approach strikes a balance between standardization by providing a shared structure, and customization, by adapting to specific project objectives and constraints. Furthermore, it separates Success (the achievement of objectives via success criteria) from Health (enablers via CSFs). In addition, the DS PRO-S broadens success beyond the “iron triangle” (i.e., scope, time and cost) by including outcomes., such as Business Value and Financial Value, alongside outputs like Product Quality and Project Management.
Methodological contribution (Formal Measurement and Traceability):
The approach keeps assessment at the metric level, with explicit metric types, normalization rules, and aggregation formulas, so scores can be traced back (“drilled down”) to base measures. By aligning the measurement and evaluation with ISO/IEC 15939 (MIM), indicators, rating scales, and decision criteria become more traceable and repeatable.
Practical Contribution (Modularity and Timely Interventions):
DS PRO-S supports assessment during project execution rather than solely at the project completion which helps with early detection and intervention when issues arise. Its modular structure allows for its partial adoption (e.g., focusing only on Phase Health during high-risk phases). Phase-level assessment reduces the risk that issues in earlier phases propagate and compound later. Finally, the provided toolkit- which includes global lists of configurable elements, templates, method explanations- helps practitioners instantiate a valid model, even when their organizational process maturity is limited.

While DS PRO-S makes several contributions as outlined above, it also has limitations, as follows:

DS PRO-S was not applied across the whole lifecycle of a data science project; therefore, the Project Health module, which requires the measurement of success of the phases from the beginning of project, could not be evaluated in a real-world environment.
Like any other measurement, monitoring or controlling systems, the selection of indicators, their weights, and the target values against which measurements are made can affect the results. In DS PRO-S, this risk is minimized by introducing a project sponsor or upper-management approval mechanism to the baselines and versions of the models instantiated for projects. In the case studies, this approval mechanism could not be executed as DS PRO-S was not a part of the case study organizations’ formal project governance processes.
Subjective measurements are enabled in DS PRO-S to account for the diverse maturities of organizations, which may affect the objectivity of the assessment results. Forcing at least two respondents in the case of subjective measurements was a tactic to reduce the effects of this limitation.

Threats to Validity

To reduce potential threats to validity and to maintain the quality of this research, we applied several strategies throughout:

Construct Validity: The core concepts and early outputs (e.g., design principles) were grounded on established theories and the prior literature using systematic and multivocal reviews. Before the case studies, expert interviews were used to check the comprehensiveness, consistency, and comprehensibility of the model elements and the operational methodology.
Internal Validity: During the case studies, assessment results were discussed with stakeholders to confirm that the output reflects the actual project situation. Scenario-based demonstrations were used to verify the internal logic and consistency of the measurement and evaluation calculations, and to remove potential errors before real-world use. Traceability from objectives and HLRs to design decisions and outputs also supported the controlled refinement of the model over iterations.
External Validity: Two different case organizations were selected to explore its applicability in diverse settings. The assessments were considered meaningful by the stakeholders in both cases, suggesting that the approach can be transferred beyond a single project type or organizational context.
Reliability: Where possible, computations relied on objective measures; when subjective inputs were required, feedback from more than one stakeholder was collected to reduce single-person bias and to keep the evaluation more consistent.

7. Conclusions

The DS PRO-S represents the first data science-specific project success assessment model to integrate a context-adaptive meta-model with a fully formalized and operationalized methodology. By bridging the gap between theoretical frameworks and practical application, it offers a scientifically grounded yet practically implementable solution for managing the unique complexities of data science projects.

Applying DS PRO-S in a predictive analytics project within a large energy enterprise and in a GenAI project delivered by a vendor provided strong support for its instantiability, completeness, and operational utility.

As future work, several studies or extensions to DS PRO-S could be implemented, to mitigate the limitations, as follows:

A longitudinal case study could be conducted to cover all phases of a data science project, from their inception to their completion.
Pre-configured template models for key sub-domains, project types, or specific organizational settings in the data science domain can be produced with tailored instantiation kits to increase adoption.
A decision support system can be developed to suggest CSFs and success criteria, based on organizational processes and project documentation.
A DS PRO-S meta-model, instantiation kit and methodology can be further established as an application so that organizations can adopt DS PRO-S more easily.

By conducting highlighted future studies, it will be possible to further improve DS PRO-S and promote its widespread adoption. This will ultimately contribute to translating theoretical breakthroughs in data science into practical applications.

Author Contributions

Conceptualization, G.T.G., E.G. and P.E.E.; methodology, G.T.G.; investigation, G.T.G.; validation, G.T.G.; writing—original draft preparation, G.T.G.; writing—review and editing, E.G.; visualization, G.T.G.; supervision, E.G. and P.E.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Not Applicable.

Data Availability Statement

The original contributions of this research are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DS PRO-S	Data Science Project Success Assessment Model
CSF	Critical Success Factor
DSR	Design Science Research
DSRM	Design Science Research Methodology
RQ	Research Question
MLR	Multivocal Literature Review
SLR	Systematic Literature Review
HLR	High-Level Requirements
P-DOTS	Project, Data, Organization, Technology, Strategy
MIM	Measurement Information Model
IS	Information Systems
PMO	Project Management Office(r)
GQM	Goal Question Metric

Appendix A

Appendix A includes detailed tables for Section 2 Methodology.

Table A1. High-level Requirements (HLR) of the model.

ID	High-Level Requirement	Objective	Reference
HLR-0	Provide mechanisms for defining Project Success constructs.	O-0	Basic Capability—Define success constructs
HLR-1	Provide mechanisms and/or rules for measuring each defined success construct.	O-0	Basic Capability—Measure success constructs
HLR-2	Enable evaluation of measurement results to support decisions.	O-0	Basic Capability—Evaluate results
HLR-3	Provide a comprehensive definition of success constructs (i.e., beyond cost, time and scope)	O-1	“The project success concept has more criteria than that of the iron triangle” [65]
HLR-4	Enable both result-based and enabler-based assessment so that interventions can be timely and targeted.	O-2	Fundamental purpose of the model
HLR-5	Reflect data science project characteristics.	O-3	“Each project should be measured in a context-specific way.” [48,51]
HLR-6	Enable customization to project-specific circumstances.	O-5
HLR-7	Provide a core backbone of the shared categories of success and evaluation flows for benchmarking and comparison.	O-6	“There is a need for common elements that can be compared between projects including project success assessment methods, monitoring and controlling projects.” [52]
HLR-8	Incorporate stakeholder perspectives into the success assessment process, where feasible.	O-1	“Therefore, it is very important for project success to be measured by taking into consideration the perceptions or business values of the project from the viewpoint of those stakeholders who are possible beneficiaries of the project.” [46]
HLR-9	Enable assessment throughout the lifecycle of projects.	O-4	“In practice, model/methods for success evaluation should be defined considering not only the performance during the project but also the impacts post-project …” [70]
HLR-10	Enable ex-post assessment of the project.	O-4
HLR-11	Enable assessment of interim phases related to the project.	O-4	“… the relative importance of various critical success factors are subject to change at different phases of the project implementation.” [43]
HLR-12	Account for the influence of a phase’s success in the evaluation of its subsequent phases.	O-4	“In each life-cycle phase, the influence of the success of the preceding phase is always significant and, in fact, far exceeds that of other success factors listed in the model.” [44]
HLR-13	Specify concrete metrics or their derivation mechanisms to measure success constructs.	O-7	“In many cases, even though the project owners have a fairly good knowledge of the project, they do not use such knowledge effectively in the definition of KPI and success indicators, just because they do not follow a proper methodology that provides guidelines for the definition and computation of KPI and success indicators.” [53]
HLR-14	Provide mechanisms for adapting the assessment process to changing project conditions.	O-5	“Moreover, reflection on project success can also change as time progresses, conditions change, and the project is viewed in longer retrospect.” [38]
HLR-15	Provide a comprehensive set of implementation resources	O-8	“… questions about “who,” “when” and “how” the evaluation of project success should be done, are not well answered yet …” [71]

Table A2. Exclusions (E) and assumptions (A) of the model.

ID	Context	Explanation
A-1	Relationship with existing process, project or quality management models, methods or standards	The model shall not replace any process, project, and quality management frameworks.
E-1	Success at the strategic level	Scope on assessing success of projects in terms of their contribution to long-term strategy of the organization is excluded.

Table A3. 5W’s of the model.

W-Questions	Answers
Why do we assess success of data science projects?	To identify pain points before it is too late or too expensive to take corrective actions To make informed decisions on whether to cancel unused, outdated, or no longer useful projects or to continue to support To understand if the targets of the project are achieved To ensure targeted functionality or the correct utilization of the outputs of the projects (secondary) To compare performance across projects (secondary) To drive continuous improvement
What aspects of the data science projects do we assess?	Meeting project objectives Ensuring enabling conditions
Who conducts and acts on the assessments?	Conducted by: Project Manager, Project Lead, PMO Analyst, or any relevant role at the organization Actioned by: Project/Portfolio Managers, Department Heads, Executives, Sponsors, Steering Committee or any relevant role at the organization
Where in the lifecycle of data science projects or organizational processes do assessments belong to?	Pre-selection processes Pre-deployment phases Post-deployment phases
When do we run assessments?	(options) After a phase closes At predefined intervals (e.g., every month) On milestone dates (e.g., release) Triggered by threshold breaches (e.g., 100% consumption of budget)

Table A4. Design principles and key design elements.

No.	Design Focus	Related HLR	Design Principle	Design Elements
No.	Design Focus	Related HLR	Design Principle	Meta-Model	Instantiation
DP-1	Success Constructs	HLR-0 HLR-4	Build the model on two basic success constructs: Success criteria (what defines success) CSF (what increases success likelihood)	Success Criterion CSF
DP-2	Project Component > Project Structure	HLR-0 HLR-9	Represent each project with a goal and an ordered sequence of phases.	Project Goal Phase
DP-3	Model Components	HLR-0, HLR-1, HLR-2	Group model elements under four core components: Project Success Measurement Evaluation	Project Component Success Component Measurement Component Evaluation Component
DP-4	Defining Success Constructs > Categories as Backbone	HLR-0 HLR-3 HLR-7	For benchmarking and comparison, maintain a fixed backbone of categories while allowing success constructs to fit into that backbone	Categories for success criteria Categories for CSFs
DP-5	Defining Success Constructs > Prioritization	HLR-6	Reflect on the context-specificity of success by enabling the prioritization of success constructs and categories	Weights Weighting Methods
DP-6	Defining Success Constructs > Goal and Success category connection	HLR-7	Distribute goal into a set of objectives which map to success categories. Success criteria will be selected for each objective.	Objective Categories for success criteria Success Criterion
DP-7	Defining Success Constructs > Success Criteria Elicitation	HLR-0 HLR-6 HLR-13	Facilitate the selection of success criteria and their associated metrics by providing examples or derive them using an applicable derivation method	Success criterion metric Derivation method of success criteria	Candidate success criterion/metric catalog, Template(s) for the derivation of success criteria
DP-8	Defining Success Constructs > CSF Elicitation	HLR-0 HLR-6	Facilitate the selection of CSFs by providing examples or derive them using an applicable derivation method	Critical success factor Derivation method of CSFs	Candidate CSF catalog Template(s) for the derivation of CSFs
DP-9	Measuring Success Constructs	HLR-1 HLR-10 HLR-13	Organize measurements and evaluation at both project and phase levels.	Phase-level assessments; project-level assessments
DP-10	Measuring Success Constructs	HLR-1 HLR-13	Support both quantitative (defined metrics where data is available), and qualitative measurement (simple attainment or graded scales)	Measurement specification
DP-11	Measuring Success Constructs	HLR-2 HLR-12 HLR-13	Aggregate leaf-level measurements into upper-level scores via a defined measurement framework and configurable weighting methods	Measurement specification Weighting method Weights	Weighting method options
DP-12	Evaluating Results	HLR-2 HLR-12 HLR-13	Provide interpretation guidelines to support decision-making	Rating scales and decision criteria	Templates for Rating-scales and Decision-criteria
DP-13	Defining Success Constructs	HLR-12	Allow for reflections on a phase’s success to affect the successive phase (e.g., via a CSF)	CSF or success criterion
DP-14	Assessment Frequency	HLR-9 HLR-10 HLR-11	Establish a minimal managed frequency (e.g., once at the phase end) and allow for periodic plans to be defined (e.g., monthly)	Assessment schedule
DP-15	Stakeholder Perspectives Integration	HLR-8	Capture stakeholder views for subjective measurements (e.g., for CSFs)	Measurement Specification
DP-16	Adaptiveness	HLR-14	Support re-baselining of configured constructs and elements as project conditions evolve.	Re-baselining Versioning
DP-17	Implementation Resources	HLR-15	Provide all necessary implementation processes, templates, and recommended methods so practitioners know how to execute each assessment step.		Process diagrams (BPMN) User Guide Template Library (Forms)
DP-18	Data Science Specificness	HLR-5	Enable the model to meet data science project-specific characteristics and for candidate instantiations to be suggested for each core model element	To be determined (adjusted/new model elements, attributes or functionality)	Candidate model element (e.g., success criteria, metric, CSF, goal, phase, etc.) catalogs

Table A5. Applicable approaches and techniques.

Design Principle	Purpose	Method	For Model Usage	For Model Development
Design Principle	Purpose	Method	For Model Usage	To Do	Step
DP-6 DP-7	Success Criteria Elicitation	GQM	Practitioners can utilize this method to determine success criteria under each success category	Provide GQM templates as a guide in the Instantiation Toolkit	Prepare the Instantiation Toolkit
DP-8	CSFs Elicitation	(Proposed) Phase-Challenge-CSF	Practitioners can utilize this method to determine CSFs for each phase by identifying tasks and challenges	Review and provide data science project typical challenges, project phases and main tasks in the Instantiation Kit Provide a short description of this method	Investigate Domain Project Characteristics
DP-8	CSFs Elicitation	(Proposed) Phase-Challenge-CSF			Prepare the Instantiation Toolkit
DP-5	Weighting Method	AHP	Practitioners will select one of these methods (or use their organization’s own method) to determine the weights of success constructs or categories	Provide basic descriptions and links for the methods in the Instantiation Kit.	Prepare the Instantiation Toolkit
		$100 Allocation
		T-Shirt Sizing
DP-10 DP-11	Measurement and Evaluation Method	Measurement Information Model (MIM)		Construct the meta-model’s measurement and evaluation specifications (including entities, rules, formulations, etc.) Develop MIM’s for phase- and project-level measurements and evaluations	Construct the Meta-Model as a Blueprint
DP-10 DP-11	Measurement and Evaluation Method	Measurement Information Model (MIM)		Consider the developed MIMs in creating operational methodology	Develop the Implementation Guidance
DP-12	Rating Scales and Decision Criteria	RAG Rating system		Suggest configurable rating scales with thresholds and decision criteria based on these approaches, and use them in developing MIMs	Construct the Meta-Model as a Blueprint
		RAG + B Rating system
		Rating scale in ISO/IEC 33020 [72]

Table A6. Base measures, derived measures and indicators in DS PRO-S modules.

Module	Base Measures (Input Data)	Derived Measures (Calculated/Normalized)	Indicators (for Decision Support)
Project Success	Measured Value of Success Criterion Metric, m at time t: $x_{m} (t)$ or $r_{m, e} (t)$ Measurement Timestamp: $t$	Normalized Metric Value (per metric, %): ${N V}_{m} (t)$ Standard Deviation of Normalized Subjective Responses (per metric, percentage points): ${S D}_{m} (t)$ Duration Consumption Rate (per metric, %): ${D C R}_{m} (t)$ Duration Consumption Rate (per metric, %): ${D C R}_{m} (t)$ Relative Metric Achievement (per metric, %): ${R A}_{m} (t)$ Array of Possible Final Normalized Values (per metric, array of %): ${F N V}_{m} (t)$	${S C}_{a b s} (s c, t)$ , ${C A T}_{a b s} (c, t)$ , ${G R P}_{a b s} (g, t), {P R O}_{a b s} (p, t)$ , Absolute Success Scores (%) ${S C}_{r e l} (s c, t)$ , ${C A T}_{r e l} (c, t)$ , ${G R P}_{r e l} (g, t), {P R O}_{r e l} (p, t),$ Relative Success Scores (%) ${P V}_{s c} (t)$ , SC Perception Variability (percentage points) ${C o v}_{s c} (t)$ , ${C o v}_{c} (t)$ , ${{C o v}_{g} (t), C o v}_{p} (t)$ , Weighted Evidence coverage (data completeness) (%) ${M a t}_{s c} (t), {M a t}_{c} (t), {{M a t}_{g} (t), M a t}_{p} (t)$ , Maturity shares (%) Array of Possible Final Absolute Success Scores (for any level L, ${P F S}_{L} (t)$ )
Phase Success	Measured Value of Success Criterion Metric, m at time t: $x_{m} (t)$ or $r_{m, e} (t)$ Measurement Timestamp: $t$	Normalized Metric Value (per metric, %): ${N V}_{m} (t)$ Standard Deviation of Normalized Subjective Responses (per metric, percentage points): ${S D}_{m} (t)$ Duration Consumption Rate (per metric, %): ${D C R}_{m} (t)$ Relative Metric Achievement (per metric, %): ${R A}_{m} (t)$ Array of Possible Final Normalized Values (per metric, array of %): ${F N V}_{m} (t)$	${S C}_{a b s} (s c, t)$ , ${C A T}_{a b s} (c, t)$ , ${P H A}_{a b s} (p h, t)$ , Absolute Success Scores (%) ${S C}_{r e l} (s c, t)$ , ${C A T}_{r e l} (c, t)$ , ${P H A}_{r e l} (p h, t),$ Relative Success Scores (%) ${P V}_{s c} (t)$ , SC Perception Variability (percentage points) ${C o v}_{s c} (t)$ , ${C o v}_{c} (t)$ , ${C o v}_{p h} (t)$ , Weighted Evidence coverage (data completeness) (%) ${M a t}_{s c} (t), {M a t}_{c} (t), {M a t}_{p h} (t)$ , Maturity shares (%) Array of Possible Final Absolute Success Scores (for any level L, ${P F S}_{L} (t)$ )
Project Health	Phase Success Score (%) available at time t: ${P H A}_{a b s} (p h, t)$ Measurement Timestamp: $t$	CSF Achievement Score (per csf, %): $H C S F (c s f, t)$ Duration Consumption Rate (per phase, %): ${H D C R}_{p h} (t)$	$H P R O (p, t)$ , Project Health Score (%) ${H M a t}_{p} (t)$ , Project Maturity Share (time to target) (%)
Phase Health	The Raw Evaluator (e) Response to a Question q of a Critical Success Factor at time t: ${h r}_{q, e} (t)$ Measurement Timestamp: $t$	Normalized Response Value (per question, %): ${H N V}_{q} (t)$ Standard Deviation of Normalized Responses (per question, percentage points): ${H S D}_{q} (t)$ Duration Consumption Rate (per phase, %): ${H D C R}_{p h} (t)$	$H C S F (c s f, t)$ , CSF Achievement Score (%) $H P V (c s f, t)$ , CSF Achievement Perception Variability $H C A T (c, t)$ , CSF Category Score (%) $H P H A (p h, t)$ , Phase Health Score (%) ${H M a t}_{p h} (t)$ , Phase Maturity Share (time to target) (%)

Appendix B

Appendix B includes detailed tables for Section 5: Multiple Case Studies.

Appendix B.1

Here, the tables and figures of Case Study A are provided.

Table A7. Case A—Questions for Project Success.

Question No.	Question	Respondents	Data Source
Q1	What are the daily absolute errors for the last month?	Data Scientist	Monthly records in Analytics Dashboard
Q2	How satisfied are you with solution’s prediction performance?	At least two Control Operators (BU)	Survey
Q3	How much of the main work packages and outputs (deliverables) defined within the project scope has been completed?	Data Scientists (2) Process Engineer (BU)	Survey
Q4	Is the overall progress status of the project in alignment with the targeted project schedule (timeline)?	Data Scientists (2) Process Engineer (BU)	Survey
Q5	To what extent are the project management processes (encompassing scope, resource, and risk management) being effectively executed within this specific phase?	Data Scientists (2) Process Engineer (BU)	Survey
Q6	How much did the daily charged material increase?	Data Scientist	Monthly records in Analytics Dashboard
Q7	How much revenue was made due to the daily charged material increase?	Data Scientist	Monthly records in Analytics Dashboard

Figure A1. Case A—Success criteria and metrics for Project Success.

Figure A2. Case A—Metric sheet for Project Success (S: Subjective, O: Objective, SS: Steady-state, CU: Cumulative).

Table A8. Case A—Questions for Data Preparation Phase Success.

Question No.	Question	Respondents	Data Source
Q1	What is the ratio of Lines without N/A to Total Lines of data?	Data Scientist	Local Database
Q2	What proportion of the data preparation tasks (e.g., cleaning, transformation) planned for this phase has been successfully completed?	Data Scientists (2) Process Engineer (BU)	Survey
Q3	Is the current completion level of this phase in accordance with the targeted schedule?	Data Scientists (2) Process Engineer (BU)	Survey
Q4	To what extent are the phase management processes (encompassing scope, resource, and risk management) being effectively executed within this specific phase?	Data Scientists (2) Process Engineer (BU)	Survey

Table A9. Case A—Questions for Modeling Phase Success.

Question No.	Question	Respondents	Data Source
Q1	What is the Mean Absolute Error (of test cycles)?	Data Scientist	Local Database
Q2	What proportion of the modeling tasks (e.g., testing alternative models) planned within this phase has been successfully completed?	Data Scientists (2) Process Engineer (BU)	Survey
Q3	Is the current completion level of this phase in accordance with the targeted schedule?	Data Scientists (2) Process Engineer (BU)	Survey
Q4	To what extent are the phase management processes (encompassing scope, resource, and risk management) being effectively executed within this specific phase?	Data Scientists (2) Process Engineer (BU)	Survey

Figure A3. Case A—Success criteria and metrics for Data Preparation Phase Success (N/A: Not available).

Figure A4. Case A—Success criteria and metrics for Modeling Phase Success.

Figure A5. Case A—Metric sheet for Data Preparation Phase Success (S: Subjective, O: Objective, MS: Milestone, SS: Steady-state, CU: Cumulative).

Figure A6. Case A—Metric sheet for Modeling Phase Success (S: Subjective, O: Objective, MS: Milestone, SS: Steady-state, CU: Cumulative).

Table A10. Case A—Questions for Data Preparation Phase Health.

Question No.	Question (Statement for Likert Scale Answers)	Respondents
Q1	The quality of the data used in this phase is sufficient to perform the targeted analyses	Data Scientists (2)
Q2	Data governance policies (access, anonymization, GDPR, etc.) ensure access to the necessary data without slowing down the data preparation processes	Data Scientists (2)
Q3	Senior management actively provides the necessary support (resources, approval, motivation) for the intensive data processing and transformation processes.	Data Scientists (2) Process Engineer (BU)
Q4	The outputs obtained in the previous phase (Data Understanding) are clear and sufficient to carry out our work in this phase smoothly.	Data Scientists (2)
Q5	The business unit is willing and participatory in providing the necessary domain knowledge for the work in this phase	Data Scientists (2)
Q6	Communication within the project team (information flow, meeting efficiency, etc.) is open, timely, and effective	Data Scientists (2) Process Engineer (BU)
Q7	The project team has the technical and business competence required to successfully complete the tasks in this phase.	Data Scientists (2) Process Engineer (BU)
Q8	The project team demonstrates high effort and participation in carrying out the work in this phase.	Data Scientists (2) Process Engineer (BU)
Q9	The project team clearly understands the strategic objectives of the project and the business problem to be solved.	Data Scientists (2) Process Engineer (BU)
Q10	The technical infrastructure (pipeline/integrations) providing data flow is working uninterrupted; there are no technical issues or delays in data access.	Data Scientist (2)

Table A11. Case A—Questions for Modeling Phase Health.

Question No.	Question (Statement for Likert Scale Answers)	Respondents
Q1	The project team clearly understands the project’s strategic objectives and the business problem it aims to solve.	Data Scientists (2) Process Engineer (BU)
Q2	The project team demonstrates a high level of participation and effort in model development and improvement.	Data Scientists (2) Process Engineer (BU)
Q3	The project team possesses the necessary technical expertise to implement the selected algorithms and conduct the modeling process	Data Scientists (2) Process Engineer (BU)
Q4	The outputs obtained in the data preparation phase are clear and sufficient to ensure the smooth execution of our work in this phase.	Data Scientists (2)
Q5	The business unit is willing and committed to reviewing the results and providing domain knowledge during the modeling phase.	Data Scientists (2)
Q6	Intra-team and inter-stakeholder communication regarding technical decisions and results in the modeling process is open and effective.	Data Scientists (2) Process Engineer (BU)
Q7	Senior management actively provides the necessary support (resources, approval, motivation) for the modeling phase.	Data Scientists (2) Process Engineer (BU)
Q8	Tools supporting intra-team collaboration (e.g., Git, Dataiku, Azure DevOps, shared Notebook environments) are sufficient and accessible for the modeling process.	Data Scientists (2)
Q9	Sufficient technical research is being conducted on different algorithms and approaches to select the most appropriate method for the problem.	Data Scientists (2)
Q10	The necessary infrastructure and tools have been provided to automate repetitive modeling processes (training, testing, data flow).	Data Scientists (2)

Appendix B.2

This section includes the tables and figures of Case Study B.

Table A12. Case B—Questions for Project Success.

No.	Question	Respondents	Data Source
Q1	What is the relevance score?	Tech Lead	Eval. Reports
Q2	What is the attribution ratio?	Tech Lead	RAG Logs
Q3	What is the fluency score?	Tech Lead	Human Evaluation reports
Q4	What is the coherence score?	Tech Lead	Human Evaluation reports
Q5	What is the cosine similarity score?	Tech Lead	Evaluation Dataset
Q6	What is the average end-to-end response latency recorded in system logs?	AI Team	System Logs
Q7	What percentage of the software requirements are met?	Tech Lead	Technical Compliance Report
Q8	What was the average robustness score recorded in the post-deployment adversarial stress test?	AI Team	Test Report
Q9	What is the ratio of progress reports actually delivered vs. contractually required?	PMO	Email Archive
Q10	How many percent of the WPs delivered?	PMO	MS Project
Q11	What is the variance (in days) between the planned and actual Go-Live date?	Tech Lead	Project Baseline
Q12	What percentage of the total workforce uses the assistant daily?	Tech Lead	Database Logs
Q13	What is the latest satisfaction rate by Client?	Tech Lead	Feedback Survey
Q14	What percentage of the total contract value was successfully invoiced without deduction?	PMO	Invoices/ERP

Figure A7. Case B—Success criteria and metrics for Project Success.

Figure A8. Case B—Metric sheet for Project Success (S: Subjective, O: Objective, MS: Milestone, SS: Steady-state, CU: Cumulative).

Table A13. Case B—Questions for Phase Success.

No.	Question	Respondents	Data Source
Q1	What was the total realized downtime (in hours) for the system last month?	AI Engineer	Azure Monitoring
Q2	What percentage of the periodic quality checks (Relevance, Latency, Groundedness, etc.) remained within the acceptable thresholds last month?	AI Engineer	Customer Evaluation Forms
Q3	How many required progress reports were successfully submitted to the client last month?	PMO	e-mail archives
Q4	What was the average time to respond to critical incidents within the last month?	AI Engineer	Teams Log

Figure A9. Case B—Success criteria and metrics for Phase Success.

Figure A10. Case B—Metric sheet for Phase Success.

Table A14. Case B—Questions for Phase Health.

No.	Question (Statement for Likert Scale Answers)
Q1	User feedback and potential issues observed in the live environment are communicated to the technical team quickly, and the resolution process is handled with transparent communication.
Q2	The system’s architectural health and operational performance (e.g., response time, availability) are monitored in real time and effectively using advanced monitoring tools.
Q3	Frequently changing model versions, prompt templates, and configuration files are managed in a controlled and traceable way, so that they do not create unnecessary complexity.
Q4	The Client (or business unit) has assigned a dedicated responsible person to follow the operational process and provide improvement suggestions.
Q5	The team has the technical capability (e.g., cloud, artificial intelligence) required to deliver the project work packages and to produce customized solutions based on emerging customer needs.
Q6	The data sources and document formats feeding the system (e.g., PDF, Excel) are stable and consistent enough not to disrupt the model’s operation.
Q7	Users have the necessary prompt engineering skills to obtain correct outputs from the model, and they use the system effectively.
Q8	This project is aligned with the (client) organization’s overall AI (artificial intelligence) adoption strategy and long-term vision.

References

Sharma, S. Data Is Essential to Digital Transformation. Available online: https://www.forbes.com/councils/forbestechcouncil/2020/12/03/data-is-essential-to-digital-transformation/ (accessed on 13 January 2026).
VentureBeat Why Do 87% of Data Science Projects Never Make It Into Production? Available online: https://venturebeat.com/ai/why-do-87-of-data-science-projects-never-make-it-into-production/ (accessed on 15 July 2023).
Challapally, A.; Pease, C.; Raskar, R.; Chari, P. State of AI in Business 2025; MIT NANDA: Cambridge, MA, USA, 2025. [Google Scholar]
Gartner. Gartner Data & Analytics Summit 2024 London: Day 1 Highlights; Gartner: Stamford, CT, USA, 2024; Available online: https://www.gartner.com/en/newsroom/press-releases/2024-05-13-gartner-data-and-analytics-summit-london-2024-day-1-highlights (accessed on 12 February 2026).
Saltz, J.S. The Need for New Processes, Methodologies and Tools to Support Big Data Teams and Improve Big Data Project Effectiveness. In Proceedings of the 2015 IEEE International Conference on Big Data, IEEE Big Data, Santa Clara, CA, USA, 29 October–1 November 2015; pp. 2066–2071. [Google Scholar] [CrossRef]
Aho, T.; Sievi-Korte, O.; Kilamo, T.; Yaman, S.; Mikkonen, T. Demystifying Data Science Projects: A Look on the People and Process of Data Science Today. In Proceedings of the Product-Focused Software Process Improvement, Turin, Italy, 25–27 November 2020; Morisio, M., Torchiano, M., Jedlitschka, A., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 153–167. [Google Scholar]
Midler, C.; Alochet, M. Understanding the Phoenix Phenomenon: Can a Project Be Both a Failure and a Success? Proj. Manag. J. 2024, 55, 187–204. [Google Scholar] [CrossRef]
Ika, L.A.; Pinto, J.K. The “Re-Meaning” of Project Success: Updating and Recalibrating for a Modern Project Management. Int. J. Proj. Manag. 2022, 40, 835–848. [Google Scholar] [CrossRef]
Baccarini, D. The Logical Framework Method for Defining Project Success. Proj. Manag. J. 1999, 30, 25–32. [Google Scholar] [CrossRef]
Rode, A.L.G.; Svejvig, P.; Martinsuo, M. Developing a Multidimensional Conception of Project Evaluation to Improve Projects. Proj. Manag. J. 2022, 53, 416–432. [Google Scholar] [CrossRef]
Takagi, N.; Varajão, J.; Ventura, T.; Ubialli, D.; Silva, T. Implementing Success Management and PRINCE2 in a BPM Public Project. In Proceedings of the ACIS 2021, Sydney, Australia, 6–10 December 2021; Available online: https://aisel.aisnet.org/acis2021/4/ (accessed on 26 December 2025).
Takagi, N.; Varajão, J. Success Management and the Project Management Body of Knowledge (PMBOK): An Integrated Perspective. Int. Res. Workshop IT Proj. Manag. 2020, 6. [Google Scholar]
Takagi, N.; Varajão, J. ISO 21500 and Success Management: An Integrated Model for Project Management. Int. J. Qual. Reliab. Manag. 2021, 39, 408–427. [Google Scholar] [CrossRef]
Project Management Institute. PMI PMBOK Guide; Project Management Institute: Newtown Square, PA, USA, 2021. [Google Scholar]
ISO 21502:2020; Project, Programme and Portfolio Management—Guidance on Project Management. ISO: Geneva, Switzerland, 2020.
The Stationery Office. Managing Successful Projects with PRINCE2; The Stationery Office: Norwich, UK, 2017; ISBN 978-0-11-331533-8. [Google Scholar]
Rigo, P.D.; Siluk, J.C.M.; Lacerda, D.P.; Rediske, G.; Rosa, C.B. A Model for Measuring the Success of Distributed Small-Scale Photovoltaic Systems Projects. Sol. Energy 2020, 205, 241–253. [Google Scholar] [CrossRef]
Zavadskas, E.K.; Vilutienė, T.; Turskis, Z.; Šaparauskas, J. Multi-Criteria Analysis of Projects’ Performance in Construction. Arch. Civ. Mech. Eng. 2014, 14, 114–121. [Google Scholar] [CrossRef]
Elkarmi, F.; Shikhah, N.A.; Alomari, Z.; Alkhatib, F. A Novel Methodology for Project Assessment and Evaluation. J. Serv. Sci. Manag. 2011, 4, 261–267. [Google Scholar] [CrossRef]
Barclay, C.; Osei-Bryson, K.-M. Project Performance Development Framework: An Approach for Developing Performance Criteria & Measures for Information Systems (IS) Projects. Int. J. Prod. Econ. 2010, 124, 272–292. [Google Scholar] [CrossRef]
McLeod, L.; Doolin, B.; MacDonell, S.G. A Perspective-Based Understanding of Project Success. Proj. Manag. J. 2012, 43, 68–86. [Google Scholar] [CrossRef]
Joseph, N.; Marnewick, C. The Continuum of Information Systems Project Success: Reflecting on the Correlation between Project Success Dimensions. S. Afr. Comput. J. 2021, 33, 37–58. [Google Scholar] [CrossRef]
Martinez, I.; Viles, E.; Olaizola, I.G. Data Science Methodologies: Current Challenges and Future Approaches. Big Data Res. 2021, 24, 100183. [Google Scholar] [CrossRef]
Dukino, C.; Kutzias, D.; Link, M. Roles and Competences of Data Science Projects. In Proceedings of the International Conference on the Human Side of Service Engineering 2022, New York, NY, USA, 24–28 July 2022. [Google Scholar]
Guo, J.X. Measuring Information System Project Success through a Software-Assisted Qualitative Content Analysis. Inf. Technol. Libr. 2019, 38, 53–70. [Google Scholar] [CrossRef]
Gao, J.; Koronios, A.; Selle, S. Towards A Process View on Critical Success Factors in Big Data Analytics Projects. In Proceedings of the Twenty-First Americas Conference on Information Systems, Fajardo, Puerto Rico, 13–15 August 2015. [Google Scholar]
Miller, G.J. Artificial Intelligence Project Success Factors—Beyond the Ethical Principles. In Proceedings of the Information Technology for Management: Business and Social Issues, Sofia, Bulgaria, 4–7 September 2022; Ziemba, E., Chmielarz, W., Eds.; Springer International Publishing: Cham, Switzerland, 2022; pp. 65–96. [Google Scholar]
vom Brocke, J.; Hevner, A.; Maedche, A. Introduction to Design Science Research. In Design Science Research. Cases; vom Brocke, J., Hevner, A., Maedche, A., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 1–13. ISBN 978-3-030-46781-4. [Google Scholar]
Peffers, K.; Tuunanen, T.; Rothenberger, M.A.; Chatterjee, S. A Design Science Research Methodology for Information Systems Research. J. Manag. Inf. Syst. 2007, 24, 45–77. [Google Scholar] [CrossRef]
Gökay, G.T.; Nazlıel, K.; Şener, U.; Gökalp, E.; Gökalp, M.O.; Gençal, N.; Dağdaş, G.; Eren, P.E. What Drives Success in Data Science Projects: A Taxonomy of Antecedents. In Proceedings of the International Conference on Computing, Intelligence and Data Analytics (ICCIDA 2022), Kocaeli, Turkey, 16–17 September 2022; García Márquez, F.P., Jamil, A., Eken, S., Hameed, A.A., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 448–462. [Google Scholar]
Tsoy, M.; Staples, D.S. What Are the Critical Success Factors for Agile Analytics Projects? Inf. Syst. Manag. 2021, 38, 324–341. [Google Scholar] [CrossRef]
Demir, N.; Aysolmaz, B.; Özcan-Top, Ö. Critical Success Factors in Data Analytics Projects: Insights from a Systematic Literature Review. In Proceedings of the Disruptive Innovation in a Digitally Connected Healthy World, Heerlen, The Netherlands, 11–13 September 2024; van de Wetering, R., Helms, R., Roelens, B., Bagheri, S., Dwivedi, Y.K., Pappas, I.O., Mäntymäki, M., Eds.; Springer Nature Switzerland: Cham, Switzerland, 2024; pp. 129–141. [Google Scholar]
Gökay, G.T.; Gökalp, E.; Eren, P.E. Data Science Projects: A Systematic Literature Review on Characteristics, Implementation, and Challenges. In Proceedings of the International Conference on Information Technology and Applications, ICITA 2025, Oslo, Norway, 14–16 October 2025; Lecture Notes in Networks and Systems; Springer: Singapore, 2026. [Google Scholar]
Kraut, N.; Transchel, F. On the Application of SCRUM in Data Science Projects. In Proceedings of the 2022 7th International Conference on Big Data Analytics (ICBDA), Guangzhou, China, 4 March 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–9. [Google Scholar]
Saltz, J.; Shamshurin, I. Big Data Team Process Methodologies: A Literature Review and the Identification of Key Factors for a Project’s Success. In Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 5–8 December 2016; p. 2879. [Google Scholar]
Bannerman, P. Defining Project Success: A Multi-Level Framework. In Proceedings of the Project Management Institute Research Conference, Warsaw, Poland, 13–16 July 2008; pp. 1–14. [Google Scholar]
Shenhar, A.J.; Dvir, D. Reinventing Project Management: The Diamond Approach to Successful Growth and Innovation; Harvard Business Review Press: Brighton, MA, USA, 2007; ISBN 978-1-59139-800-4. [Google Scholar]
Zwikael, O.; Meredith, J. Evaluating the Success of a Project and the Performance of Its Leaders. IEEE Trans. Eng. Manag. 2021, 68, 1745–1757. [Google Scholar] [CrossRef]
DeLone, W.H.; McLean, E.R. Information Systems Success: The Quest for the Dependent Variable. Inf. Syst. Res. 1992, 3, 60–95. [Google Scholar] [CrossRef]
DeLone, W.H.; McLean, E.R. The DeLone and McLean Model of Information Systems Success: A Ten-Year Update. J. Manag. Inf. Syst. 2003, 19, 9–30. [Google Scholar]
Nelson, R.R. Project Retrospectives: Evaluating Project Success, Failure, and Everything In Between. MIS Q. Exec. 2008, 4, 5. [Google Scholar]
Varajão, J. The Many Facets of Information Systems (+projects) Success. Int. J. Inf. Syst. Proj. Manag. 2018, 6, 5–13. [Google Scholar] [CrossRef]
Pinto, J.K.; Prescott, J.E. Variations in Critical Success Factors Over the Stages in the Project Life Cycle. J. Manag. 1988, 14, 5–18. [Google Scholar] [CrossRef]
Khang, D.B.; Moe, T.L. Success Criteria and Factors for International Development Projects: A Life-Cycle-Based Framework. Proj. Manag. J. 2008, 39, 72–84. [Google Scholar] [CrossRef]
de Wit, A. Measurement of Project Success. Int. J. Proj. Manag. 1988, 6, 164–170. [Google Scholar] [CrossRef]
Siddique, L.; Hussein, B.A. A Qualitative Study of Success Criteria in Norwegian Agile Software Projects from Suppliers’ Perspective. Int. J. Inf. Syst. Proj. Manag. 2022, 4, 63–79. [Google Scholar] [CrossRef]
Dvir, D.; Lipovetsky, S.; Shenhar, A.; Tishler, A. In Search of Project Classification: A Non-Universal Approach to Project Success Factors. Res. Policy 1998, 27, 915–935. [Google Scholar] [CrossRef]
Shenhar, A.J.; Dvir, D.; Lechler, T.; Poli, M. One Size Does Not Fit All: True for Projects, True for Frameworks. In Proceedings of the PMI Research Conference, Seattle, WA, USA, 14–17 July 2002; Project Management Institute: Newtown Square, PA, USA, 2002; pp. 14–17. [Google Scholar]
Ahimbisibwe, A.; Daellenbach, U.; Cavana, R.Y. Empirical Comparison of Traditional Plan-Based and Agile Methodologies: Critical Success Factors for Outsourced Software Development Projects from Vendors’ Perspective. J. Enterp. Inf. Manag. 2017, 30, 400–453. [Google Scholar] [CrossRef]
Crisan, E.L.; Dan, M.; Beleiu, I.N.; Ciocoiu, E.; Beudean, P. How Critical Success Factors Combine to Influence Success? A Configurational Theory Approach on Multiple Social Projects. Int. J. Manag. Proj. Bus. 2023, 16, 767–787. [Google Scholar] [CrossRef]
Ika, L.A. Project Success as a Topic in Project Management Journals. Proj. Manag. J. 2009, 40, 6–19. [Google Scholar] [CrossRef]
Castro, M.S.; Bahli, B.; Barcaui, A.; Figueiredo, R. Does One Project Success Measure Fit All? An Empirical Investigation of Brazilian Projects. Int. J. Manag. Proj. Bus. 2020, 14, 788–805. [Google Scholar] [CrossRef]
Lavazza, L.; Frumento, E.; Mazza, R. Defining and Evaluating Software Project Success Indicators—A GQM-Based Case Study. In Proceedings of the 10th International Conference on Software Engineering and Applications, Colmar, France, 20–22 July 2015; SCITEPRESS—Science and Technology Publications: Setúbal, Portugal, 2015; pp. 105–116. [Google Scholar]
Hevner, A.R.; March, S.T.; Park, J.; Ram, S. Design Science in Information Systems Research. MIS Q. 2004, 28, 75–105. [Google Scholar] [CrossRef]
Venable, J.; Pries-Heje, J.; Baskerville, R. FEDS: A Framework for Evaluation in Design Science Research. Eur. J. Inf. Syst. 2016, 25, 77–89. [Google Scholar] [CrossRef]
ISO 9001:2015; Quality Management Systems—Requirements. ISO: Geneva, Switzerland, 2015.
ISO/IEC/IEEE 15939:2017; Systems and Software Engineering–Measurement Process. ISO: Geneva, Switzerland, 2017. [CrossRef]
Saltz, J.; Shamshurin, I.; Connors, C. Predicting Data Science Sociotechnical Execution Challenges by Categorizing Data Science Projects. J. Assoc. Inf. Sci. Technol. 2017, 68, 2720–2728. [Google Scholar] [CrossRef]
Kelleher, J.D.; Tierney, B. Data Science; MIT Press: Cambridge, MA, USA, 2018; ISBN 978-0-262-34703-7. [Google Scholar]
Al-Debei, M.M. The Era of Business Analytics: Identifying and Ranking the Differences between Business Intelligence and Data Science from Practitioners’ Perspective Using the Delphi Method. J. Bus. Anal. 2024, 7, 94–119. [Google Scholar] [CrossRef]
Bertalanffy, L.V. General System Theory: Foundations, Development, Applications; George Braziller Inc.: New York, NY, USA, 1968; ISBN 978-0-8076-0453-3. [Google Scholar]
ISO/IEC 25010:2023; Product Quality Model. ISO: Geneva, Switzerland, 2023.
Bannerman, P.L.; Thorogood, A. Celebrating IT Projects Success: A Multi-Domain Analysis. In Proceedings of the 2012 45th Hawaii International Conference on System Sciences, Maui, HI, USA, 4–7 January 2012; pp. 4874–4883. [Google Scholar]
Marques, A.; Varajão, J.; Sousa, J.; Peres, E. Project Management Success I-C-E Model—A Work in Progress. Procedia Technol. 2013, 9, 910–914. [Google Scholar] [CrossRef]
Howsawi, E.; Eager, D.; Bagia, R.; Niebecker, K. The Four-Level Project Success Framework: Application and Assessment. Organ. Proj. Manag. 2014, 1, 1–15. [Google Scholar] [CrossRef][Green Version]
Zwikael, O.; Smyrk, J. A General Framework for Gauging the Performance of Initiatives to Enhance Organizational Value. Br. J. Manag. 2012, 23, S6–S22. [Google Scholar] [CrossRef]
Badewi, A. The Impact of Project Management (PM) and Benefits Management (BM) Practices on Project Success: Towards Developing a Project Benefits Governance Framework. Int. J. Proj. Manag. 2016, 34, 761–778. [Google Scholar] [CrossRef]
Yin, R.K. Case Study Research and Applications; SAGE Publications: Thousand Oaks, CA, USA, 2018; Volume 6. [Google Scholar]
Patton, M.Q. Qualitative Research & Evaluation Methods: Integrating Theory and Practice; SAGE Publications: Thousand Oaks, CA, USA, 2014; ISBN 978-1-4833-0145-7. [Google Scholar]
Varajão, J.; Lourenço, J.C.; Gomes, J. Models and Methods for Information Systems Project Success Evaluation—A Review and Directions for Research. Heliyon 2022, 8, e11977. [Google Scholar] [CrossRef]
Teixeira, A.; Oliveira, T.; Varajão, J. Evaluation of Business Intelligence Projects Success—A Case Study. Bus. Syst. Res. Int. J. Soc. Adv. Innov. Res. Econ. 2019, 10, 1–12. [Google Scholar] [CrossRef]
ISO/IEC 33020:2015; Information Technology—Process Assessment—Process Measurement Framework for Assessment of Process Capability. ISO: Geneva, Switzerland, 2015.

Figure 1. Research methodology followed (based on [29]).

Figure 2. DS PRO-S structure.

Figure 3. Interactions of DS PRO-S with organizational entities (solid line: input to, dashed line: output from DS PRO-S).

Figure 4. Transition from Meta-Model to customized project models via the Instantiation Toolkit.

Figure 5. Integrated lifecycle of DS PRO-S.

Figure 6. Evaluation dimensions of DS PRO-S.

Figure 7. Project Success structure in DS PRO-S.

Figure 9. Phase Health structure in DS PRO-S.

Figure 10. Project Health structure in DS PRO-S.

Figure 11. Operational methodology of DS PRO-S.

Figure 12. Case A—Project Success Assessment results (N/A: Not available).

Figure 13. Case A—Data Preparation Phase Success Assessment results.

Figure 14. Case A—Modeling Phase Success Assessment results (N/A: Not available).

Figure 15. Case A—Data Preparation Phase Health Assessment results.

Figure 16. Case A—Modeling Phase Health Assessment results.

Figure 17. Case B—Project Success Assessment results (N/A: Not available).

Figure 18. Case B—Operations and Maintenance Phase Success Assessment results.

Figure 19. Case B—Operations and Maintenance Phase Health Assessment results.

Figure 20. Comparison of the Project Success Scores of cases.

Figure 21. Comparison of Phase Success and Health scores for cases.

Table 1. Objectives for the solution.

No.	Objective	Explanation
O-0	Provide Holistic Assessment Capability	As the core functionality and the reason for this model’s existence, the model will enable an end-to-end success assessment process, allowing practitioners to define success constructs, measure them with concrete metrics, and evaluate the results to allow for informed decisions to be made.
O-1	Define Success Constructs Comprehensively	Moving beyond the traditional cost, time, and scope triangle, the model will define success constructs that address the multidimensional nature of data science value.
O-2	Enable Dual-Faceted Assessment	The model will include an evaluation of both the fulfillment of goals and the conditions enabling success, allowing for timely and targeted interventions.
O-3	Reflect Data Science Characteristics	The solution will explicitly account for the distinct properties and challenges of data science (e.g., uncertainty) that distinguish it from traditional software projects.
O-4	Ensure End-to-End Lifecycle coverage	The model will align evaluation activities with the distinct phases of a data science project, guaranteeing no critical transition or risk area is omitted.
O-5	Allow for Flexible Adaptation	To ensure its applicability across any data-science initiative and any organizational setting, the model will be supported by configurable resources such as templates, guidelines, and mechanisms that can be adapted to the specific attributes of a given project.
O-6	Facilitate Cross-Project Comparability	By providing a standardized evaluation framework, the model will allow for the fair and comparable measurement of different data science projects, which, in turn, will create a foundation for benchmarking, identifying best practices, and learning for the organization.
O-7	Provide Element-Level Actionable Granularity	To avoid becoming merely theoretical and unused, the model will make abstract concepts measurable, explaining how to derive them and establishing rules for how to calculate them.
O-8	Offer Directly Implementable Guidance	Beyond specifying metrics, the model will be accompanied by step-by-step instructions, ready-to-use templates, and worked examples to ensure practitioners can easily utilize the model in their own contexts.

Table 2. High-level requirements (HLRs) of the model (illustrative excerpt).

ID	High-Level Requirement	Related Objective	Reference
HLR-7	Provide a core backbone of shared categories of success and evaluation flows for benchmarking and comparison.	O-6	“There is a need for common elements that can be compared between projects, including project success assessment methods, monitoring, and controlling projects” [52]
HLR-8	Incorporate stakeholder perspectives into the success evaluation process, where feasible.	O-1	“Therefore, it is very important for project success to be measured by taking into consideration the perceptions or business values of the project from the viewpoint of those stakeholders who are possible beneficiaries of the project.” [46]

Table 3. 5Ws of the model (illustrative excerpt).

W-Questions	Answers
What aspects of the data science project do we evaluate?	Meeting success objectives Ensuring enabling conditions
Who conducts and acts on the evaluations?	Conducted by: Project Manager, Project Lead, PMO Analyst, or any relevant role Actioned by: Project/Portfolio Managers, Department Heads, Executives, Sponsors, Steering Committee

Table 4. Design principles and key design elements (illustrative excerpt).

No.	Design Focus	Related HLR	Design Principle	Design Elements
No.	Design Focus	Related HLR	Design Principle	Meta-Model	Instantiation
DP-4	Defining Success Constructs—Success Criteria Derivation	HLR-0, HLR-6, HLR-13	Allow practitioners to select success criteria and their associated metrics or to derive them using an applicable derivation method.	Success criterion, metric, derivation method of success criteria	Candidate success criterion/metric catalogs, template(s) for the derivation method of success criteria

Table 5. Basic properties of DS PRO-S.

Property	Values	Logic
Evaluation Level	1. Project Level	The model can be instantiated to assess a single discrete phase or the project as a holistic system.
Evaluation Level	2. Phase Level
Evaluation Dimensions	1. Success (Achievement)	Success measures the achievement of objectives via success criteria.
Evaluation Dimensions	2. Health (Enablers)	Health measures the achievement of enabling conditions for success via CSFs
Success Constructs	1. Success Criteria	Success criteria are used specifically to measure the Success dimension.
Success Constructs	2. Critical Success Factors (CSFs)	CSFs are used specifically to measure the Health dimension.
Implementation Modularity	1. Independent (Standalone)	Phase Success, Phase Health, and Project Success can be implemented as standalone modules. Project Health depends on Phase Success outputs and requires phases to be evaluated to function
Implementation Modularity	2. Integrated (Dependent)
Application Type	1. Self-Assessment (Default)	The default mode is self-assessment by the project team. However, the model is architected to support third-party assessment when scaled for audit or governance purposes.
Application Type	2. Third-Party (Scalable)
Input Sources	1. Objective Data (Reports/Extracts)	Objective data (e.g., budget variance)
Input Sources	2. Subjective Data (Surveys/Perceptions)	Subjective data (e.g., top management support)
Evaluation Timing	1. Ad hoc (Snapshot)	Ad hoc: Instant assessments
Evaluation Timing	2. Lifecycle (Longitudinal)	Lifecycle: Continuous surveillance throughout the integrated lifecycle
Primary Interface	Project Manager	The primary user is the Project Manager, though data may be ingested from organizational systems or stakeholders in automated settings.

Table 7. Case A—Project Success Structure.

Project Success
Level-1 Project Success Category	Output Success		Outcome Success
Weight	0.44		0.56
Level-2 Project Success Category	Product Quality Success (PQS)	Project Management Success (PMS)	Business Value Success (BVS)	Financial Value Success (FVS)
Project Objectives (Weight)	Develop a solution compliant with the desired requirements (0.71)	Developing the solution on time and within scope (0.29)	Maximizing the unit’s processing potential and operational efficiency (0.50)	Increase revenue (0.50)

Table 8. Case A—Phase Success Structures for Data Preparation and Modeling Phases.

Phase Success
Success Category	Deliverable Success	Phase Management Success
Data Preparation Phase Objectives (weight)	Extract specific process tags from the source system (PHD Historian) and integrate them into the master dataset, ensuring all new features are normalized and validated (0.67).	Execute the data preparation and feature engineering tasks within the defined scope and timeline (0.33).
Modeling Phase Objectives (weight)	Develop a predictive model that quantitatively outperforms the baseline performance established in the first modeling iteration (0.67).	Execute the second modeling iteration within the defined scope and timeline (0.33).

Table 9. Case B—Project Success Structure.

Project Success
Level-1 Project Success Category	Output Success		Outcome Success
Weight	0.44		0.56
Level-2 Project Success Category	Product Quality Success (PQS)	Project Management Success (PMS)	Business Value Success (BVS)	Financial Value Success (FVS)
Project Objectives (Weight)	Developing the solution in full conformance with contract requirements (0.56)	Delivering the solution within scope and schedule (0.44)	(Strategic Client Retention) Securing a strategic partnership and “referenceable” status by demonstrating the solution’s impact, increasing the potential for contract renewal and opening opportunities for upselling future projects. (0.63)	Securing full contract value (0.38)

Table 10. Case B—Phase Success Structure for the Operations and Maintenance Phase.

Phase Success
Phase Success Category	Deliverable Success	Phase Management Success
Operations and Maintenance Phase Objectives (weight)	To maintain high system availability and ensure the GenAI agent system delivers accurate, reliable responses throughout the period (0.63).	To ensure strict compliance with contractual reporting schedules and achieve rapid response times for all system incidents (0.38).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gökay, G.T.; Gökalp, E.; Eren, P.E. DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects. Appl. Sci. 2026, 16, 2551. https://doi.org/10.3390/app16052551

AMA Style

Gökay GT, Gökalp E, Eren PE. DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects. Applied Sciences. 2026; 16(5):2551. https://doi.org/10.3390/app16052551

Chicago/Turabian Style

Gökay, Gonca Tokdemir, Ebru Gökalp, and P. Erhan Eren. 2026. "DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects" Applied Sciences 16, no. 5: 2551. https://doi.org/10.3390/app16052551

APA Style

Gökay, G. T., Gökalp, E., & Eren, P. E. (2026). DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects. Applied Sciences, 16(5), 2551. https://doi.org/10.3390/app16052551

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DS PRO-S: A Success Assessment Model and Methodology for Data Science Projects

Abstract

1. Introduction

2. Background

3. Methodology

3.1. DSRM Activity 1: Problem Identification

3.2. DSRM Activity 2: Objectives of the Solution

3.3. DSRM Activity 3: Design and Development

3.4. DSRM Activity 4 and Activity 5: Demonstration and Evaluation

4. DS PRO-S: Data Science Project Success Assessment Model

4.1. DS PRO-S Meta-Model Components

4.2. Integrated Lifecycle View

4.3. Success Constructs, Assessment Levels, and Dimensions

4.4. Modules of DS PRO-S

4.4.1. Project Success Structure

4.4.2. Phase Success Structure

4.4.3. Phase Health Structure

4.4.4. Project Health Structure

4.4.5. Measurement Information Models (MIMs) as Measurement and Evaluation Specifications

4.5. DS PRO-S Operational Methodology

5. Multiple Case Studies

5.1. Objective and Research Questions

5.2. Case Selection and Unit of Analysis

5.3. Case Study Protocol

5.4. Results of Case Study A

5.4.1. Case A—Project Success Assessment

5.4.2. Case A—Phase Success Assessment

5.4.3. Case A—Phase Health Assessment

5.4.4. Actions for Improvement in Case A

5.5. Results of Case Study B

5.5.1. Case B—Project Success Assessment

5.5.2. Case B—Phase Success Assessment

5.5.3. Case B—Phase Health Assessment

5.5.4. Actions for Improvement in Case B

5.6. Cross Case Study Analysis and Results

Applicability of DS PRO-S

6. Discussion

Threats to Validity

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

Appendix B.1

Appendix B.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI