A Scoping Review and Assessment Framework for Technical Debt in the Development and Operation of AI/ML Competition Platforms †
Abstract
Featured Application
Abstract
1. Introduction
Research Questions
- RQ1: What are the most significant types of technical debt recorded in AI-based systems?
- RQ2: How can we measure the technical debt of an AI—based competition platform?
2. Background and Related Work
2.1. Technical Debt in AI/ML Systems
2.1.1. Requirements Debt in AI/ML Systems
2.1.2. Architectural Debt in AI/ML Systems
2.1.3. Design Debt in AI/ML Systems
2.1.4. Data Debt in AI/ML Systems
2.1.5. Algorithm Debt in AI/ML Systems
2.1.6. Model Debt in AI/ML Systems
2.1.7. Infrastructure Debt in AI/ML Systems
2.1.8. Test Debt and Quality Assurance in AI/ML Systems
2.1.9. Build Debt in AI/ML Systems
2.1.10. Versioning Debt in AI/ML Systems
2.1.11. Configuration Debt in AI/ML Systems
2.1.12. Code Debt in AI/ML Systems
2.1.13. Process Debt in AI/ML Systems
2.1.14. Documentation Debt in AI/ML Systems
2.1.15. People and Social Debt in AI/ML Systems
2.1.16. Ethics Debt in AI/ML Systems
2.1.17. Self-Admitted Technical Debt (SATD) in AI/ML Systems
2.1.18. Defect Debt in AI/ML Systems
2.2. Software Engineering for AI (SE4AI) and Software Engineering for Machine Learning (SE4ML)
- Requirements Engineering: In AI/ML systems, requirements are not fully specified up-front but evolve through iterative learning cycles based on data and model feedback [86]. Techniques such as incremental requirement refinement and data-centric requirement analysis are essential.
- Software Design and Architecture: Managing complex data pipelines, model versioning, deployment workflows, and continuous retraining requires specialized design patterns tailored for AI systems [23]. To prevent the accumulation of technical debt, AI/ML architectures should prioritize modularity, scalability, and the ability to adapt across evolving data and model requirements.
- Testing and Quality Assurance: Traditional software testing approaches fall short in the face of non-deterministic AI behavior. Ensuring system reliability demands the use of AI-specific testing techniques, including data validation, model evaluation, metamorphic testing, and uncertainty quantification [23].
- Deployment and Maintenance: Addressing real-world phenomena such as data drift, model degradation, and evolving user feedback requires continuous monitoring, automated retraining pipelines, and adaptive update mechanisms. MLOps frameworks offer the foundational infrastructure to support these ongoing lifecycle management needs efficiently [87].
- Ethical and Social Considerations: The development of AI systems raises important ethical concerns, including algorithmic bias, lack of transparency in decision-making, and fairness in outcomes. Responsible AI frameworks advocate for embedding ethical safeguards throughout the design, training, and deployment phases to ensure systems align with broader societal values [79].
- Training and Skill Development: Contemporary SE4AI curricula emphasize the need for interdisciplinary competencies that combine technical software engineering skills with principles from AI governance, ethics, and human-centered design [88].
2.3. Explainable Artificial Intelligence (XAI) in AI-Based Systems
- Intrinsic Interpretability: Designing models that are inherently understandable, such as decision trees, linear models, and rule-based systems. These models trade some performance for enhanced transparency.
- Post Hoc Explainability: Applying interpretability techniques after model training, using methods such as feature attribution (e.g., SHAP, LIME), surrogate models, counterfactual explanations, and visualization tools to provide insights into model behavior.
2.4. AI/ML Competition Platforms
2.4.1. Technical Infrastructure and Platform Design
2.4.2. AI Competitions as a Catalyst for Research and Innovation
2.4.3. Educational Benefits and Skill Development
2.4.4. Challenges and Considerations in AI Competitions
2.4.5. Future Directions
3. Materials and Methods
3.1. Scoping Review Framework
3.2. Search Strategy and Data Sources
- “Technical Debt”
- “Artificial Intelligence”
- “Machine Learning”
- “Software Engineering”
- “AI-Based Systems”
3.3. Study Selection and Screening
3.3.1. Inclusion Criteria
- Studies published between January 2012 and February 2024.
- Peer-reviewed journal articles and conference papers.
- Studies explicitly addressing technical debt in AI-based systems or AI/ML competition platforms.
- Studies providing empirical, theoretical, or practical insights into AI-related technical debt management.
3.3.2. Exclusion Criteria
- Non-English publications.
- Publications prior to 2012.
- Non-peer-reviewed articles, editorials, or non-scientific sources.
- Studies focused exclusively on general software engineering technical debt without AI-specific considerations.
3.3.3. PRISMA-Based Selection Process
3.4. Supplementary Search Strategy
3.5. Data Extraction and Classification
- Technical debt types addressed (e.g., data debt, model debt, algorithm debt, etc.).
- Research methodologies employed (e.g., case studies, empirical analysis, theoretical frameworks).
- Identified challenges and proposed mitigation strategies.
- Application contexts relevant to AI-based competition platforms.
No | Title | Author(s) | Year | Technical Debt Type | |
---|---|---|---|---|---|
1 | Algorithm Debt: Challenges and Future Paths | [114] | EIO Simon, M Vidoni, FH Fard | 2023 | Algorithm |
2 | Machine Learning Algorithms, Real-World Applications and Research Directions | [115] | Sarker | 2021 | Algorithm |
3 | Toward understanding deep learning framework bugs | [116] | Chen, J., Liang, Y., Shen, Q., Jiang, J. & Li, S. | 2023 | Algorithm |
4 | Understanding software-2.0: A study of machine learning library usage and evolution | [117] | Dilhara, M., Ketkar, A. & Dig, D. | 2021 | Algorithm |
5 | A survey on deep reinforcement learning architectures, applications and emerging trends | [118] | Balhara et al. | 2022 | Architectural |
6 | Adapting Software Architectures to Machine Learning Challenges | [20] | Serban, A. Visser, J. | 2022 | Architectural |
7 | An Empirical Study of Software Architecture for Machine Learning | [119] | Serban, A. Visser, J. | 2021 | Architectural |
8 | Architecting the Future of Software Engineering | [120] | Carleton, A., Shull, F. & Harper, E. | 2022 | Architectural |
9 | Architectural Decisions in AI-Based Systems: An Ontological View | [121] | Franch, X. | 2022 | Architectural |
10 | Architecture Decisions in AI-based Systems Development: An Empirical Study | [122] | Zhang, B., Liu, T., Liang, P., Wang, C., Shahin, M. & Yu, J. | 2023 | Architectural |
11 | Engineering AI Systems: A Research Agenda | [123] | Bosch, J., Olsson, H. H. & Crnkovic, I. | 2021 | Architectural |
12 | Machine Learning Architecture and Design Patterns | [124] | Washizaki et al. | 2020 | Architectural |
13 | Software Architecture for ML-based Systems: What Exists and What Lies Ahead | [125] | Muccini, H. & Vaidhyanathan, K. | 2021 | Architectural |
14 | Software Engineering for AI-Based Systems: A Survey | [1] | Martínez-Fernández, S., Bogner, J., Franch, X., Oriol, M., Siebert, J., Trendowicz, A. & Wagner, S. | 2022 | Architectural |
15 | Comprehending the Use of Intelligent Techniques to Support Technical Debt Management | [57] | Albuquerque, D., Guimaraes, E., Tonin, G., Perkusich, M., Almeida, H. & Perkusich, A. | 2022 | Build |
16 | Searching for Build Debt Experiences Managing Technical Debt at Google | [126] | Morgenthaler et al. | 2012 | Build |
17 | Better Code, Better Sharing: On the Need of Analyzing Jupyter Notebooks | [67] | Wang, J., Li, L. & Zeller, A. | 2020 | Code |
18 | Characterizing TD and Antipatterns in AI-Based Systems: A Systematic Mapping Study | [30] | Bogner, J., Verdecchia, R. & Gerostathopoulos, I. | 2021 | Code |
19 | Code and Architectural Debt in Artificial Intelligence Systems | [19] | G Recupito, F Pecorelli, G Catolino et al. | 2024 | Code |
20 | Code Smells for Machine Learning Applications | [127] | Zhang, Haiyin Cruz, Luis Deursen, Arie Van | 2022 | Code |
21 | Code Smells in Machine Learning Systems | [66] | Gesi, J., Liu, S., Li, J., Ahmed, I., Nagappan, N., Lo, D. & Bao, L. | 2022 | Code |
22 | How does machine learning change software development practices? | [86] | Wan, Z., Xia, X., Lo, D. & Murphy, G. C. | 2019 | Code |
23 | Software Engineering for Machine Learning: A Case Study | [21] | Amershi et al. | 2019 | Code |
24 | Studying the Machine Learning Lifecycle and Improving Code Quality of Machine Learning Applications | [69] | Haakman, M. P. A. | 2020 | Code |
25 | The prevalence of code smells in machine learning projects | [65] | Van Oort, B., Cruz, L., Aniche, M. & Van Deursen, A. | 2021 | Code |
26 | A Software Engineering Perspective on Engineering Machine Learning Systems: State of the Art and Challenges | [62] | Giray | 2021 | Code |
27 | An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems | [4] | Tang, Y., Khatchadourian, R., Bagherzadeh, M., Singh, R., Stewart, A. & Raja, A. | 2021 | Configuration |
28 | Challenges in Deploying Machine Learning: A Survey of Case Studies | [64] | Paleyes, Urma, Lawrence | 2022 | Configuration |
29 | Software engineering challenges for machine learning applications: A literature review | [3] | Kumeno, F. | 2019 | Configuration |
30 | Software Engineering Challenges of Deep Learning | [60] | Arpteg, A., Brinne, B., Crnkovic-Friis, L. & Bosch, J. | 2018 | Configuration |
31 | Data collection and quality challenges in deep learning: a data-centric AI perspective | [41] | Whang et al. | 2023 | Data |
32 | Data Lifecycle Challenges in Production Machine Learning: A Survey | [40] | Polyzotis, N., Roy, S., Whang, S. E. & Zinkevich, M. | 2018 | Data |
33 | Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems | [27] | Foidl, H., Felderer, M. & Ramler, R. | 2022 | Data |
34 | Data Validation for Machine Learning | [28] | Polyzotis, N., Zinkevich, M., Roy, S., Breck, E. & Whang, S. | 2019 | Data |
35 | Data Validation Process in Machine Learning Pipeline | [43] | Vadavalasa | 2021 | Data |
36 | Risk-Based Data Validation in Machine Learning-Based Software Systems | [44] | Foidl, Felderer | 2019 | Data |
37 | Software Quality for AI: Where We Are Now? | [128] | Lenarduzzi, V., Lomio, F., Moreschini, S., Taibi, D. & Tamburri, D. A. | 2021 | Data |
38 | Technical Debt in Data-Intensive Software Systems | [129] | Foidl, H., Felderer, M. & Biffl, S. | 2019 | Data |
39 | Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure | [42] | Hutchinson, B., Smart, A., Hanna, A., Denton, E., Greer, C., Kjartansson, O. & Mitchell, M. | 2021 | Data |
40 | Debugging Machine Learning Pipelines | [130] | Lourenço, Freire, Shasha | 2019 | Defect |
41 | Is using deep learning frameworks free? Characterizing and Measuring Technical Debt in Deep Learning Applications | [29] | Liu, Jiakun and Huang, Qiao and Xia, Xin and Shang, Weiyi | 2020 | Defect |
42 | Machine Learning: The High-Interest Credit Card of Technical Debt | [131] | Sculley et al. | 2014 | Defect |
43 | Technical Debt in AI-Enabled Systems: On the Prevalence, Severity, Impact, and Management Strategies for Code and Architecture | [19] | Recupito, Gilberto et al. | 2024 | Defect |
44 | Technical Debt Payment and Prevention Through the Lenses of Software Architects | [132] | Pérez, B., Castellanos, C., Correal, D., Rios, N., Freire, S., Spínola, R. & Izurieta, C. | 2021 | Defect |
45 | Studying Software Engineering Patterns for Designing ML Systems | [38] | Washizaki, H., Uchida, H., Khomh, F. & Guéhéneuc, Y. G. | 2019 | Design |
46 | Common problems with Creating Machine Learning Pipelines from Existing Code | [133] | O’Leary, K. & Uchida, M. | 2020 | Design |
47 | Design Patterns for AI-based Systems: A Multivocal Literature Review and Pattern Repository | [37] | Heiland, L., Hauser, M. & Bogner, J. | 2023 | Design |
48 | Software-Engineering Design Patterns for Machine Learning Applications | [31] | Washizaki, H., Khomh, F., Guéhéneuc, Y. G., Takeuchi, H., Natori, N., Doi, T. & Okuda, S. | 2022 | Design |
49 | Correlating Automated and Human Evaluation of Code Documentation Generation Quality | [134] | Hu et al. | 2022 | Documentation |
50 | Maintainability Challenges in ML: A Systematic Literature Review | [75] | Shivashankar, K. & Martini, A. | 2022 | Documentation |
51 | Software Documentation is not Enough! Requirements for the Documentation of AI | [135] | Königstorfer, F. & Thalmann, S. | 2021 | Documentation |
52 | Understanding Implementation Challenges in Machine Learning Documentation | [74] | Chang, J. & Custis, C. | 2022 | Documentation |
53 | “This is Just a Prototype”: How Ethics Are Ignored in Software Startup-Like Environments | [79] | Vakkuri, V., Kemell, K. K., Jantunen, M. & Abrahamsson, P. | 2020 | Ethics |
54 | Explainable Deep Reinforcement Learning: State of the Art and Challenges | [90] | George A. Vouros | 2022 | Ethics |
55 | Managing bias in AI | [136] | Roselli, D., Matthews, J. & Talagala, N. | 2019 | Ethics |
56 | Patterns and Anti-Patterns, Principles and Pitfalls: Accountability and Transparency | [81] | Matthews, J. | 2020 | Ethics |
57 | Principles alone cannot guarantee ethical AI | [26] | Mittelstadt, B. | 2019 | Ethics |
58 | Who pays for ethical debt in AI? | [82] | Petrozzino, C. | 2021 | Ethics |
59 | AI Competitions as Infrastructures Examining Power Relations on Kaggle and Grand Challenge in AI-Drive | [6] | Luitse, Blanke, Poell | 2023 | Infrastructure |
60 | Infrastructure for Usable Machine Learning the Stanford DAWN Project | [50] | Bailis et al. | 2017 | Infrastructure |
61 | Practices and Infrastructures for Machine Learning Systems: An Interview Study in Finnish Organizations | [137] | Muiruri, D., Lwakatare, L. E., Nurminen, J. K. & Mikkonen, T. | 2022 | Infrastructure |
62 | A Meta-Summary of Challenges in Building Products with ML Components—Collecting Experiences from 4758 | [138] | Nahar et al. | 2023 | Model |
63 | A Taxonomy of Software Engineering Challenges for Machine Learning Systems—An Empirical Investigation | [47] | Lwakatare, L. E., Raj, A., Bosch, J., Olsson, H. H. & Crnkovic, I. | 2019 | Model |
64 | Clones in Deep Learning Code: What, where, and why? | [139] | Jebnoun, H., Rahman, M. S., Khomh, F. & Muse, B. A. | 2022 | Model |
65 | Empirical Analysis of Hidden Technical Debt Patterns in Machine Learning Software | [140] | Alahdab, M. & Çalıklı, G. | 2019 | Model |
66 | Hidden Technical Debt in Machine Learning Systems | [5] | Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D. & Dennison, D. | 2015 | Model |
67 | Machine Learning Model Development from a Software Engineering Perspective a Systematic Literature Review | [141] | Lorenzoni et al. | 2021 | Model |
68 | Quality issues in Machine Learning Software Systems | [52] | Côté, P. O., Nikanjam, A., Bouchoucha, R., Basta, I., Abidi, M. & Khomh, F. | 2023 | Model |
69 | Synergy Between Machine/Deep Learning and Software Engineering: How Far Are We? | [142] | Wang, S., Huang, L., Ge, J., Zhang, T., Feng, H., Li, M. & Ng, V. | 2020 | Model |
70 | The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction | [7] | Breck, E., Cai, S., Nielsen, E., Salib, M. & Sculley, D. | 2017 | Model |
71 | Towards CRISP-ML(Q): A ML Process Model with Quality Assurance Methodology | [49] | Studer, S., Bui, T. B., Drescher, C., Hanuschkin, A., Winkler, L., Peters, S. & Müller, K. R. | 2021 | Model |
72 | Towards Guidelines for Assessing Qualities of Machine Learning Systems | [143] | Siebert, J., Joeckel, L., Heidrich, J., Nakamichi, K., Ohashi, K., Namba, I. & Aoyama, M. | 2020 | Model |
73 | Understanding Development Process of Machine Learning Systems: Challenges and Solutions | [70] | de Souza Nascimento, E., Ahmed, I., Oliveira, E., Palheta, M. P., Steinmacher, I. & Conte, T. | 2019 | Model |
74 | What Is Really Different in Engineering AI-Enabled Systems? | [46] | Ozkaya, I. | 2020 | Model |
75 | Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process | [71] | Nahar, N., Zhou, S., Lewis, G. & Kästner, C. | 2022 | People |
76 | How Do Engineers Perceive Difficulties in Engineering of Machine-Learning Systems?—Questionnaire Survey | [78] | Ishikawa, F. & Yoshioka, N. | 2019 | People |
77 | What is Social Debt in Software Engineering? | [76] | Tamburri, D. A., Kruchten, P., Lago, P. & van Vliet, H. | 2013 | People |
78 | Exploring the Impact of Code Clones on Deep Learning Software | [72] | Mo, R., Zhang, Y., Wang, Y., Zhang, S., Xiong, P., Li, Z. & Zhao, Y. | 2023 | Process |
79 | Studying Software Engineering Patterns for Designing ML Systems | [38] | Washizaki, H., Uchida, H., Khomh, F. & Guéhéneuc, Y. G. | 2019 | Process |
80 | It Takes Three to Tango: Requirement, Outcome/data, and AI Driven Development | [144] | Bosch, J., Olsson, H. H. & Crnkovic, I. | 2018 | Requirements |
81 | MLife a Lite Framework for Machine Learning Lifecycle Initialization | [145] | Yang et al. | 2021 | Requirements |
82 | Requirements Engineering Challenges in Building AI-based Complex Systems | [146] | Belani, H., Vukovic, M. & Car, Ž. | 2019 | Requirements |
83 | Requirements Engineering for Artificial Intelligence Systems: A Systematic Mapping Study | [34] | Ahmad, K., Abdelrazek, M., Arora, C., Bano, M. & Grundy, J. | 2023 | Requirements |
84 | Requirements Engineering for Machine Learning: Perspectives from Data Scientists | [35] | Vogelsang, A. & Borg, M. | 2019 | Requirements |
85 | 23 Shades of Self-Admitted Technical Debt: An Empirical Study on Machine Learning Software | [63] | OBrien, D., Biswas, S., Imtiaz, S., Abdalkareem, R., Shihab, E. & Rajan, H. | 2022 | Self-Admitted (SATD) |
86 | A Large-Scale Empirical Study on Self-Admitted Technical Debt | [84] | Bavota, G. & Russo, B. | 2016 | Self-Admitted (SATD) |
87 | An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software | [83] | Bhatia, A., Khomh, F., Adams, B. & Hassan, A. E. | 2023 | Self-Admitted (SATD) |
88 | Automating Change-level Self-Admitted Technical Debt Determination | [147] | Yan, M., Xia, X., Shihab, E., Lo, D., Yin, J. & Yang, X. | 2018 | Self-Admitted (SATD) |
89 | Self-Admitted Technical Debt in R: Detection and Causes | [148] | Rishab Sharma, Ramin Shahbazi, Fatemeh H. Fard, Zadia Codabux, Melina Vidoni | 2022 | Self-Admitted (SATD) |
90 | Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We? | [149] | Mastropaolo, A., Di Penta, M. & Bavota, G. | 2023 | Self-Admitted (SATD) |
91 | A Systematic Mapping Study on Testing of Machine Learning Programs | [150] | Sherin, S. & Iqbal, M. Z. | 2019 | Test |
92 | Machine Learning Testing: Survey, Landscapes and Horizons | [51] | Zhang, J. M., Harman, M., Ma, L. & Liu, Y. | 2020 | Test |
93 | On Testing Machine Learning Programs | [55] | Braiek, H. B. & Khomh, F. | 2020 | Test |
94 | Testing Machine Learning based Systems: A Systematic Mapping | [151] | Riccio, V., Jahangirova, G., Stocco, A., Humbatova, N., Weiss, M. & Tonella, P. | 2020 | Test |
95 | “We Have No Idea How Models Will Behave in Production until Production”: How Engineers Operationalize Machine Learning | [152] | Shankar, S., Garcia, R., Hellerstein, J. M. & Parameswaran, A. | 2024 | Versioning |
96 | On Challenges in Machine Learning Model Management | [48] | Schelter, S., Biessmann, F., Januschowski, T., Salinas, D., Seufert, S. & Szarvas, G. | 2015 | Versioning |
97 | On the Challenges of Migrating to Machine Learning Life Cycle Management Platforms | [58] | Njomou, A. T., Fokaefs, M., Silatchom Kamga, D. F. & Adams, B. | 2022 | Versioning |
98 | Software Engineering Challenges of Deep Learning | [60] | Arpteg, A., Brinne, B., Crnkovic-Friis, L. & Bosch, J. | 2018 | Versioning |
99 | The Story in the Notebook: Exploratory Data Science using a Literate Programming Tool | [61] | Kery, M. B., Radensky, M., Arya, M., John, B. E. & Myers, B. A. | 2018 | Versioning |
100 | Versioning for End-to-End Machine Learning Pipelines | [59] | Van Der Weide, T., Papadopoulos, D., Smirnov, O., Zielinski, M. & Van Kasteren, T. | 2017 | Versioning |
3.6. Summary of Materials and Methods
4. Results
4.1. Overview
4.2. Mapping of Technical Debt Categories
4.3. Findings per Technical Debt Type
4.3.1. Requirements Debt
4.3.2. Architectural Debt
4.3.3. Design Debt
4.3.4. Data Debt
4.3.5. Algorithm Debt
4.3.6. Model Debt
4.3.7. Infrastructure Debt
4.3.8. Test Debt
4.3.9. Build Debt
4.3.10. Versioning Debt
4.3.11. Configuration Debt
4.3.12. Code Debt
4.3.13. Process Debt
4.3.14. Documentation Debt
4.3.15. People—Social Debt
4.3.16. Ethics Debt
4.3.17. Self-Admitted Technical Debt (SATD)
4.3.18. Defect Debt
4.4. Observed Patterns and Gaps
4.4.1. Recurring Co-Occurrences
4.4.2. Underrepresented Technical Debt Categories
- Process Debt and People (Social) Debt: Despite their operational impact, these categories receive limited formal treatment. Their manifestations—such as unstructured workflows, poor coordination, or unclear stakeholder roles—are often implicit, making them difficult to quantify yet highly influential in competition outcomes.
- Build Debt and Infrastructure Debt: While highly relevant in reproducibility and deployment contexts, these are seldom addressed explicitly, particularly in the context of educational or open-source competition platforms.
4.4.3. Educational and Human-Centered Contexts
4.4.4. Research and Practical Implications
- Greater empirical focus on underrepresented debt types, especially social and process-oriented debt.
- Design of competition platforms and guidelines that explicitly address debt-prone areas (e.g., versioning enforcement, configuration validation).
- Inclusion of metrics and documentation requirements that encourage sustainable and reproducible AI practices, even in time-constrained or educational settings.
4.5. Stakeholder Roles and Technical Debt Responsibility
- Competition Organizers: Those responsible for designing and maintaining the platform, defining evaluation protocols, providing infrastructure, and ensuring transparency and fairness.
- Participants: Teams or individuals who submit AI models or agents to the competition and are accountable for code quality, reproducibility, and ethical compliance.
- Both: Debt types where responsibility is shared between organizers and participants due to their interdependent nature.
4.6. Early Academic Deployment of the Questionnaire
4.7. Illustrative Use Case: Applying the Questionnaire to a Hypothetical Platform
- In the Infrastructure Debt category (Q33–Q34), both items receive a NO response, indicating that the submission environment lacks container-based isolation and resource monitoring. Given their severity weight of 5, this yields a subtotal of 10 in this category.
- For Documentation Debt (Q26–Q28), two items are answered NO and one I Don’t Know, with weights 4, 3, and 2, respectively, resulting in a subtotal of 9.
- In Accessibility Debt (Q58–Q60), all three items receive NO responses. With assigned severity weights of 3, 2, and 2, the subtotal is 7 Calculated −7.
- Conversely, in categories such as Model Debt (Q35–Q36) and Configuration Debt (Q7–Q9), most responses are YES or Not Applicable, contributing minimally or neutrally to the total debt score.
- Containerizing the submission pipeline and introducing system-level resource controls.
- Updating the documentation portal with clear API usage instructions and dataset versioning details.
- Integrating an accessibility audit checklist into the UI design process, including keyboard navigation and WCAG-compliant color contrast.
5. Questionnaire and Quantification Method Approach
5.1. Overview and Purpose
5.2. Structure and Scoring Logic
- YES—Best practice applied (reduces debt).
- NO—Best practice not applied (adds debt).
- I Don’t Know/I Don’t Answer—Indicates uncertainty (adds debt).
- Not Applicable—Neutral, scored as zero.
- YES: reduces the total debt score, yields a negative value equal to the weight (−w).
- NO or I Don’t Know/I Don’t Answer: increases the score, yields a positive value equal to the weight (+w).
- Not Applicable: scored as zero (neutral).
- A total score ≤0 indicates low or no observable technical debt.
- A score between 1 and 10 suggests moderate technical debt.
- A score above 10 indicates significant technical debt accumulation, signaling a need for focused mitigation.
5.3. Examples of Use
5.4. Full Set of Questions
- The associated technical debt category,
- The stakeholder role (Organizer or Participant),
- The severity weight (1–5),
- A brief justification, and
- A realistic AI competition-specific scenario.
5.5. Accessibility Debt
6. Discussion and Implications
6.1. Synthesis of Key Insights
6.2. Practical Implications
- Competition Organizers: Organizers should proactively assess and mitigate debt types under their responsibility, including infrastructure, configuration, documentation, and accessibility-related aspects. Tools like the stakeholder-specific questionnaire can support pre-competition audits, platform design iterations, and post-competition reflection. As shown in Figure 4, certain debt types (e.g., Data and Infrastructure Debt) have high organizer-side impact, highlighting the importance of early planning, communication, and versioning strategies.
- Competition Participants: Participants can use the questionnaire to self-assess their submissions and development workflows, focusing on areas like code hygiene, test coverage, and algorithmic transparency. Identifying hidden debt (e.g., SATD, configuration or versioning issues) before submission enhances reproducibility and ethical compliance.
- Educators and Instructors: Academic institutions can embed the questionnaire in AI engineering curricula to teach responsible development practices. The results from the pilot study (Section 4.6) show that students gained awareness of platform-level responsibilities and software-engineering-for-AI challenges, beyond mere model performance optimization.
6.3. Research Implications and Gaps
- Underexplored categories such as People Debt, Process Debt, and Accessibility Debt merit dedicated empirical investigation, especially in educational and open-source platforms.
- Quantitative evaluation of the questionnaire’s validity and sensitivity will be a priority for future work, once the collected responses are analyzed.
- Interdependencies between debt types (e.g., between data, infrastructure, and model) should be formally modeled to better understand risk propagation. For example, poor documentation may directly hinder onboarding and inclusivity, thereby amplifying Accessibility Debt. Similarly, weak test practices can increase the likelihood of Defect Debt, while Build and Configuration Debt often co-occur in poorly modularized pipelines. These preliminary observations, drawn from the thematic analysis, support our view that certain debt types may be clustered and co-managed. While a formal dependency model is beyond the scope of this work, illustrative relationships are presented in Appendix B (Table A2) to support future investigations into structured cause-effect mappings among debts.
- Expansion of stakeholder roles beyond organizers and participants (e.g., reviewers, maintainers, platform developers) could reveal additional systemic concerns.
6.4. Threats to Validity
- Selection Bias: Although the PRISMA-ScR process was followed (Section 3), some relevant but non-indexed studies may have been missed.
- Publication Bias: The review relies on peer-reviewed and indexed literature, which may underrepresent studies with inconclusive, negative, or incomplete results. Additionally, the exclusion of non-English publications may limit insights from other linguistic and regional contexts. While this is common in scoping reviews, it remains a relevant limitation.
- Subjectivity in Categorization: The thematic classification of technical debt types was grounded in the literature and expert interpretation, but alternative taxonomies may be proposed in future work.
- Limited Evaluation Data: The pilot study in Section 4.6 involved a relatively small number of participants in an educational setting, and the broader deployment (currently at 60+ responses) has not yet been statistically analyzed.
- Context-Specific Observations: Some examples and interpretations are tailored to academic or gamified platforms, and may require adaptation for industrial or large-scale AI competitions.
- Limited Access to Commercial Platforms: The lack of access to proprietary codebases and internal infrastructures from commercial competition providers (e.g., Kaggle, DrivenData) restricts our ability to assess organizer-side technical debt in those environments. Nonetheless, the questionnaire remains applicable from the participant perspective, paving the way for future investigations into how such competitions influence development workflows and debt accumulation on the submission side.
6.5. Future Work
7. Conclusions
Broader Applicability and Design Implications
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
CPU | Central Processing Unit |
DSLE | Data Science Lab Environment |
GBG | General Board Game |
GPU | Graphic Processing Unit |
GVGAI | General Video Game Artificial Intelligence |
ML | Machine Learning |
PRISMA-ScR | Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) |
RL | Reinforcement Learning |
RQ | Research Question |
SATD | Self-Admitted Technical Debt |
TD | Technical Debt |
XAI | Explainable Artificial Intelligence |
Appendix A. Full Questionnaire (Updated Version—60 Questions)
Appendix A.1. Algorithm Debt—Question 1
- Question 1: Have you checked if the framework you are using has technical debt or may introduce glitches or incompatibility in your application?Stakeholder: Organizers and ParticipantsScore: 3
Appendix A.2. Architectural Debt—Questions 2 to 3
- 2.
- Question 2: Have you effectively separated concerns and ensured that code reuse does not lead to tightly coupled components?Stakeholder: OrganizersScore: 5
- 3.
- Question 3: Have you designed the environment for prototyping ML models to prevent the need to re-implement from scratch for production?Stakeholder: OrganizersScore: 4
Appendix A.3. Build Debt—Questions 4 to 5
- 4.
- Question 4: Have you checked your app for bad dependencies or suboptimal dependencies of internal and external artificial intelligence models?Stakeholder: Organizers and ParticipantsScore: 4
- 5.
- Question 5: Have you tested the ability to build the app on different platforms?Stakeholder: OrganizersScore: 3
Appendix A.4. Code Debt—Question 6
- 6.
- Question 6: Have you identified and refactored low-quality, complex, and duplicated code sections, including dead codepaths and centralized scattered code, while ensuring clear component and code APIs?Stakeholder: OrganizersScore: 5
Appendix A.5. Configuration Debt—Questions 7 to 9
- 7.
- Question 7: Is it easy to specify a configuration as a small change from a previous configuration?Stakeholder: OrganizersScore: 3
- 8.
- Question 8: Do you have poorly documented or undocumented configuration files?Stakeholder: OrganizersScore: 4
- 9.
- Question 9: Have the configuration files been thoroughly reviewed and tested?Stakeholder: OrganizersScore: 4
Appendix A.6. Data Debt—Questions 10 to 15
- 10.
- Question 10: Have you checked for spurious data?Stakeholder: ParticipantsScore: 4
- 11.
- Question 11: Have you checked your data for accuracy?Stakeholder: ParticipantsScore: 5
- 12.
- Question 12: Have you checked your data for completeness?Stakeholder: ParticipantsScore: 4
- 13.
- Question 13: Have you checked your data for trustworthiness?Stakeholder: ParticipantsScore: 4
- 14.
- Question 14: Have you performed testing on the input features?Stakeholder: ParticipantsScore: 3
- 15.
- Question 15: Have you checked your data for data relevance?Stakeholder: ParticipantsScore: 3
Appendix A.7. Design Debt—Questions 16 to 18
- 16.
- Question 16: Pipeline Jungle—Is it possible to maintain a single controllable, straightforward pipeline of ML components?Stakeholder: Organizers and ParticipantsScore: 3
- 17.
- Question 17: Does your system include glue code?Stakeholder: Organizers and ParticipantsScore: 2
- 18.
- Question 18: Have you avoided reusing a slightly modified complete model (correction cascades)?Stakeholder: ParticipantsScore: 3
Appendix A.8. Defect Debt—Questions 19 to 23
- 19.
- Question 19: Have you checked that there is no error in the training data collection that would cause a significant training dataset to be lost or delayed?Stakeholder: ParticipantsScore: 5
- 20.
- Question 20: Have you made the right choice in the hyperparameter values?Stakeholder: ParticipantsScore: 4
- 21.
- Question 21: Have you made sure that there is no degradation in view prediction quality due to data changes, different code paths, etc.?Stakeholder: ParticipantsScore: 4
- 22.
- Question 22: Have you quality inspected and validated the model for adequacy before releasing it to production?Stakeholder: ParticipantsScore: 5
- 23.
- Question 23: Have you implemented mechanisms for rapid adaptation and regular updates to maintain the model’s efficiency and relevance in response to changes in data, features, modeling, or infrastructure?Stakeholder: Organizers and ParticipantsScore: 4
Appendix A.9. Documentation Debt—Questions 24 to 28
- 24.
- Question 24: Is Requirement Documentation available?Stakeholder: OrganizersScore: 5
- 25.
- Question 25: Is Technical Documentation available?Stakeholder: OrganizersScore: 5
- 26.
- Question 26: Is End-user Documentation available?Stakeholder: Organizers and ParticipantsScore: 5
- 27.
- Question 27: Is the documentation clear?Stakeholder: Organizers and ParticipantsScore: 5
- 28.
- Question 28: Is the documentation up to date?Stakeholder: Organizers and ParticipantsScore: 5
Appendix A.10. Ethics Debt—Questions 29 to 30
- 29.
- Question 29: Are you familiar with the implementation guidelines, the process for submitting clarification requests, and how to address conflicting interpretations of complex or ambiguous concepts?Stakeholder: Organizers and ParticipantsScore: 5
- 30.
- Question 30: Do you know the consequences of non-compliance?Stakeholder: Organizers and ParticipantsScore: 5
Appendix A.11. Infrastructure Debt—Questions 31 to 34
- 31.
- Question 31: Are there mechanisms in place for automated monitoring and alerting of infrastructure performance metrics (e.g., Central Processing Unit (CPU) usage, memory utilization, network throughput)?Stakeholder: OrganizersScore: 4Justification:
- -
- Maintainability: Automated monitoring can reduce technical debt by making it easier to maintain the system.
- -
- Future-proofing: Early detection of performance issues can prevent the accumulation of technical debt related to system degradation.
- 32.
- Question 32: Has provision been made in the infrastructure for sufficient computing resources available?Stakeholder: Organizers and ParticipantsScore: 3Justification:
- -
- Scalability: While important, over-provisioning resources can lead to unnecessary complexity and costs, contributing to technical debt.
- -
- Cost Management: Balancing resources with actual needs can minimize expenses and reduce the risk of investing in technologies that may become obsolete.
- 33.
- Question 33: Have you developed a robust data pipeline for easy experimentation with AI algorithms?Stakeholder: OrganizersScore: 5
- 34.
- Question 34: Have you automated pipelines for model training, deployment, and integration?Stakeholder: Organizers and ParticipantsScore: 4
Appendix A.12. Model Debt—Questions 35 to 37
- 35.
- Question 35: Are you detecting direct feedback loops or hidden feedback loops?Stakeholder: Organizers and ParticipantsScore: 4
- 36.
- Question 36: Is model quality validated before serving?Stakeholder: ParticipantsScore: 5
- 37.
- Question 37: Does the model allow debugging by observing the step-by-step computation of training or inference on a single example?Stakeholder: ParticipantsScore: 3
Appendix A.13. People—Social Debt—Questions 38 to 39
- 38.
- Question 38: Is there a system in place to ensure project continuity through team member overlap and retention of the original development team’s knowledge?Stakeholder: OrganizersScore: 5
- 39.
- Question 39: Has a project support community been created?Stakeholder: OrganizersScore: 3
Appendix A.14. Process Debt—Questions 40 to 42
- 40.
- Question 40: Have you correctly described your data handling procedures?Stakeholder: OrganizersScore: 4
- 41.
- Question 41: Have you correctly described your model development processes?Stakeholder: OrganizersScore: 4
- 42.
- Question 42: Have you correctly described the deployment processes of your model?Stakeholder: OrganizersScore: 4
Appendix A.15. Requirements Debt—Questions 43 to 45
- 43.
- Question 43: Have you thoroughly defined the objectives, scope, stakeholder needs, expectations, decision goals, and insights of the AI system to ensure alignment with business objectives and user expectations?Stakeholder: OrganizersScore: 5
- 44.
- Question 44: Have you thoroughly addressed the technical aspects of the AI system, including the selection of appropriate AI techniques, algorithms, and models to achieve desired functionality and performance, as well as specifying quality attributes, trade-offs, metrics, and indicators to measure and evaluate system performance effectively?Stakeholder: OrganizersScore: 5
- 45.
- Question 45: Have you monitored and retrained the AI system with new data as needed?Stakeholder: Organizers and ParticipantsScore: 4
Appendix A.16. Self-Admitted Technical Debt (SATD)—Questions 46 to 47
- 46.
- Question 46: Do you systematically record and track self-admitted technical debt (SATD) comments in the code of Artificial Intelligence (AI/ML/RL) models, using backlog management, issue tracking, or other technical debt management tools?Stakeholder: OrganizersScore:
- 47.
- Question 47: Do you regularly plan improvements for areas of the code marked as SATD, such as gradual model refactoring, pipeline component redesign, or documentation of experimental setups?Stakeholders: OrganizersScore: 5
Appendix A.17. Test Debt—Questions 48 to 54
- 48.
- Question 48: Have the hyperparameters been properly tuned and validated to ensure optimal performance within the game environment?Stakeholder: ParticipantsScore: 5
- 49.
- Question 49: Has reproducibility of agent training and environment dynamics been tested to ensure consistency?Stakeholder: Organizers and ParticipantsScore: 4
- 50.
- Question 50: Is there a fully automated test regularly running to validate the entire pipeline, ensuring data and code move through each stage successfully and resulting in a well-performing model?Stakeholder: OrganizersScore: 5
- 51.
- Question 51: Do the data invariants hold for the inputs in the game environment?Stakeholder: Organizers and ParticipantsScore: 3
- 52.
- Question 52: Are there mechanisms in place to ensure that training and serving are not skewed in the game?Stakeholder: Organizers and ParticipantsScore: 4
- 53.
- Question 53: Are the models numerically stable for effective gameplay?Stakeholder: Organizers and ParticipantsScore: 5
- 54.
- Question 54: Has the prediction quality of the game not regressed over time?Stakeholder: Organizers and ParticipantsScore: 5
Appendix A.18. Versioning Debt—Questions 55 to 57
- 55.
- Question 55: Have you installed a proper version control system for model, training and test data?Stakeholder: OrganizersScore: 5
- 56.
- Question 56: Have you used the appropriate policy for marking the versions of your software components?Stakeholder: OrganizersScore: 3
- 57.
- Question 57: Do you maintain a consistent data structure for game state representation throughout iterations, ensuring compatibility between different versions of the RL game?Stakeholder: Organizers and ParticipantsScore: 4
Appendix A.19. Accessibility Debt—Questions 58 to 60
- 58.
- Question 58: Have you conducted usability testing to identify and address potential barriers in the platform setup process?Stakeholder: OrganizersScore: 5
- 59.
- Question 59: Have you implemented adaptive user interfaces that tailor the setup experience based on participants’ skill levels and preferences?Stakeholder: OrganizersScore: 4
- 60.
- Question 60: Have you implemented feedback mechanisms that allow participants to report accessibility issues and suggest improvements?Stakeholder: OrganizersScore: 3
Appendix B
No | Title | Rater 1 | Rater 2 |
---|---|---|---|
1 | Machine Learning Algorithms, Real-World Applications and Research Directions | Algorithm | Algorithm |
2 | Adapting Software Architectures to Machine Learning Challenges | Architectural | Architectural |
3 | Architecture Decisions in AI-based Systems Development: An Empirical Study | Architectural | Architectural |
4 | Machine Learning Architecture and Design Patterns | Architectural | Architectural |
5 | Searching for Build Debt Experiences Managing Technical Debt at Google | Build | Build |
6 | Code and Architectural Debt in Artificial Intelligence Systems | Code | Code |
7 | Code Smells for Machine Learning Applications | Code | Code |
8 | The prevalence of code smells in machine learning projects | Code | Code |
9 | A software engineering perspective on engineering machine learning systems State of the art and challenges | Configuration | Process |
10 | Challenges in Deploying Machine Learning A Survey of Case Studies | Configuration | Infrastructure |
11 | Data collection and quality challenges in deep learning a data-centric AI perspective | Data | Data |
12 | Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems | Data | Data |
13 | Debugging machine learning pipelines | Defect | Defect |
14 | Common problems with Creating Machine Learning Pipelines from Existing Code | Design | Code |
15 | Understanding Implementation Challenges in Machine Learning Documentation | Documentation | Documentation |
16 | Patterns and Anti-Patterns, Principles and Pitfalls: Accountability and Transparency | Ethics | Ethics |
17 | Infrastructure for Usable Machine Learning: The Stanford DAWN Project | Infrastructure | Infrastructure |
18 | A Meta-Summary of Challenges in Building Products with ML Components—Collecting Experiences from 4758 Practition | Model | Model |
19 | Machine Learning Model Development from a Software Engineering Perspective: A Systematic Literature Review | Model | Model |
20 | Quality issues in Machine Learning Software Systems | Model | Model |
21 | Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process | People | People |
22 | Studying Software Engineering Patterns for Designing ML Systems | Process | Architectural |
23 | MLife: A Lite Framework for Machine Learning Lifecycle Initialization | Requirements | Requirements |
24 | Requirements Engineering for Artificial Intelligence Systems: A Systematic Mapping Study | Requirements | Requirements |
25 | 23 Shades of Self-Admitted Technical Debt: An Empirical Study on Machine Learning Software | SATD | SATD |
26 | Self-Admitted Technical Debt in R_ Detection and Causes | SATD | SATD |
27 | Machine Learning Testing: Survey, Landscapes and Horizons | Test | Test |
28 | On Testing Machine Learning Programs | Test | Test |
29 | On the Challenges of Migrating to Machine Learning Life Cycle Management Platforms | Versioning | Versioning |
30 | Versioning for End-to-End Machine Learning Pipelines | Versioning | Versioning |
Source Debt Type | Affected Debt Type(s) | Relationship Type | Explanation |
---|---|---|---|
Documentation Debt | Accessibility Debt | Causal | Lack of clear or multilingual documentation hinders access for non-native or novice users. |
Infrastructure Debt | Model Debt, Data Debt | Enabling Constraint | Poor infrastructure restricts the deployment, scalability, and integrity of models and data. |
Process Debt | Versioning Debt | Triggering | Opaque or informal processes often lead to versioning inconsistencies and traceability loss. |
Test Debt | Defect Debt | Amplifying | Insufficient testing increases the likelihood of undetected bugs or model failures. |
Configuration Debt | Reproducibility Issues | Blocking | Non-reproducible configurations block validation and result reuse by participants. |
People Debt | Ethics Debt, Documentation | Reinforcing | Knowledge silos and turnover weaken compliance practices and reduce documentation quality. |
Design Debt | Maintainability, Defect Debt | Structural | Poor design decisions propagate technical issues and raise maintenance overhead. |
Code Debt | Test Debt | Indirect | Unstructured or entangled code often reduces test coverage or inhibits testability. |
References
- Martínez-Fernández, S.; Bogner, J.; Franch, X.; Oriol, M.; Siebert, J.; Trendowicz, A.; Vollmer, A.M.; Wagner, S. Software Engineering for AI-Based Systems: A Survey. ACM Trans. Softw. Eng. Methodol. 2022, 31, 1–59. [Google Scholar] [CrossRef]
- Felderer, M.; Ramler, R. Quality Assurance for AI-based Systems: Overview and Challenges. arXiv 2021, arXiv:2102.05351. [Google Scholar]
- Kumeno, F. Sofware engneering challenges for machine learning applications: A literature review. Intell. Decis. Technol. 2020, 13, 463–476. [Google Scholar] [CrossRef]
- Tang, Y.; Khatchadourian, R.; Bagherzadeh, M.; Singh, R.; Stewart, A.; Raja, A. An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems How does access to this work benefit you? In Proceedings of the 021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, 25–28 May 2021; pp. 238–250. [Google Scholar]
- Sculley, D.; Holt, G.; Golovin, D.; Davydov, E.; Phillips, T.; Ebner, D.; Chaudhary, V.; Young, M.; Crespo, J.F.; Dennison, D. Hidden technical debt in machine learning systems. Adv. Neural Inf. Process. Syst. 2015, 28, 2503–2511. [Google Scholar]
- Luitse, D.M.R.; Blanke, T.; Poell, T. Ai Competitions As Infrastructures: Examining Power Relations on Kaggle and Grand Challenge in Ai-Driven Medical Imaging. AoIR Sel. Pap. Internet Res. 2022. [Google Scholar]
- Breck, E.; Cai, S.; Nielsen, E.; Salib, M.; Sculley, D. The ML test score: A rubric for ML production readiness and technical debt reduction. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 1123–1132. [Google Scholar] [CrossRef]
- Konen, W. General board game playing for education and research in generic AI game learning. In Proceedings of the 2019 IEEE Conference on Games (CoG), London, UK, 20–23 August 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Hong, C.; Jeong, I.; Vecchietti, L.F.; Har, D.; Kim, J.H. AI World Cup: Robot-Soccer-Based Competitions. IEEE Trans. Games 2021, 13, 330–341. [Google Scholar] [CrossRef]
- Attanasio, G.; Giobergia, F.; Pasini, A.; Ventura, F.; Baralis, E.; Cagliero, L.; Garza, P.; Apiletti, D.; Cerquitelli, T.; Chiusano, S. DSLE: A Smart Platform for Designing Data Science Competitions. In Proceedings of the 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain, 13–17 July 2020; pp. 133–142. [Google Scholar] [CrossRef]
- Zobernig, V.; Saldanha, R.A.; He, J.; van der Sar, E.; van Doorn, J.; Hua, J.-C.; Mason, L.R.; Czechowski, A.; Indjic, D.; Kosmala, T.; et al. RangL: A Reinforcement Learning Competition Platform. SSRN Electron. J. 2022, 1–10. [Google Scholar] [CrossRef]
- Stephenson, M.; Piette, E.; Soemers, D.J.N.J.; Browne, C. Ludii as a competition platform. In Proceedings of the 2019 IEEE Conference on Games (CoG), London, UK, 20–23 August 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Kempka, M.; Wydmuch, M.; Runc, G.; Toczek, J.; Jaskowski, W. ViZDoom: A Doom-based AI research platform for visual reinforcement learning. In Proceedings of the 2016 IEEE Conference on Computational Intelligence and Games (CIG), Santorini, Greece, 20–23 September 2016; pp. 1–8. [Google Scholar] [CrossRef]
- Kalles, D. Artificial intelligence meets software engineering in computing education. In Proceedings of the 9th Hellenic Conference on Artificial Intelligence, Thessaloniki, Greece, 18–20 May 2016; pp. 1–5. [Google Scholar] [CrossRef]
- Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. OpenAI Gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
- Togelius, J. How to Run a Successful Game-Based AI Competition. IEEE Trans. Comput. Intell. AI Games 2016, 8, 95–100. [Google Scholar] [CrossRef]
- Kästner, C.; Kang, E. Teaching software engineering for AI-enabled systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering Education and Training, Seoul, Republic of Korea, 5–11 October 2020; pp. 45–48. [Google Scholar] [CrossRef]
- Chesani, F.; Galassi, A.; Mello, P.; Trisolini, G. A game-based competition as instrument for teaching artificial intelligence. In Proceedings of the AI* IA 2017 Advances in Artificial Intelligence: XVIth International Conference of the Italian Association for Artificial Intelligence, Bari, Italy, 14–17 November 2017; pp. 72–84. [Google Scholar] [CrossRef]
- Recupito, G.; Pecorelli, F.; Catolino, G.; Lenarduzzi, V.; Taibi, D.; Di Nucci, D.; Palomba, F. Technical Debt in AI-Enabled Systems: On the Prevalence, Severity, Impact, and Management Strategies for Code and Architecture. J. Syst. Softw. 2024, 216, 112151. [Google Scholar] [CrossRef]
- Serban, A.; Visser, J. Adapting Software Architectures to Machine Learning Challenges. In Proceedings of the 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Honolulu, HI, USA, 15–18 March 2022; pp. 152–163. [Google Scholar] [CrossRef]
- Amershi, S.; Begel, A.; Bird, C.; DeLine, R.; Gall, H.; Kamar, E.; Nagappan, N.; Nushi, B.; Zimmermann, T. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Montreal, QC, Canada, 25–31 May 2019; pp. 291–300. [Google Scholar] [CrossRef]
- Sklavenitis, D.; Kalles, D. Measuring Technical Debt in AI-Based Competition Platforms. In Proceedings of the 13th Hellenic Conference on Artificial Intelligence (SETN 2024), Athens, Greece, 11–13 September 2024. [Google Scholar] [CrossRef]
- Isbell, C.; Littman, M.L.; Norvig, P. Software Engineering of Machine Learning Systems. Commun. ACM 2023, 66, 35–37. [Google Scholar] [CrossRef]
- Kolek, L.; Mochocki, M.; Gemrot, J. Review of Educational Benefits of Game Jams: Participant and Industry Perspective. Homo Ludens 2023, 1, 115–140. [Google Scholar] [CrossRef]
- Meriläinen, M.; Aurava, R.; Kultima, A.; Stenros, J. Game jams for learning and teaching: A review. Int. J. Game Based Learn. 2020, 10, 54–71. [Google Scholar] [CrossRef]
- Mittelstadt, B. Principles alone cannot guarantee ethical AI. Nat. Mach. Intell. 2019, 1, 501–507. [Google Scholar] [CrossRef]
- Foidl, H.; Felderer, M.; Ramler, R. Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-Based Systems; Association for Computing Machinery: New York, NY, USA, 2022; Volume 1. [Google Scholar] [CrossRef]
- Polyzotis, N.; Zinkevich, M.; Roy, S.; Breck, E.; Whang, S. Data Validation for Machine Learning. SysML 2019, 1, 334–347. [Google Scholar]
- Liu, J.; Huang, Q.; Xia, X.; Shihab, E.; Lo, D.; Li, S. Is using deep learning frameworks free? In characterizing technical debt in deep learning frameworks. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, Seoul, Republic of Korea, 5–11 October 2020; pp. 1–10. [Google Scholar] [CrossRef]
- Bogner, J.; Verdecchia, R.; Gerostathopoulos, I. Characterizing Technical Debt and Antipatterns in AI-Based Systems: A Systematic Mapping Study. In Proceedings of the 2021 IEEE/ACM International Conference on Technical Debt (TechDebt), Madrid, Spain, 19–21 May 2021; pp. 64–73. [Google Scholar] [CrossRef]
- Washizaki, H.; Khomh, F.; Gueheneuc, Y.G.; Takeuchi, H.; Natori, N.; Doi, T.; Okuda, S. Software-Engineering Design Patterns for Machine Learning Applications. Comput. Long. Beach. Calif. 2022, 55, 30–39. [Google Scholar] [CrossRef]
- Li, Z.; Avgeriou, P.; Liang, P. A systematic mapping study on technical debt and its management. J. Syst. Softw. 2015, 101, 193–220. [Google Scholar] [CrossRef]
- Rios, N.; Neto, M.G.d.M.; Spínola, R.O. A tertiary study on technical debt: Types, management strategies, research trends, and base information for practitioners. Inf. Softw. Technol. 2018, 102, 117–145. [Google Scholar] [CrossRef]
- Ahmad, K.; Abdelrazek, M.; Arora, C.; Bano, M.; Grundy, J. Requirements engineering for artificial intelligence systems: A systematic mapping study. Inf. Softw. Technol. 2023, 158, 107176. [Google Scholar] [CrossRef]
- Vogelsang, A.; Borg, M. Requirements engineering for machine learning: Perspectives from data scientists. In Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW), Jeju, Republic of Korea, 23–27 September 2019; pp. 245–251. [Google Scholar] [CrossRef]
- Warnett, S.J.; Zdun, U. Architectural Design Decisions for Machine Learning Deployment. In Proceedings of the 2022 IEEE 19th International Conference on Software Architecture (ICSA), Honolulu, HI, USA, 12–15 March 2022; pp. 90–100. [Google Scholar] [CrossRef]
- Heiland, L.; Hauser, M.; Bogner, J. Design Patterns for AI-based Systems: A Multivocal Literature Review and Pattern Repository. arXiv 2023, arXiv:2303.13173. [Google Scholar]
- Washizaki, H.; Uchida, H.; Khomh, F.; Guéhéneuc, Y.G. Studying Software Engineering Patterns for Designing Machine Learning Systems. In Proceedings of the 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP), Tokyo, Japan, 13–14 December 2019; pp. 49–54. [Google Scholar] [CrossRef]
- Menzies, T. The Five Laws of SE for AI. IEEE Softw. 2020, 37, 81–85. [Google Scholar] [CrossRef]
- Polyzotis, N.; Roy, S.; Whang, S.E.; Zinkevich, M. Data lifecycle challenges in production machine learning: A survey. SIGMOD Rec. 2018, 47, 17–28. [Google Scholar] [CrossRef]
- Whang, S.E.; Roh, Y.; Song, H.; Lee, J.G. Data collection and quality challenges in deep learning: A data-centric AI perspective. VLDB J. 2023, 32, 791–813. [Google Scholar] [CrossRef]
- Hutchinson, B.; Smart, A.; Hanna, A.; Denton, E.; Greer, C.; Kjartansson, O.; Barnes, P.; Mitchell, M. Towards accountability for machine learning datasets: Practices from software engineering and infrastructure. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual, 3–10 March 2021; pp. 560–575. [Google Scholar] [CrossRef]
- Vadavalasa, R.M. Data Validation Process in Machine Learning Pipeline. Int. J. Sci. Res. Dev. 2021, 8, 449–452. [Google Scholar]
- Foidl, H.; Felderer, M. Risk-based data validation in machine learning-based software systems. In Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, Tallinn, Estonia, 3–10 March 2021; pp. 13–18. [Google Scholar] [CrossRef]
- Chen, Z.; Wu, M.; Chan, A.; Li, X.; Ong, Y.S. Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges [Review Article]. IEEE Comput. Intell. Mag. 2023, 18, 60–77. [Google Scholar] [CrossRef]
- Ozkaya, I. What Is Really Different in Engineering AI-Enabled Systems? IEEE Softw. 2020, 37, 3–6. [Google Scholar] [CrossRef]
- Lwakatare, L.E.; Raj, A.; Bosch, J.; Olsson, H.H.; Crnkovic, I. A taxonomy of software engineering challenges for machine learning systems: An empirical investigation. Lect. Notes Bus. Inf. Process. 2019, 355, 227–243. [Google Scholar] [CrossRef]
- Schelter, S.; Biessmann, F.; Januschowski, T.; Salinas, D.; Seufert, S.; Szarvas, G. On Challenges in Machine Learning Model Management. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 2018, 5–13. Available online: http://sites.computer.org/debull/A18dec/p5.pdf (accessed on 18 June 2025).
- Studer, S.; Bui, T.B.; Drescher, C.; Hanuschkin, A.; Winkler, L.; Peters, S.; Müller, K.R. Towards CRISP-ML (Q): A Machine Learning Process Model with Quality Assurance Methodology. Mach. Learn. Knowl. Extr. 2021, 3, 392–413. [Google Scholar] [CrossRef]
- Bailis, P.; Olukotun, K.; Re, C.; Zaharia, M. Infrastructure for Usable Machine Learning: The Stanford DAWN Project. arXiv 2017, arXiv:1705.07538. [Google Scholar]
- Zhang, J.M.; Harman, M.; Ma, L.; Liu, Y. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Trans. Softw. Eng. 2022, 48, 1–36. [Google Scholar] [CrossRef]
- Côté, P.O.; Nikanjam, A.; Bouchoucha, R.; Basta, I.; Abidi, M.; Khomh, F. Quality Issues in Machine Learning Software Systems. Empir. Softw. Eng. 2024, 29, 149. [Google Scholar] [CrossRef]
- Murphy, C.; Kaiser, G.; Arias, M. A Framework for Quality Assurance of Machine Learning Applications. pp. 1–10. 2020. Available online: https://www.researchgate.net/publication/228687118_A_Framework_for_Quality_Assurance_of_Machine_Learning_Applications (accessed on 18 June 2025).
- Barr, E.T.; Harman, M.; McMinn, P.; Shahbaz, M.; Yoo, S. The oracle problem in software testing: A survey. IEEE Trans. Softw. Eng. 2015, 41, 507–525. [Google Scholar] [CrossRef]
- Braiek, H.B.; Khomh, F. On testing machine learning programs. J. Syst. Softw. 2020, 164, 110542. [Google Scholar] [CrossRef]
- Golendukhina, V.; Lenarduzzi, V.; Felderer, M. What is Software Quality for AI Engineers? In Towards a Thinning of the Fog. In Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, Pittsburgh, Pennsylvania, 16–24 May 2022; pp. 1–9. [Google Scholar] [CrossRef]
- Albuquerque, D.; Guimaraes, E.; Tonin, G.; Perkusich, M.; Almeida, H.; Perkusich, A. Comprehending the Use of Intelligent Techniques to Support Technical Debt Management. In Proceedings of the International Conference on Technical Debt, Pittsburgh, Pennsylvania, 16–18 May 2022; pp. 21–30. [Google Scholar] [CrossRef]
- Njomou, A.T.; Fokaefs, M.; Silatchom Kamga, D.F.; Adams, B. On the Challenges of Migrating to Machine Learning Life Cycle Management Platforms. In Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering, Toronto, ON, Canada, 15–17 November 2022; pp. 42–51. [Google Scholar]
- Van Der Weide, T.; Papadopoulos, D.; Smirnov, O.; Zielinski, M.; Van Kasteren, T. Versioning for end-to-end machine learning pipelines. In Proceedings of the 1st Workshop on Data Management for End-to-End Machine Learning, Chicago, IL, USA, 14–19 May 2017; pp. 1–9. [Google Scholar] [CrossRef]
- Arpteg, A.; Brinne, B.; Crnkovic-Friis, L.; Bosch, J. Software engineering challenges of deep learning. In Proceedings of the 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Prague, Czech Republic, 29–31 August 2018; pp. 50–59. [Google Scholar] [CrossRef]
- Kery, M.B.; Radensky, M.; Arya, M.; John, B.E.; Myers, B.A. The story in the notebook: Exploratory data science using a literate programming tool. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal QC, Canada, 21–26 April 2018; pp. 1–11. [Google Scholar] [CrossRef]
- Giray, G. A software engineering perspective on engineering machine learning systems: State of the art and challenges. J. Syst. Softw. 2021, 180, 111031. [Google Scholar] [CrossRef]
- OBrien, D.; Biswas, S.; Imtiaz, S.; Abdalkareem, R.; Shihab, E.; Rajan, H. 23 Shades of Self-Admitted Technical Debt: An Empirical Study on Machine Learning Software; Association for Computing Machinery: New York, NY, USA, 2022; Volume 1. [Google Scholar] [CrossRef]
- Paleyes, A.; Urma, R.G.; Lawrence, N.D. Challenges in Deploying Machine Learning: A Survey of Case Studies. ACM Comput. Surv. 2022, 55, 1–29. [Google Scholar] [CrossRef]
- Van Oort, B.; Cruz, L.; Aniche, M.; Van Deursen, A. The prevalence of code smells in machine learning projects. In Proceedings of the 2021 IEEE/ACM 1st Workshop on AI Engineering–Software Engineering for AI (WAIN), Madrid, Spain, 30–31 May 2021; pp. 35–42. [Google Scholar] [CrossRef]
- Gesi, J.; Liu, S.; Li, J.; Ahmed, I.; Nagappan, N.; Lo, D.; de Almeida, E.S.; Kochhar, P.S.; Bao, L. Code Smells in Machine Learning Systems. arXiv 2022, arXiv:2203.00803. [Google Scholar]
- Wang, J.; Li, L.; Zeller, A. Better Code, Better Sharing: On the Need of Analyzing Jupyter Notebooks. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results, Seoul, Republic of Korea, 27 June 2020–19 July 2020; pp. 53–56. [Google Scholar] [CrossRef]
- Pimentel, J.F.; Murta, L.; Braganholo, V.; Freire, J. A large-scale study about quality and reproducibility of jupyter notebooks. In Proceedings of the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), Montreal, QC, Canada, 25–31 May 2019; pp. 507–517. [Google Scholar] [CrossRef]
- Haakman, M.P.A. Studying the Machine Learning Lifecycle and Improving Code Quality of Machine Learning Applications. 2020. Available online: https://repository.tudelft.nl/islandora/object/uuid%3A38ff4e9a-222a-4987-998c-ac9d87880907 (accessed on 18 June 2025).
- De Souza Nascimento, E.; Ahmed, I.; Oliveira, E.; Palheta, M.P.; Steinmacher, I.; Conte, T. Understanding Development Process of Machine Learning Systems: Challenges and Solutions. In Proceedings of the2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Porto de Galinhas, Brazil, 19–20 September 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Nahar, N.; Zhou, S.; Lewis, G.; Kastner, C. Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, Pennsylvania, 21–29 May 2022; pp. 413–425. [Google Scholar] [CrossRef]
- Mo, R.; Zhang, Y.; Wang, Y.; Zhang, S.; Xiong, P.; Li, Z.; Zhao, Y. Exploring the Impact of Code Clones on Deep Learning Software. ACM Trans. Softw. Eng. Methodol. 2023, 32, 1–34. [Google Scholar] [CrossRef]
- Rios, N.; Mendes, L.; Cerdeiral, C.; Magalhães, A.P.F.; Perez, B.; Correal, D.; Astudillo, H.; Seaman, C.; Izurieta, C.; Santos, G.; et al. Hearing the Voice of Software Practitioners on Causes, Effects, and Practices to Deal with Documentation Debt. In Requirements Engineering: Foundation for Software Quality: 26th International Working Conference; Springer: Cham, Switzerland, 2020; pp. 55–70. [Google Scholar] [CrossRef]
- Chang, J.; Custis, C. Understanding Implementation Challenges in Machine Learning Documentation. In Proceedings of the 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Gran Canaria, Spain, 31 August–2 September 2022; pp. 1–8. [Google Scholar] [CrossRef]
- Shivashankar, K.; Martini, A. Maintainability Challenges in ML: A Systematic Literature Review. In Proceedings of the 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Gran Canaria, Spain, 31 August–2 September 2022; pp. 60–67. [Google Scholar] [CrossRef]
- Tamburri, D.A.; Kruchten, P.; Lago, P.; Van Vliet, H. What is social debt in software engineering? In Proceedings of the 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), San Francisco, CA, USA, 25 May 2013; pp. 93–96. [Google Scholar] [CrossRef]
- Mailach, A.; Siegmund, N. Socio-Technical Anti-Patterns in Building ML-Enabled Software: Insights from Leaders on the Forefront. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 14–20 May 2023; pp. 690–702. [Google Scholar] [CrossRef]
- Ishikawa, F.; Yoshioka, N. How Do Engineers Perceive Difficulties in Engineering of Machine-Learning Systems?—Questionnaire Survey. In Proceedings of the 2019 IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER&IP), Montreal, QC, Canada, 28 May 2019; pp. 2–9. [Google Scholar] [CrossRef]
- Vakkuri, V.; Kemell, K.K.; Jantunen, M.; Abrahamsson, P. “This is Just a Prototype”: How Ethics Are Ignored in Software Startup-Like Environments. In Proceedings of the 2019 IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER&IP), Montreal, QC, Canada, 28 May 2019; pp. 195–210. [Google Scholar] [CrossRef]
- Hagendorff, T. The Ethics of AI Ethics: An Evaluation of Guidelines. Minds Mach. 2020, 30, 99–120. [Google Scholar] [CrossRef]
- Matthews, J. Patterns and antipatterns, principles, and pitfalls: Accountability and transparency in artificial intelligence. AI Mag. 2020, 41, 81–89. [Google Scholar] [CrossRef]
- Petrozzino, C. Who pays for ethical debt in AI? AI Ethics 2021, 1, 205–208. [Google Scholar] [CrossRef]
- Bhatia, A.; Khomh, F.; Adams, B.; Hassan, A.E. An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software. arXiv 2023, arXiv:2311.12019. [Google Scholar]
- Bavota, G.; Russo, B. A large-scale empirical study on self-admitted technical debt. In Proceedings of the 13th International Conference on Mining Software Repositories, Austin, TX, USA, 14–15 May 2016; pp. 315–326. [Google Scholar] [CrossRef]
- Nascimento, E.; Nguyen-Duc, A.; Sundbø, I.; Conte, T. Software engineering for artificial intelligence and machine learning software: A systematic literature review. arXiv 2020, arXiv:2011.03751. [Google Scholar]
- Wan, Z.; Xia, X.; Lo, D.; Murphy, G.C. How does machine learning change software development practices? IEEE Trans. Softw. Eng. 2021, 47, 1857–1871. [Google Scholar] [CrossRef]
- Serban, A.; Van Der Blom, K.; Hoos, H.; Visser, J. Adoption and effects of software engineering best practices in machine learning. In Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Bari, Italy, 5–9 October 2020; p. 12. [Google Scholar] [CrossRef]
- Abdellatif, A.; Ghiasi, G.; Costa, D.E.; Shihab, E.; Tajmel, T. SE4AI: A Training Program Considering Technical, Social, and Professional Aspects of AI-Based Software Systems. IEEE Softw. 2024, 41, 44–51. [Google Scholar] [CrossRef]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
- Vouros, G.A. Explainable Deep Reinforcement Learning: State of the Art and Challenges. ACM Comput. Surv. 2022, 55, 39. [Google Scholar] [CrossRef]
- Speith, T. A Review of Taxonomies of Explainable Artificial Intelligence (XAI) Methods. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, 21–24 June 2022; pp. 2239–2250. [Google Scholar] [CrossRef]
- Dwivedi, R.; Dave, D.; Naik, H.; Singhal, S.; Omer, R.; Patel, P.; Qian, B.; Wen, Z.; Shah, T.; Morgan, G.; et al. Explainable AI (XAI): Core Ideas, Techniques, and Solutions. ACM Comput. Surv. 2023, 55, 1–33. [Google Scholar] [CrossRef]
- Du, M.; Liu, N.; Hu, X. Techniques for interpretable machine learning. Commun. ACM 2020, 63, 68–77. [Google Scholar] [CrossRef]
- Puiutta, E.; Veith, E.M.S.P. Explainable Reinforcement Learning: A Survey. In Machine Learning and Knowledge Extraction. CD-MAKE 2020. Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020; pp. 77–95. [Google Scholar] [CrossRef]
- Mizrahi, M. AI4Science and the context distinction. AI Ethic 2025, 1–6. [Google Scholar] [CrossRef]
- Reinke, A.; Tizabi, M.D.; Eisenmann, M.; Maier-Hein, L. Common Pitfalls and Recommendations for Grand Challenges in Medical Artificial Intelligence. Eur. Urol. Focus 2021, 7, 710–712. [Google Scholar] [CrossRef]
- Pavao, A.; Guyon, I.; Letournel, A.-C.; Baró, X.; Escalante, H.; Escalera, S.; Thomas, T.; Xu, Z. CodaLab Competitions: An Open Source Platform to Organize Scientific Challenges. 2022. Available online: https://hal.inria.fr/hal-03629462/document (accessed on 18 June 2025).
- Guss, W.H.; Castro, M.Y.; Devlin, S.; Houghton, B.; Kuno, N.S.; Loomis, C.; Milani, S.; Mohanty, S.; Nakata, K.; Salakhutdinov, R.; et al. The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors. arXiv 2021, arXiv:2101.11071. [Google Scholar]
- Perez-Liebana, D.; Samothrakis, S.; Togelius, J.; Lucas, S.M.; Schaul, T. General video game AI: Competition, challenges, and opportunities. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; pp. 4335–4337. [Google Scholar] [CrossRef]
- Bojer, C.S.; Meldgaard, J.P. Kaggle forecasting competitions: An overlooked learning opportunity. Int. J. Forecast. 2021, 37, 587–603. [Google Scholar] [CrossRef]
- Kultima, A. Game Jam Natives?: The rise of the game jam era in game development cultures. In Proceedings of the 6th Annual International Conference on Game Jams, Hackathons, and Game Creation Events, Montreal, QC, Canada, 2 August 2021; pp. 22–28. [Google Scholar] [CrossRef]
- Koskinen, E. Pizza and coffee make a game jam—Learnings from organizing an online game development event. In Proceedings of the 6th Annual International Conference on Game Jams, Hackathons, and Game Creation Events, Montreal, QC, Canada, 2 August 2021; pp. 74–77. [Google Scholar] [CrossRef]
- Giagtzoglou, K.; Kalles, D. A gaming ecosystem as a tool for research and education in artificial intelligence. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence, Patras, Greece, 9–12 July 2018; p. 2. [Google Scholar] [CrossRef]
- Salta, A.; Prada, R.; Melo, F.S. A Game AI Competition to Foster Collaborative AI Research and Development. IEEE Trans. Games 2021, 13, 398–409. [Google Scholar] [CrossRef]
- Genter, K.; Laue, T.; Stone, P. Three years of the robocup standard platform league drop-in player competition: Creating and maintaining a large scale ad hoc teamwork robotics competition (JAAMAS extended abstract). Proc. Int. Jt. Conf. Auton. Agents Multiagent Syst. AAMAS 2017, 1, 520–521. [Google Scholar] [CrossRef]
- Johnson, M.; Hofmann, K.; Hutton, T.; Bignell, D. The malmo platform for artificial intelligence experimentation. Ijcai Int. Jt. Conf. Artif. Intell. 2016, 16, 4246–4247. [Google Scholar]
- Aurava, R.; Meriläinen, M.; Kankainen, V.; Stenros, J. Game jams in general formal education. Int. J. Child Comput. Interact. 2021, 28, 100274. [Google Scholar] [CrossRef]
- Abbott, D.; Chatzifoti, O.; Ferguson, J.; Louchart, S.; Stals, S. Serious “Slow” Game Jam—A Game Jam Model for Serious Game Design. In Proceedings of the 7th International Conference on Game Jams, Hackathons and Game Creation Events, Virtual, 30 August 2023; pp. 28–36. [Google Scholar] [CrossRef]
- Kim, K.J.; Cho, S.B. Game AI competitions: An open platform for computational intelligence education. IEEE Comput. Intell. Mag. 2013, 8, 64–68. [Google Scholar] [CrossRef]
- Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA extension for scoping reviews (PRISMA-ScR): Checklist and explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
- Wohlin, C. Guidelines for snowballing in systematic literature studies and a replication in software engineering. ACM Int. Conf. Proceeding Ser. 2014, 38, 1–10. [Google Scholar] [CrossRef]
- Jalali, S.; Wohlin, C. Systematic Literature Studies: Database Searches vs. In Backward Snowballing. In Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Lund, Sweden, 20–21 September 2012; pp. 29–38. [Google Scholar] [CrossRef]
- Landis, J.R.; Gary, G.K. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef]
- Simon, E.I.O.; Vidoni, M.; Fard, F.H. Algorithm Debt: Challenges and Future Paths. In Proceedings of the 2023 IEEE/ACM 2nd International Conference on AI Engineering—Software Engineering for AI (CAIN), Melbourne, Australia, 15–16 May 2023; pp. 90–91. [Google Scholar] [CrossRef]
- Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Liang, Y.; Shen, Q.; Jiang, J.; Li, S. Toward Understanding Deep Learning Framework Bugs. ACM Trans. Softw. Eng. Methodol. 2023, 32, 1–31. [Google Scholar] [CrossRef]
- Dilhara, M.; Ketkar, A.; Dig, D. Understanding Software-2.0: A Study of Machine Learning Library Usage and Evolution. ACM Trans. Softw. Eng. Methodol. 2021, 30, 1–42. [Google Scholar] [CrossRef]
- Balhara, S.; Gupta, N.; Alkhayyat, A.; Bharti, I.; Malik, R.Q.; Mahmood, S.N.; Abedi, F. A survey on deep reinforcement learning architectures, applications and emerging trends. IET Commun. 2022, 16, 1–16. [Google Scholar] [CrossRef]
- Serban, A.; Visser, J. An Empirical Study of Software Architecture for Machine Learning. arXiv 2021, arXiv:2105.12422. [Google Scholar]
- Carleton, A.; Shull, F.; Harper, E. Architecting the Future of Software Engineering. Comput. Long. Beach. Calif. 2022, 55, 89–93. [Google Scholar] [CrossRef]
- Franch, X.; Martínez-Fernández, S.; Ayala, C.P.; Gómez, C. Architectural Decisions in AI-Based Systems: An Ontological View. Commun. Comput. Inf. Sci. 2022, 1621, 18–27. [Google Scholar] [CrossRef]
- Zhang, B.; Liu, T.; Liang, P.; Wang, C.; Shahin, M.; Yu, J. Architecture Decisions in AI-based Systems Development: An Empirical Study. In Proceedings of the 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Taipa, Macao, 21–24 March 2023; pp. 616–626. [Google Scholar] [CrossRef]
- Bosch, J.; Olsson, H.H.; Crnkovic, I. Engineering AI systems: A Research Agenda. In Artificial Intelligence Paradigms for Smart Cyber-Physical Systems; Mahmood, Z., Ed.; IGI Global: Hershey, PA, USA, 2021; pp. 1–19. [Google Scholar] [CrossRef]
- Washizaki, H.; Uchida, H.; Khomh, F.; Guéhéneuc, Y.-G. Machine Learning Architecture and Design Patterns. IEEE Softw. 2020, 37, 8. Available online: http://www.washi.cs.waseda.ac.jp/wp-content/uploads/2019/12/IEEE_Software_19__ML_Patterns.pdf (accessed on 18 June 2025).
- Muccini, H.; Vaidhyanathan, K. Software architecture for ML-based Systems: What exists and what lies ahead. In Proceedings of the 2021 IEEE/ACM 1st Workshop on AI Engineering—Software Engineering for AI (WAIN), Madrid, Spain, 30–31 May 2021; pp. 121–128. [Google Scholar] [CrossRef]
- Morgenthaler, J.D.; Gridnev, M.; Sauciuc, R.; Bhansali, S. Searching for build debt: Experiences managing technical debt at Google. In Proceedings of the 2012 Third International Workshop on Managing Technical Debt (MTD), Zurich, Switzerland, 5 June 2012; pp. 1–6. [Google Scholar] [CrossRef]
- Zhang, H.; Cruz, L.; Deursen, A. Van Code Smells for Machine Learning Applications. In Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, Pittsburgh, Pennsylvania, 16–17 May 2022; pp. 217–228. [Google Scholar] [CrossRef]
- Lenarduzzi, V.; Lomio, F.; Moreschini, S.; Taibi, D.; Tamburri, D.A. Software Quality for AI: Where We Are Now? In Lecture Notes in Business Information Processing; Springer International Publishing: Cham, Switzerland, 2021; Volume 404, pp. 43–53. ISBN 9783030658533. [Google Scholar]
- Foidl, H.; Felderer, M.; Biffl, S. Technical Debt in Data-Intensive Software Systems. In Proceedings of the 2019 45th Euromicro Conference on Software Engineering and Advanced Applications, Thessaloniki, Greece, 28–30 August 2019; pp. 338–341. [Google Scholar] [CrossRef]
- Lourenço, R.; Freire, J.; Shasha, D. Debugging machine learning pipelines. In Proceedings of the 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Kallithea, Greece, 28–30 August 2019; p. 10. [Google Scholar] [CrossRef]
- Sculley, D.; Holt, G.; Golovin, D.; Davydov, E.; Phillips, T.; Ebner, D.; Chaudhary, V.; Young, M. Machine Learning: The High-Interest Credit Card of Technical Debt. NIPS 2014 Work. Softw. Eng. Mach. Learn. 2014, 8, 1–9. [Google Scholar]
- Pérez, B.; Castellanos, C.; Correal, D.; Rios, N.; Freire, S.; Spínola, R.; Seaman, C.; Izurieta, C. Technical debt payment and prevention through the lenses of software architects. Inf. Softw. Technol. 2021, 140, 106692. [Google Scholar] [CrossRef]
- O’Leary, K.; Uchida, M. Common Problems with Creating Machine LearningPipelines from Existing Code. In Proceedings of the Third Conference on Machine Learning and Systems, Bangalore, India, 25–28 October 2023; pp. 1387–1395. [Google Scholar]
- Hu, X.; Chen, Q.; Wang, H.; Xia, X.; Lo, D.; Zimmermann, T. Correlating Automated and Human Evaluation of Code Documentation Generation Quality. ACM Trans. Softw. Eng. Methodol. 2022, 31, 1–28. [Google Scholar] [CrossRef]
- Königstorfer, F.; Thalmann, S. Software documentation is not enough! Requirements for the documentation of AI. Digit. Policy Regul. Gov. 2021, 23, 475–488. [Google Scholar] [CrossRef]
- Roselli, D.; Matthews, J.; Talagala, N. Managing bias in AI. In Proceedings of the Companion Proceedings of the 2019 World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 539–544. [Google Scholar] [CrossRef]
- Muiruri, D.; Lwakatare, L.E.; Nurminen, J.K.; Mikkonen, T. Practices and Infrastructures for Machine Learning Systems: An Interview Study in Finnish Organizations. Comput. Long. Beach. Calif. 2022, 55, 18–29. [Google Scholar] [CrossRef]
- Nahar, N.; Zhang, H.; Lewis, G.; Zhou, S.; Kastner, C. A Meta-Summary of Challenges in Building Products with ML Components—Collecting Experiences from 4758+ Practitioners. In Proceedings of the 2023 IEEE/ACM 2nd International Conference on AI Engineering—Software Engineering for AI (CAIN), Melbourne, Australia, 15–16 May 2023; pp. 171–183. [Google Scholar] [CrossRef]
- Jebnoun, H.; Rahman, M.S.; Khomh, F.; Muse, B.A. Clones in deep learning code: What, where, and why? Empir. Softw. Eng. 2022, 27, 84. [Google Scholar] [CrossRef]
- Alahdab, M.; Çalıklı, G. Empirical Analysis of Hidden Technical Debt Patterns in Machine Learning Software. Lect. Notes Comput. Sci. Incl. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinform. 2019, 11915, 195–202. [Google Scholar] [CrossRef]
- Lorenzoni, G.; Alencar, P.; Nascimento, N.; Cowan, D. Machine Learning Model Development from a Software Engineering Perspective: A Systematic Literature Review. arXiv 2021, arXiv:2102.07574. [Google Scholar]
- Wang, S.; Huang, L.; Ge, J.; Zhang, T.; Feng, H.; Li, M.; Zhang, H.; Ng, V. Synergy between Machine/Deep Learning and Software Engineering: How Far Are We? arXiv 2020, arXiv:2008.05515. [Google Scholar]
- Siebert, J.; Joeckel, L.; Heidrich, J.; Nakamichi, K.; Ohashi, K.; Namba, I.; Yamamoto, R.; Aoyama, M. Towards guidelines for assessing qualities of machine learning systems. Commun. Comput. Inf. Sci. 2020, 1266, 17–31. [Google Scholar] [CrossRef]
- Bosch, J.; Olsson, H.H.; Crnkovic, I. It takes three to tango: Requirement, outcome/data, and AI driven development. CEUR Workshop Proc. 2018, 2305, 177–192. [Google Scholar]
- Yang, C.; Wang, W.; Zhang, Y.; Zhang, Z.; Shen, L.; Li, Y.; See, J. MLife: A lite framework for machine learning lifecycle initialization. Mach. Learn. 2021, 110, 2993–3013. [Google Scholar] [CrossRef]
- Belani, H.; Vukovic, M.; Car, Z. Requirements engineering challenges in building ai-based complex systems. In Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW), Jeju, Republic of Korea, 23–27 September 2019; pp. 252–255. [Google Scholar] [CrossRef]
- Yan, M.; Xia, X.; Shihab, E.; Lo, D.; Yin, J.; Yang, X. Automating Change-Level Self-Admitted Technical Debt Determination. IEEE Trans. Softw. Eng. 2019, 45, 1211–1229. [Google Scholar] [CrossRef]
- Sharma, R.; Shahbazi, R.; Fard, F.H.; Codabux, Z.; Vidoni, M. Self-admitted technical debt in R: Detection and causes. Autom. Softw. Eng. 2022, 29, 53. [Google Scholar] [CrossRef]
- Mastropaolo, A.; Di Penta, M.; Bavota, G. Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We? In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), Luxembourg, 11–15 September 2023; pp. 585–597. [Google Scholar] [CrossRef]
- Sherin, S.; Khan, M.U.; Iqbal, M.Z. A Systematic Mapping Study on Testing of Machine Learning Programs. arXiv 2019, arXiv:1907.09427. [Google Scholar]
- Riccio, V.; Jahangirova, G.; Stocco, A.; Humbatova, N.; Weiss, M.; Tonella, P. Testing machine learning based systems: A systematic mapping. Empir. Softw. Eng. 2020, 25, 5193–5254. [Google Scholar] [CrossRef]
- Shankar, S.; Garcia, R.; Hellerstein, J.M.; Parameswaran, A.G. “We Have No Idea How Models will Behave in Production until Production”: How Engineers Operationalize Machine Learning. Proc. ACM Hum. Comput. Interact. 2024, 8, 1–34. [Google Scholar] [CrossRef]
- Wan, C.; Liu, S.; Hoffmann, H.; Maire, M.; Lu, S. Are machine learning cloud APIs used correctly? In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, 25–28 May 2021; pp. 125–137. [Google Scholar] [CrossRef]
- Falk, J.; Mose Biskjaer, M.; Halskov, K.; Kultima, A. How organisers understand and promote participants’ creativity in Game Jams. In Proceedings of the 6th Annual International Conference on Game Jams, Hackathons, and Game Creation Events, Montreal, QC, Canada, 2 August 2021; pp. 12–21. [Google Scholar] [CrossRef]
- Scott, M.J.; Ghinea, G. Promoting Game Accessibility: Experiencing an Induction on Inclusive Design Practice at the Global Games Jam. arXiv 2013, arXiv:1305.4359. [Google Scholar]
Data Source | Date Range | Search String | Results |
---|---|---|---|
Google Scholar | 2012–2024 | (“Technical Debt” AND (“Artificial Intelligence” OR “AI” OR “Machine Learning” OR “ML”) AND “Software Engineering”) OR (“Technical Debt” AND “AI-Based Systems”) | 214 |
ACM | -//- | “Technical Debt” AND (“Artificial Intelligence” OR “AI” OR “Machine Learning” OR “ML”) AND (“Software Engineering” OR “SE”) OR (“Technical Debt” AND “AI-Based System *”) 1 | 413 |
IEEE Xplore | -//- | (“Technical Debt” AND (“Artificial Intelligence” OR “AI” OR “Machine Learning” OR “ML”) AND (“Software Engineering” OR “SE”)) OR (“Technical Debt” AND “AI-Based Systems”) | 49 |
Scopus | -//- | TITLE-ABS-KEY (“Technical Debt” AND (“Artificial Intelligence” OR “AI” OR “Machine Learning” OR “ML”) AND (“Software Engineering” OR “SE”)) OR TITLE-ABS-KEY (“Technical Debt” AND “AI-Based Systems”) | 46 |
Springer | -//- | (“Technical Debt” AND (“Artificial Intelligence” OR “AI” OR “Machine Learning” OR “ML”) AND (“Software Engineering” OR “SE”)) OR (“Technical Debt” AND “AI-Based Systems”) | 148 |
Technical Debt Type | # Papers | Papers |
---|---|---|
Algorithm | 4 | [114,115,116,117] |
Architectural | 10 | [1,20,118,119,120,121,122,123,124,125] |
Build | 2 | [57,126] |
Code | 10 | [19,21,30,62,65,66,67,69,86,127] |
Configuration | 4 | [3,4,60,64] |
Data | 9 | [27,28,40,41,42,43,44,128,129] |
Defect | 5 | [19,29,130,131,132] |
Design | 4 | [31,37,38,133] |
Documentation | 4 | [74,75,134,135] |
Ethics | 6 | [26,79,82,90,136,153] |
Infrastructure | 3 | [6,50,137] |
Model | 13 | [5,7,46,47,49,52,70,138,139,140,141,142,143] |
People | 3 | [71,76,78] |
Process | 2 | [38,72] |
Requirements | 5 | [34,35,144,145,146] |
Self-Admitted (SATD) | 6 | [63,83,84,147,148,149] |
Test | 4 | [51,55,150,151] |
Versioning | 6 | [48,58,59,60,61,152] |
No. | Technical Debt Type | Primary Stakeholder(s) | Key Impact on Stakeholders | Suggested Mitigation | Code Ref. in Questionnaire |
---|---|---|---|---|---|
1 | Algorithm | Participant | Sub-optimal or unvalidated algorithmic choices may reduce reproducibility, increase complexity, and affect performance consistency. | Encourage use of baseline models, include validation protocols, and require algorithmic documentation. | Q1 |
2 | Architectural | Organizer | Poor modularization and ad hoc integration decisions lead to tight coupling, limited scalability, and long-term maintainability issues in the platform infrastructure. | Apply early refactoring, adopt architectural patterns (e.g., MVC), and document integration points clearly. | Q2–Q3 |
3 | Build | Organizer | Fragile or undocumented build scripts hinder platform portability, onboarding, and collaboration between contributors or instructors. | Use standardized build tools (e.g., Docker, Maven), document build process, and automate CI/CD pipelines. | Q4–Q5 |
4 | Code | Organizer/Participant | Unreadable, duplicated, or overly complex code increases onboarding time, inhibits reuse, and introduces hidden bugs in competition solutions. | Promote coding standards, enforce linters, use peer review and code refactoring practices. | Q6 |
5 | Configuration | Organizer/Participant | Hard-coded or undocumented configuration settings reduce reproducibility, cause deployment failures, and hinder experiment replication. | Use centralized configuration files, version configuration artifacts, and document parameter effects. | Q7–Q9 |
6 | Data | Organizer | Poor-quality, biased, or evolving datasets affect model training validity, generalization, and fair scoring across submissions. | Apply data versioning, include data validation checks, and provide metadata with provenance and bias analysis. | Q10–Q15 |
7 | Design | Organizer | Poor architectural decisions lead to rigid systems that are hard to extend with new tasks, metrics, or pipelines. | Encourage modular design, document architecture decisions, and use design patterns suited for AI/ML platforms. | Q16–Q18 |
8 | Defect | Organizer | Unresolved or recurring bugs in competition submissions affect scoring fairness, participant confidence, and usability of reference implementations. | Implement test-driven development, integrate automated testing frameworks, and provide reproducible bug reports. | Q19–Q23 |
9 | Documentation | Organizer | Incomplete or outdated documentation causes misunderstandings, onboarding delays, and misuse of platform functionalities. | Maintain up-to-date documentation, provide code-level comments, and supply example workflows for both organizers and participants. | Q24–Q28 |
10 | Ethics | Organizer/Participant | May raise concerns regarding fairness, bias, or lack of transparency in evaluation processes or training data disclosure. | Integrate fairness audits, ethics checklists, and stakeholder feedback loops. | Q29–Q30 |
11 | Infrastructure | Organizer | Can cause instability or performance issues due to outdated or insufficient hosting and computing infrastructure. | Use scalable cloud services and regularly monitor infrastructure health. | Q31–Q34 |
12 | Model | Organizer | Leads to reduced reproducibility, maintainability, or performance when models are undocumented, overfitted, or opaque. | Document model architecture, training routines, and evaluation protocols clearly. | Q35–Q37 |
13 | People | Organizer/Participant | Knowledge silos, turnover, or miscommunication can reduce platform consistency and maintainability. | Establish onboarding documents, cross-training, and transparent collaboration norms. | Q38–Q39 |
14 | Process | Organizer | Unstructured or ad hoc processes can cause delays, confusion, or lack of traceability across platform tasks. | Adopt reproducible workflows with defined pipelines, task ownership, and versioning. | Q40–Q42 |
15 | Requirements | Organizer | Missing or unclear requirements may result in mismatched expectations between organizers and participants. | Define clear and testable requirements early, using templates or user stories. | Q43–Q45 |
16 | SATD | Participant | Unaddressed TODOs or FIXME comments in codebases may indicate areas of known debt left unresolved. | Systematically review SATD comments and integrate into refactoring plans. | Q46–Q47 |
17 | Test | Organizer/Participant | Lack of automated or manual testing increases the risk of defects, regressions, and scalability issues. | Develop and maintain testing protocols, including unit and system-level tests. | Q48–Q54 |
18 | Versioning | Organizer | Failure to track platform versions, data updates, or model submissions can undermine reproducibility and auditability. | Introduce version control systems with changelogs and reproducibility tags. | Q55–Q57 |
19 | Accessibility | Organizer | Lack of accessibility features may prevent equal participation, particularly for users with visual, cognitive, or language-related limitations. This can result in reduced inclusiveness, participation drop-off, and limited feedback from diverse user groups. | Provide accessible documentation (e.g., plain language, multilingual support), enforce UI design standards (e.g., contrast ratios, keyboard navigation), and validate platform usability through accessibility audits or participant surveys | Q58–Q60 |
Technical Debt Type | Primary Responsible Stakeholder |
---|---|
Algorithm | Participant |
Architectural | Organizer |
Build | Organizer |
Code | Both, Organizer and Participant |
Configuration | Both, Organizer and Participant |
Data | Organizer |
Defect | Organizer |
Design | Organizer |
Documentation | Organizer |
Ethics | Both, Organizer and Participant |
Infrastructure | Organizer |
Model | Organizer |
People | Both, Organizer and Participant |
Process | Organizer |
Requirements | Organizer |
Self-Admitted (SATD) | Participant |
Test | Both, Organizer and Participant |
Versioning | Organizer |
Question | Score | Answer | Calculated Score |
---|---|---|---|
Have you conducted usability testing to identify and address potential barriers? | 5 | YES | −5 |
Have you integrated adaptive user interfaces based on participants’ skill levels? | 4 | NO | 4 |
Have you implemented feedback mechanisms to report accessibility issues? | 3 | I Don’t Know/ I Don’t Answer | 3 |
Overall Rating | 2 |
Question | Score | Answer | Calculated Score |
---|---|---|---|
Are you detecting direct or hidden feedback loops in your model? | 4 | NO | 4 |
Is model quality validated before serving? | 5 | YES | −5 |
Does the model allow debugging by observing step-by-step inference on a single example? | 3 | I Don’t Know/ I Don’t Answer | 3 |
Overall Rating | 2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sklavenitis, D.; Kalles, D. A Scoping Review and Assessment Framework for Technical Debt in the Development and Operation of AI/ML Competition Platforms. Appl. Sci. 2025, 15, 7165. https://doi.org/10.3390/app15137165
Sklavenitis D, Kalles D. A Scoping Review and Assessment Framework for Technical Debt in the Development and Operation of AI/ML Competition Platforms. Applied Sciences. 2025; 15(13):7165. https://doi.org/10.3390/app15137165
Chicago/Turabian StyleSklavenitis, Dionysios, and Dimitris Kalles. 2025. "A Scoping Review and Assessment Framework for Technical Debt in the Development and Operation of AI/ML Competition Platforms" Applied Sciences 15, no. 13: 7165. https://doi.org/10.3390/app15137165
APA StyleSklavenitis, D., & Kalles, D. (2025). A Scoping Review and Assessment Framework for Technical Debt in the Development and Operation of AI/ML Competition Platforms. Applied Sciences, 15(13), 7165. https://doi.org/10.3390/app15137165