Towards a Reference Architecture for Machine Learning Operations
Abstract
1. Introduction
- RQ1: How has the use of ML evolved in industrial environments, and what role does MLOps play in relation to Industry 4.0/5.0?
- RQ2: What MLOps principles and best practices are reported in the literature for industrial applications?
- RQ3: What are the most frequently cited technical and organisational challenges when adopting MLOps in industrial plants, and how do they affect scalability, legacy system integration, security/privacy, and costs/operations?
- RQ4: What architectures, platforms, and technological tools are used in industrial environments to implement MLOps?
- RQ5: How is the lifecycle of ML models managed in production environments, from development to monitoring and retraining?
- RQ6: What strategies and industrial use cases demonstrate the integration of MLOps with emerging technologies?
Contributions and Novelty
2. Background
- Predictive maintenance scenarios: MLOps coordinates the lifecycle of time-series models that predict machinery failure points using vibration signals, acoustic patterns, sensor readings, and historical maintenance records. Synchronising IoT sensor inputs through data pipelines ensures integrity before these datasets trigger automated training jobs [25,26]. The models are then deployed as online services, periodically queried by plant control systems, or as components integrated into the machine controllers themselves, depending on the latency tolerances. One of the advantages here is maintaining synchronisation between a system’s changing conditions and its digital representation, especially when supported by Digital Twin infrastructures that extract real-time data streams to recalibrate model parameters [27]. By structuring this within an MLOps framework, organisations can avoid situations in which models become obsolete without clear operational signals, only to fail.
- Logistics and supply chain: this sector offers a different context in which MLOps provides measurable strategic value. In this case, the complexity stems not so much from latency constraints as from handling fluctuating demand signals, transport delays, and multi-level supplier dependencies [28]. Predictive models can incorporate data streams ranging from sales forecasts to weather predictions to anticipate potential bottlenecks. The implementation of these models in decision-support layers requires strict governance to ensure that only validated versions with known performance limits are promoted to real-time route optimisation systems [7,29].
- Quality control: This is another manufacturing-focused area where MLOps adoption is accelerating. Automated feedback loops allow thresholds or classifier weights to be recalibrated without manual intervention when processes change batch properties or raw material composition. This responsiveness depends on containerised deployments that preserve execution consistency across edge devices distributed across multiple facilities [5].
- Advanced robotics: Especially in Industry 5.0 contexts involving human–machine collaboration and the use of cobots, the deployment of machine learning-based control architectures under an MLOps regime adds reliability to autonomy functions while enabling the explainability mechanisms required by safety regulators [25,26]. Robotic arms equipped with force-sensitive grippers can leverage reinforcement learning policies refined offline and continuously evaluated in actual operation via a monitoring interface that provides feedback on reward function fulfilment. If environmental changes degrade success rates beyond acceptable levels, pipelines automatically activate restricted retraining routines with updated real-world simulation datasets.
3. Literature Review Methodology
3.1. Review Design and Descriptive Scope
3.2. Bibliographic Sources, Search Strategy and Restrictions
3.3. Descriptive Analysis
3.4. Content Analysis
- Core (mlops-machine learning), which is linked to the rest of the concepts and explains their centrality.
- Operational cluster (devops, dataops, pipeline, with lighter CI/CD), which supports coverage in Principles (20.5%) by reflecting engineering practices focused on traceability and repeatability.
- Industrial/IIoT cluster (Industry 4.0, IIoT, manufacturing, big data, with connections to edge computing), responsible for the highest percentage in Background/Context (34.1%) due to its systematic co-occurrence with the core.
- Lifecycle and quality cluster, visible in implementation and with weaker links to supervision and conceptual drift, in line with moderate coverage of Lifecycle (11.4%) and the fact that some maintenance practices (recycling, model registration, validation/testing) are not explicitly labelled as keywords.
- Future directions cluster (edge computing, federated learning, digital twin, Industry 5.0, big data), which justifies the high coverage in Future (22.7%) and points to distributed architectures and collaborative learning that preserve local data.
3.5. Quality Assessment, Risk of Bias and Threats to Validity
- Clarity of the industrial context and problem formulation.
- Study design and evaluation method (e.g., case study, experiment, deployment, analytical evaluation).
- Adequacy of data and infrastructure description (sources, pre-processing, execution environment).
- Operational detail and lifecycle governance (deployment, observability, retraining, rollback).
- Replicability and transparency (code/artefacts, parameters, sufficient procedures).
4. Results of the Systematic Review: State of the Art Analysis
4.1. Assessment of the Quality and Maturity of the Evidence
4.2. Requirements Taxonomy for Industrial MLOps
4.3. Life Cycle Capability Analysis and Tools
4.4. Deployment Architectural Patterns
4.5. Research Gap
- The Architectural Gap (Centralisation vs. Physical Reality): As demonstrated by the prevalence of pattern P-01 (65.3%), academic inertia continues to favour centralised architectures inherited from web development, where inference occurs in the cloud. This approach is incompatible with cyber-physical systems that require millisecond decisions and offline operation. Although the hybrid pattern P-02 is theoretically identified as the optimal solution, it lacks a standardised reference implementation that resolves the complexity of synchronising models and metadata between the cloud and a fleet of heterogeneous devices.
- The Closed-Loop Gap (Passive vs. Active Monitoring): There is a disconnect between anomaly detection and corrective action. Most of the architectures reviewed implement monitoring as a passive dashboard for human operators, failing to close the automatic retraining loop (CT). Without a mechanism that connects drift detection on-site directly to the retraining pipeline in the cloud, industrial models become static assets that depreciate rapidly, raising maintenance costs to unsustainable levels.
- The Replicability Gap (Black Boxes vs. Open Artefacts): Finally, the literature is polarised between proprietary “black box” solutions (hyperscaler platforms) that generate vendor lock-in and simplistic academic proofs of concept that do not scale. The almost total absence (81.6%) of open-source technology stacks that integrate industrial-grade components (such as Kubernetes, Kafka, or MLflow) into a coherent architecture forces professionals to reinvent system integration from scratch for each project.
5. Hybrid Reference Architecture and Implementation Stack
5.1. Logical View: Dual Loop and Lifecycle Automation
5.2. Development View: Open Implementation Technology Stack
5.3. Process View: Data Flow and Continuous Adaptation Mechanism
5.4. Evaluation View: Evaluation Framework and Industrial Alignment
6. Implementation and Feasibility-Oriented Validation: Predictive Maintenance Use Case
- MLflow: used as an experiment tracking and model logging system. Each trained model version is stored with its metrics and parameters, facilitating traceability and retrieval.
- Apache Airflow: responsible for orchestrating data pipelines and machine learning through automated DAGs.
- Python is the base language for large-scale training and advanced analytics.
- MinIO: deployed as S3-compatible object storage for historical data and artefacts.
- Prometheus: integrated for real-time (NoSQL) monitoring of infrastructure metrics and model performance at the Edge.
- TimescaleDB/PostgreSQL: used for efficient time series storage, enabling fast analytical queries on historical machinery behaviour.
7. Discussion: Summary and Positioning of the Proposal
7.1. From Experimentation to Resilient Operation (RQ1, RQ2, RQ3)
7.2. Comparative Analysis with Consolidated Frameworks (RQ4, RQ5)
7.3. MLOps as an Enabler of Industry 5.0 and Emerging Technologies (RQ6)
7.4. Academic and Managerial Implications
8. Conclusions and Future Work
8.1. Conclusions
8.2. Future Research Directions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Papageorgiou, A.V.; Symeonidis, G.; Nerantzis, E.; Papakostas, G.A. Agile MLOps: Bridging the Gap Between Agility and Machine Learning Operations. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations, Limassol, Cyprus, 26–29 June 2025; Springer Nature: Cham, Switzerland, 2025; pp. 15–27. [Google Scholar]
- Mateo-Casalí, M.A.; Boza, A.; Fraile, F. Digital assets in zero-defect manufacturing: Literature review and proposed framework. Int. J. Prod. Res. 2025, 1–28. [Google Scholar] [CrossRef]
- Mohammed, W.M.; Ferrer, B.R.; Martinez, J.L.; Sanchis, R.; Andres, B.; Agostinho, C. A Multi-Agent Approach for Processing Industrial Enterprise Data. In Proceedings of the 2017 International Conference on Engineering, Technology and Innovation (ICE/ITMC), Funchal, Portugal, 27–29 June 2017; IEEE: New York, NY, USA, 2018; pp. 1209–1215. [Google Scholar]
- Paul, A.; Son, R.Y.; Balodi, S.A.; Crooks, K. MLOps FMEA: A proactive & structured approach to mitigate failures and ensure success for machine learning operations. In Proceedings of the 2024 Annual Reliability and Maintainability Symposium (RAMS), Albuquerque, NM, USA, 22–25 January 2024; IEEE: New York, NY, USA, 2024; pp. 1–7. [Google Scholar]
- Oyucu, S.; Aksöz, A. Integrating machine learning and MLOps for wind energy forecasting: A comparative analysis and optimization study on Türkiye’s wind data. Appl. Sci. 2024, 14, 3725. [Google Scholar] [CrossRef]
- Venanzi, R.; Dahdal, S.; Solimando, M.; Campioni, L.; Cavalucci, A.; Govoni, M.; Tortonesi, M.; Foschini, L.; Attana, L.; Tellarini, M.; et al. Enabling adaptive analytics at the edge with the Bi-Rex Big Data platform. Comput. Ind. 2023, 147, 103876. [Google Scholar] [CrossRef]
- Colombi, L.; Gilli, A.; Dahdal, S.; Boleac, I.; Tortonesi, M.; Stefanelli, C.; Vignoli, M. A machine learning operations platform for streamlined model serving in industry 5.0. In Proceedings of the NOMS 2024–2024 IEEE Network Operations and Management Symposium, Seoul, South Korea, 6–10 May 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
- Raffin, T.; Reichenstein, T.; Werner, J.; Kühl, A.; Franke, J. A reference architecture for the operationalization of machine learning models in manufacturing. Procedia CIRP 2022, 115, 130–135. [Google Scholar] [CrossRef]
- Zimelewicz, E.; Kalinowski, M.; Mendez, D.; Giray, G.; Santos Alves, A.P.; Lavesson, N.; Azevedo, K.; Villamizar, H.; Escovedo, T.; Lopes, H.; et al. Ml-enabled systems model deployment and monitoring: Status quo and problems. In Proceedings of the International Conference on Software Quality, Vienna, Austria, 23–25 April 2024; Springer Nature: Cham, Switzerland, 2024; pp. 112–131. [Google Scholar]
- Schreier, U.; Reimann, P.; Mitschang, B. A Kanban-based approach to manage machine learning projects in manufacturing. Procedia CIRP 2025, 134, 109–114. [Google Scholar] [CrossRef]
- Dahdal, S.; Tortonesi, M. Enabling Big Data and Machine Learning Applications in High-Stakes Environments. In Proceedings of the NOMS 2024–2024 IEEE Network Operations and Management Symposium, Seoul, South Korea, 6–10 May 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
- Wewer, C.R.; Mahapatra, H.; Esterle, L.; Larsen, P.G. Using FactoryML for Deployment of Machine Learning Models in Industrial Production. In Proceedings of the 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA), Padova, Italy, 10–13 September 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
- Faubel, L.; Woudsma, T.; Methnani, L.; Ghezeljhemeidan, A.G.; Buelow, F.; Schmid, K.; Van Driel, W.D.; Kloepper, B.; Theodorou, A.; Nosratinia, M.; et al. A mlops architecture for XAI in industrial applications. In Proceedings of the 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA), Padova, Italy, 10–13 September 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
- Bachinger, F.; Zenisek, J.; Affenzeller, M. Automated machine learning for industrial applications–challenges and opportunities. Procedia Comput. Sci. 2024, 232, 1701–1710. [Google Scholar] [CrossRef]
- Marinova, S.; Tian, Y.; Leon-Garcia, A. E2E network slice assurance for B5G/6G: Realizing data collection and management, MLOps, and closed-loop control. IEEE Open J. Commun. Soc. 2025, 6, 759–774. [Google Scholar] [CrossRef]
- Martínez-Arellano, G.; Ratchev, S. Towards Frugal Industrial AI: A framework for the development of scalable and robust machine learning models in the shop floor. Int. J. Adv. Manuf. Technol. 2025, 138, 169–191. [Google Scholar] [CrossRef]
- Rigas, S.; Tzouveli, P.; Kollias, S. An end-to-end deep learning framework for fault detection in marine machinery. Sensors 2024, 24, 5310. [Google Scholar] [CrossRef]
- Ruf, P.; Reich, C.; Ould-Abdeslam, D. Aspects of module placement in machine learning operations for cyber physical systems. In Proceedings of the 2022 11th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 7–11 June 2022; IEEE: New York, NY, USA, 2022; pp. 1–6. [Google Scholar]
- Watson, H.J.; Larson, D. MLOps: From a Cottage Industry to a Factory Approach. Int. J. Bus. Intell. Res. IJBIR 2024, 15, 1–22. [Google Scholar] [CrossRef]
- Antonini, M.; Pincheira, M.; Vecchio, M.; Antonelli, F. An adaptable and unsupervised TinyML anomaly detection system for extreme industrial environments. Sensors 2023, 23, 2344. [Google Scholar] [CrossRef]
- Metcalfe, B.; Acosta-Pavas, J.C.; Robles-Rodriguez, C.E.; Georgakilas, G.K.; Dalamagas, T.; Aceves-Lara, C.A.; Daboussi, F.; Koehorst, J.J.; Corrales, D.C. Towards a machine learning operations (MLOps) soft sensor for real-time predictions in industrial-scale fed-batch fermentation. Comput. Chem. Eng. 2025, 194, 108991. [Google Scholar]
- Kreuzberger, D.; Kühl, N.; Hirschl, S. Machine learning operations (mlops): Overview, definition, and architecture. IEEE Access 2023, 11, 31866–31879. [Google Scholar] [CrossRef]
- Grilo, A.; Figueiras, P.; Rêga, B.; Lourenço, L.; Khodamoradi, A.; Costa, R.; Jardim-Gonçalves, R. Data analytics environment: Combining visual programming and mlops for ai workflow creation. In Proceedings of the 2024 IEEE International Conference on Engineering, Technology, and Innovation (ICE/ITMC), Funchal, Portugal, 24–28 June 2024; IEEE: New York, NY, USA, 2024; pp. 1–9. [Google Scholar]
- Maier, R.; Schlattl, A.; Guess, T.; Mottok, J. CausalOps—Towards an industrial lifecycle for causal probabilistic graphical models. Inf. Softw. Technol. 2024, 174, 107520. [Google Scholar]
- Hegedűs, C.; Varga, P. Tailoring mlops techniques for industry 5.0 needs. In Proceedings of the 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 30 October–2 November 2023; IEEE: New York, NY, USA, 2023; pp. 1–7. [Google Scholar]
- Varga, P.; Kővári, Á.; Herkules, M.; Hegedűs, C. MLOps in CPS–a use-case for image recognition in changing industrial settings. In Proceedings of the NOMS 2024–2024 IEEE Network Operations and Management Symposium, Seoul, South Korea, 6–10 May 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
- Kruschinski, D.; Ngassam, D.T.; Durak, U.; Hartmann, S. An MLOps Framework to Data-Driven Modelling of Digital Twins with an Application to Virtual Test Rigs. In Proceedings of the International Conference on Conceptual Modeling, Pittsburgh, PA, USA, 28–31 October 2024; Springer Nature: Cham, Switzerland, 2024; pp. 71–86. [Google Scholar]
- Mateo-Casalí, M.Á.; Gil, F.F.; Boza, A.; Nazarenko, A. An Industry Maturity Model for Implementing Machine Learning Op-erations in Manufacturing. Int. J. Prod. Manag. Eng. 2023, 11, 179–186. [Google Scholar] [CrossRef]
- Chakraborty, A.; Das, S.; Gary, K. Machine Learning Operations: A Mapping Study. In Proceedings of the World Congress in Computer Science, Computer Engineering & Applied Computing, Las Vegas, NV, USA, 22–25 July 2024; Springer Nature: Cham, Switzerland, 2024; pp. 3–21. [Google Scholar]
- Leest, J.; Gerostathopoulos, I.; Raibulet, C. Evolvability of machine learning-based systems: An architectural design decision framework. In Proceedings of the 2023 IEEE 20th International Conference on Software Architecture Companion (ICSA-C), L’Aquila, Italy, 13–17 March 2023; IEEE: New York, NY, USA, 2023; pp. 106–110. [Google Scholar]
- Faubel, L.; Woudsma, T.; Kloepper, B.; Eichelberger, H.; Buelow, F.; Schmid, K.; Ghezeljehmeidan, A.G.; Methnani, L.; Theodorou, A.; Bång, M. MLOps for Cyberphysical Production Systems: Challenges and Solutions. IEEE Softw. 2024, 42, 65–73. [Google Scholar] [CrossRef]
- Andres, B.; Diaz-Madronero, M.; Soares, A.L.; Poler, R. Enabling Technologies to Support Supply Chain Logistics 5.0. IEEE Access 2024, 12, 43889–43906. [Google Scholar] [CrossRef]
- Rani, F.; Chollet, N.; Vogt, L.; Urbas, L. Industrial Edge MLOps: Overview and Challenges. Comput. Aided Chem. Eng. 2024, 53, 3019–3024. [Google Scholar]
- Chadli, K.; Botterweck, G.; Saber, T. The environmental cost of engineering machine learning-enabled systems: A mapping study. In Proceedings of the 4th Workshop on Machine Learning and Systems, Athens, Greece, 22 April 2024; ACM: New York, NY, USA, 2024; pp. 200–207. [Google Scholar]
- Raffin, T.; Reichenstein, T.; Klier, D.; Kühl, A.; Franke, J. Qualitative assessment of the impact of manufacturing-specific influences on machine learning operations. Procedia CIRP 2022, 115, 136–141. [Google Scholar] [CrossRef]
- Safdar, M.; Paul, P.P.; Lamouche, G.; Wood, G.; Zimmermann, M.; Hannesen, F.; Bescond, C.; Wanjara, P.; Zhao, Y.F. Fundamental requirements of a machine learning operations platform for industrial metal additive manufacturing. Comput. Ind. 2024, 154, 104037. [Google Scholar]
- ISO/IEC/IEEE 42010:2022; Software, Systems and Enterprise Architecture Description. ISO: Geneva, Switzerland, 2022.
- von Hahn, T.; Mechefske, C.K. Machine learning in cnc machining: Best practices. Machines 2022, 10, 1233. [Google Scholar] [CrossRef]
- Manickam, D.D.; Mohamed, S.; Jain, V.; Goswami, D.; Lensink, L. A structured inference optimization approach for vision-based DNN deployment on legacy systems. In Proceedings of the 2023 IEEE 28th International Conference on Emerging Technologies and Factory Automation (ETFA), Sinaia, Romania, 12–15 September 2023; IEEE: New York, NY, USA, 2023; pp. 1–8. [Google Scholar]
- Sood, I.; Kaushik, A.; Bulgerin, T.; Kumar, P.; Rath, S.; Khemiri, A.; Chang, J.; Hsu, S.; Bedorf, J. Supporting fab operations using multi-agent reinforcement learning. In Proceedings of the 2024 35th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC), Albany, NY, USA, 13–16 May 2024; IEEE: New York, NY, USA, 2024; pp. 1–6. [Google Scholar]
- Bodor, A.; Hnida, M.; Najima, D. MLOps: Overview of current state and future directions. In Proceedings of the International Conference on Smart City Applications, Castelo Branco, Portugal, 19–21 October 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 156–165. [Google Scholar]
- Garrone, A.; Minisi, S.; Oneto, L.; Dambra, C.; Borinato, M.; Sanetti, P.; Vignola, G.; Papa, F.; Mazzino, N.; Anguita, D. Simple non regressive informed machine learning model for prescriptive maintenance of track circuits in a subway environment. In Proceedings of the International Conference on System-Integrated Intelligence, Genova, Italy, 7–9 September 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 74–83. [Google Scholar]
- Li, P.; Mavromatis, I.; Khan, A. Past, present, future: A comprehensive exploration of ai use cases in the umbrella iot testbed. In Proceedings of the 2024 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events (PerCom Workshops), Biarritz, France, 11–15 March 2024; IEEE: New York, NY, USA, 2024; pp. 787–792. [Google Scholar]
- Chen, H.; Liu, C.T.; Hsu, H.Y.; Hsieh, J.Y. A Federated implementation for MLOps framework based on non-intrusive load monitoring. In Proceedings of the 2023 IEEE 5th Eurasia Conference on IOT, Communication and Engineering (ECICE), Yunlin, Taiwan, 27–29 October 2023; IEEE: New York, NY, USA, 2023; pp. 284–289. [Google Scholar]
- Luley, P.P.; Deriu, J.M.; Yan, P.; Schatte, G.A.; Stadelmann, T. From concept to implementation: The data-centric development process for AI in industry. In Proceedings of the 2023 10th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 22–23 June 2023; IEEE: New York, NY, USA, 2023; pp. 73–76. [Google Scholar]
- Amirkhanova, G.; Amirkhanov, B.; Tyulepberdinova, G.; Ishmurzin, T. Application of Machine Learning Algorithms in Digital Twin Monitoring Systems: An Overview of Approaches, Methods, and Prospects. In Proceedings of the 2024 International Conference on Intelligent Computing and Next Generation Networks (ICNGN), Bangkok, Thailand, 23–25 November 2024; IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
- Martínez-Arellano, G.; Ratchev, S. Improving the development and reusability of Industrial AI through Semantic Models. In Proceedings of the Conference on Learning Factories, Enschede, The Netherlands, 17–19 April 2024; Springer Nature: Cham, Switzerland, 2024; pp. 179–186. [Google Scholar]
- Liao, Q.; Kesters, M.; Landuyt, D.V.; Joosen, W. Data Chameleon: A Self-adaptive Synthetic Data Management System. In Proceedings of the IFIP Annual Conference on Data and Applications Security and Privacy, Gjøvik, Norway, 23–24 June 2025; Springer Nature: Cham, Switzerland, 2025; pp. 44–56. [Google Scholar]
- Kukkaro, A.; Moreschini, S.; Hästbacka, D. Continuous Training vs. Transfer Learning on Edge and Fog Environments: A Steam Detection use Case. In Proceedings of the 2024 50th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Paris, France, 28–30 August 2024; IEEE: New York, NY, USA, 2024; pp. 138–141. [Google Scholar]
- Cheng, Q.; Long, G. Federated learning operations (flops): Challenges, lifecycle and approaches. In Proceedings of the 2022 International Conference on Technologies and Applications of Artificial Intelligence (TAAI), Tainan, Taiwan, 1–3 December 2022; IEEE: New York, NY, USA, 2022; pp. 12–17. [Google Scholar]
- Bayram, F.; Ahmed, B.S. Towards trustworthy machine learning in production: An overview of the robustness in mlops approach. ACM Comput. Surv. 2025, 57, 1–35. [Google Scholar] [CrossRef]
- Antonini, M.; Pincheira, M.; Vecchio, M.; Antonelli, F. Tiny-MLOps: A framework for orchestrating ML applications at the far edge of IoT systems. In Proceedings of the 2022 IEEE international Conference on Evolving and Adaptive Intelligent Systems (EAIS), Larnaca, Cyprus, 25–27 May 2022; IEEE: New York, NY, USA, 2022; pp. 1–8. [Google Scholar]










| Paper ID | Title | Year |
|---|---|---|
| P01 | A reference architecture for the operationalisation of machine learning models in manufacturing | 2022 |
| P02 | Federated Learning Operations (FLOps): Challenges, Lifecycle and Approaches | 2022 |
| P03 | Machine Learning in CNC Machining: Best Practices | 2022 |
| P04 | Tiny-MLOps: a framework for orchestrating ML applications at the far edge of IoT systems | 2022 |
| P05 | Qualitative assessment of the impact of manufacturing-specific influences on Machine Learning Operations | 2022 |
| P06 | A Federated implementation for the MLOps framework based on non-intrusive load monitoring | 2023 |
| P07 | An Adaptable and Unsupervised TinyML Anomaly Detection System for Extreme Industrial Environments | 2023 |
| P08 | An Industry Maturity Model for Implementing Machine Learning Operations in Manufacturing | 2023 |
| P09 | Data Analytics Environment Combining Visual Programming and MLOps for AI workflow creation | 2023 |
| P10 | Enabling adaptive analytics at the edge with the Bi-Rex Big Data platform | 2023 |
| P11 | Evolvability of Machine Learning-based Systems: An Architectural Design Decision Framework | 2023 |
| P12 | From Concept to Implementation: The Data-Centric Development Process for AI in Industry | 2023 |
| P13 | Machine Learning Operations (MLOps): Overview, Definition, and Architecture | 2023 |
| P14 | MLOps: Overview of Current State and Future Directions | 2023 |
| P15 | Tailoring MLOps Techniques for Industry 5.0 Needs | 2023 |
| P16 | Simple Non-Regressive Informed Machine Learning Model for Prescriptive Maintenance of Track Circuits… | 2023 |
| P17 | A Machine Learning Operations Platform for Streamlined Model Serving in Industry 5.0 | 2024 |
| P18 | An MLOps Architecture for XAI in Industrial Applications | 2024 |
| P19 | A Structured Inference Optimisation Approach for Vision-Based DNN Deployment on Legacy Systems | 2024 |
| P20 | An End-to-End Deep Learning Framework for Fault Detection in Marine Machinery | 2024 |
| P21 | CausalOps—Towards an industrial lifecycle for causal probabilistic graphical models | 2024 |
| P22 | Application of Machine Learning Algorithms in Digital Twin Monitoring Systems: An Overview of Approaches, Methods, and Prospects | 2024 |
| P23 | Aspects of Module Placement in Machine Learning Operations for Cyber-Physical Systems | 2024 |
| P24 | An MLOps Framework to Data-Driven Modelling of Digital Twins with an Application to Virtual Test Rigs | 2024 |
| P25 | Automated Machine Learning for Industrial Applications—Challenges and Opportunities | 2024 |
| P26 | Continuous Training vs. Transfer Learning on Edge and Fog Environments: A Steam Detection Use Case | 2024 |
| P27 | Deploying a Sustainable Deep Learning Pipeline for Poison Ivy Image Classification | 2024 |
| P28 | E2E Network Slice Assurance for B5G/6G: Realising Data Collection and Management, MLOps and Closed Loop Control | 2024 |
| P29 | Enabling Big Data and Machine Learning Applications in High-Stakes Environments | 2024 |
| P30 | Fundamental requirements of a machine learning operations platform for industrial metal additive manufacturing | 2024 |
| P31 | Improving the Development and Reusability of Industrial AI Through Semantic Models | 2024 |
| P32 | Industrial Edge MLOps: Overview and Challenges | 2024 |
| P33 | Integrating Machine Learning and MLOps for Wind Energy Forecasting | 2024 |
| P34 | ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems | 2024 |
| P35 | MLOps FMEA: A Proactive & Structured Approach to Mitigate Failures | 2024 |
| P36 | MLOps for Cyberphysical Production Systems: Challenges and Solutions | 2024 |
| P37 | MLOps in CPS: a use-case for image recognition in changing industrial settings | 2024 |
| P38 | MLOps: From a cottage industry to a factory approach | 2024 |
| P39 | Past, Present, Future: A Comprehensive Exploration of AI Use Cases in the UMBRELLA IoT Testbed | 2024 |
| P40 | The Environmental Cost of Engineering Machine Learning-Enabled Systems: A Mapping Study | 2024 |
| P41 | Using FactoryML for the Deployment of Machine Learning Models in Industrial Production | 2024 |
| P42 | Supporting Factory Operations Using Multi-Agent Reinforcement Learning | 2024 |
| P43 | A Kanban-based Approach to Managing Machine Learning Projects in Manufacturing | 2025 |
| P44 | Agile MLOps: Bridging the Gap Between Agility and Machine Learning Operations | 2025 |
| P45 | Machine Learning Operations: A Mapping Study | 2025 |
| P46 | Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach | 2025 |
| P47 | Towards Frugal Industrial AI: a framework for the development of scalable and robust machine learning models | 2025 |
| P48 | Towards a machine learning operations (MLOps) soft sensor for real-time predictions in industrial-scale fed-batch fermentation | 2025 |
| P49 | Data Chameleon: A Self-adaptive Synthetic Data Management System | 2025 |
| Paper ID | C1 | C2 | C3 | C4 | C5 | Brief Justification |
|---|---|---|---|---|---|---|
| P01 | 1 | 0 | 1 | 1 | 0 | Conceptual proposal for reference architecture. Describes necessary components (Docker, MQTT) but does not present an experimental implementation or validation with real data. |
| P02 | 1 | 0 | 0 | 1 | 0 | Methodological proposal. Coins the term “FLOps”. Defines the life cycle and challenges of MLOps in federated (cross-silo) environments. Purely theoretical/conceptual, with no case study or technical implementation. |
| P03 | 2 | 2 | 2 | 1 | 2 | Excellent replicability. Real-world case of CNC tool wear + public dataset. Shares code and data. Focuses on model construction “best practices,” although it acknowledges that continuous deployment (C4) is a future task. |
| P04 | 2 | 2 | 1 | 2 | 1 | TinyML/Edge. Extends MLOps to microcontrollers (Far Edge). Evaluates deployment and inference latencies on limited devices. Addresses the challenge of orchestration on low-power hardware. |
| P05 | 1 | 0 | 0 | 1 | 0 | Qualitative study. Cross-references MLOps capabilities with manufacturing requirements through literature review. Identifies semantic gaps between OT and IT but does not present implementation or data. |
| P06 | 2 | 2 | 2 | 2 | 1 | Clear technical case (NILM/Smart Grid) with public dataset (AMPds). Federated implementation with GitHub Actions and Docker. The code is mentioned, but there is no direct link to the specific repo. |
| P07 | 2 | 2 | 2 | 1 | 1 | Proposes an unsupervised TinyML system for anomaly detection in extreme industrial environments running on a microcontroller (e.g., ESP32), reporting “real” edge metrics (memory, inference/training times, footprint), with a practical approach to deployment. |
| P08 | 1 | 1 | 1 | 2 | 0 | Maturity model (IMM-MLOps) validated by experts (interviews). Very strong in defining operational practices (C4), but without technical validation through actual deployment. |
| P09 | 2 | 2 | 2 | 2 | 1 | No-Code platform for SMEs. Use case: Injection moulding (real data). Integrates Node-RED with MLflow and Docker. Supports full cycle but does not include public repo. |
| P10 | 2 | 1 | 2 | 1 | 1 | It presents an OT/IT industrial platform for adaptive analysis, articulating an OT layer at the industrial edge (close to the machine) and an IT layer with services and storage (MQTT/Kafka-type connectivity), useful as a reference architecture for interoperability and deployment in the plant. |
| P11 | 1 | 1 | 0 | 1 | 0 | Architectural Framework. Proposes a decision-making framework for managing evolvability (drift). Focuses on design strategy (“when to retrain”), illustrated with examples, but without detailed actual deployment. |
| P12 | 2 | 2 | 2 | 2 | 1 | Data-Centric AI for SMEs. Describes a specific process applied to manufacturing/machining. Detailed implementation using Airflow, DVC, and MLflow. Excellent description of the data lifecycle and management. |
| P13 | 1 | 0 | 0 | 0 | 0 | Fundamental reference. Defines MLOps through mixed review and expert interviews. Establishes principles and roles. Being theoretical/definitional, it scores low on technical implementation but is key to your theoretical framework. |
| P14 | 1 | 0 | 0 | 1 | 0 | Overview. Introduces basic concepts, tools (MLflow, Kubeflow) and the lifecycle. It is introductory, with no novel contributions in terms of design, data or implementation. |
| P15 | 1 | 1 | 1 | 1 | 0 | Architectural Proposal. Proposes the “Olympics Model” to integrate MLOps with Systems Engineering (CPS). It is based on the requirements of the AIMS 5.0 project, but the article is conceptual/propositional without detailed experimental validation. |
| P16 | 2 | 2 | 2 | 2 | 1 | Real case (Hitachi Rail). Metro track maintenance. It stands out in C4 for addressing a critical operational problem: ensuring that weekly retraining is not regressive (does not introduce new errors), using constraints in XGBoost. |
| P17 | 2 | 2 | 2 | 2 | 2 | Real case study at a gear manufacturing company (Bonfiglioli). Complete MLOps infrastructure (K8s, Jenkins) and comparative performance evaluation (BentoML vs. TorchServe). Code available. |
| P18 | 1 | 1 | 1 | 2 | 2 | Architecture focused on XAI and feedback loops. General industrial context (EXPLAIN project). It excels in operation (C4) and has a repository in GitLab, although validation is preliminary. |
| P19 | 2 | 2 | 2 | 1 | 2 | Specific industrial case: deployment of vision on legacy hardware. Clear technical methodology (OpenVINO, quantisation). Repository available. |
| P20 | 2 | 2 | 2 | 1 | 2 | Clear naval context with real ship data. Complete (end-to-end) fault detection pipeline. The focus is on modelling, with less detail on continuous operation (feedback loops). |
| P21 | 2 | 1 | 1 | 2 | 1 | Defines an “industrial” lifecycle for causal models (causal PGMs) with an emphasis on roles, phases, artefacts and governance, supported by practical experience (e.g., automotive/safety), which makes it very robust for arguing organisational maturity. |
| P22 | 1 | 0 | 0 | 0 | 0 | Review. Analyses ML methods for Digital Twins from a theoretical perspective. Proposes a general monitoring scheme but does not present experimental implementation, proprietary data or deployment. |
| P23 | 2 | 2 | 1 | 1 | 1 | Addresses distributed deployment (Edge vs. Cloud) in CPS. Validated through Proof of Concept (PoC) in a factory simulation (Fischertechnik). Infrastructure described (K3s, Zenoh), but synthetic simulation data. |
| P24 | 2 | 2 | 2 | 2 | 1 | Proposes an MLOps framework for data-driven Digital Twins with a very explicit stack (e.g., Kubeflow Pipelines/Katib/MinIO + MQTT/ETL-type ingestion + REST/gRPC serving), incorporates monitoring and triggers for degradation or distribution changes to update/retrain the model, and also evaluates operational aspects (including serving load/latency tests). |
| P25 | 2 | 1 | 1 | 1 | 1 | This is a job focused on the requirements and challenges of industrialising AutoML/ML (data heterogeneity, drift, monitoring, interpretability, functional safety, traceability), with support from industrial experience/partners. |
| P26 | 2 | 2 | 2 | 2 | 1 | Clear industrial case (detection of steam leaks in sterilisation). Experimentally compares retraining strategies (CT vs. TL) in Edge. Detailed infrastructure (K3s), but no link to the repo. |
| P27 | 1 | 2 | 2 | 1 | 2 | Green AI approach and quantisation at the edge (Jetson/Rpi). Non-industrial-manufacturing context (environmental), therefore C1 = 1. Notable for having code available. |
| P28 | 2 | 2 | 2 | 2 | 1 | Telecom sector (5G slicing). Implements a closed loop to ensure SLAs through automatic retraining. Very comprehensive architecture (ZSM, Kafka, MinIO). |
| P29 | 1 | 0 | 0 | 0 | 0 | Doctoral Symposium. Describes the research plan (PhD journey) on data management in critical environments (Humanitarian/Industry 5.0). It is a conceptual proposal with no implementation or validation reported yet. |
| P30 | 2 | 1 | 1 | 2 | 0 | Requirements Engineering. Defines the functional requirements and roles for a specific MLOps platform in Metal Additive Manufacturing (MAM). Very valuable for defining the operational architecture (C4), although it does not implement the final platform. |
| P31 | 1 | 1 | 1 | 1 | 0 | Proposes a semantic framework (ontology) to capture context and facilitate reuse. Validation through a preliminary conceptual “scenario,” with no reported full industrial deployment. |
| P32 | 1 | 0 | 0 | 0 | 0 | Review and survey. Analyses tools and defines a base architecture for Edge MLOps. Useful for a theoretical framework, but does not present experimentation or proprietary data. |
| P33 | 2 | 2 | 2 | 2 | 1 | Real case with SCADA data from a wind turbine in Turkey. Implements End-to-End MLOps pipeline (Docker, Jenkins) and measures inference latencies and accuracy (RMSE). Very comprehensive from a technical standpoint. |
| P34 | 1 | 1 | 1 | 1 | 2 | Survey of 188 professionals. Identifies actual practices and problems (e.g., legacy integration). Does not implement a system but stands out for sharing the dataset of responses and scripts (Open Science). |
| P35 | 2 | 2 | 1 | 2 | 1 | Risk Management. Adapts FMEA (Failure Mode and Effects Analysis) to the CRISP-ML(Q) life cycle. Validated with a Predictive Maintenance use case (maintenance text classification). |
| P36 | 2 | 1 | 1 | 2 | 1 | Multi-sector experience. Describes challenges and solutions based on three real scenarios (electronics manufacturing, metallurgy, chemistry). Proposes the “oktoflow” platform and discusses Edge/On-premises architectures. |
| P37 | 2 | 2 | 2 | 2 | 1 | Technical Implementation. Security use case (geofencing of humans/forklifts). Automated pipeline with Jenkins, Docker, and YOLOv5. Details the retraining and deployment flow (CI/CD/CT). |
| P38 | 1 | 1 | 1 | 1 | 0 | Book chapter/tutorial. Uses an analogy (“Craft Industry vs. Factory”) and an e-commerce scenario to illustrate concepts. Describes roles and processes well, but is not a real industrial manufacturing case study and does not present technical experimentation. |
| P39 | 2 | 2 | 2 | 2 | 1 | Real Implementation (Testbed). Describes four use cases (smart lighting, digital twin, etc.) in a real IoT network with 200 nodes. Deployment with Kubernetes and Federated Learning. Dataset published, but does not provide a direct link to the system code. |
| P40 | 1 | 0 | 0 | 0 | 0 | Systematic Mapping (SMS). Analyses 52 studies on the energy cost of MLOps. Useful for identifying sustainability metrics (Green AI), but as it is a secondary review, it scores low on its own technical implementation. |
| P41 | 2 | 2 | 1 | 2 | 2 | Open-Source Framework. Proposes “FactoryML” to deploy models in PLCs and air-gapped environments. Real-world case study. Notable for C5 = 2 (code available) and focus on rigid industrial infrastructure. |
| P42 | 2 | 2 | 2 | 2 | 0 | Production deployment (Micron). Use of RL for scheduling in semiconductor factory. Detailed MLOps section: automatic retraining, acceptance tests (UAT) and cluster deployment. Actual throughput improvement results (+2%). |
| P43 | 1 | 1 | 0 | 1 | 0 | Project management approach (adapted Kanban). Validation based on a simulated “use case” rather than on a technical implementation with real data or infrastructure. |
| P44 | 1 | 1 | 0 | 1 | 0 | Theoretical methodological proposal based on literature review. Analyses the integration of Scrum/Kanban in MLOps but lacks technical implementation or empirical validation. |
| P45 | 1 | 0 | 0 | 0 | 0 | Mapping Study. Classifies 32 studies into Data, Modelling and Deployment pipelines. Useful for understanding research trends and taxonomy, but does not provide technical implementation or proprietary data. |
| P46 | 1 | 0 | 0 | 0 | 0 | Review (Survey). Explores the intersection between “Trustworthy AI” and MLOps. Very theoretical, organises concepts of robustness, but does not present implementation or use cases of its own. |
| P47 | 2 | 1 | 1 | 1 | 0 | Semantic Approach. Proposes an ontology-based framework for “Frugal AI” (little data) and reusability. Validation through conceptual monitoring scenarios, with no reported continuous operational deployment. |
| P48 | 2 | 2 | 2 | 2 | 1 | Industrial Case (Biotech). Complete pipeline for a penicillin “Soft Sensor” (IndPenSim). Implements automatic retraining based on drift detection (PSI). Very strong in design and operation. |
| P49 | 1 | 2 | 1 | 2 | 1 | Adaptive architecture for synthetic data management. Implements an MAPE-k loop to detect drift and retrain generators. Evaluation through simulation with a retail dataset (not physical industrial one). |
| ID | Constraint/Requirement | Operational Definition | Articles | Coverage (%) | Architectural Implication |
|---|---|---|---|---|---|
| R01 | Adaptability and Drift | Management of changes in data distribution (drift) or physical processes over time. | P11, P14, P15, P16, P22, P46, P48, P49 | 16.3 | Continuous monitoring pipeline and automatic retraining triggers (CT). |
| R02 | Data Quality | Challenges of costly labelling, sample scarcity (Small Data) or synthetic data. | P15, P24, P29, P43, P46 | 10.2 | Integration of Data Engineering (ETL) and Data-Centric tools into the MLOps loop. |
| R03 | Edge/TinyML | Deployment on devices with severe computing and power constraints. | P06, P18, P20, P27, P45 | 10.2 | Model optimisation (quantisation) and lightweight remote orchestration (OTA). |
| R04 | OT/IT integration | Interoperability with plant systems (PLCs, SCADA) and legacy hardware. | P06, P08, P20, P34, P40, P42 | 12.2 | Use of industrial connectors (OPC-UA, MQTT) and containerised gateways (Docker) at a Fog point. |
| R05 | Real-time latency | Strict response time requirement for inference in process control. | P01, P28, P44, P45 | 8.2 | On-device inference, hardware acceleration, and low-latency architectures. |
| R06 | Continuous cycle (CI/CD) | Complete automation of the flow from code to deployment and validation. | P04, P14, P17, P28, P35, P44 | 12.2 | Complex orchestration (Jenkins, Kubeflow) and strict model version control. |
| R07 | Scalability/Federated | Distributed fleet management or collaborative training without data sharing. | P04, P14, P15, P23, P38, P49 | 12.2 | Decentralised architectures, fleet configuration management and secure aggregation. |
| R08 | Robustness and reliability | Guaranteed safe operation in the event of failures and regression prevention. | P33, P41, P48 | 6.1 | Failure mode analysis (FMEA), rollback strategies and regression tests. |
| R09 | Explainability (XAI) | Need for human operators to understand and validate decisions. | P05, P14, P15, P30, P36 | 10.2 | User interfaces for explainability and interpretability metrics in monitoring. |
| R10 | Security and Privacy | Protection of sensitive data, model IP, and defence against attacks. | P04, P21, P46 | 6.1 | Encryption, differential privacy, access control (RBAC) and security auditing. |
| R11 | Sustainability | Minimisation of energy consumption for training and inference. | P18, P39 | 4.1 | Energy efficiency metrics and selection of green hardware/algorithms. |
| R12 | Governance and Methodology | Organisational frameworks, roles, maturity and alignment with business. | P03, P07, P10, P11, P25, P30, P47 | 14.3 | Definition of standards, business KPIs and IT/OT collaboration structures. |
| Phase/Capacity | Articles (IDs) | Coverage (%) | Most Cited Tools |
|---|---|---|---|
| Data Ingestion | P01, P04, P06, P08, P09, P13, P14, P16, P17, P19, P20, P24, P28, P29, P35, P38, P42, P44, P45, P46, P48, P49 | 44.9% | Kafka, MQTT, Spark, Node-RED, OPC-UA |
| Validation and Quality | P11, P14, P15, P24, P29, P43, P46 | 14.3% | DVC, Synthetic Generators, Manual Scripts |
| Feature Engineering | P09, P14, P28, P29, P44 | 10.2% | Pandas/Python (Custom), Feature Stores (rare) |
| Training | P01, P04, P06, P08, P09, P14, P15, P16, P18, P19, P24, P28, P29, P35, P38, P42, P44, P45, P46, P48, P49 | 42.9% | TensorFlow, PyTorch, Scikit-learn, XGBoost |
| Model Registry | P01, P14, P15, P17, P24, P28, P35 | 14.3% | MLflow Registry, DVC, Git-based |
| CI/CD and Orchestration | P01, P04, P14, P15, P17, P19, P28, P35, P44, P49 | 20.4% | Jenkins, GitHub Actions, Airflow, Kubeflow |
| Deployment/Serving | P01, P06, P08, P13, P14, P15, P17, P18, P20, P28, P35, P42, P45, P49 | 28.6% | Docker, K8s, BentoML, TorchServe, OpenVINO |
| Monitoring | P01, P06, P14, P15, P16, P19, P20, P28, P34, P35, P38, P44, P48, P49 | 28.6% | Prometheus, Grafana, ELK Stack |
| Drift Detection | P14, P16, P22, P44, P46, P48 | 14.3% | Alibi Detect, Evidently, Tests |
| Retraining | P14, P11, P16, P19, P35, P44, P46, P49 | 16.3% | Airflow DAGs, Jenkins Triggers, Custom Loops |
| Governance | P03, P07, P10, P11, P25, P30, P33, P47 | 16.3% | Excel, Kanban Boards, FMEA, Manual |
| Rollback/Insurance | P01, P34, P35, P48 | 8.2% | K8s Rollouts, Manual Scripts |
| Pattern ID | Deployment Architecture | Cost (%) | Pros (+) and Cons (−) | Representative Studies |
|---|---|---|---|---|
| P-01 | Centralised (Cloud-Centric) Train & Inference in Cloud/Server. | 65.3 | (+) Simplifies management and scaling. (−) High latency, dependency on internet connection, privacy risks. | P01, P02, P03, P05, P07, P09, P10, P11, P12, P14, P15, P17, P19, P21, P22, P24, P25, P26, P30, P31, P32, P33, P36, P37, P39, P40, P41, P43, P44, P46, P47, P48 |
| P-02 | Hybrid (Cloud-fog-Edge) Train in Cloud and Deploy to Edge | 16.3 | (+) Real-time response, operational autonomy. (−) Complex orchestration and synchronisation challenges. | P08, P13, P16, P20, P28, P34, P35, P49 |
| P-03 | TinyML/Far Edge Inference on microcontrollers (MCU). | 8.2 | (+) Ultra-low power consumption. (−) Severe hardware constraints, difficult to update. | P06, P18, P27, P45 |
| P-04 | Federated Learning Training is distributed across nodes. | 6.1 | (+) Maximum Data Privacy (data never leaves the plant). (−) High network overhead, slow convergence. | P04, P23, P38 |
| P-05 | Air-Gapped/Isolated Manual deployment via physical media. | 4.1 | (+) Critical infrastructure security. (−) No monitoring, obsolete models, slow updates. | P29, P42 |
| Indicator | Observed Value | Unit | Evidence Interpretation |
|---|---|---|---|
| Seeded company reference events | 4848 | events | Company-derived reference dataset loaded into the validation environment |
| Persisted company sensor events | 21 | events | Company-mode ingestion path was exercised and persisted |
| Persisted edge predictions | 1 | predictions | Edge inference was exercised and recorded in the operational database |
| Observed edge inference latency | 99.6613 | ms | Latency of the persisted edge inference event |
| Closed-loop drift score | 0.178127 | score | Repeatedly observed in governance-side drift evaluation |
| Closed-loop drift severity/drifted features | low/1 | categorical/features | Repeated across observed drift runs |
| Indicator | Observed Value | Unit | Evidence Interpretation |
|---|---|---|---|
| Valid synthetic events sent | 300 | events | Controlled profiling load injected through the deployed edge-inference path |
| Persisted inference events | 300 | events | All profiling inputs produced persisted inference records in the experimental window |
| Prediction counter | 300 | predictions | metrics prediction counter matched the persisted inference count |
| Schema mismatches | 0 | events | No schema mismatches were observed during the profiling run |
| Edge inference latency (mean) | 19.0321 | ms | Mean latency from persisted inference_events.latency_ms |
| Edge inference latency (p50/p95/p99) | 9.713/75.6004/85.7648 | ms | Percentile summary from persisted edge inference events |
| Edge inference latency (min/max) | 2.6077/133.7507 | ms | Observed latency range in the profiling run |
| Container CPU usage (mean/p95/max) | 293.75/421.76/425.44 | % | Container-level CPU from docker stats on a multi-core host |
| Container memory usage (mean/p95/max) | 127.46/127.74/127.80 | MiB | Container-level memory usage from docker stats |
| Indicator | Observed Value | Unit | Evidence Interpretation |
|---|---|---|---|
| Initial accepted generation/version | 2/2 | generation/version | Edge started from an already accepted deployment state |
| Applied promoted generation/version | 3/3 | generation/version | A newly promoted model generation was successfully applied |
| Persisted inferences during OTA run | 1200 | events | Inference stream remained active throughout the experimental window |
| Predictions before/after generation switch | 288/912 | predictions | Persisted sequence shows a single clean transition from generation 2 to generation 3 |
| First cycle served by new generation | 289 | cycle | Generation switch became visible at a single transition point |
| OTA application latency | 167.20 | ms | Observed from persisted edge_sync_status |
| OTA commands accepted | 1 | commands | Prometheus-compatible metrics confirmed one accepted OTA command |
| Sync failures | 0 | failures | No failed synchronization was observed during the experiment |
| Schema mismatches during OTA run | 0 | events | No schema mismatches were observed during the generation transition |
| Pattern | Placement of Training/Inference Workloads | Suitability for Latency-Sensitive Industrial Execution | Lifecycle Governance and Controlled Adaptation | Openness/Replicability | Positioning in This Study |
|---|---|---|---|---|---|
| P-01 Centralised (Cloud-Centric) | Training and inference are mainly concentrated in cloud/server environments | Low to medium. Easy to manage centrally, but vulnerable to network dependency and latency in shop-floor settings | Medium. Can support orchestration and monitoring, but often with weaker local continuity and limited OT-side autonomy | Variable. Depends strongly on the chosen platform and deployment model | Treated as the dominant baseline in the literature, but potentially misaligned with low-latency and continuity requirements in brownfield industrial environments |
| P-02 Hybrid (Cloud–Fog–Edge) | Training and higher-level governance remain cloud-side, while inference is deployed close to the process, with fog/on-premises support | High. Better aligned with local inference, continuity, and timing constraints near the machine | High, but more complex. Enables separation of inference and learning loops, promotion, rollback, monitoring, and on-premises persistence, at the cost of cross-tier orchestration complexity | High when implemented with open tools and explicit cross-tier responsibilities | Positioned in this work as the most suitable general-purpose pattern for bounded industrial MLOps operationalisation when low-latency inference, OT/IT integration, and lifecycle governance must coexist |
| P-03 TinyML/Far Edge | Inference is pushed to ultra-constrained devices such as MCUs | Very high for extreme edge proximity | Low to medium. Strongly constrained updateability, observability, and lifecycle tooling | Medium. Often constrained by hardware-specific deployment choices | Better suited to ultra-constrained execution scenarios than to the broader CNC brownfield setting addressed here |
| P-04 Federated Learning | Training is distributed across multiple nodes/sites; inference placement may vary | Variable. Useful for privacy-preserving collaboration, but not primarily designed to solve near-machine latency in single-cell operation | Medium. Strong for decentralised collaboration, but introduces coordination, aggregation, and synchronisation complexity | Medium to high depending on implementation maturity | Interpreted here as a promising future extension for privacy-preserving multi-site collaboration, not as a validated capability of the present implementation |
| P-05 Air-Gapped/Isolated | Manual deployment via physically isolated infrastructure | Variable. Can support isolated execution, but with poor flexibility and slow update cycles | Low. Closed-loop adaptation, online monitoring, and fast retraining are severely limited | Low to medium. High isolation reduces operational openness and reuse | Relevant for highly isolated infrastructures, but misaligned with the closed-loop lifecycle-governance objective pursued in this work |
| Proposed architecture | Cloud-side training and governance; fog/on-premises persistence and integration; edge-side low-latency inference | High within the validated scope. Designed to support bounded local low-latency execution under industrial constraints | High within the validated scope. Explicitly structures promotion, rollback, monitoring, versioning, and separation between learning and inference loops | High. Vendor-neutral and open-source oriented by design | Positioned as a reproducible hybrid reference instantiation for industrial environments where OT/IT integration, low latency, and controlled lifecycle management must coexist |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Mateo-Casalí, M.Á.; Boza, A.; Fraile, F. Towards a Reference Architecture for Machine Learning Operations. Computers 2026, 15, 218. https://doi.org/10.3390/computers15040218
Mateo-Casalí MÁ, Boza A, Fraile F. Towards a Reference Architecture for Machine Learning Operations. Computers. 2026; 15(4):218. https://doi.org/10.3390/computers15040218
Chicago/Turabian StyleMateo-Casalí, Miguel Ángel, Andrés Boza, and Francisco Fraile. 2026. "Towards a Reference Architecture for Machine Learning Operations" Computers 15, no. 4: 218. https://doi.org/10.3390/computers15040218
APA StyleMateo-Casalí, M. Á., Boza, A., & Fraile, F. (2026). Towards a Reference Architecture for Machine Learning Operations. Computers, 15(4), 218. https://doi.org/10.3390/computers15040218

