Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab
Abstract
1. Introduction
- A conceptual framework for D2K and K2D pipelines built on networks of DSs, positioning them as the foundational elements of the WWL.
- A fully realized cross-organizational D2K pipeline that aggregates semantically annotated trajectory data from three independent institutions using an existing FAIR-compliant research data infrastructure and trains a reusable inverse-dynamics foundation model.
- Quantitative benchmark evidence showing that fine-tuning the foundation model reduces training time by approximately 85% while achieving accuracy within the torque sensor noise floor, compared to an end-to-end baseline following the architecture of Schneider et al. [11].
- A hybrid pipeline orchestration combining scheduled and event-driven data flows, validated in a live multi-institutional deployment.
- An analysis of the resulting governance, scalability, and sustainability implications, including an identification of open challenges for K2D feedback and federated extensions.
2. Background and Related Work
2.1. From Siloed Data to Networked Production Ecosystems
2.2. From Standalone Systems to Unified Data Pipelines
2.3. From Digital Twins to Digital Shadows
2.4. Federated Learning as a Related Paradigm
2.5. Research Gap
3. Materials and Methods
3.1. Conceptual Framework: Data, Knowledge, Agents, and Pipelines
- Data is raw, context-dependent information collected from sources like sensors, machines, and humans. It exists in structured, semi-structured, or unstructured formats, including measurements, logs, and images. Data undergoes transformations such as cleaning and semantic annotation to become actionable knowledge.
- Knowledge is actionable, semantically enriched information derived from data or inherent in autonomous agents. It supports decision-making, system optimization, and continuous improvement. Knowledge is operationalized here as a trained model (or model parameters) together with its provenance metadata: the data it was trained on, the training procedure, and performance bounds. This definition aligns with foundational knowledge management literature [49,50].
- Autonomous agent is an entity—human or artificial—capable of perceiving its environment, processing data, making decisions, and adapting over time [51]. Agents mediate between data and knowledge: they execute D2K transformations (e.g., a training pipeline) and apply K2D feedback (e.g., reconfiguring sensor sampling based on a trained model’s uncertainty). This dual role is represented in Figure 1.
- Data-to-Knowledge pipeline is a directed sequence of transformations from raw sensor data to actionable knowledge:
- Data collection from various sources.
- Data processing to transform through cleaning and annotation.
- Knowledge generation using analytics and learning methods.
- Action application for process optimization and decision-making.
- Each step may be automated, human-in-the-loop, or hybrid. The full pipeline forms a DAG that may branch, merge, and be triggered by schedules or events.
- Knowledge-to-Data pipeline is the inverse transformation, applying existing knowledge to guide data collection:
- Existing knowledge used to inform data collection (e.g., a foundation model).
- Influence on data collection strategies (e.g., targeted trajectory sampling).
- Data processing shaped by knowledge (e.g., filtering based on model uncertainty).
- Modified production activities (e.g., reconfigured robot motion).
- K2D outputs may themselves trigger new D2K cycles, creating a closed adaptive loop.
3.2. System Architecture
3.3. Domain-Specific Use Cases
- Laser Material Processing (Lehrstuhl fuer Lasertechnik (Chair for Laser Technology) (LLT)/Fraunhofer Institute for Laser Technology (ILT)): A Franka Emika robot equipped with a fast beam steering device performs laser engraving on steel, where trajectory inaccuracies directly affect geometric tolerances in laser–matter interaction [11,55]. The robot operates with relatively few workspace restrictions, resulting in a broad, near-uniform joint-space distribution (Figure 4).
- Textile Fiber Draping (Institut für Textiltechnik (Institute for Textile Technology) (ITA)): A Franka Emika robot automates fiber composite preform draping for flexible manufacturing in small and sedium-sized enterprises (SMEs), where careful force control is required to avoid damaging delicate textiles [56]. The constrained draping geometry yields a narrower joint-space distribution.
- Gear Assembly (Werkzeugmaschinenlabor (Chair for Machine Tools) (WZL)): A Franka Emika robot performs peg-in-hole gear assembly using inverse-dynamics-based torque control for precise positioning [57]. The highly structured assembly environment similarly constrains joint-space coverage.
3.4. Inverse-Dynamics Data-to-Knowledge Pipeline
- Event-driven data ingestion: Following each trajectory execution, robot data are pushed to the Coscine repository. Data are modeled as FDOs with RDF-based metadata including velocity and acceleration scaling factors, robot instance identifiers, and git commit hashes to ensure traceability.
- Nightly training sweep (schedule-based): At 2 a.m., the current DSs are pulled from the repository. A sweep agent initiates training runs, each with a new hyperparameter configuration H sampled by the sweep server (Weights and Biases). Dataset statistics are analyzed and uploaded back.
- Model selection: If a new model achieves below the current champion, the repository is updated with the new and H. The selected model is evaluated on the held-out test set and results are stored.
- On-demand fine-tuning: Instance models are fine-tuned on demand by adapting up to five layers of .
3.5. Benchmark Setup
3.5.1. Dataset and Splits
3.5.2. Hyperparameter Search
3.5.3. Success Criterion
4. Results
4.1. Dataset Characteristics
4.2. Training Time
4.3. Validation Accuracy
5. Discussion
5.1. Summary of Findings and Limitations
5.2. Comparison with State-of-the-Art
5.3. Implications
5.3.1. Technical Implications
5.3.2. Organizational Implications
5.3.3. Governance Implications
6. Conclusions
- Ablation studies: The current benchmark cannot isolate the individual contributions of cross-site data aggregation, parameter transfer from , and semantic annotation to the observed training-time reduction and accuracy. Controlled ablation experiments—for example, comparing cross-site aggregation without fine-tuning, single-site fine-tuning without a cross-organizational foundation model, and varying the number of fine-tuned layers systematically—would clarify which factors drive the gains and inform the design of future D2K deployments.
- Realizing the K2D feedback loop: Safe write-back to legacy production equipment, real-time model inference at data collection time, and semantic drift detection as ontologies evolve are the concrete technical hurdles identified.
- Extending to heterogeneous platforms: The current proof of concept is limited to a single robot family; extending to different robot platforms or other production domains requires additional standardization and may yield different transfer dynamics.
- Automated semantic alignment: Reducing the manual effort of onboarding new machine types via tools such as OPC UA to RDF mapping [36] is a prerequisite for broader WWL adoption.
- Governance frameworks: Federally auditable provenance and enforceable data usage agreements are critical non-technical enablers for multi-actor WWL expansion, as the organizational and liability challenges may prove to be as demanding as the technical ones.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| D2K | Data-to-Knowledge |
| DAG | Directed Acyclic Graph |
| deep RL | Deep Reinforcement Learning |
| DS | Digital Shadow |
| DT | Digital Twin |
| ERP | Enterprise Resource Planning |
| FAIR | Findable, Accessible, Interoperable, Reusable |
| FDO | FAIR Digital Object |
| FedCSF | Federated graph learning via Constructing and Sharing Feature spaces |
| FedHA | Federated Heterogeneity-aware Adaptive framework |
| ILT | Fraunhofer Institute for Laser Technology |
| IoP | Internet of Production |
| IoT | Internet of Things |
| ITA | Institut für Textiltechnik (Institute for Textile Technology) |
| K2D | Knowledge-to-Data |
| LLT | Lehrstuhl für Lasertechnik (Chair for Laser Technology) |
| LSTM | Long Short-Term Memory |
| MAE | Mean Absolute Error |
| MES | Manufacturing Execution System |
| PINN | Physics-Informed Neural Network |
| QOMOU | Querying of Ontology Mapping-based OPC UA |
| RDF | Resource Description Format |
| SME | Small and Medium-sized Enterprise |
| WWL | World Wide Lab |
| WWW | World Wide Web |
| WZL | Werkzeugmaschinenlabor (Chair for Machine Tools) |
References
- Bruner, J. Industrial Internet—The Machines Are Talking; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2013. [Google Scholar]
- Kagermann, H. Change Through Digitization—Value Creation in the Age of Industry 4.0. In Management of Permanent Change; Albach, H., Meffert, H., Pinkwart, A., Reichwald, R., Eds.; Springer Fachmedien Wiesbaden: Wiesbaden, Germany, 2015; pp. 23–45. [Google Scholar] [CrossRef]
- Lu, Y. Industry 4.0: A survey on technologies, applications and open research issues. J. Ind. Inf. Integr. 2017, 6, 1–10. [Google Scholar] [CrossRef]
- Zhong, R.Y.; Xu, X.; Klotz, E.; Newman, S.T. Intelligent Manufacturing in the Context of Industry 4.0: A Review. Engineering 2017, 3, 616–630. [Google Scholar] [CrossRef]
- Nargesian, F.; Zhu, E.; Miller, R.J.; Pu, K.Q.; Arocena, P.C. Data lake management: Challenges and opportunities. Proc. VLDB Endow. 2019, 12, 1986–1989. [Google Scholar] [CrossRef]
- Nambiar, A.; Mundra, D. An Overview of Data Warehouse and Data Lake in Modern Enterprise Data Management. Big Data Cogn. Comput. 2022, 6, 132. [Google Scholar] [CrossRef]
- Goedegebuure, A.; Kumara, I.; Driessen, S.; Van Den Heuvel, W.J.; Monsieur, G.; Tamburri, D.A.; Nucci, D.D. Data Mesh: A Systematic Gray Literature Review. ACM Comput. Surv. 2024, 57, 1–36. [Google Scholar] [CrossRef]
- Behery, M.; Glawe, F.; Koren, I.; Ziefle, M.; Lakemeyer, G.; Brauner, P. Vision Paper: Leveraging Industrial Big Data—Past, Present, and Future of the World Wide Lab. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 1308–1313. [Google Scholar] [CrossRef]
- Brauner, P.; Dalibor, M.; Jarke, M.; Kunze, I.; Koren, I.; Lakemeyer, G.; Liebenberg, M.; Michael, J.; Pennekamp, J.; Quix, C.; et al. A Computer Science Perspective on Digital Transformation in Production. ACM Trans. Internet Things 2022, 3, 1–32. [Google Scholar] [CrossRef]
- Liebenberg, M.; Jarke, M. Information Systems Engineering with Digital Shadows: Concept and Case Studies: An Exploratory Paper. In Proceedings of the Advanced Information Systems Engineering: 32nd International Conference, CAiSE 2020, Grenoble, France, 8–12 June 2020; pp. 70–84. [Google Scholar] [CrossRef]
- Schneider, J.N.; Gorissen, L.; Kaster, T.; Walderich, P.; Hinke, C. LSTM-based Inverse Dynamics Learning for Franka Emika Robot. In Proceedings of the 2024 International Conference on Control, Automation and Diagnosis (ICCAD), Lyon, France, 1–3 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Inmon, W.H. Building the Data Warehouse, 3rd ed.; Wiley Computer Publishing: Hoboken, NJ, USA, 2002. [Google Scholar]
- Dixon, J. Pentaho, Hadoop, and Data Lakes. James Dixon’s Blog, 2010. Available online: https://jamesdixon.wordpress.com/2010/10/14/pentaho-hadoop-and-data-lakes/ (accessed on 13 May 2026).
- Harby, A.A.; Zulkernine, F. From Data Warehouse to Lakehouse: A Comparative Review. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 389–395. [Google Scholar] [CrossRef]
- IBM. What is a Data Fabric? IBM Website: Armonk, NY, USA, 2024. [Google Scholar]
- Schuh, G.; Prote, J.P.; Gützlaff, A.; Thomas, K.; Sauermann, F.; Rodemann, N. Internet of Production: Rethinking production management. In Proceedings of the Production at the Leading Edge of Technology; Wulfsberg, J.P., Hintze, W., Behrens, B.A., Eds.; Springer: Berlin/Heidelberg, Germany, 2019; pp. 533–542. [Google Scholar] [CrossRef]
- Pallasch, C.; Hoffmann, N.; Storms, S.; Herfs, W. ProducTron: Towards Flexible Distributed and Networked Production. In Proceedings of the 2018 IEEE 22nd International Conference on Intelligent Engineering Systems (INES), Las Palmas de Gran Canaria, Spain, 21–23 June 2018; pp. 000287–000292. [Google Scholar] [CrossRef]
- Gleim, L.; Pennekamp, J.; Liebenberg, M.; Buchsbaum, M.; Niemietz, P.; Knape, S.; Epple, A.; Storms, S.; Trauth, D.; Bergs, T.; et al. FactDAG: Formalizing Data Interoperability in an Internet of Production. IEEE Internet Things J. 2020, 7, 3243–3253. [Google Scholar] [CrossRef]
- Pennekamp, J.; Matzutt, R.; Kanhere, S.S.; Hiller, J.; Wehrle, K. The Road to Accountable and Dependable Manufacturing. Automation 2021, 2, 202–219. [Google Scholar] [CrossRef]
- Auer, M.; Zutin, D.G. A grid of online laboratories based on the iLab shared architecture. In Proceedings of the ASEE Annual Conference and Exposition, San Antonio, TX, USA, 10–13 June 2012. [Google Scholar] [CrossRef]
- Salzmann, C.; Gillet, D.; Esquembre, F.; Dormido, S. Web 2.0 open remote and virtual laboratories in engineering education. In Cyber Behavior: Concepts, Methodologies, Tools, and Applications; IGI Global Scientific Publishing: Palmdale, PA, USA, 2014. [Google Scholar] [CrossRef]
- Titov, I.; Glotov, A.; Mikolnikov, J. Labicom.net—The online laboratories platform demonstration 2014. In Proceedings of the 2014 International Conference on Interactive Collaborative Learning (ICL), Dubai, United Arab Emirates, 3–5 December 2014. [Google Scholar] [CrossRef]
- Carnegie Mellon University. Manufacturing Futures Institute—Building the Factory of the Future; Carnegie Mellon University: Pittsburgh, PA, USA, 2023. [Google Scholar]
- The Smart Manufacturing Institute. Smart Manufacturing Innovation Platform; The Smart Manufacturing Institute: Los Angeles, CA, USA, 2023. [Google Scholar]
- Gaia-X. Gaia-X: A Federated Data Infrastructure for Europe. Gaia-X Project Website, 2020. Available online: https://www.gaia-x.eu (accessed on 1 October 2024).
- Catena-X. Catena-X: The Automotive Network. Catena-X, 2021. Available online: https://catena-x.net/ (accessed on 1 October 2024).
- Plattform Industrie 4.0. Manufacturing-X: Data Ecosystem for Manufacturing. Plattform Industrie 4.0 Website, 2022. Available online: https://www.plattform-i40.de/IP/Navigation/DE/Manufacturing-X/Initiative/initiative-manufacturing-x.html (accessed on 1 October 2024).
- NVIDIA. Omniverse—Plattform für Open USD. Available online: https://www.nvidia.com/de-de/omniverse/ (accessed on 8 August 2025).
- Munappy, A.R.; Bosch, J.; Olsson, H.H. On the Trade-off Between Robustness and Complexity in Data Pipelines. In Proceedings of the Quality of Information and Communications Technology; Paiva, A.C.R., Cavalli, A.R., Ventura Martins, P., Pérez-Castillo, R., Eds.; Springer: Cham, Switzerland, 2021; pp. 401–415. [Google Scholar] [CrossRef]
- Munappy, A.R.; Bosch, J.; Olsson, H.H. Maturity Assessment Model for Industrial Data Pipelines. In Proceedings of the 2023 30th Asia-Pacific Software Engineering Conference (APSEC), Los Alamitos, CA, USA, 4–7 December 2023; pp. 503–513. [Google Scholar] [CrossRef]
- Yadranjiaghdam, B.; Pool, N.; Tabrizi, N. A Survey on Real-Time Big Data Analytics: Applications and Tools. In Proceedings of the 2016 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 15–17 December 2016; pp. 404–409. [Google Scholar] [CrossRef]
- Tu, D.; He, Y.; Cui, W.; Ge, S.; Zhang, H.; Han, S.; Zhang, D.; Chaudhuri, S. Auto-Validate by-History: Auto-Program Data Quality Constraints to Validate Recurring Data Pipelines. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 6–10 August 2023; pp. 4991–5003. [Google Scholar] [CrossRef]
- Song, J.; He, Y. Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes. In Proceedings of the 2021 International Conference on Management of Data, New York, NY, USA, 20–25 June 2021; pp. 1678–1691. [Google Scholar] [CrossRef]
- Mesbah, S.; Fragkeskos, K.; Lofi, C.; Bozzon, A.; Houben, G.J. Semantic Annotation of Data Processing Pipelines in Scientific Publications. In Proceedings of the Semantic Web; Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O., Eds.; Springer: Cham, Switzerland, 2017; pp. 321–336. [Google Scholar] [CrossRef]
- Zheng, Z.; Zhou, B.; Zhou, D.; Soylu, A.; Kharlamov, E. ExeKG: Executable Knowledge Graph System for User-friendly Data Analytics. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, New York, NY, USA, 17–21 October 2022; pp. 5064–5068. [Google Scholar] [CrossRef]
- Bi, J.; Wu, R.; Yuan, H.; Wang, Z.; Zhang, J.; Zhou, M. Ontology-Based Semantic Reasoning for Multisource Heterogeneous Industrial Devices Using OPC UA. IEEE Internet Things J. 2025, 12, 25020–25032. [Google Scholar] [CrossRef]
- Bodenbenner, M.; Pennekamp, J.; Montavon, B.; Wehrle, K.; Schmitt, R.H. FAIR Sensor Ecosystem: Long-Term (Re-)Usability of FAIR Sensor Data through Contextualization. In Proceedings of the 2023 IEEE 21st International Conference on Industrial Informatics (INDIN), Helsinki, Finland, 13–16 June 2023; pp. 1–8. [Google Scholar] [CrossRef]
- Date, C.J. Database Design and Relational Theory: Normal Forms and All That Jazz; Apress: Berkley, CA, USA, 2019. [Google Scholar] [CrossRef]
- Behery, M.; Brauner, P.; Kluge-Wilkes, A.; Baier, R.; Mertens, A.; Schmitt, R.H.; Ziefle, M.; Lakemeyer, G. Digital Shadows for Robotic Assembly in the World Wide Lab. Procedia CIRP 2023, 120, 165–170. [Google Scholar] [CrossRef]
- Bauernhansl, T.; Hartleif, S.; Felix, T. The Digital Shadow of Production—A Concept for the Effective and Efficient Information Supply in Dynamic Industrial Environments. Procedia CIRP 2018, 72, 69–74. [Google Scholar] [CrossRef]
- Kulkarni, V.; Reddy, S. Separation of concerns in model-driven development. IEEE Softw. 2003, 20, 64–69. [Google Scholar] [CrossRef]
- Nadareishvili, I.; Mitra, R.; McLarty, M.; Amundsen, M. Microservice Architecture: Aligning Principles, Practices, and Culture, 1st ed.; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2016. [Google Scholar]
- Heithoff, M.; Hopmann, C.; Köbel, T.; Michael, J.; Rumpe, B.; Sapel, P. Application of digital shadows on different levels in the automation pyramid. Data Knowl. Eng. 2025, 158, 102442. [Google Scholar] [CrossRef]
- Michael, J.; Koren, I.; Dimitriadis, I.; Fulterer, J.; Gannouni, A.; Heithoff, M.; Hermann, A.; Hornberg, K.; Kröger, M.; Sapel, P.; et al. A Digital Shadow Reference Model for Worldwide Production Labs. In Internet of Production: Fundamentals, Applications and Proceedings; Brecher, C., Schuh, G., van der Aalst, W., Jarke, M., Piller, F.T., Padberg, M., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 1–28. [Google Scholar] [CrossRef]
- Tong, C.; Zhang, L.; Ding, Y.; Yue, D. A Heterogeneity-Aware Adaptive Federated Learning Framework for Short-Term Forecasting in Electric IoT Systems. IEEE Internet Things J. 2025, 12, 15388–15403. [Google Scholar] [CrossRef]
- Chen, J.; Zhuo, S.; He, J.; Qiu, W.; Zhang, Q.; Xiong, Z.; Zheng, Z.; Tang, Y.; Chen, M.; Wang, C.; et al. Federated Graph Learning via Constructing and Sharing Feature Spaces for Cross-Domain IoT. IEEE Internet Things J. 2025, 12, 26200–26214. [Google Scholar] [CrossRef]
- Chahoud, M.; Sami, H.; Mourad, A.; Otrok, H.; Bentahar, J.; Guizani, M. On-Demand Model and Client Deployment in Federated Learning with Deep Reinforcement Learning. IEEE Internet Things J. 2025, 12, 26685–26698. [Google Scholar] [CrossRef]
- Wang, X.; Chen, T.; Dai, H.N.; Long, P.; Yang, H.; Xiong, Z.; Susilo, W. A Privacy-Enhanced Method for Privacy-Preserving and Verifiable Federated Learning. IEEE Internet Things J. 2025, 12, 26768–26781. [Google Scholar] [CrossRef]
- Alavi, M.; Leidner, D.E. Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues. MIS Q. 2001, 25, 107–136. [Google Scholar] [CrossRef]
- National Institute of Standards and Technology. NIST Big Data Interoperability Framework (NBDIF) Version 3.0; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2019.
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach, 3rd ed.; Pearson Education: London, UK, 2016. [Google Scholar]
- Coscine. Coscine. Coscine Project Website, 2016. Available online: https://about.coscine.de/en/ (accessed on 17 March 2024).
- Bjorck, J.; Castañeda, F.; Cherniadev, N.; Da, X.; Ding, R.; Fan, L.J.; Fang, Y.; Fox, D.; Hu, F.; Huang, S.; et al. GR00T N1: An Open Foundation Model for Generalist Humanoid Robots. arXiv 2025. [Google Scholar] [CrossRef]
- Ball, P.J.; Bauer, J.; Belletti, F.; Brownfield, B.; Ephrat, A.; Fruchter, S.; Gupta, A.; Holsheimer, K.; Holynski, A.; Hron, J.; et al. Genie 3: A New Frontier for World Models, 2025. Available online: https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/ (accessed on 8 August 2025).
- Gorißen, L.; Schneider, J.N.; Kaster, T.; Hinke, C.; Häfner, C. Towards the Application of Low-Cost Collaborative Robots in Laser Materials Processing. J. Laser Micro/Nanoeng. 2026, 21, 91–102. [Google Scholar] [CrossRef]
- Dammers, H.; Lennartz, M.; Liebe, P.; Gries, T. AI-Driven Robotic-Tool Selection for Draping Composite Preforms Based on a Geometric Surface Segmentation Approach. In Proceedings of the SAMPE 2024; NA SAMPE: Diamond Bar, CA, USA, 2024. [Google Scholar] [CrossRef]
- Arents, J.; Abolins, V.; Judvaitis, J.; Vismanis, O.; Oraby, A.; Ozols, K. Human—Robot Collaboration Trends and Safety Aspects: A Systematic Review. J. Sens. Actuator Netw. 2021, 10, 48. [Google Scholar] [CrossRef]
- Siciliano, B.; Khatib, O. (Eds.) Springer Handbook of Robotics; Springer International Publishing: Berlin/Heidelberg, Germany, 2016. [Google Scholar] [CrossRef]
- Gorißen, L.M.; Schneider, J.N.; Behery, M.; Brauner, P.; Lennartz, M.; Kötter, E.D.; Kaster, T.; Petrovic, O.; Hinke, C.R.; Gries, T.; et al. Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab: Source Code; RWTH Aachen University: Aachen, Germany, 2025. [Google Scholar] [CrossRef]
- Gorißen, L.M.; Schneider, J.N.; Behery, M.; Brauner, P.; Lennartz, M.; Kötter, E.D.; Kaster, T.; Petrovic, O.; Hinke, C.R.; Gries, T.; et al. Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab: Trajectory Data; RWTH Aachen University: Aachen, Germany, 2025. [Google Scholar] [CrossRef]
- Plattform Industrie 4.0. Reference Architecture Model Industrie 4.0 (RAMI 4.0); Plattform Industrie 4.0: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Otto, B.; Jarke, M. Designing a Multi-sided Data Platform: Findings from the International Data Spaces Case. Electron. Mark. 2019, 29, 561–580. [Google Scholar] [CrossRef]
- Behery, M.; Brauner, P.; Zhou, H.A.; Uysal, M.S.; Samsonov, V.; Bellgardt, M.; Brillowski, F.; Brockhoff, T.; Farhang Ghahfarokhi, A.; Gleim, L.; et al. Actionable Artificial Intelligence for the Future of Production. In Internet of Production; Springer: Cham, Switzerland, 2023; pp. 1–46. [Google Scholar] [CrossRef]
- Endsley, M.R. From Here to Autonomy: Lessons Learned from Human–Automation Research. Hum. Factors 2017, 59, 5–27. [Google Scholar] [CrossRef]
- Bernhard, S.; Pütz, S.; Röhl, C.; Baier, R.; Brauner, P.; Christou, E.; Dammers, H.; Flaig, R.; Gorißen, L.M.; Heilinger, J.C.; et al. Sustainability in the Internet of Production: Interdisciplinary Opportunities and Challenges. In Proceedings of the 2023 IEEE International Symposium on Technology and Society (ISTAS), Cape Town, South Africa, 15–17 November 2023; pp. 1–8. [Google Scholar] [CrossRef]
- Sweeney, L. k-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 2002, 10, 557–570. [Google Scholar] [CrossRef]
- Li, N.; Li, T.; Venkatasubramanian, S. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, 15–20 April 2007; pp. 106–115. [Google Scholar] [CrossRef]
- Machanavajjhala, A.; Gehrke, J.; Kifer, D.; Venkitasubramaniam, M. L-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd International Conference on Data Engineering (ICDE’06), Atlanta, GA, USA, 3–7 April 2006; p. 24. [Google Scholar] [CrossRef]
- Zheng, X.; Cai, Z. Privacy-Preserved Data Sharing Towards Multiple Parties in Industrial IoTs. IEEE J. Sel. Areas Commun. 2020, 38, 968–979. [Google Scholar] [CrossRef]
- Abraham, R.; Schneider, J.; vom Brocke, J. Data governance: A conceptual framework, structured review, and research agenda. Int. J. Inf. Manag. 2019, 49, 424–438. [Google Scholar] [CrossRef]
- Hummel, P.; Braun, M.; Tretter, M.; Dabrock, P. Data sovereignty: A review. Big Data Soc. 2021, 8, 2053951720982012. [Google Scholar] [CrossRef]
- Cuñat, S.; Julian, M.; Belsa, A.; Valero, C.I.; Esteve, M.; Palau, C.E. Secure, Trusted, Privacy-Protected Data Exchange in an Edge-Cloud Continuum Environment. In Internet of Things; Springer: Cham, Switzerland, 2024; pp. 201–231. [Google Scholar] [CrossRef]
- Shoomal, A.; Jahanbakht, M.; Componation, P.J. An analytical framework for evaluating blockchain and IoT use cases in sustainable supply chains. Supply Chain. Anal. 2026, 13, 100198. [Google Scholar] [CrossRef]
- Gorißen, L.M.; Schneider, J.N.; Behery, M.; Brauner, P.; Lennartz, M.; Kötter, E.D.; Kaster, T.; Petrovic, O.; Hinke, C.R.; Gries, T.; et al. Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab: Benchmark Models; RWTH Aachen University: Aachen, Germany, 2025. [Google Scholar] [CrossRef]






| Measurements | Train | Val | Subtotal | Test | |
|---|---|---|---|---|---|
| LLT | 230,627 | 1027 | 257 | 1284 | 28 |
| ITA | 87,950 | 252 | 64 | 316 | – |
| WZL | 236,102 | 746 | 187 | 933 | – |
| Total | 554,679 | 2025 | 508 | 2533 | 28 |
| Hyperparameter | Range/Values | Distribution |
|---|---|---|
| Optimizer | Adam, SGD | Categorical |
| Learning rate | – | Log-uniform |
| Clipnorm | 1– | Log-uniform |
| Window size | 0–100 | Integer-uniform |
| Batch size | 2048, 4096 | Categorical |
| LSTM units | 1–1000 | Integer-uniform |
| Dropout | –1 | Log-uniform |
| LSTM layers | 1–100 | Integer-uniform |
| Epochs | 100 (fixed) | — |
| Symbol | Description |
|---|---|
| Inverse dynamics | |
| Joint configuration vector | |
| Joint velocity vector | |
| Joint acceleration vector | |
| Joint torque vector | |
| Inverse dynamics mapping, | |
| Data and datasets | |
| Trajectory dataset from a specific institute | |
| Aggregated cross-organizational dataset, | |
| n | Number of training runs per sweep cycle () |
| Models and training | |
| Foundation model parameters trained on | |
| Instance-specific model parameters fine-tuned from | |
| H | Hyperparameter configuration (architecture, learning rate, etc.) |
| Cross-validation loss used for model selection | |
| Evaluation | |
| MAE | Mean absolute error on predicted joint torques (Nm) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Gorissen, L.; Schneider, J.-N.; Behery, M.; Brauner, P.; Lennartz, M.; Kötter, D.; Kaster, T.; Petrovic, O.; Hinke, C.; Gries, T.; et al. Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab. Mach. Learn. Knowl. Extr. 2026, 8, 136. https://doi.org/10.3390/make8050136
Gorissen L, Schneider J-N, Behery M, Brauner P, Lennartz M, Kötter D, Kaster T, Petrovic O, Hinke C, Gries T, et al. Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab. Machine Learning and Knowledge Extraction. 2026; 8(5):136. https://doi.org/10.3390/make8050136
Chicago/Turabian StyleGorissen, Leon, Jan-Niklas Schneider, Mohamed Behery, Philipp Brauner, Moritz Lennartz, David Kötter, Thomas Kaster, Oliver Petrovic, Christian Hinke, Thomas Gries, and et al. 2026. "Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab" Machine Learning and Knowledge Extraction 8, no. 5: 136. https://doi.org/10.3390/make8050136
APA StyleGorissen, L., Schneider, J.-N., Behery, M., Brauner, P., Lennartz, M., Kötter, D., Kaster, T., Petrovic, O., Hinke, C., Gries, T., Lakemeyer, G., Ziefle, M., Brecher, C., & Häfner, C. (2026). Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab. Machine Learning and Knowledge Extraction, 8(5), 136. https://doi.org/10.3390/make8050136

