Use-Case-Driven Architectures for Data Platforms in Manufacturing

Permin, Eike; Wohlgemuth, Carsten; Keller, Tom

doi:10.3390/platforms3030015

Open AccessArticle

Use-Case-Driven Architectures for Data Platforms in Manufacturing

by

Eike Permin

^*

,

Carsten Wohlgemuth

and

Tom Keller

Institute of General Mechanical Engineering, Faculty of Computer Science and Engineering Science, TH Köln University of Applied Science, Steinmüllerallee 1, 51643 Gummersbach, Germany

^*

Author to whom correspondence should be addressed.

Platforms 2025, 3(3), 15; https://doi.org/10.3390/platforms3030015

Submission received: 27 May 2025 / Revised: 3 August 2025 / Accepted: 7 August 2025 / Published: 11 August 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

Since the term “Industry 4.0” was coined in 2011, machine data retrieval, storage and processing has been one of the major drivers for process optimization and factory management. Data platforms have been introduced as a key resource to process, align and enhance data from machines, sensors and other sources. At the same time, different use cases and applications vary greatly in their technical demands towards data amounts, formats, retrieval rate, scalability, latency and many more. Thus, holistic data platforms are often a compromise between these requirements. This contribution thus looks into the requirements and needs of different use cases and applications for data usage in manufacturing—from factory management, to process control and auxiliaries monitoring. From these use cases, specific requirements will be deducted to propose architectures that reflect the needs of these distinctive applications. From a structured literature analysis with more than 100 scientific publications, archetypes for data management platforms in manufacturing will be condensed and explained in detail. Overall, eight distinctive archetypes have been identified and clustered, with the major distinguishing feature being their reliance on extensive models or digital twin. This contribution closes with application examples for some of these archetypes.

Keywords:

manufacturing; Industry 4.0; digital twin; Smart Manufacturing; architectures; Cyber Physical Production Systems

1. Introduction

During the last years, the amount of data that is created globally has risen exponentially and is currently measured in Zettabytes [1]. While most of this data is probably created by and for consumers, the area of manufacturing plays a minor yet important role: as necessary raw ingredient for optimization and control, manufacturing data can contribute greatly to company and supply chain efficiency, competitiveness or other goals such as sustainability or resilience. Ever since the term “Industry 4.0” was coined in 2011, machine data retrieval, storage and processing has thus been one of the major drivers for process optimization and factory management. Data pipelines and platforms have been introduced as key elements to process, align and enhance data from machines, sensors and other sources. Historically, most data sources such as automation equipment and sensors have been equipped with proprietary protocols, creating a wide variety of different data formats to be collected, aligned and stored. At the same time, different use cases and applications vary greatly in their technical demands towards data amounts, formats, retrieval rate, robustness, latency, security and many more. If introduced as a “one size fits all” solution, holistic data platforms often depict a compromise between these requirements. At the same time, different technologies including those used for data transmission or storage also vary greatly in their capacity and performance.

Early on, industrial and academic contributors tried to define a common high level architecture [2]. The architecture was even transferred into an international norm IEC PAS 63088 [3]. While this helped to set overarching definitions for technical terms and a common understanding, its translation into technical building blocks remained open. Ref. [4] described an architecture enabling the so-called “Internet of Production”, which was again referenced in several publications such as [5,6]. Depicting several layers of data, this model can be used to understand data aggregation and refinement, yet it remains a high level architecture with little technical details. Several authors proposed architectures for Cyber-Physical Production Systems (CPPS) such as [7,8], with the first publications going back more than 10 years (e.g., [9]). Applications ranged from scheduling [10] to retro-fitting of existing old machines (e.g., [11]). To achieve the data integration of machines and sensors, the Asset Administration Shell (AAS) was developed as a quasi-standard for data modeling and structuring [12,13], with a focus on integration of automation control systems, the so-called Programmable Logic Controls (PLCs), as in [14,15]. During the last years, the term Digital Twin has gained significant attention, bringing together live data from the shop floor with models of machine or supply chain behavior, as, for example, in [16,17].

As can be seen, an abundance of architectures for data management platforms has been developed under different names to solve a variety of distinctive use cases in manufacturing. At the same time, several architectures that were proposed in the scientific literature seem to have significant overlap or at least similarities. From the perspective of application, no clear guidance could be identified as to when apply what kind of architecture. This contribution thus provides an overview of different architecture archetypes and when to apply them. Based on a classification scheme, the best-fitted archetype for specific boundary conditions is identified. Thus, the added value of this contribution lies in a classification and identification scheme for distinctive archetypes for data management in manufacturing.

We start by researching into the requirements and needs of different use cases and applications for data usage in manufacturing—from quality management, factory organization or resource coordination to energy optimization and condition monitoring. From these use cases, specific requirements will be deducted towards architectures that reflect the needs of these distinctive applications. Based on a holistic literature survey, archetypes for data management in manufacturing are condensed and classified with respect to their boundary conditions and characteristics. To illustrate these architypes more, example cases for some of these are presented.

2. Materials and Methods

This study is based on a systematic literature review aimed at identifying the architectural characteristics required to meet the technical demands of various data-driven use cases in manufacturing. The overarching goal is to develop a decision logic that supports the context-specific selection of suitable data platform architectures. To identify the most relevant publications in a structured review, the key technical terms are introduced in Section 2.1. Afterwards, the literature reviews itself is presented in Section 2.2.

2.1. Key Terms and Concepts

Latency: The ability of a digital system to respond to certain inputs in a pre-defined time span with a high certainty is often described as “real time”. This definition can be found in the ISO/IEC 2382 [18] and simply acknowledges the fact that latency requirements depend on the application. In the context of automatized systems such as CNC mills or robots, strong latency requirements might thus translate into few milliseconds. In the case of IT systems for business processes, weak (or relaxed) latency requirements often translate into the need to respond within several seconds [19].

Data volume: Low-volume applications in manufacturing often comprise readings from single sensors (Bytes) or text-protocols from controls (few Kilobytes). Applications with medium data volume might comprise pictures or program code (Megabytes), high volume applications sometimes range into Gigabytes [19]. At the same time, large amounts of low data producers might add up to medium data volume streams, as in the case of large scale sensor networks (e.g., [20]). Sometimes, data reduction techniques such as aggregation will come into play here.

Data consistency: Factories typically feature heterogenous data sources ranging from codified sensor data to vendor-specific automation protocols, pictures and even process management data, thus ensuring data consistency becomes a very important topic. At the same time, this might interfere with data reduction techniques to reduce volume.

Data integrity: While data consistency looks more at the availability of correct data for model building or control, integrity focuses more on data that has not be manipulated. This becomes very important when sharing sensitive quality data with customers and suppliers or managing orders across a value chain.

Security: Manufacturing companies often employ data security systems in accordance with international norms such as the ISO 27001 to prevent data breaches or unwanted intrusions [21]. In some cases, they might even be forced to consider tighter regulations as in the case of the EU Cyber Resilience Act [22]. At the same time, manufacturing-specific standards such as the IEC 62443 have not yet been introduced in large number [23].

Scalability: Scalability describes the ability of a system to scale, meaning to grow considerably in size. In the case of manufacturing systems, this depicts the ability to integrate new data producers such as sensors as well as resources, machines and employees seamlessly, without the need to manually change parts of the architecture.

2.2. Systematic Literature Review

The methodology follows the general principles of systematic literature reviews but does not explicitly apply formal frameworks such as PRISMA (short for ‘Preferred Reporting Items for Systematic Reviews and Meta-Analyses’) and SPAR-4-SLR (short for ‘Scientific Procedures and Rationales for Systematic Literature Reviews’). Our approach mirrors central elements of these frameworks—such as transparent filtering, clear inclusion criteria, and replicability—but deviates from them in specific areas. Since we used only the Scopus database, where a large number of relevant publications were already available, duplicate screening was not necessary. Moreover, to focus on identifying architectural requirements that are often mentioned only implicitly, we conducted a full-text analysis without an abstract-based screening in advance. This made it possible to extract nuanced technical details from a manageable number of papers, which justified the methodological divergence.

The search strategy was developed to systematically narrow down the literature corpus toward practically relevant publications at the intersection of industrial data usage and architectural design. A stepwise filtering approach was applied in the Scopus database, covering English-language publications from 2015 to 2025, including journal articles, review papers, and conference proceedings. The query was executed on 8 April 2025.

Figure 1 provides an overview of this four-step filtering process and illustrates the reduction in results across each stage.

In the first step (S1), the query combined general architectural terms with industrial manufacturing contexts: (“Data Architecture” OR “System Architecture” OR “Information Architecture”) AND (“Manufacturing” OR “Production Systems” OR “Industry 4.0” OR “Smart Manufacturing”). This resulted in 1213 initial publications.

In the second step (S2), the query was extended by integrating well-established architectural paradigms: S1 AND (“Edge Computing” OR “Cloud Computing” OR “Hybrid Architecture” OR “Data Mesh” OR “Monolithic Architecture” OR “On-Premise” OR “Digital Twin” OR “Centralized Architecture” OR “Distributed Architecture”). The inclusion of broader terms such as “Data Mesh” was intentional, as these papers often also cover concrete technical architectures that support such socio-technical concepts. This narrowed the results to 195 publications.

In the third step (S3), additional decision-relevant architectural factors were incorporated: S2 AND (“Requirements” OR “Design Criteria” OR “Influencing Factors” OR “Architecture Selection” OR “Real-time” OR “Latency” OR “Scalability” OR “Security” OR “Data Quality” OR “Complexity” OR “AI Readiness” OR “Data Privacy”). This final query yielded 123 publications, all of which were subjected to full-text review.

These 123 papers were analyzed using a table-based content analysis along pre-defined technical dimensions, including boundary conditions, databases, physical transmission technologies, communication protocols, cloud and edge components, and programming frameworks. The analysis was conducted collaboratively by three researchers and cross-validated through regular consensus sessions to ensure objectivity and consistency.

In a fourth step, each identified use case was additionally evaluated for its applicability and architectural relevance. A scoring system from 0 to 100% was applied, and only studies with a fit score of 60% or higher were retained for in-depth analysis, resulting in a final set of 61 studies. The rationale behind the 60% cutoff was based on an internal evaluation scale in 10% steps, where 50% marked the boundary between weak and adequate fit. Only papers with a use-case fit above this threshold—starting at 60%—were retained, ensuring that the final sample consisted of clearly relevant studies with sufficient architectural detail.

The evaluation focused on the practical and technically grounded description of each use case, particularly regarding data flow, system integration, and architectural components. For example, use cases without an active connection between data acquisition and processing were excluded, as such scenarios typically do not require a data platform and were therefore considered out of scope for this study.

Based on the results, thematic use-case clusters were derived inductively according to their application. In parallel, standardized architecture archetypes were extracted purely inductively from the technological configurations presented in the studies. These archetypes will be discussed in detail in a subsequent section.

The aim of this methodological approach was to identify architecture-specific requirements for various use case types and derive context-sensitive recommendations for data architecture design in manufacturing environments.

A list of all analyzed publications, including their assessed fit and detailed evaluations across relevant technical dimensions, is available via the link provided under ‘Supplementary Materials’.

3. Archetypes for Data Management Platforms in Manufacturing

The following section identifies the most relevant technical requirements and boundary conditions for data-driven use cases in manufacturing, based on the analysis of the 61 selected research papers. These include factors such as latency, data volume, and scalability. Building on this foundation, different use cases for data-driven optimization, monitoring, and control are subsequently discussed in relation to these requirements and constraints. These technical requirements and use cases are then used to cluster and organize different data management archetypes in manufacturing. Afterwards, each archetype is presented and discussed in detail.

3.1. Requirements of Archetypes

In the course of the systematic literature research, a number of highly relevant requirements and boundary conditions were identified: latency, data volume, scalability, application environment, data integrity, and the use of model-driven approaches. The characteristics and definitions of these requirements are not always formulated clearly and with integrity, which can lead to confusion and misinterpretation. For this reason, definitions are formulated below that largely align with the majority of the literature reviewed.

3.1.1. Latency

Latency, real-time-capabilities and high frequency of data transmission are mentioned as very important by nearly every analyzed paper, but unfortunately, these terms are interpreted extremely inconsistently as there is no exact scientific definition. While most of the 61 analyzed papers refer to low latency or real-time capabilities, only a few specify the required latency in absolute terms. In fact, the numbers quoted also vary widely: some papers, such as [24,25], describe systems that require reaction times of a few seconds or minutes; others, such as [26,27], write about milliseconds and even less. For a clear definition, this paper uses the term ‘low latency’ and ‘strong latency requirement’ to refer to time-critical controls or data transmissions of a few milliseconds and less.

3.1.2. Data Volume (and Heterogeneity)

In most of the analyzed papers, a large amount of data was mentioned as a boundary condition. However, none of them provided specific quantitative metrics regarding the volume of data. While some papers did not really address this issue, others used data preprocessing at the edge (e.g., [28,29]) or different database technologies (e.g., [25,30]) as a solution. Along with the volume of data, the challenge of data heterogeneity or the handling of data from different sources and formats is often mentioned, as [31,32] show.

3.1.3. Scalability

Scalability is the next term that seems to be relevant for some use cases or data platforms, and it is important to consider how it can be understood in these contexts. Articles such as [33,34] focus on gateways or Internet of Things (IoT) devices that enable scalable data collection and recording, referring to computational scalability. A more common description of ‘scalability’ is aligned to data throughput, emphasizing the possibilities of transmitting and storing data using data streaming (e.g., Apache Kafka [35]) or standardized transmission protocols and centralized databases; suitable examples include [24,36]. This latter understanding of scalability is used throughout the rest of this paper.

3.1.4. Application Environment

Looking at the different applications analyzed in the literature review, it is notable that some focus on fixed systems and machines, while others focus on movable entities and mobile robots. In [30,37], for example, applications are focused on digital twins of machines or production lines and are fixed in nature. Such systems work closely with PLCs and therefore use wired transmission technologies such as bus systems (e.g., Profibus [38]) or Ethernet. Typically, in correlation with logistics optimization, the data transmission of mobile entities is essential for creating digital twins, as can be found in [26,39,40]. These applications work with Automated Guided Vehicles (AGVs), requiring fast processing and transmission via 5G and tracking and tracing technologies such as Radio-Frequency Identification (RFID) and Ultra-Wideband (UWB).

3.1.5. Data Integrity

Besides the common challenge of cybersecurity—which will not be discussed in detail here—it is notable that some highlighted the importance of data integrity. This refers not only to the challenge of ensuring that data is correct, complete and consistent over time and in different formats, but also to protecting it against unauthorized modification. While blockchain technologies are one approach used to support tamper-proof and verifiable data records [41,42], data integrity is also addressed through other mechanisms, such as secure gateway solutions at the edge of the network [43].

3.1.6. Model-Driven Approach

The analysis of data archetypes in this literature review revealed a clear distinction: they can be broadly grouped into two categories—model-driven and non-model-driven approaches.

Model-driven approaches refer to data archetypes that are linked with advanced data analytics models or robotic control systems. These approaches are designed to enable highly analytical and autonomous systems directly on the shop floor. The data archetypes in this category are, therefore, tightly integrated with the operational logic and control mechanisms of these systems, facilitating real-time decision-making and automated actions. Their primary objective is to leverage data to drive sophisticated models that directly influence physical processes or automated functions, for example, refs. [44,45].

Conversely, non-model-driven approaches primarily focus on applications related to monitoring, data provision, and data sharing. These examples tend to operate at the management or business level, rather than directly controlling shop floor operations. While they are crucial for providing insights, supporting strategic decisions, and enabling information flow across an organization, they do not necessarily involve complex analytical models or direct robotic control. Instead, their emphasis is on ensuring data availability, consistency, and accessibility for broader oversight and collaborative purposes. Examples can be found in [46,47].

3.2. Use Cases

In recent years, the use of data management platforms in manufacturing has grown significantly. However, the objectives that companies aim to achieve with these platforms vary widely. As a result, the associated requirements and boundary conditions differ substantially across use cases. Through the literature study, several use cases and applications scenarios with distinctive boundary conditions were identified. In the following sub-chapters, these will be presented together with their technical characteristics regarding data volume, data variety, real time constraints and other.

3.2.1. Factory Management and Coordination

Factory management typically means the coordination of people, resources and assets on the scale of complete supply chains or at least several machines and processes. Typically, these stretch across a medium to large physical space. Due to different vendors and purchasing periods of equipment, data formats and technologies may vary. At the same time, data volume can be considered medium to high, depending on the amount of data provided by single resources. Real time constraints are typically relaxed (several seconds to even minutes), as people are in the loop to make management decisions.

3.2.2. Process Management and Monitoring

Monitoring and assessing single machines or processes is relevant in areas such as quality or operations management as well as planning and scheduling. The amount of data can be medium to high, for example, when data is taken from several sensors and PLCs to give a holistic assessment of the current situation of an asset. At the same time, extensive process models are sometimes used for predictions. Real time constraints for process management and monitoring are typically more relaxed, as human operators still make decisions and thus are part of a feedback cycle.

3.2.3. Process Control of Robots and Machines

As opposed to process monitoring or management, process control of robots and machines requires an active response to control a technical process to handle deviations and disturbances. With a focus on a small amount of processes (or even just one), the data amount typically is lower than in other use cases. At the same time, real time requirements are usually stronger due to short reaction times. However, depending on the process, these vary greatly. Motion control, as in robots, requires responses in the magnitude of milliseconds, while batch processes, as in plastic molding, might require an update only every batch or cycle, thus leaving several seconds for a response. Process control typically goes hand in hand with extensive process models on which to make predictions and draw conclusions.

3.2.4. Management of Auxiliaries

During the last years, the consumption of auxiliaries such as water, electrical energy or pressurized air have gained in importance due to rising prices as well as a change in political and regulatory focus. Thus, monitoring and managing these became a task for manufacturing companies. Typically, monitoring the consumption is a first step to later improvements. Typical meters will provide small amount of data (several bytes for one measurement) spread over relatively long periods of time. For example, in energy management, the billing interval of fifteen minutes used by electricity providers lead to measurement intervals of one minute or more. Thus, real-time constraints are very relaxed. Sections may be divided by subheadings. They should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

3.3. Clustering of Archetypes

Based on the requirements derived in Section 3.1 and the use cases identified in Section 3.2, the research team has defined and clustered eight generic archetypes. A significant differentiation has been established between non-model-driven and model-driven approaches to guide the subsequent clustering based on these criteria comprehensively. A detailed description of these archetypes will follow in Section 3.4.

3.3.1. Archetypes Without Model-Driven Approaches

Drawing on the previously defined requirements, four generic archetypes have been identified that primarily support monitoring, data provision, and sharing within data management platforms. While it is important to acknowledge that model-driven approaches are possible in these contexts and are sometimes partially utilized, they do not primarily focus on complex analytical models or direct robotic control. Clustering these archetypes without models was the initial step. One archetype—blockchain—stood clearly apart from the others as in the publications addressing this, data integrity was the driver. Several distinguishing characteristics were tried to separate the remaining architectures, from the amount of data handled to latency requirements, safety, scalability and many more. Between the authors, it has been found to be most effective when considering three key characteristics: data volume, scalability and data integrity—each divided into low and high abilities. As shown in Figure 2, this three-dimensional classification provides a clear framework for understanding the varying technical demands and operational priorities of these diverse applications.

3.3.2. Archetypes with Model-Driven Approaches

Using the identified use cases and derived requirements as a basis, the research outlines four separate model-driven archetypes for data management platforms in manufacturing. What stands out is that the systematic classification based on latency requirements (ranging from weak to strong) and the application environment (fixed vs. mobile) proves to be particularly useful. This two-dimensional classification matrix provides a clear and effective categorization of diverse operational characteristics, offering a robust framework for understanding the unique demands of each archetype, as illustrated in Figure 3.

In relation to these four archetypes, digital twins are often employed. If there are strong latency requirements, one thing in particular should be emphasized: edge–cloud dualism. The advantage is that time-critical controls and data analysis model execution can be carried out directly on the machines at shop floor level, which usually have a direct connection to the PLC. Edge devices are used for this purpose and also enable data pre-processing. A central cloud or server operates as a service coordinator and supervisor where among various other functions, data analysis models can be created, trained and optimized. An overview of the edge–cloud-dualized architecture is provided in Figure 4, alongside descriptions of it in articles like [27,28].

3.4. Archetypes in Detail

Based on a total of 61 relevant publications identified in the literature review, the following sub-sections explore each architecture type in depth, drawing on the technological characteristics identified during the literature analysis. While the number of sources per archetype and per feature is often limited, these patterns still allow for meaningful classification, as the extracted characteristics are both consistent across cases and representative of core design decisions. This reflects the high diversity of use cases, but also the methodological decision to include only clearly assignable and explicitly described technological features per archetype.

3.4.1. Gateway

Gateway-based architectures represent a lightweight and modular approach to industrial data acquisition, particularly suited for decentralized environments with heterogeneous device landscapes and moderate real-time demands. A total of 13 publications implement a gateway pattern, characterized by intermediary nodes that mediate between field-level devices and higher-level systems. These nodes—typically implemented via Raspberry Pi [34,48] and comparable microcontrollers—facilitate protocol translation, local preprocessing, and secure transmission, often via Message Queueing Telemetry Transport (MQTT) and Open Platform Communications Unified Architecture (OPC UA) [33,49]. The gateway architecture is illustrated in Figure 5.

Common use cases include sensor integration for condition monitoring [34] and visualization dashboards [50]. Several implementations rely on edge-level inference using lightweight machine learning models [43], while others focus on reliable logging and visualization without local analytics. The architectural role is often passive or semi-active, with limited capacity for model-based control or autonomous decision-making.

Overall, gateway systems are primarily deployed in small to mid-sized setups, where low integration effort, cost-efficiency, and compatibility with legacy devices are critical. Architecturally, they are not very scalable due to the technical limitations of the edge devices used, but they are suitable for large volumes of data. This archetype is defined by its intermediary role between field and platform layers, using lightweight hardware, open communication protocols, and limited edge intelligence for basic preprocessing and integration tasks.

3.4.2. Low Energy, Wide Range

The low-energy, wide-range archetype comprises architectures that emphasize energy-efficient, long-range communication protocols such as Long Range Wide Area Network (LoRaWAN) and Zigbee. These systems typically target non-critical, large-area deployments in industrial settings where real-time constraints are relaxed. Five studies implement such architectures [25,36,46,47,51].

A typical application scenario is environmental or infrastructure monitoring as described in [46]. Such setups frequently combine time-series databases (such as InfluxDB [52]) with lightweight protocols like MQTT and Representational State Transfer Application Programming Interfaces (REST APIs), as seen in [36]. Due to the reliance on low-bandwidth communication, data is often transmitted at regular intervals rather than continuously, making these architectures suboptimal for high-frequency analytics or safety-critical control.

Architecturally, the edge is often minimal, with most processing tasks shifted to centralized platforms or clouds [36]. This communication and architecture pattern is illustrated in Figure 6, which shows temperature and infrastructure sensors connected via LoRaWAN. The systems primarily collect and forward data for visualization, condition monitoring, or usage analysis. As such, data volume is moderate to low, while data integrity mechanisms are usually absent or basic compared to other non-model-driven archetypes such as blockchain, which emphasize strong integrity guarantees as discussed in Section 3.4.3. This archetype is clearly characterized by low-frequency data capture, minimal edge capabilities, and the exclusive use of long-range, low-power protocols such as LoRaWAN or Zigbee.

3.4.3. Blockchain

Blockchain architectures are designed to provide tamper-proof, transparent, and decentralized data management, making them particularly attractive in collaborative manufacturing networks where data provenance and auditability are essential. Four studies implemented blockchain-based solutions in industrial settings [41,42,53,54].

These architectures typically rely on distributed ledger technologies such as Hyperledger Fabric, paired with WORM (short for ‘Write Once Read Many’) storage [53], and are often integrated with edge or fog computing layers to compensate for blockchain’s latency and throughput limitations [41,42]. While peer-to-peer communication protocols and consensus mechanisms ensure data integrity and non-repudiation, they impose performance trade-offs that restrict the applicability of blockchain to low-data-volume, high-integrity use cases, such as traceability documentation.

Architectural implementations are often hybrid: edge devices such as Raspberry Pi [55] and Jetson boards [56] act as blockchain clients [41], while cloud or fog nodes assume ledger synchronization and consensus roles. The systems evaluated are less performant in their real-time capability than other standard architectures but show strong guarantees in data authenticity and immutability, as illustrated by [53] in scenarios requiring long-term traceability across supply chains. Blockchain-based architectures stand out by combining distributed consensus mechanisms with immutable data storage, ensuring tamper-proof documentation and verifiable data exchange across multiple stakeholders, as exemplified in Figure 7.

3.4.4. Streaming

Streaming architectures enable continuous, near real-time ingestion and processing of high-frequency data streams in industrial contexts. Despite being supported by only two sources [31,57], this archetype represents a distinct and theoretically coherent architectural pattern. Its empirical limitations are acknowledged, but its inclusion is justified by the clarity and consistency of its defining technological characteristics.

Typically, these systems are employed where process data must be acquired and analyzed in parallel—for example, for scientific workflow orchestration [31]. The architectural backbone often includes distributed data flow engines like Spark [58], HDFS storage systems such as Hadoop [59] or actor-based workflow environments such as Kepler [60]. Communication is facilitated via standard protocols (e.g., OPC UA, MTConnect, Hypertext Transfer Protocol Secure (HTTPS)), and data is either directly streamed to cloud-based platforms or routed through intermediate orchestration layers, as also reflected in the producer-consumer setup shown in Figure 8.

The streaming paradigm supports high-volume, high-velocity applications with tight temporal coupling between data capture and analysis, meaning minimal delay between data capture and analysis. However, latency constraints vary depending on the process. From an architectural classification perspective, streaming systems offer the processing of large volumes of data coupled with high scalability. Yet this benefit comes at the cost of lower data integrity. Streaming architectures are uniquely defined by their continuous data ingestion pipelines, distributed processing layers, and high-throughput capabilities, tailored for time-critical and data-intensive applications.

3.4.5. Management and Monitoring

The archetype for digital twins in factory management and monitoring mostly focuses on fixed systems with lower latency requirements. These architectures are primarily used for tasks such as planning, scheduling, quality management and operations management, thus referring to the use case outlined in Section 3.2.2. Five studies clearly belong to this category: [61,62,63,64,65].

The technical implementation often involves edge devices, such as Raspberry Pi’s [55] (as indicated in [61,62,63]), for localized data capture and preprocessing. For broader storage, analytics, and the deployment of machine learning models, cloud platforms (e.g., [62,63]) are commonly employed. Data persistence is managed through various database technologies, including both traditional SQL databases (as widely seen in [65]) and Not only Structured Query Language (NoSQL) solutions (e.g., [62]). Communication between these components typically leverages lightweight protocols like MQTT or REST APIs [61,62], with OPC UA also being a common standard for industrial interoperability [63]. This integration of data sources to factory management models is schematically outlined in Figure 9.

These architectures are characterized by their support for fixed, low-latency systems using hybrid storage, moderate edge processing, and industrial-standard protocols for factory-wide transparency and coordination.

3.4.6. Resource Orchestration

This archetype, focusing on resource orchestration, also exhibits lower real-time requirements, like the “Management and Monitoring” archetype. However, its distinctive feature lies in its emphasis on large-scale systems involving mobile entities and the efficient management of dynamic resources. Relevant literature supporting this archetype includes sources [40,66,67,68].

These systems are typically used for logistics planning, scheduling, and dynamic coordination tasks, including the simulation of production and supply chains (e.g., [66,67]). A representative overview of how spatially distributed assets are coordinated is provided in Figure 10.

Technologically, the utilization of SQL databases is a common characteristic. Crucially, these systems leverage various spatial positioning technologies, such as RFID, Bluetooth Low Energy (BLE), UWB, Barcodes, and Global Positioning System (GPS), to track and manage mobile assets. The data generated by these entities is typically transmitted wirelessly via Wireless Fidelity (WiFi) and 4G networks (supported by [40,67,68]). Resource orchestration architectures combine flexible connectivity and localization technologies to manage dynamic, mobile elements across large-scale industrial environments with moderate latency demands. Orchestrating and managing resources often also entails the necessity to deal with data veracity, meaning to create decisions based on contradictory information or demand from these resources.

3.4.7. Process Control

The archetype for process control demands strong latency requirements despite primarily operating with fixed machines and robots. It directly refers to the process control use case formulated in Section 3.2.3, focusing on active responses to handle deviations and disturbances. These systems are frequently employed in autonomous robotic control systems, where immediate reactions are paramount. Therefore, such architectures typically leverage the advantages of the edge–cloud dualism, as explained in Section 3.3.2. A total of 12 papers were identified in the literature review that fit very well into this category: [27,28,30,44,69,70,71,72,73,74,75,76].

Structured data is usually stored in SQL databases (e.g., [27,71,75]), complemented by NoSQL databases such as MongoDB [77] (e.g., [30,71]) and graph databases like Neo4j [78] for semantic knowledge representation (e.g., [27]). The edge layer predominantly uses wired Ethernet connections (e.g., [44,69,71]) to robot controllers to achieve low latency. Communication protocols typically include REST APIs (e.g., [28,69]) and OPC UA for PLC access (e.g., [73]), while 5G wireless connectivity is occasionally used (e.g., [70,73]). Data is transferred to the cloud via MQTT [27].

These architectures are defined by high-frequency edge processing, direct PLC integration, and deterministic low-latency communication between industrial resources and control systems to enable sub-second control, as exemplified in Figure 11, which illustrates a manufacturing resource connected to an edge computing device using fieldbus protocols such as EtherCAT.

3.4.8. Large-Scale, Time-Critical Control

This archetype represents highly demanding scenarios requiring both very low latency for real-time transmission and the ability to manage mobile entities throughout the factory. Applications include logistics and safety-critical systems involving human–machine interaction, such as AGV control [26], collision detection [20], worker-aware systems [32], and mobile robotic platforms [79].

Edge–cloud dual architectures, as also seen in the “Process Control” archetype, are typical in this context. Additionally, high-performance wireless communication—especially 5G as noted by [26]—is essential to meet latency requirements. As with “Resource Orchestration”, positioning technologies like RFID or UWB are frequently used for precise localization [32]. OPC UA is commonly used for control system connectivity, while MQTT supports data exchange [20,32]. These architectures often employ a combination of various database types. For example, Ref. [32] utilizes a Neo4j graph database for instance data and local storage, alongside MySQL for shipment data.

These architectures uniquely combine real-time edge analytics, mobile coordination, and ultra-low latency communication to ensure responsive, safe operations in highly dynamic production environments, as visualized in Figure 12.

4. Example Cases

In the following chapter, some examples are provided for some of the archetypes described above. These example cases will provide a better insight into the different architectures, their respective applications and boundary conditions. All example cases have been taken from the InnovationHub Bergisches Rheinland in Gummersbach and were developed under the research grant mentioned at the end of this contribution.

4.1. Gateway for Computerized Numerical Control Data

Computerized Numerical Control (CNC) mills are widely used in manufacturing, especially for metal cutting. In Germany and other European countries, controls provided by Siemens are prevalent, but other manufacturer such as Rockwell, Heidenhain and Mitsubishi feature a relevant market share. In the case of our model factory, we use a DMG Mori 3-axis CNC mill [80] with a Siemens control system [81] that is able to act as an OPC UA server. To extract data from the CNC, pre-process, accumulate and push this information to a database, a Raspberry Pi 4 (8GB) [82] running node-RED [83] is used.

The data from the CNC is pushed to a time-series database, in our case an InfluxDB [52] with a cloud installation hosted by the provider. Data is requested and processed once per second and used to gain insights into classical operations KPI such as process time and Overall Equipment Effectiveness (OEE). Dashboards for machine surveillance were developed in Grafana [84] and are currently displayed in physical proximity to the CNC. Theoretically, the dashboards could be displayed anywhere as the data is stored in a cloud environment and latency requirements are in the area of several seconds. This classical gateway architecture is typical for such use cases, moderately easy to install and relatively cheap. All software components are open source, the hardware needed costs less than 100 €.

4.2. Low Energy, Wide Range for Building Resource Monitoring

The model factory in Gummersbach has been equipped with a system for monitoring the energy and resource consumption of the building. In our specific case, a network of LoRaWAN based sensor has been installed. The overall energy consumption is monitored, as well as the status of several windows and doors (open/close), humidity and temperature of the main hall. All sensors transmit their data package through an MQTT protocol to a central antenna, wide proven range of up to 100 m. As data is collected and handled only once per minute and the overall energy consumption of the physical transmission is low, all sensors are battery operated and last several years.

From the antenna, all data is pushed to a cloud-deployed back-end employing a time-series database as well as dashboards based on Grafana [84]. As was the case with the other architecture presented before, this dashboard is available globally.

5. Discussion

This paper aimed to identify use-case-driven requirements for industrial data architectures and derive context-specific archetypes based on a systematic literature analysis. The motivation was rooted in the observation that manufacturing use cases—ranging from factory management to predictive maintenance and real-time control—entail vastly different demands regarding latency, data volume, data integrity, scalability, and architectural complexity. By analyzing 61 publications and clustering technical configurations into eight archetypes, the study offers a structured framework for understanding and evaluating architectural choices in industrial contexts.

The archetypes developed in this study are not just descriptive patterns—they highlight recurring technological trade-offs and reflect functional priorities across industrial scenarios. Model-driven archetypes emphasize edge–cloud dualism and latency-critical capabilities, making them essential for control-related use cases. Non-model-driven archetypes, in contrast, prioritize scalability, broad data accessibility, or integrity over real-time responsiveness, and are typically used in applications where reactive control is not required. This conceptual distinction allows practitioners to navigate the complexity of platform design by aligning requirements with feasible architectural models.

A key insight is that no universal architecture satisfies all boundary conditions. This becomes particularly evident when considering non-model-driven archetypes, which inherently reflect necessary compromises between data volume, latency, integrity, and implementation effort. For example, the blockchain archetype ensures tamper-proof logging but is limited in throughput, while streaming architectures offer high scalability at the cost of data integrity. These archetypes are each optimized for specific objectives—such as trust, volume, or reach—but cannot meet all requirements simultaneously. As a result, future data platforms in manufacturing will likely require hybrid or layered combinations of these non-model-driven archetypes to achieve broader applicability and technical balance.

Beyond the technical classification, this study supports strategic decision-making in manufacturing IT. The archetypes provide orientation for both greenfield implementations and incremental modernization strategies, where legacy constraints or application-specific priorities shape the technical solution space.

Future work should expand on several fronts. First, the integration of hybrid architectures that combine the strengths of different archetypes appears promising. Secondly, the development of a standardized architecture archetype that balances high scalability, large data volume handling, and strong data integrity in non-model-driven contexts should be explored. This would address the critical trade-offs highlighted in Figure 2 and support broader applicability of data platforms in manufacturing. Third, future work could explore how modular and reconfigurable architectures can be designed to allow for incremental expansion or adaptation as manufacturing requirements evolve—without necessitating a complete system overhaul.

In addition, a longitudinal evaluation of archetype implementations across industrial sectors could help verify the generalizability and refine the classification logic over time.

6. Conclusions

This paper systematically classified eight archetypes for data platform architectures in manufacturing, based on a review of 61 peer-reviewed publications. The classification differentiates between model-driven and non-model-driven architectures, each aligned with specific use-case characteristics and technical requirements.

Key results include:

A clear distinction between latency-critical architectures (e.g., process control) and data-sharing-oriented architectures (e.g., blockchain, gateway);
A three- and two-dimensional framework for organizing archetypes according to requirements such as data integrity, scalability, latency, and mobility;
The identification of trade-offs between architectural properties, illustrating that hybrid or modular solutions may be necessary in real-world settings.

This typology offers both researchers and practitioners a structured reference for aligning technical choices with industrial needs. Future work should focus on hybrid archetype combinations and empirically validating implementation outcomes.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/platforms3030015/s1. Table S1: List of available and analyzed literature including the specific evaluations in the relevant aspects.

Author Contributions

Conceptualization, E.P.; methodology, T.K.; formal analysis, E.P., C.W. and T.K.; data curation, T.K.; writing—original draft preparation, E.P., C.W. and T.K.; writing—review and editing, E.P., C.W. and T.K.; visualization, C.W. and T.K.; supervision, E.P.; funding acquisition, E.P. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to thank the European Union and the Ministry of Economic Affairs, Innovation, Digitalization and Energy of the state of North Rhine-Westphalia for their support of the project InnoFaktur under the grant EFRE 205000002.

Institutional Review Board Statement

This study was conducted under the research permission from the research ethics council of the TH Köln University of Applied Sciences following a self-assessment.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data relevant to the study are included in the article or uploaded as online Supplemental Information.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AAS	Asset Administration Shell
AGV	Automated Guided Vehicle
API	Application Programming Interfaces
BLE	Bluetooth Low Energy
CNC	Computerized Numerical Control
CPPS	Cyber-Physical Production Systems
e.g.	For Example
GPS	Global Positioning System
HTTPS	Hypertext Transfer Protocol Secure
IoT	Internet of Things
KPI	Key Performance Indicator
LoRaWAN	Long Range Wide Area Network
MQTT	Message Queueing Telemetry Transport
noSQL	Not only Structured Query Language
OEE	Overall Equipment Effectiveness
OPC UA	Open Platform Communications Unified Architecture
PLC	Programmable Logic Controls
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
REST	Representational State Transfer
RFID	Radio-Frequency Identification
SPAR-4-SLR	Scientific Procedures and Rationales for Systematic Literature Reviews
SQL	Structured Query Language
UWB	Ultra-Wideband
WiFi	Wireless Fidelity
WORM	Write Once Read Many

References

Taylor, P. Volume of Data/Information Created, Captured, Copied, and Consumed Worldwide from 2010 to 2023, with Forecasts from 2024 to 2028. Available online: https://www.statista.com/statistics/871513/worldwide-data-created/ (accessed on 27 May 2025).
Hankel, M. The Reference Architectural Model Industrie 4.0 (RAMI 4.0); ZVEI—German Electrical and Electronic Manufacturers’ Association: Frankfurt am Main, Germany, 2015; Available online: https://www.zvei.org/fileadmin/user_upload/Presse_und_Medien/Publikationen/2015/april/Das_Referenzarchitekturmodell_Industrie_4.0__RAMI_4.0_/ZVEI-Industrie-40-RAMI-40-English.pdf (accessed on 27 May 2025).
International Electrotechnical Commission. Smart Manufacturing: Reference Architecture Model Industry 4.0 (RAMI4.0), 1st ed.; IEC PAS 63088; International Electrotechnical Commission: Geneva, Switzerland, 2017. [Google Scholar]
Pennekamp, J.; Glebke, R.; Henze, M.; Meisen, T.; Quix, C.; Hai, R.; Gleim, L.; Niemietz, P.; Rudack, M.; Knape, S.; et al. Towards an Infrastructure Enabling the Internet of Production. In Proceedings of the 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS), Taipei, Taiwan, 6–9 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 31–37, ISBN 978-1-5386-8500-6. [Google Scholar]
Schuh, G.; Prote, J.-P.; Gützlaff, A.; Thomas, K.; Sauermann, F.; Rodemann, N. Internet of Production: Rethinking production management. In Production at the Leading Edge of Technology; Wulfsberg, J.P., Hintze, W., Behrens, B.-A., Eds.; Springer: Berlin/Heidelberg, Germany, 2019; pp. 533–542. ISBN 978-3-662-60416-8. [Google Scholar]
Brecher, C.; Padberg, M.; Jarke, M.; van der Aalst, W.; Schuh, G. The Internet of Production: Interdisciplinary Visions and Concepts for the Production of Tomorrow. In Internet of Production; Brecher, C., Schuh, G., van der Aalst, W., Jarke, M., Piller, F.T., Padberg, M., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 1–12. ISBN 978-3-030-98062-7. [Google Scholar]
Cardin, O. Classification of cyber-physical production systems applications: Proposition of an analysis framework. Comput. Ind. 2019, 104, 11–21. [Google Scholar] [CrossRef]
Uhlemann, T.H.-J.; Lehmann, C.; Steinhilper, R. The Digital Twin: Realizing the Cyber-Physical Production System for Industry 4.0. Procedia CIRP 2017, 61, 335–340. [Google Scholar] [CrossRef]
Monostori, L. Cyber-physical Production Systems: Roots, Expectations and R&D Challenges. Procedia CIRP 2014, 17, 9–13. [Google Scholar] [CrossRef]
Rossit, D.A.; Tohmé, F.; Frutos, M. Production planning and scheduling in Cyber-Physical Production Systems: A review. Int. J. Comput. Integr. Manuf. 2019, 32, 385–395. [Google Scholar] [CrossRef]
Lins, T.; Oliveira, R.A.R. Cyber-physical production systems retrofitting in context of industry 4.0. Comput. Ind. Eng. 2020, 139, 106193. [Google Scholar] [CrossRef]
Wei, K.; Sun, J.Z.; Liu, R.J. A Review of Asset Administration Shell. In Proceedings of the 2019 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Macao SAR, China, 15–18 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1460–1465, ISBN 978-1-7281-3804-6. [Google Scholar]
Tantik, E.; Anderl, R. Integrated Data Model and Structure for the Asset Administration Shell in Industrie 4.0. Procedia CIRP 2017, 60, 86–91. [Google Scholar] [CrossRef]
Cavalieri, S.; Salafia, M.G. Asset Administration Shell for PLC Representation Based on IEC 61131–3. IEEE Access 2020, 8, 142606–142621. [Google Scholar] [CrossRef]
Wenger, M.; Zoitl, A.; Muller, T. Connecting PLCs With Their Asset Administration Shell For Automatic Device Configuration. In Proceedings of the 2018 IEEE 16th International Conference on Industrial Informatics (INDIN), Porto, Portugal, 18–20 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 74–79, ISBN 978-1-5386-4829-2. [Google Scholar]
Ding, K.; Chan, F.T.; Zhang, X.; Zhou, G.; Zhang, F. Defining a Digital Twin-based Cyber-Physical Production System for autonomous manufacturing in smart shop floors. Int. J. Prod. Res. 2019, 57, 6315–6334. [Google Scholar] [CrossRef]
Soori, M.; Arezoo, B.; Dastres, R. Digital twin for smart manufacturing, A review. Sustain. Manuf. Serv. Econ. 2023, 2, 100017. [Google Scholar] [CrossRef]
ISO/IEC 2382:2015; Information Technology—Vocabulary. ISO/IEC JTC 1: Geneva, Switzerland, 2015.
Brecher, C.; Weck, M. Machine Tools Production Systems 3; Springer Fachmedien: Wiesbaden, Germany, 2022; ISBN 978-3-658-34621-8. [Google Scholar]
Yang, C.; Yu, H.; Zheng, Y.; Ala-Laurinaho, R.; Feng, L.; Tammi, K. Towards Human-Centric Manufacturing: Leveraging Digital Twin for Enhanced Industrial Processes. In Proceedings of the IECON 2024—50th Annual Conference of the IEEE Industrial Electronics Society, Chicago, IL, USA, 3–6 November 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7, ISBN 978-1-6654-6454-3. [Google Scholar]
DIN EN ISO/IEC 27001; Information Security Management System. DIN German Institute for Standardization: Berlin, Germany, 2024.
European Commission. Cyber Resilience Act. Available online: https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act (accessed on 3 July 2025).
International Society of Automation. ISA/IEC 62443 Series of Standards. Available online: https://www.isa.org/standards-and-publications/isa-standards/isa-iec-62443-series-of-standards (accessed on 3 July 2025).
Canizo, M.; Onieva, E.; Conde, A.; Charramendieta, S.; Trujillo, S. Real-time predictive maintenance for wind turbines using Big Data frameworks. In Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 70–77, ISBN 978-1-5090-5710-8. [Google Scholar]
Yang, C.-T.; Chen, S.-T.; Yan, Y.-Z. The implementation of a cloud city traffic state assessment system using a novel big data architecture. Clust. Comput. 2017, 20, 1101–1121. [Google Scholar] [CrossRef]
Mu, N.; Gong, S.; Sun, W.; Gan, Q. The 5G MEC Applications in Smart Manufacturing. In Proceedings of the 2020 IEEE International Conference on Edge Computing (EDGE), Beijing, China, 19–23 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 45–48, ISBN 978-1-7281-8254-4. [Google Scholar]
Yang, C.; Guo, Q.; Yu, H.; Chen, Y.; Taleb, T.; Tammi, K. Semantic-Enhanced Digital Twin for Industrial Working Environments. In Global Internet of Things and Edge Computing Summit; Presser, M., Skarmeta, A., Krco, S., González Vidal, A., Eds.; Springer Nature: Cham, Switzerland, 2025; pp. 3–20. ISBN 978-3-031-78571-9. [Google Scholar]
Kim, J.; Lee, J.Y. Server-Edge dualized closed-loop data analytics system for cyber-physical system application. Robot. Comput. Integr. Manuf. 2021, 67, 102040. [Google Scholar] [CrossRef]
Chae, C.; Kim, H.; Sim, B.; Yoon, D.; Kang, J. Poster: Real-Time Data-Driven Optimization in Semiconductor Manufacturing: An Edge-Computing System Architecture for Continuous Model Improvement. In Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services, Minato-ku, Tokyo, Japan, 2–6 June 2024; Okoshi, T., Ko, J., LiKamWa, R., Eds.; ACM: New York, NY, USA, 2024; pp. 630–631, ISBN 9798400705816. [Google Scholar]
Angrish, A.; Starly, B.; Lee, Y.-S.; Cohen, P.H. A flexible data schema and system architecture for the virtualization of manufacturing machines (VMM). J. Manuf. Syst. 2017, 45, 236–247. [Google Scholar] [CrossRef]
Li, X.; Song, J.; Huang, B. A scientific workflow management system architecture and its scheduling based on cloud service platform for manufacturing big data analytics. Int. J. Adv. Manuf. Technol. 2016, 84, 119–131. [Google Scholar] [CrossRef]
Yang, C.; Yu, H.; Zheng, Y.; Feng, L.; Ala-Laurinaho, R.; Tammi, K. A digital twin-driven industrial context-aware system: A case study of overhead crane operation. J. Manuf. Syst. 2025, 78, 394–409. [Google Scholar] [CrossRef]
Safronov, G.; Theisinger, H.; Sahlbach, V.; Braun, C.; Molzer, A.; Thies, A.; Schuba, C.; Shirazi, M.; Reindl, T.; Hänel, A.; et al. Data Acquisition Framework for spatio-temporal analysis of path-based welding applications. Procedia CIRP 2024, 130, 1644–1652. [Google Scholar] [CrossRef]
Guo, C.; Yang, S.; Thiede, S. An Autonomous Edge Box System Architecture for Industrial IoT Applications. In Proceedings of the 2024 IEEE 22nd International Conference on Industrial Informatics (INDIN), Beijing, China, 18–20 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6, ISBN 979-8-3315-2747-1. [Google Scholar]
Apache Software Foundation. Apache Kafka. Available online: https://kafka.apache.org/ (accessed on 2 July 2025).
Durão, L.F.C.; Zancul, E.; Schützer, K. Digital Twin data architecture for Product-Service Systems. Procedia CIRP 2024, 121, 79–84. [Google Scholar] [CrossRef]
Prist, M.; Monteriu, A.; Freddi, A.; Pallotta, E.; Cicconi, P.; Giuggioloni, F.; Caizer, E.; Verdini, C.; Longhi, S. Cyber-Physical Manufacturing Systems for Industry 4.0: Architectural Approach and Pilot Case. In Proceedings of the 2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0&IoT), Naples, Italy, 4–6 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 219–224, ISBN 978-1-7281-0429-4. [Google Scholar]
PROFIBUS Nutzerorganisation e.V. Profibus. Available online: https://www.profibus.de/ (accessed on 2 July 2025).
Pan, Y.H.; Qu, T.; Wu, N.Q.; Khalgui, M.; Huang, G.Q. Digital Twin Based Real-time Production Logistics Synchronization System in a Multi-level Computing Architecture. J. Manuf. Syst. 2021, 58, 246–260. [Google Scholar] [CrossRef]
Kong, X.T.; Fang, J.; Luo, H.; Huang, G.Q. Cloud-enabled real-time platform for adaptive planning and control in auction logistics center. Comput. Ind. Eng. 2015, 84, 79–90. [Google Scholar] [CrossRef]
Nguyen, T.; Nguyen, H.; Nguyen Gia, T. Exploring the integration of edge computing and blockchain IoT: Principles, architectures, security, and applications. J. Netw. Comput. Appl. 2024, 226, 103884. [Google Scholar] [CrossRef]
Lee, J.; Azamfar, M.; Singh, J. A blockchain enabled Cyber-Physical System architecture for Industry 4.0 manufacturing systems. Manuf. Lett. 2019, 20, 34–39. [Google Scholar] [CrossRef]
Charania, Z.; Vogt, L.; Klose, A.; Urbas, L. Bringing Human Cognition to Machines: Introducing Cognitive Edge Devices for the Process Industry. In Proceedings of the 2024 IEEE 22nd International Conference on Industrial Informatics (INDIN), Beijing, China, 18–20 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7, ISBN 979-8-3315-2747-1. [Google Scholar]
Klingel, L.; Kübler, K.; Verl, A. A Multicore Control System Architecture as an Operating Platform for Industrial Digital Twins. Procedia CIRP 2023, 118, 294–299. [Google Scholar] [CrossRef]
Santos, R.; Piqueiro, H.; Dias, R.; Rocha, C.D. Transitioning trends into action: A simulation-based Digital Twin architecture for enhanced strategic and operational decision-making. Comput. Ind. Eng. 2024, 198, 110616. [Google Scholar] [CrossRef]
Yan, S.; Imran, M.A.; Flynn, D.; Taha, A. Architecting Internet-of-Things-Enabled Digital Twins: An Evaluation Framework. In Proceedings of the 2024 IEEE 10th World Forum on Internet of Things (WF-IoT), Ottawa, ON, Canada, 10–13 November 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 649–653, ISBN 979-8-3503-7301-1. [Google Scholar]
Cui, X. Cyber-Physical System (CPS) architecture for real-time water sustainability management in manufacturing industry. Procedia CIRP 2021, 99, 543–548. [Google Scholar] [CrossRef]
Yeh, C.-S.; Chen, S.-L.; Li, I.-C. Implementation of MQTT protocol based network architecture for smart factory. Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 2021, 235, 2132–2142. [Google Scholar] [CrossRef]
Hastbacka, D.; Jaatinen, A.; Hoikka, H.; Halme, J.; Larranaga, M.; More, R.; Mesia, H.; Bjorkbom, M.; Barna, L.; Pettinen, H.; et al. Dynamic and Flexible Data Acquisition and Data Analytics System Software Architecture. In Proceedings of the 2019 IEEE SENSORS, Montreal, QC, Canada, 27–30 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–4, ISBN 978-1-7281-1634-1. [Google Scholar]
Lin, W.D.; Low, M.Y.H. Design and Development of a Digital Twin Dashboards System Under Cyber-physical Digital Twin Environment. In Proceedings of the 2021 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Singapore, Singapore, 13–16 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1716–1720, ISBN 978-1-6654-3771-4. [Google Scholar]
Wang, J.; Wang, X.; Dong, S. Construction of MBSE-based smart beam yard system. In Proceedings of the International Conference on Smart Transportation and City Engineering (STCE 2023), Chongqing, China, 16–18 December 2023; Mikusova, M., Ed.; SPIE: Bellingham, WA, USA, 2023; p. 62, ISBN 9781510673540. [Google Scholar]
InfluxData Inc. InfluxDB. Available online: https://www.influxdata.com/lp/influxdb-database/?utm_source=bing&utm_medium=cpc&utm_campaign=na_search_brand_related&utm_content=brand&utm_source=bing&utm_medium=cpc&utm_campaign=2020-09-03_Cloud_Traffic_Brand-InfluxDB_INTL&utm_term=influxdb&msclkid=87fab7b77d8b1df52505dfd258d80cf5 (accessed on 2 July 2025).
Adhikari, A.; Winslett, M. A Hybrid Architecture for Secure Management of Manufacturing Data in Industry 4.0. In Proceedings of the 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Kyoto, Japan, 11–15 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 973–978, ISBN 978-1-5386-9151-9. [Google Scholar]
Bhattacharjee, A.; Badsha, S.; Sengupta, S. Blockchain-based Secure and Reliable Manufacturing System. In Proceedings of the 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), Rhodes, Greece, 2–6 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 228–233, ISBN 978-1-7281-7647-5. [Google Scholar]
Raspberry Pi Ltd. Raspberry Pi. Available online: https://www.raspberrypi.com/ (accessed on 2 July 2025).
NVIDIA Corporation. Jetson—Embedded AI Computing Platform. Available online: https://developer.nvidia.com/embedded-computing (accessed on 2 July 2025).
Wu, B. Research and Design of Cloud-Based Digital Three-Dimensional Scanning System for Denture Digitization. In Proceedings of the 2023 IEEE 3rd International Conference on Data Science and Computer Application (ICDSCA), Dalian, China, 27–29 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 439–443, ISBN 979-8-3503-4154-6. [Google Scholar]
Apache Software Foundation. Apache Spark™—Unified Engine for Large-Scale Data Analytics. Available online: https://spark.apache.org/?ref=producthunt (accessed on 2 July 2025).
Apache Software Foundation. Apache Hadoop. Available online: https://hadoop.apache.org/ (accessed on 2 July 2025).
Barseghian, D.; Altintas, I.; Jones, M.B.; Crawl, D.; Potter, N.; Gallagher, J.; Cornillon, P.; Schildhauer, M.; Borer, E.T.; Seabloom, E.W.; et al. Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis. Ecol. Inform. 2010, 5, 42–50. [Google Scholar] [CrossRef]
Park, K.T.; Im, S.J.; Kang, Y.-S.; Noh, S.D.; Kang, Y.T.; Yang, S.G. Service-oriented platform for smart operation of dyeing and finishing industry. Int. J. Comput. Integr. Manuf. 2019, 32, 307–326. [Google Scholar] [CrossRef]
Havard, V.; Sahnoun, M.; Bettayeb, B.; Duval, F.; Baudry, D. Data architecture and model design for Industry 4.0 components integration in cyber-physical production systems. Proc. Inst. Mech. Eng. Part. B J. Eng. Manuf. 2021, 235, 2338–2349. [Google Scholar] [CrossRef]
Nahar, P.; Ghuraiya, A.; Voronkov, I.M.; Kharlamov, A.A. IoT System Architecture for a Smart City. In Proceedings of the 2024 8th International Conference on Information, Control, and Communication Technologies (ICCT), Vladikavkaz, Russia, 1–5 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5, ISBN 979-8-3315-1756-4. [Google Scholar]
Tang, J.; Emmanouilidis, C.; Salonitis, K. Reconfigurable Manufacturing Systems Characteristics in Digital Twin Context. IFA Pap. 2020, 53, 10585–10590. [Google Scholar] [CrossRef]
Liu, J.; Liu, J.; Zhuang, C.; Liu, Z.; Miao, T. Construction method of shop-floor digital twin based on MBSE. J. Manuf. Syst. 2021, 60, 93–118. [Google Scholar] [CrossRef]
Kim, D.-H.; Kim, G.-Y.; Noh, S.D. Digital Twin-Based Prediction and Optimization for Dynamic Supply Chain Management. Machines 2025, 13, 109. [Google Scholar] [CrossRef]
Pan, Y.H.; Wu, N.Q.; Qu, T.; Li, P.Z.; Zhang, K.; Guo, H.F. Digital-twin-driven production logistics synchronization system for vehicle routing problems with pick-up and delivery in industrial park. Int. J. Comput. Integr. Manuf. 2021, 34, 814–828. [Google Scholar] [CrossRef]
Wu, W.; Shen, L.; Zhao, Z.; Li, M.; Huang, G.Q. Industrial IoT and Long Short-Term Memory Network-Enabled Genetic Indoor-Tracking for Factory Logistics. IEEE Trans. Ind. Inf. 2022, 18, 7537–7548. [Google Scholar] [CrossRef]
Chen, Y.; Feng, Q.; Shi, W. An industrial robot system based on edge computing: An early experience. In Proceedings of the USENIX Workshop on Hot Topics in Edge Computing, HotEdge 2018, Co-Located with USENIX ATC 2018, Renton, WA, USA, 10 July 2019. [Google Scholar]
Wang, Z.; Gong, S.; Liu, Y. Convolutional Neural Network based Digital Twin of Rolling Bearings for CNC Machine Tools in Cloud Computing. In Proceedings of the 2023 10th International Conference on Dependable Systems and Their Applications (DSA), Tokyo, Japan, 10–11 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 890–895, ISBN 979-8-3503-0477-0. [Google Scholar]
Zhu, X.; Li, C.; Mai, J.; Yang, J.; Kuang, Z. An Empirical Study on the Digital Twin System of Intelligent Production Line. In Proceedings of the 2024 4th International Conference on Computer Science and Blockchain (CCSB), Shenzhen, China, 6–8 September 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 280–286, ISBN 979-8-3503-5650-2. [Google Scholar]
Lin, T.Y.; Shi, G.; Yang, C.; Zhang, Y.; Wang, J.; Jia, Z.; Guo, L.; Xiao, Y.; Wei, Z.; Lan, S. Efficient container virtualization-based digital twin simulation of smart industrial systems. J. Clean. Prod. 2021, 281, 124443. [Google Scholar] [CrossRef]
Barzegaran, M.; Pop, P. The FORA European Training Network on Fog Computing for Robotics and Industrial Automation. In Proceedings of the 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, 17–19 April 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
Pabst, R.G.; De Souza, A.F.; Brito, A.G.; Ahrens, C.H. A new approach to dynamic forecasting of cavity pressure and temperature throughout the injection molding process. Polym. Eng. Sci. 2022, 62, 4055–4069. [Google Scholar] [CrossRef]
Guo, M.; Fang, X.; Hu, Z.; Li, Q. Design and research of digital twin machine tool simulation and monitoring system. Int. J. Adv. Manuf. Technol. 2023, 124, 4253–4268. [Google Scholar] [CrossRef]
Lalik, K.; Flaga, S. A Real-Time Distance Measurement System for a Digital Twin Using Mixed Reality Goggles. Sensors 2021, 21, 7870. [Google Scholar] [CrossRef]
MongoDB, Inc. MongoDB: The World’s Leading Modern Database. Available online: https://www.mongodb.com/ (accessed on 2 July 2025).
Neo4j, Inc. Neo4j Graph Database & Analytics. Available online: https://neo4j.com/ (accessed on 2 July 2025).
Simion, G.; Filipescu, A.; Ionescu, D.; Filipescu, A. Cloud/VPN Based Remote Control of a Modular Production System Assisted by a Mobile Cyber Physical Robotic System—Digital Twin Approach. Sensors 2025, 25, 591. [Google Scholar] [CrossRef] [PubMed]
DMG MORI Deutschland. CMX 600 V-Vertikal-Fräsen-DMG MORI Deutschland. Available online: https://de.dmgmori.com/produkte/maschinen/fraesen/vertikal-fraesen/cmx-v/cmx-600-v (accessed on 2 July 2025).
DMG MORI Deutschland. SIEMENS Operate 4.8. Available online: https://de.dmgmori.com/produkte/steuerungen/19-slimline-control-and-siemens-dmp (accessed on 2 July 2025).
Raspberry Pi Ltd. Raspberry Pi 4 Model B. Available online: https://www.raspberrypi.com/products/raspberry-pi-4-model-b/ (accessed on 2 July 2025).
OpenJS Foundation. Low-Code Programming for Event-Driven Applications: Node-RED. Available online: https://nodered.org/ (accessed on 2 July 2025).
Grafana Labs. Grafana: The Open and Composable Observability Platform|Grafana Labs. Available online: https://grafana.com/ (accessed on 2 July 2025).

Figure 1. Funnel-based visualization of the four-stage filtering and selection process.

Figure 2. Three-dimensional matrix of archetypes with non-model-driven approaches (and their 2D projection).

Figure 3. A two-dimensional matrix of archetypes with model-driven approaches.

Figure 4. Overview of edge–cloud-dualized architecture.

Figure 5. A simple Gateway architecture.

Figure 6. Simplified example of temperature and other sensors connected via LoRaWAN.

Figure 7. Manufacturing resources, local and global resources sharing a public ledger.

Figure 8. Producers and consumers connected through channel feeds.

Figure 9. Connecting resources to factory management models such as MES.

Figure 10. Managing resources in large areas for non-time-critical applications.

Figure 11. Managing resources in time-critical applications.

Figure 12. Managing resources in time-critical applications over large areas.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Permin, E.; Wohlgemuth, C.; Keller, T. Use-Case-Driven Architectures for Data Platforms in Manufacturing. Platforms 2025, 3, 15. https://doi.org/10.3390/platforms3030015

AMA Style

Permin E, Wohlgemuth C, Keller T. Use-Case-Driven Architectures for Data Platforms in Manufacturing. Platforms. 2025; 3(3):15. https://doi.org/10.3390/platforms3030015

Chicago/Turabian Style

Permin, Eike, Carsten Wohlgemuth, and Tom Keller. 2025. "Use-Case-Driven Architectures for Data Platforms in Manufacturing" Platforms 3, no. 3: 15. https://doi.org/10.3390/platforms3030015

APA Style

Permin, E., Wohlgemuth, C., & Keller, T. (2025). Use-Case-Driven Architectures for Data Platforms in Manufacturing. Platforms, 3(3), 15. https://doi.org/10.3390/platforms3030015

Article Menu

Use-Case-Driven Architectures for Data Platforms in Manufacturing

Abstract

1. Introduction

2. Materials and Methods

2.1. Key Terms and Concepts

2.2. Systematic Literature Review

3. Archetypes for Data Management Platforms in Manufacturing

3.1. Requirements of Archetypes

3.1.1. Latency

3.1.2. Data Volume (and Heterogeneity)

3.1.3. Scalability

3.1.4. Application Environment

3.1.5. Data Integrity

3.1.6. Model-Driven Approach

3.2. Use Cases

3.2.1. Factory Management and Coordination

3.2.2. Process Management and Monitoring

3.2.3. Process Control of Robots and Machines

3.2.4. Management of Auxiliaries

3.3. Clustering of Archetypes

3.3.1. Archetypes Without Model-Driven Approaches

3.3.2. Archetypes with Model-Driven Approaches

3.4. Archetypes in Detail

3.4.1. Gateway

3.4.2. Low Energy, Wide Range

3.4.3. Blockchain

3.4.4. Streaming

3.4.5. Management and Monitoring

3.4.6. Resource Orchestration

3.4.7. Process Control

3.4.8. Large-Scale, Time-Critical Control

4. Example Cases

4.1. Gateway for Computerized Numerical Control Data

4.2. Low Energy, Wide Range for Building Resource Monitoring

5. Discussion

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI