Big Data in the Metal Processing Value Chain: A Systematic Digitalization Approach under Special Consideration of Standardization and SMEs

: Within the rise of the fourth industrial revolution, the role of Big Data became increasingly important for a successful digital transformation in the manufacturing environment. The acquisition, analysis, and utilization of this key technology can be deﬁned as a driver for decision-making support, process and operation optimization, and therefore increase the efﬁciency and effectiveness of a complete manufacturing site. Furthermore, if corresponding interfaces within the supply chain can be connected within a reasonable effort, this technology can boost the competitive advantage of all stakeholders involved. These developments face some barriers: especially SMEs have to be able to be connected to typically more evolved IT systems of their bigger counterparts. To support SMEs with the development of such a system, this paper provides an innovative approach for the digitalization of the value chain of an aluminum component, from casting to the end-of-life recycling, by especially taking into account the RAMI 4.0 model as fundament for a standardized development to ensure compatibility within the complete production value chain. Furthermore, the key role of Big Data within digitalized value chains consisting of SMEs is analytically highlighted, demonstrating the importance of associated technologies in the future of metal processing and in general, manufacturing.


Introduction
Since the fourth industrial revolution, the majority of industry sectors are compelled to undergo a digital transformation. Besides industries that have the ability to accommodate more rapidly, especially the heavy industry, and within this sector, metal processing facilities have additional obstacles to overcome. This applies especially to small and medium-sized enterprises (SMEs) in this specific industry segment. The required skills, as well as the necessary budget for the implementation of a state-of the-art digitalization solution are conditions that these companies often cannot fulfill, resulting in a reduction of the velocity of the digital transformation [1][2][3][4]. Additionally, in-use machine systems within this environment tend to have significantly longer life spans than other industry segments, resulting in a higher amount of mostly more complex brownfield digitization and digitalization approaches [4][5][6].
To utilize the full potential of the fourth industrial revolution, all enterprises within a supply chain must be able to communicate with other involved entities. Therefore, a common basis for communication is imperative, as this results in a holistic compatibility and extensibility of such. For this purpose, the Reference Architecture Model Industry 4.0 (RAMI 4.0) supports a standardized technical communication within and between different

Theoretical Fundamentals and State of the Art
To successfully support a digital transformation and to achieve a better comprehension of the production processes, the first step is to assemble a large amount of data from the production sites to understand the operational sequences and to initiate the digital transformation within a company [15]. One of the key factors in this context promoting this transformation is the utilization of the Big Data concept [16]. According to [17][18][19][20][21][22][23], Big Data technologies have to fulfill the three criteria of Volume, Variety, and Velocity. These three requirements can be extended by two further criteria, presented by [24,25] to five characteristics, which are referred to as the 5Vs: volume, velocity, variety, veracity, and value. As new technologies become established, new opportunities, challenges, and threats arise. The opportunities originated up by Big Data initially lie in operational efficiency, leading to a variety of advantages for businesses, and therefore, manufacturing operations [26]. At the production level, this might lead to an improvement in production planning. At the executive level, targeted data integration can support decision-making, strategy development and execution as well as supply chain management [27,28]. All these improvements can subsequently be used to augment customer service [26]. Difficulties occur in the acquisition, transmission, storage, management, analysis, visualization, integration, privacy, and security of data as well as risk management [26,29,30]. These difficulties can be traced back to the 5Vs. The first V, volume, presents a major obstacle in two respects. According to Zikopoulos [21], on the one hand, a large amount of data of at least one petabyte must be processed. On the other hand, the corresponding infrastructure has to be available for the intention to be able to process these data volumes in a reasonable amount of time, leading to the next characteristic, velocity. Velocity is defined as the ability to generate, process, analyze, and store data at high speeds continuously or discretely. The velocity of this refers to the time it takes to get from source to destination including all necessary operations. Variety arises from the different file structures that can be distinguished in structured, semi-structured, and unstructured data sets. In more than 70% of cases, data is present in the unstructured form [26]. Veracity arises from the failure to provide data of sufficient quality that cannot be used due to a lack of meaningfulness or uncertainty. Since data analysis inevitably depends on the quality of data, low-quality data can lead to an unintentional distortion of the result. Value describes the added value generated by analyzing and linking data [24,25]. An additional threat is the privacy of data that could be leaked through cyberattacks due to a lack of security. Other concerns arise from the data itself, should one criterion of the 5Vs not be met [26,[31][32][33][34][35].
One of the major challenges of Big Data analysis in the metal processing environment is the high variety of processes regarding the geometry as well as material and the processing steps concerning the application of the respective workpiece, especially for specialized SMEs and in the metal forming sector. Considering the supply chain of a metal-based product from casting to forming of semi-finished products up to component manufacturing, this variety increases even further. A typical supply chain for high-quality metal components, e.g., the aerospace industry includes multiple specialized SMEs, resulting in a processing chain consisting of a large number of different companies involved [2]. In this case, digitalization solutions are often planned and implemented as stand-alone solutions, especially taking into account the internal restrictions of these companies concerning their confidentiality regulations. Taking a globalized and interconnected supply chain approach into account, the role of standardized interfaces is therefore crucial for further superordinate supply chain optimization.
To make use of new technologies and support the premises of Industry 4.0, individualization, flexibility, decentralization, and resource efficiency, various technologies have to be combined. When operating within the production site scale, this digitalization development within such a facility are combined and defined as Smart Factory [36][37][38][39][40]. According to [40], a Smart Factory can be described as a compound of Cyber-Physical Systems (CPS) connected by the Internet of Things (IoT), to support humans and machines in their activities [34]. As stated by [2,35,[41][42][43][44], key technologies, especially for the metal forming industry, include a generic infrastructure, consisting of CPPSs, IIoT, DTs, Big Data, and Cloud Computing. Considering the already existing infrastructure within the heavy industry segment focused in this work, the integration of already existing technology by using a brownfield approach results in a mixed form of this theoretical construct: expensive layer 0-4 solutions already implemented but not fulfilling these new requirements cannot be exchanged without unreasonable investments. This statement is especially important for SMEs, which tend to have a generally lower budget for innovation that usually amortize in a medium to long-term period.
To be able to unite already existing structures with these Industry 4.0 (I 4.0) related technologies, a generic infrastructure, serving as a standard for internal and external technological communication must be implemented. To create a uniform understanding of I 4.0 technologies and their standards, the RAMI 4.0 framework, based on the Smart Grid Architecture Model (SGAM) [2,[45][46][47], was developed. RAMI 4.0 can therefore be understood as a structured approach to I 4.0 in order to enable uniform communication between its users. The most important interrelationships between key aspects of I 4.0 are visualized by three axes. For the success of a digitalized process chain, there must be a holistic, corresponding vertical and horizontal integration along the life cycle and value stream. As shown in Figure 1, the "Life Cycle and Value Stream" represents the life cycle of physical entities, including the product, along the process chain over which the horizontal and vertical integration takes place [2,45,46]. Horizontal integration, represented by the hierarchy levels ensures cross-border communication with other entities, representing one of the fundamental premises of I 4.0 [2,45,46]. Vertical integration takes place across the layers and is used for data integration and communication between those [2,45,46]. The RAMI 4.0 model consists of the following layers ( Figure 1):

•
The Asset layer describes the lowest layer in RAMI 4.0 and contains all physical objects; The Information layer contains all process-relevant data and information in different formats; • The Functional layer contains all functions of a value chain. Depending on their determination, these functions can be of a logistical or data processing character; • The Business layer houses the business logic and deals with the optimization of products and processes.
• The Integration layer is representative for the connection of physical objects with the digital domain and contains the required hardware and software; • The Communication layer executes the digital connection and thus can be seen as an IIot equivalent; • The Information layer contains all process-relevant data and information in different formats; • The Functional layer contains all functions of a value chain. Depending on their determination, these functions can be of a logistical or data processing character; • The Business layer houses the business logic and deals with the optimization of products and processes.
The hierarchy level is distinguished as follows ( Figure 1): • Product describes the product to be manufactured; • Field Device includes entities for collecting data, such as sensors and data acquisition (DAQ); • Control Device describes those operating elements that are used to control the system; • Station describes the machine or station used for the production step; • Work Center is to be understood as the production environment; • Enterprise describes the host enterprise itself. As a digitalized supply chain in the metal processing industry usually consists of multiple smart factories, often owned by different companies and in general delocalized, smart factories include several CPPSs. A CPPS serves as an extension of a CPS, referring to a system capable of acquiring, storing, analyzing data in real-time using Internet technologies, and reintegrating information from the virtual to the physical world [48], partly dismantling the classical automation pyramid [49]. The reintegration of information involves human-machine interaction, which can be realized with Human Machine Interfaces (HMIs) as shown in, e.g., [2,35]. Building upon the concept of the CPS, a CPPS is using automation technology to a greater extend [2,48]. According to [48,50,51], the following characteristics must be fulfilled: a CPPS: (i) consists of superordinate systems The hierarchy level is distinguished as follows ( Figure 1):

•
Product describes the product to be manufactured; • Field Device includes entities for collecting data, such as sensors and data acquisition (DAQ); • Control Device describes those operating elements that are used to control the system; • Station describes the machine or station used for the production step; • Work Center is to be understood as the production environment; • Enterprise describes the host enterprise itself.
As a digitalized supply chain in the metal processing industry usually consists of multiple smart factories, often owned by different companies and in general delocalized, smart factories include several CPPSs. A CPPS serves as an extension of a CPS, referring to a system capable of acquiring, storing, analyzing data in real-time using Internet technologies, and reintegrating information from the virtual to the physical world [48], partly dismantling the classical automation pyramid [49]. The reintegration of information involves humanmachine interaction, which can be realized with Human Machine Interfaces (HMIs) as shown in, e.g., [2,35]. Building upon the concept of the CPS, a CPPS is using automation technology to a greater extend [2,48]. According to [48,50,51], the following characteristics must be fulfilled: a CPPS: (i) consists of superordinate systems within systems; (ii) consists of connected and cooperative elements acting situationally appropriate between all layers of a production environment; (iii) enhances real-time decision making.
A further extension of these criteria is proposed by [52], adding two additional conditions to be met, especially considering SMEs: (iv) a user-centered CPPS consists of HMIs that are tailored to the application and the respective end-user; (v) a user-centered CPPS for SMEs is resilient and has a short amortization time.
Apparent assistance for the establishment of a CPPS of this kind can be the implementation of DTs, serving as a decision-making support system for the further optimization of the respective production process.
According to [53], a common definition of a DT is a virtual representation of a physical product. This definition can be further refined according to [2,53], depending on the field of application. In the metal industry, a DT can be seen as a partial or complete representation of a production chain, shifting the focus more onto production planning and optimization, or as a representation of one or more process steps in the production chain for the manufacturing of a semi-finished or finished product, whereby the focus shifts on the simulation [2]. To generate advantage of a DT, data from physical space and the information derived from it must be reintegrated from the digital back into physical space. This data transfer between a physical and virtual entity leads to three differentiations depending on its degree of automation [53]. The Digital Model (DM) includes data transfer between the physical and digital space without automated data transfer from the digital to the physical domain or vice versa [53]. A Digital Shadow (DS) consists of a unilateral automated data transfer, in the sense that data is automatically transferred from one domain to the other, while the reverse has to be executed manually [53]. The DT, on the other hand, includes an automated bilateral data transfer, resulting in an algorithm-based process adaption and can adapt the digital domain by utilizing near real-time process data [53]. The fundament of a DM, DS or DT can be a White Box Model (WBM), Black Box Model (BBM) or a mixture of both, defined as Grey Box Model (GBM) [2].
WBMs are based on real physical relationships build up upon known parameters and mathematical correlations. Therefore, the output and how output-related results are obtained are comprehensible [2,[54][55][56]. BBM are often used where a mathematical description based on real physical relationships is too complex [54,55,57] or not available in the required depth and time. As a result, its logic and working methods are not transparent in comparison to a WBM [54]. The output of a BBM has no real physical mechanisms compared to the WBM and is based on stochastic approaches and the correlation of data [2]. The GBM represents the combination of a WBM and a BBM, aiming to merge the benefits of both [2,54]. However, the results may vary depending on the modeling method used and so the use of either a WB, GB or BB model has to be answered individually depending on the given circumstances and resources [56,58]. Taking Machine Learning (ML) into account, the models can be developed into White Box Machine Learning Models (WBMLMs), Grey Box Machine Learning Models (GBMLMs), and Black Box Machine Learning Models (BBMLMs), as demonstrated in [52,56]. In the case of SMEs, initial WBMs are especially important to consider, as there may not be enough experimental investigations nor resources to generate data from tests to meet the 5Vs of Big Data, leading to an unsatisfactory result. In the further production process, data can be collected and fed into the initial WBM, which transforms it into a GBMLM, as shown in [56]. Large enterprises, on the other hand, may have already collected enough data to meet the 5Vs of Big Data and can implement a BBM that is also fed with data collected in the process, transforming it into a BBMLM [52]. Nevertheless, in the event of limited access to sufficient data, a similar approach to SMEs can be adopted. Data that can be used by the models has to be classified and, in the case of an ML approach, supervised learning is recommended [23,54]. To obtain the necessary data in the required quality, time and structure, IIoT solutions, despite classical level 2 automation schemes, are required to realize a true BBM or GBM as well as their further advanced ML-based extensions.
IIot is a derivative of the IoT, which describes the attempt to network smart devices across the board, whereby this term is strongly consumer-related [59,60]. The architecture associated with IoT has to be adopted when implemented in the context of an industrial environment, especially considering higher IT security and resilience requirements [60]. As concluded by [60], IIoT can be described as a superordinate system including connected cyber-physical entities that enable in-situ data acquisition, analysis, and exchange in an industrial environment leading to process and production optimization and thus serving as a major enabler for a leaner production [60,61]. The resulting benefits are improved productivity and efficiency, reduced cost and energy consumption as well as a strengthened customer relationship [59,60,62]. A major challenge arises from the heterogeneous application of protocols [2,60]. The general term protocols can be divided into data protocols (e.g., XMPP, MQTT), discovery protocols (e.g., mDNS) and infrastructure protocols (e.g., IPv4, IPv6), enabling communication, whereby each of them exhibits their individual advantages as well as disadvantages [60]. As stated by [2,60], Message Queue Telemetry Transport (MQTT) offers itself for industrial use due to efficient data storage. Furthermore, [2] pointed out the suitability of Extensible Messaging and Presence Protocol (XMPP) for HMIs, which serve as a key component in a smart factory. The integration of a large number of smart devices into the IIoT also poses several risks in terms of ITsecurity and makes production sites particularly vulnerable to cyberattacks [63,64]. A smart factory, consisting of a multitude of cyber-physical entities, offers attack points in the areas of software (viruses, trojans), protocols (man-in-the-middle, denial-of-service), and hardware [63,64]. This can not only paralyze production but also lead to data theft or targeted manipulation of processes [32,59]. As stated by [32], WB(ML)Ms, GB(ML)Ms, and BB(ML)Ms, when connected, e.g., as a DS or DT, can also be impacted by cyberattacks due to manipulation of the general model, underlying ML algorithm or related data sets. To be able to reduce or at best completely avert cyberattacks and the associated potential intellectual and physical damage, the involvement of security experts should be considered in any case when implementing a smart factory. As stated by [65][66][67], the IIoT furthermore serves as a major enabler for Industry 5.0 focusing on a higher degree of Human-Machine Interaction enabling a virtualized, costumer-driven manufacturing environment [68].
Considering the limited human and architectural IT resources within most SMEs, cloud computing solutions can add significant economic benefits and therefore additionally serve as an accelerator for the digital transformation of these entities [69]. The basic definition of cloud computing is given by the National Institute of Standard and Technology (NIST): "a model for enabling convenient, on-demand network access to a shared pool configurable computing resources (e.g., networks, servers, storage, application, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction" [70]. A Service Level Agreement (SLA) regulates the services to be provided between the consumer and the provider and the services to be provided [71]. Through the combination of IIot and cloud computing it is possible to implement decentralized, on-demand data computation [59,72,73]. As stated by [59], with centralized cloud computation, the probability of potential delays in high-priority data in the event of high data traffic cannot be neglected. However, there is a possibility that highly specialized SMEs, which already have such an infrastructure and know-how, are not willing to outsource computing activities to cloud services due to legal uncertainties regarding data protection and privacy due to different jurisdictions. Despite this uncertainties, decentralized computing resources, in combination with big data, make it possible to monitor and optimize entire process chains in real-time [63].
As already mentioned, through the interaction of networked smart devices subordinate to a CPPS and linked by an IIoT, collected data can be processed through suitable computing resources, such as cloud computing services. To enable regulated access, availability and storage to data, suitable databases and corresponding DBMS must be implemented [3,26]. When selecting a suitable DBMS, it is important to pay particular attention to the system limits, as data to be processed can be too large for specific DBMSs and thus impair performance [74]. Corresponding database models include relational, object-orientated and document-based databases [75,76]. Two programming languages capable of structuring and accessing the data of a database are Structured Query Language (SQL) and not-only Structured Query Language (noSQL) [76]. Relational databases store the data in structured tables linked together by keys and can be accessed by SQL [75]. Due to the high amount of links between the tables, performance problems occur with large amounts of data, as the relationship models become increasingly complicated [75]. The less commonly used object-oriented databases manage data in an object which internally takes over the data management [76,77]. Depending on the database model, data can be accessed with suitable object-orientated languages. Document-oriented databases follow a non-relational approach where data is stored in documents of different formats by an identifier [76,78]. Data in a document can be accessed with key-value pairs using noSQL [76][77][78][79]. As stated by [78], DBMS based on noSQL are suitable for a big amount of data if the data does not demand a relational model, thus gaining popularity, especially when dealing with unstructured data sets [79]. As stated by [79,80], which DBMS to use is highly dependent on the use case, software, and requirements. I 4.0 is also changing the security requirements for included systems [81,82]. With the increased utilization of HMIs in the manufacturing environment, it is essential to verify the validity of the input and trustworthiness of the operator to ensure operational safety [81]. This risk is further increased by principles such as Bring Your Own Device (BYOD) or Choose Your Own Device (CYOD), which exposes networks to a higher risk of infiltration, e.g., due to a lower level of standardization by a higher degree om heterogeneity of used devices [82]. Therefore, data should be classified according to their confidentiality, integrity and availability (CIA) to prevent unauthorized access and data manipulation [60,81,[83][84][85]. Manipulation or corruption of data could lead to malfunctions, miscalculations, misinterpretation and thus to wrong decisions in the upper two layers of the automation pyramid, the Manufacturing Execution System (MES), and the superordinate Enterprise Resource Planning (ERP) system. As stated by [81], another security risk is the age of the infrastructure, as there is a frequent replacement of equipment over time, making constant planning and updating of security measures necessary. Therefore, a multi-layer architecture approach with multiple safety layers should be implemented, dividing and analyzing the signal flow of each CPPS and cross-validating the signals with those of corresponding CPPS [81]. Other security measures as stated by [81], such as Side-Channel Analysis [86,87] or Post-Production Analysis, can further enhance operational security despite the challenges they pose. A further approach, as mentioned by [88], would be a standardized certification into embedded systems themselves letting those systems check for their security by themselves. Furthermore, [89] proposes that RAMI 4.0 should take greater account of safety and human factors. Another weakness in security planning was pointed out by [90], as stated that a lack of recovery planning in the case of disaster is persistent in I 4.0. Other security measures, as noted by [88,91], include the implementation of a firewall and a private network (VPN) that can only be accessed by devices with authorized IP addresses. Table 1 summarizes the most important key factors for the implementation of a digitalized metal processing value chain.
Based on the theory elaborated, the authors propose the following hypothesis: Hypothesis 1 (H1). When developing a digitalized supply chain within the metal processing environment which is horizontally interconnectable, the 5V condition and therefore the existence of Big Data is automatically fulfilled. CPPS connection and cooperation of layers; acquisition and analysis of data in real-time using Internet technologies; human-machine interaction; enhance real-time decision making [48][49][50] DT decision making support; partial or complete representation of a production chain/process steps; simulation [53] ML/Artificial Intelligence (AI) supervised learning to enhance decision making within the production environment [92][93][94][95][96] IIoT enables in-situ data acquisition, analysis and exchange for process and production optimization [59,60,

Digitalization and Development of a Metal Processing Value Chain: Framework and Corresponding Case Study for SMEs
The presence of SMEs in a value chain varies among the considered industry segment and country. In the case of Austria, 99.7% of the industry consist of SMEs [99], whereas 18.1% of Austria's industry concern the field of manufacturing [100]. Consequently, it is most likely that SMEs are involved within a metal processing value chain. This is particularly evident in the manufacturing sector of the automotive and aerospace industry, where many specialized SMEs supply big enterprises with components that have to fulfill high quality standards and require specific production related expertise [101][102][103][104].
To be able to support SMEs within this environment, this section describes the further concretization of the RAMI 4.0 framework for the development of a simplified digitalized value chain for the metal processing industry, that especially considers the additional restrictions SMEs face. The resulting concretized framework is put into operation by a smaller-scale use case at the Montanuniversität Leoben. This use case, which is part of the MUL 4.0 project, does not only serves as a practical testing of the stated hypothesis but pursues to contribute to as a fundament for the state-of-the-art engineering education for future experts in the manufacturing field [105,106].
According to [105], the two main cooperating instances of MUL 4.0 can be characterized as follows: the Chair of Metal Forming (MF) with 15 employees represents a small enterprise, whereas the Chair of Non-Ferrous Metallurgy (NFM) with 72 employees can be seen as a medium enterprise. These chairs are able to provide the required infrastructure for the development of an integrated I 4.0 standard value chain. As a result, by digitalizing and connecting these entities under consideration of providing interfaces for state-of-the-art software used in large enterprises, the steps undertaking to realize these objectives can be utilized by other SMEs within the metal processing environment.
Depending on the product to be manufactured, the respective part has to pass various steps according to RAMI 4.0 [89]. These process steps are carried out in production facilities of various sizes, which, for the sake of simplicity, will be distinguished into SMEs and large enterprises within this work, as visualized in Figure 2.
interfaces for state-of-the-art software used in large enterprises, the steps undertaking to realize these objectives can be utilized by other SMEs within the metal processing environment.
Depending on the product to be manufactured, the respective part has to pass various steps according to RAMI 4.0 [89]. These process steps are carried out in production facilities of various sizes, which, for the sake of simplicity, will be distinguished into SMEs and large enterprises within this work, as visualized in Figure 2.

Figure 2. Scheme of a simplified metal forming value chain with different enterprises (E).
To ensure the complete digital transformation of a value chain, both SMEs and large enterprises have to meet the requirements of I 4.0. Disruptions caused by a failure to meet this requirement, e.g., a lack of data to be integrated into the value chain, would lead to an interruption in the processing route. SMEs in particular, often lacking the financial resources and expertise to carry out a successful digital transformation process, pose a risk of such a disruption. Appropriate software and hardware is often expensive and can be a major financial barrier for some SMEs. To overcome this obstacle in terms of hardware and the associated computing resources, an approach with cloud computing is a preferable option. Another advantage would be that respective expertise does not have to be acquired by SMEs themselves, but is already available from the provider of the cloud computing service. Furthermore, providers offer additional solutions such as data processing and can thus be outsourced. Software solutions, serving as another barricade, can be managed with open or closed source products. The choice of whether to use open-source products or closed source products has to be considered from several points of view. Once again, internal know-how is a key factor for the application of open or closed source solutions. If internal human resources with IT expertise are available to implement a customized solution and the requirements and tasks are very specific, an open-source approach may be appropriate. If the requirements are non-specific as they are common in the industry, it would be preferable to use closed source products, pay license fees and thus have access to support and updates. By closing gaps in the value chain, data from other stations or companies can be accessed, depending on authorization, to enable more flexible supply chain management and process planning in the downstream and upstream steps. Figure 3 shows an exemplary supply chain network for the parallel operation of an SME and larger enterprise. To ensure the complete digital transformation of a value chain, both SMEs and large enterprises have to meet the requirements of I 4.0. Disruptions caused by a failure to meet this requirement, e.g., a lack of data to be integrated into the value chain, would lead to an interruption in the processing route. SMEs in particular, often lacking the financial resources and expertise to carry out a successful digital transformation process, pose a risk of such a disruption. Appropriate software and hardware is often expensive and can be a major financial barrier for some SMEs. To overcome this obstacle in terms of hardware and the associated computing resources, an approach with cloud computing is a preferable option. Another advantage would be that respective expertise does not have to be acquired by SMEs themselves, but is already available from the provider of the cloud computing service. Furthermore, providers offer additional solutions such as data processing and can thus be outsourced. Software solutions, serving as another barricade, can be managed with open or closed source products. The choice of whether to use open-source products or closed source products has to be considered from several points of view. Once again, internal know-how is a key factor for the application of open or closed source solutions. If internal human resources with IT expertise are available to implement a customized solution and the requirements and tasks are very specific, an open-source approach may be appropriate. If the requirements are non-specific as they are common in the industry, it would be preferable to use closed source products, pay license fees and thus have access to support and updates. By closing gaps in the value chain, data from other stations or companies can be accessed, depending on authorization, to enable more flexible supply chain management and process planning in the downstream and upstream steps. Figure 3 shows an exemplary supply chain network for the parallel operation of an SME and larger enterprise. Digitization and digitalization of both enterprises were based on the model of the automation pyramid. Layer 0 covers all physical production processes. Layer 1 is the DAQ level and contains sensors, actuators and programmable logic controllers (PLCs). Layer 2 acts as the Supervisory Control and Data Acquisition (SCADA) level including HMIs. Layer 3 contains the MES and layer 4 includes the ERP. The data of the respective enterprises can be integrated into the higher-level system from layer 3 and thus act as a lower layer in the higher-level layer system of the value chain. The network includes several servers, which execute several services. If possible, the option of executing only one service per server should be checked according to the "One server, one service" principle. In the event of a server failure, instead of several services, only individual services would be affected, which can be taken over by backup servers. Depending on the available resources, different approaches regarding WBM, GBM, and BBM can be followed. In the case of enterprise 4 (Figure 3, E4), an SME, an initial WBM was chosen using data acquired from a Finite Element Analysis (FEA). Using the external computing resources of a cloud computing service, the data obtained from FEA can be integrated into a WBM, which in turn can be assimilated with further FEA data for refinement, resulting in a WBMLM approach. Another possibility would be to continuously feed process data into the initial WBM, resulting in a GBMLM, as shown in [56]. Enterprise 5 (Figure 3, E5), a large enterprise that already has data that meets the 5Vs criteria, has a similar structure as E4 (Figure 2). Due to the existing amount and value of data, an initial BBM approach is applied within this example. The BBM can be fed and refined with further process data recorded during production and thus establish BBMLM, as shown in [52]. If required data cannot be obtained from the DAQ of the process and would require a WBM based simulation, e.g., FEA for material models or microstructure models, a GBMLM approach Digitization and digitalization of both enterprises were based on the model of the automation pyramid. Layer 0 covers all physical production processes. Layer 1 is the DAQ level and contains sensors, actuators and programmable logic controllers (PLCs). Layer 2 acts as the Supervisory Control and Data Acquisition (SCADA) level including HMIs. Layer 3 contains the MES and layer 4 includes the ERP. The data of the respective enterprises can be integrated into the higher-level system from layer 3 and thus act as a lower layer in the higher-level layer system of the value chain. The network includes several servers, which execute several services. If possible, the option of executing only one service per server should be checked according to the "One server, one service" principle. In the event of a server failure, instead of several services, only individual services would be affected, which can be taken over by backup servers. Depending on the available resources, different approaches regarding WBM, GBM, and BBM can be followed. In the case of enterprise 4 (Figure 3, E4), an SME, an initial WBM was chosen using data acquired from a Finite Element Analysis (FEA). Using the external computing resources of a cloud computing service, the data obtained from FEA can be integrated into a WBM, which in turn can be assimilated with further FEA data for refinement, resulting in a WBMLM approach. Another possibility would be to continuously feed process data into the initial WBM, resulting in a GBMLM, as shown in [56]. Enterprise 5 (Figure 3, E5), a large enterprise that already has data that meets the 5Vs criteria, has a similar structure as E4 (Figure 2). Due to the existing amount and value of data, an initial BBM approach is applied within this example. The BBM can be fed and refined with further process data recorded during production and thus establish BBMLM, as shown in [52]. If required data cannot be obtained from the DAQ of the process and would require a WBM based simulation, e.g., FEA for material models or microstructure models, a GBMLM approach would be followed. In any case, supervised ML should be pursued at the beginning of the implementation to support a successful establishment of the system.
For this exemplary framework, the 5V criterium of Big Data is fulfilled, as demonstrated in Table 2.

Results and Discussion
As part of the MUL 4.0 project, four machines were integrated into a value chain [107]. These are positioned at two different localizations. The continuous caster is located at the NFM and is equipped as standard with sensors and DAQ by the manufacturer. The MF houses the furnace, hydraulic press, and rolling mill, posing as the second production site in the process. The rolling mill from 1954 was transformed into a CPPS utilizing low-cost retrofitting and suitable sensors such as Linear Variable Differential Transformer (LVDT) and load cells to be able to integrate required data into the process [52]. In cooperation with the Chair of Industrial Logistics (IL), a cross-process database was set up to make data available between the cooperating parties. Table 3 shows the technical specifications of the sensors associated with the corresponding machines and their location.
As visualized in Figure 4, the process chain consists of continuous casting of the aluminum specimens, followed by a variable operation of forming processes. At the MF, the specimen can be cold-formed or rolled. In the hot forming process, the specimen is preheated in the furnace before rolling or upsetting. Subsequently, the samples are subjected to quality control and recycled in the final stage.  Figure 4, the process chain consists of continuous casting of the aluminum specimens, followed by a variable operation of forming processes. At the MF, the specimen can be cold-formed or rolled. In the hot forming process, the specimen is preheated in the furnace before rolling or upsetting. Subsequently, the samples are subjected to quality control and recycled in the final stage. To transfer the data recorded by the sensors into the production network, DAQs were implemented (Table 4). At the MF, a low-cost approach was pursued with DAQ systems from Wago GmbH (Brunn am Gebirge, Austria) and the Wago e!Cockpit software. In To transfer the data recorded by the sensors into the production network, DAQs were implemented (Table 4). At the MF, a low-cost approach was pursued with DAQ systems from Wago GmbH (Brunn am Gebirge, Austria) and the Wago e!Cockpit software. In addition, data processing, creation and execution of GUIs used with the open-source programming language Python [52]. To pursue the low-cost and open-source approach, MariaDB was chosen as SQL-based database, as deemed suitable for the amount of data generated. In the processing chain of MUL 4.0, a distinction can be made between process-related time series data and logistical data. Process-related time series data, e.g., sensor data, data obtained from a FEA or finite volume analysis (FVA), photos and videos need to be accessed and computed in near real-time to create appropriate process DTs. Not all raw data is stored in the database, but selected, filtered data to keep the performance of the database optimal. Data stored in the database can be accessed by authorized users for further data processing or investigation. Furthermore, the MariaDB is linked to an MES and ERP to enable dynamic process chain monitoring, planning and control.
As mentioned in Section 2, cyber security plays a key role in the context of I 4.0. For this reason, an IT-layer architecture was designed and implemented as shown qualitatively in Figure 5. The IIoT in layer 2 can be considered as a closed system, from which data from a NodeRed server and the Maria DB are transferred to layer 3. Layer 3 contains the remote admin host, webserver dashboard, the chosen low-cost ERP system ERP Next and the file server in a virtual environment, which can be accessed by the client by authorized workstations. To prevent unauthorized access and cyber-attacks, a firewall was installed between layers 2 and 3, which only allows layer 3 to query layer 2.     Table 5. Asset layer fragments from Figure 6 and corresponding specifications.   Table 6. Integration layer fragments from Figure 6 and corresponding specifications.

Layer Fragment Specification
It1.1 RFID chip It1. 2 Continuous caster DAQ according to   Table 8. Information layer fragments from Figure 6 and corresponding specifications.

Layer Fragment Specification
In1.1 Location of specimen In1. 2 Sensor data of continuous caster according to Table 3  In1.3 User input from continuous caster GUI In1. 4 Status of continuous caster In1. 5 Process data acquired by data processing (e.g., utilization factor) In1. 6 Economic data acquired by data processing (e.g., price per unit) In2. 1 Location of specimen In2. 2 Sensor data of furnace according to Table 3  In2.3 User input from furnace GUI In2. 4 Status of furnace In2. 5 Process data acquired by data processing (e.g., utilization factor) In2. 6 Economic data acquired by data processing (e.g., price per unit) In3. 1 Location of specimen In3. 2 Sensor data of rolling mill|hydraulic press according to Table 3  In3.3 User input from rolling mill|hydraulic press GUI In3. 4 Status of rolling mill|hydraulic press In3. 5 Process data acquired by data processing (e.g., utilization factor) In3. 6 Economic data acquired by data processing (e.g., price per unit) In4. 1 Location of specimen In4. 2 Sensor data of tensile test machine|recycling aggregate In4. 3 User input from tensile test machine|recycling aggregate GUI In4. 4 Status of tensile test machine|recycling aggregate In4. 5 Process data acquired by data processing (e.g., utilization factor) In4. 6 Economic data acquired by data processing (e.g., price per unit) Table 9. Functional layer fragments from Figure 6 and specifications.

Layer Fragment Specification
F1.1 Statistical data by data processing F1. 2 Statistical data by data processing F1. 3 Statistical data by data processing F1. 4 Machine status F1. 5 MES data processing F1. 6 ERP data processing F2. 1 Statistical data by data processing F2. 2 Statistical data by data processing F2. 3 Statistical data by data processing F2. 4 Machine status F2. 5 MES data processing F2. 6 ERP data processing F3. 1 Statistical data by data processing F3. 2 Statistical data by data processing F3. 3 Statistical data by data processing F3. 4 Machine status F3. 5 MES data processing F3. 6 ERP data processing F4. 1 Statistical data by data processing F4. 2 Statistical data by data processing F4. 3 Statistical data by data processing F4. 4 Machine status F4. 5 MES data processing F4. 6 ERP data processing Table 10. Business layer fragments from Figure 6 and corresponding specifications. Human resources optimization B4. 4 Downtime risk minimization B4. 5 Process chain optimization B4. 6 Cost optimization

Layer Fragment Specification
According to state-of-the-art literature, Big Data analytics is an integral part of the fourth industrial revolution. For the integration of SMEs within the metal processing value chain, this technology can also be seen as an essential component of this process, as this work demonstrates. Furthermore, by including the characteristics of Big Data within the initial planning phase of an interconnected metal processing supply chain project, the risk of misplanning can be minimized. The combination of structured and standardized planning, relying on the RAMI 4.0 model and the 5V definition of Big Data can thus support SMEs and their stakeholders within the value chain for an accelerated integration approach. The additional consideration of IT security, as well as cloud computing solutions, further increase the resilience of a planned mixed enterprise size value chain integration. Due to the financial and resource challenges especially SMEs have to overcome, the utilization of low-cost but industry-standard solutions, as demonstrated in Section 3 can lead to a significant boost within the digitalization, digital transformation, and finally value chain integration of these company types. The fulfillment of the 5Vs of Big Data analytics, however, is not mandatory for all types of SMEs. By focusing on the metal processing environment, the required volume can easily be reached, as this industry segment heavily relies on FEA-based WBM. For SMEs or even larger companies, operating in industry fields where the process variety is low and/or the process parameters are stable and not complex, volume, as well as variety is not necessarily high. Based on the results of the theory and case study shown in this paper, it can be stated that a fully digitalized metal processing value chain must always include the Big Data concept, therefore H1 cannot be neglected. Despite this conclusion, using the RAMI 4.0 model as a fundament for further concretization for the digital transformation of an SME can add value within all manufacturing-related industry segments. Considering the broader perspective of international supply and value chains, the authors argue that an additional focus on the legal aspect of international collaborations and networks should be additionally focused within the RAMI 4.0 framework, especially when considering legal differences in terms of responsibilities and liabilities in the event of ML-involved accidents at the shopfloor level.

Conclusions and Outlook
In this paper, an approach for a systematic standardized digitalization of a value chain is presented. For this purpose, key enablers for a digital transformation were identified according to state-of-the-art literature. The utilization of these key technologies and systems is then elaborated in more depth by applying possible configurations on the theoretical integration of SMEs into a digitalized metal processing value chain, especially considering the requirements of these enterprises operating with heterogeneous and complex processes. Based on this further concretization, the resulting concept is further applied on a smallscale value chain developed at the Montanuniversität Leoben for further analysis and validation. By doing so, the authors state that Big Data is a core element of a fully digitalized value chain within the metal processing environment. With increasing digitalization among all involved stakeholders within the metal processing sector, digital transformation and therefore, digitalized value chains will subsequently increase, leading to further utilization of Big Data and corresponding technologies. As demonstrated within this work, the use of a suitable digital transformation framework can furthermore contribute to a more resilient digital transformation process and decrease the implementation and optimization time required to fulfill the requirements of an I 4.0 compatible supply chain. As already observable in other industry segments, technologies like blockchain can boost this development even more, e.g., by increasing data security [108][109][110]. Furthermore, the rise of quantum computing and expected utilization in the manufacturing context can further boost the demand for Big Data technologies and required know-how, which increases the importance of corresponding technologies and frameworks in the future even more. Data Availability Statement: Data is available by request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.