Digital Twin Applications: A Survey of Recent Advances and Challenges

: Industry 4.0 integrates a series of emerging technologies, such as the Internet of Things (IoT), cyber-physical systems (CPS), cloud computing, and big data, and aims to improve operational efﬁciency and accelerate productivity inside the industrial environment. This article provides a series of information about the required structure to adopt Industry 4.0 approaches and a brief review of related concepts to ﬁnally identify challenges and research opportunities to envision the adoption of so-called digital twins. We want to pay attention to upgrading older systems aiming to provide the well-known advantages of Industry 4.0 to such legacy systems as reducing production costs, increasing efﬁciency, acquiring better robustness of equipment, and reaching advanced process connectivity.


Introduction
Industry 4.0 emerged as a breakthrough in industrial development and can be defined as the connection of production entities together through technologies, such as big data [1,2], the Internet of Things (IoT) [3][4][5][6], cyber-physical systems (CPS) [7,8], machine learning [9,10], and others. CPS is a technology that integrates interconnected components inside the shop floor. The integration of features that can happen through the IoT, cloud computing, and the corresponding shop floor is the reality with Industry 4.0, representing a mix of traditional manufacturing process technologies with modern processing computers (mostly embedded devices), software development approaches [11], and communication artifacts.
Digital twin (DT) systems integrate physical and virtual data throughout a product life cycle. A DT can exist along the products' lifecycle, from the initial concepts to the design stages. As a result, simulations may be executed before the final product is manufactured, facilitating real-time monitoring and quality evaluation. Such a result generates a digital representation of a physical system, known as a physical twin [4].
The DT of a physical system of the natural world can simulate the life cycles of the system and reflect the synchronized action of the virtual and physical twin, coupling the virtual and real world [12,13]. Thus, the product life cycle is an action in the physical world represented by a combination of the system's process, data, and components. DT can reproduce the state of the physical entities through physical models. DTs can update data in real time so that virtual models can improve continuously through data updates from physical assets.
Systems have considerable agility, autonomy, and efficiency improvements [14]. They have characteristics such as an interconnected network, regularly exchanging data, and interlinking communication between all the system's objects. However, we have the socalled legacy systems that, among their definitions, we can say that it is an old application or system that may be based on outdated technologies, but is fundamental to day-today operations.
Considering the various applications by using the DT, we have a promising scenario for further research into DT architecture, Industry 4.0 applications and concepts, legacy system upgrades, intelligent systems, and others. These technologies help improve performance and solve problems in the modern world. In this context, this paper aims to summarize research on digital twins and their principal characteristics with outlook definitions and presents the main challenges and trends among them and the possibility of updating a legacy system using the retrofitting technique.
The remainder of this paper is organized as follows: Section 2 addresses the contextualization and definitions; Section 3 presents the management using DT and related works. In Section 4, an example of the architecture and model representations of a digital twin is given. Finally, Section 5 provides remarks and discusses the trends and challenges of the digital twin.

Contextualization and Definitions
The main concepts about DT are addressed in this section, together with the conceptualization and a brief presentation of the principal technologies investigated in the literature.

Industrial Cyber-Physical System
CPS may be seen as the basis of Industry 4.0, which allows the combination of software components with machines' mechanical or electronic parts. The technical integration of all CPS elements in production and planning is processed using Industrial IoT approaches. Here, data can be transmitted through diverse technologies, including digital subscriber lines (DSLs), robust wireless methods, or high-speed interconnections. Such integration represents an emerging trend in technology and engineering [15].
CPS refers to a system integrating modern computing and communication technologies and cyber and physical components concepts through sensors, actuators, communication networks, and other technologies. Thus, a CPS can be described as a set of physical systems connected to a network (embedded systems and internet) that have a superior level of integration between products and processes, where the elements of the production chain have the characteristic "smart" as a prerequisite [16].
The embedded systems are the origin of cyber-physical systems responsible for performing specific computational functions and linking mechanical and electrical components. It can be described as physical computing devices or equipment interacting with virtual space through a data network, system components, and structure [17]. Thus, it is possible to register the state information of the physical element and ensure that it is secure and works efficiently, resulting in an intelligent operation of physical devices.
In a cyber-physical DT, virtual models are developed in cyberspace to mirror their corresponding physical objects' behaviors in the real world [18]. The cyber-physical DT has a separate simulated entity of an operating model to provide a digital footprint of the operational processes and characteristics, such as interoperability, configurability, and programmability, in implementing predictive maintenance and quality management systems [19,20]. This can be used in various applications and serves objectives at the facility level [21], shop floor level [22,23], and product level [24].

Digital Twin
A digital twin can be considered an essential advancement in the technology field. There are diverse definitions of DT. It began to be defined when NASA sought to increase knowledge about the concept of a mirror system aiming to reduce costs and resources. That is why NASA started to investigate and develop DT for its space assets. These definitions are still a work-in-progress. Several concepts and meanings proposed recently are shown in Table 1, which shows DT approaches related to vehicles, products, asset diagnostics and prognostics, digital representation of diverse assets, and simulations.

Definition of Digital Twin Ref
"Ultrahigh fidelity physical models of the materials and structures that control the life of a vehicle." [25] "Product digital counterpart of a physical product." [26] "Virtual substitutes of real-world objects consisting of virtual representations and communication capabilities making up smart objects acting as intelligent nodes inside the internet of things and services." [27] "Digital representation of a real-world object with focus on the object itself." [28] "The simulation of the physical object itself to predict future states of the system." [29] "Virtual representation of a real product in the context of Cyber-Physical Systems." [30] "A digital twin is a virtual representation of a physical object or system-but it is much more than a high-tech lookalike. Digital twins use data, machine learning, and the Internet of Things (IoT) to help companies optimize, innovate, and deliver new services." [31] "A digital twin is a virtual representation of a physical entity or system. The digital twin is much more than a picture, blueprint, or schematic: It is a dynamic, simulated view of a physical product continuously updated throughout the design, build, and operation lifecycle. The digital twin and its corresponding physical object exist in parallel, evolving together as the physical product progresses and matures." [32] "A digital twin is a virtual representation of a physical object or system across its lifecycle, using real-time data to enable understanding, learning, and reasoning." [33] "A digital twin application to monitor the CPS aiming to support aspects such verification and validation of the running of complex systems through simulation." [34] Several characteristics are summarized previously in Table 1, the unity of which is considered a clear definition. Specifically, a practice can be treated as the DT technique if it results in reduced operational costs, increased safety in risky processes, increased efficiency of products and systems, the anticipation of problems and risks, optimization of techniques and strategies, and others.
The life cycles of the physical system are reflected in the DT by obtaining data from the system, which in turn reproduces the operation of the model with its data, functions, and communication capabilities in the digital space [16]. Usually, if changes occur in the physical system, the models are automatically updated to reproduce the same event.
Thus, DT refers to a digital replica of physical assets (physical twin), in this case, which follows the life cycle to monitor, control, or optimize the processes. Moreover, an intelligent digital twin consists of a digital twin with all relevant characteristics and additional algorithms to implement artificial intelligence solutions inserted as part of the process [35].
We found an application of DT in a manufacturing context that proposes a model with three types of shop floor applications, one focusing on the product, the other on the process, and the last on the operation [36].

•
Product Digital Twins-related to the outcome of the production process; • Process Digital Twins-structure described in steps that represent how the process works; • Operation Digital Twins-an operational procedure that regulates how each part of the process works.
DT provides a high-fidelity model and updates the product life cycle. Thus, the DT system can reproduce the current state of a physical object in the virtual world with data so that every time information is added, the DT model improves. In [37], there is a proposal to create computable virtual abstractions of complex manufacturing arrangements denoted as phenomena twins to develop the CPS needed for work in Industry 4.0.
In this context, the DT simulation can reduce unpredicted and undesirable conditions caused by diverse factors such as communication error, network failure, false human interactions, and others. This ability is explained by the fact that DT captures and processes the current information about the running system to predict those conditions [38]. DT can process data to implement diagnostics, optimization, and prediction tasks. However, it is challenging to predict structurally complex systems accurately because a considerable data flow is needed to provide a suitable data representation.
Some complex cases are linked to the physical system's characteristics depending on the application or process (management manufacturing, among others). This may include real-time door localization, which is still a considerable problem in diverse applications [39]. In this sense, such situations have accelerated the realization of edge intelligence in the industrial Internet of Things (IIoT), and [40] proposed a solution that improves the reliability and security of the system and formulated an optimization for edge association by using the digital twin.
The DT can learn and update itself from various sources to represent its status, working condition, or position in real time [41]. The more sophisticated the DT is, the larger the demand for increased connectivity, which is also necessary to expand the network's reachability for indoor environments [42,43] and process a large amount of data in distinct phases of the product lifecycle [44].
Intelligent manufacturing merges the virtual and physical worlds through CPS and the IoT [45,46], common concepts within Industry 4.0, which proposed next-generation smart manufacturing to achieve high adaptability, fast changes in design, digital information technology, and a more flexible process. According to [35], DT definitions contain three parts in the literature: models and data, interconnection, and simulation. Table 2 summarizes existing DT applications by classifying them based on concepts and definitions, management, cybersecurity, shop floor, manufacturing, and formal requirements. The main aspects of the research goals are highlighted in each group, and a brief presentation is made.

Concepts and definitions: Definitions of digital twins and applications
Extract and infer knowledge from large scale production line data. [42] The perspectives that shape the intelligent factory and suggest approaches and technical support enable the realization of those perspectives. [47] Benefits of integrating digital twins with system simulation and Internet of Things (IoT) in support of MBSE. [48] A study focused on analyzing the state-of-the-art definitions of DT, investigating the main characteristics that a DT should possess, and exploring the domains in which DT applications are currently being developed. [49] Methodologies and techniques related to the construction of digital twins mostly from a modeling perspective. [50] Management: Applications and management system to support digital twin A generic reference architecture based on these concepts and a concrete implementation methodology is proposed using AutomationML. [30] Describes the development of digital twin technology focuses on Prognostics Health Management.
[51] Cyber Security: Digital twin related a network and interlinks between components Research of cyber-physical systems and digital twins regarding Industry 4.0 to determine obstacles to adopting cybersecurity for digital twins/CPS in the built environment. [52] Methodology to create and execute cybersecurity test cases on the fly in a black box setting using pattern matching-based binary analysis and translation mechanisms to formal attack descriptions and model-checking techniques. [53] Shopfloor: Digital twin related to an industrial application Shopfloor is required to reach the interaction and convergence between physical and virtual spaces, which is not the imperative demand of intelligent manufacturing. [54] Proposed for the detailed representation of a complex shopfloor organization system employing automated robotic vehicles (AGV). [55] Manufacturing: Digital twin related an industrial application Proposes a systematic method for constructing a digital twin for the customized products, including 3D modeling, mechanism modeling, and real-time synchronization. [56] Graph learning is one potential pathway toward enabling cognitive functionalities in manufacturing digital twins. [57] Formal Requirements: Modeling, simulation, verification, and validation Due to the characteristics of physical systems, requirements and constraints related process lifecycle of the CPS is highly challenging, and an integrated solution to formally define system requirements and automate their verification through simulation was proposed. [58] Present the steps toward defining graphical syntax to express requirements for intelligent industries. A formal constraint-based language was proposed for the modeling assumptions, requirements, and preliminary designs. [34] Despite each study's different applications and specificity, we note that DT can represent, simulate, and interpret collected data. In many cases, there is communication with other DTs and schemes to formalize and validate requirements. It is also common to use IoT, CPS, and artificial intelligence (AI) or machine learning to preprocess and predict system behavior [59].
In addition, the AI methodologies support essential functionalities of the system. It provides, in some cases, the capacity to learn on its own, using all environment and process data that represent actual running system operating conditions [17]. Therefore, the operator analyzes and interprets these data, such as engineers, technicians or specialists with deep and relevant process domain knowledge of the devices and machines from similar industrial structures. Then, this experience and expertise are applied to improve factors related to production, demand management, and cost reduction.

Internet of Things and Big Data
The IoT is a networking infrastructure that connects various devices and monitors and controls devices by using modern technologies in cyberspace. All interactions generate multiple data sources: from human experts with deep and relevant industry domain knowledge, from other similar machines, from other fleets of similar devices, and from the larger systems and environments of which it may be a part.
The data collected must be stored and analyzed to obtain the desired results. It must be managed to be helpful in sequence, and this process has several steps to extract the information. Various analysis methods and tools exist to translate data [60][61][62]. These studies require massive amounts of data and computing capability to produce a DT model. It is a progression that will rely on computer and communications technology advances.
In the DT process, virtual and physical interaction and obtaining data are challenging tasks. The data from the physical world are transmitted to the virtual world by various sensors and involve all steps of the process, such as simulation, validation, and commis-sioning [63]. There is, in this case, data exchange between the physical and virtual world so that the changes that occur in the physical world will also happen in the virtual world.

DT-Driven Management
DT-driven management, in some cases, can predict potential problems and support integration between process and operation. There are management activities that include defining an organization's strategy and coordinating the role and duties of its employees to accomplish its demands through the application of available resources (financial, technological, and human).
The digital twin supports features such as the decision-making process, configuration, reconfiguration, planning, commissioning, and condition monitoring. Therefore, DTs can function as a real-time management system.
The development of science and technology necessitates the development of novel quality management methods. Innovative approaches are emerging along with existing processes and quality management systems [64,65].
There are distinct types of management DT, and the concept of an intelligent inspection management system has emerged. This type of system uses modern technical characteristics, such as integrating the Production Management System (PMS) [66], Geographic Information system (GIS) [24], Radio-Frequency Identification technology (RFID), and Personal Digital Assistant (PDA) [67], use in-process and maintenance. The goal is to improve the efficiency of the inspection work to control and analyze the process through data provided by various sensors and devices to support the correct measurement of variables and display this information on the monitoring platform [22].
In the literature, there are different works related to management using DT. It is possible to mention studies that apply this methodology. One of them is the DT model of the equipment management system to use in a case in an intelligent railway station [59]. This prediction system uses a scene frame based on digital twin technology [68] and other applications.

Model Representation and Architecture
In general, the adopted DT architectures are hierarchical. They are based on DTs as a prerequisite for developing a CPS, where the DTs offer services to the CPS for enabling the CPS' control. Hierarchical architectures are usually multilayers, where the bottom layer is related to the physical system, and the top layer is associated with the decision layer. Figure 1 shows a generic hierarchical architecture that details the DT.
Each layer has a specific function. The physical layer depicts the physical description of the system or process. The data layer represents the different information about the system. The integration layer supports and guarantees the integration interlayers. The service layer provides a tool to control, monitor, and manage the system. The business layer is the last layer and has the function of showing centralized information about the process. This information supports the decision making process. In the following lines, each layer of this architecture is briefly explained.
Physical layer (L0). Physical elements such as sensors and actuators, conveyors, robotic arms, pumps, and other devices are monitored. All the data on the process are acquired and curated in the next layer, i.e., the data layer.
Data layer (L1). The data flow between the physical components and their digital representations is continuous in both directions. In particular, the large amount of data from this layer must be translated to obtain meaningful information. For this purpose, big data, clustering, and granular computing techniques can be used. Several data-driven approaches can be proposed for control, simulation, fault diagnosis, and health management. Otherwise, those services can also be performed based on parametric models obtained through regression and systems identification to enable a model-based approach. Moreover, machine learning and AI can also allow decision-making and data analysis in this layer.
Integration layer (L2). It guarantees the integration of interlayers and bridges the physical and virtual layers gap. In this layer, network technologies, such as the IoT, are relevant to efficiently connecting processes and systems. However, the CPS's dependence on this layer highlights the issues regarding cybersecurity and data integrity since those networked systems are more vulnerable to cyberattacks.
Service layer (L3). Here, the integration provided by L2 may be synchronized with the data and physical layers, enabling DT development. The DT can provide different services through data processing, modeling tools, automatic reasoning, AI, and optimization algorithms. DT-driven services are monitoring, diagnosis and prognostics, networked control, verification and validation, fault tolerance, and cyber-security. Decision layer (L4). Based on DT-driven services, the decision layer allows control of the physical system by considering specific goals, e.g., increasing production, reducing system degradation, or reducing costs. The main control algorithms can be implemented using DT services in this layer. For example, a DT can provide the service of failure prognostics by estimating the remaining useful life of a system's component. In this sense, the maintenance condition can be set to reduce losses, and a health-aware control algorithm can be used to extend the useful life while ensuring the control requirements.
The CPS may integrate all these layers and use the DT as core functions to ensure the control and operation of the entire process based on the virtual product lifecycle model. In short, the DT needs the CPS to integrate the physical and virtual world. However, several additional layers are necessary to define a basic DT, as discussed in the above layers, which can be used to develop diverse appliances, but this complexity depends on each case.
According to Minerva et al. [69], a generic approach to designing the DT architecture model is proposed and details elements that compound the proposed architecture of DT, such as essential features, types of the layer structure, layer communications, and technologies used. Finally, compelling examples were presented. In [70], a DT network was proposed to operate with a fault in the communication network. In addition, this approach aims at physical virtualization objects to monitor the main process variables. Moreover, this approach has a continuous data flow due to the preprocessing data stage, a specific layer of the method to ensure a high quality of the data and realize the dynamic interaction and synchronized evolution of the physical and virtual objects.
In [71], other technologies (cf. digital twin networks, IoT, CPS, and other technologies used in the Industry 4.0 concept) were used to integrate top-down requirements. This integration presents challenges as integration at distinct levels and supporting fluid and synchronized processes. These challenges may be solved by using technologies that can be edge intelligence, which integrates edge computing with AI, providing low-latency and high-security computing services and improving the processes' performance.

Trends and Challenges of the Digital Twin
This paper summarizes and presents concepts, definitions, approaches, and characterizations of different technologies applied in DT. Then, DT became increasingly widespread due to the increase in the Industry 4.0 concept. Thus, it is glimpsed trends and obstacles that might be encountered.
In general, the models proposed in the DT framework are built by observing the physical system, that is, by investigating and understanding the entire system. Modern needs may include the biggest problem of precise indoor localization and the proper representation of virtual reality, giving the user an actual sensation of what is happening on the shop floor.
The faithful representations of the DT that can predict unmodeled and emergent events challenge the DT models. Emergence occurs when unshaped or unpredictable behavior arises in the production process. When such a situation occurs, the system can measure this information and solve it through countermeasures such as prediction and detection of abnormalities in the system. This problem represents an interesting behavior because a change in an undesired event is already or not described in the model and combines concepts of AI, machine learning, and DT networks aiming to adapt the system to the new situation.
The main idea is to reproduce/simulate such conditions on the premise that they occur. However, in cases where unwanted behavior does occur, the system must manage these changes. These changes are parameter variations that the system cannot always take because they are too many, not foreseen, or lack processing power, thus causing an obstacle or problem that can be highlighted in this context.
The emergence problem mentioned earlier is related to system complexity. The way to deal with this can be by obtaining a more significant amount of data in the model-building phase or possibly "front-running" a simulation in real-time. Thus, it is possible to deal with such a situation before it occurs and produce a more straightforward solution that depends only on the designer, unlike the computational effort demanded by the solution that integrates AI and machine learning applications to solve this problem. However, this solution is specific and not generalized, and promotes a greater dependency on the designer's knowledge to adapt and improve the solution.
To mitigate complexity, the simulation could be a window into the future of system states that might occur and run the system with real-time data feed [64]. Such a suggestion would make it possible to predict with a certain degree of anticipation due to the simulation of the system's behavior being just before the reproduction of the entire system. Thus, it would have a window into the future with various application possibilities.
In this sense, another possibility is applying retrofitting techniques to improve the legacy systems in Industry 4.0 technologies. These applications are the most cost-effective way, providing advantages such as reduced production costs, efficiency, and reliability. Thus, the main idea is to update a legacy system process to achieve specific demands of the Industry 4.0 concept.
In this situation, the strategy should identify similarities related to technology and resources in the retrofitting process to facilitate and create a basic overall model to increase the number of old systems that may migrate to the Industry 4.0 concept. The retrofitting also allows the inclusion of more advanced management models and real-time monitoring of each asset in the process.

Conclusions
This paper summarizes concepts and definitions and presents a generic architecture described in five layers for DT applications in Industry 4.0. The proposed structure in five layers aims to describe and specify a basic design and detail the functionality of physical objects, related technologies, and the integration interlayer process. Then, the challenges related to the retrofitting process of the legacy system were discussed, i.e., updating the process in the context of Industry 4.0 and improving characteristics such as increasing productivity, reducing costs, and updating the devices of the process.
A retrofitting process is an essential approach in industry, and it is possible to define steps to achieve the desired update. First, it is necessary to analyze the equipment or process and the environment, describe the procedure of control and monitor the variable and new data flow, and finally, use technologies such as IoT, cloud computing, DT, and CPS to update the legacy system.
Our primary goal in future works is to investigate and deeply formalize a retrofitting process to a specific legacy system to achieve the technologies and concepts used in Industry 4.0. Moreover, a DT framework will be proposed to attain this goal and evaluate whether the legacy system achieves the Industry 4.0 concept.