# Defining a Digital Twin: A Data Science-Based Unification

## Abstract

**:**

## 1. Introduction

## 2. Grant Vision of DT and Derived Application

“This book describes an event that will happen someday soon: You will look into a computer screen and see reality. Some part of your world the town you live in, the company you work for, your school system, the city hospital… This Mirror World you are looking at is fed by a steady rush of new data pouring in through cables.”

“Mirror Worlds aren’t ordinary programs. They are software ensembles, glued-together out of many separate programs all chattering at once.”

“Consider Darwin’s twin processes of speciation and evolution. Ensembles evolve: ensembles develop species. Individuals don’t.”

“why not capture an entire country?”

#### Even Earlier Visions

## 3. What Is a Digital Twin: A Literature Survey

- “The Digital Twin is an integrated multiphysics, multiscale, probabilistic simulation of an as-built vehicle or system that uses the best available physical models, sensor updates, fleet history, etc., to mirror the life of its corresponding flying twin” [16].
- “The Digital Twin is a set of virtual information constructs that fully describes a potential or actual physical manufactured product from the micro atomic level to the macro geometrical level. At its optimum, any information that could be obtained from inspecting a physical manufactured product can be obtained from its Digital Twin” [17].
- “Digital twin is an integrated multi-physics, multi-scale, probabilistic simulation of a complex product and uses the best available physical models, sensor updates, etc., to mirror the life of its corresponding twin. Meanwhile, digital twin consists of three parts: physical product, virtual product, and connected data that tie the physical and virtual product” [18].
- “A complete DT should include five dimensions: physical part, virtual part, connection, data, and service” [18]
- “The digital twin is composed of three components, which are the physical entities in the physical world, the virtual models in the virtual world, and the connected data that tie the two worlds” [19].
- “Digital twin is a virtual, dynamic model in the virtual world that is fully consistent with its corresponding physical entity in the real world and can simulate its physical counterpart’s characteristics, behavior, life, and performance in a timely fashion” [20].
- “A Digital Twin is a virtual instance of a physical system (twin) that is continually updated with the latter’s performance, maintenance, and health status data throughout the physical system’s life cycle” [21].
- “The digital twin is actually a living model of the physical asset or system, which continually adapts to operational changes based on the collected online data and information, and can forecast the future of the corresponding physical counterpart” [22].
- “A digital twin is defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, control- ling, and improved decision making” [23].
- “In health care, the ‘digital twin’ denotes the vision of a comprehensive, virtual tool that integrates coherently and dynamically the clinical data acquired over time for an individual using mechanistic and statistical models. This borrows but expands the concept of ‘digital twin’ used in engineering industries, where in silico representations of a physical system, such as an engine or a wind farm, are used to optimize design or control processes, with a real-time connection between the physical system and the model” [2].
- “A Digital Twin is a dynamic and self-evolving digital/virtual model or simulation of a real-life subject or object (part, machine, process, human, etc.) representing the exact state of its physical twin at any given point of time via exchanging the real-time data as well as keeping the historical data. It is not just the Digital Twin which mimics its physical twin but any changes in the Digital Twin are mimicked by the physical twin too” [6].

## 4. Comparison with Physical Theories

#### 4.1. Adaptive System

#### 4.2. Real-Time System

#### 4.3. Data Stream

- Evolution equations (or dynamical system)
- Parameters
- Initial conditions

#### 4.4. Simulations

#### 4.5. Special Case of a Digital Twin: Physical Theory

## 5. Defining Digital Twin and Digital Twin System

#### 5.1. Digital Twin

**Remark**

**1.**

**Definition**

**1**(Digital twin)

**.**

**Definition 2**(Identification of a digital twin)

**.**

#### 5.2. Digital Twin System

**Definition 3**(Digital twin system)

**.**

## 6. Discussion

#### 6.1. Characteristics of a DT

#### 6.2. Types of DT Models

- Physics-based models
- Data-driven models

#### 6.3. Connections to CAS and GST

#### 6.4. What-If Scenarios

#### 6.5. Differences to Previous Definitions of DT

**Application-driven approach:**From the literature one can see that many attempts to define a digital twin approach the problem from an application-driven perspective. This means that a problem, usually from manufacturing or engineering, is identified and then discussed by utilizing expert knowledge. Unfortunately, this makes the information transfer to other fields challenging for scientists lacking domain-specific knowledge from the underlying field of such studies. Considering the fact that digital twin research is of interdisciplinary interest [59] this is a severe issue.

**Technology-driven approach:**Other studies approach the problem in a technology-driven manner. This means they focus on the implementation and realization of various parts of a digital twin discussing algorithmic or technological difficulties. Frequently, this is done in combination with an application-driven discussion. One major drawback of such approaches is their tendency to be cumbersome and overloaded with non-essential details, making it difficult to grasp the core concepts of the barebone theory. In this context, it is worth reflecting on our previous discussion about physical theories in Section 4, where we observed that physical theories do not establish a direct connection to their technological implementation. Rather, they concentrate solely on providing a theoretical description of phenomena. This observation inspired our approach.

**Theory-driven approach:**In order to avoid the above problems, we assumed a theory-driven approach to define our digital twin framework. This helps avoiding problems from application-driven presentations defying a simple information transfer to other fields and problems from technology-driven studies providing cumbersome presentations. As a consequence, our theory-driven framework is not only more abstract and formal but also more general.

#### 6.6. Application Benefits of Our Framework

#### 6.7. Interpretations of DT and DTS

- What is the difference between a digital twin and a digital twin system? A digital twin is a mathematical model that has measureable characteristics that are indistinguishable from its physical counterpart. Hence, a DT is just a mechanism for generating data that are practically almost identical to data generated by its physical counterpart. The DT is part of a digital twin system which outlines a decision-making framework. That means a DTS processes the data generated by a DT, converts it into information and translate the latter into decisions.
- Does the definition of a DTS as given in this paper, or any other characterization of DT from the literature, tell you how to specify a digital twin? No, it assumes you have a mathematical model for a DT that is sufficient to serve as virtual representation of a physical entity.
- Does the definition of a DTS as given in this paper provide general standards? Yes, our definition provides a flexible functional framework that allows to see a digital twin as part of a wider process for analyzing data. Here the standards are set for the connectivity of the involved components.
- Does the definition of a DTS as given in this paper provide specifications for what technologies should be used? No, this information can only be provided by addressing a particular problem. Hence, such an answer is not generic but application domain-specific. In contrast, the data science-based definition of our digital twin system is application domain-independent.
- Why do physical theories not require updates? Because mathematical models that deserve to be called “theory” provide faithful descriptions of (physical) problems. That means they are “good” even without an update mechanism. Put differently, mathematical models used for problems in engineering, manufacturing, health, medicine or climate science are much more complex than physical problems and, currently, no theories are known that would be comparable in quality to their physical theory counterparts. Hence, the need for introducing mathematical models in such areas with an updating mechanism—which we called a DT—is due to quality deficits of the current models in these areas.
- Is a DTS a jack of all trades? No, because of two reasons. First, the underlying problem is very difficult and no (single) mathematical model is known to provide a model with sufficient quality. Second, the predictions of a DTS refer to (A) future states of the physical object and (B) involve interventions of it; see Figure 3. That means a DTS aims to answer “what if” questions requiring the underlying system (corresponding to PO) to go through significant changes.

## 7. Outlook

## 8. Conclusions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Wang, J.; Li, X.; Wang, P.; Liu, Q. Bibliometric analysis of digital twin literature: A review of influencing factors and conceptual structure. Technol. Anal. Strateg. Manag.
**2022**, 1–15. [Google Scholar] [CrossRef] - Corral-Acero, J.; Margara, F.; Marciniak, M.; Rodero, C.; Loncaric, F.; Feng, Y.; Gilbert, A.; Fernandes, J.F.; Bukhari, H.A.; Wajdan, A.; et al. The ‘Digital Twin’to enable the vision of precision cardiology. Eur. Heart J.
**2020**, 41, 4556–4564. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Bauer, P.; Stevens, B.; Hazeleger, W. A digital twin of Earth for the green transition. Nat. Clim. Chang.
**2021**, 11, 80–83. [Google Scholar] [CrossRef] - Voosen, P. Europe builds’ digital twin’of Earth to hone climate forecasts. Science
**2020**, 370, 16. [Google Scholar] [CrossRef] - Duan, H.; Gao, S.; Yang, X.; Li, Y. The development of a digital twin concept system. Digit. Twin
**2023**, 2, 10. [Google Scholar] [CrossRef] - Singh, M.; Fuenmayor, E.; Hinchy, E.P.; Qiao, Y.; Murray, N.; Devine, D. Digital twin: Origin to future. Appl. Syst. Innov.
**2021**, 4, 36. [Google Scholar] [CrossRef] - Newrzella, S.R.; Franklin, D.W.; Haider, S. 5-Dimension cross-industry digital twin applications model and analysis of digital twin classification terms and models. IEEE Access
**2021**, 9, 131306–131321. [Google Scholar] [CrossRef] - Hassani, H.; Huang, X.; MacFeely, S. Impactful Digital Twin in the Healthcare Revolution. Big Data Cogn. Comput.
**2022**, 6, 83. [Google Scholar] [CrossRef] - Grieves, M.W. Product lifecycle management: The new paradigm for enterprises. Int. J. Prod. Dev.
**2005**, 2, 71–84. [Google Scholar] [CrossRef] - Hernandez-Boussard, T.; Macklin, P.; Greenspan, E.J.; Gryshuk, A.L.; Stahlberg, E.; Syeda-Mahmood, T.; Shmulevich, I. Digital twins for predictive oncology will be a paradigm shift for precision cancer care. Nat. Med.
**2021**, 27, 2065–2066. [Google Scholar] [CrossRef] - Pobuda, P. The digital twin of the economy: Proposed tool for policy design and evaluation. Real-World Econ. Rev.
**2020**, 94, 1–9. [Google Scholar] - Gelernter, D. Mirror Worlds: Or the Day Software Puts the Universe in A Shoebox… How It Will Happen and What It Will Mean; Oxford University Press: Oxford, UK, 1991. [Google Scholar]
- Aheleroff, S.; Zhong, R.Y.; Xu, X. A digital twin reference for mass personalization in industry 4.0. Procedia Cirp
**2020**, 93, 228–233. [Google Scholar] [CrossRef] - Mitchell, J.; Moore, J.; Trauboth, H.H. Digital simulation of an aerospace vehicle. In Proceedings of the 1967 22nd National Conference, Washington, DC, USA, 1 January 1967; pp. 13–18. [Google Scholar]
- Trauboth, H.; Prasad, N. MARSYAS: A software system for the digital simulation of physical systems. In Proceedings of the Spring Joint Computer Conference, New York, NY, USA, 5–7 May 1970; pp. 223–235. [Google Scholar]
- Glaessgen, E.; Stargel, D. The digital twin paradigm for future NASA and US Air Force vehicles. In Proceedings of the 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 20th AIAA/ASME/AHS Adaptive Structures Conference 14th AIAA, Honolulu, HI, USA, 23–26 April 2012; p. 1818. [Google Scholar]
- Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems: New Findings and Approaches; Springer: Berlin/Heidelberg, Germany, 2017; pp. 85–113. [Google Scholar]
- Tao, F.; Zhang, H.; Liu, A.; Nee, A.Y. Digital twin in industry: State-of-the-art. IEEE Trans. Ind. Inform.
**2018**, 15, 2405–2415. [Google Scholar] [CrossRef] - Qi, Q.; Tao, F. Digital twin and big data towards smart manufacturing and industry 4.0: 360 degree comparison. IEEE Access
**2018**, 6, 3585–3593. [Google Scholar] [CrossRef] - Zhuang, C.; Liu, J.; Xiong, H. Digital twin-based smart production management and control framework for the complex product assembly shop-floor. Int. J. Adv. Manuf. Technol.
**2018**, 96, 1149–1163. [Google Scholar] [CrossRef] - Madni, A.M.; Madni, C.C.; Lucero, S.D. Leveraging digital twin technology in model-based systems engineering. Systems
**2019**, 7, 7. [Google Scholar] [CrossRef] [Green Version] - Liu, Z.; Meyendorf, N.; Mrad, N. The role of data fusion in predictive maintenance using digital twin. In AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2018; Volume 1949, p. 020023. [Google Scholar]
- Rasheed, A.; San, O.; Kvamsdal, T. Digital twin: Values, challenges and enablers from a modeling perspective. IEEE Access
**2020**, 8, 21980–22012. [Google Scholar] [CrossRef] - Emmert-Streib, F.; Yli-Harja, O. What Is a Digital Twin? Experimental Design for a Data-Centric Machine Learning Perspective in Health. Int. J. Mol. Sci.
**2022**, 23, 13149. [Google Scholar] [CrossRef] - Wang, K.; Wang, Y.; Li, Y.; Fan, X.; Xiao, S.; Hu, L. A review of the technology standards for enabling digital twin. Digit. Twin
**2022**, 2, 4. [Google Scholar] [CrossRef] - Moyne, J.; Qamsane, Y.; Balta, E.C.; Kovalenko, I.; Faris, J.; Barton, K.; Tilbury, D.M. A requirements driven digital twin framework: Specification and opportunities. IEEE Access
**2020**, 8, 107781–107801. [Google Scholar] [CrossRef] - Ashtekar, A.; Pawlowski, T.; Singh, P. Quantum nature of the big bang: An analytical and numerical investigation. Phys. Rev. D
**2006**, 73, 124038. [Google Scholar] [CrossRef] [Green Version] - Van Rienen, U. Numerical Methods in Computational Electrodynamics: Linear Systems in Practical Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2001; Volume 12. [Google Scholar]
- Higham, D.J. An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Rev.
**2001**, 43, 525–546. [Google Scholar] [CrossRef] [Green Version] - Childs, A.M. Equation solving by simulation. Nat. Phys.
**2009**, 5, 861. [Google Scholar] [CrossRef] - Hinton, G.E.; Ghahramani, Z. Generative models for discovering sparse distributed representations. Philos. Trans. R. Soc. Lond. Ser. Biol. Sci.
**1997**, 352, 1177–1190. [Google Scholar] [CrossRef] [Green Version] - Sundberg, J.; Lindblom, B. Generative theories in language and music descriptions. Cognition
**1976**, 4, 99–122. [Google Scholar] [CrossRef] - Holland, J.H. Studying complex adaptive systems. J. Syst. Sci. Complex.
**2006**, 19, 1–8. [Google Scholar] [CrossRef] - Tesfatsion, L. Agent-based computational economics: Modeling economies as complex adaptive systems. Inf. Sci.
**2003**, 149, 262–268. [Google Scholar] [CrossRef] - Buckley, W. Society as a complex adaptive system. In Systems Research for Behavioral Sciencesystems Research; Routledge: England, UK, 2017; pp. 490–513. [Google Scholar]
- Sheskin, D.J. Handbook of Parametric and Nonparametric Statistical Procedures, 3rd ed.; RC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
- Emmert-Streib, F.; Dehmer, M. Understanding Statistical Hypothesis Testing: The Logic of Statistical Inference. Mach. Learn. Knowl. Extr.
**2019**, 1, 945–961. [Google Scholar] [CrossRef] [Green Version] - Björnsson, B.; Borrebaeck, C.; Elander, N.; Gasslander, T.; Gawel, D.R.; Gustafsson, M.; Jörnsten, R.; Lee, E.J.; Li, X.; Lilja, S.; et al. Digital twins to personalize medicine. Genome Med.
**2020**, 12, 1–4. [Google Scholar] [CrossRef] [Green Version] - Golse, N.; Joly, F.; Combari, P.; Lewin, M.; Nicolas, Q.; Audebert, C.; Samuel, D.; Allard, M.A.; Cunha, A.S.; Castaing, D.; et al. Predicting the risk of post-hepatectomy portal hypertension using a digital twin: A clinical proof of concept. J. Hepatol.
**2021**, 74, 661–669. [Google Scholar] [CrossRef] - Henriksen, H.J.; Schneider, R.; Koch, J.; Ondracek, M.; Troldborg, L.; Seidenfaden, I.K.; Kragh, S.J.; Bøgh, E.; Stisen, S. A New Digital Twin for Climate Change Adaptation, Water Management, and Disaster Risk Reduction (HIP Digital Twin). Water
**2023**, 15, 25. [Google Scholar] [CrossRef] - Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital Twin in manufacturing: A categorical literature review and classification. Ifac-PapersOnline
**2018**, 51, 1016–1022. [Google Scholar] [CrossRef] - Zhang, H.; Liu, Q.; Chen, X.; Zhang, D.; Leng, J. A digital twin-based approach for designing and decoupling of hollow glass production line. IEEE Access
**2017**, 5, 26901–26911. [Google Scholar] [CrossRef] - Lindström, J.; Larsson, H.; Jonsson, M.; Lejon, E. Towards intelligent and sustainable production: Combining and integrating online predictive maintenance and continuous quality control. Procedia CIRp
**2017**, 63, 443–448. [Google Scholar] [CrossRef] - Wright, L.; Davidson, S. How to tell the difference between a model and a digital twin. Adv. Model. Simul. Eng. Sci.
**2020**, 7, 13. [Google Scholar] [CrossRef] - Kochunas, B.; Huan, X. Digital twin concepts with uncertainty for nuclear power applications. Energies
**2021**, 14, 4235. [Google Scholar] [CrossRef] - Prawiranto, K.; Carmeliet, J.; Defraeye, T. Physics-based digital twin identifies trade-offs between drying time, fruit quality, and energy use for solar drying. Front. Sustain. Food Syst.
**2021**, 4, 606845. [Google Scholar] [CrossRef] - Fahim, M.; Sharma, V.; Cao, T.V.; Canberk, B.; Duong, T.Q. Machine learning-based digital twin for predictive modeling in wind turbines. IEEE Access
**2022**, 10, 14184–14194. [Google Scholar] [CrossRef] - Ghosh, A.K.; Ullah, A.S.; Kubo, A. Hidden Markov model-based digital twin construction for futuristic manufacturing systems. AI EDAM
**2019**, 33, 317–331. [Google Scholar] [CrossRef] - Klimontovich, Y.L. Statistical Theory of Open Systems; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1994; Volume 67. [Google Scholar]
- Chick, V.; Dow, S. The meaning of open systems. J. Econ. Methodol.
**2005**, 12, 363–381. [Google Scholar] [CrossRef] - von Bertalanffy, L. The Theory of Open Systems in Physics and Biology. Science
**1950**, 111, 23–29. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Skyttner, L. General Systems Theory: Problems, Perspectives, Practice; World Scientific: Singapore, 2005. [Google Scholar]
- Klir, G.J. Facets of Systems Science; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 7. [Google Scholar]
- Krivov, S.; Dahiya, A.; Ashraf, J. From equations to patterns: Logic-based approach to general systems theory. Int. J. Gen. Syst.
**2002**, 31, 183–205. [Google Scholar] [CrossRef] - Kapp, K.W. The open-system character of the economy and its implications. In Economics in the Future; Springer: Berlin, Germany, 1976; pp. 90–105. [Google Scholar]
- Rebolledo, R.; Navarrete, S.A.; Kéfi, S.; Rojas, S.; Marquet, P.A. An open-system approach to complex biological networks. SIAM J. Appl. Math.
**2019**, 79, 619–640. [Google Scholar] [CrossRef] - Caddy, I.N.; Helou, M.M. Supply chains and their management: Application of general systems theory. J. Retail. Consum. Serv.
**2007**, 14, 319–327. [Google Scholar] [CrossRef] - Adams, K.M.; Hester, P.T.; Bradley, J.M.; Meyers, T.J.; Keating, C.B. Systems theory as the foundation for understanding systems. Syst. Eng.
**2014**, 17, 112–123. [Google Scholar] [CrossRef] [Green Version] - Emmert-Streib, F.; Tripathi, S.; Dehmer, M. Analyzing the Scholarly Literature of Digital Twin Research: Trends, Topics and Structure. IEEE Access
**2023**, 11, 69649–69666. [Google Scholar] [CrossRef] - Emmert-Streib, F.; Moutari, S.; Dehmer, M. The process of analyzing data is the emergent feature of data science. Front. Genet.
**2016**, 7, 12. [Google Scholar] [CrossRef] [Green Version] - Elsken, T.; Metzen, J.H.; Hutter, F. Neural architecture search: A survey. J. Mach. Learn. Res.
**2019**, 20, 1997–2017. [Google Scholar] - Ren, P.; Xiao, Y.; Chang, X.; Huang, P.Y.; Li, Z.; Chen, X.; Wang, X. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Comput. Surv. CSUR
**2021**, 54, 1–34. [Google Scholar] [CrossRef] - Hoi, S.C.; Sahoo, D.; Lu, J.; Zhao, P. Online learning: A comprehensive survey. Neurocomputing
**2021**, 459, 249–289. [Google Scholar] - Emmert-Streib, F.; Dehmer, M. Taxonomy of machine learning paradigms: A data-centric perspective. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
**2022**, 12, e1470. [Google Scholar] [CrossRef] - Ma, L.; Jiang, B.; Xiao, L.; Lu, N. Digital twin-assisted enhanced meta-transfer learning for rolling bearing fault diagnosis. Mech. Syst. Signal Process.
**2023**, 200, 110490. [Google Scholar] [CrossRef]

**Figure 1.**Data science definition of a digital twin. Shown are the functional relations between analysis components that define a digital twin system.

**Figure 2.**Special cases of a DTS. (

**A**): Ordinary data analysis system based on experimental data as used in machine learning, artificial intelligence, or statistics. (

**B**): A DTS can even assume the form of a physical theory. (

**C**): Special case of a digital twin system which utilizes experimental data (data-EX) only for parameter updating of the DT. In engineering such a system would be called a digital shadow system (DSS).

**Figure 3.**Synchronization of a DT with the physical object (PO) it describes. Shown are three updating time points allowing for the calibration of the parameters, $\alpha $, of the DT to adjust to changes in the states, $\gamma $, of PO.

**Figure 4.**Visualization of the effects of updates. (

**A**) Improvement of the performance. (

**B**) Stabilization of the performance. In both figures, the dashed red line indicates the model’s performance if updating stops at the time step shown in red.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Emmert-Streib, F.
Defining a Digital Twin: A Data Science-Based Unification. *Mach. Learn. Knowl. Extr.* **2023**, *5*, 1036-1054.
https://doi.org/10.3390/make5030054

**AMA Style**

Emmert-Streib F.
Defining a Digital Twin: A Data Science-Based Unification. *Machine Learning and Knowledge Extraction*. 2023; 5(3):1036-1054.
https://doi.org/10.3390/make5030054

**Chicago/Turabian Style**

Emmert-Streib, Frank.
2023. "Defining a Digital Twin: A Data Science-Based Unification" *Machine Learning and Knowledge Extraction* 5, no. 3: 1036-1054.
https://doi.org/10.3390/make5030054