Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems

Emmert-Streib, Frank

doi:10.3390/make8060159

Open AccessPerspective

Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems

by

Frank Emmert-Streib

^1,2

¹

College of Health and Life Sciences, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar

²

Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, 33720 Tampere, Finland

Mach. Learn. Knowl. Extr. 2026, 8(6), 159; https://doi.org/10.3390/make8060159 (registering DOI)

Submission received: 18 March 2026 / Revised: 5 June 2026 / Accepted: 7 June 2026 / Published: 10 June 2026

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

Download

Browse Figures

Versions Notes

Abstract

The scientific method is widely acknowledged as an authoritative framework that provides guiding principles for empirical research across disciplines. Despite this central role, it is rarely examined explicitly as a conceptual framework. In this paper, we revive attention to its role by revealing a connection to digital twins, which have received considerable attention in recent years. Specifically, we argue that the digital twins framework can be interpreted as a computational realization of the scientific method in the context of dynamical systems. This connection is rooted in the dynamical nature of models, since dynamical systems arise across many scientific fields, from physics to economics, and also constitute a core component of digital twins. The main benefits of this connection include a common scientific language for knowledge transfer, a systematic approach that emphasizes the mechanisms of continuous learning and model selection, and a practical framework for implementing the scientific method computationally across disciplines.

Keywords:

digital twins; scientific method; scientific discovery; dynamical systems

1. Introduction

How do we make scientific discoveries? This centuries-old question remains central to any scientific inquiry conducted today [1]. Dating back to antiquity, philosophers sought a rational understanding of nature, with the origins of deductive and inductive reasoning traced back to Aristotle. When scientists such as Galileo Galilei in the 16th century recognized the necessity of observations, the question evolved into: How do we systematically learn from empirical observations? The answer is provided by the so-called scientific method [2,3,4].

While the scientific method is the current gold standard, offering guidance to scientific discovery, it is more of a conceptual framework than a strict methodology [5]. This means that the methodological approaches to solving a problem must be determined on a case-by-case basis. As a result, each scientific endeavor that applies the scientific method tends to look different, and cross-references between problems from different domains are rarely attempted. Another consequence is that, although the scientific method is present in nearly all empirical studies, its role can become obscured, leading to a blurred perception of this foundational framework and diminishing its recognized significance.

In response to these challenges, we propose a systematic realization of the scientific method through the digital twins framework. This framework not only highlights the scientific method’s role in conducting studies but also provides a universal methodological implementation for general research using digital twins [6,7]. This allows for the utilization of methods from machine learning, data science, artificial intelligence and statistics [8,9] in combination with modeling approaches for complex systems [10] having the ability to learn over time to ensure the evolution of a model. We will show that the digital twins framework can be interpreted as a computational realization of the scientific method in the context of dynamical systems.

This paper is organized as follows. Section 2 introduces background information for digital twins, and Section 3 provides an overview of the scientific method and the Hypothetico-Deductive Model. In Section 4, we establish the connection between digital twins and the scientific method and in Section 5 we discuss a computational realization. Section 6 presents an additional example. The paper concludes with a discussion in Section 7 and final remarks in Section 8.

2. What Is a Digital Twin

We start our discussion by introducing digital twins. The concept of digital twins has generated much excitement in recent years [11], although its origins date back several decades and are generally attributed to Michael Grieves [12] and David Gelernter [13]. Essentially, a digital twin is a digital representation of a real-world object, accurately mirroring its features over time through iterative updates. In simple terms, a digital twin can therefore be understood as a learned dynamical system [14].

In [15], a data science-based definition was provided, describing a digital twin as a structured system that processes data from a physical twin and a digital twin through analysis methods and decision-making. This showed that a digital twin is just one component in a larger ecosystem termed as a digital twin system (DTS).

The main benefits of digital twins are their ability to improve cost efficiency, shorten development time, and enhance safety, whether applied to product development or medical treatment testing [16,17]. Consequently, the potential applications of digital twins span many domains, including manufacturing, healthcare, economics, and climate science [18,19,20].

A key characteristic of a digital twin is its integration of simulation (or modeling) and learning [21]. This means the model has the ability to improve over time and is not solely dependent on initial data estimates. Whether this learning process occurs in real-time or at longer intervals depends on the specific problem being addressed. For example, in a manufacturing process, sensor data might be available on a sub-second scale, while in healthcare, patient data, such as biopsy or blood sample results, would naturally have much longer intervals. Overall, there are five distinct features of a digital twin that are instrumental for the modeling of complex problems:

1.: predictability
2.: explainability
3.: intervenability
4.: learnability
5.: diversability

In the following, we discuss each of these features of a digital twin briefly to provide a clear understanding of their roles and significance.

1.: Predictability: A digital twin is a generative model that produces observable behavior. These observations can be considered predictions of a digital twin because they are comparable to the observable behavior of the physical twin.
2.: Explainability: The model of a digital twin is based on a dynamical system, such as a system of ordinary differential equations (ODEs) or an agent-based model (ABM). This approach fundamentally differs from machine learning methods, like support vector machines (SVMs) or neural networks, because the components of the model represent meaningful entities with phenomenological correspondences. For example, in systems biology, a regulatory network connects proteins, while in economics, companies form trading networks. All these examples illustrate mechanistic models with an inherently explainable structure.
3.: Intervenability: The mechanistic model represented by a digital twin carries causal relations among the system variables that are not only interpretable but also changeable. For this reason, virtual interventions in a digital twin model allow to study What-If scenarios as if a real-world experiment would be conducted.
4.: Learnability: The information extractable from data is finite, which means that parameter estimations of a digital twin have a limited accuracy. Continuous learning of a digital twin, utilizing additional data obtained over time, allows for improved model accuracy making the model better over time.
5.: Diversability: Given the limitations of learning from data, there is a need to quantify the resulting uncertainties. By repeatedly estimating the parameters of a digital twin at a particular time step, we can obtain a population of digital twins, each with an observable trajectory. Summarizing these outcomes provides probabilistic predictions that correspond to uncertainty quantification. That means digital twins enable an ensemble approach.

In Figure 1, we present a healthcare example that illustrates the working mechanism of a patient’s digital twin, highlighting one of its key features: learnability. Over the lifetime of a patient, data is continuously gathered from the patient (physical twin) and used to update the parameters of the digital twin model. This allows the digital twin to learn and adapt as new information is acquired, ensuring a continuous calibration to the physical twin. Importantly, the digital twin can suggest interventions, such as treatment options for a patient, which can inform the actual treatment and potentially lead to different outcomes [22]. This forms a closed loop system of the information flow between a physical twin and a digital twin and shows that the life trajectory of patients could benefit from such a model.

3. The Scientific Method

The scientific method provides guidance for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge [2,3]. It begins with observation, where a researcher notices an interesting phenomenon or problem that requires further understanding. From these observations, a theory may emerge, which is often a broad generalization of the available evidence. Based on this theory, a hypothesis can be formulated as a testable prediction or explanation. The next step involves experimentation, where controlled tests are designed to gather new data that either supports or refutes the hypothesis. These experiments must be carefully planned to isolate variables and measure outcomes, ensuring reliable and unbiased data collection. Once the data has been gathered, it undergoes analysis, during which researchers interpret the results to determine whether they align with the initial hypothesis. If the hypothesis is supported, it may contribute to the development of broader scientific theories; if not, the theory may need to be revised or rejected.

A key element of the scientific method is reproducibility, meaning that other researchers must be able to replicate the experiments and obtain comparable results to validate the findings. This iterative process of forming hypotheses, testing, and refining theories constitutes the core of how scientific knowledge advances, ensuring objectivity, reliability, and continuous improvement. Figure 2 illustrates the overall mechanism of the scientific method. Importantly, it encompasses all three fundamental forms of reasoning and operates as a cyclic process that progressively enhances a model over time.

The scientific method is characterized by several key properties that ensure the process is systematic, objective, and reliable. Here are the five most important properties:

1.: Empiricism: Knowledge is gained through observation and experimentation. Data collection from experiments or observations forms the basis for drawing conclusions.
2.: Falsifiability: A scientific hypothesis must be framed in a way that it can be tested and potentially proven false. This allows for robust validation or rejection of theories.
3.: Reproducibility: Scientific findings must be repeatable by others under the same conditions. This ensures that results are not due to random chance or unique circumstances.
4.: Objectivity: The process should be free from personal biases or subjective opinions. Conclusions should be based on evidence rather than assumptions or beliefs.
5.: Systematic Exploration: The scientific method follows a structured approach that includes hypothesis formulation, experimentation, analysis, and conclusion. This ensures a clear and logical progression of inquiry.

The scientific method rests on three principal modes of reasoning that serve as its primary instruments of inquiry:

Inductive reasoning
Deductive reasoning
Abductive reasoning

Inductive reasoning is used to generalize from specific observations or experimental results to form broader theories or laws. For example, observing a pattern in data and concluding a general principle [23]. In contrast, deductive reasoning is used to make predictions or draw specific conclusions from general principles or theories [24]. For instance, a hypothesis is tested by deriving predictions from an established theory, which are then subject to experimental testing. Finally, abductive reasoning is used to generate hypotheses or infer the best possible explanation for an observed phenomenon, especially when information is incomplete or ambiguous [25]. This form of reasoning also encompasses the creative process of arriving at a particular hypothesis. These three reasoning forms complement each other, providing a robust framework for exploring, testing, and refining scientific knowledge.

Before discussing an important component of the scientific method, we provide a note of caution regarding its algorithmic realization. The key strength of the scientific method lies in its general and abstract formulation, which is not tied to a specific practical implementation. This should be viewed not as a limitation, but as an advantage, as it allows for flexible, case-by-case realization across different problem domains. A fully specified methodological or algorithmic formulation would inevitably restrict its applicability, since not all problems can be addressed within a single fixed procedure. Consequently, any individual case study can illustrate only a particular instantiation of the scientific method and cannot fully capture its generality, as different problem settings may require fundamentally different workflows.

3.1. Hypothetico-Deductive Model

A particular component of the scientific method is the Hypothetico-Deductive Model (HDM) [26]. The HDM focuses on formulating hypotheses and testing them through experimentation, which makes it a central approach in many scientific disciplines. The HDM has been popularized by Hempel and Popper [27,28] with early contributions dating back to William Whewell (1794–1866), William Stanley Jevons (1835–1882) and Charles S. Peirce (1838–1914). The basic idea of the HDM is the formulation of a (testable) hypothesis and its testing [29,30].

There are variations of the HDM, but its basic components are as follows [31]: (1) formulate a hypothesis, typically derived from an underlying theory; (2) deduce testable predictions from the hypothesis that can be examined through observation or experimentation; and (3) conduct experiments to test these predictions. If the predictions are confirmed by the data, the hypothesis is (provisionally) supported and further predictions can be derived and tested. If they are not confirmed, the hypothesis is falsified, requiring its rejection and a revision of the underlying theory. Importantly, the HDM presupposes an existing theoretical framework [32]. The HDM is visualized in Figure 2, where it forms an integral component of the scientific method.

There are several extended methods that aim to enhance the hypothetico-deductive approach. Notably, we would like to highlight the Cyclic Deductive-Abductive (CDA) model proposed by [33]. The CDA model integrates a hypothetico-deductive framework with an abductive epistemological framework in a continuous cycle. In this model, prediction and postdiction occur in an ongoing process, where prediction aligns with the hypothetico-deductive method, and postdiction is rooted in abduction. All exploratory analyses are inherently abductive, and all hypothetico-deductive experiments begin with a postdiction, which entails preliminary evidence suggesting a plausible hypothesis for testing. Through deduction, hypotheses generate new data and findings, which, in turn, refine the hypothesis space via abduction. Applications and discussions of the CDA method have been documented across various domains, such as in [34,35]. Further examples for extended models include hypothetico-inductive inference [36], strong inference [37] or allochthonous models [38].

In general, it is important to highlight that regardless of the specific form of an approach, each is based on (or a subset of) the three base forms of scientific reasoning: induction, deduction and abduction [39]. This is because all aspects of inference—data-driven, theory-driven, and explanation-driven—are essential for effectively corroborating a theory using all available means.

3.2. Limitations of the Scientific Method

From the description of the scientific method, it becomes clear that it represents a conceptual framework rather than a directly specified methodology. It provides general principles and guidelines but does not prescribe concrete procedures for their implementation. As a result, the application of the scientific method differs significantly across fields such as Physics, Chemistry, Biology, Medicine, and Economics.

On the other hand, a common feature of many phenomena in these fields is their temporal evolution, which is often modeled using dynamical systems. These are mathematical models that describe how a system’s state changes over time according to specified rules [40]. Consequently, many theories in Physics, Chemistry, Biology, Medicine, and Economics can naturally be expressed through dynamical systems. Examples include general relativity, quantum mechanics, population dynamics in ecology, and macroeconomic models of market behavior. This observation allows for a connection to digital twins.

4. Connection to Digital Twins

From the above description of the scientific method, we can make several observations. First, the scientific method is not a rigid, detailed procedure but rather a set of guiding principles. As a result, it does not directly align with any specific statistical, machine learning, or artificial intelligence methods. This explains the wide variety of such methods found in the scientific literature for addressing different problems across disciplines. Second, despite its somewhat diffuse nature, we can identify a connection between the scientific method and the five features of a digital twin, as discussed in Section 2.

1.: Predictability: The scientific method, through experimentation and modeling, aims to make accurate predictions about future outcomes based on observed data and tested hypotheses. Similarly, in a digital twin, predictability refers to the model’s ability to forecast future behavior of the real-world system it mirrors, relying on dynamic models like ordinary differential equations (ODEs) or agent-based models.
2.: Explainability: The scientific method emphasizes understanding and explaining phenomena through theories grounded in evidence. In digital twins, explainability is key because the model components correspond to real-world entities or processes, providing a clear, interpretable framework that mirrors physical, biological or economic systems.
3.: Intervenability: Just as the scientific method allows for controlled interventions through experiments to test hypotheses, digital twins enable simulations of interventions in the virtual model. This allows for exploring "What-If" scenarios and observing the effects of changes on the system, without disrupting the real-world counterpart.
4.: Learnability: In the scientific method, knowledge evolves through continuous learning from new data and refined theories. Similarly, a digital twin improves over time by continuously learning from new data, refining its accuracy and adaptability to better represent the real system.
5.: Diversability (Uncertainty Quantification): The scientific method involves assessing and quantifying uncertainties in experimental results and models. In digital twins, diversability refers to the ability to quantify uncertainties, providing probabilistic predictions by running multiple simulations and capturing the range of potential outcomes based on different parameter sets.

This discussion indicates that each of these features reflects an aspect of the scientific method, showing how the two are fundamentally aligned in their approach to understanding, predicting, and refining knowledge of complex systems.

Third, despite the versatility of the scientific method and the diversity of its representations, its dynamical systems character unifies these approaches and provides a direct interface to digital twins. This connection arises not only because digital twins are also formulated as dynamical systems, but because they are iteratively updated. Crucially, both paradigms rely on feedback loops that guide continuous refinement. To emphasize this link, we reproduce a visualization of the scientific method in Figure 3, now framed within the workflow of digital twins.

From Figure 3, it is evident that the digital twin system (DTS) encapsulates the digital twin along with related components that support decision-making. Importantly, a digital twin can undergo both iterative parameter updating and model selection, enabling not only fine-tuning but also substantial revisions whenever the existing model conflicts with new evidence.

On a technical note, we would like to emphasize that the underlying concepts of digital twins and DTS provide a framework rather than a specific mathematical formulation. The reason for this is the versatility of digital twins, which allows their application across a wide range of dynamical phenomena.

Applications of Digital Twins

For the purpose of this paper, it is important to demonstrate that the fields discussed above can be addressed through digital twin studies. Therefore, in Table 1, we present example studies that encompass the fields accessible to the scientific method, as discussed in the preceding section. Interestingly, in [41], a systematic literature review demonstrated that the distribution of digital twin studies is highly uneven across different scientific domains. Specifically, the vast majority of studies are concentrated in engineering and manufacturing, while the remaining fields are still in their infancy. However, regardless of the developmental state of the respective fields, they all recognized the importance of digital twins.

5. Digital Twins as a Computational Realization of the Scientific Method

At this point, the individual components can be connected into a coherent framework. Revisiting the scientific method and its components revealed that digital twins and their broader ecosystem—the digital twin system—closely align with this framework. On one hand, this alignment supports the expectations widely associated with digital twins. On the other hand, it indicates that digital twins are not an entirely new concept but rather a paradigm for a computational realization of the scientific method.

To gain a more detailed understanding, we examine this connection further. In general, a scientific inquiry begins with a hypothesis derived either from a formal theory or from an informal cognitive model that may not be fully specified. For the purposes of this discussion, we focus on formal models formulated as dynamical systems. From these dynamical systems, predictions can be made directly, or by modifying the system according to rules consistent with the underlying phenomenon, which causes changes in the system. In medicine, such changes may result from the administration of a drug that alters the binding between proteins in regulatory networks, or in economics, changes may stem from policy adjustments that facilitate trade between countries. These alterations generally correspond to virtual interventions because they are made within the dynamical system, allowing for consequential predictions. This process is visualized in the lower part in Figure 2.

In the discussion above, we assumed a model in the form of a dynamical system for two main reasons. First, all fundamental physical theories, such as mechanics, electrodynamics, quantum mechanics, and statistical physics, are based on equations of change. These equations describe the time evolution of system variables, like position or momentum, through differential equations, which constitute a type of dynamical system [51,52]. Importantly, digital twins have been identified as special cases of dynamical systems for frozen learning that embody physical theories [15]. Second, even in disciplines outside physics, such as biology, epidemiology, sociology, or economics, models aim to describe the dynamic behavior of systems [53,54,55,56]. However, due to the complexity of these fields, only approximate descriptions are often feasible, commonly through static network models like gene regulatory networks, social networks, or economic networks [57,58,59,60], which provide the foundational structure for dynamical systems. Despite these limitations, dynamical systems remain fundamental across disciplines, as all phenomena share the characteristic of having system variables that evolve over time. From this description one can see that the scientific method aligns with the representation of digital twins.

From the above discussion, two important practical questions remain.

1.: How is a digital twin obtained?
2.: What makes a digital twin an accurate model?

As we will see below, the first question relates to abductive and inductive reasoning, the second to inductive reasoning.

5.1. Model Identification and Selection

We begin our discussion by addressing the question, “How is a digital twin obtained?” To answer this, we assume that we do not yet have a digital twin, but we do have data and prior knowledge. We will answer this question in two steps involving (1) model identification and (2) model selection.

Based on the available data or experience, we can generate several models that align with this evidence using either abductive or inductive reasoning. This model identification step produces a list of candidate models, which then undergoes model selection. By selecting the “best” model from these candidates, we arrive at a digital twin. In Figure 4, these two steps correspond to the lower part of the illustration.

From a data science perspective, the selection of the best model from a list of candidates can be accomplished via model selection. There are several approaches to model selection, including cross-validation, which evaluates models by testing them on different subsets of data, and the Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC), both of which balance model fit with model complexity [61,62]. The BIC, however, applies a stronger penalty for models with more parameters.

We would like to note that there are also other realizations of model selection, which are typically based on various forms of optimization. These include:

Equation discovery
Heuristic Search Methods

Equation Discovery refers to the process of identifying mathematical models or equations from data [63,64]. The process of finding equations is about selecting those that not only fit the data well but also provide a plausible explanation for the underlying processes governing the observed relationships. This technique is particularly useful in situations where the governing equations of a system may not be known in advance. The primary goal is to derive equations that accurately describe the behavior of the system based on observed data. Methods for equation discovery often involve symbolic regression, which uses algorithms to search for mathematical expressions that fit the data, or more recently, Physics-Informed Neural Networks (PINNs) [65,66,67]. Additionally, the process of model selection can be represented as a Heuristic Search Method within a space of potential models [68]. In [69], such topics have been discussed in the broader context of Complexity Data Science (CDS).

5.2. Model Evaluation and Parameter Estimation

The second question left unanswered was, “What makes a digital twin an accurate model?”. To answer this, it is important to note that every model typically consists of three key components: variables, parameters, and dynamic rules. From these components, the parameters need to be specified. One way to do this is by estimating them from data. In Figure 2, this corresponds to the top-left section of the illustration of the scientific method.

From a data science perspective, the estimation of parameters in a model is accomplished via statistical inference. There are several approaches to statistical inference for parameter estimation, including:

Maximum Likelihood Estimation (MLE): A method that estimates parameters by finding values that maximize the likelihood of the observed data given the model.
Bayesian Inference: This approach estimates parameters by combining prior distributions with the likelihood of observed data to produce a posterior distribution.
Least Squares Estimation: A method that minimizes the sum of squared differences between observed data and model predictions.

These methods offer distinct advantages depending on the nature of the data and the assumptions of the model. The process of parameter estimation is closely linked to model evaluation, as after estimating a model’s parameters, it is crucial to assess its quality. Typically, this is accomplished by testing it against unseen data. This connection highlights the importance of not just identifying and specifying a digital twin, but also thoroughly evaluating it using methods from data science. In other words, the concept of a digital twin alone is insufficient and requires a broader framework. In [69], Complexity Data Science (CDS) is proposed to fulfill this role.

Figure 4 also includes a case in which a digital twin model must undergo a radical change beyond parameter fine-tuning. In practice, such a transition can be operationalized using statistical hypothesis testing. Specifically, a significance level, often called

α

, defines the tolerated Type I error rate when testing whether predictive performance remains within acceptable accuracy bounds. If the null hypothesis of adequate predictive accuracy is rejected, this indicates that parameter tuning within the current model class is insufficient, thereby signaling the need for model revision or replacement.

6. Example: Hospital-Based Digital Twin System

In Figure 5, we illustrate the evolution of a hospital-based digital twin system over time, highlighting how the proposed framework operationalizes the scientific method in a data-driven setting, including data-driven model evolution and structural changes triggered by new data modalities.

At the core of this process is the interaction between the physical twin, for example a patient undergoing chemotherapy, and the digital twin, which is embedded within a digital twin system (DTS). The process begins with data acquisition from the patient, initially limited to electronic health records (EHR) [70,71]. These data streams are used to construct a first-generation digital twin model that supports clinical decision-making, such as chemotherapy treatment planning. Importantly, this intervention is not automated, but rather functions as a support system for the medical doctor.

As time progresses, the system evolves through continuous data collection and intervention cycles. The doctor interacts with the system by applying treatments, while the digital twin generates predictions, for example about patient response. A key feature highlighted in the figure is the transition in data modalities. At a later stage, multi-omics data [72,73], such as genomic or transcriptomic information, become available and are integrated with existing EHR data. This requires the development of a new multimodal model [74,75], which can fundamentally differ from the earlier model that relied solely on clinical data.

Importantly, this transition does not represent a simple parameter update but rather a structural change in the model class. This implies that the previously adequate model, based solely on EHR data, may no longer capture the underlying biological mechanisms revealed by multi-omics data. Consequently, new candidate models must be generated and evaluated, corresponding to a model re-identification and selection step. In this context, the scientist plays a critical role in guiding model development, particularly during transitions that require abductive reasoning and the formulation of new hypotheses, which cannot be fully automated.

Another important aspect of this example is that the digital twin model is assumed to be used over a prolonged period of time. This extends beyond the typical funding horizon, which often spans three to five years, and instead reflects a long-term, continuous application. Such an approach helps to avoid a piecemeal treatment of related problems, where individual aspects are addressed in isolation. Instead, it enables a coordinated and sustained strategy that remains flexible through iterative learning and, when necessary, structural changes of the model itself.

From Adaptive Modeling to Scientific Discovery

The hospital-based example above also illustrates a broader possibility of digital twin systems, namely their potential contribution not only to adaptive modeling but also to the discovery of new regularities and governing mechanisms. In this context, model identification and model selection can be interpreted as computational processes operating within a space of candidate dynamical systems. Approaches such as symbolic regression, equation discovery, heuristic search in model spaces, causal discovery, and physics-informed machine learning provide concrete mechanisms for generating and evaluating alternative models directly from observational data [76,77,78]. Rather than restricting digital twins to parameter updating within a fixed model class, such methods allow the exploration of structurally different hypotheses, potentially revealing previously unknown relationships or dynamical principles underlying the observed phenomena.

At the same time, the extent to which scientific discovery can be fully automated remains an open question. While computational approaches can support the generation and evaluation of candidate models, the formulation of fundamentally new hypotheses may still require abductive reasoning, creativity, and domain-specific intuition. For this reason, digital twin systems should not be viewed as fully autonomous scientific agents, but rather as computational frameworks that augment scientific discovery through iterative interaction between data, models, and human reasoning. In this sense, digital twins may provide a practical environment for partially operationalizing aspects of scientific discovery, while still relying on scientists to guide substantial conceptual innovations and reinterpretations of the underlying model space.

7. Discussion

In this paper, we have presented a systematic alignment between the scientific method and the broader framework in which digital twins are embedded, namely digital twin systems. Although digital twins have gained significant attention in recent years, their connection to the scientific method has so far been largely overlooked. Importantly, recognizing this connection reveals that digital twins constitute a framework rather than a specific method, unlike techniques such as a support vector machine or a Mann–Whitney U test. In this sense, digital twins are much like the scientific method itself, which also represents a general framework rather than a single methodological procedure. This explains some of the difficulties in understanding digital twins (or the scientific method), since a framework cannot be fully understood by examining individual examples. Instead, its meaning must be appreciated at the systems level, beyond the scope of individual methods.

In the following, we discuss several questions that arise from this connection.

Are digital twins grounded in a fixed methodology, or do they represent a general framework applicable across domains?
To what extent does the proposed framework generalize beyond dynamical systems?
Is there a conceptual misalignment between general scientific laws and system-specific digital twin models?
Is there a practical benefit by connecting digital twins to the scientific method?
Is it possible to automate the scientific method using digital twins?
Are digital twins black-box prediction models, or do they provide explainability?
What happens if a digital twin model is incorrect?

Are digital twins grounded in a fixed methodology, or do they represent a general framework applicable across domains?

To clarify this important point, we emphasize that the scientific method is not a fixed methodology, but rather a general framework, despite its name. If it were a methodology, it would be limited to a specific set of procedures or approaches. Instead, it provides a set of guiding principles that can be applied across a wide range of domains, from physics to economics.

By analogy, any attempt to provide a computational realization of the scientific method cannot be reduced to a single methodology without restricting its generality. Instead, it must take the form of a framework that accommodates different models, data types, and problem settings. This highlights that generalizability and adaptability inherently require moving beyond a single methodology toward a flexible framework that integrates multiple methodologies. For this reason, digital twins are not tied to a fixed methodology but instead represent a general framework [69].

To what extent does the proposed framework generalize beyond dynamical systems?

The interpretation proposed in this paper is explicitly restricted to problems that can be represented as dynamical systems. Accordingly, we do not claim that all forms of scientific inquiry can be reduced to digital twins. Rather, our argument is that, for phenomena characterized by temporal evolution, digital twin systems provide a computational framework that operationalizes key components of the scientific method, including prediction, experimentation, iterative updating, and model revision. Importantly, this still encompasses a broad range of domains, since many physical, biological, medical, and economic phenomena fundamentally involve evolving states over time.

Is there a conceptual misalignment between general scientific theories and system-specific digital twin models?

A common view is that scientific laws aim to characterize the fundamental principles governing the dynamical behavior of phenomena as described by scientific theories. In physics, such theories typically take the form of (ordinary or partial) differential equations. However, this should be understood as an aspirational goal rather than a strictly verifiable claim, given the inherent limitations of inductive inference [79]. A classical example is Newtonian mechanics, which was long regarded as a correct description of mechanics until it was superseded by the theory of general relativity. In light of these limitations, scientific theories should be regarded as provisional rather than definitive [80]. Consequently, scientific inquiry is better understood as an ongoing process of model refinement and selection, in which theories are iteratively evaluated and revised in response to empirical evidence. Importantly, this perspective aligns with digital twin systems, as discussed in Section 4, suggesting that both scientific theories and digital twin models are ultimately evaluated in terms of their empirically achieved performance rather than their status as definitive representations of underlying reality.

On the other hand, there is a distinction between a theory and a model. A theory specifies general principles and relationships, whereas a model is a simplified, concrete representation of a system that can be used to describe or predict particular phenomena. In this sense, the scientific method and the digital twin framework are closely aligned, as both operate through the construction, evaluation, and refinement of models based on empirical observations.

Is there a practical benefit by connecting digital twins to the scientific method?

A key benefit is that it provides a clear computational framework for operationalizing the scientific method. The scientific method has long served as a structured approach for generating and validating knowledge through hypothesis formulation, experimentation, and iterative refinement. In this light, digital twins, through workflows of data integration, model construction, prediction, and continuous updating, can be understood as a computational realization of these principles. This perspective makes the scientific method more concrete by enabling systematic implementation and testing of hypothesis generation, prediction, and model refinement in practice.

A second benefit is that this connection highlights the mechanisms that enable continuous learning and improvement. Within the scientific method, hypotheses are repeatedly tested against new evidence, leading either to refinement or replacement of the underlying theory. Similarly, digital twins undergo iterative parameter updates and model selection as new data becomes available. Viewing this process through the lens of the scientific method emphasizes the importance of feedback loops, systematic experimentation, and evidence-based model revision, thereby strengthening the reliability and interpretability of digital twin systems in practical applications.

Another advantage of connecting digital twins to the scientific method is the establishment of a common scientific language. Modern science is divided into numerous disciplines, such as physics, chemistry, biology, and medicine, each characterized by its own terminology and methodological traditions. From a data science perspective [21], however, digital twins can be viewed as a computational realization of the scientific method, providing a unified framework for representing scientific inquiry across disciplines. This, in turn, facilitates knowledge transfer by offering a common conceptual language that transcends disciplinary boundaries and naturally promotes interdisciplinary and transdisciplinary research and education.

A final benefit is the potential unifying role of digital twins. Since the Enlightenment (roughly the late seventeenth to late eighteenth century), science has undergone an increasing degree of specialization, resulting in the emergence of numerous disciplines with distinct concepts, methods, and terminologies. While this specialization has been instrumental in advancing scientific knowledge, it has also contributed to the fragmentation of scientific inquiry across fields. Digital twins offer the possibility of addressing this challenge by providing a common computational framework that can be applied across domains. In this sense, digital twins may contribute to a renewed form of scientific integration, analogous to the role of the scientific method as a common foundation for inquiry across disciplines. For example, medical digital twins could provide a systematic framework for disease modeling, diagnosis, and treatment, thereby supporting precision medicine and related activities across the health sciences [22,81].

Is it possible to automate the scientific method using digital twins?

At first glance (Figure 4), one might be inclined to answer this question positively, as it seems feasible to formulate an optimization problem over all possible models. However, in our view, a fully automated realization of the scientific method is currently not achievable. A key reason is that model identification, the precursor to model selection, may involve not only inductive reasoning but also abductive reasoning. Abductive reasoning, which is closely related to hypothesis generation, relies on creativity and intuition [32] and currently lacks a fully formalized mathematical characterization.

Consequently, when model identification depends on abductive steps, it is not possible to exhaustively specify the space of candidate models in a fully automated manner, leading to an inherently incomplete representation of the model space. This suggests that abductive reasoning necessitates a human-in-the-loop approach [82,83], which in turn limits the extent to which the scientific method can be fully automated in a closed-loop system.

As a result, digital twins should not be interpreted as fully autonomous learners capable of addressing all possible scenarios without intervention. Instead, they should be viewed as systems that are augmented by human reasoning, for example, by incorporating human input when substantial model revisions are required in response to novel empirical evidence.

Are digital twins black-box prediction models, or do they provide explainability?

It is important to recognize that, due to the versatile nature of the digital twin framework, digital twins are not inherently limited to either black-box or fully transparent explainable models. Instead, their degree of interpretability depends on the specific implementation and problem setting. Accordingly, it is not surprising that the literature contains examples of both approaches. For explainable digital twins, mechanistic models are typically used [84,85], often based on differential equations. At the same time, mechanistic modeling does not exclude the use of data-driven methods such as deep learning. For instance, physics-informed neural networks (PINNs) combine a mechanistic structure with neural parameter estimation [66]. In this context, the mechanistic component remains interpretable, while the learning component may itself operate as a black-box. Overall, this illustrates an interplay between mechanistic and data-driven approaches, which underlies the flexible and application-dependent nature of digital twins.

What happens if a digital twin model is incorrect?

To address this question, we begin by outlining the characteristics of a model that represents a dynamical system. The most general way to describe a dynamical system is to assume it consists of the following components: a structure, dynamic rules, and parameters. Here, ’structure’ refers to the system’s configuration or state space, ’dynamic rules’ define how the system evolves over time, and ’parameters’ influence the behavior within these rules. In our discussion, we presented digital twins as realizations of dynamical systems, possessing the same characteristics as a dynamical system itself. This raises an important question: what if a digital twin model used to describe a physical object is actually incorrect? More specifically, does this reveal limitations in the framework?

This problem is visualized in Figure 4. Suppose there is a generally accepted model of a physical object (physical twin) that represents a realization of a dynamical system at each point in time. Here, “physical” is not limited to problems in physics but also includes objects in chemistry, biology, medicine, and other domains, essentially referring to real-world entities. Starting with model A, an experiment conducted at time

t_{c}

might reveal that this model is inconsistent with newly acquired experimental data. This discovery could necessitate not only adjustments to its parameters but also more fundamental changes. As a result, model B may emerge as the new best-accepted representation of the physical object. Importantly, even such a radical transition from model A to model B falls within the scope of a digital twins framework. The reasoning for this has been discussed in Section 5.1, where we explored how a digital twin is created by evaluating different model realizations against available data, thereby establishing a model selection process. Crucially, this process is iterative and continuous over time, making repetition an integral feature of the digital twin. Hence, the key point is not to prove that a given digital twin represents the true model, but rather that iterative updating progressively improves its representation over time.

We conclude by noting that Galileo is often credited as one of the first to explicitly assert that science requires a structured, formal language to systematically describe and understand the universe. Importantly, Galileo saw mathematics as such a language as stated in his book Il Saggiatore (The Assayer, 1623). Wittgenstein later echoed this idea, famously stating, “The limits of my language mean the limits of my world” [86]. In this sense, the paradigm underlying digital twins offers a language that shapes our understanding and systematic expression of scientific ideas not limited to a particular domain.

8. Conclusions

In conclusion, connecting digital twins and digital twin systems to the scientific method provides both conceptual clarity and practical benefits. In particular, it offers a clear computational framework for operationalizing the scientific method, making hypothesis generation, prediction, and iterative model refinement more concrete and practically implementable. This connection also emphasizes the mechanisms of continuous learning and improvement through feedback loops and evidence-based model revision, thereby enhancing the reliability of digital twin systems. Moreover, digital twins provide a common scientific language that facilitates knowledge transfer across disciplines and promotes interdisciplinary research. Finally, by bridging diverse scientific fields, digital twins have the potential to foster a renewed form of scientific integration, reflecting the role of the scientific method as a common foundation for scientific inquiry while addressing the increasing specialization of modern science. Overall, interpreting digital twins through the lens of the scientific method provides a coherent conceptual foundation for understanding their role in scientific inquiry and their growing adoption across domains. To the best of our knowledge, although individual components such as system identification, control, simulation, and physics-informed modeling have been extensively studied, their integration into a unified conceptual framework that views digital twins as a computational realization of the scientific method has not previously been articulated in this form.

Funding

This work was in part supported by the Academy of Finland (352266). The funders had no role in study design, data collection and analysis, publication decision, or manuscript preparation.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The author declares no conflicts of interest.

References

Simon, H.A. Does scientific discovery have a logic? Philos. Sci. 1973, 40, 471–480. [Google Scholar] [CrossRef]
Cohen, M.F. An Introduction to Logic and Scientific Method; Read Books Ltd.: Chicago, IL, USA, 2011. [Google Scholar]
Achinstein, P. Science Rules: A historical Introduction to Scientific Methods; JHU Press: Baltimore, TX, USA, 2004. [Google Scholar]
Lee, H.N. Scientific method and knowledge. Philos. Sci. 1943, 10, 67–74. [Google Scholar] [CrossRef]
Gauch, H.G. Scientific Method in Practice; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems; Springer: Abingdon, UK, 2017; pp. 85–113. [Google Scholar]
Kapteyn, M.G.; Pretorius, J.V.; Willcox, K.E. A probabilistic graphical model foundation for enabling predictive digital twins at scale. Nat. Comput. Sci. 2021, 1, 337–347. [Google Scholar] [CrossRef]
Haste, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference and Prediction; Springer: New York, NY, USA, 2009. [Google Scholar]
Emmert-Streib, F.; Moutari, S.; Dehmer, M. Elements of Data Science, Machine Learning, and Artificial Intelligence Using R; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
Ladyman, J.; Lambert, J.; Wiesner, K. What is a complex system? Eur. J. Philos. Sci. 2013, 3, 33–67. [Google Scholar] [CrossRef]
Jones, D.; Snider, C.; Nassehi, A.; Yon, J.; Hicks, B. Characterising the Digital Twin: A systematic literature review. CIRP J. Manuf. Sci. Technol. 2020, 29, 36–52. [Google Scholar] [CrossRef]
Grieves, M.W. Product lifecycle management: The new paradigm for enterprises. Int. J. Prod. Dev. 2005, 2, 71–84. [Google Scholar] [CrossRef]
Gelernter, D. Mirror Worlds: Or the Day Software Puts the Universe in a Shoebox. How It Will Happen and What It Will Mean; Oxford University Press: Oxford, UK, 1991. [Google Scholar]
Wright, L.; Davidson, S. How to tell the difference between a model and a digital twin. Adv. Model. Simul. Eng. Sci. 2020, 7, 13. [Google Scholar] [CrossRef]
Emmert-Streib, F. Defining a Digital Twin: A Data Science-Based Unification. Mach. Learn. Knowl. Extr. 2023, 5, 1036–1054. [Google Scholar] [CrossRef]
Hernandez-Boussard, T.; Macklin, P.; Greenspan, E.J.; Gryshuk, A.L.; Stahlberg, E.; Syeda-Mahmood, T.; Shmulevich, I. Digital twins for predictive oncology will be a paradigm shift for precision cancer care. Nat. Med. 2021, 27, 2065–2066. [Google Scholar] [CrossRef]
Emmert-Streib, F.; Yli-Harja, O. What Is a Digital Twin? Experimental Design for a Data-Centric Machine Learning Perspective in Health. Int. J. Mol. Sci. 2022, 23, 13149. [Google Scholar] [CrossRef]
Glaessgen, E.; Stargel, D. The digital twin paradigm for future NASA and US Air Force vehicles. In Proceedings of the 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 20th AIAA/ASME/AHS Adaptive Structures Conference 14th AIAA, Honolulu, HI, USA, 23–26 April 2012; p. 1818. [Google Scholar]
Corral-Acero, J.; Margara, F.; Marciniak, M.; Rodero, C.; Loncaric, F.; Feng, Y.; Gilbert, A.; Fernandes, J.F.; Bukhari, H.A.; Wajdan, A.; et al. The ‘Digital Twin’ to enable the vision of precision cardiology. Eur. Heart J. 2020, 41, 4556–4564. [Google Scholar] [CrossRef]
Tuegel, E.J.; Ingraffea, A.R.; Eason, T.G.; Spottswood, S.M. Reengineering aircraft structural life prediction using a digital twin. Int. J. Aerosp. Eng. 2011, 2011, 154798. [Google Scholar] [CrossRef]
Emmert-Streib, F.; Cherifi, H.; Kauffman, S.; Yli-Harja, O. Moving beyond simulation and learning: Unveiling the potential of complexity data science. PLoS Complex Syst. 2024, 1, e0000002. [Google Scholar] [CrossRef]
Emmert-Streib, F.; Parkkila, S.; Laubenbacher, R.; Mannermaa, A.; Hood, L.; Yli-Harja, O. The role of digital twins in P4 medicine: A paradigm for modern healthcare. NPJ Digit. Med. 2025, 8, 735. [Google Scholar] [CrossRef]
Hayes, B.K.; Heit, E.; Swendsen, H. Inductive reasoning. Wiley Interdiscip. Rev. Cogn. Sci. 2010, 1, 278–292. [Google Scholar] [CrossRef]
Holland, J.H.; Holyoak, K.J.; Nisbett, R.E.; Thagard, P.R. Deductive reasoning. In Readings in Philosophy and Cognitive Science; MIT Press: Cambridge, UK, 1993; pp. 23–41. [Google Scholar]
McAuliffe, W.H. How did abduction get confused with inference to the best explanation? Trans. Charles Peirce Soc. Q. J. Am. Philos. 2015, 51, 300–319. [Google Scholar] [CrossRef]
Lawson, A.E. Hypothetico-deductive Method. In Encyclopedia of Science Education; Gunstone, R., Ed.; Springer: Dordrecht, The Netherlands, 2015; pp. 471–472. [Google Scholar] [CrossRef]
Hempel, C.G.; Oppenheim, P. Studies in the Logic of Explanation. Philos. Sci. 1948, 15, 135–175. [Google Scholar] [CrossRef]
Popper, K. The Logic of Scientific Discovery; Basic Books: New York, NY, USA, 1959. [Google Scholar]
Ayala, F.J. Darwin and the scientific method. Proc. Natl. Acad. Sci. USA 2009, 106, 10033–10039. [Google Scholar] [CrossRef]
Mahootian, F.; Eastman, T.E. Complementary frameworks of scientific inquiry: Hypothetico-deductive, hypothetico-inductive, and observational-inductive. World Futur. 2009, 65, 61–75. [Google Scholar] [CrossRef]
Godfrey-Smith, P. Theory and Reality: An Introduction to the Philosophy of Science; Science and Its Conceptual Foundations Series; University of Chicago Press: Chicago, IL, USA, 2003. [Google Scholar]
Mirza, N.A.; Akhtar-Danesh, N.; Noesgaard, C.; Martin, L.; Staples, E. A concept analysis of abductive reasoning. J. Adv. Nurs. 2014, 70, 1980–1994. [Google Scholar] [CrossRef]
Ramoni, M.; Stefanelli, M.; Magnani, L.; Barosi, G. An epistemological framework for medical knowledge-based systems. IEEE Trans. Syst. Man Cybern. 1992, 22, 1361–1375. [Google Scholar] [CrossRef]
Riva, A.; Nuzzo, A.; Stefanelli, M.; Bellazzi, R. An automated reasoning framework for translational research. J. Biomed. Inform. 2010, 43, 419–427. [Google Scholar] [CrossRef]
Prosperi, M.; Bian, J.; Buchan, I.E.; Koopman, J.S.; Sperrin, M.; Wang, M. Raiders of the lost HARK: A reproducible inference framework for big data science. Palgrave Commun. 2019, 5, 125. [Google Scholar] [CrossRef]
Niiniluoto, I.; Tuomela, R. Theoretical Concepts and Hypothetico-Inductive Inference; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; Volume 53. [Google Scholar]
Platt, J.R. Strong Inference: Certain systematic methods of scientific thinking may produce much more rapid progress than others. Science 1964, 146, 347–353. [Google Scholar] [CrossRef]
Voit, E.O. Perspective: Dimensions of the scientific method. PLoS Comput. Biol. 2019, 15, e1007279. [Google Scholar] [CrossRef] [PubMed]
Kalinichenko, L.A.; Kovalev, D.Y.; Kovaleva, D.A.; Malkov, O.Y. Methods and tools for hypothesis-driven research support: A survey. Inform. Primen. 2015, 9, 28–54. [Google Scholar]
Brin, M.; Stuck, G. Introduction to Dynamical Systems; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
Emmert-Streib, F.; Tripathi, S.; Dehmer, M. Analyzing the Scholarly Literature of Digital Twin Research: Trends, Topics and Structure. IEEE Access 2023, 11, 69649–69666. [Google Scholar] [CrossRef]
Pylianidis, C.; Osinga, S.; Athanasiadis, I.N. Introducing digital twins to agriculture. Comput. Electron. Agric. 2021, 184, 105942. [Google Scholar] [CrossRef]
Laubenbacher, R.; Niarakis, A.; Helikar, T.; An, G.; Shapiro, B.; Malik-Sheriff, R.S.; Sego, T.; Knapp, A.; Macklin, P.; Glazier, J.A. Building digital twins of the human immune system: Toward a roadmap. NPJ Digit. Med. 2022, 5, 64. [Google Scholar] [CrossRef]
Björnsson, B.; Borrebaeck, C.; Elander, N.; Gasslander, T.; Gawel, D.R.; Gustafsson, M.; Jörnsten, R.; Lee, E.J.; Li, X.; Lilja, S.; et al. Digital twins to personalize medicine. Genome Med. 2020, 12, 4. [Google Scholar] [CrossRef] [PubMed]
Vallée, A. Digital twin for healthcare systems. Front. Digit. Health 2023, 5, 1253050. [Google Scholar] [CrossRef]
Barat, S.; Parchure, R.; Darak, S.; Kulkarni, V.; Paranjape, A.; Gajrani, M.; Yadav, A.; Kulkarni, V. An agent-based digital twin for exploring localized non-pharmaceutical interventions to control covid-19 pandemic. Trans. Indian Natl. Acad. Eng. 2021, 6, 323–353. [Google Scholar] [CrossRef]
Li, L.; Lei, B.; Mao, C. Digital twin in smart manufacturing. J. Ind. Inf. Integr. 2022, 26, 100289. [Google Scholar] [CrossRef]
Pobuda, P. The digital twin of the economy: Proposed tool for policy design and evaluation. Real-World Econ. Rev. 2020, 94, 140–148. [Google Scholar]
Jiang, F.; Ma, L.; Broyd, T.; Chen, K. Digital twin and its implementations in the civil engineering sector. Autom. Constr. 2021, 130, 103838. [Google Scholar] [CrossRef]
Caldarelli, G.; Arcaute, E.; Barthelemy, M.; Batty, M.; Gershenson, C.; Helbing, D.; Mancuso, S.; Moreno, Y.; Ramasco, J.; Rozenblat, C.; et al. The role of complexity for digital twins of cities. Nat. Comput. Sci. 2023, 3, 374–381. [Google Scholar] [CrossRef] [PubMed]
Longair, M.S. Theoretical Concepts in Physics: An Alternative View of Theoretical Reasoning in Physics; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Frieden, B.R.; Frieden, R. Physics from Fisher Information: A Unification; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
Furusawa, C.; Kaneko, K. A dynamical-systems view of stem cell biology. Science 2012, 338, 215–217. [Google Scholar] [CrossRef] [PubMed]
Tyson, J.J.; Novak, B. A dynamical paradigm for molecular cell biology. Trends Cell Biol. 2020, 30, 504–515. [Google Scholar] [CrossRef]
Finkenstädt, B.F.; Grenfell, B.T. Time series modelling of childhood diseases: A dynamical systems approach. J. R. Stat. Soc. Ser. C Appl. Stat. 2000, 49, 187–205. [Google Scholar] [CrossRef]
Axtell, R.L.; Farmer, J.D. Agent-based modeling in economics and finance: Past, present, and future. J. Econ. Lit. 2022, 63, 197–287. [Google Scholar] [CrossRef]
Hache, H.; Lehrach, H.; Herwig, R. Reverse Engineering of Gene Regulatory Networks: A Comparative Study. EURASIP J. Bioinform. Syst. Biol. 2009, 2009, 617281. [Google Scholar] [CrossRef]
Karlebach, G.; Shamir, R. Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell. Biol. 2008, 9, 770–780. [Google Scholar] [CrossRef]
Freeman, L. The Development of Social Network Analysis: A Study in the Sociology of Science. 2004, Volume 1. Available online: https://www.researchgate.net/publication/239228599_The_Development_of_Social_Network_Analysis (accessed on 6 June 2026).
Emmert-Streib, F.; Tripathi, S.; Yli-Harja, O.; Dehmer, M. Understanding the world economy in terms of networks: A survey of data-based network science approaches on economic networks. Front. Appl. Math. Stat. 2018, 4, 37. [Google Scholar] [CrossRef]
Ding, J.; Tarokh, V.; Yang, Y. Model Selection Techniques: An Overview. IEEE Signal Process. Mag. 2018, 35, 16–34. [Google Scholar] [CrossRef]
Emmert-Streib, F.; Dehmer, M. Evaluation of Regression Models: Model Assessment, Model Selection and Generalization Error. Mach. Learn. Knowl. Extr. 2019, 1, 521–551. [Google Scholar] [CrossRef]
Makke, N.; Chawla, S. Interpretable scientific discovery with symbolic regression: A review. Artif. Intell. Rev. 2024, 57, 2. [Google Scholar] [CrossRef]
Zhang, M.; Kim, S.; Lu, P.Y.; Soljačić, M. Deep learning and symbolic regression for discovering parametric equations. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 16775–16787. [Google Scholar] [CrossRef]
Kim, S.; Lu, P.Y.; Mukherjee, S.; Gilbert, M.; Jing, L.; Čeperić, V.; Soljačić, M. Integration of neural network-based symbolic regression in deep learning for scientific discovery. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4166–4177. [Google Scholar] [CrossRef] [PubMed]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Farea, A.; Yli-Harja, O.; Emmert-Streib, F. Understanding Physics-Informed Neural Networks: Techniques, Applications, Trends, and Challenges. AI 2024, 5, 1534–1557. [Google Scholar] [CrossRef]
Poole, D.L.; Mackworth, A.K. Artificial Intelligence: Foundations of Computational Agents; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
Emmert-Streib, F.; Cherifi, H.; Kaski, K.; Kauffman, S.; Yli-Harja, O. Complexity data science: A spin-off from digital twins. PNAS Nexus 2024, 3, 456. [Google Scholar] [CrossRef]
Tayefi, M.; Ngo, P.; Chomutare, T.; Dalianis, H.; Salvi, E.; Budrionis, A.; Godtliebsen, F. Challenges and opportunities beyond structured data in analysis of electronic health records. Wiley Interdiscip. Rev. Comput. Stat. 2021, 13, e1549. [Google Scholar] [CrossRef]
Yang, X.; Chen, A.; PourNejatian, N.; Shin, H.C.; Smith, K.E.; Parisien, C.; Compas, C.; Martin, C.; Costa, A.B.; Flores, M.G.; et al. A large language model for electronic health records. NPJ Digit. Med. 2022, 5, 194. [Google Scholar] [CrossRef]
Hasin, Y.; Seldin, M.; Lusis, A. Multi-omics approaches to disease. Genome Biol. 2017, 18, 83. [Google Scholar] [CrossRef]
Subramanian, I.; Verma, S.; Kumar, S.; Jere, A.; Anamika, K. Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights 2020, 14, 1177932219899051. [Google Scholar] [CrossRef]
Acosta, J.N.; Falcone, G.J.; Rajpurkar, P.; Topol, E.J. Multimodal biomedical AI. Nat. Med. 2022, 28, 1773–1784. [Google Scholar] [CrossRef]
Liang, Z.; Xu, Y.; Hong, Y.; Shang, P.; Wang, Q.; Fu, Q.; Liu, K. A survey of multimodel large language models. In Proceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering, Xi’an, China, 26–28 January 2024; pp. 405–409. [Google Scholar]
Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2020, 404, 109136. [Google Scholar] [CrossRef]
Rudy, S.H.; Brunton, S.L.; Proctor, J.L.; Kutz, J.N. Data-driven discovery of partial differential equations. Sci. Adv. 2017, 3, e1602614. [Google Scholar] [CrossRef]
Gong, C.; Zhang, C.; Yao, D.; Bi, J.; Li, W.; Xu, Y. Causal discovery from temporal data: An overview and new perspectives. ACM Comput. Surv. 2024, 57, 100. [Google Scholar] [CrossRef]
Salmon, W.C. The Foundations of Scientific Inference; University of Pittsburgh Press: Pittsburgh, PA, USA, 2017. [Google Scholar]
Van Fraassen, B.C. The Scientific Image; Oxford University Press: Oxford, UK, 1980. [Google Scholar]
Sadée, C.; Testa, S.; Barba, T.; Hartmann, K.; Schuessler, M.; Thieme, A.; Church, G.M.; Okoye, I.; Hernandez-Boussard, T.; Hood, L.; et al. Medical digital twins: Enabling precision medicine and medical artificial intelligence. Lancet Digit. Health 2025, 7, 100864. [Google Scholar] [CrossRef]
Mosqueira-Rey, E.; Hernández-Pereira, E.; Alonso-Ríos, D.; Bobes-Bascarán, J.; Fernández-Leal, Á. Human-in-the-loop machine learning: A state of the art. Artif. Intell. Rev. 2023, 56, 3005–3054. [Google Scholar] [CrossRef]
Holzinger, A. Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Inform. 2016, 3, 119–131. [Google Scholar] [CrossRef]
Laubenbacher, R.; Mehrad, B.; Shmulevich, I.; Trayanova, N. Digital twins in medicine. Nat. Comput. Sci. 2024, 4, 184–191. [Google Scholar] [CrossRef]
Metzcar, J.; Jutzeler, C.R.; Macklin, P.; Köhn-Luque, A.; Brüningk, S.C. A review of mechanistic learning in mathematical oncology. Front. Immunol. 2024, 15, 1363144. [Google Scholar] [CrossRef]
Wittgenstein, L. Tractatus Logico-Philosophicus; Routledge: London, UK, 1922. [Google Scholar]

Figure 1. A key feature of digital twins is learnability. This example illustrates the connection between a physical twin and a digital twin over the lifetime of a patient. Continuous data collection improves the digital twin model over time, enabling treatment suggestions that can influence the patient’s life trajectory.

Figure 2. Illustration of the Scientific Method. The Hypothetico-Deductive Model (HDM) emphasizes rigorous hypothesis testing, while iterative feedback loops enable continuous learning and model refinement.

Figure 3. The digital twin system (DTS) can be interpreted as a computational representation of the scientific method. Its overall structure mirrors the conceptual framework shown in Figure 2, but embedded within the workflow of digital twins.

Figure 4. Progression of the model development process over time. The underlying model of a dynamical system is constantly tested. In the shown example, a radical model change is required at time

t_{c}

. Such a transition is facilitated by model selection, as described within Complexity Data Science.

Figure 4. Progression of the model development process over time. The underlying model of a dynamical system is constantly tested. In the shown example, a radical model change is required at time

t_{c}

. Such a transition is facilitated by model selection, as described within Complexity Data Science.

Figure 5. Evolution of a hospital-based digital twin system over time. Initial models based on electronic health records (EHRs) are iteratively updated through data collection, prediction, and clinical intervention. The integration of multi-omics data requires a transition to a new multimodal model, representing a structural change rather than a parameter update, and may require a scientist in the loop to apply abductive reasoning.

Table 1. Example applications of digital twins in a variety of different areas.

Subject	Digital Twin Modeling	References
Agriculture	Agricultural systems: crops, soil, livestock, farm operations	[42]
Immunology	Immune system: cells, signaling, pathways, responses, dynamics	[43]
Medicine	Personalized medicine: patients, biomarkers, treatments, outcomes	[44]
Health	Healthcare systems: hospitals, workflows, patients, resources	[45]
Epidemiology	Pandemic control: population, infection, interventions, spread	[46]
Manufacturing	Manufacturing systems: machines, production lines, processes, products	[47]
Economics	Economy: policies, sectors, agents, financial flows	[48]
Engineering	Civil infrastructure: buildings, bridges, construction processes, assets	[49]
Urban planning	Cities: infrastructure, traffic, population, services, dynamics	[50]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Emmert-Streib, F. Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems. Mach. Learn. Knowl. Extr. 2026, 8, 159. https://doi.org/10.3390/make8060159

AMA Style

Emmert-Streib F. Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems. Machine Learning and Knowledge Extraction. 2026; 8(6):159. https://doi.org/10.3390/make8060159

Chicago/Turabian Style

Emmert-Streib, Frank. 2026. "Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems" Machine Learning and Knowledge Extraction 8, no. 6: 159. https://doi.org/10.3390/make8060159

APA Style

Emmert-Streib, F. (2026). Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems. Machine Learning and Knowledge Extraction, 8(6), 159. https://doi.org/10.3390/make8060159

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems

Abstract

1. Introduction

2. What Is a Digital Twin

3. The Scientific Method

3.1. Hypothetico-Deductive Model

3.2. Limitations of the Scientific Method

4. Connection to Digital Twins

Applications of Digital Twins

5. Digital Twins as a Computational Realization of the Scientific Method

5.1. Model Identification and Selection

5.2. Model Evaluation and Parameter Estimation

6. Example: Hospital-Based Digital Twin System

From Adaptive Modeling to Scientific Discovery

7. Discussion

8. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI