How to Set Up the Pillars of Digital Twins Technology in Our Business: Entities, Challenges and Solutions

A digitalization of business process through utilizing Digital Twins is an approach that assists companies to align themselves with changes of technology development, and accordingly, improve their outcomes. To take full advantage of implementing Digital Twins, the importance of the creative phase role as pillars of this technology on the performance of the other parts and overall outcome should not be overlooked. This research addresses the lack of an integrated framework for setting up the creative phase of digital twins. To design the proper framework, by relying on a qualitative empirical method, an interview with persons who are experts in the Digital Twin area was organized to collect the information about all obvious and hidden aspects of this phase and manifest what kind of entities participate in this phase, what potential challenges and obstacles exist and what solution is effective to overcome them. The structural feature of the proposed framework continuously prepares the system for changes, aiming to adopt improvement within. The findings of this study can be used as instruction by all companies that want to take the first steps toward the digital representation of physical assets, or for those who deal with Digital Twin and want to improve their systems’ interactions.


Introduction
Digital Twin is the virtual representation of something that is physical, including not only the structural parts but also the dynamics and behaviors of the system along the lifecycle [1]. The interesting point about this presentation was that it included all the basic components related to Digital Twin, consisting of real space, virtual space, the link for data flow from real space to virtual space, the link for information flow from virtual space to real space and virtual sub-spaces. The introduced concept could be implemented through various ways into the processes. The most general one is the digital model, where there is just the virtual representation and automated communication with the physical system is absent. A deeper implementation of it could be attributed to the digital shadows, where changes in the physical system can be applied in the virtual system, but it is not possible vice-versa. At the highest level, implementing the entire concept of Digital Twins requires bidirectional automated communication. The latter allows changes starting from both the physical and the virtual system [1].
The main motivations that encourage the industrial sector to employ Digital Twins technology could be referred to its striking benefits. Continuous improvement along the product lifecycle phases, monitoring the physical system for the sake of preventing problems [1], simulating the physical system and tracking the system performance under the influence of changing a series of parameters in the digital system before adopting the in the physical system, predicting about problems or future possible scenarios, analyzing the operation and manufacturing process in real-time [2] and resolving several problems originating from insufficient information are a small sample of advantages of the Digital Twins technology [3].
One of the big challenges that companies are faced with is maintaining, and in the further steps enhancing, the market share of companies, and actually, it depends on how well they can identify the customers' requirements and accordingly upgrade the current products or develop new products. Overcoming this challenge entails a huge amount of information and data that goes from the clients' order to the product delivery [4]. The big objective of most companies, which are working in digitalization, is to digitalize the information lifecycle process [4]. Digital Twin assists companies to achieve this goal in different phases.
Digital Twin technology, like other technology, has its own lifecycle that involves four phases, including the creative phase, productive phase, support phase and disposal phase [5].
The creative phase is where there is a huge amount of work to be done. The creative phase is the fundamental stage of using Digital Twin, which has many details that need to be carefully considered. This phase has four emergent concepts which should be taken into account, including predicted desirable (PD), predicted undesirable (PU), unpredicted desirable (UD) and unpredicted undesirable (UU).
The productive phase is about the construction of physical systems with the specific configurations that are the result of the previous phase. The main goal of this step is to allow and optimize the communication of the information between the virtual world and the physical one.
In the support phase, there are several analyses about the behavioral concepts of the system that were defined in the creative phase. The connections between the real and the virtual system are maintained. Here, it is important to take into account possible changes in order to optimize the actual situation or to eliminate the unpredicted undesirable behaviors that were not identified in the creative phase [6].
The disposal phase is taken into account because information about a product is of huge value for the business, even after that the product is retired from the market. Nextgeneration products could have similar problems and companies can save cost and time by taking care of this information.
Digital Twin can be classified into two types, as follows, Digital Twin Prototype (DTP) and Digital Twin Instance (DTI). Digital Twin works in a Digital Twin Environment (DTE) [5].
DTP considers the prototypical physical artefact. It takes into account all the information needed to reproduce the product in the physical world after defining the prototype in the virtual one. Here, the information flow ideally goes from the virtual model to the physical model. The goal of this type is to improve the efficiency in terms of time and cost.
DTI means the digital representation of a physical product. This process needs continuous connections throughout the entire lifecycle to optimize the virtual model over time. Here, the information flow has the opposite direction of DTP because in this case, it goes from the physical to the digital model.
DTE is a multi-domain application used for managing the Digital Twin for different purposes. Predictive purpose means the possibility to perform forecasting about future scenarios and performance of the physical product. Here, we can analyze the predictive purpose in the two types of Digital Twin described before. Prediction about DTP might be the analyses of product tolerances in order to be sure that the as-designed product met the proposed requirements in the physical one, while predictions in DTI take into account the historical processes of product components and performance, providing a range of possible future states. The interrogative purpose is used basically on DTI. The use of interrogative purposes is related to the prediction of future states. For example, the aggregate of actual failures could provide probabilities for predictive uses [5].
As the traditional method of designing a new prototype, DTP aims to verify and validate the PD result and eliminate the PU output. In addition to this, DTP provides the capability of identifying UU scenarios that might create problems for the system.
The creative phase has a great significance because it is the first stage where the validation and verification must firstly take place to ensure that every decision being made will actually positively affect the investment and maximize it. Moreover, data is collected and analyzed in this phase to obtain the needed information to define areas of improvements, and to develop all possible solutions. Therefore, Digital Twin in the creative phase allows simulations, real-time analyses in the operations and manufacturing process, predictions about problems or future possible scenarios and other applications that create several benefits for the organization [2]. Due to the undeniable importance of this phase and the lack of studies and research on this area [7], the main concentration of the current study has been dedicated to developing a comprehensive framework for implementing Digital Twins in the creative phase. Real-time understanding and simulation of changes in data is considered as one of the main challenges of the implementation of Digital Twins, which is carried out mainly by the creative phase, and this reflects the novelty of this research, which is the presentation of the first of a kind of "fully integrated" framework to assist in the process of implementing Digital Twin technologies in any business and guide the implementation through an agile and a precise step-by-step approach that takes into consideration the real-time analysis of data and its effect on the overall performance.
The rest of the paper is structured as follows. Section 2 includes theoretical background analysis. Section 3 introduces the methodology which has been utilized in this research work. Section 4 discusses the results, and finally, Section 5 provides a discussion of research directions for future work.

Theory
In this section, a detailed analysis of the importance of the creative stage of Digital Twin is presented. The main part of this section is attributed to data as the fuel of the Digital Twin. Fuel quality and distribution have a direct effect on the final performance of the system. In order to ensure the mentioned characteristics, all aspects related to how data should be collected, analyzed and managed and in what situations data should be used have been scrutinized under the following headings.

Creative Stage of Digital Twin
The creative (or design) phase in Digital Twins sets its foundation on the basis of the creation of the digital model [8]. The digital model could be about something that exists in the real world, as a process, or about something that does not yet exist in the real world, as a product prototype. The latter might be based on data and information that comes from previous experience or different physical models with similar characteristics. In this stage, the main goal is the digital creation that can allow analyses and virtual verification, which provides its user with the possibility of identifying defections, behaviors and structures of the digital prototype [9].
The creation of the Digital Twin starts with a new pipeline of manufacturing data. In fact, the integration of historical data about operations, performance and system interaction acts as the support for Digital Twin [10]. This data provides the industrial owners with a special opportunity to simulate the system and its processes and evaluate how well they will perform in the physical word under various what-if scenarios. In this regard, having no high cost in prototype is the notable point that influences companies to utilize this opportunity. The concept of feedback to design is compatible with this situation, which also represents solutions based on historical design systems that were used in the past [11].
To exert Digital Twin on the position in which the physical system has an external presence, various technologies have been developed that assist the process of simulation. One of the popular technologies is the Internet of Things (IoT). IoT is the concept of connecting different objects to the Internet, and it has accelerated the movement from connecting devices to the Internet to collecting and analyzing data by using sensors to extract data throughout the lifecycle of the product, in order to create value and knowledge from the huge amount of the collected data, such as the knowledge of the product performance and conditions [12], and this has enabled the transformation in several industries to move toward selling these collected value-added data to obtain more revenue and higher margins instead of selling their final product [13,14].
The related sensors and devices are positioned and are in charge of exploiting data from products and processes. RFID is one of the most commonly used sensors in IoT. The RFID implementation is relatively cheap, and it improves not only the communication quality between physical and digital models, but also the practicability of sensing systems [15].
One fascinating detail in the creative phase of Digital Twin is following the designdriven approach, which compensates for the inefficiency of the traditional design approaches. Through identifying the behaviors of the system, Digital Twin forms the digital prototypes that align with the requirements of the considered business. The goal of this procedure is to improve decision-making by way of generating information. Meaningful information comes from data and data come from the physical model through sensors and devices. Regarding the undeniable importance of quality of data, placing sensors, which are responsible for optimally collecting data, in the right position has great sensitivity. Fortunately, the simulation capability of the digital model could govern them toward the luminous direction.
The role of Digital Twin in the product design process is also outstanding. The process of designing a product can be divided into three stages, including conceptual design, detailed design and virtual verification [9]. The conceptual design phase is where the designers identify the characteristics as aesthetic and main functions of the final product. Here, designers have to take into account several data types, such as customer satisfaction, investment plans, product sales, etc. In this situation, Digital Twin can help not only by taking into account all these data but also by establishing more efficient and transparent communications, such as communication between clients and designers to get feedback and identify problems. The detailed design phase is focused on completing the product design prototype, taking into account appearance, functions, configuration, parameters and tests. Digital Twin is helpful for the simulation of this prototype in a faster and cheaper way. The virtual verification is conducted based on the pointing out of performances and defects through the use of Digital Twin.
Xu et al. [16] analyzed how Digital Twin can assist the fault diagnosis using deep transfer learning (DTL), which allows the extraction of knowledge from one or more sources for using it in another target domain. Here, the Digital Twin creation process starts with the creation of a virtual system that obtains data and insights from the simulation of the physical system that will be built. This is the intelligent development phase. As always in statistical scenarios, high-quality input data are needed to obtain high-quality output; when the virtual entities achieve a satisfactory level, the second phase is the construction of the physical entity, meanwhile DTL is used to create a diagnosis model using the knowledge of the previous phase. The problem of insufficient training data for high-quality diagnosis is addressed here by the DTL solution. Therefore, DTL improves the diagnosis quality in the first stages, and it could also be used to establish new working conditions in the future [16].

Data Acquisition for Digital Twin
The role of data in today's digitized systems is more prominent than ever. In fact, data plays the role of fuel for the system and causes it to run and survive. Since fuel quality plays a significant role in system performance, data quality will also affect system performance and output. As a result, the process of data acquisition needs to be carefully taken into account.
Systems have their own data sources. Most of the time, the structure of data coming out of these sources are not the same. In addition to sources that generate and store data, there are other important sources whose data is valuable but difficult to store, such as human experiences and personal behaviors. Collecting all these types of data for the creation of digital representation [17] and better simulations of the physical system in different scenarios is necessary [18].
For better implementation of digital representation, the acquisition of information should not be confined to only data originating from machines and should encompass a comprehensive perspective of the process through a multi-modal data acquisition [19].
Data acquisition could be conducted by two main approaches, including sensor-based tracking and machine vision. The sensor-based tracking approach provides information about the position and the moves of employees, products and devices around the organization, while machine vision is used to identify products in certain processes, and this implementation enables the use of a single sensor or tag for each product. The machine vision approach could be employed in certain conditions, for example, when the number of the product is high, or the process characteristics are not compatible with sensor-based tracking. From the economic point of view, the cost of the sensor-based tracking approach is proportional to the number of devices that have to be tracked, while the machine vision approach has different steps considering the number of devices, but generally, the cost of it is fixed and does depend on quantity [19].
The challenge of data storage is an issue that needs to be addressed in addition to data collection. Companies can decide to store data in-house or use a cloud-based system. The latter scenario is well-described by Alam and El Saddik. They analyzed the cloudbased system for the scalability of storage, computation and cross-domain communication capabilities [20]. In this case, every physical system has a Digital Twin hosted in the cloud. They also argued about a one-to-one connection between physical and digital systems. The changes in the real world are updated through sensors in the virtual one. In their study, there is a unique ID for an object and a relation ID for every communication.
Hofmann and Branding described four different data sources that are used for simulation in Digital Twin. These sources include an existing system that contains daily useful information, a historical database that includes data from the past, external sources from the surrounding environment and real-time data from sensors technologies [21].
While abundance of data (big data) improves the quality of simulation in digital models, their analysis will be faced with greater challenges. The three main challenges that need to be managed in this direction are volume storage, velocity and variety. To conquer them, three brilliant responses or scenarios for questions, including how to reduce the stored set of data, how to speed up data collection and real-time analysis and how to combine multiple data sources respectively, are required [22].
The data collection could be established on two platforms, one is concerned with the prediction service and the other one is related to the production management and control service. Enterprise information system, big data-based prediction and analyses system and Digital Twin technology-based prediction and analyses system are all requirements for setting up these platforms [23].

Data Management for Digital Twin
Data collection will have no added value without the right implementation. This is why the data management process needs to be meticulously taken into account.
After the data collection, all these resources need to be implemented efficiently in the digital system, in order to have a continuous data flow. Nowadays, software technologies make Digital Twin data implementation more feasible and affordable [24]. The key to giving power to Digital Twin simulation is a complete data-model that includes the features of the Cyber-Physical System (CPS) [25]. In order to establish data flow, the Centralized Support Infrastructure is needed. It supports the semantic metadata model, the simulation framework and the communication layer. Data acquisition and data evaluation are characterized by location-independence. This is helpful to contrast the low level of knowledge about industry 4.0 and Digital Twin. In fact, companies do not need employees to be inside the company to do their job. Technologies help with remote working [19].
Other than the manner of collecting, storing and transmitting data, it is important to clarify for what purpose these data are going to apply. The purpose has a direct effect on the type of data. For example, if the purpose is about maintenance, Digital Twin needs historical available knowledge, and it needs to be close to the process [26].
Digital Twin's performance is not limited to its entities, but the people who put these components together and manage them also have a significant impact on the final performance. The bottom level of Digital Twin implementation is of course composed of people, engineers and scientists that have both software development skills and engineering skills to implement these technologies. West and Blackburn support the idea that an engineer with deep engineering knowledge about the physical system and basic software development skills can do a better job than a professional software developer with a limited understanding of the engineering part of the system that has to be modeled [27]. A comprehensive understanding of the process allows the data to be transformed into meaningful information and insights and implemented properly. Effective information management across the built environment provides overall improvement [28] and several benefits. The most important of these benefits are better decision-making and financial savings.
One factor that affects the correct use of data in the digital model is aligning the upstream data (from the physical to the digital system) with the requirements of the downstream (from the digital to the physical system) process operation. This wide view will lead to effective data management [29]. Blockchain is one of the best creative solutions for managing the data flow of digital currency, Bitcoin, which provides the possibility to trace information through the entire lifecycle and protect the security of the artifacts through hashes [30]. A comprehensive assessment of whether the concept of Blockchain could be implemented in Digital Twins is provided by Heber and Groll [31].
Continuous improvement of Digital Twin depends on rectification of the deviation between the simulated signal and the measured signal using data management, which will eventually lead to an upgrade of the Digital Twin's system. There are two main phases in order to have an updated Digital Twin, including local data processing and global data processing. The first is used for simple feedback such as data cleaning, data storing in a private database and the data used to make local decisions. The second is used from the shop floor-level management. It includes the transmission of clean data in public databases and data exploitation in order to extract information and knowledge [32].
Data management provides assistance to the built-in data flow to improve the actual situation of the physical and the digital systems. In order to achieve these expectations, several aspects of data flow need to be adjusted properly according to the strategy behind data management. Data synchronization between different resources, low latency data transfer to the cloud, data security and privacy, reliability in data transfer, low energy consumption in data transfer and interconnectivity between different smart objects are all entities associated with data flow that need to undergo special kinds of regulation [32,33].

Digital Twin-Driven Product Design
Product design through utilizing Digital Twin requires ascertaining some entities. At the initial step, the main ideas behind the considered product should be determined, and following this, the related criteria need to be structured into the Digital Twin vision. In order to set up the virtual model, the main resources which represent the physical part into a virtual one should be identified, and product data coming out of tests, prototypes and similar components from the past should be organized to support the digital model [22].
The design phase of Digital Twins has been supported by two pillars, including design, theory and methodology (DTM), as well as the data lifecycle management (DLM).
The Digital Twin in this phase takes into account existing DTMs with the aim of optimizing and customizing them for the specific product. DTMs are also helpful in order to identify conflicts and contacts through the different design phases. DLM exploits data that comes from the physical model. It improves data quality through cleaning and mining processes. The Digital Twin-Driven Product Design (DTPD) is an engine that enables the use of a huge quantity of data transformed by the DLM into beneficial information to be used in order to make decisions aligned with the DTM [34].
An interesting design approach is the FBS framework. FBS stands for function (F), behavior (B) and structure (S). Function means "what the object is for", the behavior means "what it does" and the structure means "what the object is". It has the aim to integrate the knowledge of the design agent in order to have a dynamic and open vision of the system. The design phase of a Digital Twin needs to consider the connections between these aspects in order to have a logical and rational system view.
Geri and Kannengisser well-described the FBS framework as a process for designing. The process includes eight steps (need, analysis of the problem, statement of the problem, conceptual design, selected schemes, the embodiment of schemes, detailing, and finally, working drawings) based on the information flow between different entities that contain sets of information. These sets include expected behavior, behavior derived from the structure, function and design description. In the phase of designing the external world, expected world and interpreted world, there are three domains that are constantly interacting with each other. These domains are constructed on the basis of an interpreting process, which creates the interpreted world based on the external world, a focusing process, that allows the designer to create actions based on the interpreted world in order to reach the goals of the expected world, and an action process, that is the result of these connections and is responsible for changing the external world in order to achieve the predefined objective [35].
In order to deeply define the design of a final product or process, it is important to consider the concept of Virtual Commissioning (VC). This is the verification and validation of the virtual model against the real one [36]. VC can be carried out during the development of the physical system, and it is helpful to define the real commissioning.

Materials and Methods
The more realistic the information, the more enforceable the framework based on it. Following this, for the current research, an interview with fourteen persons who are experts in the Digital Twin area was organized. These specialists are aware about all aspects and challenges of implementing the Digital Twin technology, and since their information is highly valuable and applicable, the Mayring Content Analysis approach was applied to their responses to extract the meaningful information. The goal of the interviews was strictly related to the research question: "Fo the creative phase of implementing digital twin as the first step, what framework should be adopted??" The aims of this research are achievable through the personal experience of experts [37], therefore, in order to answer the former question, interviews with experts in the field of the digital twin implementation have been organized, because the literature does not cover all the exact aspects of the creative phase of the Digital Twin technology. Moreover, in order to interpret the collected data accurately and categorize it in a precise way, qualitative content analysis has been used throughout this research since it has proven its efficiency over the quantitative method. Moreover, qualitative content analysis using the Mayring inductive category development approach has been adopted for this research to analyze the interviews, because the classical quantitative content analysis cannot provide clear answers of how the categories were defined and developed [38]. However, the qualitative approach proves its capability in content analysis and the interpretation of data and the development of categories that are highly similar to the original content/material, and one of the procedures that were developed by the qualitative content analysis is the inductive category development, which is based on a group of reductive processes to the original text in order to be able to develop and extract the final categories [38].
The inductive approach was used in order to create a general framework from specific cases. The presentation of new information in a real scenario provides the possibility to easily link new knowledge with an existing cognitive structure [39]. The reason behind adopting an inductive approach is that Digital Twin is a quite new technology, so there are different points of view about certain concepts of this technology. Using a deductive approach (from a general theory to a specific case) was not feasible, because it is not compatible with the purpose of the research, which is related to designing a general framework for the creation of a Digital Twin. The inductive approach required analyses about state-of-the-art Digital Twin technology. It was a fundamental step in order to have a wider view of this topic and to be more specific about the inductive research [39].
Moreover, the qualitative content analysis based on the Mayring approach was used to carry out a step-by-step systematic text analysis and interpret texts within any kind of recorded communication, such as interviews, by dividing the text into content analytical units, in order to be able to develop the appropriate categories [38].
The questionnaire was strictly designed based on a research gap in the creation phase of Digital Twins. The answer to the question of how a Digital Twin implementation framework in the first stages of the Digital Twin creation should be set up "considering the data collection process" is the foundation of the proposed framework. It has been designed according to the hypotheses and reflections emerging during the literature review analysis. In order to eliminate redundancy and make the answer more efficient, only three questions were presented in the interviews. These questions are as follows: 1.
There are two main scenarios for the creation of a Digital Twin. The first is when we have a physical system and the second is when we do not. What is the difference in the framework for Digital Twin data collection between these two scenarios? 2.
Data is the fuel of Digital Twin. Therefore, data collection systems need to be implemented in order to enable an efficient data flow. How do companies choose the right systems for Digital Twin data collection? Which are the main systems? (e.g., sensor-based system, machine vision system).

3.
One of the main challenges in Digital Twin data collection is the diversity of data that come from the physical world (e.g., structure, size, frequency, noise). How can different types of data be implemented into a single Digital Twin system? (e.g., lake databases).
After defining interview questions, 14 interviews were conducted with an average duration of 26 min. These interviews were recorded and stored to undergo analysis. The software application used for the interview was Skype, which provides the opportunity of having high-quality video and audio interviews [40].
The next step after recording the responses is the analysis, that is, extracting the meaningful insights and directions, which was structured following the qualitative content analysis-based Mayring inductive approach, and the main steps taken for this approach are as follows:

1:
Definition of the material, which is the interview text in our case.

2:
Determination of the units of analysis.

3.1:
Determining the envisaged level of abstraction, generalization of paraphrases below this level of abstraction.

3.2:
First reduction through selection, erasure of semantically identical paraphrases.

3.3:
Second reduction through binding, construction, integration of paraphrases on the envisaged level of abstraction.

4:
Generalization and collection of the new statements as a category system.

5:
Re-testing of the new statements as a category system.
Therefore, the development of the categories was a step-by-step systematic qualitative text analysis process that was performed during the text interpretation using an inductive approach (Mayring approach).
In more detail, the identification of the right Digital Twin experts was one of the main challenges. This is a critical step for the quality of qualitative research. The choice of the right interviewers was a step-by-step process. The research on these people was carried out using the network of researchers that came out of the theoretical background academic papers. A message was created that well-described who the researcher is, what the researcher is doing and the purpose of this research. This message was sent to the e-mail addresses of the network of researchers that were found during the theoretical background research. These contacts were available online or directly on the research papers. Social media such as LinkedIn and Twitter were used for discovering other experts and Digital Twin researchers. This research is composed of fourteen qualitative interviews, and each one was anticipated by a phone call in order to understand if the respondent had the requirements needed for the interview questions. It was a crucial step to ensure the quality of the answers.
After recording the interviews, the first action was the transcription of the fourteen interviews into a text paragraph. The interview transcription was divided initially into the three questions. The transcription was made by the researcher following the clean read or smooth verbatim transcription method description in the Mayring approach [38]. This transcription takes into account all the words of the respondents, but all the utterances were left out. The choice of this transcription rules system was based on the research question requirements. All the possible information from the experts can be useful for improvements and for the creation of the framework.
After the choice of the transcription method, the next step was the analysis of the text content. The analyses started with the division of the whole text of each interview into segments, called units. The division analysis of the text was performed in three levels. The first was the coding unit, that represents the minimum portion of text that can fall in one category. The second was the context unit, that represents the largest portion of text that can fall in one category. The third one was the recording unit, that determines the text portions confronted with other categories. This classification was performed with an inductive method for the first round of reduction of the entire text. This means that the different units were defined throughout the content analysis process.
The original text was reduced into paraphrases, then these paraphrases were generalized, and in the end, the generalizations were classified into categories. The coding unit was around 40 words because there were cases where the text was so clear and valuable that it had to be highlighted as a category.
The context unit was the answer to a question that is composed of approximately 180 words for the shortest answers (e.g., answers for question number three). This choice was based on the fact that some interviewees had a really clear vision of Digital Twin and they summarized a meaningful and linear message related to a specific question. Therefore, some answers did not need something more to be added and they formed a category. Having a context unit bigger than these means that the level of detail would have been too general, and it would not have been in line with the research question.
The recording unit was the same as the context unit for the same reason-some answers were so clear and logical that they can be confronted with one system of categories. It never happened that a shorter piece of text could be confronted with another system of categories.
The reduction process was performed two times. This is because the amount of information was huge, and the first reduction was not enough for creating balanced categories. The second reduction had as first text the output of the previous reduction, which means the first category. The latter were reduced into paraphrases, and then generalized, and finally, the generalizations fell into five categories. These five categories will be the pillars for the framework. The quality of the final results was ensured by the repetition of content data analyses. In fact, this process led to the categories that will be addressed in the next section In Table 1, an example about how the Mayring approach was applied to the research analysis is provided.

Results
The results of the qualitative analyses and Mayring approach are classified into five categories, with the titles of Digital Twin definition and classifications, Digital Twin framework, Data collection process, Data diversity problem, and Scenario 1 "DT based on existed physical object" and Scenario 2 "no physical object" analyses, aligning with the research question, which inquires how a Digital Twin implementation framework in the first stages of the Digital Twin creation should be set up "considering the data collection process". All these categories have to take into account the aspects throughout the creation phase of Digital Twin. The level of details between these categories in some cases is different because what emerged from the interviews was that it is important to have a wide awareness of the process (low level of details), but also, more precise analyses about specific parts of Digital Twin are critical.
All the following considerations are based on the information that emerged from the 14 interviews of Digital Twin experts. Based on the answers that were collected from these interviews, we were able to define four main ideas and concepts that need to be covered by any framework for the Digital Twin implementation at the creation phase. These concepts are purpose-driven approach, agile and step-by-step approach, trade-offs through the framework and feedback loop. More details about each of these concepts are discussed in the following sections.

Digital Twin Framework
As mentioned above, this section defines the answers to the research question. It contains the four important concepts that emerged from the qualitative interviews that a framework for the Digital Twin needs to be able to cover.

Purpose-Driven Approach
The main concept that was underlined by almost all interviewees was about the purpose of Digital Twins. It means that all the decisions that are made during all phases of the framework must be consistent with the overall purpose of the Digital Twin. At this point, the question is "How should a company set the purpose of the Digital Twin?" and also "Is Digital Twin the right investment/solution for the considered business?" Referring to insight obtained through analysis, companies should start from market needs. Once companies identify the market needs that they want to satisfy, it is critically important to consider the possible Digital Twin directions based on the digital capabilities that are required for those specific market needs. This means that companies need to take into consideration the market and the digital system needs for the data collection process, to be able to collect the required and right data related to their previously defined purpose.

Agile and Step-by-Step Approach
Digital Twin creation is a long and time-consuming process. What emerged from the interviews was that having an agile approach is necessary for the correct creation of Digital Twin. This is because an agile approach is continuously prepared for changes throughout the process in order to improve the actual situation [41]. Another concept that emerged from the interviews was the step-by-step method. The benefit of this method is that it allows for meaningful digital representation. All the interviewees believe that everything starts from a raw representation of the ideal physical system. This representation can be created digitally using simulation software and different data sources [42]. Consider that in order to ensure a decent quality level of the first representation, having awareness and knowledge of the physical system is required.

Trade-Offs through the Framework
The Digital Twin framework provides various kinds of possible scenarios, and in some cases, there is not always a right or a wrong decision. A common approach that interviewees noted in this situation was trade-offs. The pros and cons of each option completely depend on the companies' priorities, which could be attributed to three main entities, including data storage system (cloud storage system vs. on-premise storage system), sensors (value received vs. costs) and data collection system (machine vision system vs. sensor-based system).
Some interviewees considered these data analyses as out of the company domain, because what the company must do is to understand the purpose of the Digital Twin and data needed. Then, the technical parts should be developed by external partners. This is a key part of the interview results because it adds a certain level of detail to the framework. It shows that there is not a technology problem for the companies, but there is a value and purpose defining problem in the process of the implementation of the Digital Twin. Therefore, companies need to focus more on the purpose of extracting the value-added data instead of thinking and investing in the technical solutions to extract and analyze this data, because technical solutions can be achieved easily by external partners nowadays.

Feedback Loop
The Feedback is the ability to continuously extract and feed data to and from the system in a closed loop in order to be able to improve the digital representation.
Continued improvement in digital representation occurs by relying on feedback loops. The procedure of the feedback loop is that in the first step, a Digital Twin data workflow is simulated, in the second step, the deviation between the physical and digital systems is calculated, and in the last step, according to deviation, a new possible variable is proposed, and again, all these steps are repeated.
Two important points were noted by interviewees which should be considered in the feedback loop. The first point is related to deviation, which emphasizes that a deviation should be considered only if it could affect the purpose of the Digital Twin. If the predefined purpose is achieved, it means that the received value from the Digital Twin is at the highest level, so it does not matter if there is a deviation between the physical and the digital systems. As claimed by interviewees, physical changes and digital changes are two main reasons which cause the deviation between the systems. Therefore, it is important to search the problem through both physical and digital systems to identify where the problem is. The second point for improving the feedback loop was about including variables related to the purpose. This means that the level of details for the data analysis can be increased or decreased based on the purpose defined by the companies. In other words, if the Digital Twin needs to be used only to simulate the prototype of the object (in the pre-design phase of the object's lifecycle), then this requires a basic data analysis. While, if the purpose of the digital twin is to perform a bi-directional aggregation of the data and to mirror the performance in real-time, in this case, this means that the level of details for the data analysis increases during the Digital Twin lifecycle in order to have more sophisticated analyses related to the purpose.

Framework for the Creation of a Digital Twin
After carrying out this research and obtaining the results discussed above, it is important to have a structured theoretical framework for the implementation of the digital twin in the creative phase. The proposed framework is discussed in further detail in this section.
The goal of this theoretical framework (Figure 1) is the creation of the Digital Twin, and it includes all the concepts that emerged from interviews. As it has been shown in Figure 1, the cornerstone of the proposed framework for the creation phase of Digital Twin is the purpose. It affects all subsequent components and needs to be precisely identified. According to Figure 2, the purpose could be identified from different points of view. Identifying market needs, creating new business assets and solving problems throughout processes all are dimensions through which purposes could be defined. Digital Twin implementation requires a huge workforce to be implemented through processes. To properly organize all components and interactions among various entities, the identified purpose needs to be analyzed from two aspects, time horizon (long, medium, short) and decision level (strategic, tactical, operational), with continuous checking for the correctness of the purpose. In order to achieve the purpose, it is important to have awareness about the process that you want to digitally reproduce. According to Figure 3, achieving this awareness depends on two prerequisites, including acquiring knowledge about process workflow and analyzing the required data. The latter should undergo the minimum viable concept to hierarchically classify the importance of the data that the system could reproduce. An obtained minimum viable dataset includes the required data that enables a first raw digital representation. In this phase, the company starts considering hypotheses about the structure of the digital system. The agile approach and a step-by-step vision enter this field to evaluate different variables that are needed for the digital representation. Three different data sources, including human knowledge, IT systems and sensors, are in charge of supporting this phase. The company's role in implementing the digital twin is based on the purpose identification, the implementation of the technologies through the processes and the continuous improvement of Digital Twin. Therefore, the company that wants to create Digital Twin has to leave the technological aspects to external partners that can offer the right solutions for the case needs. The technological aspect requires time and cost. Therefore, if the companies outsource this part, they will have more opportunity to focus on the rest of DT. According to Figure 4, all the technical aspects, such as how to collect data, how to store data and how to transmit data, are developed by the technology partners.
The external partners provide the company with the solutions regarding data collection methods, data storage method, software and platforms for data management. The platforms enable the transmission of the data, and here, it is important to consider the compatibility of the new software with the previous ones.
According to interviewees' responses, there are two main challenging issues for a technology partner in this phase. One problem is with data diversity and the other is with cyber security. For the former challenge, the interviewees underlined that it would be better to look for a solution from the beginning to prevent it from happening. A common solution is creating the standardization of data structure and communication layers to enable the link of different data. Nowadays, various kinds of software and technological solutions have facilitated the process of harnessing the data diversity [43]. The cyber security problem is a real challenge that requires a lot of attention from the company. The warning point in this situation is that all kinds of processes and knowledge are digitized and there is the possibility that they get hacked. Therefore, technology partners are responsible for offering an appropriate cyber-security solution for companies which ensures that the Digital Twin implementation will not endanger their systems. When the partners are identified and the technological solutions are selected, the process can move forward and technology solutions can be implemented. According to Figure 5, there are three main operations to which the company has to dedicate more concentration in order to organize them in the perfect way. The first one is related to choosing the right type and right place for sensors, which are the foundation of data collection. The second one is associated with establishing the proper way to ensure the expected quality level for transmitting data and the right time requirements. The last but not least is concerned with training workers who need to learn about how they should work with the new technology. Without training, interaction with the system will be confusing for them, and taking incorrect action could cause trouble for the system. Interviewees emphasized that there are pivotal principles that must be followed in technology implementation. Data in the Digital Twin system is a kind of fuel, and data loss is an inevitable problem in this regard. Therefore, it is better to employ the standards through the system from the beginning to avoid any problems. Digital Twin systems could take the advantage of a step-by-step approach and establish a balance between the overview of the system and operational actions to make a standardized, connected and integrated process from the beginning of the implementation until the end.
The next step according to Figure 6 is the Digital Twin representation. At this stage, we have the entire infrastructure for collecting, storing and managing the data. Therefore, the company can start with the first raw representation of the Digital Twin. This interpretation will reveal what is going on in the system. In order to realize the realistic interpretation of a system, by relying on human knowledge, the helpful data must be selected and interpreted. From the interviews, it emerged that in some cases, having the right level of detail is not easy, and in complex systems it could be useful to divide the overall system into smaller ones. This approach enables a more precise result, where the output of the small systems generates the overall Digital Twin. The selection of the right data is important for the digital representation and the continuous improvement in the feedback loop. The use of human knowledge enables us to exploit the data and obtain an efficient digital representation. In fact, the training of the employees about using software and the new technologies is an activity of the previous phase.  Figure 7 indicates the feedback loop as the prime part of Digital Twin. This loop is extremely important because the continuous improvement of Digital Twin is directed by it. In the first stages of the Digital Twin creation, the first representations are raw and not precise. Using this loop, the company can identify the deviation between the measured value and the expected value. Here, the company analyzes where the problem is. After these analyses, the companies can modify and improve their physical products' performance based on the real-time presentation of the collected data. The use of visual and design software, such as CAD, is important in order to have an overview about how the system works and which data the company needs in order to achieve the purpose.

Discussion
Digitalization is an intelligence response of business owners to the rapid changes resulting from technology development. The Digital Twin is one of leading technologies that enables companies to align themselves with competitive flow that exists in the market. In order to take full advantaging of Digital Twin, the pivotal step related to the creation phase of Digital Twin should be precisely organized. Since the influence of this step in the final performance of Digital Twin is undeniable, the main motivation behind carrying out this research work was identifying the entities that are impressive and designing them as the integrated framework. The qualitative empirical method was the basis of the adopted methodology of the current study. Following this, interviews with fourteen persons who were specialists in the Digital Twin area were performed and the Mayring content analysis approach was employed on their responses to extract the helpful information. The foundation of the questionnaire was based on the question: "How should a Digital Twin implementation framework in the first stages of the Digital Twin creation be set up "considering the data collection process"". Analyzing the interviewees' responses revealed some striking insights related to successful implementation of Digital Twin. Through utilizing an accurate and deep intuitive understanding of interviewees' responses, a comprehensive framework, which includes 5 main phases, was designed. This framework provides a guideline for implementing the creation phase of Digital Twin. The implementation of each phase faces challenges, and in order to overcome them, effective dimensions need to be properly evaluated and defined. One of the striking advantages of the proposed framework is that all the possible challenges were identified via the interviews, and their proper reactions were proposed. In this regard, the users could avoid any challenge before its occurrence, which leads to saving time and costs in the long run.
One of the main philosophical aspects that emerged is that the failure in the implementation of digital technologies, such as Digital Twin, is due to mistakes in the prioritizations of the activities. You cannot look for a solution if you do not know the problem. This behavior comes from the IT industries that have to sell things and advise standard solutions for not standard problems. Here, one of the experts said: "The biggest reason why innovations fail is because people jump into solutions too quickly, without first analyzing the problem they want to solve. It is necessary to analyze the problem deeply before jumping into solutions".
There are several small-and medium-sized enterprises (SMEs) that need to make the first move into digitalization. Digital Twin not only continuously improves the system, but also provides an exceptional opportunity for business owners to learn about possible scenarios that could lead to development of the existing businesses. An increased profit margin stemming from continuous improvement plus more profit originating from business development are two distinguished benefits of Digital Twin implementation.
Implementing the proposed framework in real-life practical cases, such as in the manufacturing industry, can help managers and decision-makers in prioritizing their solutions and improving the quality of products/services and the performance of their processes, especially the maintenance process for their products/services. As mentioned above, this framework is focused on the real-time data analysis and the continuous feedback improvements to the system, which is considered as one of the main challenges of implementing Digital Twin technology. Therefore, the right implementation of the creative phase through this proposed integrated framework will improve the performance of the manufacturing companies through the right data collection and analysis feedback loops proposed in this framework.
Some viewpoints have been neglected in the current research study and could be considered and evaluated in the future research work. The first one is associated with the boundary of the proposed framework and its internal feedback loop. The proposed framework starts from the purpose of implementing the Digital Twin and ends with the feedback loop about the performance of the products reflected by this Digital Twin (starting from the ore-design phase of the product and ending in the service phase). Future study could go through the disposal phase of the product lifecycle (after the life of the product ends) and develop a more detailed framework. The second one could be attributed to the technological nature of the system. Since the technological nature of the whole system was outsourced to other technological partners, the companies did not do all the technological work by themselves. Following this, future work could consider that companies need to carry all the technological aspects within their boundaries, without any outsourcing. Some businesses implement a network of Digital Twins, "more than one", for different purposes and difference objectives, therefore, the third research direction could be to investigate the practical implementation of this framework through businesses with more than one Digital Twin presented, and improve it to consider the connection of different Digital Twins in the same business use case. The fourth one could be regarding the cybersecurity issue, which is one of the main challenges in the implementation of such technology. Therefore, future research could analyze the implementation of this framework in a secure way by taking into consideration a security audit for data encryption and identification of different users and devices. The last but not least is related to the influence of the government on Digital Twin's implementation. In fact, government regulation for technological implementations differs from one country to another, and each regulation imposes a special kind of burden on its user. Digital Twin requires future research, and current research is just a step toward the next considerations and future analysis and extension.

Acknowledgments:
We have started the implementation of this framework within the boundaries of a startup in the maritime industry, which is called Digital-OMT. It is an innovative and agile start-up spun off from OMT Spa in 2018 to provide tailor-made digital solutions that enable value creation through data analytics, with a particular focus on the marine transport and propulsion markets. Its vision is to provide services based on its products, and it is based in Turin, Italy. Therefore, it has created an intelligent injector able to communicate its operative characteristics to a local processing unit for performing fast data analytics and providing immediate feedback to the engine control unit and to the engine room crew, as well as transmitting the processed data to a cloud-based storage for further analysis and knowledge generation (Marco C. & Marco F., 2019). The literature shows some differences in the development of Digital Twins in the manufacturing domain and maritime domain. However, the research and interest in the maritime domain is still new, while a lot of research has focused on the manufacturing domain in the past few years (Nicole T. et al., 2020). This startup has been chosen for this research because their aim is to support their customers' digital evolution, which is equivalent to the aim of implementing the Digital Twin. Starting from the purpose of implementing the digital twin technology within their processes, it turned out that their purpose is to facilitate the design of the injectors' prototypes and to validate certain decisions about the system and identify possible problems or risks. In order to do so, they do not need an intelligent Digital Twin, but a basic Digital Twin, which is at the pre-digital twin level. Basically, they are using the Digital Twin as a simulation tool to help them in the design process of the injector before having the actual physical model. This means that there is not any data acquisition from the physical twin at the moment, but OMT-Digital mentioned that in the near future, they are planning to invest more in the Digital Twin technology and to move toward the implementation of a higher-level Digital Twin, such as the intelligent Digital Twin, to be able to improve the after-sales services and to perform the adaptive and condition-based maintenance. Therefore, the Digital Twin framework produced by this research could not be fully implemented and investigated in a real industrial case such as "OMT-Digital" because of the reasons discussed above.

Conflicts of Interest:
The authors declare no conflict of interest.