Augmented Reality and the Digital Twin: State-of-the-Art and Perspectives for Cybersecurity

: The rapid advancements of technology related to the Internet of Things and Cyber-Physical Systems mark an ongoing industrial revolution. Digital Twins and Augmented Reality play a signiﬁcant role in this technological advancement. They are highly complementary concepts enabling the representation of physical assets in the digital space (Digital Twin) and the augmentation of physical space with digital information (Augmented Reality). Throughout the last few years, research has picked up on this and explored the possibilities of combining DT and AR. However, cybersecurity scholars have not yet paid much attention to this combined-arms approach, despite its potential. Especially, concerning contemporary security challenges, such as developing cyber situational awareness and including human factors into cybersecurity, AR and DT, offer tremendous potential for improvement. In this work, we systematize existing knowledge on AR-powered DTs and shed light on why and how cybersecurity could beneﬁt from this combination.


Introduction
The Digital Twin (DT) is a key enabling technology in the digital transformation of industries to digitalize physical assets. Over the past few years, the demand for DT solutions increased significantly, with its potential advantages attracting attention from aerospace to healthcare. The digital representation of physical assets often allows pursuing new businesses opportunities or gaining highly relevant new insights. Therefore, the DT market is expected to increase its volume from an estimated 3.1 billion USD in 2020 to 48.2 billion USD by 2026 [1]. A contrast to the relatively new DT trend is the concept of Augmented Reality (AR), which has been around for several decades [2]. However, it is experiencing new attention since, for the first time, respective technology is affordable and capable of leveraging AR's full potential [3]. Today, the available hardware to realize AR applications allows implementing new ways of human-machine interaction and information visualization in various use cases.
Both concepts, DT and AR, play a significant role in the currently ongoing industrial revolution [4,5]. Most interestingly, DT and AR are highly complementary concepts in this context. While the DT allows representing physical assets in the digital space, AR enables augmenting the physical space with digital information. Therefore, applying these two concepts in combination unlocks a new way of human-machine interaction and, especially, a new way of situational awareness [6]. They help to realize a seamless integration of users into the cyber-physical space. Thus, users' situational awareness is not limited to a digital visualization of relevant data but can be directly contextualized with the respective realworld objects. Digital data and physical data are blended, allowing users to be really aware of the situation regarding cyberspace-related aspects, as well as regarding the physical world [7]. In recent years, research has picked up on the combination of AR and DT, and a significant amount of work has been done to explore its possibilities.
Nevertheless, very little work has been done to unleash the potential of applying AR combined with a DT in the cybersecurity context. While the DT concept is becoming more prevalent in security research, its application is mainly limited to simulating attacks on physical assets or similar one-dimensional use cases [4]. However, the rise of Cyber-Physical Systems (CPS) and the Internet of Things (IoT), with an accompanying increase of DT applications, require using DTs the DT for cybersecurity, as well as securing the DTs themselves [8]. Although many automated approaches for detecting and preventing security incidents exist, experts remain essential to make final decisions and interpret results. In addition to their domain knowledge, situational awareness plays an essential role in this decision-making process [9]. The use of AR for DT offers excellent opportunities to sharpen the situational awareness of security professionals through the direct connection of real-world objects with cyberspace. However, security research currently does not seem to be fully aware of this potential and is, thus, lacking a clear vision and perspective to leverage the potential slumbering in the combined-arms approach of AR and DT.
With our work, we make a first two-fold contribution to this problem. To develop a tenable perspective on the combined use of AR and DT in cybersecurity, we first capture the current state of research on the AR/DT combination through a structured literature review (SLR). In the second part of the paper, we leverage the insights gained in the SLR to develop a vision of how the AR/DT combination can contribute to improving an enterprise's security posture.
The remainder of this work is structured as follows. Section 2 briefly introduces background knowledge on the concepts of AR and DT. Throughout Section 3, we describe a structured literature analysis regarding the applied methodology, as well as both relevant quantitative and qualitative results. We follow this up in Section 4 with perspectives for cybersecurity when the combination of AR and DT is applied, resulting benefits, possible application areas, an architectural blueprint, and aspects on securing related technology. Section 5 concludes our work with a summary and possible directions for future research.

Background
This section lays the foundations of the two main concepts. Section 2.1 provides the background on AR; afterwards, the foundations on DTs are explained in Section 2.2.

Augmented Reality
AR is a concept and, by now, a technology that augments real-world objects with additional digital information. This augmentation of reality is the main benefit of AR as it allows additional information to be added to a user's perception. AR moves the simple digital display of data towards a more immersive, contextualized experience in the real world. Examples of this technology can now be found in many areas. However, applications are always based on the principle that information is superimposed on users' reality with the corresponding end devices. Thus, users perceive both reality and the additional information. For example, navigation information is mapped directly onto the environment, or augmented assembly instructions directly highlight the next required component and placement.
The concept of AR is very closely related to Virtual Reality (VR). Both terms describe some form of a perceptual combination of the virtual world and the real world. However, they are clearly differentiated by the Reality-Virtuality Continuum by Milgram et al. [10], depicted in Figure 1. In VR, a user's perception is entirely dominated by virtual, computergenerated information. Users are immersed in a virtual environment and do not perceive the real world around them. Thus, VR is located on the virtuality end of Milgram's continuum. In contrast, AR is located on the other end of the continuum. It superimposes the virtual information into the real world, while users perceive a combination of both the virtual and the real world. This immersion is achieved through three main functional features (FF) of AR [11,12]: • FF 1: Combining virtual and real-world in a single perceptual space. • FF 2: Allowing users to interact with this perceptual space in real-time. • FF 3: Registering virtual and real objects in a 3D space accurately. Initial ideas and concepts for AR have been around for more than five decades. In 1968, Ivan Sutherland created the first Head-Mounted Display (HMD) to let users explore simple wire-frame drawings [2]. AR's most common understanding in today's world is the virtual, transparent visualization of information superimposed on the real world [13]. Enhancing the real world with additional computer-generated information is not limited to visual perception but can span other sensory modalities, such as auditory, somatosensory, or haptic [14]. However, these approaches are less common.
There are various hardware solutions for realizing AR. They are categorized into five main types [15], including Head-Mounted Displays (HMD), Hand-Held Displays (HHD), projectors, 2D smart glasses, and monitors. An example of common HHDs is any modern smartphone. Several respective AR applications for smartphones have been trending and thriving in the last years [16]. The most advanced HMD is the HoloLens built by Microsoft [17]. In addition to these examples, many other technical approaches exist. A device can be explicitly characterized as AR hardware based on the following four criteria: • Mobility: Allow users to be mobile and not fixed to a specific space. • 3D-space: Spatially register the 3D-space in which it is used. • Hands-free: Allows a hands-free operation of an interaction with the hardware. • Real-time: Transmit data in real-time.
AR technology has been explored and applied for many different use cases throughout recent years and is considered as one of the key enabling technologies for Industry 4.0 [15,18]. Another vital key technology for the industry's future is the Digital Twin, which we introduce throughout the following section.

Digital Twin
Similar to AR, the DT is a concept that, in different forms, has been known for quite some time. The first visions of systems, today called a Digital Twin, have been presented as early as 2002 [19]. However, only over the last five years have DTs gained an increasing interest throughout several research and industry domains. The reasons for this renewed interest in the DT are similar to those leading to the rise of AR: the technology needed to implement a DT is now, for the first time, mature enough to render DTs possible at all. However, a wide variety of existing challenges in the DT realm highlight that the concept itself can still be considered to be in its infancy [20].
The DT presents a concept that applies in many different fields [21]. Owing to these vast application domains [22], definitions of the term itself vary widely. Most generically, a DT can be regarded as a virtual representation of a real-world counterpart along its entire lifecycle. Thereby, dominating DT features are its sole focus on one asset (i.e., the real-world counterpart, physical twin), its mere virtual embodiment and the novel insights [23] the concept can provide. One of the primary purposes of DTs is, thus, to monitor [24] and manage real-world assets.
Besides, several related terms exist which describe similar aspects and are erroneously used as synonyms to the term "Digital Twin". For example, the term "digital shadow" does not comprise the entire lifecycle of a real-world asset but, rather, refers to a digital footprint. Therefore, throughout this work, we define a Digital Twin as an asset's virtual counterpart enabling an organization to monitor and analyze the asset during its entire lifecycle.
Novel insights can be achieved with the DT, thanks to its key characteristics presented in Figure 2. Data concerning the real-world counterpart is enhanced with semantic technologies in order to represent the counterpart virtually [25], e.g., by creating an emulation or model. On this basis, simulations, analyses, etc., are conducted [26]. These, in turn, provide deep insights into the real-world counterpart's present state and potential future.

Real-world counterpart Virtual Physical
Data enhanced with semantics Simulation, analyses etc.
Virtual representation

Literature Review and State-of-the-Art
To synthesize existing research at the intersection of DT and AR, we carry out a literature analysis following an Information Systems Research methodology [27] and the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines (cf. Figure 3) [28,29]. These systematic review techniques allow us to select and review relevant research literature in order to analyze the results presented there. Section 3.1 describes our process to identify relevant literature, while Section 3.2 gives a brief insight into quantitative analysis of this literature. Section 3.3 summarizes the main qualitative results that can be drawn from the identified literature.

Methodology
We describe the applied methodology in detail to ensure reprehensibility and transparency of the carried out SLR. The search process is sequential and is applied to five databases covering a wide range of Information Systems literature. We design the SLR to reach a representative coverage on the research of AR combined with DT. Thus, we combine keyword, backward, and forward search to provide the foundation for the following knowledge systematization.

Search Terms
The first important step in the SLR is to identify an appropriate keyword string. For this purpose, we screened existing work on DT or AR to identify keywords and relevant synonyms. Especially, in the context of DT, several synonyms, such as "digital shadow" or "digital model", exist (see Section 2.2). However, an extensive study shows that the keyword "digital twin" is sufficient within an SLR to cover both DT approaches and approaches more related to one of the respective synonyms [30].
"Augmented reality", in contrast, is a well-used and established keyword and research term. Thus, we apply the rather simplistic search term ("digital twin" AND "augmented reality") to identify related research.

Information Sources
We apply this search term to the databases ACM Digital Library (DL), IEEE Xplore, Springer Link, ScienceDirect, and Web of Science (WoS). These information sources are chosen because they cover a wide body of peer-reviewed research on information systems, as well as computer science. To acquire a comprehensive collection of the relevant research from the information sources, we apply the following query: ("digital twin" AND "augmented reality") to title, abstracts, and key (index) terms within the information sources. Complementary, we also decide to include Google Scholar, although its relevance as a scientific database is controversially discussed. However, Google Scholar indisputably provides a good, holistic overview of existing research. Thus, we applied our search term to this search engine in a final step, ensuring that we do not miss out on any relevant research.

Eligibility Criteria
To ensure high-quality research on the topic of interest, we formulate and assert two vital inclusion criteria (IC) for relevant publications: • IC1: Article is written in English. • IC2: Article meets IC1 and is not a "buzzword paper", i.e., only mentioning DT or AR in the title, abstract, or introduction.
The latter criterion has been introduced throughout a first screening of the identified publications as we came upon a significant issue: although the publications mention DT and AR, most of them do not actually conduct research related to those terms at their core. We label these publications "buzzword papers", as they often only mention DT or AR once in their title, abstract, or introduction and do not relate to these concepts in the work's main body. Thus, if a paper does only mention the relevant keywords (i.e., digital twin or augmented reality) within its title, abstract, introduction but not in the remaining full text itself, we exclude this work from our further analysis.

Study Selection
Our study selection process is schematically depicted in Figure 3 and follows the four PRISMA phases:

1.
Identification: First, for the period under consideration, information sources were searched using the search term to identify potentially relevant publications (n = 435).

2.
Duplicate Removal & Initial Screening: In the second step, we remove duplicates (n = 80) that occurred within or between databases and checked the title, keywords, and abstract of the publications to remove publications which are not relevant for our studies (n = 260). The full texts of the remaining records (n = 95) are retrieved to assess their eligibility.

3.
Eligibility criteria application: In the next step, the remaining publications (n = 91) are reviewed according to the eligibility criteria. At the end of this step, we determine a comprehensive list of actual relevant scientific publications on the use of AR for DT (n = 32).

4.
Backward & Forward Search: Finally, we review the reference list of relevant literature from Step 3 to identify papers that may have been previously overlooked (Backward Search) and screen papers that referenced one of the relevant publications (Forward Search). These publications were also quality-checked according to the Steps 2 and 3 in the process described here and included into the record collection if they are relevant (n = 1).

Quantitative Results
This four-stepped selection process to determine relevant academic literature within the area of interest was carried out in June 2021. The search initially resulted in 355 unique publications related to the specified search term. After applying the quality criteria, 32 papers are labeled relevant to the literature analysis. Based on these papers, we perform a forward and a backward search. This additional search unveils one relevant publication. Therefore, a total of 33 relevant publications is identified (cf. Figure 3). The small number of papers identified through the last step of our selection process (i.e., Forward & Backward Search) confirms the comprehensibility of the previous process steps.
The first publication we determine relevant has been published in 2016. Along the following years, the number of published papers per year is increasing, as displayed in Figure 4. This trend seems to change with a smaller number of papers in 2020 and 2021. However, the number of relevant publications being published in 2021 is not yet representative as we only include the first half of the year into our search. Thus, the final number of publications from 2021 is very likely to be higher than shown in Figure 4. Additionally, the small number of years in which we were able to identify papers would not allow for a resilient interpretation of the seemingly decreasing amount of publications. It could just as well be that an over proportionally high number of papers was published in 2019 and that the approximately linear growth of previous years is continuing again in 2020 and 2021. However, due to the short time span for which data are available, such interpretations must be regarded as currently not verifiable hypotheses. Even considering the initial 355 unique results before applying the inclusion criteria, no publications have been published before 2014. This result can be interpreted from two perspectives. First, the research on combining DTs with AR is relatively young and was established around 2016. It is even possible to identify this field's origin, as Schröder et al.'s work [5] is the first identified relevant publication cited by many of the other papers. Second, identifying only publications from 2016 onward can be related to the rise of the DT concept dating back into this time.
Therefore, there may be additional, relevant academic work even before 2016 that tackled DTs' combination with AR without using the term "Digital Twin" or any of the often synonymously used terms. This possible bias needs to be addressed in further studies. In this context, identifying significant keywords to perform a comprehensible literature analysis poses a major challenge. Although "Digital Twin" has been proven to cover most of the respective research, even covering similar approaches as described in Section 3.1.1, this does not cover relevant approaches introduced before the term "Digital Twin" gained relevance.

Qualitative Results
In this section, we systematize the knowledge derived from the literature identified in Section 3.1. Therefore, we answer the questions of Why AR is applied in combination with the DT (Section 3.3.1), Where this combination is applied (Section 3.3.2), How these concepts can be combined (Section 3.3.3), and What technologies are used to realize AR in a DT context (Section 3.3.4).

Why?-Motivation for Combining AR with the Digital Twin
The overarching, central motivation why AR is applied for DT concepts within the identified literature is to allow easy access to the underlying data and improve understanding through visual representation [6]. AR is utilized to support the connection between the DT's physical and digital part enabling a new way of monitoring and analyzing of the physical part [31]. Although respective simulations, analyses, etc., are possible via standard (primarily web-based) user interfaces, AR offers a more intuitive and efficient interaction with the DT data [32]. Literature sees AR as a unique way to contextualize the digital information of a DT with the actual real-world twin enabling a new way to improve situational awareness. The improved human-machine interaction, achieved by using AR and this perceptual combination of digital and physical twin, makes AR so beneficial in various use cases. They range from advanced analysis or to train and educate staff members [33]. Overall, motivations for combining AR with the DT can be summarized under monitoring, training, and malfunction detection [34].

Where?-Main Application Areas
The application areas in which AR is applied in combination with the DT overlap with the core DT areas. Rasheed et al. summarize the DT focus areas under three topics (1) Manufacturing, (2) Education, and (3) Cities, Transportation, and Energy [4]. We mostly follow this classification to describe the different areas where AR and DT are jointly applied. However, we extend the topic Education to Education and Training. We summarize work that does not fit under the previously mentioned topics under the General category. Table 1 matches the identified literature with the respective application areas. Please note that several publications address more than one application area, which is why the overall count in Table 1 is higher than the 33 identified unique, relevant papers. Manufacturing: This application area is the most popular in the AR/DT combinations as it is the current primary use case for the DT itself. Due to the evolution of Industry 4.0, the use of CPS and, respectively, of DTs will increase drastically [5,17]. To ensure a practical and convenient workflow in this context, workers are required to use portable wearable devices, voices, or gestures to interact with CPS [35]. For example, AR and DT are used to improve predictive maintenance. Thereby, the combination enables industrial processes to predict and, thus, detect anomalies or prevent imminent failures [36]. The resulting maintenance procedures are launched through the DT and can be carried out in real-time through workers using AR. Thus, the maintenance procedures selected within the DT are made available for workers who then can be guided by AR [36]. Besides the manufacturing workers, higher employees (i.e., factory managers or supervisors) can benefit from AR applications displaying DT data. Kritzler et al., for example, present a Virtual Twin of a smart factory for managers, allowing full remote accessibility and control of the factory, as well as the respective processes [37]. Tran et al. are seeing a development towards a Lean manufacturing approach based on DTs and AR, enabling various advantages [41].
Besides manufacturing itself, Digital Assembly is also a related research area where AR and DT are currently raising expectations for significant improvements [38,42]. AR is used to realize virtual reality fusion, assembly operation navigation, assembly scene perception, and many others. It is meant to improve quality and efficiency through the interaction of virtual assembly objects and real assembly objects. The DT is used for the stimulation of behavior in the assembly process of physical entities. This provides the prediction of assembly performance and accuracy. DT technology enables assembly instructions in real-time through AR and still guarantees the overall assembly performance [38]. Assembly processes driven by the DT gain efficiency and flexibility and can provide multi-modal feedback through adding AR [39]. Thus, AR displays just-in-time information, creates a more natural interaction, and improves the user experience for assembly design, training, and guidance [32]. Digital assembly benefits from these approaches by increased quality, as well as reduced costs and cycle times, from initiation to execution [40].
Education and Training: In the context of education and training, AR is a widely accepted concept to improve learning effectiveness, motivation, and engagement [42,49]. AR can make learning processes more involving, as well as increases training improvements with digital technology trends, such as the DT concept [50]. Combining DT and AR allows the visualization of digital data in a physical context that uses remote teaching concepts [33]. These teaching concepts are highly inclusive for students, even if they are not physically on-site.
In a more business-oriented context, AR and DT are applied to enhance training and workplace learning. AR enables participants to perceive data provided by the DT that would otherwise not be available directly. The DT makes it possible to trace quality issues or predicting future stresses. All this information can be displayed via AR. For example, augmenting the real manufacturing environment with this additional information and process steps supports inexperienced operators in their learning process with step-by-step guidance throughout various scenarios. This setting enables an efficient, guided, learningby-doing approach for workplace learning [51]. This idea is also picked up by applying AR to train staff members with visual manuals. For example, AR presents a machine's status data based on its DT and shows technicians how to maintain the machine [52].
Raybourn and Trechter also use DT and AR for training [53]. The authors apply AR in combination with a DT to provide training and situational awareness in physical security systems, such as military installations or security command posts. The DT is a model of the site under consideration connected to sensors. AR is used to make suggestions determined by the DT (e.g., best maneuver routes, courses of action with minimal damage) easily accessible to security forces.
Cities, Transportation, and Energy: For cities, one potential application area is construction. AR-DT technology is supporting the interaction with construction machinery, improving operation precision, as well as safety [35,54]. In the transportation, or more specifically, in the aerospace sector, AR is a commonly applied technology [13,57]. With the DT, AR becomes a vital tool to facilitate (predictive) maintenance of aircrafts [55]. An aircraft's DT collects data used for the analysis over a timeframe of the vehicle's lifetime or even its complete lifespan. Markers attached to the physical components allow synchronizing AR with the DT and displaying relevant information based on the DT's analyses. Thus, maintenance tasks become more efficient, and the optimal point in time to perform these tasks gets predictable.
In the energy sector, combining AR with DT technology is used to monitor and analyze critical processes. Pargmann et al. implement a DT of a wind farm, allowing users to analyze the complete farm within an AR environment [56]. The AR displays the wind farm landscape with all data required to plan necessary (predicted) maintenance work or coordinate contracts and customers.
General: Besides the mentioned application areas, other application areas do not seem to have adopted the AR-DT combination, although many other areas could benefit from applying this combination. However, first, more general approaches [4,34] and surveys [6] are fundamental to make the concept of using AR to visualize DT data more accessible for other application areas, as well. For example, Paripooranan et al. are using the combination for a real-time interaction with a 3D Printer [59].

How?-Architecture for an AR-Powered Digital Twin
This section summarizes our findings regarding different frameworks and architectural approaches to implement AR and DT. Several publications present respective work but on different levels of detail. Another noticeable aspect within literature is that no best practice framework or architecture exists yet. Many authors introduce proprietary architectures. However, some common themes can be identified.
Some authors present generic, holistic frameworks for IoT architectures or platforms, including AR and DTs [6,34]. Other authors are more focused on specific architectures for enabling the visualization of DT data with AR [35,37,46]. The central theme within these specific setups is that they pivot towards mainly service-oriented, layered architectures. Naturally, these architectures' fundamental parts are a physical layer, representing the physical hardware related to the use case, and a visual layer to augment the physical hardware with additional data [32,55]. The DT itself is part of these architectures as a layer including several processes, such as a control process to remotely control the physical layer via the visual layer and an augmentation process to realize the AR visualization [15,55]. Some authors separately define a DT layer including all the necessary components (databases, simulation models, computing units, knowledge ontologies) and an interaction layer enabling the interaction of users with the DT via web services [5,46]. As these layers are often labeled differently, and the identified literature tends to introduce proprietary architectures, we propose a minimal, common architecture for AR-powered DTs, as in Figure 5. The underlying Physical Layer includes any hardware related to the use case under consideration. This physical layer is reflected in a Digital Twin. The DT is connected to the physical layer via synchronization processes that transport all necessary data from the physical to the DT and a control process that allows the DT to interact with its physical twin. The information and functionalities contained in the DT are accessed through applications or services located within the Application Layer. Within the application layer, any service can retrieve data from the DT by actively querying it. However, the applications can also be connected to the DT via a data provisioning process directly pushing new information from the DT to the application layer. The Augmented Reality, finally, allows the representation of the DT on the physical layer. It augments the physical layer with the information retrieved from the services within the application layer. This layered, service-based architecture allows a bi-directional, synchronous, and remote interaction of the user with the physical layer powered and contextualized by AR and DT, and vice versa.

What?-AR Technologies in the Digital Twin Environment
This section considers technologies in terms of possible technological solutions to realize an AR visualization of DT data. Besides the main characteristics of AR that need to be fulfilled by respective technology (cf. Mobility, 3D-space, Hands-free, Real-time defined in Section 2.1), there is a set of additional, more DT-specific features. These additional requirements need to be met by an AR technology, allowing it to display DT information effectively [6]: • Continuous data flow: The data flow needs to be bi-directional between the AR component and the DT [55]. A direct and synchronous interaction between users and the DT is only possible if this feature is addressed. • Continuous tracking of physical and logical objects: To ensure the ongoing and correct contextualization of DT data, both physical and logical objects need to be continuously tracked. Current placements of virtual objects need to be matched with user movement, as well as the possible movement of physical objects [15,55]. • Access to behavioral models: If necessary, behavioral representation models for physical objects need to be accessible for logical objects. • Adherence to physical constraints: The representation and alignment of logical objects based on physical constraints need to be ensured. There are three different ways the 3D models of an AR visualization can be positioned in the user's view. This relative positioning can be realized based on markers, devices, or the surrounding physical room [34].
Analyzing the applied AR technologies in the identified literature reveals the dominance of Microsoft's HoloLens solution. A large proportion of authors use this HMD for their solutions in a wide variety of use cases [15,34,37,40,43,44,52,55]. HMDs can fulfill all AR hardware criteria and, therefore, are the preferred technology to realize respective applications [15]. Additionally, the HoloLens is currently one of the few commercially available HMDs, mainly explaining its dominance. The HoloLens has been proven to be a very efficient and accessible solution for interacting in real-time with a DT [17]. However, studies reveal some constraints, such as wearing and viewing comfort, which should be further improved. Additionally, the first generation of HoloLens regularly faces significant issues with retrieving real-time data and ensuring a continuous data flow, thus violating two core criteria in the context of AR for the DT. Since the initial study by Pusch and Noël is conducted with the first commercially available version of the HoloLens [17], it might be interesting to see how the HoloLens 2 performs in a similar experiment as it is promoted to be much more ergonomic and secure.
Besides most businesses using the Microsoft HoloLens, an increasing amount of AR-DT combinations is implemented with other technologies, especially in the public sector. For example, Hand-held displays are primarily applied in training and education contexts [33,49]. Commonly, in these use cases, smartphones or tablets are used as AR HHDs. Although they do not fulfill the third criterion for AR hardware, which is the possibility for hands-free interaction [15], they are still ubiquitous and accessible to almost everyone. This makes them a perfect trade-off solution for education and training, where many people use AR technology simultaneously. Offering the often more affordable HHDs in this context is reasonable. Other authors implement the AR-based visualization of DT data using standard displays and user interfaces [5,35,53]. In these applications, precaptured models of reality or webcam streams are augmented with virtual information.
Standard displays were more popular in the earlier days of combining AR with the DT, but their use has decreased with the maturing of technologies, such as the HoloLens.
The potential and feasibility of applying AR technology for visualizing DT information are more and more examined empirically. Studies and interviews with expert users confirm that the overall potential of this combined forces approach improves interaction with the physical product or object and helps increase awareness [44,48,60].

Perspectives for Cybersecurity
The results of our SLR highlight the lack of research leveraging the AR/DT combination in the context of cybersecurity, as none of the analyzed articles deals with this aspect. However, we argue that cybersecurity research and practice can benefit immensely from this combined-arms approach. As there is no existing practical work showcasing the implementation of the AR/DT combination for cybersecurity, our perspective is merely theoretical. It should point out possible directions and application scenarios for future practical cybersecurity applications, especially in areas already using AR and DTs.
We flesh out our cybersecurity perspectives on the combined application of Augmented Reality and Digital Twins within the following section. This perspective is based on the conviction that vital parts of cybersecurity can benefit significantly from the AR/DT combination, particularly security monitoring, training, or analytics. Generally speaking, combining AR with DT enables a more direct and intuitive interaction of users with cybersecurity-related processes and strengthens their cyber situational awareness. This is especially true for cyber-physical systems where the borders between the digital and physical world are blurring. In this context, situational awareness of both security experts and novices is crucial but even more challenging to achieve than in purely digital security use cases.
Concerning security, the DT needs to be viewed from two sides. On the one hand, security needs to be ensured for the DT itself, while, on the other hand, DTs can be utilized for security. At first, confidentiality, availability, and integrity are required to be ensured for the DT itself. This describes what we call security for the DT (Sec4DT). Recent work focuses on how to provide secure synchronization between digital and physical twin [61] and how DT data management can be secured [62]. However, while the twin might require adequate security measures, it can also function as a security enhancer [63]. This provides the second side: Digital Twin for security (DT4Sec). Various works propose mechanisms for intrusion detection with DTs [64,65]. Others focus on simulating attacks [66,67], or providing a concept for security testing [68]. As AR offers an immersive and intuitive interface to the DT for users, these viewpoints also apply to the combined usage of AR and DT.
With these two viewpoints regarding cybersecurity and the AR-DT combination set up, we will now expound perspectives for cybersecurity on AR and the DT. At first, we discuss why an AR-DT combination is necessary for cybersecurity. Afterward, several security-relevant aspects within the main applications areas (as identified in Section 3.3.2) are considered. In the next step, we propose a conceptual architecture highlighting how to utilize the benefits of an AR-powered DT for cybersecurity. These three steps determine an idea, how AR and DT can be leveraged for security (AR-DT4Sec). Sec4AR-DT is, to some extent, part of Section 4.4, when we discuss technologies for the AR-DT combination under a security premise. However, Sec4AR-DT is not the focus of this work.

Benefits for Cyber Security
We identify several benefits when DTs enhanced with AR technology are applied for security-related use cases, which we describe in more detail throughout the following paragraphs.

Contextualizing Physical Surroundings
First of all, using AR to inspect physical systems by providing data from the respective DT enables novel security-related insights contextualized on physical surroundings. This is especially true since the data is imposed on the physical object by AR. Thus, security operators not only have the data at their hands but, instead, have a contextualized visual representation of the data. Therefore, they can directly associate data to the actual physical objects to which the data logically belongs. As mentioned in Section 3.3.2, this offers an immense advantage when dealing with cybersecurity in a CPS-dominant context. Cybersecurity breaches and threats in a cyber-physical system are often particularly hard to identify when the physical system seems to work correctly. There is no direct indication for operators that the respective industrial process might be compromised.
For example, it might be easy for human operators to detect single cases of misbehavior regarding physical objects (e.g., bottles being overfilled in an industrial filling plant), which may occasionally occur. However, this malfunction might potentially not just be casual but could result from a cybersecurity breach where the perpetrator is slowly and cautiously moving towards doing actual physical harm. This behavior is common for Advanced Persistent Threats (APTs), where attackers refrain from significant physical impact for quite some time to remain undetected for longer within the infested time [69,70]. They use this time for probing security measures and lateral movement towards systems where they can deliver significant damage. This so-to-speak final blow becomes apparent through overrotating centrifuges [69], undesigned shut-down petro-chemical processes [70], or a spill of tremendous amounts of fluid within our exemplary filling plant. With DT information, especially system and sensor data collected from the real-world system, human operators using AR devices might detect suspicious patterns. For example, the overfilled bottles could be imposed with a warning, such as an AR-based overlay over the real-world bottle leveraging the pattern-detection capabilities of humans to identify the anomaly.
As described within the qualitative results of our literature review (see Sections 3.3.1 and 3.3.2), similar approaches are, for example, becoming popular to detect manufacturing malfunctions. However, a contextualization of physical surroundings with primarily security-relevant information from DTs would benefit the detection of APTs within CPS.

Improving Cyber Situational Awareness
The previous aspect leads to another benefit when AR and DT are jointly applied for security as they help to increase cyber situational awareness. With CPSs becoming an increasingly important part of an organization's IT infrastructure, they also need to be appropriately monitored to detect security incidents. In this context, applying a combined approach with AR and DT grants that situational awareness no longer remains a vague concept that analysts may achieve mainly through monitoring dashboards. Instead, the actual physical context of the data becomes directly visible and intertwined with the related data. While the DT can perform in-depth analyses and simulations with the data at hand, the results can be presented to human operators through AR by either displaying wellknown dashboards or imposing relevant information and alerts directly on the related physical objects.
This aspect seems to be strongly interconnected with the previously presented benefit. However, it differs in terms of the effect on the operator. While digital insights might be provided and contextualized in their physical surroundings, this does not necessarily entail correctly grasping the situation. Cyber situational awareness ensures that the current cybersecurity posture of the physical system is recognized correctly. This might also include additional information on insights, such as the comparison with similar incidents or the inclusion of additional security databases, such as the Common Vulnerabilities and Exposures (CVE) (https://cve.mitre.org/, accessed on 20 July 2021).
In addition, this concept to improve situational awareness should provide relatively easy access to the simulation capabilities of the DT. This could help operators to directly analyze the possible impact of changing specific parameters through the AR display. Thus, humans can gain a faster and more in-depth understanding of a CPS's inner workings, helping them assess its security posture correctly.

Integrating Domain Knowledge
The last and most crucial benefit for cybersecurity when AR is applied together with a DT is the integration of domain knowledge. AR and the DT allow straightforward and efficient integration of experts and their domain knowledge into security processes. While the human factor has long been considered the weakest link in cybersecurity, experts' knowledge is indispensable to secure modern IT architectures and infrastructures [9]. Security experts can correlate the information they use for their security monitoring and analyses directly to the physical assets where the DT collects the respective data. Additionally, results, warnings, and alerts produced by the security analyses performed by the DT itself can be displayed in their context (see Section 4.1.1). This AR-based visualization of security-relevant data enables a far more in-depth security analysis combined with the physical context of the respective data. Security alerts can be displayed onto the asset they concern, and experts can be integrated directly into the analysis process. Furthermore, while operators or system engineers know how to interact with the physical system, security analysts most certainly lack this knowledge. AR combined with the DT can provide instructions on how to interact appropriately or which actions to take to prevent harm resulting from a cyber attack. In addition, functions that induce commands via the DT towards the physical system-without the human having to touch the physical system-might be provided by this solution.
Besides the security domain knowledge of security experts, the domain knowledge of security novices is also highly relevant for incident detection, especially when dealing with CPS. Security experts might understand the cyber-aspect of CPS, but other experts, such as engineers, are needed to understand and evaluate the physical part of the system. Therefore, using an AR-powered DT approach, security novices can participate in security analyses. They can physically regard the system and add the AR-based device, which gives input on the security perspective. This input is coming from the digital-twin component and is properly visualized with AR. This benefits not only security users but also supports the actual operative staff with little focus on security. Although many security analytics processes are automated by now, it is not possible to entirely rely on automated systems to decide whether an anomaly is an actual incident or a benign activity. To make this decision, human domain knowledge (i.e., a clear and deep understanding of the system under consideration), the current situation, and contextual factors are necessary. When users are directly enabled to visualize security issues, and related data on the respective physical asset, this necessary integration of domain knowledge is strengthened.

Security-Relevant Application Areas for DTs with AR
The identified application areas (cf. Section 3.3.2) can be a starting point for securityrelated AR-DT approaches, as well. This section describes how security plays a role and how the AR-powered DT may add to the current body of knowledge in these application areas. However, the application of an AR-DT4Sec approach are, by far, not limited to the following variations.
Manufacturing: The industrial/manufacturing area is potentially the main application domain using the AR-based DT for cybersecurity. Rather than acknowledging confidentiality and integrity, the industry has been solely focusing on their systems' availability over the last few decades. Ever since industry-based attacks, such as Stuxnet [69], appeared, the manufacturing industry and operators of critical infrastructures have been becoming aware of their systems' security issues. Additionally, moving towards Industry 4.0, Industrial Control Systems (ICS) converge with information technology (IT). This further increases the attack surface of industrial environments [71]. While the DT's main application area is the industry [26], recent works have also discussed applying DTs for security in industrial environments [8].
Therefore, AR-supported DTs can benefit industrial security, as well. For instance, while a system's imminent failure due to attacks might not be visible to an operator's eye on site, adding additional information from the DT via an AR device might reveal previously invisible misoperations. Moreover, AR-DT-intertwined devices might display visual representations of the system's logs or network traffic between systems imposed on the physical devices, enabling visual and contextual intrusion detection right next to the physical systems and allows for direct intervention with them. Furthermore, instructions for prompt mitigation procedures might be given that can be carried out at an operational site even by personnel with low or missing security knowledge.
Previously, we identified predictive maintenance as the primary usage of DT-AR combined solutions in general. In this context, security can benefit from the combined solutions, too: Predictive maintenance approaches might be adapted to foresee potential security incidents and threats and throw alerts whenever an indication for a future attack is detected.
Education and Training: In terms of education, the concept of combining AR with DTs can be especially valuable for cybersecurity education. Using the AR technology together with DTs can provide hands-on and visual effects for security learning [51]. Generally, the combination can provide a highly effective virtual training environment for cybersecurity. Well-established training methods, such as cyber ranges, can be extended with AR enabling a more immersive and intuitive training experience. For these scenarios, a cyber range appliance could be part of the AR-DT architecture.
This application area could prove especially valuable with respect to cybersecurity situational awareness training and campaigns as it helps to highlight security topics and issues directly within the participants' domain. Thus, it becomes clear how cybersecurity is also a concern for their domain and could improve their willingness to adopt a more security-aware behavior. For example, approaches, such as Vielberth et al.'s, which already benefit greatly from the use of a DT, could be further extended with AR [72].
Moreover, AR, together with DT, can be used to develop security-aware DT workshops. To that, the PDCA-learning cycle for AR-DT solutions [51] might be adapted to fit cybersecurity training. Furthermore, education instructing about how AR might support security-operating DTs might also present a potential educational aspect.
Cities, Transportation, and Energy Sector: In contrast to the manufacturing area, the potentials for cybersecurity in the energy sector and critical infrastructures, including those for smart cities, slightly vary. Nevertheless, the underlying idea is to represent the system under consideration (e.g., energy grid, smart city) with the DT and visualize the representation with AR on top of the real system or parts of it [58]. Thereby, interaction functionalities should be provided to recognize attacks. For instance, incident detection could be implemented, and proper responses for handling the issue might be offered. In terms of differences to the manufacturing area, smart cities, energy grids, and transportation present much larger heterogeneous systems with multiple sub-systems. In this regard, the DT, the security interactions, and the AR on top are affected by this heterogeneity, as well: A system-of-systems approach, e.g., as already proposed with DTs [20], is necessary to take all considerable sub-systems into account and provide sufficient semantics to replicate their relationships properly. As such, the security aspect on the interaction layer can, on the one hand, provide a holistic view of sub-systems that work together. This might, for instance, be realized by implementing a Security Information and Event Management (SIEM) system that combines system logs of the sub-systems. On the other hand, each sub-system might require a different security application depending on its usage and attack surface.
Moreover, similar sub-systems can also be managed in terms of fleet management, meaning, if, for instance, a wind park is to be represented in the energy sector, each windmill might have a sub-system DT, which in sum amount to the wind park DT. On this basis, security applications can be provided on the whole wind park (e.g., a SIEM tool) or on each windmill (e.g., log anomaly detection), whereby each windmill might require similar security applications.

Blueprint Architecture for DTs with AR in Security
To enable all these benefits in the coming age of CPS, it is necessary to have a clear vision of the underlying conceptual architecture. In this section, we discuss a high-level concept based on the general academic architecture described in Section 3.3.3 and first concepts from practitioners [73]. The concept we introduce is not a technical architecture but, rather, an abstract blueprint for respective architectures. This blueprint is displayed in Figure 6 and shows how an AR-powered DT approach can be intertwined into modern cybersecurity architectures. In terms of architecture, the application of AR and the DT for cybersecurity especially comes into play in the Application Layer (see Figure 6). While the Physical Layer might provide security-relevant data (e.g., system logs or network traffic) to the Digital Twin, the real benefit for security can be provided when the DT is utilized for security applications. Typically, these security-relevant functionalities manifest in the application layer, where information from the DT is retrieved and analyzed. AR, as part of the Presentation Layer on top of the application layer, finally, provides a visual representation for this information and enhances the security applications through its augmentation potential.
Physical Layer: The foundation of the architectural blueprint is the physical layer, where the real-world system (i.e., real twin) under consideration is operating. Within this layer, the real-world system data originates. From a security perspective, we differentiate between two types of data relevant to any security measure: Dynamic data, such as logs, sensor data, or network traffic data, which is produced continuously; and Static data, which occurs rather non-frequently and primarily describes the physical system, e.g., its topology, configurations, etc.
Digital Twin: Through its replication mode, the DT aims to mirror the current events in the real-world twin and induce the same states virtually as are present in the real-world counterpart. Thus, the DT layer's data lake is continuously synchronized with the most recent data from the physical layer. Additionally, the collected information is enriched with semantics and ontologies to give context to the data. On this basis, the Digital Twin can build models to represent the real-world system virtually. With these models, it can run in different operation modes. The analysis mode is mainly based on dynamic data to conduct analytical operations, such as predictive maintenance or anomaly detection. The simulation mode allows testing scenarios that might occur in the future (e.g., an attack or the implementation of another sub-component). Such simulations could, for instance, help to build a cyber range. With these functionalities, the DT is a crucial component to enable situational awareness in its full meaning.
Application Layer: The first two layers of this concept (i.e., Physical Layer and Digital Twin) are universally applicable for any use case of the AR-DT combination. However, the application layer of the conceptual architecture is highly dependent on the use case. When applying the AR-DT combination for cybersecurity use cases, the application layer holds all necessary security measures. Security Information and Event Management (SIEM) systems, Intrusion Detection (IDS), Intrusion Prevention Systems (IPS), etc., can be connected to the DT via respective interfaces to evaluate and ensure the security posture of CPS. Although these security systems might also rely on other data sources, the DT will become a crucial part of CPS-oriented security analytics. Besides the aforementioned system, other security-oriented tools might be applied in this layer. One of them is trainingand education-oriented security applications, such as cyber ranges built under the use of the DT.
Presentation Layer: Existing security tools, such as SIEM systems, incorporate traditional visual representations of the considered data. However, these visualizations are solely displayed on classical screens and do not allow direct contextualization in the user's physical surroundings. Including AR as a core element of the presentation layer on top of the existing security measures residing in the interaction layer enables a new security analysis level in the context of CPS. It provides a much more intuitive and direct interaction (cf. Section 2.1) with both security mechanisms but also the physical asset. Indicators of compromise, alerts, and threats can be analyzed, including real-world and digital information. Besides better decision-making, this also allows for direct, physical counter-measure (if necessary) guided by information displayed through AR. Security experts and novices get a better, more direct view of what is going on in the regarded CPS by imposing the actual real asset with security-related information from DT via AR. Especially, for situational awareness and the integration of human domain knowledge, the application of AR in combination with DTs for security analysis is not only a possibility but a necessity.

Securing AR and DT Technologies
Although the presented technologies in Section 3.3.4 are proven suitable for AR implementation in DTs, they might carry adverse effects when used in security. The dominant Microsoft HoloLens solution addresses several security and privacy concerns in its second version. The implemented security framework ensures, for instance, end-toend encryption. Thus, it is highly recommended to utilize the second version instead of HoloLens 1 for any security-focused applications. In terms of HHDs, such as smartphones or tablets, a variety of security vulnerabilities are known. These depend on the type of the device, including hardware, the underlying software, and installed applications. Such vulnerabilities can be easily found in databases, such as the CVE (https://cve.mitre.org/, accessed on 20 July 2021). Thus, the risk of attackers exploiting these exposures is relatively high. Depending on the use case, current HHDs might lack sufficient security measures and provide an easy entry point for attackers. Especially, in the context of security applications, it may not be advisable to use such unsecured devices.
Moreover, some technologies might be added to be able to conduct proper security analyses with the AR-DT combination. This includes simulation-enabling tools, but first and foremost, the implementation of intrusion detection mechanisms with visual representations to detect system intruders. Thus, next to integrating system logs, network traffic and other environmental data of the regarded system, technologies for anomaly detection, correlation engines, or SIEM tools are to be integrated.
To provide sufficient confidentiality, integrity, and availability for the AR-DT combination, additional tools or concepts might be implemented. These security-fostering technologies might provide proper access control mechanisms, data encryption, and secure synchronization between the physical system and the AR-based DT. These security enhancements are especially required for features of (1) Continuous data flow and (2) Continuous tracking of virtual and physical objects (cf. Section 3.3.4): The continuous data flow requires secure synchronization capabilities and might even need data encryption technologies. The tracking might also require encryption or anonymization techniques to ensure privacy of the end user.

Conclusions
In this work, existing knowledge on the combined application of Augmented Reality and Digital Twins is systematized. We analyze academic literature to understand why, where, how, and with what technologies this combination is realized. We are able to identify the possibility of improving monitoring, education, and analysis as the major reason why AR is applied with DT. Several application areas already implement respective approaches, and the leading area is the manufacturing sector, followed by education, as well as others, such as the energy sector. The existing approaches are mostly built on proprietary architectures but can be generalized using a layered architecture, including a physical layer, the DT, an application layer, and AR. To realize individual use cases, currently, the majority of existing applications build on the Microsoft HoloLens.
The systematized knowledge and the findings drawn from the existing work enlightens a path to how AR, together with DT, can also play a significant role in cybersecurity, especially in the context of CPS. Therefore, we discuss several main benefits which can be realized with AR-powered DTs. The combined approach enables real situational awareness, including the actual physical situation, and allows respective insights derived from DT data and simulations to be analyzed together with the physical assets. These aspects significantly improve the integration of human domain knowledge into any security measure. In a final step, we discuss a general, conceptual architecture of how AR and DT can be integrated with existing security mechanisms.
When used (for security), harnessing AR for DTs will only be beneficial when sufficiently secured. Otherwise, an attacker will find an easy target that provides especially relevant knowledge of a real-world system, violating confidentiality and potentially turning into an existential threat for the company. Additionally, research in improving cybersecurity with a combination of AR and DT is in its infancy. There is little to no existing conceptual or empirical work on the topic. Nevertheless, we show that following down this road can result in significant progress for cybersecurity.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following technical abbreviations are used in this manuscript:

AR
Augmented