This section presents the core outcomes of this study and discusses their relevance to the stated research objectives. The presentation is structured around two key areas. First, it examines a conceptual definition of the metaverse derived from a thematic analysis of existing scholarly literature. The goal of this analysis is to clarify ambiguities and highlight the multidimensional nature of the concept. Second, it provides a comparative analysis of established metaverse architectures, focusing on their structural composition, conceptual logic and implementation relevance. In combination, these two approaches offer a cohesive synthesis that addresses both theoretical understanding and practical system design. By integrating qualitative synthesis with architectural analysis, the aim is to elucidate both the shared foundational elements and the conceptual distinctions that inform our understanding of the metaverse. These insights contribute to refining the definitional framework and highlight the strengths and limitations of current structural models in addressing the evolving technical, social, and governance needs within metaverse ecosystems, thus providing a basis for further discussion.
4.1. The Definition of Metaverse
The definitional diversity shown in the data indicates that the metaverse is a concept in active evolution, shaped by technical advancements and societal creativity. The findings predominantly highlight immersion, avatar embodiment, and communal interaction; however, a more profound analysis uncovers tensions between opposing conceptual frameworks, particularly concerning whether the metaverse constitutes an extension of existing digital systems or a completely separate reality paradigm. This disparity signifies an unresolved debate: some definitions regard the metaverse as the subsequent evolution of the internet [
68,
78], whilst others view it as a sociotechnical disruption that transforms the interplay between physical and digital realities [
65,
75].
Definitions of the metaverse often prioritise user experience, particularly spatial computing and social interaction via avatars. However, this emphasis sometimes neglects structural and governance considerations. For instance, few definitions explicitly address decentralisation, digital asset ownership, platform interoperability, or algorithmic governance, all of which are essential for the effective deployment of metaverse platforms. This omission reduces the utility of existing definitions for architectural design, regulatory frameworks, and ethical governance.
Given these gaps, this study proposes a definition that harmonises theoretical orientations with the practical imperatives of contemporary systems. Drawing upon eight core themes identified through thematic analysis (as outlined in
Table 2), the metaverse is reconceptualised as a multifaceted domain that transcends a purely digital interface. This integrated perspective encompasses social systems, technological infrastructures, and immersive spatial experiences.
A proposed metaverse definition can be formulated by considering the results.
“The metaverse is a three-dimensional virtual environment that encompasses a vast universe of interconnected digital realms, enabled by advanced technologies, that exist parallel to the physical world, where users are represented by digital representatives, providing an immersive social, economic and interactive experience that transcends traditional internet applications by blurring the boundaries between real and virtual worlds."
This formulation swiftly addresses the deficiencies of prior definitions by explicitly incorporating socio-economic, experiential, and technological dimensions. It presents the metaverse as a technologically enabled ecosystem that facilitates a variety of interactions and value exchanges, rather than only as a visual or spatial environment. The definition is neutral concerning platform governance and ownership structures, rendering it applicable to both centralised and federated systems. Furthermore, it conforms to contemporary frameworks that emphasise scalability, modularity, and cross-platform compatibility as fundamental design concepts.
4.2. Metaverse Architecture
A total of 25 architectural models have been identified in the literature, categorised into three primary groups: general layered architectures, domain-focused architectures, and technology-based architectures. This study primarily focuses on the comprehensive analysis and comparative assessment of general layered architectures, while all three categories contribute to the architectural debate regarding the metaverse.
Layered metaverse architectures, comprising three to six layers, provide modular, abstract frameworks that enable the methodical conceptualisation and structural organisation of the metaverse across many areas. In contrast, domain-specific architectures are typically designed for specialised industries such as healthcare, education, or manufacturing, and hence, they lack the versatility needed for wider applications. Although technology-based architectures effectively highlight enabling infrastructures like artificial intelligence, blockchain, extended reality, 6G, and digital twins, they frequently lack a clearly defined layered structure that systematically organises system components. Thus, while these architectures elucidate the technological foundations of the metaverse, they fail to offer comprehensive architectural frameworks appropriate for cross-domain applications.
The classification of architectural models revealed instances where design intent and application domain were not perfectly aligned. One architecture [
101], which targeted the e-commerce domain, was included under technology-based architecture because of its explicit emphasis on blockchain infrastructure. This highlights the analytical tension between functional deployment and conceptual orientation, necessitating interpretation based on authorial framing and structural design focus, rather than domain alone.
Moreover, both domain-specific and technology-oriented architectures frequently fail to satisfy critical needs such as modularity, scalability, interoperability, and data security, which are vital for extensive industrial use. Consequently, there is a considerable demand for versatile metaverse designs that can support sectoral diversity while ensuring a comprehensive and adaptive framework. This paper analyses how layered architectures conceptualise the metaverse and aims to discover recurring functional components, structural similarities, and their progression across different approaches.
The study analyses general-purpose metaverse architectures by concentrating on a set of well-established models that exhibit conceptual clarity, structural coherence, and widespread acceptance in the literature. These frameworks were selected for their impact on metaverse design discourse and their embodiment of fundamental, domain-agnostic architectural methodologies. The study facilitates a targeted comparison examination by focusing on a selection of exemplary models that provide modular layered frameworks. This method enables the recognition of repetitive structural patterns, thematic uniformities, and essential conceptual components that support the building of overarching metaverse systems.
4.2.1. Hierarchical Structure of the Metaverse Architectures
The study investigates general-purpose metaverse architectures employing models with layered, modular, and ordered frameworks. These designs help to enable the methodical conceptualisation of the metaverse and ease the comparison of fundamental structural elements. The analysis considered three sample models: the three-layer framework by Duan et al. [
48], the five-layer architecture introduced by Wang et al. [
5], and the four-layer structure jointly presented by Lim et al. [
78] and Xu et al. [
31]. The selection of these models was based on their established recognition within the academic literature as general-purpose metaverse architectures, their conceptual clarity, and structural integrity. Their recurrent presence in comparative studies and their fundamental impact on the development of metaverse design principles additionally substantiate their incorporation.
To ensure a systematic and meaningful comparison across these hierarchical frameworks, this analysis uses several comparison criteria. These criteria include (1) functional domain coverage, (2) granularity of layer differentiation, and (3) support for modularity, scalability, and interoperability. The three metaverse architectural models each employ a hierarchical structure to incrementally refine the division of system responsibilities, which enables conceptual clarity and architectural adaptability across diverse implementation scenarios.
Figure 3,
Figure 4 and
Figure 5 show representations of three-layer, four-layer and five-layer metaverse architecture, respectively, arranged from top to bottom layer. Each of these figures will be discussed separately.
- A.
Three-Layer Architecture:
The three-layer architecture is the basic structure.
Figure 3 illustrates that the three-layer metaverse architecture is structured into three principal components: infrastructure, interactivity, and ecosystem, emphasising essential functionality.
- 3.
Ecosystem Layer: It supports the seamless operation of the entire Metaverse, creating a parallel, dynamic, and interactive virtual world.
- 2.
Interactivity Layer: This layer serves as an interface between the virtual and real worlds, allowing the real world to be connected to the metaverse and the metaverse to be realised.
- 1.
Infrastructure Layer: The infrastructure layer contains the fundamental requirements for supporting the operation of a virtual world, including computation, communication, blockchain and storage. Blockchain is often incorporated at a fundamental level to guarantee decentralisation, transparency, and security within virtual environments. In this architecture, blockchain not only ensures distributed data integrity but also facilitates smart contracts and tokenised interactions that support higher-level functions, such as digital economies and decentralised governance. Although its structural role is within the infrastructure, its influence extends across layers, particularly within the ecosystem layer, highlighting the merging of technical and socio-economic design in metaverse systems.
- B.
Four-Layer Architecture:
The four-layer metaverse architecture offers a disciplined framework that differentiates important elements engaged in the interaction between the real and virtual worlds. As illustrated in
Figure 4, this architecture is composed of the following layers:
- 4.
Physical World: The physical layer includes key stakeholders, which are users, IoT and sensor networks, virtual services providers and physical space providers.
- 3.
Virtual World: The virtual world layer contains a virtual environment designed for education, trading, goods and services. Users are represented as digital avatars, allowing them to engage in activities that closely resemble those in the real world.
- 2.
Metaverse Engine: The metaverse engine orchestrates inputs from physical and virtual environments to create immersive experiences. Using AR / VR interfaces, haptic technologies, and the Tactile Internet, it is able to improve user interaction. Interoperability standards ensure consistency across platforms. Digital twin systems replicate physical assets in real-time, while AI manages functionalities like 3D rendering and avatar creation. Blockchain technologies support decentralised ownership and asset management. The engine also incorporates a virtual economy layer for transactions and content monetisation. These components collectively enable the creation of dynamic, responsive, and persistent digital environments.
- 1.
Infrastructure: The infrastructure layer is crucial for enabling efficient access by distributing computing resources closer to end-users, which significantly enhances user experiences through low latency and high-speed data transfer. It is designed to support high-density networks capable of managing the ever-increasing data demands of modern applications. By employing advanced communication technologies, this layer enhances bandwidth and overall efficiency, while a multi-layered computing approach balances performance with scalability. Basic tasks can be processed directly on user devices, allowing complex tasks to be offloaded to distributed servers, effectively optimising resource usage. Furthermore, AI-driven optimisation techniques are applied to improve efficiency and reduce network load. Security is paramount, ensuring the management and verification of digital assets while also supporting peer-to-peer transactions and interoperability across various virtual environments. This infrastructure layer facilitates cross-platform accessibility and resource-sharing, creating a cohesive and efficient distributed ecosystem.
- C.
Five-Layer Architecture:
The five-layer metaverse architecture is a detailed and flexible framework intended to include the entire range of virtual environment creation and engagement.
Figure 5 illustrates that this architecture comprises the following layers.
- 5.
Digital Life: The digital life layer represents the diverse virtual experiences, services, and economies within the metaverse. It encompasses multiple interconnected sub-metaverses, each offering unique digital environments, applications, and services that cater to various user needs, such as gaming, social networking, education, virtual commerce, and digital entertainment.
- 4.
Interconnected Virtual Worlds: The interconnected virtual worlds are an expansive digital landscape that comprises multiple interconnected distributed virtual worlds, often referred to as sub-metaverses. Each of these sub-metaverses caters to specific interests by providing a range of virtual goods and services, including gaming, social networking, online museums, and virtual concerts. Users can explore diverse environments within these sub-metaverses, immersing themselves in unique game scenes or navigating vibrant virtual cities. Central to the experience are avatars, which act as digital representations of human users, allowing them to engage and interact within these virtual realms.
- 3.
Metaverse Engine: The metaverse engine merges physical, digital, and virtual elements into a cohesive experience, leveraging big data to generate and continuously update expansive virtual worlds. With the capability for real-time rendering, users can seamlessly interact between their physical surroundings and the digital landscape. This immersive environment is supported by technologies such as extended reality, human–computer interaction (HCI), and brain–computer interfaces (BCI), allowing individuals to control avatars through their senses and physical movements. This multifaceted platform not only facilitates gaming and social interactions but also enables virtual commerce. AI analytics play a crucial role by enhancing user experiences, optimising system performance, and sustaining large-scale simulations. The environment ensures high-definition rendering and creates realistic virtual settings, while also supporting decentralised trade of virtual goods and services through transparent, trust-free transactions. Additionally, user-generated content and diverse avatar activities contribute to a spontaneous and thriving virtual economy.
- 2.
The Physical Infrastructure: This layer is crucial as it enables seamless interactions between the digital and human worlds by providing essential support for data perception, transmission, processing, and caching. It includes components like smart objects, sensors, and actuators, which facilitate physical control for these interactions. Additionally, this layer supports networking capabilities through both wired and wireless systems and offers the computing power and storage necessary for various metaverse applications. Its ability to ensure real-time, high-performance processing is vital for the smooth operation of the metaverse.
- 1.
Human Society: The human society layer of the metaverse emphasises the centrality of users, their psychological makeup, and their social interactions. This layer is designed to create intuitive and immersive experiences that cater to the behaviours and needs of individuals. By leveraging technologies such as VR and AR helmets, smart glasses, and a range of wearable devices, the metaverse offers seamless interactions within its digital realms. Users can control their digital avatars, allowing them to participate in diverse activities that foster connection and engagement. The layer promotes intuitive communication between humans and the virtual environment, effectively blending virtual, augmented, and mixed reality to provide rich and immersive experiences that reflect the complexities of human society.
4.2.2. Conceptual Similarities Across Metaverse Architectures
This section examines the conceptual similarities among three-layer, four-layer, and five-layer metaverse architectures. The accompanying figures visually illustrate how these frameworks, despite differences in structural detail and terminology, frequently converge on functionally equivalent components. Each comparison underscores shared core functionalities, including user interaction, computational processing, and infrastructure support.
Figure 6,
Figure 7 and
Figure 8 illustrate the metaverse architecture, representing three distinct architectural models: the three-layer, four-layer, and five-layer architectures, along with their interrelations. The illustration aims to highlight the hierarchical structure of various architectures and the conceptual commonalities between levels that perform analogous duties but are designated differently among architectures. Bidirectional arrows point to areas of conceptual overlap or functional equivalency by indicating layers that carry similar purposes in the comparative structures.
The three-layer and four-layer architectures, as shown in
Figure 6, exhibit significant structural similarities. Both frameworks establish a fundamental infrastructure layer responsible for processing, communication, storage, and the integration of enabling technologies. Each architecture utilises blockchain to facilitate secure digital ownership, economic transactions, and trust management inside the virtual ecosystem.
A specific layer in both models enables interaction between the physical and virtual realms, referred to as “Interactivity” in the three-layer architecture and manifested through the metaverse engine and virtual world in the four-layer architecture. Moreover, both architectures prioritise immersive user experiences by utilising technologies such as virtual reality, augmented reality, and spatial computing, and depend on digital representations to facilitate avatar-based or system-level interactions.
Ultimately, each framework facilitates a virtual environment or ecosystem layer that fosters social, cultural, and economic activities, supporting enduring and multi-user digital communities. These common traits demonstrate a fundamental congruence in design principles, emphasising interoperability, scalability, and user-centric engagement across platforms.
A comparative analysis of the three-layer and five-layer architecture, as shown in
Figure 7, highlights several patterns of conceptual equivalence. Both architectures include an infrastructure layer that serves as the technical foundation of the metaverse, providing essential computing and networking resources. In both architectures, blockchain technology, at a fundamental level, provides secure digital ownership, trustless transactions, and the development of decentralised virtual economies.
The three-layer approach incorporates blockchain in the infrastructure layer to enable smart contracts and distributed governance, whereas the five-layer architecture integrates it into the metaverse engine to enhance asset management and ensure transparent economic exchange. In the three-layer architecture, the ecosystem layer consolidates functions related to user engagement, application services, and social interaction.
In contrast, the five-layer architecture adopts a more distributed approach, separating these functions across three distinct layers: the digital life layer, the interconnected virtual worlds layer, and the metaverse engine layer. Together, these layers manage immersive environments, real-time interactivity, and content rendering. Additionally, the five-layer architecture explicitly includes a human society layer, which emphasises user-centric aspects that are present, but not formally defined, in the three-layer architecture.
Similarly, the four-layer and five-layer architectures exhibit closely related functional characteristics in
Figure 8. The alignment of infrastructure components is once again evident. The metaverse engine layer in the four-layer architecture corresponds directly to the metaverse engine in the five-layer framework. Additionally, the concept of the virtual life layer in the four-layer architecture overlaps with the digital life and interconnected virtual worlds in the five-layer architecture.
The five-layer architecture’s enhanced differentiation of social dynamics and digital presence, through the human society layer, signifies an advancement in structural clarity and societal modelling. Moreover, blockchain technology is featured in both frameworks to support secure asset ownership, decentralised trade, and digital economy management.
Across all three architectural frameworks, recurring functional domains are evident, particularly in the areas of infrastructure support, immersive systems, and user interaction layers. Despite variations in terminology and layer divisions, these models converge in addressing the fundamental requirements of metaverse design, such as real-time processing, virtual environment management, and social engagement. These conceptual similarities demonstrate a developmental continuity from simpler to more specialised structures, underscoring that evolving architectures, while diverse in structure, remain anchored in a set of shared foundational principles.
4.2.3. Merits and Shortcomings of the Different Models
Although the architectures share foundational elements, they exhibit significant divergence in their approaches to managing user interaction and experience. The three-layer architecture integrates interactivity, user environments, and economic activity within broad, consolidated layers, specifically, the interactivity layer and the ecosystem layer. In contrast, the four-layer architecture refines this structure by segregating these functions into more specialised components. For example, the interactivity layer is divided into a physical world layer, which manages real-world user inputs and sensor data, and a metaverse engine layer, which addresses immersive technologies, AI, and data synchronisation. This enhanced granularity supports modularity and scalability, facilitating more flexible system design and a clearer delineation of responsibilities. While the simplicity of the three-layer architecture may be advantageous in small-scale or closed environments, the expanded structure of the four-layer architecture offers greater adaptability in more complex use cases [
105].
The four-layer architecture offers enhanced structural clarity and functional specialisation relative to the three-layer architecture; yet, both frameworks demonstrate considerable limits when utilised in extensive, socially integrated metaverse settings. A significant limitation is the lack of structural support for interoperability. Both the three-layer and four-layer architectures lack a defined architectural mechanism to enable cross-platform interaction. Rather, these models typically presume isolated or self-sufficient ecosystems. This design methodology restricts users’ capacity to transfer assets, identities, or social connections between platforms, leading to disjointed experiences that impede the overarching goal of a cohesive metaverse [
106]. This constraint highlights the essential requirement for architectural improvements that emphasise interoperability as a primary design element [
107].
The five-layer approach mitigates several of these constraints through its more detailed and modular framework. By incorporating distinct layers, including the human society layer and the interconnected virtual worlds layer, it provides improved support for social complexity and cross-platform functioning. The interconnected virtual worlds layer explicitly promotes interoperability by allowing seamless transitions and data interchange among scattered sub-metaverses. This structural inclusion differs from the three-layer and four-layer, which presume isolated ecosystems and lack methods for identity transfer, asset continuity, or cross-platform connections. The separation of once-integrated functions into distinct, purpose-driven layers enhances modularity, scalability, and system clarity.
A significant shortcoming in the three-, four-, and five-layer metaverse architectures is the lack of a dedicated governance layer. Although all three frameworks incorporate blockchain technology, primarily to support decentralised data integrity, digital ownership, and secure transactions, this integration serves as a technological enabler rather than a formal governance mechanism. Blockchain facilitates elements such as smart contracts and peer-to-peer exchanges, but it does not architecturally model processes like regulatory compliance, content moderation, or ethical oversight.
As metaverse platforms continue to expand in both scope and societal influence, the absence of integrated governance mechanisms has emerged as a significant vulnerability [
108]. Regulatory compliance, content moderation, and user rights management are often perceived as external or secondary concerns, resulting in a structural governance gap [
109]. The assumption that legal and ethical oversight can be managed externally to the system design is increasingly untenable, given the rising expectations for platform accountability and responsible data stewardship [
77,
110].
Among these three architectures, the five-layer architecture most effectively addresses governance by incorporating a human society layer, which integrates elements of user behaviour and social presence [
5,
111]. Nevertheless, this layer predominantly emphasises psychological and experiential aspects, without formally modelling legal, regulatory, or ethical governance structures. Consequently, even the most advanced framework lacks a comprehensive governance framework [
31].
Furthermore, the disaggregation of previously unified functions into distinct, purpose-specific layers within the five-layer architecture enhances modularity, scalability, and systemic transparency. While the three-layer architecture may remain appropriate in controlled or experimental settings due to its simplicity, its structural limitations impede its applicability in large-scale or ethically complex environments. Conversely, the five-layer architecture signifies not only a technical advancement but also a conceptual shift towards metaverse systems that are resilient, interoperable, and ethically aligned.
The evolution of the metaverse underscores a vital understanding: for platforms to be sustainable, they must incorporate not only governance structures but also explicitly integrate user-focused design and systemic interoperability into their foundational framework. As the metaverse becomes more intertwined with social, economic, and regulatory spheres, maintaining these platforms demands both technological advancements and institutional responsibility. Future metaverse frameworks need to go beyond simple interaction and immersion by embedding governance, privacy, data management, and ethical principles into their core design, thereby fostering robust, inclusive, and scalable digital ecosystems.
Table 12 presents key insights across the three-layer, four-layer, and five-layer metaverse architectures. These observations highlight recurring structural patterns, functional priorities, and design limitations that have emerged from the evolution of architectural models.
4.3. Societal and Ethical Implications
While this study primarily focuses on conceptual clarity and architectural coherence, it is equally important to address the broader societal and ethical considerations that arise from metaverse frameworks. Key areas of concern include, but are not limited to, privacy and data ownership, equitable access, and the socio-psychological consequences stemming from highly immersive environments.
Metaverse systems frequently require the collection of comprehensive user data that includes biological, behavioural, and emotional information [
112]. This raises substantial concerns about user consent, data exploitation, and prolonged surveillance [
113]. The reviewed architectures, especially those lacking specialised governance layers, provide insufficient structural support for safeguarding data or overseeing the ethical use of user information. In the absence of integrated privacy-preserving safeguards, such systems jeopardise transparent data practices and compromise user autonomy [
57]. While existing legal frameworks, like the GDPR, are applicable to metaverse environments, there is currently no specific, harmonised international regulation designed to address the unique challenges it presents [
114,
115].
Access to metaverse technology is often limited by economic costs, digital literacy, and physical or cognitive obstacles [
114]. None of the three examined models clearly incorporates accessibility or inclusivity as fundamental design principles. The lack of such measures can result in the exclusion of disadvantaged people and exacerbate existing digital disparities. A metaverse aligned with ethical standards must incorporate universal design principles to guarantee equal involvement among varied user populations.
The highly immersive nature of metaverse environments has the potential to reshape individual perceptions of identity, community, and reality. Sustained engagement within these digital spaces can have ramifications for mental health, social dynamics, and the expression of cultural values [
116]. While the five-layer architecture includes a human society layer that touches on these issues, its capacity to incorporate ethical protections or strategies for safeguarding psychological well-being and cultural diversity remains limited.
Existing metaverse architectures predominantly adopt a technology-focused approach, often neglecting societal resilience and ethical design considerations. Future architectural models should therefore evolve to integrate governance mechanisms, inclusivity principles, and psychosocial safeguards as core tenets. Embedding these elements at a structural level is essential for the development of scalable, accountable, and ethically sustainable metaverse ecosystems.