In recent years there has been a widespread penetration of location-based services (LBS), also known as geo-services, into people’s daily activities, triggered by the increasing affordability of smart devices (tablets, mobile phones) for the average end-user/consumer [1
]. A typical example of using geo-services is the navigation to a place using freely available maps and the simultaneous association of current location with other complementary information. At the same time, significant developments in geoinformatics technologies reflect the progress occurred in hardware and software technologies. Nowadays, it is possible to map a region with high topographical accuracy and create high-resolution digital terrain/surface models (DTM/DSM) by employing unmanned aerial vehicles (UAV) [2
]. Likewise, new technologies allow the creation of three-dimensional models that simulate inanimate or moving/animated spatial objects (e.g., buildings or virtual humans respectively) [5
]. The above, combined with high quality three-dimensional (3D) modeling visualization and animation capabilities offered by computer graphics technologies, enable the development of custom virtual geospatial worlds. A custom virtual geospatial world simulates real-world spatial objects in a number of overlaid geo-referenced thematic layers [6
]. These layers may include raster surface textures, DTM/DSM models and 3D spatial entity models moving in pre-specified or dynamically defined motion paths. The implementation of such virtual worlds can be based on open technologies and standards and on functionalities provided by free software libraries. Moreover, they can be demonstrated on smart device platforms and/or standard interfaces of the World Wide Web. The ever increasing data transfer speed in communications and wireless networks [7
], the widespread use of geospatial web services [8
] and the evolution in the augmented reality (AR) technologies [9
] are boosting the provision of custom virtual geospatial worlds for a mobile end-user and offer novel opportunities for the deployment of numerous smart applications (apps).
Building on the above-mentioned, the presented work aims to introduce “Mergin’ Mode”—a system comprising an authoring tool and an app, able to support overlaying of (a) highly detailed virtual terrain environments and three-dimensional models representing animate or inanimate objects, placed or moving over these environments and (b) the real world as captured by the camera of a smart device. For example, during the physical presence of a user in an archaeological site, it will be possible, through storytelling techniques, to “revive” on the screen of the mobile phone, historical events represented by a custom virtual geospatial world. These events along with dynamic reconstructions are evolving based on the users’ actual position (spatial reference) and the route they follow on the site. That way, the digital material must also possess a spatial reference and be provided under LBS. Possible application scenarios include information provision regarding both the history and the historical events that took place in monuments and archaeological sites. These scenarios can be used both for educational and recreational purposes for the local communities of the monuments and archaeological sites areas, and for the development of cultural tourism in those areas.
Technically, the system is based on the merging of the real with the virtual in mixed reality (MR), assisted by geoinformatics technologies, to be applied on monuments with the goal of demonstrating them. The final output is a set of web services that will enable the visualization of an archaeological site, a monument or a set of monuments in its current form as captured by the camera of a smart device in conjunction with virtual objects that can represent historical events and that can narrate stories to the end-users/visitors of the site. In addition, the visitor app may be freely available on the marketplaces and in combination with the virtual objects offered in special repositories, may enable virtual tours of the monuments remotely.
In a hypothetical scenario, the visitor of the monument opens the mobile camera and aims at a point of interest. Through a free application and a set of data made available by the wireless infrastructure of the archaeological site or the Internet, the cultural tourists can visualize and acquire, according to their wishes, in-depth information regarding the archaeological site, the monuments or historical events pertinent to them, enhancing, that way, their cultural experience. At the same time, the Global Positioning System (GPS) receiver of the smart device will accurately approximate the visitor’s position within the 3D virtual geospatial world. Alternatively, the visitor may be able to make use of the virtual content remotely without having to visit the site and receive information about the cultural product of the area of interest (Figure 1
Having described all the above capabilities provided by the system in a laboratorial-experimental level, the real-world conditions at an operational level will be presented. There are two distinct factors to be considered, which both highlight the potential contribution of the presented work:
The authorities responsible for promoting specific cultural-touristic resources may exploit the portion of the system dealing with the development of the digital material. “Mergin’ Mode” authoring tool utilizes geographic information systems (GIS)-based functionalities to assist the development of virtual custom geospatial worlds presenting historical representations of the monuments. Beyond the typical thematic layers that may include the site terrain and vector graphics with areas, lines and points, additional themes may be added to compose sophisticated 3D scenes [6
]. Such themes may include 3D models of natural spatial objects (e.g., trees or plants) or of cultural objects of historical importance either moveable (e.g., amphorae) and immovable (e.g., temples) or “living” ones (e.g., people and animals). Other thematic layers may specify the routes of motion for moving objects, or the points/areas of placement for the stable ones. Although the overlaying of numerous thematic layers to form a photorealistic 3D scene is a relatively old technique, since the era of the first multimedia projects, such as Flash [10
], the geospatial reference of all the involved objects nevertheless requires a GIS-based approach. Besides, the spatial reference is the key property of an object that specifies its behavior in response with the end-user location. Moreover, although virtual reality and 3D computer graphics technologies have evolved over the last decades, their co-existence with LBS is an issue that invites further research and development projects.
The visitors/end-users perceive the digital material provided by the authorities of a monument as rendered over their camera, along with the monument in its current condition in a MR app. Considering the ubiquity of LBS in a vast number of smartphone apps, “Mergin’ Mode” invests in this evolving capability of establishing an on-demand, direct connection between the user of cultural digital material and the provider, without the need of specialized equipment except for a smart device. Although this issue is already raised [11
], the developments so far justify additional research efforts and allow significant improvements. Someone could imagine the whole venture as being similar to that of the Google Maps: the user may download the maps for an area of interest and use them offline for routing purposes, whenever the device is located in that area. In our case, the material concerns cultural heritage resources and the representations of a monument along with historical events, which are triggered at the time of the user’s georeference in the area of the monument. Obviously, the material may be provided at the time of the visitors’ presence at the site, synchronously, through the authority’s communication infrastructure.
The next section presents a review on the conjunction of MR with LBS methods and tools. Most importantly, it attempts to identify critical features of contemporary MR authoring tools and highlights “Mergin’ Mode” contribution. The third section provides information about all the technical details, the technologies and the standards employed for the development and demonstration of “Mergin’ Mode” software prototype. Section 4
provides an extensive demonstration of the system and Section 5
highlights the results and possible future research directions.
2. Similar Works
2.1. Mixed Reality and Location-Based Services: Recent Developments
Many applications combining AR and GIS have been developed during last decades in various areas: environmental monitoring [14
] and changes [15
], navigation [16
], architecture [17
], pipeline prospect [18
], tourist information system [19
], landscape visualization [20
], etc. VR and AR are receiving increasing attention in cultural tourism and virtual museums [21
]. In fact, the size of the information required to be served via LBS, in order to form a MR-based scene on a mobile smart device owner, could not be supported before 4G’s introduction, given the bandwidth and Internet speeds in the Global System for Mobile Communications (GSM) networks of that era. Therefore, the spatial reference of the involved objects was not a primary specification in the related developments. Some indicative recent developments employing LBS, thereby involving geoinformatics technologies, are mentioned below:
A survey that includes (a) the technical requirements of MR systems in indoor and outdoor settings and (b) the purposes and the enabling technologies adopted by MR applications in cultural heritage is provided in Bekele et al. [31
] (p. 26 and p. 28).
Debandi et al. [24
] present a research co-funded by the H2020 European project 5GCity that resulted in a MR smart guide that provides information on a city-scale about historical buildings, thereby supporting cultural outdoor tourism. The user can select the object (monument/building/artwork) for which augmented contents should be displayed (video, text audio); the user can interact with these contents by a set of defined gestures. Moreover, if the object of interest is detected and tracked by the MR application, 3D contents can also be overlapped and aligned with the real world.
Nobrega et al. [32
] describe a methodology for fast prototyping of a multimedia mobile application dedicated to urban tourism storytelling. The application can be a game that takes advantage of several location-based technologies, freely available geo-referenced media, and augmented reality for immersive gameplay. The goal is to create serious games for tourism that follow a main narrative but where the story can automatically adapt itself to the current location of the player, assimilate possible detours and allow posterior out-of-location playback. Adaptable stories can use dynamic information from map sources, such as points of interest (POI). An application designed for the city of Porto, namely, Unlocking Porto, is presented, which employs the above-mentioned methodology. This location-based game with a central, yet adaptable story engages the player into the main sights following augmented reality path while playing small games.
Balandina et al. [11
] summarize their research in the area of the Internet of Things (IoT) for the development of services to tourists. More specifically, they share ideas of innovative e-Tourism services and present Geo2Tag LBS platform that allows easy and fast development of such services. The proposed platform provides open application programming interfaces (APIs) for local developers to create extension services on top of the available content and allows automatic binding of the new content and extending it by open data from various sources, thereby helping to advertise the regions concerned. They present the developed Open Museum platform that employs mixed/augmented reality and a couple of its implementation instances, namely, Open Karelia and New Moscow systems.
Alkhafaji et al. [13
] introduce a list of guidelines for designing mobile location-based learning services with respect to cultural heritage sites. This list was set out based on the results of a user study in the field which was carried out with adult end users to evaluate a prototype mobile application that delivered information through mobile phones and smart eye glasses simultaneously, regarding cultural heritage sites based on location. Moreover, augmented reality and LBS are utilized in this specific app. This paper presents an empirical study that examines aspects of usability, usefulness and acceptance of the smart learning environment—“SmartCity”—designed to deliver instant information, based on location, with respect to cultural heritage sites.
2.2. Identifying State-of-the-Art Software
Prior to identifying related state-of-the-art software, this section addresses the scientific–technological areas involved in applications pertinent to the presented work. These areas are characterized by strong synergies between pure computer-graphics and animation-motion technologies with AR and MR technologies. In addition, AR software applications cooperate closely with geospatial software, in order to inherit LBS and GIS capabilities. Therefore, (a) animation-motion, (b) GIS functionalities and (c) VR–AR–MR comprise the three major software areas of interest relevant to the presented work.
Another difficult task was to classify the findings of this review according to their general software type, e.g., API, framework or library; however, such a classification would cause misunderstandings and non-logical comparisons. Besides, many terms may have similar meanings. Having gathered as much information as possible, a decision was taken to split the final table in three subdivisions based on the distinct contribution of the findings, as follows:
Game engines: Their role in 2D and 3D graphics rendering, physics simulation, interactive animation and motion effects is decisive to place them as head of the related software table. Beyond the features recording support of the previously defined technological areas, the capability of acting as an authority tool is also recorded and their capability to operate on a browser.
Libraries–platforms–prameworks: They were created to cover a broad range of functionalities, including basic geospatial ones, and graphics animations, and to cooperate with other software components to form complete solutions. This category records the same features as the above category.
AR tools: This subdivision contains software exclusively focused on AR that is obviously cooperating with software from the above categories to form complete solutions. Some AR-specific, individual features were selected such as simultaneous localization and mapping (SLAM), Geo-location, 2D and 3D images recognition and online-cloud recognition.
Before presenting the findings of the review, a brief presentation of some of them from each category is offered below.
Unity is a real-time 3D development platform that lets artists, designers and developers to create immersive and interactive experiences, in games, films and entertainment, architecture or any other industry. As of 2018, Unity had been used to create approximately half of the mobile games on the market and 60 percent of augmented reality and virtual reality content [33
]. Unreal Engine is an open, advanced real-time 3D creation tool that is continuously evolving to serve as a state-of-the-art game engine, giving to creators, freedom and control to deliver cutting-edge content, interactive experiences and immersive virtual worlds [34
]. These two engines seem to be the leaders in the field of game engines and a very recent article attempts to compare them [35
CityEngine is an advanced 3D modeling software for creating expansive, interactive and immersive urban environments that may be based on real-world GIS data or may showcase a fictional city of the past, present or future. CityEngine fully covers all critical geospatial aspects, such as georeference, geolocation and overlaying; however, it does not integrate motion effects on spatial objects [36
]. On the same category, vGIS Utilities is a cloud-based app that displays GIS and CAD data, using mixed and augmented reality. It does not require specialized hardware or client provided servers to operate. It connects to Esri ArcGIS and other data sources to aggregate and convert traditional 2D GIS data into 3D visuals. It primarily targets public utilities, municipalities and service providers [37
]. ARKit makes use of just very recently released [40
] iPad’sLiDAR scanner and depth-sensing system to make realistic AR experiences. Via its API it is possible to capture a 3D representation of the world in real time, enabling object occlusion and real-world physics for virtual objects [41
]. ARCore is Google’s platform for building augmented reality experiences using APIs across Android and iOS and enabling mobile phones to sense the environment, understand the world and interact with information [42
All findings of this extensive search [43
] are concentrated in Table 1
for the reader’s convenience. However, it should be clearly noted that the final table does not aim to act as a products comparison catalogue. Though the seekers have strived to gather any type of relevant stuff, this table should be considered as a non-exhaustive collection of existing state-of-the-art software solutions in the broad area of the presented work. In any case, the final result is changeable and needs continual update because many of the presented findings may soon be deprecated, may be merged to others or may not be supported or updated and so forth.
2.3. Specifying “Mergin’ Mode”
There are several solutions for developing MR environments and visualizations that exploit powerful 3D simulation engines and offer immersive experiences (e.g., Unity, OpenSceneGraph). Moreover, the geospatial community extends the capabilities of GIS software, in order to provide sophisticated 3D geospatial visualizations (e.g., CityEngine, vGIS).
The increasing demand for LBS involving MR technologies and advanced animation and motion effects justifies the need for developing apps combining features from all the above mentioned. Table 2
presents in detail the specifications that are deemed necessary for a project, in order to satisfy this need. The table also summarizes the specifications of “Mergin’ Mode”.
At the operational level, the system architecture, as freely illustrated in the image below (Figure 3
), includes the distinct users/actors and the data, services and equipment technologies involved:
The repository of virtual geospatial world models, together with data and metadata that document the cultural-tourism resources represented through these models;
Web services and servers that make cultural content available;
Global positioning, video capture and internet connection technologies.
Two distinct software components are identified: (a) the “Mergin’ Mode” manager component concerning the authority responsible for the preparation of the cultural content which will be being served or deposited in the open repositories and (b) the “Mergin’ Mode” end-user component (app) concerning the visitor of the site which will be perceiving the digital content (virtual world) merged with the real world in a MR.
4.1. “Mergin’Mode” Authoring Tool
To develop a complete set of cultural data for the demonstration of a monument, a custom virtual geospatial world has to be prepared. The following paragraphs describe the minimum components required to develop a virtual geospatial world and those utilized for the purposes of the present demonstration.
4.1.1. 3D Models of the Monument Area in its Current Condition
The 3D model of the area of the monument in its current condition as captured by high capacity cameras and processed with photogrammetric techniques is required to enable MR functionality, as described in Section 3.2.5
For the purposes of the present demonstration, an archaeological site located in Apollonia, northern Greece [70
] has been selected. Figure 4
a depicts an aerial photo of the Ottoman bath and Figure 4
b the photogrammetric process performed.
The shots were taken at different heights and distances from the monument. In total, 229 shots with an average overlapping of 85% + were taken. The unmanned aerial vehicle used for shooting is Parrot’s Anafi product with a 21 Mpixel resolution camera, while the orbits and shoots were performed with Pix4D Capture Works by Geosense [71
4.1.2. 3D Model of the Monument Area in its Past Condition
3D models of the monument and the archeological findings associated with it may be reconstructed in the form they had in the past by utilizing specialized 3D CAD software. However, for the reconstruction to be successful, the managing authority of the monument needs to possess significant material, analogue or digitized: maps, archival documents, photographs, etc. Figure 5
illustrates some views of the 3D reconstruction of the Ottoman bath.
4.1.3. Other 3D Models
Other 3D models stable or moving, animate or inanimate, with or without motion effects may populate the above-mentioned models with the aim to “revive” historical events over the archeological site and enrich the visitor’s experience. For the purposes of the presented demonstration free tree models were selected (Figure 6
), and an ancient man was constructed.
4.1.4. Creating the Custom Virtual Geospatial World of the Monument
The creation of the custom virtual geospatial world of the monument practically means to put all the 3D models previously described together, along with motion effects, where appropriate, to produce a 3D scene of the monument. The “Mergin’ Mode” authoring tool offers easy to use tools for importing 3D point clouds, DTM/DSM and 3D textured meshes of the area; specifying or importing vector layers for placing other 3D models of animate or inanimate objects; and specifying motion paths for the last ones where needed. Figure 7
provides an overview of the authoring tool user interface.
An indicative workflow for creating a custom virtual geospatial world consists of the following activities (Figure 8
Importing the DSM/DTM of the area.
Importing the surface texture.
Importing the monument.
Importing, rotating and scaling other 3D models.
Placing models at selected locations.
Implementing motion effects.
Combining all to make a “living geospatial world.”
Navigation in the monument area and demonstration of the custom virtual geospatial world can be checked at the “Mergin’ Mode” website available online: https://mergin-mode.prieston.tech
(accessed on 8 May 2020).
4.2. The “Mergin’Mode” End-User Component (App)
As already described (Section 3.2.1
) the objects of a 3D scene of a custom virtual geospatial world can be exported and served via geospatial web services during navigation on site or may be downloaded offsite, for offline use. As the user lies and navigates inside the “influence area” of the site and observes it, and as the position and the angle of view of his smart device are being modified, virtual events and storytelling are appropriately enabled and visualized.
The end-user component is a typical app that handles the satellite navigation system, the gyroscope and the camera of a smart device, merging the real with the virtual to present the content in an MR mode.
illustrates the implementation of a mixed object. The Ottoman bath previously (Section 4.1.1
) captured and photogrammetrically processed is transformed to a virtual object presented with a textured mesh in the bottom right picture. The other virtual object (a tree) is partially covered by the Ottoman bath. A transparent mask is applied on top of the Ottoman bath (upper right picture).
The 3D model of the Ottoman bath totally covered by a transparent mask allows the real one-captured by the visitor’s camera to be displayed. Both objects, the 3D model and the real, compose the mixed object, which in turn interacts with the virtual tree and partially covers it (left picture).
provides instances of the end user interaction with mixed and virtual objects via smart device touch screen. This is implemented by utilizing Raycaster function of Three.js (upper left picture). Tapping on the screen in conjunction with the camera position creates a raycast which intersects a mixed (lower left picture) or a virtual (upper right picture) object. The user receives informative content, reconstructions (lower right picture) and may also dynamically alter a virtual object’s behavior; e.g., position and motion.
Finally, Figure 11
shows the reconstruction of the Ottoman bath in a mergin’ mode: the real Ottoman bath captured by the visitor’s camera is merged with the reconstructed 3D model of the virtual bath.
“Mergin’ Mode” aims at enabling the web of cultural data. Ideally, this can be achieved through the development of a wide network of cultural heritage promoters acting as cultural content providers/emitters. In an ideal case every cultural authority will be publishing custom virtual geospatial worlds containing 3D scenes of the archaeological sites with representations and animations. What this practically means, is that a user located in an area or wishing to visit a destination or having scheduled a route, will be receiving web services, including information related to available cultural heritage resources based on his/her current location. The cultural content transmission is activated by utilizing various techniques, with geofencing [72
] and geotargeting [73
] being the most appropriate ones when the receivers/mobile users are located in the area of the cultural site.
The above rationale may well be applied to an urban area with numerous sites of cultural interest and therefore numerous potential emitters of cultural content. An end user of “Mergin’ Mode” app may potentially receive on the screen of his/her smart device historical events and representations of the whole area. The content will be unfolded in accordance with his/her location over the area and the scenes will come one after the other based on the order of visit. The content is distributed to multiple cultural authorities, each one serving the content of its responsibility. The content will be downloaded and executed on the mobile user’s app during passing through the site. The content may also be filtered based on user profile [74
Concerning the software and hardware selections made for the system’s development:
Some of the key decisions during system design involved the equipment specifications for the smart app of the visitors of a cultural heritage resource. No equipment is required beyond an average smart device. “Mergin’ Mode” focuses on the capability of the smart device to capture the site with a moderate capacity camera and to receive—via the WiFi—the 3D models of geospatial virtual worlds, rather than to locate the exact position of the end user. Special equipment is only needed for the needs of photogrammetric surveying and the development of DTM/DSM and of the 3D modeling of cultural heritage sites. However, the above-mentioned activities are subject of third-party contractors and depend on the needs, the financial capacity and the maturity of the managing authority responsible for the cultural heritage resources.
Finally, an interesting use case might include the development of a “past world finder” application that might be enabled by the Directory OpenLS service interface [75
]. It might utilize this service to find custom worlds that fit, among others, flexible and/or specified temporal criteria which correspond to the diversity in archeology and history, in contrast with common date formats (e.g., custom worlds near the user, before 100 B.C. or worlds of the ancient times).