A Heterogeneous Distributed Virtual Geographic Environment — Potential Application in Spatiotemporal Behavior Experiments

Due to their strong immersion and real-time interactivity, helmet-mounted virtual reality (VR) devices are becoming increasingly popular. Based on these devices, an immersive virtual geographic environment (VGE) provides a promising method for research into crowd behavior in an emergency. However, the current cheaper helmet-mounted VR devices are not popular enough, and will continue to coexist with personal computer (PC)-based systems for a long time. Therefore, a heterogeneous distributed virtual geographic environment (HDVGE) could be a feasible solution to the heterogeneous problems caused by various types of clients, and support the implementation of spatiotemporal crowd behavior experiments with large numbers of concurrent participants. In this study, we developed an HDVGE framework, and put forward a set of design principles to define the similarities between the real world and the VGE. We discussed the HDVGE architecture, and proposed an abstract interaction layer, a protocol-based interaction algorithm, and an adjusted dead reckoning algorithm to solve the heterogeneous distributed problems. We then implemented an HDVGE prototype system focusing on subway fire evacuation experiments. Two types of clients are considered in the system: PC, and all-in-one VR. Finally, we evaluated the performances of the prototype system and the key algorithms. The results showed that in a low-latency local area network (LAN) environment, the prototype system can smoothly support 90 concurrent users consisting of PC and all-in-one VR clients. HDVGE provides a feasible solution for studying not only spatiotemporal crowd behaviors in normal conditions, but also evacuation behaviors in emergency conditions such as fires and earthquakes. HDVGE could also serve as a new means of obtaining observational data about individual and group behavior in support of human geography research.


Background
Geovisualization started with two-dimensional (2D) mapping, and later developed to include three-dimensional (3D) interactive rendering.Starting from the mid-1990s, with the advent of the virtual reality modeling language [1], the applications of the virtual reality geographic information system (VRGIS) in various fields have been rapidly developing [2].VRGIS offers an immersive GIS environment, and could be considered an advanced form of geovisualization [2].Since VRGIS needs to process massive amounts of GIS data, and render VR graphics at the same time, VRGIS applications at this time can only run on high-end workstations, and sometimes even require supercomputers.A virtual geographic environment (VGE) is a VRGIS-based platform for multidimensional visualization, dynamic process simulation, and geocollaboration [3,4].The second wave of VR has brought better graphics hardware and cheaper helmet-mounted displays (HMDs), greatly improving the availability and affordability of virtual reality (VR) technology.This will also promote an upgrade of VRGIS, and facilitate the development of new theories and methods of geovisualization, geoanalysis, and geocollaboration.
VR-based virtual experiments have been widely used in spatiotemporal behavioral research.In the field of cognitive behavior research, virtual behavioral experiments have been used to study spatial cognition, path selection, obstacle avoidance, and other factors [5][6][7][8][9].In the game industry, in order to improve game design and enhance the gaming experience, data mining and visualization have been used to analyze users' spatiotemporal behaviors in massively multiplayer online role-playing games (MMORPGs) [10][11][12].In education and training, virtual reality simulators are used to train users' skills and enhance learning effectiveness relating to flight [13], driving [14], fire escape [15], and earthquake evacuation [16].In the field of urban design and planning, the impact of the urban environment upon pedestrian decision-making behavior [17], pre-occupancy assessment [18], and guidance layout [19] also need the support of observational data about users' spatiotemporal behaviors, which could be easily acquired in the virtual environment.
In recent years, there have mainly been four groups of crowd behavior research methods.First, computer vision technology has been used to extract pedestrian motion trajectories from surveillance video and other multimedia data.With trajectories, researchers can identify and analyze pedestrian behavioral patterns and the characteristics of spatial-temporal motion [20,21].The second has been to use a classic social force model [22,23] and field model [24] to model, simulate, and analyze pedestrian behavior in specific situations [25].Third, controlled real-world crowd experiments [26] have been designed with particular research goals in mind, in order to obtain observations that faithfully represent pedestrian movement patterns in the real world.Fourth, virtual reality experiments provide a new methodology for crowd behavior research.The data collected in a virtual environment can be used not only to validate and calibrate existing models [27], but also for further data mining.
Real-world evacuation experiments are difficult to implement for two reasons.First, it is difficult to resolve the safety issues in evacuation experiments, which would involve the participation of real people.Second, it is difficult to capture and quantitatively describe the spatiotemporal behavior observed in emergency scenarios.Virtual geographical experiments (VGExs) are becoming a promising research method.Immersed in a helmet-mounted VGE, the user not only has a strong sense of presence and immersion, but can flexibly control an avatar to evacuate in an emergency scene.This new form of experiment not only avoids the aforementioned safety problems, but can also faithfully capture the behavioral characteristics of real people.This would greatly facilitate behavior analysis and rule discovery.However, most of the current emergency drill systems have been designed only for single users [15,28].There are few multi-user collaborative experimental platforms to support the study of crowd behavior.

Related Works
Helmet-mounted VR devices can provide participants with a strong sense of immersion and real-time interactivity, which can enhance the effectiveness of existing research.Moussaïd et al. [29] constructed a desktop personal computer (PC)-based multi-person collaborative virtual environment.They carried out crowd movement experiments in both real and virtual scenes.They also proved the feasibility of using a shared 3D virtual environment to carry out crowd experiments by involving real people.A cave automatic virtual environment (CAVE) is an immersive virtual reality environment the capability of this prototype system to support heterogeneous distributed virtual experiments for spatiotemporal crowd behavior experiments.The proposed system can serve as a data collection tool for further behavior analysis.In Section 5, we discuss the key issues in the experiments.In the end, we summarize the conclusions, and highlight future work.

Conceptual Framework
The human-environment relationship refers to the relationship between the existence and development of human society, or the human activities and the geographical environment [36].The geographical environment here is considered to be the entire geographic environment, encompassing natural and human elements.They are intertwined in accordance with certain rules, and are closely integrated.The literature [37] suggests that VGEx can be categorized into virtual natural geographic experiments and virtual human geographic experiments.Virtual natural and human elements constitute the "environment", while individuals, groups, and society in the VGE constitute the "human" in a virtual experiment.Due to network and other technological limitations, collaborative VGEx, even based on distributed technology, cannot normally support the number of concurrent users of a real human society.However, virtual crowd experiments at the individual or group level could be affordable.According to the human-environment relationship theory [37], we categorize the VGEx into three types of elements: human, entity, and environment.These elements and their relationships are shown in Figure 1.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 4 of 20 spatiotemporal crowd behavior experiments.The proposed system can serve as a data collection tool for further behavior analysis.In Section 5, we discuss the key issues in the experiments.In the end, we summarize the conclusions, and highlight future work.

Conceptual Framework
The human-environment relationship refers to the relationship between the existence and development of human society, or the human activities and the geographical environment [36].The geographical environment here is considered to be the entire geographic environment, encompassing natural and human elements.They are intertwined in accordance with certain rules, and are closely integrated.The literature [37] suggests that VGEx can be categorized into virtual natural geographic experiments and virtual human geographic experiments.Virtual natural and human elements constitute the "environment", while individuals, groups, and society in the VGE constitute the "human" in a virtual experiment.Due to network and other technological limitations, collaborative VGEx, even based on distributed technology, cannot normally support the number of concurrent users of a real human society.However, virtual crowd experiments at the individual or group level could be affordable.According to the human-environment relationship theory [37], we categorize the VGEx into three types of elements: human, entity, and environment.These elements and their relationships are shown in Figure 1.

•
Virtual Human: HDVGE regards the human as the core, and emphasizes the subjectivity of human beings.The virtual human in the HDVGE includes avatars controlled by real users, and agents driven by computer programs.They can interact with other elements as well as themselves, such as interactions between avatars or avatars and agents.Multiple virtual human form a group through social relations, roles, and tasks.Massive groups in collective activities form a crowd.

•
Environment: The environment is the background of the VGEx, and is the static part of the VGE.We mainly use 3D models to simulate the real geographical environment, including terrain, vegetation, architecture, etc., which constitute the physical part of the VGE.Similar to • Virtual Human: HDVGE regards the human as the core, and emphasizes the subjectivity of human beings.The virtual human in the HDVGE includes avatars controlled by real users, and agents driven by computer programs.They can interact with other elements as well as themselves, such as interactions between avatars or avatars and agents.Multiple virtual human form a group through social relations, roles, and tasks.Massive groups in collective activities form a crowd.

•
Environment: The environment is the background of the VGEx, and is the static part of the VGE.We mainly use 3D models to simulate the real geographical environment, including terrain, vegetation, architecture, etc., which constitute the physical part of the VGE.Similar to the real environment, it can be perceived and recognized.At the same time, different environments will constrain a virtual human's behaviors.

•
Entity: Entity refers to the dynamic variable entities in VGE.They have the ability to interact with humans.They can not only be perceived by humans, they can also provide feedback to the humans.For example, small obstacles in the evacuation process, such as desks and chairs, can affect the route choice of virtual humans; meanwhile, their state variables can also be changed by the humans.An essential issue in HDVGE is how the consistency of entity state variables in heterogeneous clients can be maintained.
In addition, HDVGE includes an important non-physical element: the event.An event can not only express natural and human geographic processes, it can also express their interactions, such as the transfer of crowds in the course of mountain floods, and the evacuation of crowds during subway fires.Events are represented by the updating of attributes of entities and environments.They can be perceived by humans, and they can also drive people to respond.From the experiment organizer's point of view, virtual instruments are their tools for observing and documenting the spatiotemporal states of various elements in virtual experiments.

HDVGE Design Principles
In the literature [38], the similarities in human behavior between the virtual and real worlds have been studied, and a mapping principle has been proposed.As the research purposes and objects differ from one virtual world to another, a research frame based on four aspects has been put forward: group size, traditional controls and independent variables, contextual and social architecture factors, and directionality.Similarly, the virtual-real geographic similarity principles were also established from four aspects [37]: geographic space-time, geographic attribute, geographic attributes group, and geographic spatial cognition.The higher the similarity between the virtual and real, the closer the user behavior in the virtual experiment is to the real world.From the above principles, we propose the following HDVGE design principles.

•
Similarity in geographic space-time: The time and space in VGE should be similar to that in the real geographic environment (RGE).That is, the spatial scale and time scale in VGE should be identical to the real ones.This similarity provides principles for modeling the 3D virtual environment and process simulation.It requires strict reference to the size and proportion of real space when modeling VGE.Additionally, the time scale of the VGE cannot be changed.

•
Similarity in spatial attributes: The spatial attributes and distributions of entities and processes in the VGE should be similar to that in the RGE.VGEx includes the simulation of processes in natural geography and human geography.This principle stipulates that the modeling and presentation of objects and geographic processes should be similar to reality.

•
Similarity in group composition: Through the observation of pedestrians in public places [39], at least 70% of pedestrians in a given population are not traveling alone, but walk in groups.
In a VGEx for spatiotemporal behavior research, group composition and member attributes must be similar to reality.Due to the limitations of the 3D modeling, VGEx cannot provide every user with an elaborate avatar.We use avatar models that are easy to discern to represent the members of the group, while those who are outside the group use an avatar with a different appearance.This similarity provides the basis for group modeling, observation, record, and analysis.

•
Similarity in perception: The subject of a VGEx perceives the environment, entities, other subjects' spatiotemporal positions, attributes, and group relations from a first-person perspective.The results of this perception process should be similar to the perception results of a RGE.For example, during a fire, the subjects' perceptions of the evacuating crowd, and their own companions in the VGEx, should be similar to those in an actual fire environment.This similarity could stimulate the subject to behave similarly to reality.Therefore, this similarity provides the framework and principles for virtual scene design, process simulation, and interactions between multiple subjects.
In addition, HDVGE designs are also constrained by factors such as VR device performance, usability, and user experience [40].The design of an HDVGE should consider these aspects in an integrated manner.The ultimate design would reflect a compromise between the various factors mentioned above.

HDVGE Architecture
According to the framework and design principles described in Section 2, which consider user experience, system performance, and research objectives, we propose a detailed architecture for an HDVGE, as shown in Figure 2.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 6 of 20 In addition, HDVGE designs are also constrained by factors such as VR device performance, usability, and user experience [40].The design of an HDVGE should consider these aspects in an integrated manner.The ultimate design would reflect a compromise between the various factors mentioned above.

HDVGE Architecture
According to the framework and design principles described in Section 2, which consider user experience, system performance, and research objectives, we propose a detailed architecture for an HDVGE, as shown in Figure 2.

•
Collaborative Server A collaborative server is primarily responsible for providing network services to the HDVGE.The VGEx for spatiotemporal behavior research often requires multiple participants distributed in different geographic locations.With this in mind, we adopted an authoritative server-based architecture, instead of a peer-to-peer client.The server supports deployment in heterogeneous network environments, in order to meet the experimental needs in either a LAN or wide area network (WAN).The authoritative server maintains the states of all of the objects in the virtual scene, and is responsible for computing and updating them.Each operation on a client requires sending a synchronization request to the server.The server periodically performs object status verification, computation, updating, and then sends the latest statuses back to the target client.Due to the performance of heterogeneous clients varying enormously, this architecture shifts computation stress from the client side to the server side, enabling clients to focus on the high-fidelity rendering of HDVGE.In addition, the architecture is easy to scale, given the possibility of large numbers of concurrent users.
The collaborative server can provide a variety of services.The transmission control protocol (TCP) service is mainly used for frequently updated user data, such as the location, orientation, and action of the avatar.The hyper text transfer protocol (HTTP) service is mainly used for transactional network requests, such as user authentication and background management.Voice communication between users in a group is achieved through voice services.The database is mainly used to store structured data during the experiment, while the storage cluster is mainly used to store unstructured data, and data that needs to be serialized.The logging module periodically records the status and behavior of all of the users in the experiment.Due to the high real-time requirements and interactivity, when there are a large number of massive concurrent users, the load balancing module is responsible for the scheduling of computing resources and storage, and distributing the tasks according to the actual situation.• Collaborative Server A collaborative server is primarily responsible for providing network services to the HDVGE.The VGEx for spatiotemporal behavior research often requires multiple participants distributed in different geographic locations.With this in mind, we adopted an authoritative server-based architecture, instead of a peer-to-peer client.The server supports deployment in heterogeneous network environments, in order to meet the experimental needs in either a LAN or wide area network (WAN).The authoritative server maintains the states of all of the objects in the virtual scene, and is responsible for computing and updating them.Each operation on a client requires sending a synchronization request to the server.The server periodically performs object status verification, computation, updating, and then sends the latest statuses back to the target client.Due to the performance of heterogeneous clients varying enormously, this architecture shifts computation stress from the client side to the server side, enabling clients to focus on the high-fidelity rendering of HDVGE.In addition, the architecture is easy to scale, given the possibility of large numbers of concurrent users.
The collaborative server can provide a variety of services.The transmission control protocol (TCP) service is mainly used for frequently updated user data, such as the location, orientation, and action of the avatar.The hyper text transfer protocol (HTTP) service is mainly used for transactional network requests, such as user authentication and background management.Voice communication between users in a group is achieved through voice services.The database is mainly used to store structured data during the experiment, while the storage cluster is mainly used to store unstructured data, and data that needs to be serialized.The logging module periodically records the status and behavior of all of the users in the experiment.Due to the high real-time requirements and interactivity, when there are a large number of massive concurrent users, the load balancing module is responsible for the scheduling of computing resources and storage, and distributing the tasks according to the actual situation.

• Heterogeneous Clients
Although all-in-one VR offers a better sense of immersion, its availability and performance remain limited.As large numbers of participants are required in spatiotemporal crowd behavior experiments, the system has been designed to be able to interact with heterogeneous clients so that PC users can be included.All-in-one VR uses high-precision sensors and gamepads as input devices, and HMD as the output device, while a PC uses the traditional mouse and keyboard as input devices, and the monitor as the output device.To accommodate interactions between heterogeneous clients, we designed a device-oriented graphic user interface (DGUI).On a PC, it appears as a screen-space user interface, while an all-in-one VR has a view-centered user interface that follows the user's head movement.The DGUI is mainly used to display user-related information.The interaction algorithm for heterogeneous clients will be described in detail in Section 3.2.
As an HDVGE is a common workspace shared by multiple users, the 3D modeling of the environment must be sufficiently photorealistic.In light of the limited computing and rendering capabilities of the heterogeneous clients, 3D models and scene rendering must also be optimized to meet the requirements of user experience and system availability.A heterogeneous distributed virtual environment is a trade-off between high fidelity and availability.As an important reference point for users to perceive the virtual environment, avatars' skin, bones, and animations also need to meet the above requirements.We distinguish between different groups of avatars using easily identifiable colors of clothing.The avatars' skeletal animations meet the non-verbal communication needs between users in the HDVGE.
We add a logging module to the heterogeneous client, which is responsible for recording the local user's locations, orientations, actions, and other status information.This module is different from the server-side logging module, which records the status data for all of the users.Although the log data have some degree of redundancy, it increases the reliability of data logging.
The network synchronization module is used mainly to communicate with the server.Data transmission is based on the TCP protocol.Heterogeneous distributed clients perform collaborative tasks through the network.Its real-time and interactivity are important factors that affect the user experience.The architecture of the HDVGE uses the ideas of authoritative server and dumb client.All of the clients will send their own status changes to the server, and then, the collaborative server forwards them to each client as requested.This architecture can reduce the client's computing pressure and avoid cheating.However, the disadvantage is that since all of the data sent and received must first go through the collaborative server, the overall performance is greatly affected by the speed of the network.Therefore, the client prediction algorithm is very important.The algorithm will be described in detail in Section 3.2.

Abstract Interaction Layer for Heterogeneous Clients
Interactive devices and methods vary greatly between HDVGE clients.The interactive devices of PC clients include the keyboard, mouse, and monitor.They use the mouse to control the viewpoint rotation, and use the keyboard to control the viewpoint movement, trigger skeletal animation, and perform other actions.The interactive devices of all-in-one VR clients mainly use the HMD, gamepad, and stereo screen.The HMD is equipped with a high-precision nine-axis sensor, which is a combination of three sensors: a three-axis accelerometer, a three-axis gyro, and a three-axis electronic compass.Among them, the HMD mainly uses the three-axis gyroscope to measure, obtain the attitude parameters of the helmet, and then reconstruct the user's 3D motion.That is, the user controls the rotation of the viewpoint using the HMD, controls the viewpoint movement using the gamepad, and triggers the skeleton animation using buttons.Although the devices and methods vary considerably, the ultimate goals are the same.Therefore, in order to be compatible with different interactions between heterogeneous clients, we designed an abstract interaction layer (AIL), so that the different interactive methods can achieve the same results.Figure 3 shows a typical interaction process.The left and right sides of the figure represent the interaction process of the all-in-one VR and the PC client, respectively.The AIL is a collection of predefined actions.It is responsible for converting interactions from heterogeneous clients into standard actions.First, we define the standard actions that can be recognized by the system.Second, we establish a mapping relationship between the various types of operations from heterogeneous clients and AIL standard actions.This mapping relationship bridges the differences between heterogeneous clients, and guarantees that different operations from heterogeneous clients can produce the same effect.Third, according to the specific interaction action, each client computes and updates the rendering scene in its own heterogeneous computing platform.Finally, the rendering results are sent to the client's display device.

Protocol-Based Interactions between Heterogeneous Clients
The standard actions defined in AIL also provide a common language for interactions between the heterogeneous clients.For heterogeneous clients to communicate and interact with each other, we propose protocol-based interactions to implement the collaboration between the heterogeneous clients.The agreement is a data structure that the system appoints in advance for data exchange between the clients.It can be understood and applied by each client in its own form, thereby masking the differences between clients.
A typical data transmission process based on a custom protocol is shown in Figure 4. First, a user of an all-in-one VR changes his status locally.Then, based on the type of heterogeneous client, this interaction is mapped to standard actions by the AIL.Next, the system uses the custom protocol The AIL is a collection of predefined actions.It is responsible for converting interactions from heterogeneous clients into standard actions.First, we define the standard actions that can be recognized by the system.Second, we establish a mapping relationship between the various types of operations from heterogeneous clients and AIL standard actions.This mapping relationship bridges the differences between heterogeneous clients, and guarantees that different operations from heterogeneous clients can produce the same effect.Third, according to the specific interaction action, each client computes and updates the rendering scene in its own heterogeneous computing platform.Finally, the rendering results are sent to the client's display device.

Protocol-Based Interactions between Heterogeneous Clients
The standard actions defined in AIL also provide a common language for interactions between the heterogeneous clients.For heterogeneous clients to communicate and interact with each other, we propose protocol-based interactions to implement the collaboration between the heterogeneous clients.The agreement is a data structure that the system appoints in advance for data exchange between the clients.It can be understood and applied by each client in its own form, thereby masking the differences between clients.
A typical data transmission process based on a custom protocol is shown in Figure 4. First, a user of an all-in-one VR changes his status locally.Then, based on the type of heterogeneous client, this interaction is mapped to standard actions by the AIL.Next, the system uses the custom protocol to encode the standard action.To improve network transmission efficiency, the encoded result is converted to binary form before being transmitted to the server.After receiving the status update from client 1, the server calculates and processes the status information, and then forwards it to other clients in the current scene.Let's take PC client 2 as an example.After receiving the binary data, client 2 first converts it to text data and decodes it.Then, it continues to restore the interaction of client 1 according to the custom protocol, and updates the local client 1 status.Finally, based on the latest status of client 1, client 2 performs the calculation, rendering, and output display, and responds according to its needs and feedback.All client-side interactions are similar to this.

Adjusted Dead Reckoning Algorithm for Client Prediction
The network environment in which the distributed client is located to a large extent determines the system's real-time experience.Network latency is a key issue affecting the overall performance of the system.In an HDVGE, not only does the status of the virtual avatars in the client need to be synchronized, but many other scene elements also require consistent maintenance through the server, such as interactions between avatars and entities, user entry and exit events, and the instantiation and deletion of networked entities.Virtual scene maintenance can ensure the consistency of scene, entities, avatars, and other elements in each client to avoid user perception differences caused by distributed clients.
A typical client state synchronization process of an HDVGE is shown in Figure 5.We assume that the network latency for each client is consistent throughout the process.The initial position of client 1 is (10, 10), and it moves one unit along the x-axis.Client 1 sends a new status to the server while moving the local avatar.The data reaches the server after t1 time.Then, the server receives the new status sent by client 1, and starts to broadcast to other clients.The broadcasted data reaches client 1 after time t2, and reaches client 2 after time t3.At this point, client 2 can see the new status of Let's take PC client 2 as an example.After receiving the binary data, client 2 first converts it to text data and decodes it.Then, it continues to restore the interaction of client 1 according to the custom protocol, and updates the local client 1 status.Finally, based on the latest status of client 1, client 2 performs the calculation, rendering, and output display, and responds according to its needs and feedback.All client-side interactions are similar to this.

Adjusted Dead Reckoning Algorithm for Client Prediction
The network environment in which the distributed client is located to a large extent determines the system's real-time experience.Network latency is a key issue affecting the overall performance of the system.In an HDVGE, not only does the status of the virtual avatars in the client need to be synchronized, but many other scene elements also require consistent maintenance through the server, such as interactions between avatars and entities, user entry and exit events, and the instantiation and deletion of networked entities.Virtual scene maintenance can ensure the consistency of scene, entities, avatars, and other elements in each client to avoid user perception differences caused by distributed clients.
A typical client state synchronization process of an HDVGE is shown in Figure 5.We assume that the network latency for each client is consistent throughout the process.The initial position of client 1 is (10,10), and it moves one unit along the x-axis.Client 1 sends a new status to the server while moving the local avatar.The data reaches the server after t1 time.Then, the server receives the new status sent by client 1, and starts to broadcast to other clients.The broadcasted data reaches client 1 after time t2, and reaches client 2 after time t3.At this point, client 2 can see the new status of client 1.In the process, the network delay of client 1 is (t1 + t2), while the new status of client 1 reaches client 2 after (t1 + t3).That is, the status of client 1 as seen by client 2 is actually the former's status before time (t1 + t3).An out-of-sync status between distributed clients can cause serious problems.Assume that in the HDVGE for a crowd evacuation experiment, users need to control the avatars to escape quickly from the scene of the fire.Client 1 and client 2 are in one group, and need to escape together.Client 2 sees the status of client 1 before time (t1 + t3), which is slightly behind client 2. Thus, client 2 stops and waits for client 1.However, in fact, client 1 has already come to the front.At this point, client 1 sees client 2 falling behind.Therefore, client 1, in turn, will need to stop and wait for client 2. It can be seen that the out-of-sync status between distributed clients will eventually make client 1 and client 2 stop moving.We need to predict the state of the next moment based on the client's current state, so that the avatars of different clients appear to be synchronized.We also need to minimize deviations between the true and the predicted values.
There are many algorithms that predict a moving object's future states based on the latest state.The most widely used are dead reckoning (DR) and the Kalman filter (KF).DR is an algorithm to predict the motion parameters of a moving object.It predicts the state of an object based on the latest position, velocity, and acceleration, and is widely used in the fields of aviation and navigation [41].Curtiss Murphy [42] uses projective velocity blending, which mixes the newly acquired velocity with the current velocity, and predicts the position combined with the time variable.The KF is an algorithm that estimates the state of a system from measured data.It is commonly used in guidance, navigation, and control systems.In computer vision applications, the KF is used for object tracking to predict an object's future location, account for noise in an object's detected location, and help associate multiple objects with their corresponding tracks [43].Comparing the two, the DR algorithm is more widely used, and has been applied in networked games [44,45].Therefore, we adjusted the DR algorithm to provide client-side predictions, and compare its performance with KF.
The algorithm proposed by Curtiss Murphy [42] uses a fixed update rate to implement the DR algorithm.In HDVGE, the client uploads the data to the server only when its status data changes.An out-of-sync status between distributed clients can cause serious problems.Assume that in the HDVGE for a crowd evacuation experiment, users need to control the avatars to escape quickly from the scene of the fire.Client 1 and client 2 are in one group, and need to escape together.Client 2 sees the status of client 1 before time (t1 + t3), which is slightly behind client 2. Thus, client 2 stops and waits for client 1.However, in fact, client 1 has already come to the front.At this point, client 1 sees client 2 falling behind.Therefore, client 1, in turn, will need to stop and wait for client 2. It can be seen that the out-of-sync status between distributed clients will eventually make client 1 and client 2 stop moving.We need to predict the state of the next moment based on the client's current state, so that the avatars of different clients appear to be synchronized.We also need to minimize deviations between the true and the predicted values.
There are many algorithms that predict a moving object's future states based on the latest state.The most widely used are dead reckoning (DR) and the Kalman filter (KF).DR is an algorithm to predict the motion parameters of a moving object.It predicts the state of an object based on the latest position, velocity, and acceleration, and is widely used in the fields of aviation and navigation [41].Curtiss Murphy [42] uses projective velocity blending, which mixes the newly acquired velocity with the current velocity, and predicts the position combined with the time variable.The KF is an algorithm that estimates the state of a system from measured data.It is commonly used in guidance, navigation, and control systems.In computer vision applications, the KF is used for object tracking to predict an object's future location, account for noise in an object's detected location, and help associate multiple A, B, C, and D, and located at the two ends of the platform.There are round pillars in the middle of the platform.A fire breaks out on one side of the subway; thus, two exits on one end are blocked.Considering the weak performance of the all-in-one, the prototype system does not use real-time lighting.Instead, the system uses some area light sources and bakes the lighting into light maps.In order to improve the fidelity of the fire scene, the system uses particle systems to simulate heavy black smoke.At the same time, the system establishes weak lighting to create a low-visibility scene.Accompanied by a sharp fire alarm, the system creates a sense of urgency to the user both visually and audibly.The pillars in the middle of the platform have a clear exit-point marking, which is self-illuminated to ensure clarity.
The system uses low-precision 3D models as avatars.In order to reduce the amount of data that needs to be processed when rendering a large number of avatars in the VR system, we have simplified the mesh of the high-precision avatar model while keeping the texture, skinning, and skeleton information.The number of triangles in each of the low-precision models is between 600 and 700, but the appearance and action features of them are comparable to those of the high-precision models.To make it easy to distinguish an avatar's group information, we designed the coat texture of avatars of the same group to have the same color.We also designed three skeletal animations for each avatar, where "run" is used to represent the user's escape animation, "idle" is used to indicate that the user is not moving, and "greet" is used for non-verbal communication between the group members.We have implemented two different modes of interactions, all-in-one VR and PC, both of which use the first-person perspective.We use the device-oriented graphical user interface (GUI) to display the local user group, flag, correct exit, evacuation countdown, system prompts, and other news.Both types of clients record the position, orientation, movement, and other status data of the avatar in 0.3-s intervals.simplified the mesh of the high-precision avatar model while keeping the texture, skinning, and skeleton information.The number of triangles in each of the low-precision models is between 600 and 700, but the appearance and action features of them are comparable to those of the high-precision models.To make it easy to distinguish an avatar's group information, we designed the coat texture of avatars of the same group to have the same color.We also designed three skeletal animations for each avatar, where "run" is used to represent the user's escape animation, "idle" is used to indicate that the user is not moving, and "greet" is used for non-verbal communication between the group members.We have implemented two different modes of interactions, all-in-one VR and PC, both of which use the first-person perspective.We use the device-oriented graphical user interface (GUI) to display the local user group, flag, correct exit, evacuation countdown, system prompts, and other news.Both types of clients record the position, orientation, movement, and other status data of the avatar in 0.3-s intervals.Figure 6 is a prototype system diagram.Since the LAN environment is pure, and it is easy to simulate complex conditions such as network latency, the prototype system server is deployed in the LAN.At the same time, we implement HTTP services to complete user authentication, configuration parameters, and so on.We define in advance the data structures of the request, and the response between server and client, which are used to transfer data such as status and message between the local and remote users.The server records the status, actions, and events of all of the users in the virtual scene, according to the data sent by the user.

Performance Evaluation of Key Algorithms
Since the AIL and protocol-based interaction algorithms for heterogeneous clients cannot be measured using numeric values, we have supplemented the implementation of the prototype Since the LAN environment is pure, and it is easy to simulate complex conditions such as network latency, the prototype system server is deployed in the LAN.At the same time, we implement HTTP services to complete user authentication, configuration parameters, and so on.We define in advance the data structures of the request, and the response between server and client, which are used to transfer data such as status and message between the local and remote users.The server records the status, actions, and events of all of the users in the virtual scene, according to the data sent by the user.

Performance Evaluation of Key Algorithms
Since the AIL and protocol-based interaction algorithms for heterogeneous clients cannot be measured using numeric values, we have supplemented the implementation of the prototype system.The availability of the system can prove the effectiveness of our additions.Therefore, only the adjusted dead reckoning (ADR) algorithm is evaluated here.

• Adjusted dead reckoning algorithm
To test the accuracy of the algorithm, we conducted a small-scale crowd evacuation experiment, in which participants' trajectory data were collected in a virtual environment.Five participants were invited to take part in the evacuation experiment, with three repetitions using the PC clients in the LAN.All of the users entered the subway scene at exactly the same time, and were informed of the target exit in advance.The users were instructed to navigate from the platform center to the target exit, and would need to climb up staircases and pass through gates in between.The trajectories recorded by the prototype system are time series data, with each record containing the following attributes: UserID, Timestamp, Position X, Position Y, Position Z, and Action.The time interval between each sample is 0.3 s.
Taking the trajectories of user activities collected in the experiment as an example, we implement the KF, with a constant acceleration model and ADR for position prediction.In order to study the running time and prediction accuracy of the algorithm under different update frequencies, we used a uniform distribution of fixed intervals to simulate the update frequency with some randomness.We use the total time consumed by the prediction algorithm to evaluate the algorithm's time complexity, and use root mean squared error (RMSE) to measure the deviation from the observed value to the true value.
The test results are shown in Figure 7.In this test, we implement the ADR algorithm that takes a velocity blending factor of 0.3.This means that the predicted value receives higher priority than the latest updated position data.As seen from Figure 7a, from the accuracy of the algorithm, the prediction error of the ADR algorithm is smaller than the KF algorithm by an average of 0.961 m, which is lower by 31.76%.From the running time, the single run time of ADR is almost negligible, while the KF algorithm requires an average of 0.23 ms.This is because KF needs to update the state transition model and covariance model in each time step, and performs a series of matrix multiplications.Therefore, we recommend that the ADR algorithm should be considered in 3D rendering programs that require high real-time performance.Figure 7b shows the prediction results of the ADR algorithm and the KF algorithm.The XZ plane is the user's activity plane, and the positive Y axis represents the height value.The error in the prediction trajectory of the KF algorithm increased after the avatar moved vertically.The trajectory predicted by the ADR algorithm is subject to unsatisfactory accuracy when the avatar's acceleration changes, but the error is quite small in other places.This is more in line with the actual situation.Taking the trajectories of user activities collected in the experiment as an example, we implement the KF, with a constant acceleration model and ADR for position prediction.In order to study the running time and prediction accuracy of the algorithm under different update frequencies, we used a uniform distribution of fixed intervals to simulate the update frequency with some randomness.We use the total time consumed by the prediction algorithm to evaluate the algorithm's time complexity, and use root mean squared error (RMSE) to measure the deviation from the observed value to the true value.
The test results are shown in Figure 7.In this test, we implement the ADR algorithm that takes a velocity blending factor of 0.3.This means that the predicted value receives higher priority than the latest updated position data.As seen from Figure 7a, from the accuracy of the algorithm, the prediction error of the ADR algorithm is smaller than the KF algorithm by an average of 0.961 m, which is lower by 31.76%.From the running time, the single run time of ADR is almost negligible, while the KF algorithm requires an average of 0.23 ms.This is because KF needs to update the state transition model and covariance model in each time step, and performs a series of matrix multiplications.Therefore, we recommend that the ADR algorithm should be considered in 3D rendering programs that require high real-time performance.Figure 7b shows the prediction results of the ADR algorithm and the KF algorithm.The XZ plane is the user's activity plane, and the positive Y axis represents the height value.The error in the prediction trajectory of the KF algorithm increased after the avatar moved vertically.The trajectory predicted by the ADR algorithm is subject to unsatisfactory accuracy when the avatar's acceleration changes, but the error is quite small in other places.This is more in line with the actual situation.

System Overall Performance Test
To verify the usability of the prototype system, we conducted an overall system performance test.Here, an important question we aimed to address is: with the heterogeneous clients of general configuration, how many concurrent users can the system support in distributed virtual

System Overall Performance Test
To verify the usability of the prototype system, we conducted an overall system performance test.
Here, an important question we aimed to address is: with the heterogeneous clients of general configuration, how many concurrent users can the system support in distributed virtual experiments?We assume that the hardware and networks of all of the clients are the same.The performance metrics of the prototype system are mainly influenced by the number of concurrent users and the lag of packets.Therefore, we take them as two factors used in a factorial experiment.The number of concurrent users includes five levels, namely, 10, 30, 50, 70, and 90.In order to simulate different network environments, we used software to add delays to the packets sent and received.The packet lag includes four levels, namely, 0, 10, 20, and 30 milliseconds; that is, each replicate of the experiment contains all 20 treatments, and each treatment contains five replicates.
The prototype system performance indicators include resource consumption, rendering pressure, and network latency.Two participants were invited to take part in the performance test, where one participant used a PC client, while the other used an all-in-one VR client.To simulate a large number of concurrent users, we developed a user agent that can communicate with the server in real time, update the avatar's location randomly, and upload and download the avatars' location and status.One end of the subway evacuation route was filled with thick smoke.When the test started, the user agent first generated a specified number of simulated users.They were evenly distributed in the scene and moved randomly.The two participants, who used a PC and a VR client, respectively, were informed of the correct exit in advance.They controlled their respective avatar to navigate through the crowd, and finally reached the other end of the subway platform, which was filled with smoke (Figure 8).The server and the clients of two types recorded the resource consumption during the running of the program.The clients additionally recorded the frame rate and the overall network delay data.The result of each test is the average of each parameter in the process.We took the average of each indicator in each test as the test result.Two participants were invited to take part in the performance test, where one participant used a PC client, while the other used an all-in-one VR client.To simulate a large number of concurrent users, we developed a user agent that can communicate with the server in real time, update the avatar's location randomly, and upload and download the avatars' location and status.One end of the subway evacuation route was filled with thick smoke.When the test started, the user agent first generated a specified number of simulated users.They were evenly distributed in the scene and moved randomly.The two participants, who used a PC and a VR client, respectively, were informed of the correct exit in advance.They controlled their respective avatar to navigate through the crowd, and finally reached the other end of the subway platform, which was filled with smoke (Figure 8).The server and the clients of two types recorded the resource consumption during the running of the program.The clients additionally recorded the frame rate and the overall network delay data.The result of each test is the average of each parameter in the process.We took the average of each indicator in each test as the test result.

Data Analysis
As hardware and computing capacity vary greatly between servers, PCs, and all-in-one VR, the evaluation indicators are also different, and we will discuss them separately.

•
Server side

Data Analysis
As hardware and computing capacity vary greatly between servers, PCs, and all-in-one VR, the evaluation indicators are also different, and we will discuss them separately.

• Server side
The resource consumption of the prototype system, with different numbers of concurrent users on the server side, is shown in Figure 9a-c.The system resource consumption increases with the number of concurrent users.With 90 concurrent users, this process takes up to 10% of the CPU.Memory usage increases significantly with the number of concurrent users, with the maximum being 320 MB.Network sent traffic (up to 3500 KB per second) is several times higher than network received traffic (up to 500 KB per second).This is because after the server received a user update, it sends the update to all of the other users.When the packet lag is 0, the network traffic both received and sent reaches the highest level.One of the possible reasons is that the latency has caused data packets to get stuck in the network, without reaching the server processing flow on time.Some packets are discarded due to timeout, and are no longer being processed.This results in a reduction in the total network traffic.In general, the system's CPU usage is not high.This process does not occupy much of the server's resources, and more concurrent users can be supported.in the total network traffic.In general, the system's CPU usage is not high.This process does not occupy much of the server's resources, and more concurrent users can be supported.• PC side Figure 10 shows the resource consumption of the PC client under different numbers of concurrent users and packet lags.Figure 10a shows that as concurrent users increase or as packet lags increase, CPU usage does not increase significantly.Figure 10b shows that memory usage is mainly affected by the number of concurrent users.Since the client only needs to send the status data of the local user, the network sent traffic is stable.On the other hand, the client needs to receive the status data of all other remote users, so there is an increasing process in Figure 10c as the number of concurrent user increases.However, an increase in packet lags causes network congestion and some data is discarded.With the increase of concurrent users, the decline in the FPS rate in Figure 10d is obvious, but it is still at a very high level.It can be seen from Figure 10e that in the absence of packet lags, the increase in the number of concurrent users has no effect on the network delay on the PC side.However, once the packet lag is introduced, the impact of both factors on the overall network delay is approximately logarithmic.Overall, the prototype system is stable on this medium-configured computer.• PC side Figure 10 shows the resource consumption of the PC client under different numbers of concurrent users and packet lags.Figure 10a shows that as concurrent users increase or as packet lags increase, CPU usage does not increase significantly.Figure 10b shows that memory usage is mainly affected by the number of concurrent users.Since the client only needs to send the status data of the local user, the network sent traffic is stable.On the other hand, the client needs to receive the status data of all other remote users, so there is an increasing process in Figure 10c as the number of concurrent user increases.However, an increase in packet lags causes network congestion and some data is discarded.With the increase of concurrent users, the decline in the FPS rate in Figure 10d is obvious, but it is still at a very high level.It can be seen from Figure 10e that in the absence of packet lags, the increase in the number of concurrent users has no effect on the network delay on the PC side.However, once the packet lag is introduced, the impact of both factors on the overall network delay is approximately logarithmic.Overall, the prototype system is stable on this medium-configured computer.
data is discarded.With the increase of concurrent users, the decline in the FPS rate in Figure 10d is obvious, but it is still at a very high level.It can be seen from Figure 10e that in the absence of packet lags, the increase in the number of concurrent users has no effect on the network delay on the PC side.However, once the packet lag is introduced, the impact of both factors on the overall network delay is approximately logarithmic.Overall, the prototype system is stable on this medium-configured computer.

• VR side
Since the all-in-one PicoVR system is Android, its performance is measured in a slightly different way.In the following indicators, CPU Time refers to the average of the total time consumed in the most recent 30 frames.The larger the value, the greater the overall pressure on the device.Memory refers to the used heap size.FPS has been limited by their software development kit (SDK), up to 60 FPS.

• VR side
Since the all-in-one PicoVR system is Android, its performance is measured in a slightly different way.In the following indicators, CPU Time refers to the average of the total time consumed in the most recent 30 frames.The larger the value, the greater the overall pressure on the device.Memory refers to the used heap size.FPS has been limited by their software development kit (SDK), up to 60 FPS.
We can see from Figure 11a that the number of concurrent users and packet lags have no significant impacts on CPU time consumption.The used heap size in Figure 11b increases with concurrent users, and has no obvious relationship with packet lag.The network traffic in Figure 11c is similar to the PC client.The FPS rate in Figure 11d shows a decreasing trend with the number of concurrent users, but has no obvious relationship with packet lag.This shows that the packet lag does not affect the FPS.That is, packet lag can lead to poor interactivity, but it does not affect the real-time user experience.Figure 11e shows that the impact of these two factors on the overall network delay in the VR client is also logarithmic.We can see from Figure 11a that the number of concurrent users and packet lags have no significant impacts on CPU time consumption.The used heap size in Figure 11b increases with concurrent users, and has no obvious relationship with packet lag.The network traffic in Figure 11c is similar to the PC client.The FPS rate in Figure 11d shows a decreasing trend with the number of concurrent users, but has no obvious relationship with packet lag.This shows that the packet lag does not affect the FPS.That is, packet lag can lead to poor interactivity, but it does not affect the real-time user experience.Figure 11e shows that the impact of these two factors on the overall network delay in the VR client is also logarithmic.In summary, the overall performance of the server, PC client, and all-in-one VR is stable and less demanding on system resources.The number of concurrent users does not have a significant effect on the overall performance.The main bottleneck of system expansion depends on the performance of the all-in-one VR.Packet lag has a great impact on the overall network delay, and will make system performance decline rapidly.Conditions such as network congestion and packet loss will have a negative effect on the system performance and user experience.In an actual experiment, high network latency should always be avoided.In summary, the overall performance of the server, PC client, and all-in-one VR is stable and less demanding on system resources.The number of concurrent users does not have a significant effect on the overall performance.The main bottleneck of system expansion depends on the performance of the all-in-one VR.Packet lag has a great impact on the overall network delay, and will make system performance decline rapidly.Conditions such as network congestion and packet loss will have a negative effect on the system performance and user experience.In an actual experiment, high network latency should always be avoided.

Discussion
Organizing large numbers of people over a network to conduct virtual experiments is a challenging task.The LAN environment is pure and has low network delays, generally less than 20 ms.Our HDVGE prototype system supports the participation of 90 or more concurrent users in collaborative virtual spatiotemporal behavior experiments.The main limitation is the performance of the all-in-one VR.Under the conditions of 90 concurrent users and no packet lags, the PC client can maintain a rendering performance of approximately 300 FPS, while the all-in-one VR can only run at approximately 20 FPS, which could barely meet the requirements for a real-time user experience and interactivity.With the continuous development of hardware, all-in-one VR will become more powerful and better meet the experimental requirements.At present, the heterogeneous distributed architecture is probably the most effective option to conduct virtual experiments with high numbers of concurrent users.
The Internet is a complicated network environment subject to high latency due to the large number of users distributed around the world.Network delay has more influence on real-time and interactivity than numbers of concurrent users.It should be noted that overall network delay and packet lag is not a simple linear relationship.This shows that the Internet experimental environment may introduce more complex factors, which would make it more difficult to meet the real-time and interactivity requirements of HDVGE.When conducting virtual experiments on an HDVGE, the network environment should be selected according to the actual experimental needs.
When conducting virtual spatiotemporal crowd behavior experiments in emergency scenarios, a very important question to consider is: how can tension be created for the participants?As mentioned in the literature [29], there are several main ways.First, create more realistic emergency elements, such as dim lights and thick smoke; second, set a time limit using a countdown to urge participants to escape; third, develop experimental policies, such as the shorter the time needed for a successful evacuation, the better the payoff.In practice, these methods all play a role in the experiment, but the immersion and presence brought by the HMD VR device can provide a better user experience.For non-immersive devices, the 3D virtual environment on the computer screen is independent of the participant's cognitive space.While in a helmet-based VGE, the virtual environment space and cognitive space are closely coupled, so that the user's cognition of the virtual world is consistent with that of the real world.
Rendering quality is also an important factor to consider in system performance balancing.At present, a PC client normally has several times more graphics processing power than a VR client.To ensure the prototype system run smoothly on the all-in-one VR, we applied a variety of rendering optimization methods, including scene model simplification, character model simplification, and baked lighting.That means that the visual quality was sacrificed in exchange for a stable frame rate on the VR system.If the quality of rendering is too high, the usability of the system will be reduced.The current heterogeneous distributed virtual environment is a trade-off between high fidelity and availability.
Technically speaking, there are many alternative VR devices that can be used as heterogeneous clients in the proposed framework.For example, cardboard VR headsets have gained much popularity due to the good immersion experience and low cost.However, their limitations are also obvious.On the one hand, cardboard VR headsets depend heavily on the mobile phone's performance both for computing and rendering.At present, the performance of mobile phones varies greatly across brands.It is therefore a great challenge for users to experiment with cardboard VR headsets.On the other hand, cardboard VR headsets provide only limited interactivity.The only type of interaction that they support is reacting to the user's head movements detected from the sensors of the phone.For more complex interactions, such as scene roaming and interactions between users, they need to work in collaboration with other devices.These challenges need be addressed in order to effectively incorporate cardboard VR headsets into HDVGE.
Additionally, the validity of the virtual behavior experiments is somehow dependent on the sense of presence produced by the immersive VR environment.The sense of presence relies not only on visual and auditory stimuli, but also on tactile, olfactory, and haptic stimuli.Therefore, in order to create a stronger sense of presence, or even a sense of full immersion, the VR system should ideally be able to synchronously produce multi-channel perceptions with regard to the five senses.A stronger sense of presence could potentially lead to a higher similarity in user behavior between the virtual and real world, and therefore enhanced experimental validity.
Human-computer interaction can be achieved through different user interfaces, such as mouse, keyboard, gamepad, and headset interfaces.Existing research [46] has shown that different ways of interaction could lead to different human behavior patterns in virtual environments.The same issue could potentially arise in HDVGE with the use of heterogeneous devices, particularly when a HMD is used.A major concern with HMD is that the user cannot see the mouse and keyboard when wearing the headset.This may constitute an important source of error in HDVGE-based experiments.Future work is needed to quantify the impact of interaction mode on human behavior, with an emphasis on HMD.

Conclusions and Future Works
In recent years, low-cost HMD VR devices are becoming popular, but they are not sufficiently widely available.Hence, they will coexist with PC-based systems for a long time.In order to solve the heterogeneous problems caused by various types of clients, and to support the implementation of virtual spatiotemporal crowd behavior experiments with large numbers of concurrent participants, the HDVGE represents a feasible solution.In this paper, we present HDVGE as a practical solution, and demonstrate the technical feasibility of HDVGE.
First, we have proposed an HDVGE framework for spatiotemporal crowd behavior experiments, and analyzed the design principles of the HDVGE platform based on this framework.We then designed the architecture and the key technologies of the experiment platform.Finally, using a subway fire as an example, we implemented an HDVGE prototype system for crowd evacuation.Through testing and analyzing the key algorithms and overall performance, we demonstrated the effectiveness of the proposed system.
The results show that in a low-latency LAN environment, the system could support 90 concurrent users for collaborative virtual experiments as heterogeneous distributed clients.System performance bottlenecks were dependent on the all-in-one VR.Packet lag had a great impact on the overall network delay, and would result in a rapid decline in the system performance, leading to further issues such as network congestion, packet loss, etc.In actual experiments, high-latency network environments should be avoided.
We have shown that the HDVGE platform can effectively support heterogeneous clients and multi-user collaboration.We expect to see applications not only in large-scale spatiotemporal behavior research under normal conditions, but also evacuation drills under emergency conditions such as fires or earthquakes.These types of experiments cannot normally be conducted in VR environments without the support of multi-user collaboration.The HDVGE could also serve as a new means of obtaining observational data on individual and group behaviors.Future work will include the following considerations.(1) Crowd behavior varies greatly in different scenes; therefore, analyzing the behavioral data of individuals and groups in different scenarios may lead to different conclusions.
(2) Compared with the data of a real scene experiment, we will analyze the similarities in the behaviors between the virtual and real scenes.The ongoing study of such additional factors will contribute to the advancement of HDVGEs in ways that more closely mirror RGEs.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 8 of 20 shows a typical interaction process.The left and right sides of the figure represent the interaction process of the all-in-one VR and the PC client, respectively.

Figure 3 .
Figure 3. Flow chart of the abstract interaction layer for heterogeneous clients.

Figure 3 .
Figure 3. Flow chart of the abstract interaction layer for heterogeneous clients.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 9 of 20 from client 1, the server calculates and processes the status information, and then forwards it to other clients in the current scene.

Figure 4 .
Figure 4. Flow chart of protocol-based interactions for heterogeneous clients.

Figure 4 .
Figure 4. Flow chart of protocol-based interactions for heterogeneous clients.

Figure 5 .
Figure 5. State synchronization process between clients.

Figure 6
is a prototype system diagram.ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 12 of 20

Figure 6 .
Figure 6.Prototype system overview: (a) subway scene with heavy smoke and dim lighting; (b) first person view with GUI; (c) three skeletal animations of an avatar; (d) all-in-one virtual reality (VR) client; (e) personal computer (PC) client.

Figure 6 .
Figure 6.Prototype system overview: (a) subway scene with heavy smoke and dim lighting; (b) first person view with GUI; (c) three skeletal animations of an avatar; (d) all-in-one virtual reality (VR) client; (e) personal computer (PC) client.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 13 of 20 exit, and would need to climb up staircases and pass through gates in between.The trajectories recorded by the prototype system are time series data, with each record containing the following attributes: UserID, Timestamp, Position X, Position Y, Position Z, and Action.The time interval between each sample is 0.3 s.

Figure 7 .
Figure 7.Comparison between adjusted dead reckoning (ADR) and the Kalman filter (KF): (a) accuracy of the two algorithms and (b) real prediction results.

Figure 7 .
Figure 7.Comparison between adjusted dead reckoning (ADR) and the Kalman filter (KF): (a) accuracy of the two algorithms and (b) real prediction results.
Resource consumption is measured by CPU usage, memory usage, and network throughput.The indicator of rendering pressure is the frame per second (FPS) of the client.Network latency is measured by the overall latency recorded by the client.The hardware configurations relating to the test are as follows.(1) PC configuration: Intel (R) Core (TM) i5 750, NVIDIA GeForce GTX 650 and 8 GB RAM.(2) The all-in-one VR is described in Section 3.1.(3) Server configurations: Intel (R) Core (TM) i7 6700, NVIDIA GeForce GTX 1060, 8 GB RAM.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 14 of 20 and network throughput.The indicator of rendering pressure is the frame per second (FPS) of the client.Network latency is measured by the overall latency recorded by the client.The hardware configurations relating to the test are as follows.(1) PC configuration: Intel (R) Core (TM) i5 750, NVIDIA GeForce GTX 650 and 8 GB RAM.(2) The all-in-one VR is described in Section 3.1.(3) Server configurations: Intel (R) Core (TM) i7 6700, NVIDIA GeForce GTX 1060, 8 GB RAM.

Figure 8 .
Figure 8. Overview of the performance testing process.

Figure 8 .
Figure 8. Overview of the performance testing process.

Figure 9 .
Figure 9. Server resource consumption with different numbers of concurrent users and packet lags: (a) CPU percentage; (b) memory; (c) network (PL stands for packet lag.S refers to sent traffic, while R refers to received traffic.Abbreviations in other figures have the same meanings).

Figure 9 .
Figure 9. Server resource consumption with different numbers of concurrent users and packet lags: (a) CPU percentage; (b) memory; (c) network (PL stands for packet lag.S refers to sent traffic, while R refers to received traffic.Abbreviations in other figures have the same meanings).

Figure 10 .
Figure 10.PC resource consumption in different number of concurrent users and different packet lags: (a) CPU percentage; (b) memory; (c) network; (d) framerate per second; (e) network delay.

Figure 10 .
Figure 10.PC resource consumption in different number of concurrent users and different packet lags: (a) CPU percentage; (b) memory; (c) network; (d) framerate per second; (e) network delay.

Figure 11 .
Figure 11.All-in-one VR resource consumption for different numbers of concurrent users and packet lags: (a) CPU percentage; (b) memory; (c) network; (d) framerate per second; (e) network delay.

Figure 11 .
Figure 11.All-in-one VR resource consumption for different numbers of concurrent users and packet lags: (a) CPU percentage; (b) memory; (c) network; (d) framerate per second; (e) network delay.