A Virtual Reality Soldier Simulator with Body Area Networks for Team Training

Soldier-based simulators have been attracting increased attention recently, with the aim of making complex military tactics more effective, such that soldiers are able to respond rapidly and logically to battlespace situations and the commander’s decisions in the battlefield. Moreover, body area networks (BANs) can be applied to collect the training data in order to provide greater access to soldiers’ physical actions or postures as they occur in real routine training. Therefore, due to the limited physical space of training facilities, an efficient soldier-based training strategy is proposed that integrates a virtual reality (VR) simulation system with a BAN, which can capture body movements such as walking, running, shooting, and crouching in a virtual environment. The performance evaluation shows that the proposed VR simulation system is able to provide complete and substantial information throughout the training process, including detection, estimation, and monitoring capabilities.


Introduction
In recent years, since virtual reality (VR) training simulators allow soldiers to be trained with no risk of exposure to real situations, the development of a cost-effective virtual training environment is critical for training infantry squads [1][2][3]. This is because if soldiers do not develop and sustain tactical proficiency, they will not be able to react in a quickly evolving battlefield. To create a VR military simulator that integrates immersion, interaction, and imagination, the issue of how to use VR factors (e.g., VR engine, software and database, input/output devices, and users and tasks) is an important one. The VR simulator is able to integrate the terrain of any real place into the training model, and the virtual simulation trains soldiers to engage targets while working as a team [4,5].
With a head-mounted display (HMD), the key feature used in VR technology, soldiers are immersed in a complex task environment that cannot be replicated in any training areas. Visual telescopes are positioned in front of the eyes, and the movement of the head is tracked by micro electro mechanical system (MEMS) inertial sensors. Gearing up with HMDs and an omnidirectional treadmill (ODT), soldiers can perform locomotive motions without risk of injury [6][7][8]. Note that the above systems do not offer any posture or gesture interactions between the soldiers and the virtual environment.
To address this problem, the authors of [9][10][11][12][13][14][15][16][17] proposed an action recognition method based on multiple cameras. To tackle the occlusion problem, the authors mostly used Microsoft Kinect to capture color and depth data from the human body, which requires multiple devices arranged in a specific pattern for the recognizing different human actions. Therefore, the above methods are not feasible in a real-time virtual environment for military training.
To overcome the limitations of the real-time training system, the dismounted soldier training system (DSTS) was developed [18][19][20][21]. In the DSTS, soldiers wear special training suits and HMDs, and stand on a rubber pad to create a virtual environment. However, the inertial sensors of training suits can only recognize simple human actions (e.g., standing and crouching). When soldiers would like to walk or run in the virtual environment, they have to control small joysticks and function keys on simulated rifles, which are not immersive for locomotion. Thus, in a training situation, soldiers will not be able to react quickly when suddenly encountering enemy fire. As a result, though the DSTS provides a multi-soldier training environment, it has weaknesses with respect to human action recognition. Therefore, in order to develop a training simulator that is capable of accurately capturing soldiers' postures and movements and is less expensive and more effective for team training, body area networks (BANs) represent an appropriate bridge for connecting physical and virtual environments [22][23][24][25][26], which contain inertial sensors like accelerometers, e-compasses and gyroscopes to capture human movement.
Xsens Co. have been developing inertial BANs to track human movement for several years. Although the inertial BAN, called the MVN system, is able to capture human body motions wirelessly without using optical cameras, the 60 Hz system update rate is too slow to record fast-moving human motion. The latest Xsens MVN suit solved this problem, reaching a system update rate of 240 Hz. However, the MVN suit is relatively expensive in consumer markets. In addition, it is difficult to support more than four soldiers in a room with the MVN system, since the wireless sensor nodes may have unstable connections and high latency. On the other hand, since the Xsens MVN system is not an open-source system, the crucial algorithm of the inertial sensors does not adaptively allow the acquisition by and integration with the proposed real-time program. Moreover, the MVN system is only a motion-capture BAN and is not a fully functional training simulator with inertial BANs [26].
For the above-mentioned reasons, this paper proposes a multi-player training simulator that immerses infantry soldiers in the battlefield. The paper presents a technological solution for training infantry soldiers. Fully immersive simulation training provides an effective way of imitating real situations that are generally dangerous, avoiding putting trainees at risk. To achieve this, the paper proposes an immersive system that can immediately identify human actions. Training effectiveness in this simulator is highly remarkable in several respects:

1.
Cost-Effective Design: We designed and implemented our own sensor nodes, such that the fundamental operations of the inertial sensors could be adaptively adjusted for acquisition and integration. Therefore, it has a competitive advantage in terms of system cost.

2.
System Capacity: Based on the proposed simulator, a six-man squad is able to conduct military exercises that are very similar to real missions. Soldiers hold mission rehearsals in the virtual environment such that leaders can conduct tactical operations, communicate with their team members, and coordinate with the chain of command.

3.
Error Analysis: This work provides an analysis of the quaternion error and further explores the sensing measurement errors. Based on the quaternion-driven rotation, the measurement relation with the quaternion error between the earth frame and the sensor frame can be fully described.

4.
System Delay Time: The update rate of inertial sensors is about 160 Hz (i.e., the refresh time of inertial sensors has a time delay of about 6 ms). The simulator is capable of training six men at a 60 Hz system update rate (i.e., the refresh time of the entire system needs about 16 ms), which is acceptable for human awareness with delay (≤40 ms).

5.
System Feedback: Instructors can provide feedback after mission rehearsals using the visual after action review (AAR) function in the simulator, which provides different views of portions of the action and a complete digital playback of the scenario, allowing a squad to review details of the action. Furthermore, instructors can analyze data from digital records and make improvements to overcome the shortcomings in the action ( Figure 1). Accordingly, in the immersive virtual environment, soldiers and leaders improve themselves with respect to situational awareness, decision-making, and coordination skills.

System Description
The proposed system, which is deployed in five-square-meter area (Figure 2), is a robust training solution that can support up to six soldiers. The virtual reality environment is implemented in C++ and DirectX 11. In addition to the virtual system, each individual soldier stands in the center of an

System Description
The proposed system, which is deployed in five-square-meter area (Figure 2), is a robust training solution that can support up to six soldiers. The virtual reality environment is implemented in C++ and DirectX 11. In addition to the virtual system, each individual soldier stands in the center of an

System Description
The proposed system, which is deployed in five-square-meter area (Figure 2), is a robust training solution that can support up to six soldiers. The virtual reality environment is implemented in C++ and DirectX 11. In addition to the virtual system, each individual soldier stands in the center of an ODT. The ODTs are customized hardware designed for the soldier training simulator, which is equipped with a high-performance workstation that generates the visual graphics for the HMD and provides voice communication functions based on VoIP technology. The dimensions of the ODT are 47" x47 "x78", 187 lbs. The computing workstation on the ODT is equipped with a 3.1 GHz Intel Core i7 processor, 8 GB RAM, and an NVidia GTX 980M graphics card. The HMD is an Oculus Rift DK2, which has two displays with 960 × 1080 resolution per eye. The VoIP software is implemented based on Asterisk open-source PBX. Soldiers are not only outfitted with HMDs, but also equipped with multiple inertial sensors, which consist of wearable sensor nodes ( Figure 3) deployed over the full body ( Figure 4). As depicted in Figure 3, a sensor node consists of an ARM-based microcontroller and nine axial inertial measurement units (IMUs), which are equipped with ARM Cotex-M4 microprocessors and ST-MEMS chips. Note that these tiny wireless sensor nodes can work as a group to form a wireless BAN, which uses an ultra-low power radio frequency band of 2.4G. Soldiers are not only outfitted with HMDs, but also equipped with multiple inertial sensors, which consist of wearable sensor nodes ( Figure 3) deployed over the full body ( Figure 4). As depicted in Figure 3, a sensor node consists of an ARM-based microcontroller and nine axial inertial measurement units (IMUs), which are equipped with ARM Cotex-M4 microprocessors and ST-MEMS chips. Note that these tiny wireless sensor nodes can work as a group to form a wireless BAN, which uses an ultra-low power radio frequency band of 2.4G.

Sensor Modeling
This subsection the sensors are modeled in order to formulate the orientation estimation problem. The measurement models for the gyroscope, accelerometer, and magnetometer are briefly discussed in the following subsections [27].  ODT. The ODTs are customized hardware designed for the soldier training simulator, which is equipped with a high-performance workstation that generates the visual graphics for the HMD and provides voice communication functions based on VoIP technology. The dimensions of the ODT are 47" x47 "x78", 187 lbs. The computing workstation on the ODT is equipped with a 3.1 GHz Intel Core i7 processor, 8 GB RAM, and an NVidia GTX 980M graphics card. The HMD is an Oculus Rift DK2, which has two displays with 960 x 1080 resolution per eye. The VoIP software is implemented based on Asterisk open-source PBX. Soldiers are not only outfitted with HMDs, but also equipped with multiple inertial sensors, which consist of wearable sensor nodes ( Figure 3) deployed over the full body ( Figure 4). As depicted in Figure 3, a sensor node consists of an ARM-based microcontroller and nine axial inertial measurement units (IMUs), which are equipped with ARM Cotex-M4 microprocessors and ST-MEMS chips. Note that these tiny wireless sensor nodes can work as a group to form a wireless BAN, which uses an ultra-low power radio frequency band of 2.4G.

Sensor Modeling
This subsection the sensors are modeled in order to formulate the orientation estimation problem. The measurement models for the gyroscope, accelerometer, and magnetometer are briefly discussed in the following subsections [27].

Sensor Modeling
This subsection the sensors are modeled in order to formulate the orientation estimation problem. The measurement models for the gyroscope, accelerometer, and magnetometer are briefly discussed in the following subsections [27].
(1) Gyroscope: The measured gyroscope signal s ω m t can be represented in the sensor frame s at time t using where s ω t is the true angular velocity, b ω, t is the bias caused by low-frequency offset fluctuations, and the measurement error e ω, t is assumed to be zero-mean white noise. As shown in Figure 5, the raw data of the measured gyroscope signals in the x, y and z directions includes true angular velocity, bias and error. (2) Accelerometer: Similar to the gyroscope signal, the measured accelerometer signal s a m t can be represented using s a m t ≈ s a b,t + b a, t + e a, t where s a b, t is the linear acceleration of the body after gravity compensation, b a, t is the bias caused by low-frequency offset fluctuations, and the measurement error e a, t is assumed to be zero-mean white noise. As shown in Figure 6, the raw data of the measured accelerometer signals in the x, y and z directions includes true linear acceleration, offset and error. (3) Magnetometer: For the magnetometer, the measured signal s m m t is often modeled as the sum of the earth's magnetic field s m t , magnetic offset b m, t , and noise e m, t , which yields As shown in Figure 7, the raw data of the measured magnetometer signals in the x, y and z directions includes the earth's magnetic field, offset and error.

Sensor Calibration
Since the accuracy of the orientation estimate depends heavily on the measurements, the null point, and the scale factor of each axis of the sensor, a calibration procedure should be performed before each practical use. To this end, the sensor calibration is described as follows:

•
Step 1: Given a fixed gesture, we measure the sensing data (i.e., the raw data) and calculate the measurement offsets.

•
Step 2: Remove the offset and normalize the modified raw data to the maximum resolution of the sensor's analog-to-digital converter. In this work, the calibrated results (CR) of the sensors are described by Equations (4)-(6). Table 1 shows the means of the calibrated results in the x, y and z directions of sensor nodes, which include the accelerometer, the magnetometer, and the gyroscope. Note that the inertial sensor signals (e.g., s a m t , s m m t , and s ω m t ) are measured without movement, and that a small bias or offset (e.g., b a, t or b m, t or b ω, t ) in the measured signals can therefore be observed. In general, the additive offset that needs to be corrected for before calculating the estimated orientation is very small. For instance, the additive offset would be positive or negative in the signal output (e.g., the average b ω, t = −0.08 degree/sec and s ω m t = 0 degree/sec) without movement. After bias error compensation for each sensor, the outputs of the sensors represented in (1)-(3) can be rewritten as s ω m t = s ω t + e ω, t , s a m t ≈ s a b,t + e a, t , and s m m t ≈ s m t + e m, t , and they are then applied as inputs for the simulation system. In particular, the proposed algorithm can be good at handling zero-mean white noise. After the completion of modeling and calibration of the sensor measurements from the newly developed MARG, all measurements are then represented in the form of quaternions as inputs of the proposed algorithm.
As shown in Figure 8, a typical run of the attitude angles of a sensor node placed on the upper arm when standing in a T-pose is recorded. In the first step, we stand with the arms naturally at the sides. Then, we stretch the arms horizontally with thumbs forward, and then we move back to the first step. This experiment shows that the sensor node is capable of tracking the motion curve directly.

Information Processing
As mentioned above, human action recognition is crucial to developing an immersive training simulator. The wireless BAN is able to instantaneously track the action of the skeleton of a soldier. The microcontroller of the sensor node acquires the raw data from the accelerometer, gyroscope and magnetometer through SPI and I2C interfaces. The measurements of the accelerometer, gyroscope and magnetometer are contaminated by errors of scale factor and bias, resulting in a large estimation error during sensor fusion. Thus, a procedure for normalization and error correction is needed for the sensor nodes to ensure the consistency of the nine-axis data ( Figure 9). After that, the calibrated

Information Processing
As mentioned above, human action recognition is crucial to developing an immersive training simulator. The wireless BAN is able to instantaneously track the action of the skeleton of a soldier. The microcontroller of the sensor node acquires the raw data from the accelerometer, gyroscope and magnetometer through SPI and I2C interfaces. The measurements of the accelerometer, gyroscope and magnetometer are contaminated by errors of scale factor and bias, resulting in a large estimation error during sensor fusion. Thus, a procedure for normalization and error correction is needed for the sensor nodes to ensure the consistency of the nine-axis data ( Figure 9). After that, the calibrated accelerometer and magnetometer data can be used to calculate the orientation and rotation. Estimation based on the accelerometer does not include the heading angle, which is perpendicular to the direction of gravity. Thus, the magnetometer is used to measure the heading angle. the Euler angles are defined as follows: Roll − Φ : rotation about the X-axis Pitch − θ : rotation about the Y-axis Heading − ψ : rotation about the Z-axis.
Let X acc , Y acc , Z acc be the accelerometer components. We have: Let X mag , Y mag , Z mag be the magnetometer components. We have: Heading Accordingly, the complementary filter output (using gyroscope) is Let Xmag, Ymag, Zmag be the magnetometer components. We have: Heading ψ = tan (11) Accordingly, the complementary filter output (using gyroscope) is To mitigate the accumulated drift from directly integrating the linear angular velocity of the gyroscope, the accelerometer and magnetometer are used as aiding sensors to provide the vertical and horizontal references for the Earth. Moreover, a complementary filter [28,29] is able to combine the measurement information of an accelerometer, a gyroscope and a magnetometer, offering big advantages in terms of both long-term and short-term observation. For instance, an accelerometer does not drift over the course of a long-term observation, and a gyroscope is not susceptible to small forces during a short-term observation. The W value of the complementary filter in Equation (12) is the ratio in which inertial sensor data is fused, which can then be used to compensate for gyroscope drift by using the accelerometer and magnetometer measurements. We always set the W value to 0.9, or probably even higher than 0.9. Therefore, the complementary filter provides an accurate orientation estimation.

Communication and Node Authentication Procedures
To tackle the problem of interference and to reduce the bit error rate (BER) of the wireless data transmission between the sensor nodes and the sink node, communication mechanisms can be applied to build a robust wireless communication system [30,31]. The communication technique employed for communication between the sensor nodes and the sink node is frequency hopping spread spectrum (FHSS). The standard FHSS modulation technique is able to avoid interference between different sensor nodes in the world-wide ISM frequency band, because there are six BANs in total, all of which are performing orientation updates in a small room. Moreover, the sensor nodes' transmission of data to the sink node is based on packet communication.
In the beginning, sensor nodes send packets, which include orientation and rotation data. When the sink node receives a packet from a sensor node, an acknowledgement packet will be transmitted To mitigate the accumulated drift from directly integrating the linear angular velocity of the gyroscope, the accelerometer and magnetometer are used as aiding sensors to provide the vertical and horizontal references for the Earth. Moreover, a complementary filter [28,29] is able to combine the measurement information of an accelerometer, a gyroscope and a magnetometer, offering big advantages in terms of both long-term and short-term observation. For instance, an accelerometer does not drift over the course of a long-term observation, and a gyroscope is not susceptible to small forces during a short-term observation. The W value of the complementary filter in Equation (12) is the ratio in which inertial sensor data is fused, which can then be used to compensate for gyroscope drift by using the accelerometer and magnetometer measurements. We always set the W value to 0.9, or probably even higher than 0.9. Therefore, the complementary filter provides an accurate orientation estimation.

Communication and Node Authentication Procedures
To tackle the problem of interference and to reduce the bit error rate (BER) of the wireless data transmission between the sensor nodes and the sink node, communication mechanisms can be applied to build a robust wireless communication system [30,31]. The communication technique employed for communication between the sensor nodes and the sink node is frequency hopping spread spectrum (FHSS). The standard FHSS modulation technique is able to avoid interference between different sensor nodes in the world-wide ISM frequency band, because there are six BANs in total, all of which are performing orientation updates in a small room. Moreover, the sensor nodes' transmission of data to the sink node is based on packet communication.
In the beginning, sensor nodes send packets, which include orientation and rotation data. When the sink node receives a packet from a sensor node, an acknowledgement packet will be transmitted immediately ( Figure 10). If the acknowledgement fails to arrive, the sensor node will retransmit the packet, unless the number of retries exceeds the retransmission limit. When the sink node receives packets in operation mode, it will check the payload of the packets. As shown in Figure 11, if the payload is equal to the sensor node's unique identification information, the sink node will accept the packet and scan the next sensor node's channel during the next time slot. The ten wireless sensor node channels operate at a frequency range of 2400-2525 GHz. immediately ( Figure 10). If the acknowledgement fails to arrive, the sensor node will retransmit the packet, unless the number of retries exceeds the retransmission limit. When the sink node receives packets in operation mode, it will check the payload of the packets. As shown in Figure 11, if the payload is equal to the sensor node's unique identification information, the sink node will accept the packet and scan the next sensor node's channel during the next time slot. The ten wireless sensor node channels operate at a frequency range of 2,400-2,525 GHz.    . The diagram shows that sensor nodes communicate with sink node in two domains.

System Intialization
The sink node is deployed on the back of the body and collects the streaming data from multiple wireless sensor nodes. Afterwards, the streaming data from the sensor nodes is transmitted to the workstation via Ethernet cables, modeling the skeleton of the human body along with the structure of the firearm (Figure 12). Table 2 describes the data structure of a packet, which is sent by a sensor node to the sink node in every ODT. Table 3 shows the integrated data that describes the unique Figure 11. The diagram shows that sensor nodes communicate with sink node in two domains.

System Intialization
The sink node is deployed on the back of the body and collects the streaming data from multiple wireless sensor nodes. Afterwards, the streaming data from the sensor nodes is transmitted to the workstation via Ethernet cables, modeling the skeleton of the human body along with the structure of the firearm (Figure 12). Table 2 describes the data structure of a packet, which is sent by a sensor node to the sink node in every ODT. Table 3 shows the integrated data that describes the unique features of each soldier in the virtual environment. The description provides the appearance characteristics, locomotion, and behaviors of each soldier [32,33].      For sensor nodes, the data stream reported to the sink node is interpreted as the movement-based data structures of a skeleton. Each wearable sensor node of the skeleton has a unique identification information during the initialization phase. As a result, the sink node can distinguish which sensor nodes are online. Accordingly, when turning the power on, the HMDs on the soldiers are automatically connected to the workstations, and the voice communication group setting is immediately ready. Please note that the T-pose calibration is performed in sensor node n of a skeleton for initializing the root frame, which is given by where q T,n is the reading of sensor node n in the modified T-pose and q 1,n is the new body frame from the initial root frame q 0,n . Although sensor nodes are always worn on certain positions of a human body, the positions of sensor nodes may drift due to the movements occurring during the training process. Hence, the T-pose calibration procedure can be applied to estimate the orientation of the sensors. After that, the system is prepared to log in for simulation tasks. The system initialization flow diagram is shown in Figure 13.

Quaternion Representation
This section outlines a quaternion representation of the orientation of the sensor arrays. Quaternions provide a convenient mathematical notation for representing the orientation and rotation of 3D objects because quaternion representation is more numerically stable and efficient compared to rotation matrix and Euler angel representation. According to [34][35][36], a quaternion can be thought of as a vector with four components, as a composite of a scalar and ordinary vector. The quaternion units , , are called the vector part of the quaternion, while is the scalar part. The quaternion can frequently be written as an ordered set of four real quantities, = , , , .
Denote as the orientation of the earth frame ue with respect to the sensor frame us. The conjugate of the quaternion can be used to represent an orientation by swapping the relative frame, and the sign * denotes the conjugate. Therefore, the conjugate of can be denoted as * = = , − , − , − .
Moreover, the quaternion product ⨂ can be used to describe compounded orientations, and their definition is based on the Hamilton rule in [37]. For example, the compounded orientation can be defined by

Quaternion Representation
This section outlines a quaternion representation of the orientation of the sensor arrays. Quaternions provide a convenient mathematical notation for representing the orientation and rotation of 3D objects because quaternion representation is more numerically stable and efficient compared to rotation matrix and Euler angel representation. According to [34][35][36], a quaternion can be thought of as a vector with four components, q = q 0 + q x i + q y j + q z k as a composite of a scalar and ordinary vector. The quaternion units q x , q y , q z are called the vector part of the quaternion, while q 0 is the scalar part. The quaternion can frequently be written as an ordered set of four real quantities, q = q 0 , q x , q y , q z .
Denote e s q as the orientation of the earth frame u e with respect to the sensor frame u s . The conjugate of the quaternion can be used to represent an orientation by swapping the relative frame, and the sign * denotes the conjugate. Therefore, the conjugate of e s q can be denoted as e s q * = s e q = q 0 , −q x , −q y , −q z .
Moreover, the quaternion product ⊗ can be used to describe compounded orientations, and their definition is based on the Hamilton rule in [37]. For example, the compounded orientation s h q can be defined by where e h q denotes the orientation of the earth frame u e with respect to the frame u h . A human body model consists of a set of body segments connected by joints. For upper limbs and lower limbs, kinematic chains are modeled that branch out around the torso. The kinematic chain describes the relationship between rigid body movements and the motions of joints. A forward kinematics technique, which was introduced for the purposes of robotic control, is used to configure each pair of adjacent segments. In the system, the aim of building human kinematic chains is to determine the transformation matrix of a human body from the first to the last segments and to find the independent configuration for each joint and the relationship with the root frame.
Thus, the rotation matrix q n n−1 used for orientation from sensor node n-1 to sensor node n is given by q n n−1 = q n−1 ⊗ q n . Figure 14 shows the simplified segment biomechanical model of the human body. The kinematics of segments on which no inertial sensors are attached (e.g., hand, feet, toes) are considered to be rigid connections between neighboring segments. The transformation matrix is defined as where T n n−1 is the translation matrix from sensor frame to body frame. According to [37,38], therefore, the transformation matrix of a human body from the first segment to the n-th segment is A human body model consists of a set of body segments connected by joints. For upper limbs and lower limbs, kinematic chains are modeled that branch out around the torso. The kinematic chain describes the relationship between rigid body movements and the motions of joints. A forward kinematics technique, which was introduced for the purposes of robotic control, is used to configure each pair of adjacent segments. In the system, the aim of building human kinematic chains is to determine the transformation matrix of a human body from the first to the last segments and to find the independent configuration for each joint and the relationship with the root frame.
Thus, the rotation matrix used for orientation from sensor node n-1 to sensor node n is given by Figure 14 shows the simplified segment biomechanical model of the human body. The kinematics of segments on which no inertial sensors are attached (e.g., hand, feet, toes) are considered to be rigid connections between neighboring segments. The transformation matrix is defined as where is the translation matrix from sensor frame to body frame. According to [37,38], therefore, the transformation matrix of a human body from the first segment to the n-th segment is

Performance Analysis
The analysis focuses on the quaternion error and further explores the sensing measurement errors. Based on the quaternion-driven rotation, the measurement relation with the quaternion error between the earth frame and sensor frame can be further described.

Performance Analysis
The analysis focuses on the quaternion error and further explores the sensing measurement errors. Based on the quaternion-driven rotation, the measurement relation with the quaternion error between the earth frame and sensor frame can be further described.

Rotation Matrix
According to [36], given a unit quaternion q = q r + q x i + q y j + q z k, the quaternion-driven rotation can be further described by the rotation matrix R, which yields R =    1 − 2 q y 2 + q z 2 2q x q y − 2q r q z 2q x q z + 2q r q y 2q x q y + 2q r q z 1 − 2 q x 2 + q z 2 2q y q z − 2q r q x 2q x q z − 2q r q y 2q y q z + 2q r q x 1 − 2 q x 2 + q y 2   . Letq be an estimate of the true attitude quaternion q. The small rotation from the estimated attitude,q, to the true attitude is defined as q err . The error quaternion is small but non-zero, due to errors in the various sensors. The relationship is expressed in terms of quaternion multiplication as follows: Assuming that the error quaternion, q err , is to represent a small rotation, it can be approximated as follows: Noting that the error quaternion q err is a perturbation of the rotation matrix, and the vector components of → q err are small, the perturbation of the rotation matrix R in Equation (21) can be written as: Equation (22) relatingq and q can be written as R(q) is the estimate of the rotation matrix or the equivalent ofq. Now, considering the sensor frame u s and the earth frame u e , we have Thus, the measurement relation for the quaternion error is obtained: Accordingly, given the error quaternion q err and the sensor frame u s , the perturbation of the earth frame u e can be described. The quantitative analysis of the error quaternion is detailed in Section 6.1.

Error Analysis
The analysis in Section 4.1 focuses on the quaternion error. Here we further explore the sensing measurement errors, which consist of the elements of the error quaternion.
Roll − Φ : rotation about the X-axis Pitch − θ : rotation about the Y-axis Heading − ψ : rotation about the Z-axis Now we associate a quaternion with Euler angles, which yieldŝ Denote the pitch angle measurement as θ + ∆θ, where θ is the true pitch angle information and ∆θ is the measurement error. To simplify the error analysis, assume the rotation errors are neglected in roll angle and heading angle measurements. Let sin(Φ/2) = A and sin(ψ/2) = B. Accordingly, considering the measurement error in the pitch angle, the quaternion can be rewritten as Assuming that the measurement error in the pitch angle is small, we obtain sin θ+∆θ 2 = sin θ 2 cos ∆θ 2 + cos θ 2 sin ∆θ According to Equation (29), the quaternion with measurement error in the pitch angle can be further approximated by Note that, given a measurement error in the pitch angle, and the roll, pitch, and heading angle measurements, the error quaternion q err can be approximately derived by Equation (32). Therefore, the measurement relation with the quaternion error between the earth frame and the sensor frame can be further described using Equation (27).

System Operating Procedures
To evaluate the effectiveness and capability of the virtual reality simulator for team training, we designed a between-subjects study. In the experiments, the impacts of three key factors on system performance are considered: training experience, the group size of the participants, and information exchange between the group members. The experimental results are detailed as follows.

Participants
The experiment involved 6 participants. No participants had ever played the system before. Half of the participants were volunteers who had done military training in live situations, while the other half had never done live training. The age of the participants ranged from 27 to 35.

Virtual Environment
The immersive environment was designed for a rescue mission in an enemy-held building ( Figure 15). In addition, three hostages were being guarded on the top floor of a three-storied building which was controlled by 15 enemies. To ensure that the virtual environment was consistent with an actual training facility, we simulated a real military training site, including the urban street, the enemy-held building, and so on. All enemies controlled automatically by the system were capable of shooting, evading attacks, and team striking. When participants were immersed in the virtual environment, they could interact with other participants not only through gesture tracking, but also through VoIP communication technology. ammo. Under these experimental conditions, the participants' death rate (hit rate) is recorded for data analysis by the experimenters. Moreover, if all participants are killed before they complete the mission, the rescue task is terminated, and the time will not be recorded for the experiment. One sensor node attached to the right thigh was not calibrated well, and a sensing error was derived in the pitch direction.

Experimental Results
In order to assess the system performance, four sets of experiments were performed to explore the impact of quaternion error and the training experience on mission execution and management.

Error Analysis
In the first set of simulations, we explored the characteristic of the error quaternion . With reference to the analysis in Section 4.2, the rotation errors are assumed to be negligible in the roll angle and heading angle measurements, and the measurement error in the pitch angle is considered to be Δθ. With angle information (e.g., the heading angle 60 , the roll angle 30 , the pitch angle 15 ) and Δθ, Figure 17 presents the behavior of the error quaternion when varying the measurement As mentioned above, sensing measurement errors greatly affect the sensor nodes, which are attached to a human body. The integration of the inertial sensors, including sensor signals and drift errors, is performed on the basis of the kinematics of the human body. Therefore, sensing errors will be accumulated in quaternion form. As shown in Figure 16, the sensing measurement errors are calibrated when the T-pose calibration is performed. In the first step, we normalize the accelerometer, gyroscope and magnetometer in all of the sensor nodes and compensate for bias errors. In the following step, a complementary filter is used to mitigate the accumulated errors based on the advantages of long-term observation and short-term observation respectively. In the final step, T-pose calibration is performed to align the orientation of the sensor nodes with respect to the body segments, after which the sensor node is able to capture body movement accurately in the virtual environment. ammo. Under these experimental conditions, the participants' death rate (hit rate) is recorded for data analysis by the experimenters. Moreover, if all participants are killed before they complete the mission, the rescue task is terminated, and the time will not be recorded for the experiment. One sensor node attached to the right thigh was not calibrated well, and a sensing error was derived in the pitch direction.

Experimental Results
In order to assess the system performance, four sets of experiments were performed to explore the impact of quaternion error and the training experience on mission execution and management.

Error Analysis
In the first set of simulations, we explored the characteristic of the error quaternion . With reference to the analysis in Section 4.2, the rotation errors are assumed to be negligible in the roll angle and heading angle measurements, and the measurement error in the pitch angle is considered

Procedure
The purpose of the experiment is to evaluate the system performance. Participants follow the same path through the virtual environment in the experiment. The time-trial mission starts when the participants begin from the storm gate. Moreover, the time taken by the participants to kill all of the enemies who guard the three hostages on the top floor will be recorded. All participants control simulated M4 rifles, and the enemies control virtual AK47 rifles. All of the weapons have unlimited ammo. Under these experimental conditions, the participants' death rate (hit rate) is recorded for data analysis by the experimenters. Moreover, if all participants are killed before they complete the mission, the rescue task is terminated, and the time will not be recorded for the experiment.

Experimental Results
In order to assess the system performance, four sets of experiments were performed to explore the impact of quaternion error and the training experience on mission execution and management.

Error Analysis
In the first set of simulations, we explored the characteristic of the error quaternion q err . With reference to the analysis in Section 4.2, the rotation errors are assumed to be negligible in the roll angle and heading angle measurements, and the measurement error in the pitch angle is considered to be ∆θ. With angle information (e.g., the heading angle 60 • , the roll angle 30 • , the pitch angle 15 • ) and ∆θ, Figure 17 presents the behavior of the error quaternion q err when varying the measurement error of the pitch angle. Note that, given ∆θ ranging from 0 • to 0.2 • , the vector parts of q err (i.e., ) are approximately linear with respect to the ∆θ, which can provide a sensible way of describing the error behaviors of rotation X, rotation Y, and rotation Z. According to Equation (27), given the quaternion error q err and the sensor frame u s , the perturbation of the earth frame u e can be described. As shown in Figure 17, when the measurement error in the pitch angle is small, the small vector components of q → err lead to a small perturbation of the earth frame u e . In contrast, as the measurement error in the pitch angle increases, the perturbation in the Y-axis increases, which results in a larger error component in the Y-axis (e.g., with ∆θ = 0.1, q   Figure 18 shows a snapshot of the proposed system. All three individual participants, who had done the same military training at the actual training site, successfully completed the rescue mission, Figure 17. The scalar part and the vector part of the quaternion error for a small rotation of measurement error of pitch angle. Figure 18 shows a snapshot of the proposed system. All three individual participants, who had done the same military training at the actual training site, successfully completed the rescue mission, with times of 22 16 , 25 40 , 28 51 , respectively (mean = 25 36 , standard deviation = 3 18 ). However, of the three participants who had never done the same training and started the mission individually, only one participant completed the rescue mission, with a time of 36 59 , and the other two participants were killed by enemies before completing the rescue task. The three two-man teams who had been trained in the live situation completed the rescue mission with times of 11 28 , 18 19 , 16 5 , respectively (mean = 15 17 , standard deviation = 3 30 ). However, the three two-man teams who had never done the same training also completed the rescue mission, with times of 25 19 , 28 33 , 26 12 , respectively (mean = 26 41 , standard deviation = 1 40 ). Finally, the three-man team that had live training experience completed rescue mission with a time of 7 49 . On the other hand, the three-man team that had never done the same training completed the rescue mission with a time of 13 47 . The results of experienced and unexperienced participants' mean times in the experiment are shown in Figure 19. We also evaluated another situation, in which two subjects in the three-man groups completed the mission without the VoIP communication function. The mean time in this experiment increased by 1 26 (Figure 20), which implies that communication and information processing can improve the performance for rescue missions.

Simulated Training Performance
Finally, we evaluated a six-man team of all participants in the rescue task, because standard deployment for a real live mission is a six-man entry team. The mission time decreased by 2 28 with respect to the three-man experiment with experienced participants. The experimental results for mean times with different numbers of participants are shown in Figure 21. In addition, death rate (hit rate) revealed another difference between single and multiple participants. From the results, the mean of death rate (hit rate) was 1.5 shots/per mission when a single participant interacted with the system. However, the mean of death rate decreased to 0.38 shots/per mission when multiple participants interacted with the system. the three-man team that had never done the same training completed the rescue mission with a time of 13'47". The results of experienced and unexperienced participants' mean times in the experiment are shown in Figure 19. We also evaluated another situation, in which two subjects in the three-man groups completed the mission without the VoIP communication function. The mean time in this experiment increased by 1'26" (Figure 20), which implies that communication and information processing can improve the performance for rescue missions. Finally, we evaluated a six-man team of all participants in the rescue task, because standard deployment for a real live mission is a six-man entry team. The mission time decreased by 2'28" with respect to the three-man experiment with experienced participants. The experimental results for mean times with different numbers of participants are shown in Figure 21. In addition, death rate (hit rate) revealed another difference between single and multiple participants. From the results, the mean of death rate (hit rate) was 1.5 shots/per mission when a single participant interacted with the system. However, the mean of death rate decreased to 0.38 shots/per mission when multiple participants interacted with the system.     one-man two-man three-man six-man     one-man two-man three-man six-man

Discussion
When the participants executed the rescue mission, the activities involved in the experiment included detecting enemies, engaging enemies, moving inside the building and rescuing the hostages. The results reveal significant differences in several respects, including experience, quantity, and communication, and show that compared with the inexperienced participants, all experienced participants who had done the same training in a live situation took less time to complete the rescue mission. The wireless BANs of the participants are able to work accurately in the virtual environment for experienced participants. Tactical skills (e.g., moving through built-up areas, reconnoiter area, reacting to contact, assaulting, and clearing a room) absolutely require team work, demanding that wireless BANs interact with each other perfectly in terms of connection and accuracy. Without proper BANs, participants may feel mismatched interaction with their virtual avatars, and may feel uncomfortable or sick in the virtual environment.
The experimental results show that a larger sized group of participants took less time to complete the rescue mission than a smaller sized group of participants. Moreover, a group of multiple participants had a lower death rate compared with that of a single participant. This is due to the fact that, as the group size of participants increases, team movement is more efficient and fire power is greater in the virtual environment, which is similar to a real world mission. Furthermore, when the VoIP communication function was disabled, whether participants were experienced or not, the rescue mission time in the experiment consequently increased. As we know, in the real rescue mission, team coordination is important in the battlefield. In the system, all participants are able to interact with each other through hand signal tracking and voice communication. As a result, multiple-user training may become a key feature of the system.

Conclusions
In this paper, we have addressed problems arising when building an infantry training simulator for multiple players. The immersive approach we proposed is an appropriate solution that is able to train soldiers in several VR scenarios. The proposed simulator is capable of training six men at a system update rate of 60 Hz (i.e., the refresh time of the entire system takes about 16 ms), which is acceptable for human awareness with delay (≤ 40 ms). Compared with the expensive Xsens MVN system, the proposed simulator has a competitive advantage in terms of system cost. For future work, we intend to develop improved algorithms to deal with accumulated sensing errors and environment noise on wireless BANs. Consequently, the system can develop finer gestures for military squad actions and enrich the scenario simulation for different usages in military training. The system is expected to be applied in different kind of fields and situations.