Applying Touchscreen Based Navigation Techniques to Mobile Virtual Reality with Open Clip-On Lenses

: Recently, a new breed of mobile virtual reality (dubbed as “EasyVR” in this work), has appeared in the form of conveniently clipping on a non-isolating magnifying lenses on the smartphone, still o ﬀ ering a reasonable level of immersion to using the isolated headset. Furthermore, such a form factor allows the ﬁngers to touch the screen and select objects quite accurately, despite the ﬁnger(s) being seen unfocused over the lenses. Many navigation techniques have existed for both casual smartphone 3D applications using the touchscreen and immersive VR environments using the various controllers / sensors. However, no research has focused on the proper navigation interaction technique for a platform like EasyVR which necessitates the use of the touchscreen while holding the display device to the head and looking through the magnifying lenses. To design and propose the most ﬁtting navigation method(s) with EasyVR, we mixed and matched the conventional touchscreen based and headset oriented navigation methods to come up with six viable navigation techniques—more speciﬁcally for selecting the travel direction and invoking the movement itself—including the use of head-rotation, on-screen keypads / buttons, one-touch teleport, drag-to-target, and ﬁnger gestures. These methods were experimentally compared for their basic usability and the level of immersion in navigating in 3D space with six degrees of freedom. The results provide a valuable guideline for designing / choosing the proper navigation method under di ﬀ erent navigational needs of the given VR application.


Introduction
Recent mobile virtual reality (M-VR) requires the ever-ubiquitous smartphone to be inserted into a lenses-only headset [1,2]. Compared to the PC based VR (PC-VR) where the headset is separated from and connected to a powerful PC [3] by the cable, the self-contained M-VR platform is lighter, more convenient and relatively inexpensive (but less performing). Nevertheless, typical M-VR platforms like the Google Cardboard or Samsung Gear-VR are still too bulky to carry around for everyday usage and the act of sliding the smartphone in remains a significant nuisance. A new alternative type of M-VR has emerged, dubbed as "EasyVR" [4], which uses cheap, simple clip-on lenses on the smartphone [5]. Clipping the less bulky and foldable lenses onto the phone is clearly simpler and more convenient than sliding the smartphone into the headset (See Figure 1a,b). There are even more convenient designs in which the lenses are integrated into the phone protective cover/casing and can be popped out/in by a light hand-jerk [6]. See Figure 1c. separate controllers for interaction. In fact, usability (in all aspects, namely, device form factor and set-up, interaction methods, sickness, convenience) has been the most formidable obstacle to VR becoming one of the main stream media in the consumer market, against media hype and prospects. We believe that EasyVR has great potential to alleviate many of these usability problems and thus realize the dream of mass consumer-level VR.
(a) (b) (c) Figure 1. EasyVR with the lenses clipped onto the smartphone (a); detachable, foldable and easy-to-carry (b); and integrated into the smartphone protective casing that can easily be popped out and folded in (c).
The main objective of this paper, thus, is to design and propose the most fitting interaction technique for EasyVR with respect to user performance, usability and the level of immersion. In VR, object selection/manipulation and navigation are regarded the two most important primitive interactive tasks and their usability and effect on presence and immersion have been studied extensively [7]. As for the object selection in EasyVR, it has already been shown in our previous work that users can select virtual objects through the touchscreen quite accurately, despite the "seemingly" significant occlusion by the finger with the help of proprioceptive sense and the effects of binocular rivalry [8].
Therefore, this work pertains specifically to the navigation technique in EasyVR. Our basic approach is to combine the conventional touchscreen based and headset based 3D navigation techniques. Note that for the latter, we would exclude techniques that rely on separate sensors or controllers as our design goal for EasyVR is to maintain its self-containment. That led us to mix and match the acts of discrete touch, continuous drag (e.g., the convenience and directness of touchscreen usage) and head/gaze rotation (e.g., promoting the proprioceptive and immersive sense for the VR space) for the two main subtasks of 3D navigation, namely, choosing the direction/target (aiming) and enacting the movement itself [7,9]. We have identified six viabilities of such combination methods and experimentally compared their basic usability (including sickness), user performance and the level of immersion in navigating in 3D space with six degrees of freedom.
M-VR ideally requires not only good usability-as a mobile platform to be used conveniently everywhere and anytime, but also sufficient immersion and presence-as a VR platform. Striking the right balance or orientation, given the main objective of the VR content to be viewed and navigated, is important in choosing the appropriate navigation technique as such a trade-off is expected across these six methods. In this vein, we believe our study can provide a valuable interaction design guideline and help achieve the ultimate goal of a "consumer-usable" mobile VR platform.
The rest of the paper is organized as follows. First we review conventional 3D navigation methods as realized on hand-held touchscreen devices (like smartphones) and on traditional headset In EasyVR, despite the user's peripheral view into the outside world being open and visible, it has been suggested and partly shown that the immersive experience is comparable to that of the closed (i.e., completely isolated from the real world) VR headsets, and expectedly, significantly higher than "smartphone" VR (completely open without any use of lenses) [4]. Moreover, unlike the conventional M-VR, the touchscreen becomes accessible making the already familiar smartphone-oriented interaction possible. This makes the EasyVR, along with the many sensors already integrated into the smartphone, a very much self-contained and able to do away with separate controllers for interaction. In fact, usability (in all aspects, namely, device form factor and set-up, interaction methods, sickness, convenience) has been the most formidable obstacle to VR becoming one of the main stream media in the consumer market, against media hype and prospects. We believe that EasyVR has great potential to alleviate many of these usability problems and thus realize the dream of mass consumer-level VR.
The main objective of this paper, thus, is to design and propose the most fitting interaction technique for EasyVR with respect to user performance, usability and the level of immersion. In VR, object selection/manipulation and navigation are regarded the two most important primitive interactive tasks and their usability and effect on presence and immersion have been studied extensively [7]. As for the object selection in EasyVR, it has already been shown in our previous work that users can select virtual objects through the touchscreen quite accurately, despite the "seemingly" significant occlusion by the finger with the help of proprioceptive sense and the effects of binocular rivalry [8].
Therefore, this work pertains specifically to the navigation technique in EasyVR. Our basic approach is to combine the conventional touchscreen based and headset based 3D navigation techniques. Note that for the latter, we would exclude techniques that rely on separate sensors or controllers as our design goal for EasyVR is to maintain its self-containment. That led us to mix and match the acts of discrete touch, continuous drag (e.g., the convenience and directness of touchscreen usage) and head/gaze rotation (e.g., promoting the proprioceptive and immersive sense for the VR space) for the two main subtasks of 3D navigation, namely, choosing the direction/target (aiming) and enacting the movement itself [7,9]. We have identified six viabilities of such combination methods and experimentally compared their basic usability (including sickness), user performance and the level of immersion in navigating in 3D space with six degrees of freedom.
M-VR ideally requires not only good usability-as a mobile platform to be used conveniently everywhere and anytime, but also sufficient immersion and presence-as a VR platform. Striking the right balance or orientation, given the main objective of the VR content to be viewed and navigated, is important in choosing the appropriate navigation technique as such a trade-off is expected across these six methods. In this vein, we believe our study can provide a valuable interaction design guideline and help achieve the ultimate goal of a "consumer-usable" mobile VR platform.
The rest of the paper is organized as follows. First we review conventional 3D navigation methods as realized on hand-held touchscreen devices (like smartphones) and on traditional headset oriented Electronics 2020, 9,1448 3 of 17 VR platforms. Based on our review, we then present the six mixed and matched navigation designs in detail and explain the rationales behind them. Section 4 describes the details of the comparative experiment of the six proposed navigational interfaces on EasyVR and its results. Finally, we discuss and summarize our findings and conclude the paper.

Related Work
Traveling or navigating in virtual environments with different motion control techniques have been studied extensively [7,[9][10][11][12][13][14][15][16][17][18]. There is no one-for-all solution/interface for the 3D navigation task, because different VR set-ups employ different sensors, displays and there may be other factors to consider such as the type of content, purpose and harmonization with other interactive tasks. Perhaps the most current popular navigation interface for VR is the controller based one. For example, the HTC Vive VR platform uses a hand-held controller to select the direction or point to the target (by the internal sensor) and move toward it by a variable or fixed distance (using the button) [17]. The movement may be instant (teleport) or animated (continuous move). Compared to the animated move, the instant teleport may reduce the degree of vection and sickness [7], but can induce disorientation. For this reason, the teleport is often limited to a short distanced target within the view. The target designation may be visualized and accomplished in different styles such as the arc [11], pointing [10].
Another popular method is to use the gaze or head direction to select the direction or target, for example, sensed by the gyro in the headset [12]. The movement may be invoked separately through the controller button(s) or be based on lock-on time. In the latter case, a separate controller would not be needed, and as such is the popular choice for M-VR (like the Google Cardboard [1] and Samsung GearVR [2]). Moreover, the proprioceptive sense from the use of the neck to steer the direction can help the user understand the space and feel more immersion [8], but it can also cause more significant fatigue and stress than using the hand-direction.
The previous studies on 3D navigation has given us rough guidelines such as the merits of providing maps and landmarks [9], avoidance of sudden and complete teleport and ensuing disorientation [7], the utility of employing walk-in-place [16] or treadmill [15], and the use of proper metaphors (e.g., driving [19], bicycle [20], superman [21]) for naturalness and usability (including user performance). We believe that these general guidelines will continue to apply to EasyVR, while specific navigation techniques need to be devised, considering its two main characteristics: direct accessibility to the screen target by touch (but seen through the magnifying lenses) and intended avoidance of using the controller (not only to be self-contained but also because the hands are used to hold the headset-even if the headset was held with one hand, to use the controller with the other hand would be as difficult a situation as a M-VR use case).
Therefore, the most relevant 3D navigation interface designs we should consider are the touch based ones employed in the non-VR 3D applications (such as 3D games) on the hand-held devices. The "Point of Interest Logarithmic Flight" [11] technique operates similarly to the VR teleport [17], the only difference being that the target designation is accomplished by direct touch. Another popular interface, often employed for 3D games, is the virtual joystick and go-button that appear on the screen sides/corners for direction control and invoking movement, respectively [22].
Finally, Moerman et al. have introduced a technique called the Drag'n Go [13] where the user controls the progression of the navigation by sliding/dragging the cursor position (with just one finger) along the line between its initial touch position and the bottom border of the screen (acting like a screen space projected movement vector toward the target). While this technique is limited only to moving on the ground (2D), it was experimentally shown to be one of the most efficient. Most touch-based interfaces are limited by the 2D planar screen and thus difficult to support true 3D movements intuitively-extra modes may be necessary.
Recently, head/gaze based direction control has been introduced for non-VR 3D applications (such as 3D games) on hand-held devices, but has not caught on much, as smartphone users find them to be tiresome, less familiar, embarrassing to use in public and contrary to their purpose (just to play Electronics 2020, 9,1448 4 of 17 the game without any concern for immersive experience) [23]. In the case of EasyVR, we expect and assume the users to actually be interested in gaining the immersive experience, and the qualities of presence and immersion will be important criteria for evaluating the effectiveness of the proposed navigation technique.

Touch Based 3D Navigation Methods for EasyVR
Based on the careful review of existing techniques in smartphone navigation and headset oriented VR navigation (not using the controller), we mixed and matched the acts of discrete touch, continuous drag and head/gaze rotation. Hereafter, we use the term "head rotation" to include the gaze based rotation assuming that the gaze direction is the same as the head direction. Furthermore, note that the head/gaze direction is considered the same as the "z" direction (e.g., normal to the main surface of the smartphone) of the device which the user is holding to look through the clip-on lenses in the same direction for the two main subtasks of 3D navigation, namely, (1) rotating to choose and aim for the direction/target and (2) enacting the movement itself (sometimes including the control for the amount of movement). We emphasize that the navigation techniques were designed and studied to support travel in all six degrees of freedom (e.g., not limited to just ground travel as in the work of Drag'n Go [13]). We have identified six main methods as summarized in the Table 1. The detailed description of each interface follows. Table 1. The six navigation interface designs for EasyVR categorized by the methods used for the two subtasks of (1) direction control (rotation) and (2) invoking/controlling the movement.

Head-Directed + Touch-to-Go (HD + TG)
With the "HD + TG" interface, the users move and rotate their heads, while grabbing the smartphone with one (or two) hand to point to a direction and simply touch anywhere on the screen to move toward in that direction by a fixed amount. A cross-hair is shown to indicate the aimed direction and an arbitrarily set amount of movement is used. This interface is mostly equivalent to the gazed based travel method employed in the current M-VR except that the movement is enacted by the touch rather than the lock-on time. While the time based method could have been considered in this study, we excluded it because there were several previous works that indicated [8,12] that the time-based method was unnatural, less usable, less preferred and simply an unavoidable choice (when a separate controller is unavailable)-thus expectedly less desirable than this method in all regards (see Figure 2). The head-directed, touch-to-go interface (HD + TG). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The gaze is estimated by the orientation of the hand-held smartphone and sets the movement direction. A single touch on the screen enacts the virtual movement by a predefined fixed distance.

Head-Directed + Touch-to-Teleport (HD + TT)
With the "HD + TT" interface, the users move and rotates their heads, while grabbing the smartphone with one (or two) hand to point to a direction and touch a target in the view to teleport to a new position which is toward the point of interest by a fixed amount. The movement is instant and not animated. This interface is mostly equivalent to the "Teleport" method [7] except that the target/direction designation is accomplished by the head direction and touch, and the teleport enacted by the same touch. The amount of the teleport movement is similarly set arbitrarily for now (see Figure 3). Figure 3. The head-directed, touch-to-go interface (HD + TG). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The gaze is estimated by the orientation of the hand-held smartphone and sets the movement direction. A single touch on the target on the screen (indicated with the red circle) enacts the virtual teleport to a new position toward the selected target by a predefined fixed distance. lenses Figure 2. The head-directed, touch-to-go interface (HD + TG). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The gaze is estimated by the orientation of the hand-held smartphone and sets the movement direction. A single touch on the screen enacts the virtual movement by a predefined fixed distance.

Head-Directed + Touch-to-Teleport (HD + TT)
With the "HD + TT" interface, the users move and rotates their heads, while grabbing the smartphone with one (or two) hand to point to a direction and touch a target in the view to teleport to a new position which is toward the point of interest by a fixed amount. The movement is instant and not animated. This interface is mostly equivalent to the "Teleport" method [7] except that the target/direction designation is accomplished by the head direction and touch, and the teleport enacted by the same touch. The amount of the teleport movement is similarly set arbitrarily for now (see Figure 3). Electronics 2020, 9, x FOR PEER REVIEW 5 of 17 Figure 2. The head-directed, touch-to-go interface (HD + TG). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The gaze is estimated by the orientation of the hand-held smartphone and sets the movement direction. A single touch on the screen enacts the virtual movement by a predefined fixed distance.

Head-Directed + Touch-to-Teleport (HD + TT)
With the "HD + TT" interface, the users move and rotates their heads, while grabbing the smartphone with one (or two) hand to point to a direction and touch a target in the view to teleport to a new position which is toward the point of interest by a fixed amount. The movement is instant and not animated. This interface is mostly equivalent to the "Teleport" method [7] except that the target/direction designation is accomplished by the head direction and touch, and the teleport enacted by the same touch. The amount of the teleport movement is similarly set arbitrarily for now (see Figure 3). Figure 3. The head-directed, touch-to-go interface (HD + TG). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The gaze is estimated by the orientation of the hand-held smartphone and sets the movement direction. A single touch on the target on the screen (indicated with the red circle) enacts the virtual teleport to a new position toward the selected target by a predefined fixed distance. lenses Figure 3. The head-directed, touch-to-go interface (HD + TG). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The gaze is estimated by the orientation of the hand-held smartphone and sets the movement direction. A single touch on the target on the screen (indicated with the red circle) enacts the virtual teleport to a new position toward the selected target by a predefined fixed distance. The third and fourth interfaces both use the virtual joystick, located in the left part of the screen, to move forward (joystick up) and backward (joystick down) along the direction of the view (z direction). Dragging the joystick left and right (x direction) will translate the user sideways. This resembles the typical and much familiar navigation interface on smartphone games. HD + JG uses the head/gaze movement (rotation) to point the direction to travel (z direction), and JD + JG uses another virtual joystick on the right to control the direction, for example, to pan left and right, and look up and down.
The joysticks on the screen are touch activated and appear at the touch location rather than at fixed locations (see Figure 4).

Head-Directed + Joystick-to-Go (HD + JG) and Joystick-Directed + Joystick-to-Go (JD + JG)
The third and fourth interfaces both use the virtual joystick, located in the left part of the screen, to move forward (joystick up) and backward (joystick down) along the direction of the view (z direction). Dragging the joystick left and right (x direction) will translate the user sideways. This resembles the typical and much familiar navigation interface on smartphone games. HD + JG uses the head/gaze movement (rotation) to point the direction to travel (z direction), and JD + JG uses another virtual joystick on the right to control the direction, for example, to pan left and right, and look up and down. The joysticks on the screen are touch activated and appear at the touch location rather than at fixed locations (see Figure 4).  . Head-directed + joystick-to-go (HD + JG) (a) and joystick-directed + joystick-to-go (JD + JG) (b). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The virtual joystick on the left controls the movement by drag. As for direction control, HD + JG relies on the user's head orientation (estimated by the phone direction) and JD + JG by another joystick on the right.

Head-Directed + Drag'n Go (HD + DG) and Joystick-Directed + Drag'n Go (JD + DG)
The fifth and sixth interfaces both use the Drag'n Go style interface [13] to move forward (drag down) and backward (drag up), toward or away from a point of inter-est. The point of interest is designated by a touch on the view screen and the amount of vertical drag is mapped proportionally to how much forward or backward move is made in the virtual environment. HD + DG uses head . Head-directed + joystick-to-go (HD + JG) (a) and joystick-directed + joystick-to-go (JD + JG) (b). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. The virtual joystick on the left controls the movement by drag. As for direction control, HD + JG relies on the user's head orientation (estimated by the phone direction) and JD + JG by another joystick on the right. The fifth and sixth interfaces both use the Drag'n Go style interface [13] to move forward (drag down) and backward (drag up), toward or away from a point of inter-est. The point of interest is designated by a touch on the view screen and the amount of vertical drag is mapped proportionally Electronics 2020, 9,1448 7 of 17 to how much forward or backward move is made in the virtual environment. HD + DG uses head rotation for user's view control, while JD + DG uses the touch activated joystick on the right hand-side (see Figure 5).
Electronics 2020, 9, x FOR PEER REVIEW 7 of 17 rotation for user's view control, while JD + DG uses the touch activated joystick on the right hand-side (see Figure 5).
(a) (b) Figure 5. Head-directed + Drag'n Go (HD + DG) (a) and joystick-directed + Drag'n Go (JD + DG) (b). The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. For both interfaces the Drag'n Go method is used to control the user's movement toward the point of interest proportionally to the amount of the drag. For HD + DG, the user's head orientation sets the movement direction and for JD + DG, the virtual joystick on the right.

Experiment Design
The main purpose of the experiment was to explore the design space for the EasyVR navigation interface among the six proposed methods. While there could certainly be other possible designs, we believed that these six methods reasonably represent the most notable ones by the systematic mix and matching. Their basic usability (including user performance), and effects to the level of immersion were assessed comparatively. The experiment was designed as one-factor 6 levels (1 × 6) within-subject repeated measure, where the sole factor was the navigation interface type as described in Section 3-(1) HD + TG, (2) HD + TT, (3) HD + JG, (4) JD + JG, (5) HD + DG and (6) JD + DG.

Experimental Task
The user was asked to navigate through a 3D piped tunnel path with the six interfaces given in a balanced order. The user was initially positioned in the virtual space at the starting point and was The red cylinders represent the virtual tunnel segments through which the user needs to navigate in the experimental task. For both interfaces the Drag'n Go method is used to control the user's movement toward the point of interest proportionally to the amount of the drag. For HD + DG, the user's head orientation sets the movement direction and for JD + DG, the virtual joystick on the right.

Experiment Design
The main purpose of the experiment was to explore the design space for the EasyVR navigation interface among the six proposed methods. While there could certainly be other possible designs, we believed that these six methods reasonably represent the most notable ones by the systematic mix and matching. Their basic usability (including user performance), and effects to the level of immersion were assessed comparatively. The experiment was designed as one-factor 6 levels (1 × 6) within-subject repeated measure, where the sole factor was the navigation interface type as described in Section 3-(1) HD + TG, (2) HD + TT, (3) HD + JG, (4) JD + JG, (5) HD + DG and (6) JD + DG.

Experimental Task
The user was asked to navigate through a 3D piped tunnel path with the six interfaces given in a balanced order. The user was initially positioned in the virtual space at the starting point and was instructed to navigate all the way to the finishing point as fast as possible and as accurately as possible, for example, through the center of the tunnel without colliding with the walls. Note that the tunnel was not completely closed but was deliberately made of semi-transparent short pipe segments spaced out at the turning points (see Figure 6), so that the user could get back onto the trail if they veered out of the tunnel. The path was designed sufficiently long (60 segments plus inter-segment turns where each segment would normally require 3-4 incremental movements) and such that all 17 different 45 degree and 90 degree forward turns were included as shown in the figure. Yellow arrows were shown to guide the users to the finish point. The start and finish segments were indicated in different colors for distinction.
Electronics 2020, 9, x FOR PEER REVIEW 8 of 17 instructed to navigate all the way to the finishing point as fast as possible and as accurately as possible, for example, through the center of the tunnel without colliding with the walls. Note that the tunnel was not completely closed but was deliberately made of semi-transparent short pipe segments spaced out at the turning points (see Figure 6), so that the user could get back onto the trail if they veered out of the tunnel. The path was designed sufficiently long (60 segments plus inter-segment turns where each segment would normally require 3-4 incremental movements) and such that all 17 different 45 degree and 90 degree forward turns were included as shown in the figure. Yellow arrows were shown to guide the users to the finish point. The start and finish segments were indicated in different colors for distinction. All six interfaces were tested using the EasyVR platform in which the user held the smartphone (Samsung Galaxy S7 (Samsung, Seoul, Korea) [24]) with both hands (while using the thumbs to access and touch the screen for navigational control), and Homido-mini VR magnifying glasses [5] were clipped onto it. The imagery was rendered in stereo mode.

Experimental Procedure
Twenty paid subjects (ten men and ten women between the ages of 20 and 32, mean = 24.7, SD = 3.23) participated in the experiment. After collecting their background information (e.g., age, education), the subjects were briefed about the purpose of the experiment and given instructions for the navigation interfaces and the task. Important background information was whether the user was familiar with and had prior experience in using the VR/smartphone games. A short training was given to allow them to become familiar with the six navigation interfaces and to understand the 3D All six interfaces were tested using the EasyVR platform in which the user held the smartphone (Samsung Galaxy S7 (Samsung, Seoul, Korea) [24]) with both hands (while using the thumbs to access and touch the screen for navigational control), and Homido-mini VR magnifying glasses [5] were clipped onto it. The imagery was rendered in stereo mode.

Experimental Procedure
Twenty paid subjects (ten men and ten women between the ages of 20 and 32, mean = 24.7, SD = 3.23) participated in the experiment. After collecting their background information (e.g., age, Electronics 2020, 9, 1448 9 of 17 education), the subjects were briefed about the purpose of the experiment and given instructions for the navigation interfaces and the task. Important background information was whether the user was familiar with and had prior experience in using the VR/smartphone games. A short training was given to allow them to become familiar with the six navigation interfaces and to understand the 3D tunnel path. The training lasted until the subject felt sufficiently comfortable in using all the six different navigation interfaces.
The subjects sat on a swivel chair (so that body turns were fully possible) and held the smartphone with two hands, interacting with the touch screen using both thumbs to carry out the task (see Figure 7). The experiment presented the six interfaces to the user in a balanced Latin-square order. Each treatment involved finishing the 3D path once and measuring the task completion time and accuracy. The task completion time was measured automatically by the software timer between the moments of the first movement entering into the first segment, and reaching the final segment of the tunnel. The accuracy was measured by accumulating the user's distance to the center of the tunnel path at a regular interval and averaging them.
Electronics 2020, 9, x FOR PEER REVIEW 9 of 17 tunnel path. The training lasted until the subject felt sufficiently comfortable in using all the six different navigation interfaces. The subjects sat on a swivel chair (so that body turns were fully possible) and held the smartphone with two hands, interacting with the touch screen using both thumbs to carry out the task (see Figure 7). The experiment presented the six interfaces to the user in a balanced Latin-square order. Each treatment involved finishing the 3D path once and measuring the task completion time and accuracy. The task completion time was measured automatically by the software timer between the moments of the first movement entering into the first segment, and reaching the final segment of the tunnel. The accuracy was measured by accumulating the user's distance to the center of the tunnel path at a regular interval and averaging them. After experiencing each interface, surveys assessing the level of immersion, sickness and usability were filled out (see Appendix A for the detailed questions and scoring schemes). The IEQ (immersive experience questionnaire) [25] was originally designed to appraise the level of immersion in games for which navigation, the interactional task of focus, often occurs (e.g., racing, shooting games, adventure games). Furthermore, it is one of the rare surveys whose credibility was shown through the close and consistent correlation to other indirect quantitative measures like eye movement, pace in interaction and even affective behaviors [25]. As our focus was with the task of 3D navigation with six degrees of freedom (DOF), it was translated into Korean and modified to fit our purpose. The revised questionnaire was comprised of 32 questions with four immersion categorical factors, namely cognitive involvement, emotional involvement, real world disassociation, control and challenge. We omitted a few questions that were less relevant to the experiment, and added/rephrased some questions to link them to the experimental factors at hand. The omission was minor and thus the reliability and validity of the evaluation remains as the scoring system was also scaled appropriately. General feedback was asked for after the whole experiment was finished.
The simulator sickness was measured using the revised SSQ (Simulation Sickness Questionnaire) by Kennedy et al. [26], which is the most widely used questionnaire in cyber-sickness research. We also referred to the work of Bouchard et al. [27] who had refactored the original SSQ to make it a better fit to modern VR systems and streamlined and revised (and translated to Korean) to also fit our experimental purpose. Participants reported the severity of each symptom on a 4-point scale (0-3). Total SSQ score or three subscales (for nausea, oculomotor and disorientation) are used.
The usability questionnaire was also drafted based on notable previous work such as the NASA TLX [28] and IBM Computer Usability Satisfaction Questionnaire [29]. The questions had six categories: ease of use, learnability, efficiency, fun, fatigue and general satisfaction/preference. After experiencing each interface, surveys assessing the level of immersion, sickness and usability were filled out (see Appendix A for the detailed questions and scoring schemes). The IEQ (immersive experience questionnaire) [25] was originally designed to appraise the level of immersion in games for which navigation, the interactional task of focus, often occurs (e.g., racing, shooting games, adventure games). Furthermore, it is one of the rare surveys whose credibility was shown through the close and consistent correlation to other indirect quantitative measures like eye movement, pace in interaction and even affective behaviors [25]. As our focus was with the task of 3D navigation with six degrees of freedom (DOF), it was translated into Korean and modified to fit our purpose. The revised questionnaire was comprised of 32 questions with four immersion categorical factors, namely cognitive involvement, emotional involvement, real world disassociation, control and challenge. We omitted a few questions that were less relevant to the experiment, and added/rephrased some questions to link them to the experimental factors at hand. The omission was minor and thus the reliability and validity of the evaluation remains as the scoring system was also scaled appropriately. General feedback was asked for after the whole experiment was finished.
The simulator sickness was measured using the revised SSQ (Simulation Sickness Questionnaire) by Kennedy et al. [26], which is the most widely used questionnaire in cyber-sickness research. We also referred to the work of Bouchard et al. [27] who had refactored the original SSQ to make it a better fit to modern VR systems and streamlined and revised (and translated to Korean) to also fit our experimental purpose. Participants reported the severity of each symptom on a 4-point scale (0-3). Total SSQ score or three subscales (for nausea, oculomotor and disorientation) are used.
The usability questionnaire was also drafted based on notable previous work such as the NASA TLX [28] and IBM Computer Usability Satisfaction Questionnaire [29]. The questions had six categories: ease of use, learnability, efficiency, fun, fatigue and general satisfaction/preference.
Each treatment lasted 8-9 min including the task itself (5-6 min), answering for the survey (2 min), and the in-between breaks (1.5 min). Thus, with six treatments, each user spent about an hour to compete the whole session.

Quantitative Results
A one-way ANOVA (and Tukey HSD pairwise comparison) was applied to examine the effect of the lone factor, namely the interface type. Figure 8 shows the average task completion times among the six interfaces. The analysis revealed that a significant main effect (F = 19.938, p-value < 0.00). HD + TG, HD + TT and HD + JG exhibited a relatively higher performance compared to the other three interfaces. HD + DG and JD + DG showed the worst performance. The Tukey HSD has shown three subgroups as indicated in Figure 8 (table and graph). The Cohen's d-values indicate the effect sizes (large effects of d > |0.8| are observed). Participants reported that the Drag'n Go style interface was not suitable for EasyVR, especially when the task required the users to interact in short, fast bursts on the small touchscreen space. Participants also had a difficult time orienting themselves with the virtual joystick on the right side of the screen when using the JD + DG interface. The detailed statistics are provided in Figure 8. Each treatment lasted 8-9 min including the task itself (5-6 min), answering for the survey (2 min), and the in-between breaks (1.5 min). Thus, with six treatments, each user spent about an hour to compete the whole session.

Quantitative Results
A one-way ANOVA (and Tukey HSD pairwise comparison) was applied to examine the effect of the lone factor, namely the interface type. Figure 8 shows the average task completion times among the six interfaces. The analysis revealed that a significant main effect (F = 19.938, p-value < 0.00). HD + TG, HD + TT and HD + JG exhibited a relatively higher performance compared to the other three interfaces. HD + DG and JD + DG showed the worst performance. The Tukey HSD has shown three subgroups as indicated in Figure 8 (table and graph). The Cohen's d-values indicate the effect sizes (large effects of d > |0.8| are observed). Participants reported that the Drag'n Go style interface was not suitable for EasyVR, especially when the task required the users to interact in short, fast bursts on the small touchscreen space. Participants also had a difficult time orienting themselves with the virtual joystick on the right side of the screen when using the JD + DG interface. The detailed statistics are provided in Figure 8.  Figure 9 shows the distance error (or accuracy) among the six interfaces analyzed by the one-way ANOVA (and Tukey HSD pairwise comparison). The analysis revealed a significant main effect for the interface type (F-value = 6.864, p-value < 0.00). The Tukey HSD has shown two subgroups as indicated in the table and graph. The Cohen's d-values indicate the effect sizes. The distance error was computed as the accumulated perpendicular distance from the user's position to the center of the tunnel as they were instructed to travel through the center without colliding with the walls as much as possible. The graph mainly shows that HD + DG and JD + DG incurred the most distance error (despite the longest time spent to finish the course) with statistical difference from the other four.  Figure 9 shows the distance error (or accuracy) among the six interfaces analyzed by the one-way ANOVA (and Tukey HSD pairwise comparison). The analysis revealed a significant main effect for the interface type (F-value = 6.864, p-value < 0.00). The Tukey HSD has shown two subgroups as indicated in the table and graph. The Cohen's d-values indicate the effect sizes. The distance error was computed as the accumulated perpendicular distance from the user's position to the center of the tunnel as they were instructed to travel through the center without colliding with the walls as much as possible. The graph mainly shows that HD + DG and JD + DG incurred the most distance error (despite the longest time spent to finish the course) with statistical difference from the other four.
Electronics 2020, 9, x FOR PEER REVIEW 11 of 17 Figure 9. The average distance error to the center of the tunnel path among the six tested interfaces.        The collective (nor did categorical) sickness score did not have any statistical difference across the interfaces (F = 1.450, p-value = 0.211, also see Figure 11). The joystick direction (JD) control showed the worst sickness level but without any statistical differences to others. As the maximum weighted score can be up to 812, and even the worst interface, JD + JG, had a sickness score of below 360, indicating that overall, sickness was not a serious problem, despite the apparent vection in the experimental navigational task. This could be due to the relatively narrow field of view of the EasyVR set-up. It is interesting to note that JD + JG was rated relatively high in terms of the immersive experience despite incurring a high level of sickness.

Qualitative Results
Electronics 2020, 9, x FOR PEER REVIEW 12 of 17 The collective (nor did categorical) sickness score did not have any statistical difference across the interfaces (F = 1.450, p-value = 0.211, also see Figure 11). The joystick direction (JD) control showed the worst sickness level but without any statistical differences to others. As the maximum weighted score can be up to 812, and even the worst interface, JD + JG, had a sickness score of below 360, indicating that overall, sickness was not a serious problem, despite the apparent vection in the experimental navigational task. This could be due to the relatively narrow field of view of the EasyVR set-up. It is interesting to note that JD + JG was rated relatively high in terms of the immersive experience despite incurring a high level of sickness. Figure 11. Simulator Sickness Questionnaire (SSQ) scores (total) for the six interfaces. Figure 12 shows the results of the usability questionnaires (six categories) analyzed by the two-way ANOVA. The analysis showed significant main effects for interface type in all categories (ease of use: F-value = 7.802, p-values < 0.00; learnability: F-value = 5.295, p-value < 0.00; efficiency: F-value = 8.905, p-value < 0.00; fun: F-value = 7.178, p-value < 0.00; fatigue: F-value = 5.372, p-value < 0.00; satisfaction/preference: F-value = 6.470, p-value < 0.00). Consistent with previous results, HD + DG and JD + DG showed the poorest usability in general. HD + JG was rated as the most usable, particularly with the least amount of fatigue. The Drag'n Go style inter-face that induces the user to "control" the amount of movement very frequently seems to affect all aspects of the navigation in a negative way including the user performance, sickness, experience and usability. We omit the detailed statistical figures for the post hoc pairwise comparison (the equivalent subgroups with significant differences are indicated in the same color in the graphs).  Figure 12 shows the results of the usability questionnaires (six categories) analyzed by the two-way ANOVA. The analysis showed significant main effects for interface type in all categories (ease of use: F-value = 7.802, p-values < 0.00; learnability: F-value = 5.295, p-value < 0.00; efficiency: F-value = 8.905, p-value < 0.00; fun: F-value = 7.178, p-value < 0.00; fatigue: F-value = 5.372, p-value < 0.00; satisfaction/preference: F-value = 6.470, p-value < 0.00). Consistent with previous results, HD + DG and JD + DG showed the poorest usability in general. HD + JG was rated as the most usable, particularly with the least amount of fatigue. The Drag'n Go style inter-face that induces the user to "control" the amount of movement very frequently seems to affect all aspects of the navigation in a negative way including the user performance, sickness, experience and usability. We omit the detailed statistical figures for the post hoc pairwise comparison (the equivalent subgroups with significant differences are indicated in the same color in the graphs).

Experienced vs. Non-Experienced
Out of twenty participants, ten participants were categorized as experts, who had sufficient experience of mobile gaming at least on a weekly basis and also had at least used the VR headsets on several occasions. The other ten participants were categorized as novices who had almost no experience. It turned out that the expert group was more satisfied with the joystick interface, due to their previous experience with the mobile gaming on their smartphones. Furthermore, the novice group exhibited significantly more simulator sickness (One way ANOVA: p-value < 0.00, see Figure 13) leading to more fatigue compared to the expert group. In short, the general finding still applies in that both the expert and novice users preferred the head-directed direction control and non-Drag'n Go style interfaces (HD + TG, HD + TT, and HD + JG). JD + JG was particularly unsuitable for the novice users.

Experienced vs. Non-Experienced
Out of twenty participants, ten participants were categorized as experts, who had sufficient experience of mobile gaming at least on a weekly basis and also had at least used the VR headsets on several occasions. The other ten participants were categorized as novices who had almost no experience. It turned out that the expert group was more satisfied with the joystick interface, due to their previous experience with the mobile gaming on their smartphones. Furthermore, the novice group exhibited significantly more simulator sickness (One way ANOVA: p-value < 0.00, see Figure  13) leading to more fatigue compared to the expert group. In short, the general finding still applies in that both the expert and novice users preferred the head-directed direction control and non-Drag'n Go style interfaces (HD + TG, HD + TT, and HD + JG). JD + JG was particularly unsuitable for the novice users.

Experienced vs. Non-Experienced
Out of twenty participants, ten participants were categorized as experts, who had sufficient experience of mobile gaming at least on a weekly basis and also had at least used the VR headsets on several occasions. The other ten participants were categorized as novices who had almost no experience. It turned out that the expert group was more satisfied with the joystick interface, due to their previous experience with the mobile gaming on their smartphones. Furthermore, the novice group exhibited significantly more simulator sickness (One way ANOVA: p-value < 0.00, see Figure  13) leading to more fatigue compared to the expert group. In short, the general finding still applies in that both the expert and novice users preferred the head-directed direction control and non-Drag'n Go style interfaces (HD + TG, HD + TT, and HD + JG). JD + JG was particularly unsuitable for the novice users. Figure 13. Simulator Sickness Questionnaire scores between the expert and novice user groups. Figure 13. Simulator Sickness Questionnaire scores between the expert and novice user groups.

Discussion and Conclusions
In this work, we have investigated various touch based interaction methods for navigating in the six dimensional virtual space using the "EasyVR" mobile VR platform with the open clip-on lenses.
We have identified six possible major techniques by mixing and matching the methods for the two subtasks, namely, direction control and movement enactment. The Drag'n Go method [13] has been considered to be one of the most usable and efficient touchscreen based navigation methods on mobile platforms. However, our investigation has shown that such was not true in the case of EasyVR, (1) because the navigation task was extended into 3D with 6 DOF (with which more control dimensions are needed) while Drag'n Go was mainly for 2D navigation on the ground and (2) different form factors in terms of the difference in the screen size (Drag'n Go was originally developed and tested on a 22 sized device) and how the touchscreen is accessed between the lenses and how they are seen through the lenses.
The head/gaze directed method for direction control was rated the best in both quantitative and qualitative measures. The movement enactment was split between the simple touch for the novices and the virtual joystick for the more advanced and experienced users. We were not able to observe and interaction effects between the subtask methods-the direction control (head/gaze directed) and movement enactment could mostly be controlled independently as is the motor control for the arms (holding the phone/movement direction) and fingers (pressing on to move). Joystick directed direction/view control incurred a small degree of sickness, although vection induced sickness would be somewhat unavoidable in any navigational tasks. In contrast, as indicated already, the head direction was felt to be much more natural, mimicking the way humans travel in the real world.
One important lesson is that we have demonstrated that just because EasyVR leverages on the touchscreen for interaction, it does not necessarily mean the regular smartphone interface transfers without problems. Different form factors and operating conditions affect usability and the level of immersion. With more controls (beyond just navigation) added for a particular application, the interaction form factor would be changed and so would the usability and user performance.
We acknowledge the limitation of this study particularly in terms of the number of subjects and simplicity of the experimental task. In the domain of navigation alone, there are many other issues to consider, for example, 2D navigation, consideration of non-touch based methods (like gestures and walk-in-place), the use of other metaphors and the effects of open peripheral views in EasyVR. There could be other demographic or system factors that could influence the navigational performance and interface preference other than prior VR experience and expertise. This study is just a small start to investigating the usability and user experience of the EasyVR platform, which we believe has good potential to be adopted by the masses for its tighter integration into the regular smartphone.
Continued experiments and explorations of various aspects of the design space are surely needed. Other possible future work items include the investigation of walk-in-place type of navigation interfaces using only sensors on the mobile device (for detecting the gait/walk, without any external devices) and enriching the navigation interface with vibro-tactile feedback. Another important research direction is to investigate the touch based interaction paradigm as a whole, for example, how to harmoniously integrate touch based selection, navigation and other primitive tasks (e.g., manipulation).