Effect of Target Size, Location, and Input Method on Interaction in Immersive Virtual Reality

: Although new virtual reality (VR) devices and their contents are actively being released, there are still not enough studies to prepare its interface/interaction standard. In this study, it was investigated whether speciﬁc interaction factors inﬂuenced task performance and the degree of virtual reality sickness when performing pointing tasks in immersive virtual reality. A smartphone-based VR device was used, and twenty-ﬁve targets were placed in a 5 × 5 layout on the VR experimental area that extended to a range similar to the human viewing angle. Task completion time (TCT) was signiﬁcantly affected by target selection method ( p < 0.001) and target size ( p < 0.001), whereas the error rate (ER) signiﬁcantly differed for the target selection method ( p < 0.001) and not for the target size ( p = 0.057). Target location was observed to be a factor affecting TCT ( p < 0.001), but it did not affect the ER ( p = 0.876). VR sickness was more severe when the target size was smaller. Gaze selection was found to be more efﬁcient when accuracy is demanded, and manual selection is more efﬁcient for quick selection. Moreover, applying these experimental data to Fitts’ Law showed that the movement time was found to be less affected by the device when using the gaze-selection method. Virtual reality provides a three-dimensional visual environment, but a one-dimensional formula can sufﬁciently predict the movement time. The result of this study is expected to be a reference for preparing interface/interaction design standards for virtual reality.


Introduction
Recently, as communications and display technologies have evolved, a basis for commercialization has been established, and virtual reality (VR) has emerged as a nextgeneration platform along with augmented reality and mixed reality. Research on virtual reality is largely divided into two categories: studies to attenuate motion sickness and to analyze input device task performance. Currently, there are many factors obstructing the popularization of VR. One of the main problems is VR sickness [1]. In order for VRrelated markets to continue to develop in the future, an environment in which VR can be experienced for a long time without motion sickness must be prepared. Hardwarebased research has primarily focused on reducing the gap between motion and visual information [2,3]. The inconsistency of visual information is the main reason for the occurrence of VR sickness. When a user turns his or her head and senses a change of position through a sensory organ, sickness may occur if the VR motion does not correspond to natural sensations [4][5][6]. This difference is referred to as motion-to-photon (MTP) latency. The hardware market has an aim of reducing this phenomenon to less than a 20-ms latency [7][8][9]. In addition, studies have been conducted to reduce this MTP latency using machine learning algorithms [10] or IMU sensors [9]. In terms of software, motion sickness has been effectively reduced by adjusting focus [11][12][13], viewing angle [13], and visual effect or guide [14][15][16][17]. These studies suggest that motion sickness levels can be adjusted not only by reducing MTP latency but also by the interaction method and interface On the other hand, a smartphone-based virtual reality device uses the smartphone's touchscreen as the display and hardware of the virtual reality, and the role of the headset is to show the screen in three dimensions to the user through the lens attached to the VR headset. Among these smartphone-based VR devices, there are products that can use separate controllers, but low-cost smartphone-based VR headsets mainly use head tracking. This type of method can be divided into (1) the method of pressing the capacitive touch button on the virtual reality headset, and (2) the method of placing the cursor on the object and holding it for a certain time [35]. Smartphone-based VR is less expensive than a dedicated headset for virtual reality and does not require a high-end computer, so people who are new to virtual reality can easily access it. Since research conducted in the meantime is being conducted for PC-based VR devices, it is necessary to conduct research for smartphone-based VR devices.
Therefore, the task performance and the level of virtual reality sickness were analyzed when performing input tasks using two main input methods-gaze selection and manual selection-mentioned above in smartphone-based virtual reality. In this study, participants selected targets not only within a viewing angle but also slightly away from the viewing angle, and task performance was analyzed by setting and quantitatively measuring variables related to task time and accuracy. In addition to the input method, how the user experience is affected by differences in interface elements such as the size and location of the buttons was compared. The purpose of this study is to investigate the size and position of the target (i.e., interface elements) and the effect of the input method (i.e., interaction elements) on task performance and VR sickness. Additionally, the movement time data were applied to Fitts' law and analyzed whether there is a difference in predictive power according to the input methods.

Fitts' Law
Fitts' law [36] is widely used in the field of human-computer interaction as a way of predicting the movement time of the target-pointing task. Fitts defined the index of difficulty (ID) of the task using the amplitude (A) and the width (W) of the target as variables. The relationship between the ID and the movement time is expressed by the following simple linear regression model.
This is a simple but relatively powerful model for estimating movement time as a way to express the concept that the greater distance to the target, the smaller the target size, and the greater the difficulty of the task. Some models with slightly modified ID definitions were presented. Among them, a model proposed by Welford [37] and the Shannon model proposed by MacKenzie [38] were often used. These models had been used to analyze target selection tasks in two dimensions, such as selection with a controller [38][39][40], selection with a stylus pen [41], or tapping a touchscreen by hand [42][43][44]. However, these models were applicable to 1-dimensional (1D) work. Thereafter, models considering the width and height of the target were proposed [45][46][47]. Murata and Iwase [48] and Cha and Myung [49] proposed a 3D extended Fitts' law, which included the movement angles as variables. Table 1 summarizes the IDs used in the Fitts' law formula and the various modified formulas mentioned above. In this study, we confirm that Fitts' law is established in a 3D VR environment (Section 5.3).

Virtual Reality Sickness Questionnaire
The Simulator Sickness Questionnaire (SSQ) proposed by Kennedy et al. [50] was developed to measure the level of motion sickness in the simulator environment. Thus, the degree of motion sickness was evaluated on a scale of four points for 16 symptoms. Each sickness symptom is included in one or two categories (e.g., nausea, oculomotor, disorientation). The motion sickness level of the simulator could be quantitatively expressed by calculating the score of the sickness symptom included in each category. Kim et al. [51] proposed the Virtual Reality Sickness Questionnaire (VRSQ), which modified the SSQ for the VR environment. Table 2 shows the evaluation items of VRSQ and the category to which each evaluation item belongs. At the bottom of the table, it shows how to calculate the score of each category and its total score. This study used the VRSQ of Table 2 to determine how the interaction method (input) and interface elements (target size and location) affect VR sickness.

Participants
A total of 20 subjects were recruited from the local university, and the sex ratio was equalized. The age range of the subjects was 18-to-25 yrs (mean: 22.7; s.d: ± 1.9). This is considering that the main users of virtual reality devices are mainly in their 20s and 30s [52,53]. Of the subjects, 12 had experienced HMD-based VR before, but the rest were first-timers. The subjects were all students with no vision problems. Prior to the experiment, the subjects were individually adjusted focal lengths for their visual acuity via a dial on atop the HMD so that they could clearly see the image. Therefore, VR sickness caused by unclear images could be reduced as much as possible. All participants provided informed consent, and the study design was approved by the university's institutional review board (IRB No. 7007971-201801-003-02).

Apparatus and Application
For the VR experiment, a smartphone-based virtual reality head-mounted display (HMD) was used. The smartphone was a Samsung Galaxy S7 (SM-G930K), and the VR headset that holds the smartphone was a Samsung Gear VR (SM-R322) ( Figure 1). The liquid crystal display of the smartphone was used as the HMD screen. The VR effect was realized by providing stereoscopic images. The resolution was 2560 × 1440, and the maximum viewing angle through the HMD was 96 • on a diagonal basis, which was sufficient for the experiment. The total weight was 470 g (smartphone: 152 g, headset: 318 g), similar to other virtual reality headsets, and the duration of one type of experiment was within 3 min. Therefore, the effect of motion sickness due to the device's weight is expected to be small. The applications used to conduct the experiments were created and implemented directly with Unity and C#. In VR, the visual size varies depending on how far away things are spaced. Therefore, when implementing the target, the target size was determined using a viewing angle, which is a unit that takes into account both the absolute size of the target and the absolute distance from the subject's eye to the target, rather than using only the absolute size of the target. Based on the results of Bababekova et al. [54], the viewing angle of the target was determined ( Figure 2). The distance and the size of the target set here were fixed, so it did not change over the participants. Referring to the experimental design of other studies that analyzed the effect of the target characteristics [43,44,55,56], a total of 25 square targets, spaced at regular intervals in a 5 × 5 array, was placed on a white plane with a viewing angle of about 127 • 19 on a diagonal basis. Therefore, the experiment was performed within a range similar to the mid-peripheral vision of a human, which is not less than the viewing angle provided by the VR HMD. The size of the target was set at two levels. The viewing angle was set at about 3 • 50 for large targets and about 2 • 23 for small targets. Figure 3 shows an example of the experimental area.  The experimental conditions comprise a total of four according to the target size and target selection method. The size of the target was two levels, as described above, and the target selection method is also divided into two. Target selection methods included gaze selection and manual selection. Both used a cursor fixed at the center of the screen, reflecting head movement through head tracking, placing the cursor on the point to be selected. In this case, the gaze-selection method was automatically selected after 1000 ms when placing the cursor on the target. This is based on a previous study in which the highest task performance was obtained when the gaze-timing was 1000 ms [57]. For reference, in the previous study, considering that the average waiting time used for the top 5 applications based on the number of downloads from the Android Play Store was 1.52 (±0.60) seconds, the waiting time was set to 1 s, 1.5 s, and 2 s. However, the manualselection method was used after placing the cursor on the target and pressing the touchpad on the right side of the HMD (Figure 4).

Experimental Measurement
This experiment was performed to evaluate how the size, position, and selection methods of the target affected the performance of the task when the user performed the input operation in the virtual environment. Therefore, task completion time (TCT) and the error rate (ER) in all four experiments were measured. Task completion time was defined as the time from when one target was presented to when each participant selected the target. The error rate was defined as the number of incorrect selections made during the task.
Additionally, we measured VR sickness using the VRSQ. Participants rated the level of motion sickness at four levels while taking a break after each experiment (0: Not at all; 1: Slight; 2: Moderate; 3: Very). Therefore, a total of four questionnaires were completed by each participant.

Procedure
Experiments were performed using within-subject design methods. Experiments conducted in this study show that the learning effect could be increased toward the end of the experiment because the target was repeatedly pressed many times and was challenging. Therefore, to minimize the learning effect, the order of execution of the four experimental conditions was determined using the Latin Square balancing design technique.
In all four conditions of the experiment, one of the 25 green targets was highlighted in blue. The targets presented were designed to appear randomly without any set rules. One task ended when four target selections were completed. When the participant was ready for the next task, he or she began by pressing the HMD's touchpad. In order to reduce the effect by a small number of samples, participants were asked to perform the task several times. Thus, the task was repeated 25 times, so one participant selected a total of 400 targets (4 selections × 25 repetitions × 2 target sizes × 2 selection methods) ( Figure 5). The participants sat in their chairs and performed the experiments in a relaxed posture. At the end of each of the four experimental conditions, the participants completed a VRSQ.

Results
As the result of the statistical analysis, there was no significant difference by gender in all dependent variables (TCT: p = 0.450; ER: p = 0.062; VRSQ: 0.683; α = 0.05).

Task Completion Time
The mean and standard deviation of each task completion time (TCT) based on two independent variables were measured as follows ( Table 3). The average TCT of the large target was 2322 ms, and the small target was 2518 ms, indicating that the time required to select a small target was longer. For determining whether the difference in TCT was statistically significant depending on the size of the target, a two-way repeated measure of ANOVA (RM-ANOVA) analysis was performed. Thus, it was found that the size of the target could have a significant effect on TCT (p < 0.001; α = 0.05). The manual-selection method took 2132 ms, and the gaze-selection method took 2708 ms. In the case of manual selection, the TCT was shorter than that of gaze selection. Likewise, RM-ANOVA analysis was conducted to find out whether the difference in target selection method was significant (p < 0.001). Additionally, the interaction effect between target size and input method was also significant (p < 0.001). Simple effect analysis showed that there was no effect on target size during gaze-selection, but TCT was significantly different according to target size in manual-selection ( Figure 6).

Error Rate
We examined how the error rate (ER) varied with each experimental condition. ER of the large target was 0.35% and 0.65% in the small target. The chi-squared test was performed to determine whether these differences were statistically significant. The results showed that the size of the target was not a significant factor in the ER (p = 0.057; α = 0.05). Additionally, the manual-selection method showed ER of 0.90%, and the gaze-selection method was 0.10%. ER was about nine times different, depending on the selection method. As in the chi-squared test, the two target selection methods were analyzed to be significant factors for the ER (p < 0.001).

Target Location
To investigate the influence of the TCT per the position of the target, the average TCT was obtained for each target position. The one-way repeated measure of ANOVA analysis was performed using the target location as an independent variable (α = 0.05). The target location data were entered on a nominal scale from 1 to 25 in order from the top left. The result showed that the target position was a significant factor in the TCT in all experimental conditions (p < 0.001). Post-hoc analysis was performed using the Student-Newman-Keuls (SNK) test, and SNK analysis showed statistically significant differences for the target location ( Figure 7). We performed a separate classification task to visually check how the TCT differed according to the position of the target. First, TCTs, according to 25 targets, were grouped by post hoc analysis. Among them, the targets belonging to the same group as the target corresponding to the median value were classified into the medium speed group. The targets with shorter times were classified into the fast-speed group based on the mediumspeed group, and those with longer times were classified into the slow-speed group. Table 4 shows the range of TCTs for the three groups determined by the target size and the selection method.  Figure 8 shows the distribution of TCT for each condition. According to the criteria in Table 4, the slow-speed group was shown in dark gray, the medium-speed group in light gray, and the fast-speed group in white. In all experimental conditions, targets located at the center were classified into the fast-speed group. Additionally, as the distance from the center increased, time also became longer. In particular, targets located farthest from the center were classified into the slow-speed group. Unlike the TCT, the chi-squared test found that the target position was not statistically significant at ER (p = 0.876; α = 0.05). Figure 9 shows the location of the error in each condition. Errors less than 1/3 of all errors were white, less than 2/3 errors were light gray, and the rest were dark gray. Thus, the darker the color, the more errors occurred at the position; the brighter the color, the fewer errors occurred. Figure 9a,b show the error distributions for large and small targets. By comparison, whereas the error rate was not analyzed as a significant difference per the target size, more errors occurred in the smaller target. However, through Figure 9c,d, the error distribution of the two selection methods could be checked. In the case of manual selection, many areas were painted in dark colors, but in gaze selection, all areas were marked in white. As with the results of the analysis in Section 4.3, it can be confirmed that errors occurred significantly, depending on the selection method. However, there was no effect from the target position.

Virtual Reality Sickness
We measured the motion sickness of the experiments performed in this study using the VSRQ. The oculomotor score was 38.54, and the disorientation score was 20.5. The total score was 29.52. For further analysis, ANOVA analysis of the VRSQ scores, per independent variables, was performed (α = 0.05). Thus, the difference according to the target size was analyzed as significant (oculomotor: p < 0.01; disorientation: p < 0.05; total: p < 0.01). Among the components of the VRSQ, the oculomotor was 33.75 points for the large target and 43.33 points for the small target. Disorientation was 18.33 points and 22.67 points, respectively. The total was 26.04 points and 33 points (Figure 10). However, the difference in scores per the input method was not statistically significant (oculomotor: p = 1.000; disorientation: p = 0.468; total: p = 0.689) (Figure 11). Oculomotor was 38.54 points for both methods, the disorientation was 21.17 points for gaze selection, and 19.83 points for manual selection. The gaze-selection showed a score equal to or slightly higher than the manual selection, but no significant difference.

Application of Fitts' Law
The correlation between the TCT and the index of difficulty can be analyzed using Fitts' law. In order to calculate the ID, the distance (A) from the starting point to the target was substituted for the distance from the center of the current button to the center of the next button, and the size of the target (W) was substituted for the width of the button. Table 5 shows the average movement time according to the ID for each target selection method. Through this, a regression analysis was performed, and the results are shown in Table 6. Table 5. Mean task completion time (unit: s) by the index of difficulty in each selection method.

Gaze-Selection
Manual -Selection   ID  TCT  ID  TCT  ID  TCT  ID  The result showed that the values of R2 were 0.71 in the gaze-selection method and 0.90 in the manual-selection method. Therefore, it turns out that Fitts' law fits better when the target is selected by the manual-selection method.

Factors Affecting the Task Performance
The size of the target was a major factor affecting the performance of tasks on other devices, such as mobile and touchpad [43,44,58]. In previous research, it was confirmed that the size of the target is a factor affecting the ER, and most research shows that the larger the target size, the faster the input task can be performed. In this study, we found a significant difference in task completion time between the two levels of target size. On the other hand, it is not a statistically significant factor in the ER, although more errors occurred when a small target was pressed. Therefore, if we extend the range of the target size further, it is expected that we can find the size of the target that can achieve minimum usability in terms of time and error.
The method of selecting the target also showed statistically significant results. The subjects could process tasks on average 500-ms faster than the gaze-selection when using manual-selection. Therefore, the manual-selection method was more advantageous for the content needing fast and continuous selection. However, the results in Figure 8 implies that the target selection task can be performed regardless of the size of the target in gaze selection, and the accuracy was affected more sensitively by the target size in the case of manual selection. It is noteworthy that, whereas the same operation was performed, the effect of the target size was small in the case of the gaze selection.
Both methods have a low error probability of less than 1%, but the target selection method has a statistically significant effect on the errors. The manual-selection method produced 9-times more errors than the gaze-selection method. As mentioned in Section 4.1, the manual-selection method had an advantage in terms of the task completion time. However, it was not in terms of accuracy. Factors such as hand trembling of participants and cursor deviation caused by head movement when pressing the touchpad may have affected the errors. Therefore, the gaze-selection method should be used in cases where accurate selection is required. Considering that most errors occurred in the gaze-selection method were wrongly selected the previous target without the subject noticing that the target had changed, it can be judged that the gaze-timing on the target may not be sufficient in certain situations or more clear feedback may be needed. These findings support the process of designing interfaces to perform accurate and fast tasks.

Influence of Target Position
The target position is also a factor that has a significant influence on the task completion time. Figure 8 shows that the time taken to select the targets close to the center was short and that the targets at the edges took much time. These results also seem to have affected the viewing angle. In this experiment, the FOV in which the targets were located was 127 • , larger than the viewing angle provided by the HMD (96 • ). Therefore, not all targets were visible at once. However, targets located near the center were mostly exposed. It is therefore assumed that the centrally located targets were able to select relatively sooner than the target located at the edge. This is consistent with the results of other studies comparing the interaction between users and devices. Therefore, it is suggested that targets should be placed near the center when the input task requires quick selection in a virtual environment. For computer work using a mouse, because the edge of the screen is a boundary, pointing can be performed without finely adjusting the position of the pointer. However, the utility of the edge is relatively lower because all spaces can be utilized without a VR boundary.
In this study, it was found that the probability of choosing the wrong target was not related to the target's position. When the user selected the button on the touchscreen by hand, most errors occurred when the button was located at the lower-right, and the fewest errors were found at the upper-left [59]. In the case of controlling the cursor via head tracking, it is expected that the target location may have less influence on the ER than when using the hand.

Application of Fitts' Law
In this study, an experiment was performed to select a target continuously by wearing a head-tracking-type HMD. Therefore, we analyzed whether the data derived from this experiment conformed to Fitts' law. To determine the best predictive power when the VR pointing task is applied to the formula reflecting the ID value dimension, the correlation of movement time per each ID value was analyzed. The target was a square shape with the same width and height. Therefore, the same results as with the Shannon formula by MacKenzie were obtained when the data were applied to the extended model proposed by Crossman [45], MacKenzie and Buxton [46], and Hoffman and Sheikh [47]. Additionally, the 3D extended formula proposed by Murata and Iwase [48] and Cha and Myung [49] included specific angles in the independent variables. Figure 12 shows the angles used for this study's analysis.  Figure 13 shows the results of applying the data of this study to each model. Thus, the movement time can be estimated using the index of difficulty with relatively high reliability in all models. Consequently, the pointing task in the HMD also follows Fitts' law. In addition, between the two 3D extended models, it was found that the model of Cha and Myung with two angular variables showed better predictive ability than Murata and Iwase's model with one angular variable. Given that VR is a space representing three dimensions, the 3D extended model was expected to predict movement time more accurately. However, the lower-dimensional formulas showed better predictive power. Thus, it was found that the movement time of the pointing task of VR in a 3D environment was well explained by a 1D formula. VR provides a 3D visual environment but differs from the actual 3D work environment because it does not move to select objects far from the user. An input task in a VR can be performed by a 2-dimensional (2D) motion, which moves the cursor up, down, left, and right. The characteristics of such VR influenced these results.

Measuring VR Sickness through VRSQ
For both VRSQ component scores, there was more severe VR sickness when choosing a small target compared to selecting a large target ( Figure 10). In all experimental conditions, the VRSQ score was higher when the experiments performed in this study were compared with Kim et al. [51]. The notable difference from this study is the magnitude of the viewing angle and the number of targets the subjects had to select in one experimental condition. Lin et al. [60] reported that FOV in VR was a factor affecting motion sickness. Additionally, the number of targets to be selected in this study was 2.5 times, and the distance between the targets increased because of the wide viewing angle. It thus took longer to carry out the experiment. Additionally, VR use time has been found to be a factor affecting VR sickness [4]. Therefore, it is considered that the time takes to carry out the continuous selection task by wearing the VR HMD also influenced this difference. Looking at another aspect, the number of repetitions of the task in this experiment was 25, which was more than that of the experiment performed by Kim et al. Therefore, the subjects may have felt more bored while conducting the experiment. It is expected that future experiments will determine whether the emotions felt while using virtual reality have also an effect on VR motion sickness.

Conclusions
In this study, we analyzed the effect of target position and task performance when a continuous target selection task was performed with a VR device. The experimental area was set larger than the maximum viewing angle represented by the VR HMD. The target size was divided into two levels. The target selection method consisted of the manual-selection method and the gaze-selection method. The results show that the effect of the target size on the task completion time was different according to the two selection methods. Additionally, the error rate was significantly higher at manual selection than at gaze selection, but it was analyzed that there was no effect by target size. The task completion time was different according to the target location, but it did not affect the error rate. Additionally, the degree of motion sickness was measured by the VRSQ to investigate the level of motion sickness during the target pointing task. It is considered that the extension of the viewing angle and the extended use time influenced motion sickness. Additional experiments are needed to clarify which factors are more influential because there were great differences when comparing the VRSQ scores of this study and Kim et al. [51].
As the research on virtual reality mainly focused on technology for commercialization, there were relatively few basic usability studies. Virtual reality also interacts with the user through the screen, but it is a platform that has distinguished characteristics than other devices with displays. Therefore, the results of this study are meaningful in that separate basic research is needed to reflect the unique characteristics of virtual reality. For example, virtual reality is visually expressed in three dimensions, but manipulation can be carried out in two dimensions, moving the head tracking-based cursor. These characteristics were once again confirmed by the fact that the data obtained from this study were better suited to the lower-dimensional Fitts' law than the three-dimensional expansion formula. However, because the target width and height are the same, there is a limitation in that the result cannot be compared with the 2D Fitts' law model reflecting the target area. We expect that the effects of size and shape of the target on the movement of the pointing task in the VR environment can be clarified by comparing the results obtained by experimenting with varying target distances or setting different widths and heights instead of squares or circles. Additionally, it is expected that the results of experiments can be compared by placing the targets at various distances and heights in order to reflect the characteristics of virtual reality further in future studies. Thus, it is possible to develop a new time prediction model by adding a variable that reflects the VR environment by searching for the relationship. Another limitation of this paper was that the recruited subjects were relatively young, so it was not possible to compare the results with the elderly. For example, there are research results showing that there is a statistically significant difference in the level of VR sickness by age group [61][62][63]. Considering that older people also use virtual reality technology for reasons such as treatment or education, it is necessary to analyze the results obtained from various age groups in the future.