The Influence of Display Parameters and Display Devices over Spatial Ability Test Answers in Virtual Reality Environments

This manuscript analyzes the influence of display parameters and display devices over the spatial skills of the users in virtual reality environments. For this, the authors of this manuscript developed a virtual reality application which tests the spatial skills of the users. 240 students used an LG desktop display and 61 students used the Gear VR for the tests. Statistical data are generated when the users do the tests and the following factors are logged by the application and evaluated in this manuscript: virtual camera type, virtual camera field of view, virtual camera rotation, contrast ratio parameters, the existence of shadows and the device used. The probabilities of correct answers were analyzed based on these factors by logistic regression (logit) analysis method. The influences and interactions of all factors were analyzed. The perspective camera, lighter contrast ratio, no or large camera rotations and the use of the Gear VR greatly and positively influenced the probability of correct answers on the tests. Therefore, for the assessment of spatial ability in virtual reality, the use of these parameters and device present the optimal user-centric human–computer interaction practice.


Introduction
Spatial ability is an important skill to have in the modern day and age, as multiple jobs require a well-developed spatial ability [1]. If this skill is well-developed, it allows the person to understand the spatial relations between objects and space. It is possible to improve this skill by solving simple geometric problems. Tests were created with this goal in mind in the last century in paper-based formats. Multiple types of these tests exist, but the authors chose three of them for this research: The Mental Rotation Test (MRT) [2], where the user has to rotate objects in their mind; the Purdue Spatial Visualization Test (PSVT) [2,3], where the user also has to rotate objects in their mind; and the Mental Cutting Test (MCT) [2,4], where the user has to cut and also rotate objects in their mind.
Since these tests mostly exist on paper, and virtual reality can improve the learning skills [5] and even the spatial skills [6][7][8] of students, a question arises: What happens when these tests are taken in VR? When doing the tests in VR, new factors must be considered such as Human-Computer Interaction (HCI) and the display parameters in the virtual environment. While the latter makes it possible to see the application with different graphical and display settings, the former differs from application to application. The behavior and interaction of humans towards the computer depends on the tasks, the available devices and even the design of the application [9]. With the use of HCI principles, applications for different purposes can be designed, such as learning applications [10,11], mobile applications [12], helping with assistive technologies [13], entertainment applications [14], interfaces in VR [15], and even the virtual environments themselves [16]. The latter paper also states that there is no perfect HCI principle for VR, as this greatly depends on the type of the application. However, according to them, user-centric development proved to be useful in the past [17].
While the authors of this manuscript did not follow the user-centric development, their aim was to find an optimal preference that was the most user-centric for virtual environments. For this, the authors developed an application for the mentioned spatial ability tests and added options for the users with the ability to change the display parameters on each device. Since this developed application is in VR, it has an interaction that is greatly different from the paper-based methods used previously. As the tests are not on paper, they can be solved with a keyboard and a mouse on a desktop display and with the touchpad on the Samsung Gear VR.
Though the application measures spatial ability skills, it also logs which display factors are used in the tests with each device. These factors are the virtual camera type, its field of view, its rotation, the contrast ratio between the foreground object and the background, and lastly whether shadows are turned on or off. These factors were examined to determine the most user-centric virtual environment and to see how these new factors in the virtual environment influence the users in achieving better results during the spatial ability tests.
After gathering data from 240 students who used a desktop display and 61 who used the Gear VR, the authors analyzed them. In the first round, the authors evaluated one factor. Then, the examination continued by evaluating in pairs. However, those factors which did not have a significant influence on the results were deliberately left out from further examination. After completing the analysis in pairs, the authors continued by evaluating in triplets, and lastly, in four factors. The maximum examination number of four was due to only four factors having significant influences. With these analyses, the authors determined the effects of each display parameter and device separately. However, these factors depend on and even influence each other; therefore, the interactions between them were also investigated.
When the analyses were complete, the authors concluded that with the Gear VR the users had a higher probability of correct answers, and that the perspective camera type, lighter contrast, and no or large rotations also increased this probability. It was not only these factors that affected the user results, but according to the analysis, this is the optimal user-centric option in VR to assess spatial skills.
This manuscript is structured as follows: In the next section, the research questions (RQs) and hypotheses (Hs) are stated. In Section 3, the materials and methods are presented. Section 4 deals with the results, while Section 5 discusses them. In the last section, the conclusions are summarized.

Research Questions and Hypotheses
The goal of the authors with this manuscript is to see whether these display parameters and devices positively or negatively affect the interaction of the human with the computer. During the research, the authors set up seven RQs and Hs. The RQs are the following: • RQ7: What are the optimal preferences for these factors for achieving the largest probability of correct answers on the tests?
The following are the Hs: • H1: Camera type used does not affect the probability of correct answers; opposite to: perspective type positively influence the probability of correct answers on the tests. • H2: Changing the camera field of view has no effect on the probability of correct answers; opposite to: changing the camera field of view to a higher degree can positively influence the probability of correct answers on the tests. • H3: Camera rotation does not affect the probability of correct answers; opposite to: changing the camera rotation increases the probability of correct answers on the tests. • H4: Contrast ratio does not affect the ratio of correct answers; opposite to: changing the contrast ratio from higher to lower values can positively influence the probability of correct answers on the tests. • H5: The presence of shadows does not affect the probability of correct answers on the tests; opposite to: the ratios of corrects answers are different in case of shadows and in the absence of shadows. • H6: Using a desktop display or the Gear VR, the probabilities of correct answers are equal; opposite to: using the Gear VR the probability of correct answers is larger. • H7: Based on the previous hypotheses, the optimal preferences are the perspective camera type, higher field of view, some rotation, lower contrast ratio while also using the Gear VR.

The Applied Device
The Unity game engine [18], with the version number 2018.3.14f1, was used for the application development at the University of Pannonia in the C# programming language. The development phase was carried out during the first half of 2019. When developing the application, a problem arose. Unity could not obtain the correct contrast values for the background and the object. This was due to different color spaces. Unity uses the sRGB color space, which had to be converted to the RGB space first for their relative luminance values to be calculated. First, let us define the sRGB and RGB spaces: After defining this, the authors determined the albedo color of a Unity object using a built-in function. However, there is ambient lighting in a scene in Unity. The albedo color of the object does not contain the ambient lighting, therefore the color values had to be corrected accordingly. To get the correct color of the object, the transformation w corr = w * w ambientlight * Intensity ambientlight (3) was used, where w is the color of the object and w corr is its corrected value. The next step was the conversion to RGB color space. A new q variable was defined, which would contain the R, G, B values like w contained the sR, sG and sB values. Conversion was performed by the following equation: After obtaining the correct R, G, B values of each object, the relative luminance values (L) can be calculated according to the following equation: When both the relative luminance of the background and the object (in the foreground) have been calculated, the contrast ratio can finally be measured by: When this contrast problem was solved, the development of the application was finished. As mentioned in the Introduction section, the authors focused on three types of tests: The MRT, the MCT and the PSVT tests. Therefore, the application contains these three test types. Three examples of the tests can be seen in Figure 1. Each test features ten rounds of questions about spatial ability. The application runs on two operating systems: for the desktop version, Windows 7 or newer can be used, and for the Samsung Gear VR (SM-R322 [19]) version, Android 7.0 or newer version can be used. The authors used a Samsung Galaxy S6 Edge+ smartphone [20] for the Gear VR version of the tests.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 4 of 26 When this contrast problem was solved, the development of the application was finished. As mentioned in the Introduction section, the authors focused on three types of tests: The MRT, the MCT and the PSVT tests. Therefore, the application contains these three test types. Three examples of the tests can be seen in Figure 1. Each test features ten rounds of questions about spatial ability. The application runs on two operating systems: for the desktop version, Windows 7 or newer can be used, and for the Samsung Gear VR (SM-R322 [19]) version, Android 7.0 or newer version can be used. The authors used a Samsung Galaxy S6 Edge+ smartphone [20] for the Gear VR version of the tests.

Data Collection
Testing and gathering the data with the application were conducted at the University of Pannonia and at the University of Debrecen. This happened in September 2019. At the University of Pannonia, the Gear VR was used, and the tests were carried out by 61 students. At the University of Debrecen, 240 students used an LG 20M37A (19.5") desktop display [21] for the tests. The VR testers consisted of Information Technology (IT) and non-IT students, while the students who used the desktop display were either Mechanical Engineering (ME) or Architectural Engineering (AE) students, mostly in their first year.
The VR tests were three weeks long and the students tested in a sequential order as only one Gear VR device was available at the University of Pannonia. Different number of students tested each day. The smallest number of testers was two and the largest number was eight. As more desktop display devices were available at the University of Debrecen, testing there was different: The tests

Data Collection
Testing and gathering the data with the application were conducted at the University of Pannonia and at the University of Debrecen. This happened in September 2019. At the University of Pannonia, the Gear VR was used, and the tests were carried out by 61 students. At the University of Debrecen, 240 students used an LG 20M37A (19.5") desktop display [21] for the tests. The VR testers consisted of Information Technology (IT) and non-IT students, while the students who used the desktop display were either Mechanical Engineering (ME) or Architectural Engineering (AE) students, mostly in their first year.
The VR tests were three weeks long and the students tested in a sequential order as only one Gear VR device was available at the University of Pannonia. Different number of students tested each day. The smallest number of testers was two and the largest number was eight. As more desktop display devices were available at the University of Debrecen, testing there was different: The tests were done in a computer laboratory. Due to the laboratory being small, students were grouped into twenty groups, each group consisting of twenty students. Each test type had to be done three times: First, the students had to do the MRT test type once, then the MCT once, then the PSVT once. After that, they started from the beginning with different display parameters.
The reason the testers had to repeat the test types three times is that each test option had different display parameters. However, testing three times was not enough to test all parameters for influences and interactions. Therefore, to test all parameters, the authors used the randomization technique. Due to this, each test randomized two or three, but different, parameters. The authors believe that when testing in large numbers, sufficiently large numbers of results regarding each parameter were obtained.
The following information was saved into a .csv file by the application in real-time: • The technical information about the display parameters in each test: The virtual camera type, its field of view, its rotation, the contrast ratio in the scene and whether the shadows are turned off.

•
The user-related information: Their gender, age, primary hand, number of years at a university and what their major is. This category is not focused on in this manuscript.

•
The test type, its completion time and the number of correct and incorrect answers.

Data Analysis
In this manuscript, the influence of the display parameters was investigated by focusing on the probabilities of the correct answers.
The problems solved were grouped into 10, and all testers carried out 9 of these groups. The parameters of the display belonging to the tests in the separate groups were fixed. Therefore, the authors have 240 × 9 + 61 × 9 = 2709 relative frequencies for estimating the probability of correct answers in the knowledge of the parameter values. The aim was to clarify the effects of the parameters. As probabilities were in the focus, the authors used the logistic regression analysis to verify the influence of the parameters [22].
Logistic regression is a well elaborated statistical method to detect the effects of factors in themselves, additively or by taking the interactions into account. The probabilities are transformed by a monotone increase and invertible transformation into the interval (−∞, ∞), and linear regression models are fitted to the transformed values. The estimated coefficients of the variables are tested as to whether they can be considered zero (no effect), or whether they differ significantly from zero (there exists an effect). The sign of estimated value reveals the direction of the effect too, such as improvement or waste in probabilities. The authors investigated the effects of the variables one by one, in pairs, in triplets and in a quartet too. The numerical calculations were carried out by statistical program package R [23]. The results are presented in the next section.

Results
This section is broken into four subsections. In the first subsection, the effect of each single factor is analyzed in itself. In the second subsection, the interactions of two factors are allowed and analyzed. In the third subsection, the interactions of triplets of factors and in the fourth subsection, the interactions of all factors are investigated.

Results of the Analyses of a Single Factor's Effects
In this subsection the authors considered the effects of the display parameters and devices separately. This subsection is broken into six subsubsections, each featuring a different factor.

Analysis of the Camera Type
The first factor to analyze was the virtual camera type. Virtual cameras can be one of two types. The first type is a perspective camera, which is similar to the human eye. The second type of virtual camera is orthographic, meaning that it uses orthographic projection, representing 3D objects in two dimensions. Due to the random choice of camera type, 1418 tests were done with perspective camera and 1291 tests were done with orthographic camera. Numerical results of the users can be seen in Table A1. The authors applied logistic regression analysis for the probability of correct answers, and the results of this can be seen in Table 1. On the basis of p-value = 2.57 × 10 −12 , the difference is significant. The authors conclude that the type of the camera has an influence on the probability of correct answers: the perspective camera type produces better results than the orthographic camera type. The authors numerically computed the average rates of correct answers in the case of orthographic and perspective, obtaining 0.606 and 0.642, respectively, as can be seen in Table A1.

Analysis of the Camera Field of View
The second factor to be analyzed was the virtual camera's field of view. The default field of view in Unity is 60 • , but this value can easily be changed in the application. The authors were interested in multiple fields of view, such as 45 • , 60 • , 75 • , 90 • . 1049 tests were done with 45 • , 120 with 60 • , 134 with 75 • , and 115 with 90 • . 1291 tests were done with an orthographic camera. The numerical rates belonging to the groups mentioned are shown in Table A2. The results of the logistic regression analysis can be seen in Table 2  The basis value was −1, belonging to orthographic camera type. All coefficients are estimated to be positive. The p-values (in the Pr(>|z|) column) suggest that every probability is significantly greater in the case of the perspective camera's field of view options (45 • , 60 • , 75 • , 90 • ). Moreover, due to the results presented in Table 1, the authors are aware that the perspective camera produces better probabilities. Therefore, the authors eliminated the data belonging to the value of −1 and investigated whether the different levels of the fields of view in case of perspective camera type result in different probabilities. This means that the authors were interested in the results when the −1 is taken out. The results of the analysis based on this restricted data set are contained in Table 3.
The basic level was actually 45 • . The signs of the estimated coefficients show that each further level is better, but, except for 90 • , the difference is not significant. In the case of 90 • , on the basis of p-value = 0.0225, the 90 • field of view presents a significantly better probability of correct answers than the others, at the level of 0.05. However, the difference is not significant at the level 0.01. As the amount of data is quite large, the authors accept that the effect of the variable named Field of View is not significant in the case of the perspective camera type.

Analysis of the Camera Rotation
The next to be analyzed was the camera rotation. The authors wanted to see whether the rotation of the virtual scene influenced the results of the users. 106 tests were performed with a rotation of  Table A3. The results of the logistic regression analysis are summarized in Table 4. The basic level was −15 • . According to the p-values, the −45 • , 0 • , 45 • rotations presented significant increases in the probability of correct answers to that of −15 • . The latter is not significant at the 1% level, but it is close to being significant. To further validate the results, the authors grouped the results into two groups. The first group named "IMP_R" contained those rotations that positively affect the probabilities. These are the rotations of −45 • , 0, 45 • . The other group, named "NO_R", contained the rotations that did not have a significant positive effect.
1496 tests fell into the IMP_R group and 1213 tests fell into NO_R. The numerical results are presented in Table A4. When analyzing the results of these two groups through logistic regression analysis, the results were as presented in Table 5. The reference point was NO_R. Table 5 indicates that the two groups previously defined in this subsubsection had significantly different probabilities. These results prove that the groups can be distinguished from each other, and the authors will use these groups later in the case of variable camera rotation.

Analysis of the Contrast Ratio
After analyzing the results of the camera rotation, the influence of the contrast ratio between the foreground object and the background was measured. The authors considered five contrast ratio values: 1.5:1, 3:1, 7:1, 14:1 and 21:1, and their test numbers were 1066, 167, 1121, 164, 191, respectively. See Table A5 for the numerical average rates. Similar to the previous cases, the regression coefficients were computed by logistic regression and the test statistics (testing their zero values), moreover the appropriate p-values can be seen in Table 6. Comparing the results to the 1.5:1 contrast ratio, the 7:1, 14:1 contrast ratios produce significantly worse results in the probabilities, even the contrast ratio of 21:1 is on the 0.05 level. Therefore, the authors grouped the contrast ratios into two different groups: IMP_C, which contains 1.5:1 and 3:1, and the other, NO_C, which contains 7:1, 14:1 and 21:1. IMP_C has 1233 tests and NO_C has 1476, as seen in Table A6. For checking the equality of the probability of correct answers by logistic regression analysis, see Table 7. According to p-value = 2.56 × 10 −6 , the IMP_C contrast ratio group gives a significantly better probability of correct answers than the NO_C group. This means that the bright scenes give better results.

Analysis of the Shadows
The next factor analyzed was the shadows in the scene. This variable has only two levels: When the shadows are turned on and when the shadows are turned off. 1414 tests were done with the former and 1295 with the latter. The numerical data (average rates and dispersions) are presented in Table A7 and the results of the logistic regression analysis are presented in Table 8. The reference point was Turned off. According to p-value = 0.204, the shadows do not affect the probability of correct answers on the tests.

Analysis of the Device Used
The last factor to analyze was the device used. Two devices were used in the tests, an LG 20M37A (19.5") desktop display and the Samsung Gear VR. 2160 tests ran on the desktop display and 549 on the Gear VR as can be seen from Table A8. The results obtained using the logistic regression method are presented in Table 9. The reference point was desktop display. According to p-value = 0.00677 and the estimated coefficient of 0.07595, the probability of correct answers is significantly larger in case of the Gear VR.

Results of Analyses of Effects of Two Factors
In this subsection, the authors analyzed the effects of the display parameters in pairs. Those variables were excluded that do not in themselves affect the probabilities. Therefore, neither the influence of shadows nor the influence of camera field of view (see Tables 3 and 8) are examined further. The authors analyzed the effects of camera type, camera rotation, contrast ratio, and device used. The influences of these factors are investigated in pairs. This section is broken into six different subsubsections, each analyzing a pair of factors using the logistic regression analysis and the ANOVA method.

Analysis of the Pair Camera Type and Rotation
The first pair was camera type and camera rotation. To make the calculations easier and more precise, the same camera rotation groups were used that were previously described in Section 4.1.3. Every level of camera type was paired with every level of camera rotation group. The number of groups involved is equal to 4; the first is named "Orthographic, NO_R", and it contains 611 tests. The others are "Orthographic, INC_R", with 680 tests; "Perspective, NO_R", with 602 tests; and "Perspective, INC_R", with 816 tests. For the numerical average rates and dispersions belonging to the groups, see Table A9; and for the results of the logistic regression analysis, see Table 10. According to the p-values, every camera type is significantly better than "Orthographic, NO_R". There is no significant difference between the "Orthographic, INC_R" and "Perspective, NO_R". This difference was calculated by means of a t-test and has a p-value of 0.7126. The group "Perspective, INC_R" has the best results, and the improvement as compared to "Perspective, NO_R" is significant (p-value = 0.0033).
Moreover, the authors investigated an additive model on the basis of the variables camera type and camera rotation by allowing their interaction. The result can be seen in Table 11.
The estimated coefficients indicate that both camera type and camera rotation have influence, and p-value = 0.0459 means that there is an interaction, as well. The negative sign of −0.08995 was a surprise to the authors, but this is due to the speed of improvement. The measures of the improvement cannot be summed, the result is a little bit lower than that caused additively. Table 12 shows the results of their interactions. According to p-value = 0.0459, the authors concluded that the model that takes the interactions into account provides a better probability.

Analysis of the Pair Camera Type and Contrast Ratio
The second pair to be analyzed was the camera type and the contrast ratio. Similarly to above, the previously formed contrast ratio groups were used. In "Orthographic, NO_R", 696 tests were performed; in "Orthographic, INC_C", 595 tests; in "Perspective, NO_C", 780; and in "Perspective, INC_C", 638 tests. The numerical results are presented in Table A10 and the analysis provided by logistic regression analysis is presented in Table 13. "Orthographic, NO_C", on the basis of Table 13, was found to be significantly worse than the others. There are no significant differences among the other three.
The results of the additive model allowing interactions are summarized in Table 14. As can be seen, both variables have a significant influence on the probability of correct answers, and interaction also exists (p-value = 8.91 × 10 −5 ). Table 15 shows the results of their interaction.
Due to p-value = 8.895 × 10 −5 , it was concluded that the model that takes interactions into account provided a significantly better probability. The next pair examined was the camera type and the device used. 1065 tests were performed in "Orthographic, desktop display", 226 tests in "Orthographic, Gear VR", 1095 tests in "Perspective, desktop display", and 323 tests in "Perspective, Gear VR". The numerical data and the results of the logistic regression analysis are presented in Tables 16 and A11, respectively. When using a desktop display with an orthographic virtual camera, the worst results are produced. With the Gear VR using an orthographic virtual camera, there is no significant improvement. However, performing the tests on a desktop display or a Gear VR with a perspective virtual camera is significantly better than on the desktop display with an orthographic camera. The difference between a desktop display with an orthographic camera or a perspective camera and using a Gear VR with a perspective camera cannot be distinguished. Table 17 presents the results of the logistic regression analysis of the additive model allowing interactions: As can be seen, in addition to the camera type, the device used has no significant influence (p-value = 0.0647), and there is no significant interaction (p-value = 0.6164). Table 18 shows the results of their interactions. According to p-value = 0.6164 in the column of Pr (>Chi), the authors concluded that the model that takes the interactions into account does not provide a better probability than the additive model.

Analysis of the Pair Camera Rotation and Contrast Ratio
The next pair to be analyzed was the camera rotation and the contrast ratio. For this, the authors used groups for both variables as defined in Sections 4.1.3 and 4.1.4. The "NO_R, NO_C" group comprised 1005 tests, the "NO_R, INC_C" group 208 tests, the "INC_R, NO_C" group 471 tests, and the "INC_R, INC_C" group 1025 tests, as can be seen from Table A12. The results of the logistic regression analysis can be seen in Table 19. According to the results, every pair has a better probability than "NO_R, NO_C". The t-test does not indicate a difference between "NO_R, NO_C" and "NO_R, INC_C", but differences can be seen among "NO_R, NO_C", "INC_R, NO_C" and "INC_R, INC_C". These last pairs are not distinguishable. Table 20 shows the results while also taking into account the interaction of the variables. As can be seen, the influence of INC_R is strong. The influence of INC_C is smaller (as can be seen, the estimation of the coefficient is equal to 0.09323), and there is no significant interaction (p-value = 0.1667). Table 21 shows the results of their interactions. According to p-value = 0.1661, the authors concluded that the model which takes the interactions into account does not provide a better probability than the additive model.

Analysis of the Pair Camera Rotation and Device Used
The next pair to be examined was the camera rotation and device used. For the camera rotation, the authors used the same groups that had been formed previously, NO_R and INC_R. 1062 tests were carried out for the group named "NO_R, desktop display", 151 for the group "NO_R, Gear VR", 1098 for the group "INC_R, desktop display", and 398 for the group "INC_R, Gear VR".
Similarly to earlier comparisons, the numerical values of the average rates in the mentioned groups can be seen in Table A13, and the results of the logistic regression analysis are contained in Table 22.
The reference point was "NO_R, Desktop display". As the sign of the estimation of the coefficient is negative, it can be seen that the probabilities of the correct answers are a bit smaller in the group "NO_R, Gear VR", but the difference is not significant (p-value = 0.385). The other two groups present significantly greater probabilities. There is a significant difference between "NO_R, Gear VR" and "INC_R, Gear VR". If the authors apply the additive model with interactions, the same phenomenon can be seen, as presented in Table 23:  As can be seen, the camera rotation has significant influence (p-value = 2.97 × 10 −6 ), but the influence of the device used is not significant (p-value = 0.3847) in itself. However, there is significant interaction at the 0.05 level of significance (p-value = 0.0326). The reader can check the numerical values of the average rates in the involved groups in Table A14. Table 24 shows the results of their interactions. Due to p-value = 0.0329, the authors concluded that the model that takes the interactions into account provides a significantly better probability than the additive model.

Analysis of the Pair Contrast Ratio and the Device Used
The last pair to be examined is the contrast ratio and the device used. For the contrast ratio, the authors used the same groups as before, "NO_C" and "INC_C". After creating the pairings, the "NO_C, desktop display" comprised 1183 tests, "NO_C, Gear VR" comprised 293 tests, "INC_C, desktop display" comprised 977 tests, and "INC_C, Gear VR" comprised 256 tests, as can be seen from Table A14. The results of the logistic regression analysis are presented in Table 25. The point of reference was "NO_C, Desktop display". According to the results, the Gear VR does not present a significant improvement with the "NO_C" contrast. However, with the "INC_C" contrast, the results are significantly better. Moreover, if we compare "INC_C, Desktop display" and "INC_C, Gear VR", although the differences between the average rates are large (see Table A14), this difference is not significant (p-value = 0.06332). This is due to the relatively small number of tests (256 tests).
Although the difference between the average rates of "NO_C, Desktop display" and "INC_C, Desktop display" provides a significant difference because the numbers of the samples is higher, the authors suspect that if they had more tests with Gear VR, the difference would become significant. Performing the logistic regression analysis for the additive model containing the variables contrast ratio and device used and applying the available data set, the same phenomenon can be realized (Table 26).  Table 26 indicates that the contrast ratio has a significant effect (p-value = 0.000583), but aside from the described influence, the influence of the device is not significant (p-value = 0.396699) and interaction has not been detected (p-value = 0.094199). Table 27 shows the results of their interactions. Due to p-value = 0.09403, the authors concluded that the model that takes the interactions into account does not provide a significantly better probability in contrast to the additive model.

Results of Analyses Investigating the Effects of Three Factors
After examining the factors in pairs, the next step was to analyze them as triplets. The first triplet to analyze was the camera type, rotation, and contrast ratio. Similarly to the previous sections, the latter two are grouped.
As in the previous section, the authors performed the analyses in two different ways. The first was as follows: the authors created all possible triplets from the levels of variables and performed the logistic regression analysis for these triplets.
Therefore, similarly to earlier sections, the authors created subsubsections for these triplets, each analyzing a different one. When comparing the triplets, the number of these subsubsections was equal to four.

Analysis of the Triplet Camera Type, Rotation and Contrast Ratio
The first triplet to be examined was the camera type, camera rotation and the contrast ratio. The authors used the same rotation and contrast ratio groups as before. The numerical results of these groups as defined by the triplets are presented in Table A15, and the results of the logistic regression analysis are presented in Table 28. The reference point was "Orthographic, NO_R, NO_C". As can be seen, the values of "Orthographic, INC_R, INC_C" and the values of all Perspective values are significantly better than "Orthographic, NO_R, NO_C". The authors checked and concluded that the "Orthographic, INC_R, INC_C" and the Perspective rows could not be distinguished from one another.
With the ANOVA dispersion analysis method, a comparison was made between the logistic regression models with additive property without interactions (I), the models with interactions between two variables (II), and the models allowing interactions among all three variables (III). When comparing I and II, the model took into account the interactions of the camera type and camera rotation, moreover the interaction of camera type and contrast ratio gave significantly better results than the logistic regression models without interactions (p-value = 0.001258). The interaction of camera rotation and contrast ratio was not significant, as was presented in the previous section; therefore, the authors omitted it in this section.
The case is the same when comparing I and III (p-value = 0.002387). However, when comparing II and III, it can be concluded that the model in which interaction of all three variables is built in does not give better results than model II (p-value = 0.2049). The results of the ANOVA can be seen in Table 29. Therefore, the authors present the results of the logistic regression analysis in the case of the most appropriate model, which is model II. In Table 30, the effects of the factors according to model II can be seen. This means that every variable has an influence, and the interaction in case of camera type and contrast ratio is significant. To double-check ourselves, and the results as well, the authors compared the model in which the interactions of the camera type and rotation, as well as the interactions of the camera type and contrast ratio, were built in to the model in which only the interaction of the camera type and contrast ratio were built in. Here, the authors did not get a worse result by means of this reduction (p-value = 0.99).

Analysis of the Triplet Camera Type, Rotation and Device Used
The next triplet to be examined was the camera type, rotation, and device used. The camera rotation factor used the same groups as before. These rotation and contrast ratio groups were defined beforehand in Sections 4.1.3 and 4.1.4. Then, 8 groups were formed based on all possible levels of the variables.
As before, the numerical values of average ratios and further statistical characteristics are presented in Table A16. Performing the logistic regression analysis, the authors obtained the results presented in Table 31. As can be seen, "Orthographic, NO_R, Gear VR" has some, but not a significant, decrease in the results compared to "Orthographic, NO_R, desktop display". The others are significantly better. The smallest improvement is in "Perspective, NO_R, Gear VR" and the greatest improvements are in "Perspective, INC_R, desktop display" and "Perspective, INC_R, Gear VR". The difference between the two best cases is not significant (p-value = 0.3012).
When comparing the model that does not allow interactions (I), and the model that allows the interactions of two variables (II) and the model that allows the interactions of all variables (III), model II is significantly better than model I (p-value = 0.01175), and model III is not significantly better than model I (p-value = 0.0502). Between model II and model III, there is no significant difference (p-value = 0.7445). When analyzing model II by logistic regression, the analysis results are as presented in Table 32: When looking at the results, it can be concluded that the influence of the device itself disappeared (p-value = 0.2649), but its interactions are still relevant (see p-values 0.0322 and 0.0286). This means that the device should be taken into account when analyzing the data.

Analysis of the Triplet Camera Type, Contrast Ratio, and Device Used
The next triplet to be examined was the camera type, contrast ratio, and the device used. As before, the contrast ratio is encoded into groups. The contrast ratio groups were defined previously in Section 4.1.4.
The numerical values of the descriptive statistics for the resulting groups are presented in Table A17, and the results of the logistic regression analysis are presented in Table 33. The reference point was "Orthographic, NO_C, desktop display". The logistic regression analysis indicates that the "Orthographic, NO_C, Gear VR" was not significantly better than "Orthographic, NO_C, desktop display". The remaining ones were significantly better. However, they were not distinguishable from each other. When comparing the model with the variables camera type, contrast ratio, and device used without interactions (I), the model allowing interactions in pairs (II), and the model allowing interactions among all three variables (III), the results were as follows: II is significantly better than I (p-value = 0.0001132), III is significantly better than I (p-value = 0.000546), and there is no significant difference between II and III (p-value = 0.1792). When using model II, the logistic regression analysis results are the following, as seen in Table 34: This means that the influence of every parameter is significant, and the interaction of the camera type and contrast ratio is also detectable (p-value = 0.000113).

Analysis of the Triplet Camera Rotation, Contrast Ratio, and Device Used
The last triplet to be analyzed is the camera rotation, contrast ratio and the device used. The former two are grouped as before. First, the authors formed 2 × 2 × 2 = 8 groups according to the levels of the variables. The statistical characteristics of the resulting groups are presented in Table A18. The results of the logistic regression analysis for these 8 groups are presented in Table 35. The results show that there is no detectable difference between "NO_R, NO_C, desktop display" and "NO_R, NO_C, Gear VR" (p-value = 0.82046). The cut is not significant in the row of "NO_R, INC_C, Gear VR". This requires clarification, as the average rates of correct answers are 0.602 and 0.561. However, with "NO_R, INC_C, Gear VR", the available data is very low (23), and the dispersion is high. Further investigations (tests) are needed to be able to reject the hypothesis of the equality of the average rates. In other cases, there is a significant improvement. The largest improvement is in the case of "INC_R, INC_C, Gear VR". Moreover, the results of this case are also significantly better than the results of any other case. When comparing the model investigating the effects of the variables without interactions (I), the model investigating the effects of variables allowing the interaction of rotation and used devices (II), and the model allowing the interactions of all three variables (III), the following conclusions can be drawn: II is significantly better than I (p-value = 0.02568), III is significantly better than I (p-value = 0.0001853) and is also significantly better than II (p-value = 0.0006448). When using model III, the logistic regression analysis yields results as shown in Table 36: According to this table, the unique influence of the device disappears, but it has an interaction with the contrast ratio, and even triple interactions can be detected.
Concluding the analyses of the effects of triplets of variables, the following factors should be considered: camera type, rotation, contrast ratio, and the device used. The results show that the most important interactions are those between: • Camera type-Camera rotation • Camera type-Contrast ratio • Camera rotation-Device used • Camera rotation-Contrast ratio-Device used

Results of the Analyses of the Effects of Four Variables
After concluding the analysis of triplets of factors, one more analysis remains: the analysis of all four significant factors. This means that the camera type, camera rotation, contrast ratio and the device used were examined together. If the authors construct the groups based on the possible quartets using the levels of the variables, 2 4 = 16 groups are formed. The numerical values of the descriptive statistics belonging to these groups are presented in Table A19. Applying logistic regression analysis, the authors obtain the results presented in Table 37. The reference point was "Orthographic, NO_R, NO_C, desktop display". According to the logistic regression analysis, there is a significant improvement in every Perspective value except for "Perspective, NO_R, INC_C, Gear VR" (and "Orthographic, NO_R, INC_C, desktop display", and "Orthographic, INC_R, INC_C, desktop display") compared to "Orthographic, NO_R, NO_C, desktop display". The greatest improvements are in "Perspective, INC_R, NO_C, desktop display and "Perspective, INC_R, INC_C, Gear VR". These two cases represent the best numeric values in Table A19. In any case, the average rates belonging to these groups can be considered to be equal (p-value = 0.7627).
Next, a comparison was carried out between the different additive models. The model that uses 4 variables but does not contain any interactions (I), the model that uses 4 variables and allows the interaction of pairs (II), the model that allows the interactions of three variables (III), and finally the model that allows the interactions of all variables (IV). After comparison on the basis of ANOVA, model II was significantly better than I (p-value = 0.0004441), model III was significantly better than model I (p-value = 2.147 × 10 −6 ), and model III was also significantly better than II (p-value = 0.0003342). Finally, IV was not significantly better than IV (p-value = 0.1701).
According to model III, the logistic regression analysis results are as presented in Table 38. Now, it can be concluded that the influence of the devices itself cannot be detected (p-value = 0.987212). However, their interactions can be detected. In the end, after examining each factor on their own and in pairs, triplets and fours, the "Perspective, INC_R, INC_C, Gear VR" gave the optimal results.

Discussion
The transition from paper to virtual is always difficult. This new environment, namely the virtual environment, presents a different type of interaction to the user. Interacting with virtual spaces is not the same as interacting with real objects. These virtual environments are built differently and have graphics that are unlike reality. The goal of the authors was to make this HCI easier, and in order to achieve that, to present an optimal solution.
According to the research data, an optimal solution was found, and the research questions were clearly answered. The authors demonstrated the effects of the parameters, and H5 was accepted, while H1, H2, H4, H6 were rejected and H3, H7 presented mixed cases. In the following subsections, these are elaborated.

Rejected Hypotheses-Detected Influences
The first rejected hypothesis to discuss is H1. Originally, the authors suspected that the perspective camera type influenced the probability of correct answers. The null hypothesis was that there is no effect, and the alternative hypothesis was that there exists some effect. According to Table 1, this latter proved to be the case. The perspective camera type positively influenced the probability of correct answers. However, as can be seen from Tables 10,11,13,14,16,17,28,[30][31][32][33][34]37 and 38, the results slightly changed when multiple factors were taken into account. This is due to VR being a complex, synthetic environment: in virtual reality, no scene exists with only a single factor. Therefore, it is safe to assume that when the users are taken into the virtual space, multiple factors should be considered. That is why the authors analyzed all factors in pairs and triplets. After examining everything, the perspective camera was demonstrated to be superior in all tests, and this was always an important factor. This forms T1: when using a perspective camera, the performance of the users was significantly (p-value = 2.57 × 10 −12 ) influenced in terms of increasing their probability of answering correctly; and in pairs, it exhibited significant (p-value = 0.0459) interactions with −45 • , 0 • , 45 • camera rotations or significant (p-value = 8.91 × 10 −5 ) interactions with the 1.5:1 and 3:1 contrast ratios; in triplets, it had no significant interactions; but in fours it exhibited significant (p-value = 0.000133) interactions with −45 • , 0 • , 45 • camera rotations, 1.5:1, 3:1 contrast ratios, and the Gear VR.
T1 is interesting, because when looking at the paper-based tests, the objects on the paper are drawn on the basis of orthographic projection. This fact can also lead to the question as to whether if the tests on the paper were changed to the projection of perspective type, would it also change the probability of the testers' results?
The next rejected hypothesis to look at is H4. This hypothesis deals with contrast ratios. According to Table 6, smaller contrast ratios (1.5:1 and 3:1) produce a better probability of correct answers than larger ones. This was also confirmed by Table 7, in which the contrast ratios were grouped into two groups. In Tables 13,14,19,20,25,26,28,30 and 33, Tables 34-38, the contrast ratios were examined in detail. Since this factor has a great influence on virtual environments, the interaction of the contrast ratio groups was also assessed. These facts comprise T4: the contrast ratios of 1.5:1 and 3:1 significantly (p-value = 2.56 × 10 −6 ) influence the performance of the users by increasing their probability of answering correctly, and in pairs they significantly (p-value = 8.91 × 10 −5 ) interact with the perspective camera type, and in triplets they significantly (p-value = 0.000237) interact with the −45 • , 0 • , 45 • camera rotations and the Gear VR, while in fours they significantly (p-value = 0.000133) interact with the perspective camera type, the −45 • , 0 • , 45 • camera rotations and the Gear VR.
After forming T4, let's think back to the paper-based tests. There are no contrast ratios on the paper-based tests. Everything is white, only the edges of the objects are black. The authors wanted to make a similar virtual environment to the paper-based tests with brighter colors. However, the idea of using even brighter contrast ratios was discarded, as the testers who used the Gear VR with the 1.5:1 ratio said that their eyes hurt after a few minutes. It is interesting to note that this contrast provided the best results, even numerically.
The last rejected hypothesis is H6. This was one of the most interesting hypotheses to the authors, as they wanted to compare the desktop display and the Gear VR headset. According to Table 9, the Gear VR had a significant influence (improvement) over the desktop display, as its HCI level was different. Since the display device used was one of the most important factors, and was influential, it was also examined in pairs, triplets, in fours, and its interactions were also assessed in Tables 16,  17

Mixed Cases
The first mixed case hypothesis is H2. Recall that the null hypothesis is that there is no effect; therefore, rejecting it means that its effect can be realized. This fact is interesting, because the camera has two types. For orthographic cameras, the field of view is undefined, for perspective cameras, the authors analyzed the 45 • , 60 • , 75 • and 90 • fields of view. Since T1 states that the perspective camera type is better than the orthographic type, and Table 2 also proves this with respect to the fields of view, the main comparison was only carried out among the fields of view of the perspective camera as seen in Table 3. Due to these results, T2 is formed: The field of view of 90 • influenced the performance of the users by increasing their probability of answering correctly. This is a significant difference on the level 0.05, but not on the level 0.01 (p-value = 0.0225).
The second mixed hypothesis was H3. This is mixed because the authors hypothesized that some rotation would help the users. This hypothesis was demonstrated to be true, but when no rotation occurred, it was also true. When the rotation was smaller than 45 • in a given direction, the results showed the hypothesis to be false. It can be stated that the greatest influence on the results takes place when no, or a large rotation occurs. For this, see Tables 4 and 5. Since the rotation is influential, it was also analyzed in pairs, triplets, and in fours, and its interactions were assessed in Tables 10, 12, 19,  20, 22, 23, 28, 30-32 and 35, Tables 36-38. T3 was formed: when rotating the camera −45 • , 0 • , 45 • , the performance of the users will be significantly increased (p-value = 1.12 × 10 −10 ), increasing their probability of answering correctly and in pairs; it will significantly (p-value = 0.0459) interact with the perspective camera type; and in triplets it will significantly (p-value = 0.000237) interact with the 1:5.1, 3:1 contrast ratios and the Gear VR; and in fours it will have significant (p-value = 0.000133) interactions with −45 • , 0 • , 45 • camera rotations, 1.5:1, 3:1 contrast ratios and the Gear VR.
The third and final mixed-case hypothesis was H7, which is related to the optimal preferences in virtual environments for achieving the best HCI results. This hypothesis is only a mixed case, because originally the authors put the camera's field of view into this hypothesis. However, for the optimal preferences due to similar results between the perspective camera type and the field of view, the latter was discarded and only the former was left in. Therefore, T7 is formed: based on the previous theses, the optimal preference for the virtual environments to positively influence the correct answers on spatial ability tests by affecting the human-computer interaction is a perspective camera type, a camera rotation of −45 • or 0 • or 45 • , a contrast ratio of 1.5:1 or 3:1, and the Gear VR display device.

Accepted Hypothesis-No Differences Detected
The first and only accepted hypothesis is H5, which deals with the presence of shadows in the virtual environment. On the paper-based tests, there are no shadows, thus the authors investigated whether their presence changed the probability of correct answers. According to Table 8, the shadows did not significantly influence the probability of correct answers. Therefore, the shadows were omitted from the multiple factor analyses. On this basis, T5 is formed: the shadows of the object do not significantly (p-value = 0.204) influence the performance of the users.

Conclusions
Designing virtual environments is not an easy task, even if the virtual environment is a virtual version of something in reality. The aim of the authors was to find the factors which positively influence users in VR.
The authors analyzed the virtual camera types, fields of view, and rotations, the contrast ratio between the foreground object and the background, the existence of shadows, and the display device used. 240 students carried out the tests using a desktop display and 61 using the Gear VR, each performing each test three times. The results were analyzed using the logistic regression analysis method.
The measurements and results show that these display factors and devices can influence the interaction between humans and the computer. While these factors and devices all have a unique influence, it has to be kept in mind that no virtual environment exists comprising only one of these factors. Therefore, these factors will always be in effect with multiple others. On this basis, these factors were analyzed in pairs, in triplets and in fours. Some factors lost their unique influences, but interactions emerged.
However, these interactions change depending on the number of examined factors. Many retain their interactions, but most of their significances are lost. Thus, virtual environments should be carefully designed.
In conclusion, a carefully designed virtual environment can positively influence the users in their tasks: The results show that the perspective camera type, a camera rotation of −45 • or 0 • or 45 • , a contrast ratio of 1.5:1 or 3:1 and the Gear VR display device proved to be the optimal factors in virtual environments. When the user is in the virtual space with these factors and display devices, their probability of correct interaction, and even the results, increase.

Acknowledgments:
The authors would like to thank Mónika Szeles and Lóránt Horváth for their help in developing the application and creating the 3D models for the MCT test mode, respectively. The authors would also like to thank everyone who helped by testing the application.

Conflicts of Interest:
The authors declare no conflict of interest.