The Effect of Training in Virtual Reality on the Precision of Hand Movements

: The main point of the work was to use virtual reality to discover its beneﬁts on training, speciﬁcally on the precision of hand movements in speciﬁc settings, and then evaluate its effects both for virtual reality and the transfer of the results to the real world. A virtual reality simulation was created using the Unity3D game engine and real-world experimental material was also prepared. A total of 16 participants took part in the training, which lasted for approximately one month. Once the data was gathered from both the virtual reality and real-world tests, we carried out in-depth statistical analysis. The results suggest positive outcomes in most aspects in virtual reality training productivity, but only partial transfer of the training beneﬁts to the real world scenario. The possible reasons for this are described in the work and suggestions are given to duplicate the study with different variables to try to achieve different results.


Introduction
VR technology is used in different areas of life for various purposes. VR is the most immersive type of reality technology that creates an artificial environment to inhabit. This immersive, computer-generated environment blocks out sensory input from the outside world and uses visual and auditory cues to make the virtual world seem quite real. It is immersive because of its simulated environment that manages to trick your subconscious mind to start treating this illusion as real.
The main point of this work is to perform an experiment in VR and RW and find out if precision, accuracy and time factors change and, if so, by how much and in which direction. Accuracy and precision are alike only in the fact that they both refer to the quality of measurement, but they are very different indicators of measurement. Accuracy is the degree of closeness to the true value, whereas precision is the degree to which an instrument or process will repeat the same value. In other words, accuracy is the degree of veracity while precision is the degree of reproducibility [1].
This work focuses on just one way of measuring the precision of hand movements using one specific type of simulation, but the types of training simulations for each job should definitely have their own variants and styles of simulation. Some examples of professions that require high hand-to-eye coordination and precision are given below. VR training in these cases could drastically improve the skills required to perform the tasks, as it would allow unlimited training in any circumstances.
VR technology has become very popular and is used in many different areas such as entertainment, military, healthcare, education, engineering, etc. VR is a simulated environment that is created with computer technology and presented to the user in such a way that the user starts to feel as if they are in a real environment. A simulation is a model of the RW wherein the user has the ability to interact with the environment [2]. Simulations There is a wide variety of research being conducted in the field of VR on cyber sickness: cyber sickness in different types of moving images [23], comparison of one-screen and three-screen displays [24,25], relationship between age and sickness level [26], and a gender effect on cyber sickness [27,28]. Research on virtual environments (VEs) has provided converging evidence that being placed in a VE with a head-mounted display (HMD) can lead to motion sickness (MS) [29] Previous studies suggest that a twentyminute exposure to VREs (virtual reality environments) can increase cyber sickness (CS) symptoms in over 60% of participant [30]. It has been noted that some users might exhibit symptoms of CS, both during and after experiencing VR [31,32]. CS is different from MS in that the users are stationary, but they have a compelling sense of motion as the visual imagery changes [33,34].
Currently there is a debate concerning VR technology and VR cyber sickness. Some argue that the problem of VR sickness is inherent in this technology itself and that as long as there is mismatch between what is visually perceived and what is physically sensed, fixing the VR display properties will not address the root cause [35].

Software and Hardware Equipment
The VR simulation was created using the Unity3D game engine. The tests were run on a VR-ready Acer Predator notebook, with Intel Core i7-6700HQ @ 2.60 GHz, 16 gb Ram and Nvidia GeForce GTX 1060-6gb. The VR headset was a HTC Vive Pro, which has a resolution of 1440 × 1600 pixels per eye (2880 × 1600 pixels combined), a 90 Hz refresh rate, and a 110 degrees field of view. All of these parameters allowed participants to have as comfortable a VR experience as possible.

Design
The environment in VR was a simple room with no extra objects in it, in order not to distract participants with unnecessary visual noise (see Figure 1). Apart from the room itself, the only 3D models available in the scene were a 1.05 m high table, an object (line, circle, sinus wave) of reference to draw on, a model of a glue gun or caulking gun, depending on the scene, which was either a one-handed or two-handed version. The line, circle, and sinus wave were green with medium transparency on them, whereas the drawing was done with opaque yellow with Unity3D's line renderer.
There is a wide variety of research being conducted in the field of VR on cy ness: cyber sickness in different types of moving images [23], comparison of on and three-screen displays [24,25], relationship between age and sickness level [26 gender effect on cyber sickness [27,28]. Research on virtual environments (VEs) vided converging evidence that being placed in a VE with a head-mounted (HMD) can lead to motion sickness (MS) [29] Previous studies suggest that a twe nute exposure to VREs (virtual reality environments) can increase cyber sickn symptoms in over 60% of participant [30]. It has been noted that some users migh symptoms of CS, both during and after experiencing VR [31,32]. CS is different f in that the users are stationary, but they have a compelling sense of motion as th imagery changes [33,34].
Currently there is a debate concerning VR technology and VR cyber sicknes argue that the problem of VR sickness is inherent in this technology itself and tha as there is mismatch between what is visually perceived and what is physically fixing the VR display properties will not address the root cause [35].

Software and Hardware Equipment
The VR simulation was created using the Unity3D game engine. The tests w on a VR-ready Acer Predator notebook, with Intel Core i7-6700HQ @ 2.60 GHz, 16 and Nvidia GeForce GTX 1060-6gb. The VR headset was a HTC Vive Pro, whi resolution of 1440 × 1600 pixels per eye (2880 × 1600 pixels combined), a 90 Hz refr and a 110 degrees field of view. All of these parameters allowed participants to comfortable a VR experience as possible.

Design
The environment in VR was a simple room with no extra objects in it, in ord distract participants with unnecessary visual noise (see Figure 1). Apart from th itself, the only 3D models available in the scene were a 1.05 m high table, an obj circle, sinus wave) of reference to draw on, a model of a glue gun or caulking pending on the scene, which was either a one-handed or two-handed version. T circle, and sinus wave were green with medium transparency on them, whereas th ing was done with opaque yellow with Unity3D's line renderer. The laboratory with assigned VR tracking space can be seen in Figure 2. The laboratory with assigned VR tracking space can be seen in Figure 2.

3D Models
The models of the glue gun (see Figure 3) and caulking gun (see Figure 4) were first downloaded from the web and slightly adjusted in Blender software for our experiment.

Shape Shape Models
Line, circle, and sine wave models were modelled in Blender software using curve objects and were very thin, in order to create a strict movement pattern that participants had to follow (see Figures 5-7). The thinner the objects were, the less would be the scatter in precision.

3D Models
The models of the glue gun (see Figure 3) and caulking gun (see Figure 4) were first downloaded from the web and slightly adjusted in Blender software for our experiment.

3D Models
The models of the glue gun (see Figure 3) and caulking gun (see Figure 4) were first downloaded from the web and slightly adjusted in Blender software for our experiment.

Shape Shape Models
Line, circle, and sine wave models were modelled in Blender software using curve objects and were very thin, in order to create a strict movement pattern that participants had to follow (see Figures 5-7). The thinner the objects were, the less would be the scatter in precision.

3D Models
The models of the glue gun (see Figure 3) and caulking gun (see Figure 4) were first downloaded from the web and slightly adjusted in Blender software for our experiment.

Shape Shape Models
Line, circle, and sine wave models were modelled in Blender software using curve objects and were very thin, in order to create a strict movement pattern that participants had to follow (see Figures 5-7). The thinner the objects were, the less would be the scatter in precision.

Shape Models
Line, circle, and sine wave models were modelled in Blender software using curve objects and were very thin, in order to create a strict movement pattern that participants had to follow (see Figures 5-7). The thinner the objects were, the less would be the scatter in precision.

3D Models
The models of the glue gun (see Figure 3) and caulking gun (see Figure 4) were first downloaded from the web and slightly adjusted in Blender software for our experiment.

Shape Shape Models
Line, circle, and sine wave models were modelled in Blender software using curve objects and were very thin, in order to create a strict movement pattern that participants had to follow (see Figures 5-7). The thinner the objects were, the less would be the scatter in precision.

Procedure
One session lasted around 20-25 min, depending on the participants' abilities. Each participant was instructed to draw with the glue gun (one hand) and caulking gun (both hands) following the line on the table, as precisely and as quickly as possible, but it was completely up to them which of the two aspects (precision or speed) they would choose over the other. They just needed to do it the best way they could.
Participants were handed HTC Vive Pro controllers and placed in a specific location in the real environment, so that the tracking stations could follow their hand movements throughout the whole experiment to avoid any jitter or freeze. Despite having the best conditions, this still happened sometimes, so the drawing on which the tracking was jittery had to be erased and redone. Using specialized gloves/trackers for such experiments/tasks is the ideal situation as it provides maximum precision. But in this case, two Vive tracking stations were used to track the hand movements in VR. The lighting and positioning of the users during the experiment were taken into account; both tracking stations had clear views of the participant, and the lighting in the room was also enabled to sufficient levels. The drawing started on button press and finished on button release.
Once the VR scene started, participants would be 'spawned' right in front of the table, as everything was positioned (both in VR and the RW) in a way to make the experience for participants as comfortable as possible. The experiment was conducted in such a way that all the objects required for the experiment (and there were no extra objects in the environment) were located right in front of the user, at the closest possible distance. So when it comes to the potential problem of stereopsis (difficulty focusing on farther objects), in this particular case, everything was done in such a way so that the users could easily focus on the task at hand and the objects in front of them. The drawing was done using Unity3D game engines line renderer component, which draws points on the position of the glue gun or caulking gun edge when the VR controller movement changes and the button is pressed. The distance at which the points could be drawn was adjusted in such a way that, when drawn, the points were so close to each other that it created the illusion of a continuous yellow line. If the distance was adjusted for higher values, such as 1 cm, the points would be drawn 1 cm apart from each other while the participant held the controller and drew in VR, and with this big distance between points, there would be no continuous line, but simply points in 3D space. Additionally, the more points there

Procedure
One session lasted around 20-25 min, depending on the participants' abilities. Each participant was instructed to draw with the glue gun (one hand) and caulking gun (both hands) following the line on the table, as precisely and as quickly as possible, but it was completely up to them which of the two aspects (precision or speed) they would choose over the other. They just needed to do it the best way they could.
Participants were handed HTC Vive Pro controllers and placed in a specific location in the real environment, so that the tracking stations could follow their hand movements throughout the whole experiment to avoid any jitter or freeze. Despite having the best conditions, this still happened sometimes, so the drawing on which the tracking was jittery had to be erased and redone. Using specialized gloves/trackers for such experiments/tasks is the ideal situation as it provides maximum precision. But in this case, two Vive tracking stations were used to track the hand movements in VR. The lighting and positioning of the users during the experiment were taken into account; both tracking stations had clear views of the participant, and the lighting in the room was also enabled to sufficient levels. The drawing started on button press and finished on button release.
Once the VR scene started, participants would be 'spawned' right in front of the table, as everything was positioned (both in VR and the RW) in a way to make the experience for participants as comfortable as possible. The experiment was conducted in such a way that all the objects required for the experiment (and there were no extra objects in the environment) were located right in front of the user, at the closest possible distance. So when it comes to the potential problem of stereopsis (difficulty focusing on farther objects), in this particular case, everything was done in such a way so that the users could easily focus on the task at hand and the objects in front of them. The drawing was done using Unity3D game engines line renderer component, which draws points on the position of the glue gun or caulking gun edge when the VR controller movement changes and the button is pressed. The distance at which the points could be drawn was adjusted in such a way that, when drawn, the points were so close to each other that it created the illusion of a continuous yellow line. If the distance was adjusted for higher values, such as 1 cm, the points would be drawn 1 cm apart from each other while the participant held the controller and drew in VR, and with this big distance between points, there would be no continuous line, but simply points in 3D space. Additionally, the more points there

Procedure
One session lasted around 20-25 min, depending on the participants' abilities. Each participant was instructed to draw with the glue gun (one hand) and caulking gun (both hands) following the line on the table, as precisely and as quickly as possible, but it was completely up to them which of the two aspects (precision or speed) they would choose over the other. They just needed to do it the best way they could.
Participants were handed HTC Vive Pro controllers and placed in a specific location in the real environment, so that the tracking stations could follow their hand movements throughout the whole experiment to avoid any jitter or freeze. Despite having the best conditions, this still happened sometimes, so the drawing on which the tracking was jittery had to be erased and redone. Using specialized gloves/trackers for such experiments/tasks is the ideal situation as it provides maximum precision. But in this case, two Vive tracking stations were used to track the hand movements in VR. The lighting and positioning of the users during the experiment were taken into account; both tracking stations had clear views of the participant, and the lighting in the room was also enabled to sufficient levels. The drawing started on button press and finished on button release.
Once the VR scene started, participants would be 'spawned' right in front of the table, as everything was positioned (both in VR and the RW) in a way to make the experience for participants as comfortable as possible. The experiment was conducted in such a way that all the objects required for the experiment (and there were no extra objects in the environment) were located right in front of the user, at the closest possible distance. So when it comes to the potential problem of stereopsis (difficulty focusing on farther objects), in this particular case, everything was done in such a way so that the users could easily focus on the task at hand and the objects in front of them. The drawing was done using Unity3D game engines line renderer component, which draws points on the position of the glue gun or caulking gun edge when the VR controller movement changes and the button is pressed. The distance at which the points could be drawn was adjusted in such a way that, when drawn, the points were so close to each other that it created the illusion of a continuous yellow line. If the distance was adjusted for higher values, such as 1 cm, the points would be drawn 1 cm apart from each other while the participant held the controller and drew in VR, and with this big distance between points, there would be no continuous line, but simply points in 3D space. Additionally, the more points there were, the more accurate the data that could be obtained. In this experiment, the distance was less than a millimetre. The smaller the distance between the point centres, the more points would be drawn.
Time and distance were automatically calculated using our own Unity3D VR simulation. There is no need to consider the subjective time compression factor [36], as we are obtaining only objective real data.

Results
Figures 8-13, which are the results from the VR experiment, show the trends in the results after 7 sessions of the experiment from all the users' combined data, which is the average distance (millimetres) between the drawn points and the actual static 3D objects (line, circle, sine) and the average time in seconds it took to perform each task. The trend towards less distance between the points and objects is better. The same goes for the time requirements to perform the tasks. were, the more accurate the data that could be obtained. In this experiment, the distance was less than a millimetre. The smaller the distance between the point centres, the more points would be drawn. Time and distance were automatically calculated using our own Unity3D VR simulation. There is no need to consider the subjective time compression factor [36], as we are obtaining only objective real data.

Results
Figures 8-13, which are the results from the VR experiment, show the trends in the results after 7 sessions of the experiment from all the users' combined data, which is the average distance (millimetres) between the drawn points and the actual static 3D objects (line, circle, sine) and the average time in seconds it took to perform each task. The trend towards less distance between the points and objects is better. The same goes for the time requirements to perform the tasks.    were, the more accurate the data that could be obtained. In this experiment, the distance was less than a millimetre. The smaller the distance between the point centres, the more points would be drawn. Time and distance were automatically calculated using our own Unity3D VR simulation. There is no need to consider the subjective time compression factor [36], as we are obtaining only objective real data.

Results
Figures 8-13, which are the results from the VR experiment, show the trends in the results after 7 sessions of the experiment from all the users' combined data, which is the average distance (millimetres) between the drawn points and the actual static 3D objects (line, circle, sine) and the average time in seconds it took to perform each task. The trend towards less distance between the points and objects is better. The same goes for the time requirements to perform the tasks.    were, the more accurate the data that could be obtained. In this experiment, the distance was less than a millimetre. The smaller the distance between the point centres, the more points would be drawn. Time and distance were automatically calculated using our own Unity3D VR simulation. There is no need to consider the subjective time compression factor [36], as we are obtaining only objective real data.

Results
Figures 8-13, which are the results from the VR experiment, show the trends in the results after 7 sessions of the experiment from all the users' combined data, which is the average distance (millimetres) between the drawn points and the actual static 3D objects (line, circle, sine) and the average time in seconds it took to perform each task. The trend towards less distance between the points and objects is better. The same goes for the time requirements to perform the tasks.    From the graphs above, it can clearly be seen that all the results from one hand and both hands VR experimental results of average time and distance improved from the first day all the way to the last. It can even be concluded that, in VR, no more than 4 sessions are  Figure 11. Both hands circle avg. time to distance trend.  From the graphs above, it can clearly be seen that all the results from one hand and both hands VR experimental results of average time and distance improved from the first day all the way to the last. It can even be concluded that, in VR, no more than 4 sessions are required, as, after the 4th session, the trend of improvement slowed down, thus indicating that the benefits were sufficient right up until that point.

Equipment
The same models as in VR were used for the experiment on paper, but this time they were a real glue gun and a caulking gun (see Figures 14 and 15). The sizes of the real models were very similar to the ones in VR, and the drawings on the paper were similar to the 3D models of the lines in VR. Below are shown the real models of the glue gun and caulking gun.
Although the glue gun uses plastic as a glue that melts when attached to an electric socket, in this experiment a pencil lead on the tip of the glue gun was used. This was a much easier solution then using melted plastic and allowed us to get a continuous line on the paper.   From the graphs above, it can clearly be seen that all the results from one hand and both hands VR experimental results of average time and distance improved from the first day all the way to the last. It can even be concluded that, in VR, no more than 4 sessions are required, as, after the 4th session, the trend of improvement slowed down, thus indicating that the benefits were sufficient right up until that point.

Equipment
The same models as in VR were used for the experiment on paper, but this time they were a real glue gun and a caulking gun (see Figures 14 and 15). The sizes of the real models were very similar to the ones in VR, and the drawings on the paper were similar to the 3D models of the lines in VR. Below are shown the real models of the glue gun and caulking gun.
Although the glue gun uses plastic as a glue that melts when attached to an electric socket, in this experiment a pencil lead on the tip of the glue gun was used. This was a much easier solution then using melted plastic and allowed us to get a continuous line on the paper.   From the graphs above, it can clearly be seen that all the results from one hand and both hands VR experimental results of average time and distance improved from the first day all the way to the last. It can even be concluded that, in VR, no more than 4 sessions are required, as, after the 4th session, the trend of improvement slowed down, thus indicating that the benefits were sufficient right up until that point.

Equipment
The same models as in VR were used for the experiment on paper, but this time they were a real glue gun and a caulking gun (see Figures 14 and 15). The sizes of the real models were very similar to the ones in VR, and the drawings on the paper were similar to the 3D models of the lines in VR. Below are shown the real models of the glue gun and caulking gun.
Although the glue gun uses plastic as a glue that melts when attached to an electric socket, in this experiment a pencil lead on the tip of the glue gun was used. This was a much easier solution then using melted plastic and allowed us to get a continuous line on the paper.

Equipment
The same models as in VR were used for the experiment on paper, but this time they were a real glue gun and a caulking gun (see Figures 14 and 15). The sizes of the real models were very similar to the ones in VR, and the drawings on the paper were similar to the 3D models of the lines in VR. Below are shown the real models of the glue gun and caulking gun.
Although the glue gun uses plastic as a glue that melts when attached to an electric socket, in this experiment a pencil lead on the tip of the glue gun was used. This was a much easier solution then using melted plastic and allowed us to get a continuous line on the paper.
The caulking gun had a new unopened silicone tube. Similarly, instead of using silicone for drawing, a small pencil was inserted in the nozzle. This was an easier solution and produced better drawing results. The caulking gun had a new unopened silicone tube. Similarly, instead of using silicone for drawing, a small pencil was inserted in the nozzle. This was an easier solution and produced better drawing results.

Paper/RW Material
For both sessions for the paper experiment, 5 A4 format sheets of paper were used for each participant.
1. One sheet of paper contained 2 lines on one side for one-hand drawing (glue gun), and 2 lines on the back of the sheet for both-hands drawing (caulking gun). 2. One sheet of paper with 2 circles on both sides for one-hand drawing (glue gun). 3. One sheet of paper with 2 circles on both sides for both-hands drawing (caulking gun). 4. One sheet of paper with 2 sine waves on both sides for one-hand drawing (glue gun). 5. One sheet of paper with 2 sine waves on both sides for both-hands drawing (caulking gun).
In Figures 16-18 are photographs of the paper sheets with red lines drawn on them. In addition, you can see the drawings made by participants with the glue gun and the caulking gun (using pencil). The first number (e.g., 5.7 or 7.3) indicates the time in seconds taken to draw the line. The second number (1 or 2) indicates the one-hand (glue gun) or two-hand (caulking gun) test. The caulking gun had a new unopened silicone tube. Similarly, instead of using silicone for drawing, a small pencil was inserted in the nozzle. This was an easier solution and produced better drawing results.

Paper/RW Material
For both sessions for the paper experiment, 5 A4 format sheets of paper were used for each participant.
1. One sheet of paper contained 2 lines on one side for one-hand drawing (glue gun), and 2 lines on the back of the sheet for both-hands drawing (caulking gun). 2. One sheet of paper with 2 circles on both sides for one-hand drawing (glue gun). 3. One sheet of paper with 2 circles on both sides for both-hands drawing (caulking gun). 4. One sheet of paper with 2 sine waves on both sides for one-hand drawing (glue gun). 5. One sheet of paper with 2 sine waves on both sides for both-hands drawing (caulking gun).
In Figures 16-18 are photographs of the paper sheets with red lines drawn on them. In addition, you can see the drawings made by participants with the glue gun and the caulking gun (using pencil). The first number (e.g., 5.7 or 7.3) indicates the time in seconds taken to draw the line. The second number (1 or 2) indicates the one-hand (glue gun) or two-hand (caulking gun) test.

Paper/RW Material
For both sessions for the paper experiment, 5 A4 format sheets of paper were used for each participant.

1.
One sheet of paper contained 2 lines on one side for one-hand drawing (glue gun), and 2 lines on the back of the sheet for both-hands drawing (caulking gun). 2.
One sheet of paper with 2 circles on both sides for one-hand drawing (glue gun). 3.
One sheet of paper with 2 circles on both sides for both-hands drawing (caulking gun). 4.
One sheet of paper with 2 sine waves on both sides for one-hand drawing (glue gun). 5.
One sheet of paper with 2 sine waves on both sides for both-hands drawing (caulking gun).
In Figures 16-18 are photographs of the paper sheets with red lines drawn on them. In addition, you can see the drawings made by participants with the glue gun and the caulking gun (using pencil). The first number (e.g., 5.7 or 7.3) indicates the time in seconds taken to draw the line. The second number (1 or 2) indicates the one-hand (glue gun) or two-hand (caulking gun) test.

Procedure
Participants were positioned near the table where all the paper materials and glue gun or caulking gun were prepared. Once a participant was given a tool, they proceeded to draw the line on the paper. All participants were instructed to draw the lines onto the prepared red drawings as fast and precisely as possible. Choosing either higher precision or speed was completely up to them. It just needed to be done to the best of their abilities.

Procedure
Participants were positioned near the table where all the paper materials and glue gun or caulking gun were prepared. Once a participant was given a tool, they proceeded to draw the line on the paper. All participants were instructed to draw the lines onto the prepared red drawings as fast and precisely as possible. Choosing either higher precision or speed was completely up to them. It just needed to be done to the best of their abilities.

Procedure
Participants were positioned near the table where all the paper materials and glue gun or caulking gun were prepared. Once a participant was given a tool, they proceeded to draw the line on the paper. All participants were instructed to draw the lines onto the prepared red drawings as fast and precisely as possible. Choosing either higher precision or speed was completely up to them. It just needed to be done to the best of their abilities. During the test, the experimenter would hold the paper so that it did not move and also measured the time with a stopwatch. After each drawing, the time to finish the drawing and the test type (1 or 2) were written down beside each drawing. Then the paper was either changed or turned to continue the process until all the drawings were finished. Overall, the whole process could take around 5 min, depending on the participant's decision to go either for speed or accuracy.

Calculations
Although the experiment on the paper measured the same factors as in VR, doing the calculations here in the same way as with the VR data would be a huge challenge or maybe even impossible. Therefore, a different route was chosen. In order to measure the accuracy of the paper drawings, a transparent plastic sheet was prepared on which was printed each type of shape with multiple points on them, which used as a reference to check if the drawings went through them or not. These covers then were placed onto the actual paper drawing results, and the points were counted. Only the points that touched the drawing made by the participant were counted. If the drawn shape (grey) and the red figure had some white space in between, the point was counted as a miss. Below you can see all 3 covers for each drawing. The line cover had 52 points, the circle 112 points, and the sinus 81 points.
Using this transparent sheet (see Figures 19-21), it was possible to accurately calculate the number of points that were present on the drawn lines of the participants. These calculations required the manual tracking of all the points and adding them to the data table. The VR experiment, wherein data was automatically gathered in a Json file, required less manual work. Due to the calculations being done manually, the number of points (density) calculated for the RW paper tests was limited. All the calculations were done by a single person in an objective manner.

Calculations
Although the experiment on the paper measured the same factors as in VR, doing the calculations here in the same way as with the VR data would be a huge challenge or maybe even impossible. Therefore, a different route was chosen. In order to measure the accuracy of the paper drawings, a transparent plastic sheet was prepared on which was printed each type of shape with multiple points on them, which used as a reference to check if the drawings went through them or not. These covers then were placed onto the actual paper drawing results, and the points were counted. Only the points that touched the drawing made by the participant were counted. If the drawn shape (grey) and the red figure had some white space in between, the point was counted as a miss. Below you can see all 3 covers for each drawing. The line cover had 52 points, the circle 112 points, and the sinus 81 points.
Using this transparent sheet (see Figures 19-21), it was possible to accurately calculate the number of points that were present on the drawn lines of the participants. These calculations required the manual tracking of all the points and adding them to the data table. The VR experiment, wherein data was automatically gathered in a Json file, required less manual work. Due to the calculations being done manually, the number of points (density) calculated for the RW paper tests was limited. All the calculations were done by a single person in an objective manner.

Calculations
Although the experiment on the paper measured the same factors as in VR, doing the calculations here in the same way as with the VR data would be a huge challenge or maybe even impossible. Therefore, a different route was chosen. In order to measure the accuracy of the paper drawings, a transparent plastic sheet was prepared on which was printed each type of shape with multiple points on them, which used as a reference to check if the drawings went through them or not. These covers then were placed onto the actual paper drawing results, and the points were counted. Only the points that touched the drawing made by the participant were counted. If the drawn shape (grey) and the red figure had some white space in between, the point was counted as a miss. Below you can see all 3 covers for each drawing. The line cover had 52 points, the circle 112 points, and the sinus 81 points.
Using this transparent sheet (see Figures 19-21), it was possible to accurately calculate the number of points that were present on the drawn lines of the participants. These calculations required the manual tracking of all the points and adding them to the data table. The VR experiment, wherein data was automatically gathered in a Json file, required less manual work. Due to the calculations being done manually, the number of points (density) calculated for the RW paper tests was limited. All the calculations were done by a single person in an objective manner.

Results
Precision in the RW tests was calculated differently from in VR. The more points present on the drawn shape, the higher the score. Figures 22-26 show that the time measurements in seconds are the same as for the VR measurements, but when it comes to precision measurements, we count the number of points that are positioned correctly on the predefined visual shapes on the paper (line, circle, sine).

Results
Precision in the RW tests was calculated differently from in VR. The more points present on the drawn shape, the higher the score. Figures 22-26 show that the time measurements in seconds are the same as for the VR measurements, but when it comes to precision measurements, we count the number of points that are positioned correctly on the predefined visual shapes on the paper (line, circle, sine).

Results
Precision in the RW tests was calculated differently from in VR. The more points present on the drawn shape, the higher the score. Figures 22-26 show that the time measurements in seconds are the same as for the VR measurements, but when it comes to precision measurements, we count the number of points that are positioned correctly on the predefined visual shapes on the paper (line, circle, sine).

Results
Precision in the RW tests was calculated differently from in VR. The more points present on the drawn shape, the higher the score. Figures 22-26 show that the time measurements in seconds are the same as for the VR measurements, but when it comes to precision measurements, we count the number of points that are positioned correctly on the predefined visual shapes on the paper (line, circle, sine).

Results
Precision in the RW tests was calculated differently from in VR. The more points present on the drawn shape, the higher the score. Figures 22-26 show that the time measurements in seconds are the same as for the VR measurements, but when it comes to precision measurements, we count the number of points that are positioned correctly on the predefined visual shapes on the paper (line, circle, sine).        Results from the RW experiments, which were only done twice, on the first and on the last day of the experiment, indicate improvements only in the time variable, whereas the accuracy of the drawing worsened compared to the first attempt. One of the interesting changes that could be incorporated in such an experiment would be adding one more RW test in between the 7 VR tests to see if the results would be the same. Although there are multiple factors, some of the reasons why these results were as they are could be forgetting the RW task, the type of the test itself, lack of motivation, etc.

In-Depth Statistical Evaluation
MATLAB R2017a software was used for the statistical evaluation. The main goal was to decide if there was a difference between the first and last attempt (VR and paper), in order to see if the participants improved after the process of training in VR.

Methodology of statistical evaluation
Statistical hypothesis testing was used for the statistical processing of the experimental data. A statistical hypothesis is a statement that relates to an unknown property or parameters of the probability distribution of a random variable. In statistical testing, we always have two types of hypotheses:

Hypothesis 2 (H2). Alternative hypothesis means the hypothesis that denies the validity of the null hypothesis.
To test the null hypothesis, we must always choose a significance level , which is the probability of rejection of a true null hypothesis (type I error). We always require this probability to be small, so we choose, according to our habit, to be = 0.05. The decision to reject/not-reject the null hypothesis is made using the p-value of the test. We reject the null hypothesis if the p-value of the test is less than the level of significance. More about hypothesis testing can be found in [37]. Results from the RW experiments, which were only done twice, on the first and on the last day of the experiment, indicate improvements only in the time variable, whereas the accuracy of the drawing worsened compared to the first attempt. One of the interesting changes that could be incorporated in such an experiment would be adding one more RW test in between the 7 VR tests to see if the results would be the same. Although there are multiple factors, some of the reasons why these results were as they are could be forgetting the RW task, the type of the test itself, lack of motivation, etc.

In-Depth Statistical Evaluation
MATLAB R2017a software was used for the statistical evaluation. The main goal was to decide if there was a difference between the first and last attempt (VR and paper), in order to see if the participants improved after the process of training in VR.

Methodology of Statistical Eevaluation
Statistical hypothesis testing was used for the statistical processing of the experimental data. A statistical hypothesis is a statement that relates to an unknown property or parameters of the probability distribution of a random variable. In statistical testing, we always have two types of hypotheses:

Hypothesis 1 (H1).
Null hypothesis H 0 means the hypothesis, the validity of which we verify using a test.

Hypothesis 2 (H2). Alternative hypothesis H A means the hypothesis that denies the validity of the null hypothesis.
To test the null hypothesis, we must always choose a significance level α, which is the probability of rejection of a true null hypothesis (type I error). We always require this probability to be small, so we choose, according to our habit, to be α = 0.05. The decision to reject/not-reject the null hypothesis is made using the p-value of the test. We reject the null hypothesis if the p-value of the test is less than the level of significance. More about hypothesis testing can be found in [37].
Statistical tests are used to verify the validity of the null hypothesis. There are parametric and nonparametric tests. Parametric tests are those for which we must know the distribution of data, and they usually assume a normal distribution. Nonparametric tests do not require this assumption. Three tests were used in this work:
Wilcoxon signed ranked test

Jarque-Bera test
First, the data obtained were tested for normality using the Jarque-Bera [38] test in order to decide whether we should use parametric or non-parametric statistical tests.
Parametric tests can only be used for data with normal distribution; non-parametric tests are used for data with different distributions [37].
The Jarque-Bera test tested the null hypothesis that the measured data follow a normal distribution against the alternative hypothesis that the measured data do not follow a normal distribution.
If the p-value is greater than the significance level α = 0.05, we do not reject the null hypothesis of data normality and we can use a parametric test (paired-sample ttest). Otherwise, we reject the null hypothesis of data normality, and we have to use a non-parametric test (Wilcoxon signed rank test).

Paired-sample t-test
This paired-sample t-test [39] is parametric and can only be used for data that comes from a normal distribution.
We test the null hypothesis that the distribution of the observed variable (time/distance/ accuracy) is identical in both groups (first and last attempt) and has the same mean (expected value) against the alternative hypothesis that the distribution of the observed variables is not identical in both groups and has a different mean.
We reject the null hypothesis if the p-value of the test is less than the significance level α = 0.05, which means that there is a statistically significant difference between the first and last attempt. If we do not reject the null hypothesis, there is no statistically significant difference.

Wilcoxon signed rank test
If data does not come from a normal distribution, in the next step, we use the parametric Wilcoxon test for two paired samples [40]. It is the non-parametric equivalent of the paired-sample t-test. We test that the null hypothesis data in both groups (first and last attempt) are samples from continuous distributions with equal medians against the alternative that they are not.
The method of evaluation and conclusions are the same as for the paired-sample t-test.

VR Experiment
Here these two hypotheses were verified:

Hypothesis 3 (H3).
Distance-reducing trend (increasing drawing accuracy) for VR testing is to be expected.

Hypothesis 4 (H4).
Time-reducing trend (increasing drawing speed) for VR testing is to be expected.
We tested the statistically significant difference between the first and seventh (last) attempt. First, the normality test was done. In Tables 1 and 2, we can see the p-values of the test for both observed variables (distance and time) and for all six versions of the experiment.  As can be seen, the p-value was in some cases lower than the significance level α = 0.05. This means that not all the data sets come from a normal distribution. Therefore, the non-parametric Wilcoxon signed rank test was used. It was tested whether there was a difference in significance level between the first and last attempt of the VR experiment for all the variants (one hand/both hands, line/circle/sinus wave).
In Table 3, we can see the results for the distance variable. The p-value is lower than the significance level α = 0.05 for all variants. This means that we reject the null hypothesis and that there is a statistically significant difference between the first and last attempt. We know from Figures 8-13 that the average values of the distance variable decreased. This test confirmed this change with a significance level α = 0.05, and this means that the drawing accuracy increased, and Hypothesis H1 is confirmed.  Table 4 shows the results for the time variable. Again, in all cases the p-value was lower than the significance level α = 0.05. This means that we reject the null hypothesis and that there is a statistically significant difference between the first and last attempt. As we know from Figures 8-13, this difference is the shorter time in the last attempt than in the first. This confirms Hypothesis H2 about the expected time-reducing trend (increasing drawing speed) for VR testing. Real world experiment In the real world experiment, these two hypotheses were verified: Hypothesis 5 (H5). Accuracy-increasing trend for RW tests (on paper) is to be expected.
Hypothesis 6 (H6). Time-reducing trend (increasing drawing speed) for RW tests (on paper) is to be expected.
As in the VR experiment, the difference was tested between the first and last (second) attempts. First, the normality test was done. The results of the normality test can be seen in Tables 5 and 6 for all versions of the experiment. In the case of the accuracy variable, see Table 5. The p-value was in some cases lower than the significance level α = 0.05. This means that not all of the data sets come from a normal distribution. Therefore, the non-parametric Wilcoxon signed rank test was used.
In the case of the time variable, see Table 6. All p-values were greater than the significance level α = 0.05; we do not reject the null hypothesis of data normality, and we can use the paired-sample t-test.
In Table 7, we can see the results for the accuracy variable. The p-value was lower than the significance level α = 0.05 in the case of the one-hand line and circle and the both-hands line. This means that, in these cases, there was a statistically significant difference between the first and second attempt. As we know from Figures 22-24, this difference was in the reduction in accuracy. This is the opposite of what we assumed. In the other versions of the experiment, we did not reject the null hypothesis, so there was no statistically significant difference between the first and second attempt. This means that accuracy did not decrease significantly in these cases, but neither did it improve. Hypothesis H1 was therefore not confirmed.  Table 8 shows the results (p-values) of the paired-sample t-test for the time variable. Based on the results, we can say that there is a significant difference between attempts 1 and 2 in the case of the one-hand circle, sinus wave, and both-hands line, circle, and sinus wave, with a significance level α = 0.05 %. There is not a significant difference between the first and second attempt in the experiment with the one-hand line at the significance level α = 0.05 % (but there is at the level 0.1%). As we know from Figures 23-27, the trend of the time variable declined in all cases. This means that we have confirmed Hypothesis H2 in all versions of the experiment with the exception of the one-hand line at the significance level α = 0.05 %. However, even in this case, it can be seen from Figure 22 that some improvement has taken place, so we can confirm the Hypothesis H2 for this version as well but at the significance level of 0.1 (a higher level of uncertainty).  In the case of the time variable, see Table 6. All p-values were greater than the significance level = 0.05; we do not reject the null hypothesis of data normality, and we can use the paired-sample t-test. In Table 7, we can see the results for the accuracy variable. The p-value was lower than the significance level = 0.05 in the case of the one-hand line and circle and the both-hands line. This means that, in these cases, there was a statistically significant difference between the first and second attempt. As we know from Figures 22-24, this difference was in the reduction in accuracy. This is the opposite of what we assumed. In the other versions of the experiment, we did not reject the null hypothesis, so there was no statistically significant difference between the first and second attempt. This means that accuracy did not decrease significantly in these cases, but neither did it improve. Hypothesis H1 was therefore not confirmed.  Table 8 shows the results (p-values) of the paired-sample t-test for the time variable. Based on the results, we can say that there is a significant difference between attempts 1 and 2 in the case of the one-hand circle, sinus wave, and both-hands line, circle, and sinus wave, with a significance level = 0.05 %. There is not a significant difference between the first and second attempt in the experiment with the one-hand line at the significance level = 0.05 % (but there is at the level 0.1%). As we know from Figures 23-27, the trend of the time variable declined in all cases. This means that we have confirmed Hypothesis H2 in all versions of the experiment with the exception of the one-hand line at the significance level = 0.05 %. However, even in this case, it can be seen from Figure  22 that some improvement has taken place, so we can confirm the Hypothesis H2 for this version as well but at the significance level of 0.1 (a higher level of uncertainty).

Discussion
The main point of this work was to discover the effects/benefits of using virtual reality for training. The training was focused on improvements to the precision of hand movements using a specific type of VR simulation that would allow users/participants to perform similar movement tasks over a prolonged period of time (approximately one month). As VR is said to be a very effective technology in this sphere, we created a VR simulation using the Unity3D game engine for this experiment. This engine proved itself to be very powerful and easy to use, as it had already been used for other types of simulations prior to this.
The simulation for this work took several months to prepare and conduct. In addition to VR tests, RW experiments also took place. VR tests took place in seven sessions, whereas there were only two RW tests, on the start and end dates of the experiment. This was done to analyse the effects of training not only in VR itself but also the transfer of skills to the RW.
Once the experiment was done, the data were collected. VR data was gathered inside the Unity3D game engine during the test process itself. Then the data about precision (in Json format) were exported to Excel, where they were processed in pre-prepared macro templates. RW data on performing the line movements (line, circle, sine wave) was processed at the end of the experiment. Lastly, RW data was exported to Excel templates, and the precision results were calculated and analysed using specific macros.
The results suggest very positive outcomes for VR training in particular. As the participants had to perform on each of seven sessions four attempts of each type of simulation (six overall: one hand/two hands, line/circle/sine wave); in the end, the results improved drastically, and they actually improved from the start to the end. It was pointed out that possibly only four training sessions for this type of training could be sufficient as, after the fourth attempt, although there were improvements, they were not that impressive.
When it comes to the RW tests, the results suggest that the transfer of skill from VR to RW was only partial. A study suggested similar results, wherein the training in VR produced positive results in that virtual scenario, but when it came to transferring the training to the RW scenario, no significant or no benefits were found [41]. Only the time value improved from the first attempt compared to the second one. Multiple reasons were suggested for this, such as participant forgetfulness, noninterest, tiredness, etc. It is recommended to perform such an experiment with a much bigger pool of participants, in-depth questionnaires, and possibly perform RW tests more frequently between the VR tests to remind the participants what they are training for. Additionally, it could also be the type of simulation that resulted in these outcomes. In order to be sure about some factors that could cause these non-beneficial effects, more research and experimentation should be performed on top of this work.
As this experiment and training look very similar to welding training simulations, it is interesting to point out a couple of studies that delved into training participants in this field and evaluated the transfer of that training to practical cases.
A study [42] suggests that VR welding simulations are good not only for novice welders but also for experienced ones as well. The training in VR welding is then used in practice in RW cases. As the results show some difference in quality between novice and experienced welders, data also suggests that even experienced welders can struggle with higher difficulty welds in VR, which can in turn allow them to train and become better at new levels. Thus, suggesting that there is a positive correlation and transfer of skills from VR training to the real world depending on the simulation type/difficulty.
On the other hand, research [43] on different types of welding difficulty tasks shows that, the higher the difficulty of the welding task in VR, the less effective VR training becomes, thus requiring supplementary RW training. But for easy and medium types of training difficulty, VR is considered to be a great tool, which suggests a positive outcome in the transfer of skills.
It is important to note that the weight factor can play some role in the fatigue and performance of the participants during the tests. In our experiment, all the participants were young and healthy, thus using controllers in VR and simple common tools which did not exceed a weight of 1 kg (caulking gun-both hands), with the duration of the experiment for a maximum of 15 min with breaks if needed, would not cause a significant hurdle for the participants to perform the task they were given.
Although the weight of the two controllers and the caulking gun were almost the same (the caulking gun was slightly heavier), and one controller had almost the same weight as the glue gun, there was no significant point in conducting the experiment in VR and RW with identical weights. The experiment was very similar in some ways both in VR and RW, but some things-weight, controller vs. RW tool, etc.-could not be matched identically, thus measuring the effect of the weight difference on the result was not one of our goals. One possibility would be to use a tracker mounted on the tool instead of the controller, which could give better results, as the grip feeling and weight in VR and RW tests would be almost identical. The results of such tests would and should definitely give different results.
This experiment was just a preliminary test of such a type of training in VR. Initially, the analysis and conclusions may look rough, but to make them more fine-tuned, more such tests should be performed and in different variations. Those results will then be compared, and more precise and expanded versions of the outcomes could be discussed. In this case, seven tests in VR were performed, whereas in the RW, there were only two. Initially, the idea to perform the RW test on the start date and end date of the VR tests seemed sufficient, and the results were believed to be more positive. As it came out at the end, the results in the RW tests were not so satisfactory, which suggests many different ideas, such as the type of the experiment, lack of repetitions, etc. The only way to answer those ideas precisely is to repeat the same and different experiments multiple times, with different variables.
This work suggested using this type of training only in one way. Using information from this work and performing similar research could eventually result in different outcomes. Those outcomes then would be used to prove some other important points about using VR for training and its benefits. As mentioned, the difficulty of the simulation and supplementary RW training could play a big role in the transfer of skills. Future work should be targeted at delving into the factors described above more deeply, in order to prove or disprove the hypothesis.

Conclusions
Four hypotheses were tested using statistical hypothesis testing and set out in Section 6. VR experiment Hypothesis 3 (H3). Distance-reducing trend (increasing drawing accuracy) for VR testing is to be expected.

Hypothesis 4 (H4).
Time-reducing trend (increasing drawing speed) for VR testing is to be expected.
Results: Both hypotheses were confirmed. There were statistically significant changes (in terms of improvement) for both the monitored variables (distance, time).
Real world experiment Hypothesis 5 (H5). Accuracy-increasing trend for RW tests (on paper) is to be expected.
Hypothesis 6 (H6). Time-reducing trend (increasing drawing speed) for RW tests (on paper) is to be expected.
Results: Only Hypothesis H2 was confirmed for the time variable. There was no change or a change for the worse for the accuracy variable.

Institutional Review Board Statement:
The study was approved by the internal university Ethics Committee.