You are currently viewing a new version of our website. To view the old version click .
Multimodal Technologies and Interaction
  • Article
  • Open Access

10 April 2024

A Two-Level Highlighting Technique Based on Gaze Direction to Improve Target Pointing and Selection on a Big Touch Screen

and
1
Computer Science Department, IMT Atlantique, Technopôle Brest-Iroise, CS 83818, 29238 Brest, CEDEX 3, France
2
Lab-STICC, CNRS UMR 6285, 29238 Brest, France
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.

Abstract

In this paper, we present an approach to improve pointing methods and target selection on tactile human–machine interfaces. This approach defines a two-level highlighting technique (TLH) based on the direction of gaze for target selection on a touch screen. The technique uses the orientation of the user’s head to approximate the direction of his gaze and uses this information to preselect the potential targets. An experimental system with a multimodal interface has been prototyped to assess the impact of TLH on target selection on a touch screen and compare its performance with that of traditional methods (mouse and touch). We conducted an experiment to assess the effectiveness of our proposition in terms of the rate of selection errors made and time for completion of the task. We also made a subjective estimation of ease of use, suitability for selection, confidence brought by the TLH, and contribution of TLH to improving the selection of targets. Statistical results show that the proposed TLH significantly reduces the selection error rate and the time to complete tasks.

1. Introduction

Today, we are surrounded by modern touch screens on peripherals of all kinds. However, although touch screens are becoming the central stage of interaction between users and systems, their operation with our fingers creates problems with finger occlusion and imprecision. The situation is worsening with the trend towards miniaturization of devices and enrichment of content, and pointing or selecting small objects on the screen becomes difficult as their number increases [1,2,3]. Since pointing and selecting targets are fundamental and the most executed tasks in graphical user interfaces, improvements in the performance of these tasks can have a significant impact on their usability and on the overall usability of the software [4]. Therefore, improving the quality of Pointing is one major challenge in the area of Human–Machine Interfaces (HMI) [5,6,7,8,9,10,11,12,13], especially for touch screens where additional problems, such as occlusion, arise during the tactile selection of objects. There, too, numerous works [1,2,3] proposed by researchers testify to the importance of the question of the selection of targets on touch screens.
Our objective is to develop means enabling monitoring maritime situations (such as monitoring of the activity of swarm marine drones) displayed on big touch screens (we consider a screen as a big screen when the user must make large gestures or even must move to reach any point of the screen where the operators must select representations of moving objects on the screens). To make this selection easy for the operators, the systems to be set up must offer possibilities for better human–machine interaction, e.g., by following their gestures and gaze in real time.
The main contribution of this paper is the enhancement of a touch modality for selecting targets by enlightening the target by magnifying it thanks to the gaze of an operator. We show that this enhancement performs as well as the enhancement that can be obtained in the classical case of the use of the mouse.
In Section 2, we will review how multimodal interfaces can improve human–machine interactions for the pointing and selection of targets on screens. Section 3 deals with the technique we propose to enhance target selection and the approach to its implementation, then describe the experimental system designed and presents its prototype. Section 4 describes the experiment carried out with users to evaluate the performance of our proposal, as well as the results obtained from this experiment. Section 5 discusses the relevance of the proposed technique, and Section 6 ends with a conclusion and gives future directions for improving our work.

3. Enhancing Interaction by Anticipating Pointing and Target Selection

This section describes the main principle of the system we propose to enhance selection and pointing, the different interaction modalities used for our proposal, and the system implemented to experiment and evaluate this proposal.

3.1. Anticipating Pointing and Target Selection Using the Orientation of the Head

It is convenient for a touch screen to use an anchor point to represent the location of a touchdown. However, it is a contact area, instead of a point, that forms when the finger touches the screen. The shape of the contact area can also change under various circumstances, as users can use the touch screen with different fingers [24]. Also, with the use of touch screens, there is the occlusion problem [1]. In addition, people of different sexes, ages, or ethnic groups have different finger diameters and different finger sizes, which lead to different target tolerance levels [25]. This justifies work on taking the size of the finger into account in the design of interaction techniques for touch screens. Although the inclusion of finger size in the design of the interaction technique may have untapped potential regarding touch screens, and although several studies have addressed this issue, the problem of occlusion remains and continues to be explored.
According to this state of the art and to the quality that is recognized using the mouse for interactions, three questions arise. First, might it be interesting to define an interaction method based on touch and using a technique allowing the reduction or even elimination of the occlusion problem and which would do as well as the mouse with the same technique? One way would be to add a highlight effect to the touch interaction mode. Second, is it really not possible to do as well with the touch as with the mouse with a highlight effect (for example magnification) on the objects overflown by the cursor? Third, considering that the use of the touch would be done without the presence of a cursor indicating on the screen the place which will be touched (as is the case with the mouse), by what means should we add an effect of highlight for touch? If we would like to see the objects grow, as in the case of the mouse, when we fly over them, we would have to find a way to fly over them for the touch before possibly touching them.
This is why we propose an approach to improve tactile interaction based on anticipating pointing and target selection. The main idea is to use the gaze direction that could be estimated through the head orientation to determine where the attention of the user is and then highlight the objects thus targeted, as using the gaze obtained through an eye tracker is not always so stable and usable [26]. Our work also aims to evaluate the proposed technique using the combination of the mouse and the same highlight effect as a baseline. The goal is to see if adding the highlight to the touch does as well as adding the highlight to the mouse. This assessment will be made on two objective aspects: the rate of errors made and time, and on subjective aspects, such as user preferences.

3.2. Interaction Modalities

To carry out this work, four selection methods are offered to users:
  • Mouse: it is used to select objects with a simple click of the left button.
  • Touch: a touch screen allows users to select objects by touching them.
  • Mouse and Highlight: the selection is made by a simple click on the left mouse button. However, this time, when the mouse cursor rolls over objects, they are magnified to mark their targeting by the user. This notion of highlighting and its operating principle are described in Section 3.3.
  • Touch and Highlight: the selection is made by touching the objects. However, this time, the direction of the user’s gaze, given by the orientation of the user’s head, is used to magnify the objects to which the gaze is directed. This is similar to the approach of PowerLens [27], which magnifies the interaction areas on wall-sized displays to reduce the precision required to interact to select data. For this first work, the direction of the gaze is limited to the position on the screen of the central point, which is located between the user’s two eyes. This is retrieved by tracking the position and orientation of the user’s head relative to the screen.

3.3. How Highlighting Works and Its Interest

As proposed in Ghost-Hunting [10], we materialize the highlight by the magnification of the targeted objects. Two levels of highlighting are defined:
  • A first level allows the highlighting of objects located within a defined perimeter around the mouse or the orientation of the head (gaze) on the screen. These objects are magnified to have size s 1 > s 0 , s 0 being the initial size of the objects.
  • A second level allows the highlighting of only one of the previous objects (those located in the perimeter), in this case, the one on which the mouse or the position of the gaze is directed. This object thus pointed is magnified more than the others in the highlighting area. It has a size s 2 such that s 2 > s 1 . The objective is to facilitate its selection by the user.
For a first prototype with circles as targets, we choose arbitrarily to double the radius of the target for the first level of magnification, which multiplies by 4 the surface of the target, and to triple it for the second level of magnification, which multiplies by 9 the surface of the target. These values have been chosen arbitrarily and could, of course, be changed. An additional study would be needed to determine which would be the best values for the magnifications. Figure 1 describes how this highlight works.
Figure 1. Highlight operation mode.
Taking the special case of the red object, when the latter enters the highlighted area, it increases in size (we materialized it by the blue object). Then, when it comes into contact with the mouse (green circle), its size increases again.
The two-level highlighting works as follows:
  • First, it preselects a set of potential objects to be treated. This quickly lets the users know whether their mouse or their gaze position is in the correct area (the one where the target object is located). This limits the errors that the user can make, reduces the distance between these objects and the position of the mouse, and thus saves potential time. Based on the diagram of Figure 1, the targeted object here would have had to cross the distance d 1 before coming into contact with the mouse if this object had not undergone the highlight effect (red dotted circle). However, with the effect of the highlight (circle in solid blue line), it travels the distance d 2 to come into contact with the mouse. There is, therefore, a possible gain in distance w = d 1 d 2 , and consequently, in the time taken to select the object since the speed of movement of the object is constant.
  • Then, it preselects the only potential object to be treated, distinguishing it from others in the area. This increases the selection area and, therefore, makes the task easier for the user (reduced risk of selection error).
The system and its functioning having been described, we present in the following section its physical structure.

3.4. Description of the Experimental System

The experimentation system is made up of 2 parts: a part that ensures the interaction and a part that takes care of the treatments. The interaction part defines a user activity area in which the user must be for everything to work. This area is delimited by cameras making it possible to locate the user’s position and track the movements of his head using a target that he must carry on his head. There is also a touch screen and a mouse. The processing part consists of a computer on which an application produced with Unity 3D is deployed. This part is responsible for creating, displaying, and managing the movement of objects on the screen, intercepting and interpreting user gestures (click, touch, head orientation), managing the rendering of information according to the actions taken, etc. Figure 2 shows the block diagram of the experimental system. The diagram in Figure 2 gives a general view of the different physical components that make up the system and their interactions and interrelations.
Figure 2. Physical architecture of the system.
In the diagram in Figure 2, the system is separated into 2 parts, as mentioned above:
  • the human–machine interface, consisting of a mouse for selection by click, a touch screen for selection by touch, an HTC Vive virtual reality system for tracking the user’s position, and consisting of 2 lighthouses base stations, 1 headset, 2 controllers, and 1 live tracker target. The lighthouse stations enable the 6 DoF tracking of the headset, controllers, and live tracker target. Here, we use neither the headset nor the controllers for the calibration before the experiment in order to detect the exact size and position of the big screen by selecting each of its 4 corners with a controller. Then, during the experimentation, we use only the live tracker target to track the head and know its position and orientation relative to the big screen). “Lighthouse is a laser-based inside-out positional tracking system developed by Valve for SteamVR and HTC Vive. It accurately tracks the position and orientation of the user’s head-mounted display and controllers in real time. Lighthouse enables the users to move anywhere and re-orient themselves in any position within the range of the SteamVR Base Stations” (https://xinreality.com/wiki/Lighthouse, URL accessed on 8 April 2024). Any other 6DoF 3D tracking system could have been used for our experiment.
  • the processing: it is provided by a Unity 3D application.

3.5. Experimental Design and Prototype

To experiment and evaluate the proposed concepts, we used Unity 3D to design a simulator of drones in motion, using the C# programming language for scripting. Drones are represented by 3D objects which move randomly on a flat and delimited surface. At regular time intervals, the system chooses one of the moving objects randomly as the target object. The user is instructed to select this target object each time.
Figure 3 shows the main system interface. We see a set of moving drones, among which one is elected (yellow object) for user selection.
Figure 3. The main application interface. The drones are visualized as blue cylinders seen from the top, here one is elected (yellow object) for user selection.
We have reviewed the various elements implemented to test our proposal. An experiment was carried out to assess the relevance/impact of the proposal thus made. The following section is devoted to it.

4. Experimentation

We conducted an experiment to evaluate, on the one hand, the performance of the effect of the highlight on the touch and, on the other hand, the performance of the combination (touch, highlight) compared to that of the combination (mouse, highlight). This section deals with this experiment. It is structured in two sub-sections: the first one describes the method used to conduct it, and the second one presents and analyzes the results obtained.

4.1. Method

The chosen method consisted of setting up the experimentation space, defining the task to be accomplished and the procedure, defining the evaluation metrics, and selecting the participants.

4.1.1. Context and Task to Be Accomplished

Adding a highlight effect to the selection using the mouse makes it easier to select objects. The objective of this experiment is to see if, as is the case with the mouse, adding the highlight to the touch improves the selection of objects by touch and if this improvement is similar to what brings the highlight to the mouse. Two measurement criteria are used for this evaluation: the time taken to complete a task and the error rate made during the completion of the task. The scenario defined for conducting this experiment is an observation and supervision activity of drones on a mission. We simulate such a system in which objects representing drones move randomly on a flat surface. In the beginning, all the objects are the same color. At each time step i, defined before the start of the mission, one of the objects is chosen (chosen) randomly by the system and marked with a color, yellow in this case. We call this object the elected or target object. The user must select the chosen object. This object remains available for selection for a period of s (selection time).
Three scenarios arise during the selection:
  • the user selects the target object during this time step s: the object changes its color to green for a duration g (good selection time), then returns to the initial color (color at the start of the mission); this is a correct selection.
  • the user cannot select the target object during the time step s: the object changes its color to red for a period b (time of bad selection), then returns to the initial color; a selection error is recorded.
  • the user selects an unelected object (not the target object): the selected object changes its color to red for a period b, then returns to the initial color; a selection error is also counted.
The total number of objects on the stage, the number of objects to be selected, the selection methods (mouse, touch, etc.), the time between two selections (i), the duration of a selection (s), the duration for which the colors of good (g) or bad selection (b) are maintained, are parameters defined before the beginning of the experiment.

4.1.2. Population

A total of 28 subjects, including 4 women and 24 men, aged from 18 to 40 (18–25: 6, 25–35: 16, 35–40: 6), took part in this experiment. These participants were from 5 different profiles, distributed as follows: 7 undergraduate students, 9 doctoral students, 7 research engineers, 4 post-doctoral researchers, and 1 teacher. They were recruited through an email, which we passed on to the mailing list of our engineering school. Users have not received any reward for this work.

4.1.3. Experiment Settings

Several parameters have been defined: the different methods to be used, the number of objects present on the screen during a passage, the total number of objects that will be selected for a passage, the time available to select an object, the time between a selection and the choice of the next target to be selected, etc. For this experiment, we therefore had:
  • 4 methods of selecting objects: mouse, mouse + highlight, touch, touch + highlight (by the orientation of the gaze);
  • 3 different densities (total number of objects on the screen) of objects: 10, 30, and 50 objects;
  • 10 objects to be selected per mission (passage);
  • i = Time between two elections of the target object = 2 s
  • s: Hold time of the target object in the “selectable” state by the user = 2 s
  • g and b: Respective times of green color after a good selection and red after a selection failure = 1 s.
With this demarcation of the various elements, a mission is limited in time. An example of a mission is defined as follows: its modality (m) = hit, the density of the objects on the screen (d) = 30, the number of objects to select (f) = 10, i = 2 s, s = 2 s, g = b = 1 s. A mission is successful if, before the end of its duration, the user has selected all the f objects. In the example of the previous mission, f is equal to 10.
These durations of 1 s and 2 s have been arbitrarily chosen, and a small pilot with a few colleagues of our research team and from Thales validated that it was putting the experimenters in a high-load but still manageable situation.

4.1.4. Procedure

We established a participant time schedule and a charter that explained the project and the experiment, as well as the rights of the participants. Before starting, each participant was briefed on the purpose of the experiment and the operation of the system to be used for this, then followed by an environmental training phase. During the training, each participant made 4 passages at the rate of one passage for each modality. On each pass, they had 10 objects moving on the screen, among which it was necessary to select 5. This session allowed the participants to familiarize themselves with the environment and especially to understand the purpose of the experiment. During the evaluation session, each participant used the 4 modalities (M: Mouse, MH: Mouse and Highlight, T: Touch, TH: Touch and Highlight), and for each of them, he made 3 passages with 3 different densities of objects: 10, 30 and 50 objects, for a total of 12 passages per participant. Each time, he had to select a total of 10 objects. Questionnaires were also submitted to participants: 1 questionnaire after each modality and 1 final questionnaire on the experiment. For the whole experiment, we had a total number of 28 × 4 × 3, i.e., 336 passages. The order of use of the interaction modalities changed from one participant to another. We have numbered the modalities: 1: M, 2: MH, 3: T, and 4: TH. Then, we built a Latin square with these 4 numbers. The first line of this matrix gives the order of passage M–MH–T–TH.
Based on this principle, we determined 4 passage orders, as shown in Table 1. For these four passages, each modality is used one time at each of the four positions. This order was repeated every four participants; we did 7 repetitions for the 28 participants. Unfortunately, this has only ensured the equitable use of the modalities Touch (with or without Highlight) and Mouse (with and without Highlight) but not of the highlighting and, therefore, not totally reduced the learning effect of the highlight on the results of the experiment. Anyway, as the orders of conditions “with highlight” and “without highlight” are the same for mouse and touch, it does not affect our main objective, which is to compare the enhancement of Touch and Mouse using highlight.
Table 1. Illustration of the variation of the orders of passage.
At each passage, the mission data were saved in a file for later analysis. In terms of data, we record, among others, the participant’s identifier, the modality used, the density of objects in the environment, the duration of the mission, the state of the mission (success or failure), the number of selections made at total, the number of wrong selections, etc.

4.1.5. Materials and Apparatus

The hardware platform used in this research consisted of:
  • a computer running a Windows 7 Enterprise Edition 64-bit operating system; It had an Intel Xeon 3.5 GHz processor, 64 GB of RAM, and an Nvidia Titan XP graphics card;
  • a Sony touch screen of size 9/16;
  • a mouse;
  • a virtual reality system consisting of 1 HTC Vive headset, 2 base stations, 2 controllers, and 1 HTC Vive tracker target.
The touch screen was placed vertically on a table so that participants could use touch while standing for the touch modalities. A table and chair were placed in front of the touch screen when the user had to use the mouse modalities. Figure 4 shows a participant during a mission, looking at the target object (yellow color) to select. The direction of his gaze is given by the blue circle on the screen.
Figure 4. Participant following the object to be selected.
In Figure 5, the user has already selected the object that has turned green. Objects 1, 2, and 3 are outside the highlighted area defined around the position of the gaze. Their size has not changed. On the other hand, object 4, which is the target already selected (green color) and is located in the highlighting area thereof, has a size greater than that of objects 1, 2, and 3 (first level of highlighting). On the other hand, the size of object 4 remains smaller than that of object 5, which now has the gaze direction represented by element 6 (second level of highlighting).
Figure 5. Target object already selected.

4.1.6. Assessment Metrics

To assess the four interaction modalities, and in particular, to compare the contribution of highlighting with the mouse and the touchscreen, we defined 2 objective criteria (variables of interest): the time taken to complete a mission and the rate of selection errors made during a mission. The various types of selection errors are defined above in Section 4.1.1. We also offered subjective questionnaires to participants.
Completion Time
Mission completion time is a measure of the effectiveness of a modality. Because we consider that performing a task quickly with a modality is a sign of a certain ease of use. Likewise, this means that the modality is well suited to carrying out the task. For each mission, the application records the time taken to complete it for each participant. At the end of a mission, the environment changes color from light gray to dark gray to indicate the end to the participant. The software then ceases to determine the target objects, and therefore, no further selection is possible for the participant.
Selection Error Rate
The rate of errors that occur when selecting objects is also a measure of the effectiveness of the means used for this selection. We consider that the fewer errors made, the less difficulty there is in using this means for the task, and therefore the user is more comfortable. For each mission, the software records each selection or attempted selection made by the participant, distinguishing the good from the bad. More precisely, if we take the case of the mouse, each click made by the participant during a mission is recorded. At the end of the mission, the recording stops. We consider the selection error rate as the number of failed attempts of selections divided by the total number of attempts of selections:
e r r o r R a t e = n u m b e r O f F a i l e d S e l e c t i o n s n u m b e r O f S e l e c t i o n s = n u m b e r O f F a i l e d S e l e c t i o n s n u m b e r O f F a i l e d S e l e c t i o n s + n u m b e r O f S u c c e e d e d S e l e c t i o n s
Subjective Questionnaires
Two types of questionnaires were offered to each participant: a questionnaire at the end of each modality to evaluate this modality and an end-of-experimentation questionnaire to compare the modalities and give a general opinion on the experimentation. At the end of the use of a modality, the participant was asked to fill in a questionnaire with a subjective evaluation according to certain criteria, among which are familiarity with the modality, ease of use, self-confidence during use, and tiredness. At the end of the experiment, the participant had to give his opinion on his preferred modality, the least preferred modality, the effect of the highlight on the traditional modalities, etc. Demographic data were also recorded, detailing age, gender, function, and experience with 3D environments.
At the end of the experiments, the data collected were analyzed. The following section presents some results and findings that emerge.

4.2. Results

Using the data collected from the experiment, we carried out a statistical analysis to assess the impact of the factors of selection modality and density of the objects on the scene on our variables of interest, which are the duration of the mission and the rate of mistakes made. Our goal is to compare two modalities (“mouse + highlight” vs. “touch + highlight”) in order to see if the highlight added to the touch allows users to do as well as with the mouse doubtful of a highlight effect. We have conducted ANOVAs. We have 2 interest variables (error rate and completion time) and 2 factors (interaction modality and density of objects) that can influence them. For all our analyses, the risk of the first species chosen is α = 0.05 , i.e., a confidence interval of 95%.

4.2.1. Error Rate

Error Rate by Modality
Figure 6 shows the error rate committed according to the interaction modality used.
Figure 6. Error rate by modality.
The ANOVA carried out shows that the effect of the interaction modality on the rate of errors made during the selection is significant (DF = 3, F = 21.65, p = 7.8 × 10−13). The graph in Figure 6 shows that adding highlighting to the mouse and touch reduces the rate of errors made using these modalities, respectively, for the selection of moving objects. The error rate goes from 27.80% to 10.06% for the mouse and from 38.02% to 24.12% for the touch screen.
The result obtained above gives the general impact of all the conditions on the error rate. To verify that this impact is not the effect of a single modality that dominates all the others, we conducted a Tukey HSD (Tukey multiple comparisons of means) between modalities taken 2 by 2. The results are summarized in Table 2.
Table 2. Detail of the effect of the modalities taken 2 by 2 on the error rate.
As shown in Table 2, by comparing the effects of the interaction methods taken 2 by 2 on the rate of errors made, the impact remains significant. This means that the reductions in the error rate obtained by adding a highlighting effect on the mouse and touch modes, which are observed in the graph in Figure 6, are significant.
Error Rate by Modality and Density
The graph in Figure 7 shows the rate of errors made as a function of the interaction modality and the density of the objects.
Figure 7. Error rate by modality and by density.
Two observations are made: (i) the error rate increases with density; (ii) although the selection error rate increases with the number of objects present on the scene, adding highlighting to the interaction mode makes it possible to considerably reduce this error rate and therefore improve the quality of selection.
ANOVAs were carried out to judge the relevance of the impact of density on the selection error rates. The 3 of them show that the effect of the interaction modality on the rate of errors made during the selection is significant (p < 0.05). Here again, we conducted Tukey HSDs between modalities taken 2 by 2. Table 3 and Table 4 summarize the results obtained from these analyzes.
Table 3. Impact of the modalities on the error rate depending on the density.
Table 4. Detail of the impact of the modalities taken 2 by 2 on the error rate depending on the density.
In Table 3, all p-values are smaller than 0.05. This shows that there is a significant impact between the 2 factors of interaction modality and density of objects on the error rate committed. A comparison of the different modalities gave the results of Table 4.
For a better understanding, we will name the lines by the pairs (Density, comparison case). For example, (10, M/MH) means the results obtained for the comparison between the modalities M and MH for the density 10. Table 4 gives several pieces of information: (i) the line (50, T/M) gives a p-value of 0.58 > 0.05 . Therefore, from 50 objects on the scene, the difference between the error rate made with the mouse only and the touch alone is not significant; (ii) all the other differences observed in the graphs in Figure 7 are significant. In particular:
  • the lines (10, M/MH), (30, M/MH), (50, M/MH) all have p-values less than 0.05. So, regardless of the density of objects in the environment, adding a highlight effect on the mouse significantly decreases the error rate of object selections.
  • the lines (10, T/TH), (30, T/TH), (50, T/TH) all have p-values less than 0.05. So, regardless of the density of objects in the environment, the addition of a highlight effect on the touch significantly decreases the error rate of object selections.

4.2.2. Mission Completion Time

We studied the influence of the modality and density factors on the time to complete the mission.
Mission Completion Time by Modality
The graph in Figure 8 shows the variation in the average completion time depending on the modality used.
Figure 8. Average mission duration by modality.
You can see that adding the highlight to a modality reduces the time taken to complete the mission. We obtain a reduction of 3.60% from 57.44 s to 55.37 s for the mouse and a reduction of 3.20% from 62.62 s to 60.63 s for the touchdown. The ANOVA carried out shows that the effect of the interaction modality on the time duration to complete the mission is significant (DF = 3, F = 95.84, p = 1 × 10−44), so the impact of the interaction modality on the completion time thus observed in the graph in Figure 8 is therefore significant. The result obtained above shows the general impact of all the modalities on completion time. Here again, we conducted a Tukey HSD between modalities taken 2 by 2. The results are summarized in Table 5.
Table 5. Effect of the modalities taken 2 by 2 on the mission duration.
All the p-values obtained are smaller than 0.05, so we can say that the modality of interaction significantly influences the completion time of a mission, especially when adding highlight to the mouse and touch modes.
Mission Completion Time by Modality and by Density
The graph in Figure 9 shows the average completion times based on terms and densities.
Figure 9. Average completion time depending on modalities and densities.
We note that although the duration of the mission increases with the density of the objects, adding the highlight to a modality still makes it possible to decrease this duration despite the increase in the number of objects on the scene.
Two other observations are also made: (i) the time to complete a mission increases with density; (ii) although this time increases with the number of objects present in the environment, the addition of the highlighting to the interaction modalities reduces the completion time. ANOVAs were carried out to assess the relevance of this impact of the modality according to the density of the objects. Table 6 summarizes the results obtained from these analyzes.
Table 6. Impact of modalities on the completion time depending on density.
In Table 6, all p-values are smaller than 0.05. This shows that there is a significant impact of the 2 factors, interaction modality and density of objects, on the mission completion time. The results obtained by comparing the impacts of the modalities taken 2 by 2 for each density are summarized in Table 7.
Table 7. Impact of the modalities taken 2 by 2 on the completion time depending on the density.
In Table 7, all the comparison cases carried out give a p-value smaller than 0.05. There is, therefore, a significant interaction between the modality and density factors over the completion time, regardless of the density and modality used.

4.2.3. Subjective Questionnaire

As indicated in paragraph Subjective Questionnaires of Section Assessment Metrics, the participants answered a subjective questionnaire at the end of the experiment. The questions contained in the questionnaire made it possible to collect the opinions (ratings) of the participants on various criteria.
The participants first had to give a score between 1 and 5 on the following criteria:
  • Ease of use: Would you like to use this modality for the selection of moving objects?
  • Suitability for selection: is the modality suitable for the selection of moving objects?
  • Complexity: is the selection procedure unnecessarily complicated?
  • Size of the objects: is the size proposed for the objects appropriate for their selection?
  • Density and difficulty: does the increase in the number of objects in the environment make selection more difficult?
  • Self-confidence: what level of confidence did you have when using the modality?
Then, they ended the session by giving their opinion with a score between 1 and 10 on the contribution of the highlight (did the addition of the highlight to the modalities improve/facilitate the selection?). Friedman’s ANOVAs were performed on the questionnaire responses; the observed means and the p-values are presented in Table 8.
Table 8. Mean scores and p-values off the subjective data of the experiment.
According to these data, first, the participants found that the highlighted modalities were very easy to use, that they were more suitable for the selection of objects than the traditional modalities, and that they had more self-confidence during the use of highlight. They also found that the size of the objects seemed correct for selection and that the density of the objects in the environment made selection more difficult.
Second, participants found that adding highlighting to a modality greatly improved (average > 8.5 out of 10) the selection of objects. The p-value of this criterion is 0.5 > 0.05 ; therefore, the difference in sentiment observed in the participants’ response to this question is not significant. In other words, the latter notes that the comfort brought by the addition of highlighting on the mouse and the addition of highlighting on touch for the activity of selecting moving objects is substantially the same in the 2 cases. Finally, when asked which method they preferred, 16 participants, i.e., 57.14%, chose the mouse and the highlight against 12, or 42.86%, for the touch and the highlight.

5. Discussion

In this experiment, the addition of highlighting the traditional mouse and touch modalities brings a clear improvement to the quality of the pointing and the selection of the targets. The error rate made by the participants during the experiment decreased considerably with the addition of highlighting to the touch screen. This rate fell from 38.02% to 24.12% as shown in the diagram in Figure 6, a decrease of 36.56%. This improvement in the quality of the selection brought by the highlighting is significant for both the mouse and the touchscreen. Indeed, the ANOVA on this criterion has given statistically significant results. By observing the distribution of the error rate over the different densities of objects that have been tested, we observe a decrease of 34.64% (from 11.98 to 7.83), 40.73% (from 12.94 to 7.67) and 34.12% (from 13.10 to 8.63), respectively, for the densities 10, 30 and 50 objects, as shown in the graph in the figure ref tab: error-rate-by-modality-detail. So, we can say that in our approach, the quality of the selection seems to be resilient to the increase in the density of objects on the screen. However, for the error rate criterion, the improvement brought by our combination approach (touch, highlight) remains lower than that brought by the combination (mouse, touch), which is 63.81% (decrease from 27.80% to 10.06%). This can be explained by the two main remarks made in the subjective questionnaire by almost all the participants: the touch screen, which did not seem very responsive, and its size, which they found very large. According to the participants, the size of the screen did not allow a large view of the screen when using the touch screen, as was the case when using the mouse (the table was distant from the screen, with a wider angle of view). This undoubtedly justifies the very high error rate with the touchscreen.
The average task completion time has also been improved by adding touch highlighting. We determined the time saved by adding highlighting to a modality, and this was done according to the density of the objects. Table 9 gives details of the differences observed. The data that is expressed in seconds reads as follows: for the cell located in the 2nd row and 2nd column, adding the highlight to the mouse saves 1.73 s for a density of 10 objects.
Table 9. Comparison of improvement in completion time.
As we can see, we went from 57.44 s to 55.37 s (−2.07 s) for the mouse and from 62.62 s to 60.63 s (−1.99 s) for the touch screen. We, therefore, see that for the completion time criterion, using the direction of the gaze to add a highlight effect makes it possible to do the same as adding the same highlight effect to the mouse.
Statistical analysis of the data collected through the subjective questionnaire shows that the participants felt more comfortable using a modality with highlights than a modality without. Indeed, they found that adding highlighting to a modality made it easier to use and more suitable for selecting objects. For them, the highlight improved the usability of the mouse by 28.28% with an average score increased from 3.57/5 to 4.57/5, and that of the tactile of 38.39% with an average note passing from 3.07/5 to 4.25/5. It also made it possible to make the methods more suitable for the selection of objects, with an improvement of 23.08% for the mouse and 27.73% for the touchscreen. We see that the improvement rate for the two criteria is higher with the touch screen than with the mouse. Participants gained confidence in the performance of their task with the addition of highlighting; they went from an average confidence level of 4/5 to 4.29/5, an improvement of 7.15% for the mouse and from 3.36/5 to 4.14/5, an improvement of 23.41% for touch. This means that the addition of highlighting to the touchscreen seems to triple the users’ confidence in the exercise of their mission with this selection method. Participants found that highlighting as implemented brought them substantially the same level of comfort with the touch as with the mouse. Indeed, the scores assigned, namely 8.93/10 and 8.68/10, respectively, for the mouse and the touchscreen, are not significantly different regarding the results of ANOVA, which gave a p-value of 0.57 > 0.05 for this criterion. However, participants preferred the use of highlight added to the mouse at 57.14% versus 42.86% for the highlight added to the touch. According to the data extracted from the free comments made by the participants at the end of the experiment, those who preferred the mouse + highlight seemed to justify this unfavorable choice for the touch screen by the quality of the touch screen and its size (size of the interaction interface). It is not so surprising because users are more used to mouse interactions [2], so it is almost normal that the interfaces using the mouse as a mode of interaction are easier to use than with other methods.

6. Conclusions and Future Work

In this paper, we have proposed an approach to improve tactile interaction on big screens based on a two-level highlight, which allows for the anticipation of the pointing and target selection. The approximation of gaze direction obtained by the head orientation is used to determine where the attention of the user is focused on his screen. A perimeter is defined around this position, and all the objects found there are highlighted. If one of the objects thus preselected intercepts the gaze direction, it is differentiated from the others by a second level of highlighting. In this way, we anticipate the user’s action. As detailed in Section 5, the results obtained are satisfactory. The comparative analysis carried out shows that the proposed approach improves tactile interaction in the same way as it improves interaction with the mouse.
Currently, the direction of the gaze is given by the orientation of the head. The use of the actual position of the user’s gaze on the screen could allow greater precision and, therefore, better results in our approach to improving the performance of pointing and target selection. However, note that it is quite difficult to have its gaze fixed in one place by ensuring that the latter does not move. Indeed, the human eye is constantly in motion, and even when we decide to fix a point, it is difficult to stay there. As a result, an object/ray representing the position of the gaze may be unstable on the screen and, therefore, may lack precision in the movements. It could be interesting to compare the performance of a solution using the gaze position with that using the gaze direction to decide. Thus, we intend to use an eye tracker to obtain the position of the gaze and then compare the two approaches. In addition, depending on the results obtained from this first work, eye movements could be envisaged to facilitate pointing tasks and the selection of objects. Finally, to improve our system, we could also develop a contactless interaction based on user gestures.

Author Contributions

V.M.M. contributed to the software, the validation, the investigation, the data curation, and the writing of the original draft. T.D. contributed to the conceptualization, the methodology, the validation, the formal analysis, the resources, the writing (review & editing), the supervision, and the funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been funded by a grant from the Brittany region, France, and this work was also supported by French government funding managed by the National Research Agency under the Investments for the Future program (PIA) grant ANR-21-ESRE-0030 (CONTINUUM). We also gratefully acknowledge the support of the NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Data Availability Statement

The code of the prototype used for the experiments and the dataset from the experiment can be found at this address: https://gitlab.imt-atlantique.fr/sos-dm/drone-simulator-head-tracking, URL accessed on 8 April 2024.

Acknowledgments

We would like to thank the participants in the experiment, without whom this study would not have been carried out, and the beta testers who assessed the quality of the environment and interactions before the launch of the experiment.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kwon, S.; Kim, C.; Kim, S.; Han, S.H. Two-mode target selection: Considering target layouts in small touch screen devices. Int. J. Ind. Ergon. 2010, 40, 733–745. [Google Scholar] [CrossRef]
  2. Lee, Y.L. Comparison of the conventional point-based and a proposed finger probe-based touch screen interaction techniques in a target selection task. Int. J. Ind. Ergon. 2010, 40, 655–662. [Google Scholar] [CrossRef]
  3. Sears, A.; Shneiderman, B. High precision touchscreens: Design strategies and comparisons with a mouse. Int. J.-Man-Mach. Stud. 1991, 34, 593–613. [Google Scholar] [CrossRef]
  4. Grossman, T.; Balakrishnan, R. The bubble cursor: Enhancing target acquisition by dynamic resizing of the cursor’s activation area. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Portland, OR, USA, 2–7 April 2005; pp. 281–290. [Google Scholar]
  5. Baudisch, P.; Cutrell, E.; Czerwinski, M.; Robbins, D.C.; Tandler, P.; Bederson, B.B.; Zierlinger, A. Drag-and-pop and drag-and-pick: Techniques for accessing remote screen content on touch-and pen-operated systems. In Proceedings of the IFIP TC13 International Conference on Human-Computer Interaction, Zurich, Switzerland, 1–5 September 2003; Volume 3, pp. 57–64. [Google Scholar]
  6. Blanch, R.; Guiard, Y.; Beaudouin-Lafon, M. Semantic pointing: Improving target acquisition with control-display ratio adaptation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Vienna, Austria, 24–29 April 2004; pp. 519–526. [Google Scholar]
  7. Cockburn, A.; Firth, A. Improving the acquisition of small targets. In People and Computers XVII—Designing for Society; Springer: London, UK, 2004; pp. 181–196. [Google Scholar]
  8. Guiard, Y.; Blanch, R.; Beaudouin-Lafon, M. Object pointing: A complement to bitmap pointing in GUIs. In Proceedings of the Graphics Interface 2004, Canadian Human-Computer Communications Society, London, ON, Canada, 17–19 May 2004; pp. 9–16. [Google Scholar]
  9. Kabbash, P.; Buxton, W.A. The “prince” technique: Fitts’ law and selection using area cursors. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 7–11 May1995; pp. 273–279. [Google Scholar]
  10. Kuwabara, C.; Yamamoto, K.; Kuramoto, I.; Tsujino, Y.; Minakuchi, M. Ghost-hunting: A cursor-based pointing technique with picture guide indication of the shortest path. In Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion, Santa Monica, CA, USA, 19–22 March 2013; pp. 85–86. [Google Scholar]
  11. Worden, A.; Walker, N.; Bharat, K.; Hudson, S. Making computers easier for older adults to use: Area cursors and sticky icons. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, Atlanta Georgia, GA, USA, 22–27 March 1997; pp. 266–271. [Google Scholar]
  12. Zhai, S.; Buxton, W.; Milgram, P. The “Silk Cursor” investigating transparency for 3D target acquisition. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Boston, MA, USA 24–28 April 1994; pp. 459–464. [Google Scholar]
  13. Zhai, S.; Conversy, S.; Beaudouin-Lafon, M.; Guiard, Y. Human on-line response to target expansion. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Ft. Lauderdale, FL, USA 5–10 April 2003; pp. 177–184. [Google Scholar]
  14. Hürst, W.; Van Wezel, C. Multimodal interaction concepts for mobile augmented reality applications. In Proceedings of the International Conference on Multimedia Modeling, Taipei, Taiwan, 5–7 January 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 157–167. [Google Scholar]
  15. Pastoor, S.; Liu, J.; Renault, S. An experimental multimedia system allowing 3-D visualization and eye-controlled interaction without user-worn devices. IEEE Trans. Multimed. 1999, 1, 41–52. [Google Scholar] [CrossRef]
  16. O’Grady, R.; Cohen, C.J.; Beach, G.; Moody, G. NaviGaze: Enabling access to digital media for the profoundly disabled. In Proceedings of the 33rd Applied Imagery Pattern Recognition Workshop (AIPR’04), Washington, DC, USA, 13–15 October 2004; IEEE: Piscataway, NJ, USA, 2004; pp. 211–216. [Google Scholar]
  17. Land, M.; Mennie, N.; Rusted, J. The roles of vision and eye movements in the control of activities of daily living. Perception 1999, 28, 1311–1328. [Google Scholar] [CrossRef]
  18. Yin, J.; Fu, C.; Zhang, X.; Liu, T. Precise Target Selection Techniques in Handheld Augmented Reality Interfaces. IEEE Access 2019, 7, 17663–17674. [Google Scholar] [CrossRef]
  19. Baloup, M.; Pietrzak, T.; Casiez, G. RayCursor: A 3D pointing facilitation technique based on raycasting. In Proceedings of the the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–12. [Google Scholar]
  20. Peng, Y.H.; Lin, M.T.; Chen, Y.; Chen, T.; Ku, P.S.; Taele, P.; Lim, C.G.; Chen, M.Y. PersonalTouch: Improving touchscreen usability by personalizing accessibility settings based on individual user’s touchscreen interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–11. [Google Scholar]
  21. Ahmad, S. A usable real-time 3D hand tracker. In Proceedings of the 1994 28th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 31 October–2 November 1994; IEEE: Piscataway, NJ, USA, 1994; pp. 1257–1261. [Google Scholar]
  22. Kolsch, M.; Bane, R.; Hollerer, T.; Turk, M. Multimodal interaction with a wearable augmented reality system. IEEE Comput. Graph. Appl. 2006, 26, 62–71. [Google Scholar] [CrossRef]
  23. Moustakas, K.; Strintzis, M.G.; Tzovaras, D.; Carbini, S.; Bernier, O.; Viallet, J.E.; Raidt, S.; Mancas, M.; Dimiccoli, M.; Yagci, E.; et al. Masterpiece: Physical interaction and 3D content-based search in VR applications. IEEE Multimed. 2006, 13, 92–100. [Google Scholar] [CrossRef]
  24. Pascoe, J.; Ryan, N.; Morse, D. Using while moving: HCI issues in fieldwork environments. ACM Trans.-Comput.-Hum. Interact. (TOCHI) 2000, 7, 417–437. [Google Scholar] [CrossRef]
  25. Hoffmann, E.R.; Sheikh, I.H. Finger width corrections in Fitts’ law: Implications for speed-accuracy research. J. Mot. Behav. 1991, 23, 259–262. [Google Scholar] [CrossRef]
  26. Wang, Q.; Sun, M.; Liu, H.; Pan, Y.; Wang, L.; Ge, L. The applicability of eye-controlled highlighting to the field of visual searching. Aust. J. Psychol. 2018, 70, 294–301. [Google Scholar] [CrossRef]
  27. Rooney, C.; Ruddle, R. Improving Window Manipulation and Content Interaction on High-Resolution, Wall-Sized Displays. Int. J. Hum.-Comput. Interact. 2012, 28, 423–432. [Google Scholar] [CrossRef][Green Version]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.