Let Me Help You: Improving the Novice Experience of High-Performance Keyboard Layouts with Visual Clues

Grüneis, Dominik; Kurz, Marc; Sonnleitner, Erik

doi:10.3390/app13169391

Open AccessArticle

Let Me Help You: Improving the Novice Experience of High-Performance Keyboard Layouts with Visual Clues

by

Dominik Grüneis

,

Marc Kurz

^*

and

Erik Sonnleitner

Department for Smart and Interconnected Living (SAIL), University of Applied Sciences Upper Austria, Softwarepark 11, 4232 Hagenberg, Austria

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(16), 9391; https://doi.org/10.3390/app13169391

Submission received: 20 July 2023 / Revised: 8 August 2023 / Accepted: 14 August 2023 / Published: 18 August 2023

(This article belongs to the Special Issue Novel Approaches for Human Activity Recognition)

Download

Browse Figures

Versions Notes

Abstract

:

Since the advent of smartphones, work tasks shift progressively towards mobile phones. The main hurdle of this progression is the substantially slower input speed of smartphones in comparison with physical keyboards. This input speed could be improved by using keyboard layouts optimized for use with smartphones. As these keyboard layouts are not commonly used, switching to them results in a poor novice experience with an initially decreased text input speed. Previous investigations confirmed that this poor novice experience can be attenuated by using visual clues accentuating the most probable keys. This gives rise to the question of which visual impression leads to the best performance improvement. Therefore, this article evaluates the performance of six visual clues with different visual impressions via a user study conducted with 28 participants. The results showed that the visual clue with a pop-up animation and the visual clue with an increased font size performed the best with a 4% and 3%, respectively, typing performance improvement. Nonetheless, the questionnaire and personal preferences part of the user study showed that a 4% performance gain is not enough to crown one individual visual clue as the best visual clue to be used with every individual. This leads to the conclusion that functionality to choose one personal preference is more important than focusing on one specific best clue.

Keywords:

human-centered computing; text input; mobile computing

1. Introduction

1.1. Motivation

Nowadays, smartphones take a pivotal role in everyday life. With touchscreen input and various sensors, the smartphone enables humans to interact with computer systems in totally new ways. Many new interaction techniques surfaced with the smartphone, starting with simple gestures like swiping and ending with logging into a smartphone with face detection instead of typing a password. Despite all these exciting new interaction possibilities, text input, the most used interaction aspect with a smartphone, stayed mostly the same as with personal computers and notebooks. But why is it important to reevaluate that interaction and what are common problems with so-called soft keyboards?

A big problem with soft keyboards, also called virtual keyboards, is the lower text entry speed, measured in words per minute (WPM), in comparison with an actual physical keyboard. The average user of a soft keyboard inputs text at about 36 words per minute. This is about 15 words per minute less than users using an actual physical keyboard [1]. This means that tasks carried out with a smartphone can take up to 30% longer compared with tasks carried out on a personal computer or notebook. A soft keyboard has several additional challenges, for example the smaller size, lack of haptic feedback, and occlusion.

Nonetheless, users execute more and more tasks on their mobile devices and therefore it is definitely important to increase the usability and speed of textual input with a smartphone [2]. Despite all the mentioned disadvantages, smartphones will be used due to their unprecedented accessibility and handiness for quick tasks and increasingly for longer tasks. This handiness and portability should not decrease the efficiency of the user. There should not be a trade-off between efficiency and portability.

The soft keyboard is the main bottleneck of text input. While companies like Google try to improve the usability of soft keyboards with touch prediction and language models for auto-correction and auto-completion [3], the standard layout remains the Qwerty layout. Generally, most mobile phones have a default Qwerty keyboard layout, which is never changed by the user. According to Palin et al. [1] 87% of all smartphone users type with a Qwerty keyboard layout. The Qwerty layout is not even optimal for use with physical keyboards. It is not optimized in the domain of letter frequency and words in a corpus unlike for example the Dvorak layout [4]. The Qwerty layout is even less optimal for use with soft keyboards on mobile phones [4,5]. There are several optimized keyboard layouts like Opti or Metropolis, which all increase typing speed by decreasing the traveled distance between keys according to Fitts’ Law [6]. Nonetheless, as already mentioned, most users type their text with the preconfigured Qwerty layout. This is similar to the non-adoption of high-performance layouts for physical keyboards. The reason for this non-adoption in both cases might be the decreased typing performance coming from the unfamiliarity of the layout, which increases the visual search times and therefore decreases the typing speed [6,7]. This so-called novice experience is the entry barrier for users to adapt to high-performance layouts. This means that, despite the gain after learning the new keyboard, the user is not willing to experience a temporary decreased typing performance and increased effort. This paper tries to diminish this barrier by helping the user with the visual search of the next key with a new keyboard layout and therefore increasing the typing speed until the user is familiar with the new keyboard layout. This should be achieved by comparing different visual clues to improve the visual search by highlighting the next most probable characters on a keyboard.

1.2. Problem Statement and Methodology

Several studies [8,9,10,11] showed an impact of visual clues on input performance using a soft keyboard but none examined which visual clue performs the best. Therefore, this thesis investigates which visual clue performs the best in the domain of novice experience by comparing them in a user study. Additionally, this thesis examines how the prediction performance correlates to the typing performance of a novice user. This leads to the following main research question: “Which visual clue leads to the best typing performance in a novice experience setting and how dependent is this performance of the prediction performance?”.

To answer these questions, new visual clues were designed after research in the domain of soft keyboards, usability, and human perception. These clues were then implemented in a mobile app prototype. To calculate the character probabilities, a language model was implemented and used in the prototype. This prototype was then used to carry out a user study. The quantitative measures like input speed and error rate are then combined with assessment measures like the NASA Task Load Index to determine the best visual clue for improving the novice experience with high performance keyboard layouts.

1.3. Outline

The rest of the paper is structured as follows:

Section 2 summarizes related works and discusses relevant user studies and findings.

Section 3 presents our approach. We have implemented an Android prototype including a language model with an Opti-layout keyboard for evaluation. Furthermore, the visual clues of interest in our article are described in detail.

Section 4 describes the user study design, the participants, the apparatus, and the general procedure.

Section 5 summarizes the results: (i) input performance in the form of typing speed in corrected words per minute (wpm), (ii) perceived workload of the participants, where we have used the NASA-TLX, and (iii) personal preferences of the participants.

Section 6 discusses and interprets the different clue performances.

Section 7 closes the paper with a conclusion and an outlook on future work.

2. Related Work

In the domain of visual clues, there are four essential works, which are presented below [8,9,10,11].

Magnien et al. [8] focus on the same problem with the same solution proposal as this paper, trying to improve the novice experience of alternative keyboard layouts. Similar to the approach of this paper, they wanted to decrease the scanning time in the novice situation via visual clues. Generally, their goal was to encourage users to use optimized keyboards, which is exactly the goal of our work. The main difference to the approach of this paper is that they just tested whether visual clues improve the input performance, while this approach tries to find out which clue results in the best performance.

They indicate the next probable characters by making the font bold on the affected buttons. Three modes were used to check the performance. The first mode was without visual clues, the second one with perfect visual clues, and the third one with a 10% error rate. To keep the novice experience the same during the experiment, they decided to mix the letter assignment to the buttons after every input [8]. Twelve participants had to input 50 words per mode on a Pocket PC. Concerning the input speed they observed an improvement of 37.7% when all predictions were right. When 90% of all predictions were right, the performance increased by 27.5%. Therefore they state that visual clues can help in a novice experience setting but the performance is dependent on the prediction performance. This paper can be seen as the base knowledge of this paper. It describes that visual clues help in the novice experience situation. It is expected to see a difference in input performance depending on the visual clue, assuming that the bold font clue is not the most suitable clue. The bold font clue of this experiment is also used as one of the clues in the comparison of clues of this thesis.

Al Faraq et al. [9] use a different kind of visual clue. They enlarge the keys according to their probability. To calculate the probability of the keys, they use single letters and digram frequency counts. They expected an improved performance based on two principles. First, they expect an improved character selection. They base this assumption on Fitt’s law. Second, they expect reduced visual scanning times, especially in a novice experience context. The experiment was performed on a Sony VAIO UMPC with a stylus and three keyboard modes were tested. First, no clue was shown; second, the size of the letter with the highest probability was increased; and, third, four keys were adapted. Adapting the size of four keys performed the best, with a 25.14% input speed improvement in comparison with no clue at all [9]. The performance increase is about 12% less than the one documented by Magnien et al. [8]. This can be attributed to the fact that the BigKey implementation of Al Faraq et al. [9] used the Qwerty layout and therefore did not really operate in the domain of novice experience, because some participants might have had Qwerty experience. This demonstrates a difference to the approach of this paper, as it focuses on the novice experience with high-performance keyboard layouts. In our approach, a similar visual clue is used as the size of buttons is also increased according to their probability. The difference is that in this approach more overlap is allowed and a higher scaling factor is used.

A part of the proposal of Rodrigues et al. [11] also aimed to improve input speed and was also engaged with visual clues. Their problem domain was initially different from this paper. They tested the impact of their measures in the domain of elderly users. Elderly people can also be seen to be in the domain of novice experience, because of their limited experience with keyboard layouts. Rodrigues et al. [11] contemplated that similarly as they stated that they expected a better input performance with visual clues due to the unfamiliarity with the Qwerty layout. Additionally, they supposed that a not pressed highlighted character might be additional feedback, which may make the user rethink his choice more often. Their approach was that the four most probable letters were highlighted. The color effect was continuous; this implies a correlation between the probability and the brightness of a character. The approach was tested on 20 participants, of whom twelve stated that they had little or no experience with the keyboard layout. Therefore, this can also be seen to be an experiment in the domain of novice experience. Rather surprisingly, considering the results of Magnien et al. [8] and Al Faraq et al. [9], the visual clues did not lead to an improvement concerning input performance and error rate. Furthermore, the input performance became even worse in comparison to the normal Qwerty input rates. They assumed that the participants were distracted by visual clues and that some participants ignored the visual clues. To conclude this result, the novice experience of elder and younger users might not be comparable.

Gkoumas et al. [10] explored the impact of key resizing and key highlighting with the Qwerty layout. Therefore they focused on the problem of touch accuracy, solving it via increasing the size of the most probable characters, and the problem of visual search, solving it with key highlighting. The most probable keys were highlighted and their size was increased uniformly. Highlighting was performed via semi-transparent white overlays. The character probability was taken from the word completion. While they expected to have an increased input performance and a decreased error rate, the input performance decreased and the error rate stayed the same. Nonetheless, the users reported a positive experience with the keyboard adaptions. The results were similar to the observations from Rodrigues et al. [11]. The reason for this might be that the visual adaption disturbs the user. This might mean that, in the context of novice experience, the clues need to be removed after some learning time. This could be a follow-up study.

Now while visual clues appear to disturb users with experience, users with no experience tend to perform better with visual clues. Generally, the feedback of the users is rather positive concerning the adaption of the keyboards. All experiments focus on one visual clue. This is the main discrepancy between all other approaches and this paper. The goal of this paper is to find the best visual clue in the domain of novice experience.

3. Approach

To test the hypothesis, an Android prototype was developed. This prototype holds the language model, keyboard with visual clues, and data recording functionality.

3.1. Digression: Language Model

The visual clues display the different probabilities of the characters [12]. To calculate these probabilities, a language model is implemented. The language model is a combination of n-gram models [13,14] and a long short-term memory (LSTM) model [15]. As this work focuses on the usability of the visual clues, the implementation is not touched upon closer. To focus on the usability of the visual clues, prediction errors are minimized and the language model is overfitted to the testing sentences. The language model takes 18.49 ms on average to predict the probabilities of the next character. This fits well with the constraint of having an average prediction time below 20 ms in the context of user experience [16]. The prediction performance is calculated across 35 sentences, made up of 322 words, made up of 1623 characters. These 35 sentences are also used in the user study. The prediction performance is 55% across all characters. The prediction index, the ranking of the expected character in the probability list, is 1.05 on average. This means with known sentences the model performs well, which was expected. Concerning the first places, the expected character is the highest predicted character 95.5% of the time. This makes the paper comparable with the experiment carried out by Magnien et al. [8], which tested it with performances of 100% and 90%, but with multiple noise options. In this work, no noise is used and the errors should feel more natural to the user because the errors come from a language model, which only proposes real words and not random errors as in their approach. On observation, the model has problems with short words and the beginning of words, because it may not be able to abstract the context in this situation. With more input characters of a word, the performance becomes better. The problems with short words and the beginning of words are negligible; overall, the language model performs sufficiently well. Generally, across all sentences, the first characters are predicted right with less probability. This results in the clues being better towards the end of a word. The language model of the prototype of this approach is not as strong as other approaches. Using sentences from the training data improves the performance of the language model substantially to a high performance compared with live systems. This is carried out so that the focus of the results lies more on the visual clues themselves. The high performance should make the visual clues and their ability to reduce the visual search time the important factor in the results.

3.2. Keyboard

As already mentioned, there are several keyboard layouts, which are optimized for mobile usage. The Opti II [17,18] layout, shown in Figure 1a, was chosen for this experiment. The Opti II layout was chosen because of its optimization towards Fitt’s law [19], high performance, easy implementation, and the possibility for it to be used with one or two hands when compared with other layouts like Fitaly [20], Metropolis [21], or Kalq [5]. Figure 1 shows the actual implementation of the keyboard of the prototype. The Opti layout is adapted to hold a deletion key. The keyboard has a width of 1080 pixels and a height of 900 pixels. A single button is a square with rounded edges. The side length of the square is 180 pixels. On the user study phone, a OnePlus 5t, this results in a width of 68.42 mm, a height of 57.02 mm, and a single button size of 11.4 mm. The keyboard is bigger than conventional soft keyboards. This measure should diminish the effect of the fat finger problem in the results. As the comparison of the layouts is conducted inside the app and no other results from other experiments are used for comparison, the bigger size is not problematic concerning wrong results and findings. The implementation offers an offset callout functionality, which prevents occlusion and gives visual feedback about the touch. Additionally, the keyboard forgives small mistouches to counter the fat finger problem without having to use a spatial model.

3.3. Visual Clues

This paper centers around finding the best visual clue for performance improvement when handling new keyboard layouts. Therefore, from an implementational standpoint, this is the most important part of the prototype. We have chosen six different visual clues for implementation. This choice might not cover the full spectrum of possible visual impressions; nevertheless, we have focused on these six visual clues because they are sufficiently different. All clues are based on the Opti layout. The visual clues try to achieve maximum pop-out effects. These pop-out effects should reduce the increased visual search time of unfamiliar keyboard layouts. The visual clues are split up into two groups. The prototype holds three static clues and three dynamic clues. Static clues do not have any animation effects. The affected property of the clue is distributed across all buttons according to their probability. So, if the affected property is the color opacity, the color opacity is changed for every button according to its probability. This is just carried out once after every input and the property stays static after that change. Dynamic clues have an animated property, which means that they express the probability via changing a property like color over time and additionally changing the frequency property of the animation. This means that the period of an animation is shorter for buttons with a higher probability. The period (T) of a single animation clue considering the probability (P) is calculated with Equation (1).

T = 2100 - P \cdot 2000

(1)

The following subsections introduce the different visual clues. To make the clues comparable, the two images per visual clue as in Figure 2a,b always show the same probability distributions. Figures denoted with (a) always represent an iteration, where the most probable key is the “o” key with a probability of 36.82%, followed by the key “t” with a probability of 11.17%, and the key “i” with a probability of 9.25%. Figures denoted with an (b) always represent an iteration where the most probable key is “r” with 47.52%, followed by the key “n” with 18.93%, and the key “f” with 8.89%.

3.3.1. Size Clue (SC)

The size clue is a static clue that adapts the size of the buttons according to the probability. A normal button has a side length of 180 pixels. A button with 0% probability also has a size of 180 pixels. A button with 100% probability has a size of 324 pixels. Therefore the size of a button is calculated using Equation (2).

s i z e = 180 p x \cdot (1 + 0.8 \cdot P)

(2)

Additionally, a more probable key is in front of a less probable key. Figure 2 shows the two probability distributions. The most probable keys are bigger and in the foreground. It is visible that the buttons grow evenly in all directions. The keyboard area does not move, so the buttons on the bottom, left, and right are cut on the end of the keyboard area.

3.3.2. Color Clue (CC)

The color clue is also a static clue. It presents the probability of the next character via the color property of the button. More precisely, the probability correlates with the alpha property of the color. The alpha property indicates how visible a certain color is, so it presents the opacity. Equation (3) shows how the alpha value of a single button is calculated. The maximum value of the alpha property is 255, which means no opacity.

a l p h a = 255 \cdot P

(3)

In contrast to the size clue, the color clue just uses the full alpha range and distributes it according to the probability. Red was chosen to display the probability. Figure 3 shows the implementation of the color clue of the Android prototype.

3.3.3. Font Clue (FC)

Similar to Magnien et al. [8], the prototype of this paper also tests changing the font size as a visual clue. In the prototype, the font size is adapted continuously according to the probability, which differs from the implementation by Magnien et al. [8]. The font clue is also a static clue, so there is not any sort of animation. The probability of a button is indicated via the font-size property. The font size is calculated as shown in Formula (4).

f o n t S i z e = 22 + 48 \cdot P

(4)

The reason for these numbers is that the minimum font size of 22 is still readable. The maximum font size of 70 is small enough to fit inside a button without major misalignment. Therefore, the resulting range of 48 is evenly distributed across the whole probability range. Figure 4 shows the font clue implementation of the prototype.

3.3.4. Size Animation Clue (SAC)

The size animation clue is a dynamic clue. It works similarly to the normal size clue and presents the probability via the size property of the button. This size property is animated. This means that the size of the button changes between the minimum size of a button, 180 pixels, and the maximum size of the button, depending on the probability. Equation (5) shows how the size of a button is calculated. It shows that the size is dependent on the probability and the triangle function of the animation.

s i z e = 180 p x \cdot (1 + 0.8 \cdot P) \cdot \land (t)

(5)

Furthermore, the period of the animation is inversely proportional to the probability. This means that the higher the probability, the lower the period of the animation. Buttons with a higher probability move faster than buttons with a lower probability. Figure 5 shows the size animation clue implementation of the prototype. Figure 6 shows some intermediate steps of one half of an animation sequence. The size increases continuously until it reaches the maximum value. Then, the size decreases again in a similar fashion.

3.3.5. Color Animation Clue (CAC)

The color animation clue is also a dynamic clue. It is based on the normal color clue and transforms the probability to the color property of the button. Similarly, it animates the alpha value of a red color layer of the button. The alpha value animation starts with zero, rises up to the maximum value, which is dependent on the probability of the character, and then goes back down to zero. This is one period of the animation. Equation (6) describes how the alpha value is computed for a discrete value of time. The triangle function describes the animation in Android.

a l p h a = 255 \cdot P \cdot \land (t)

(6)

Figure 7 shows the color animation clue implementation of the prototype and Figure 8 shows single subfigures of the intermediate steps of the animation. The animation starts with alpha being zero and ends with the maximum value depending on the probability. This is the first half of the period. The figure does not show the second half when the alpha is decreasing again.

3.3.6. Wiggle Animation Clue (WAC)

The last dynamic clue is the wiggle animation clue and, differently from the size animation clue and the color animation clue, there is no static clue similar to the wiggle animation clue, as it was discarded after testing. The wiggle animation clue adapts the maximum rotation angle of a rotation animation according to the probability. The overall maximum rotation angle is 45 degrees. Equation (7) shows the formula for calculating the rotation angle. A period of the rotation angle starts at −45 degrees, rises up to 45 degrees, and then returns to −45 degrees. Therefore, the triangle function of the animation is different and adjusted to have a negative part.

r o t a t i o n = 45^{\circ} \cdot P \cdot (\land (t) \cdot 2 - \hat{\land})

(7)

Figure 9 shows the implementation of the wiggle animation clue. Figure 10 shows single substeps of the animation in the subfigures. The animation starts from zero and goes up to the maximum rotation. This is just a quarter of the whole animation, as the animation goes back to zero, then goes to the negative maximum rotation, and then finishes the period by going back to zero again.

4. Method

To evaluate the performance of the different visual clues, a user study was carried out. This user study should produce representable results concerning the input performance and generally the usability of the different clues. The user study consists of an experimental task, where the participants have to use the different visual clues, a NASA Task Load Index (NASA-TLX) questionnaire for every clue, and a questionnaire concerning a personal ranking of the different visual clues. All three measurements are taken into account to determine the usability of the single visual clues.

4.1. User Study Design

The experiment is a 7 × 1 factorial design. A within-subjects design was chosen to reduce the number of participants significantly. Every participant has to use every clue, which are the independent variables of the experiment, to input four sentences, with a character count between 35 and 50, for every clue. The sentences are taken from the training data set of the language model. The four sentences are not the same across all seven experiment sessions. This means that there are 28 different sentences used in the experiment. To counterbalance learning effects, a Latin square experiment design is used.

4.2. Participants

The user study was carried out with 28 participants because the Latin square demands a number divisible by 14. Of the 28 participants, eleven (39%) were female, while 17 (61%) were male. The average age was 33.86 (SD: 15.85). Seven (25%) of the participants were older than 50. The participants assessed themselves on how experienced and familiar they were with computers and mobile phones. The scale was between 0 and 10 and the average experience concerning computers was 6.75 (SD: 2.37), with one participant scoring zero. Concerning mobile phone experience, the average scoring was 6.79 (SD: 2.717) with two participants scoring zero. Seventeen (61%) participants were holding their phone in both hands while typing with both thumbs, two (7%) were holding the phone in both hands and typing with one thumb, three (11%) participants had the phone laying on the table tapping with both fingers, and six (21%) participants had the phone laying on the table and tapping with one finger. Regarding the number of participants in our study, we have followed related work articles, where, e.g., 12 or 20 participants were involved. Nevertheless, we are aware that more participants would be beneficial in order to generalize findings and avoid potential biases. By thoroughly selecting our participants, to the best of our knowledge, we do not see any bias in our data and results.

4.3. Apparatus

The experiment was carried out with the user study app running on a OnePlus 5t. The smartphone ran on a Qualcomm® Snapdragon™ 835 octa-core with up to 2.45 Ghz and 8 GB of RAM. The screen had a resolution of 2160 × 1080 pixels and dimensions of 68.42 mm × 136.85 mm, which resulted in a diagonal screen size of 153 mm [22]. The keyboard had a width of 68.42 mm, a height of 57.02 mm, and the side of a single button measured 11.4 millimeters.

4.4. Procedure

The experiment was carried out for all participants in a quiet place. The participant was sitting in front of a table while carrying out the experiment. It was not the same place for everyone but always in a familiar location for the participant. Due to the COVID-19 pandemic, the instructor was wearing a mask, while the participants were free to choose how they wanted to handle it. It was explained to every single participant why the Opti layout is better for typing on smartphones with an English corpus. Additionally, they were told what the research topic concerning the best visual clue was. Before starting the single sessions of the experiment, the participant was told that his input performance was measured and that he could expect four different sentences for every clue. Then, the participant was shown the Opti layout and was able to interact with it free for one minute. Additionally, it was explained to the participant how to interact with the user study app concerning getting the next sentence and that timekeeping started with inputting the first character. Before every clue, the participant was told which clue he could expect and how the clue displayed the probability. Then, the participant started the individual experiment on that clue. After the clue, the participant had to fill out the NASA Task Load Index questionnaire concerning this clue. This was carried out until all independent variables were tested. Then, the participant had to rank all clues starting with his favorite one and ending with his least favorite one.

5. Results and Evaluation

5.1. Input Performance

The user study revealed that the size animation clue performed best, as shown in Table 1 and Figure 11. Concerning typing speed in corrected words per minute (wpm), the performance of participants using the size animation clue was 1.07% better than the second-best clue, the font clue, and 4.16% better than the third-best clue, the size clue. Generally, concerning input performance, the ranking was the size animation clue, followed by the font clue, size clue, color clue, color animation clue, and wiggle animation clue. Figure 11a visualizes the results of the user study concerning input speed in words per minute. The differences between the individual visual clues turned out to be smaller than expected. The results of this user study also confirm the work of Magnien et al. [8], because the typing speed improvement overall with visual clues compared to no visual clues was around 30%, which is in line with their observations. The size animation clue led to a faster typing speed than the size clue, while the color animation clue led to a lower typing speed in comparison with the color clue. This leads to the assumption that the animation features do not automatically improve the visual search times, despite the additional feature. In the domain of errors, the results were similar to the typing speed, as shown in Figure 11b.

Concerning the significance of the differences in the typing speed results of the visual clues, a paired t-test was carried out in a matrix design to compare every visual clue with each other. The null hypothesis, therefore, was that there is no difference between the single averages of the samples. The results are shown in Table 2. The tests show that, in most cases, the null hypothesis cannot be rejected. The differences between size clue, color clue, font clue, size animation clue, and color animation clue are not significant and could be that way just by chance. What the results do show is that the wiggle animation clue is worse than the other clues concerning typing speed and the null hypothesis can be rejected in comparison with every other clue except the color clue. This means that these differences are not by chance. This is the same when comparing no clue to any other clue. Here, the differences are also significant. Concerning the input performance, it is clear that the size animation clue and the font clue performed best according to the user study. The differences from the next ones are rather small with 4% and 3% improvements to the third best clues. The size animation clue also offered the least errors. Additionally, the t-test does not reject the null hypothesis, so there is a chance that these results just happened randomly. The differences are smaller than expected.

5.2. Perceived Workload of the Participants

The NASA Task Load Index showed that generally static visual clues are perceived more positively than dynamic clues. All clues were better perceived than no clues at all. Table 3 shows the individual results of all clues and metrics. The font clue is generally perceived the best, followed by the color clue and the size clue. The font clue was assessed as the best in the metrics of mental demand, physical demand, effort, and frustration. The participants thought that they achieved their best performance with the color clue, followed by the size animation clue.

5.3. Personal Preferences

The static clues were on average ranked better than the dynamic clues, as shown in Table 4. The participants preferred the color clue the most, followed by the size clue and the font clue. The size animation clue was the best animated clue followed by the color animation clue and the wiggle animation clue. It is interesting to see that, despite having the best input speed, the size animation clue is nearly a whole rank behind the color clue concerning personal preferences. The color clue and the size clue appear to be preferred by the participants but do not display this in the experiment concerning typing performance. Generally, it is interesting to see that the animated clues were not as appealing to the user as the static clues. This may indicate that the animated clues may have overwhelmed some participants. Now, while, for example, the size animation clue performed very well in the experiment, it did not do so well in the questionnaires, displaying some sort of trade-off between performance and usability. The wiggle animation clue confirmed its worst performance in the experiment among the visual clues with also the worst average ranking.

5.4. Answering the Research Question

With all these results, the answer concerning the best visual clue is rather hard. While, in the domain of input performance, the size animation clue performed best, the static clues showed better usability and were preferred by the participants. Additionally, the gain of 4% typing speed in comparison to the static clues is rather small. If the problem is viewed from a performance perspective, the best clue is the size animation clue ahead of the font clue. But combining all results shows that this is not really the answer this paper was looking for. These might be the best-performing visual clues in the domain of typing speed, but the user experience is also a performance metric and similarly important. Therefore, the answer to the question “Which visual clue leads to the best typing performance in a novice experience setting” is the clue the user prefers to use in this setting, as long as it is not the wiggle animation clue. The differences are too small to choose a single visual clue to be used with all people and therefore using the one the user likes the most is the best choice.

6. Discussion

6.1. Interpreting the different Clue Performances

The size animation clue performed the best. The reason for this might be that the size animation movement of the button works well in peripheral vision, leading to faster discrimination of the feature. Additionally, this movement is already well known from various pop-up ads. We attribute the performance improvement of the font clue to the fact that the character was more readable and because the feature responsible for catching the attention is the same as the feature used for deciding if it is really the right button. This means that while, for example with the color clue, the participant first looks for the color and then reads the character, with the font clue, the participant looks for the bigger character and maybe already reads the character at the same time. The feedback from the participants and the questionnaire give insights that the color animation clue might have stressed the participants too much, resulting in poorer performance. This may be attributed to the psychological effects of the color red.

6.2. Do Visual Clues Hinder the Learning of the Keyboard Layout?

While observing the user study, it became visible that some users heavily rely on over-performing visual clues. This became clear when they faced the Opti layout without visual clues. Now, it would be interesting to see if this reliance on visual clues hinders learning the layout. This would mean that the user would never be an expert typist, reducing the positive effect of the Opti layout. Figure 12 shows how the learning progression looks in the experiment of this paper. The figure shows the typing speed averages across the different sessions. Due to the design, a single point for a single session of a single visual clue and no clue only corresponds to four participants, making this figure only indicate possible trends toward this question. But it gives an idea of what the initial problem is. It is visible that the blue line, indicating no clues, is not really steadily going up as is expected with a learning effect. Session five, in particular, is worse knowing that the participants interacted with the new layout four times before. With the visual clues, this is different. The changes are much smaller and not decreasing between sessions that extremely. The dark gray baseline clue is the average across all clues and it is visible that it is steadily increasing as is expected with the learning effect. This could indicate that the keyboard layout is not really learned by some of the participants, making the use of visual clues counterproductive. This may also mean that the prediction model should not be perfect, as with a more erroneous prediction the users may be able to learn the layout more efficiently and rely less on the visual clues. Therefore, the prediction performance has to be in a small bandwidth.

7. Conclusions and Outlook

7.1. Conclusions

The results of the experiment itself were clear, with the size animation clue and the font clue being the best clues in the domains of typing speed and error rate. They were 4% and 3%, respectively, faster than the other clues. Only the wiggle animation clue performed significantly worse. Nonetheless, the overall result is that the 4% or 3% typing speed improvement might not overrule individual preferences concerning usability. There was no clear unanimously preferred clue amongst all participants. Only the wiggle clue was dead last again. Therefore, the conclusion is that a solution with the possibility to choose the individually preferred clue is important. This results in the standpoint that there is not a single visual clue that helps all users to achieve the best performance. The individuality concerning perception and impression of the visual clues is immense, making personalization important. Nonetheless, the paper confirms the results of Magnien et al. [8], that visual clues significantly improve performance in a novice experience situation. In a further effort, it is important to check whether visual clues hinder the learning of the keyboard layout, as there are some hints that this might be the case, which would make the visual clues useless. The findings of this paper can not only be used in the specific domain of novice experience with new soft-keyboard layouts or in the domain of soft keyboards, but it can also be used in the whole domain of user experience. The findings help with deciding which feature to use to help or guide a user with on-screen clues. And this can be used everywhere. This makes the results even more interesting and many other researchers may profit from this work. Of course, it is also important to state that the number of clues and participants was rather limited; therefore, it would be interesting to research more into this domain, not focusing on soft keyboards but on general human–computer interaction with clues.

7.2. Outlook

As it was rather hard to determine the right color for the color clue and the color animation clue, it would be interesting to conduct a study to investigate which color performs the best when you want a user to press something. This has to be combined with animations to see which colors overwhelm the least while having a good performance. With the great performance of the font clue, it would also be interesting to see if its performance only comes from improved readability. Therefore, making all characters large and repeating the experiment would be interesting. In relation to the use of visual clues in this domain, it is important to make a longitudinal study on how visual clues affect the learning process of the new layouts. As already mentioned, visual clues may hinder the learning process, which would make them unnecessary in the real world. This experiment can be linked with a longitudinal study to find out when the users should deactivate the visual clues, investigating the long-term effects. It is important to know when this point appears because, as the user study shows, wrong visual clues slow down expert typists. Of course, this paper can always be expanded with new visual clues under the premise of using the same prediction component and the same corpora with the same input sentences. This would make the other clues comparable, with the small problem of another population, so a single clue from this experiment should be incorporated into any other approaches.

7.3. Closing Remarks

This paper gave valuable insights into how people react to certain visual impressions. It showed that individual preferences may be more important than some generic overall best clue. It is also clear that most people will not use visual clues and stay with the normal keyboard layout, while alternative keyboard layouts will mostly be a topic for tech-savvy people. Nonetheless, it once more showed that visual clues can help people by lowering the entry barrier to certain technologies. And that is what should be the goal of many kinds of research: enabling people to achieve better performances.

Author Contributions

Conceptualization, D.G. and M.K.; methodology, D.G.; software, D.G.; validation, D.G., M.K. and E.S.; formal analysis, D.G.; investigation, D.G.; resources, D.G.; data curation, D.G.; writing—original draft preparation, D.G.; writing—review and editing, M.K. and E.S.; visualization, D.G.; supervision, M.K.; project administration, M.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The experiment and user study data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Palin, K.; Feit, A.M.; Kim, S.; Kristensson, P.O.; Oulasvirta, A. How do people type on mobile devices? Observations from a study with 37,000 volunteers. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services, Taipei, Taiwan, 1–4 October 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 1–12. [Google Scholar]
Li, Y.; You, F.; Ji, M.; You, X. The Influence of Smartphone Text Input Method, Posture, and Environment on User Experience. Int. J. Hum.-Comput. Interact. 2020, 36, 1110–1121. [Google Scholar] [CrossRef]
Beaufays, F.; Riley, M. The Machine Intelligence Behind Gboard, Google Research Blog Article. Available online: https://ai.googleblog.com/2017/05/the-machine-intelligence-behind-gboard.html (accessed on 1 May 2017).
Walker, C.P. Evolving a more optimal keyboard. In Course Project: Introduction to Evolutionary Computation; Missouri University of Science & Technology: Rolla, MI, USA, 2003. [Google Scholar]
Oulasvirta, A.; Reichel, A.; Li, W.; Zhang, Y.; Bachynskyi, M.; Vertanen, K.; Kristensson, P.O. Improving two-thumb text entry on touchscreen devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France, 27 April–2 May 2013; pp. 2765–2774. [Google Scholar]
Sears, A.; Jacko, J.A.; Chu, J.; Moro, F. The role of visual search in the design of effective soft keyboards. Behav. Inf. Technol. 2001, 20, 159–166. [Google Scholar] [CrossRef]
MacKenzie, I.S.; Zhang, S.X. An empirical investigation of the novice experience with soft keyboards. Behav. Inf. Technol. 2001, 20, 411–418. [Google Scholar] [CrossRef]
Magnien, L.; Bouraoui, J.L.; Vigouroux, N. Mobile text input with soft keyboards: Optimization by means of visual clues. In Proceedings of the International Conference on Mobile Human-Computer Interaction, Glasgow, UK, 13–16 September 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 337–341. [Google Scholar]
Al Faraj, K.; Mojahid, M.; Vigouroux, N. Bigkey: A virtual keyboard for mobile devices. In Proceedings of the International Conference on Human-Computer Interaction, San Diego, CA, USA, 19–24 July 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 3–10. [Google Scholar]
Gkoumas, A.; Komninos, A.; Garofalakis, J. Usability of visibly adaptive smartphone keyboard layouts. In Proceedings of the 20th Pan-Hellenic Conference on Informatics, Patras, Greece, 10–12 November 2016; pp. 1–6. [Google Scholar]
Rodrigues, É.; Carreira, M.; Gonçalves, D. Improving text-entry experience for older adults on tablets. In Proceedings of the International Conference on Universal Access in Human-Computer Interaction, Heraklion, Greece, 22–27 June 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 167–178. [Google Scholar]
Hard, A.; Rao, K.; Mathews, R.; Ramaswamy, S.; Beaufays, F.; Augenstein, S.; Eichner, H.; Kiddon, C.; Ramage, D. Federated learning for mobile keyboard prediction. arXiv 2018, arXiv:1811.03604. [Google Scholar]
Chen, M.; Suresh, A.T.; Mathews, R.; Wong, A.; Allauzen, C.; Beaufays, F.; Riley, M. Federated learning of n-gram language models. arXiv 2019, arXiv:1910.03432. [Google Scholar]
Mani, S.; Gothe, S.V.; Ghosh, S.; Mishra, A.K.; Kulshreshtha, P.; Bhargavi, M.; Kumaran, M. Real-time optimized n-gram for mobile devices. In Proceedings of the 2019 IEEE 13th International Conference on Semantic Computing (ICSC), Newport Beach, CA, USA, 30 January–1 February 2019; pp. 87–92. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Ouyang, T.; Rybach, D.; Beaufays, F.; Riley, M. Mobile keyboard input decoding with finite-state transducers. arXiv 2017, arXiv:1704.03987. [Google Scholar]
MacKenzie, I.S.; Zhang, S.X. The design and evaluation of a high-performance soft keyboard. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Pittsburgh, PA, USA, 15–20 May 1999; pp. 25–31. [Google Scholar]
Zhang, S.X. A High Performance Soft Keyboard for Mobile Systems; University of Guelph: Guelph, ON, Canada, 1999. [Google Scholar]
Fitts, P.M. The information capacity of the human motor system in controlling the amplitude of movement. J. Exp. Psychol. 1954, 47, 381. [Google Scholar] [CrossRef] [PubMed]
Textware Solutions. The Fitaly One-Finger Keyboard. 1998. Available online: https://textware.com/fitaly/fitaly.htm (accessed on 1 May 2023).
Zhai, S.; Hunter, M.; Smith, B.A. The metropolis keyboard-an exploration of quantitative techniques for virtual keyboard design. In Proceedings of the 13th Annual ACM Symposium on User Interface Software and Technology, San Diego, CA, USA, 5–8 November 2000; pp. 119–128. [Google Scholar]
OnePlus 5T-Technical Specification. Available online: https://www.oneplus.com/de/support/spec/oneplus-5t (accessed on 1 May 2023).

Figure 1. Opti keyboard layout and implementation in the prototype.

Figure 2. Two examples (a,b) of the size clue of the prototype.

Figure 3. Two examples (a,b) of the color clue of the prototype.

Figure 4. Two examples (a,b) of the font clue of the prototype.

Figure 5. Two examples (a,b) of the size animation clue of the prototype.

Figure 6. Animation sequence of the size animation clue.

Figure 7. Two examples (a,b) of the color animation clue of the prototype.

Figure 8. Animation sequence of the color animation clue.

Figure 9. Two examples (a,b) of the wiggle animation clue of the prototype.

Figure 10. Animation sequence of the wiggle animation clue.

Figure 11. Results of the experiment. (a) Words per minute depending on the visual clues. (b) Average error rate depending on visual clues.

Figure 12. Learning progression.

Table 1. User study results.

Visual Clue	Errors	Uncorrected Errors	CpS	WpM	ErrRate [%]
No Clue	83	38	0.833	9.650	1.528
Size Clue	63	25	1.073	12.587	1.164
Color Clue	60	22	1.070	12.575	1.111
Font Clue	48	10	1.091	12.971	0.887
Size Animation	42	9	1.102	13.110	0.784
Color Animation	67	27	1.043	12.204	1.245
Wiggle Animation	58	17	0.971	11.470	1.071

Table 2. T-test concerning typing speed.

	Opti	SC	CC	FC	SAC	CAC	WAC
Opti	1	6.46 $\times 10^{- 8}$	4.08 $\times 10^{- 8}$	3.75 $\times 10^{- 9}$	1.94 $\times 10^{- 10}$	2.75 $\times 10^{- 6}$	4.58 $\times 10^{- 8}$
SC	6.45 $\times 10^{- 8}$	1	0.90	0.22	0.17	0.95	0.07
CC	4.08 $\times 10^{- 8}$	0.90	1	0.03	0.10	0.85	0.03
FC	3.75 $\times 10^{- 9}$	0.22	0.03	1	0.88	0.14	0.00
SAC	1.94 $\times 10^{- 10}$	0.17	0.10	0.88	1	0.10	0.00
CAC	2.75 $\times 10^{- 6}$	0.96	0.85	0.14	0.10	1	0.04
WAC	4.58 $\times 10^{- 8}$	0.07	0.03	0.00	0.00	0.04	1

Table 3. NASA Task Load Index results.

	Opti	SC	CC	FC	SAC	CAC	WAC
Mental Demand	12.96	8.63	8.26	7.59	8.33	9.07	9.44
Physical Demand	7.11	5.07	5.52	4.33	5.11	5.30	5.59
Temporal Demand	9.63	7.56	7.52	7.48	8.37	8.70	7.37
Performance	9.78	7.85	6.85	7.40	7.22	8.52	8.44
Effort	11.37	8.11	8	7.63	8.11	8.19	8.96
Frustration	7.07	5.63	4.52	4.63	5.19	5.96	6.30
Mean	9.65	7.141	6.78	6.51	7.06	7.62	7.69

Table 4. Mean ranking results.

	SC	CC	FC	SAC	CAC	WAC
Ranking	2.85	2.56	3.26	3.41	4.07	4.85

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Grüneis, D.; Kurz, M.; Sonnleitner, E. Let Me Help You: Improving the Novice Experience of High-Performance Keyboard Layouts with Visual Clues. Appl. Sci. 2023, 13, 9391. https://doi.org/10.3390/app13169391

AMA Style

Grüneis D, Kurz M, Sonnleitner E. Let Me Help You: Improving the Novice Experience of High-Performance Keyboard Layouts with Visual Clues. Applied Sciences. 2023; 13(16):9391. https://doi.org/10.3390/app13169391

Chicago/Turabian Style

Grüneis, Dominik, Marc Kurz, and Erik Sonnleitner. 2023. "Let Me Help You: Improving the Novice Experience of High-Performance Keyboard Layouts with Visual Clues" Applied Sciences 13, no. 16: 9391. https://doi.org/10.3390/app13169391

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Let Me Help You: Improving the Novice Experience of High-Performance Keyboard Layouts with Visual Clues

Abstract

1. Introduction

1.1. Motivation

1.2. Problem Statement and Methodology

1.3. Outline

2. Related Work

3. Approach

3.1. Digression: Language Model

3.2. Keyboard

3.3. Visual Clues

3.3.1. Size Clue (SC)

3.3.2. Color Clue (CC)

3.3.3. Font Clue (FC)

3.3.4. Size Animation Clue (SAC)

3.3.5. Color Animation Clue (CAC)

3.3.6. Wiggle Animation Clue (WAC)

4. Method

4.1. User Study Design

4.2. Participants

4.3. Apparatus

4.4. Procedure

5. Results and Evaluation

5.1. Input Performance

5.2. Perceived Workload of the Participants

5.3. Personal Preferences

5.4. Answering the Research Question

6. Discussion

6.1. Interpreting the different Clue Performances

6.2. Do Visual Clues Hinder the Learning of the Keyboard Layout?

7. Conclusions and Outlook

7.1. Conclusions

7.2. Outlook

7.3. Closing Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI