In recent years, authentication has become very important. Authentication is used to secure systems so that only legitimate users can access them. Authentication can be categorised into three categories: token-based, biometric-based and knowledge-based [1
]. Token-based authentication relies on what the users possess (e.g., ID card) to perform authentication, biometric-based authentication relies on users’ attributes (e.g., thumbprint) to perform authentication, while knowledge-based authentication relies on what the users know (e.g., alphanumeric password) to perform authentication [1
Alphanumeric passwords are the foremost and primary form of user authentication [13
]. This form is easy to implement and has been used widely from the past up to today [14
]. A secure password must be random and easy to remember [1
]. However, a secure password that is made up of a random string (e.g., upper and lower cases, used special characters, must have at least eight characters long) is difficult for users to memorise. Therefore, the graphical password was introduced as an alternative to help users to memorise their password better [15
Graphical passwords are a method of authentication in computer security. Computer security is one of the disciplines of computer science. Graphical passwords leverage human memory, since the human brain has significant memory capabilities to recognise and recall visual images [3
]. The belief is that with a graphical password, a user can register random and secure password and still have no difficulty in remembering the registered password [3
Fundamentally, graphical passwords can be divided into three forms, namely, recall, cued-recall and recognition-based systems [3
]. Recall systems entail the users reproducing the previously drawn password object (e.g., a picture, icon, image, or shape). In cued-recall systems, users are presented with images and are required to click on previously registered points. In recognition-based systems, to login users need to recognise a set of registered objects and identify certain objects or pass-objects from among other decoy objects displayed [1
In this study, we focus only on the recognition-based systems because these systems are less complex, and they have been implemented in many security systems, such as online banking systems [2
]. The following is a review of selected related works on recognition-based systems.
2. Related Work
WYSWYE (“Where You See is What You Enter”) was proposed by Khot et al. [5
] (see Figure 1
). There are two main procedures in this system—registration and authentication. During registration, a user is required to register four images from the 28 images shown. During authentication, a random image grid and an empty grid are generated and placed side by side on a login screen. The random image grid or the challenge grid consists of password images and decoy images. The empty grid or the response grid is used to acquire input from users. Users are required to use the challenge grid to find the required positions. After that, the users are required to apply the identified positions on the response grid.
According to [5
], WYSWYE is able to prevent shoulder-surfing attack because attackers who are peeping over the shoulder or monitoring with hidden cameras/screen scrapper programs could only see the random positions clicked in the challenge set. However, this method has a weakness whereby each of the boxes in the respond grid is associated with 4 boxes at the challenge grid. For example in Figure 1
d, box No. 1 in the respond grid is associated with A, B, F and G boxes in the challenge grid; box No. 2 is associated B, C, G and H boxes; box No. 3 is associated with C, D, H and I boxes; box No. 4 is associated with D, E, I and J boxes; box No. 5 is associated with F, G, K and L boxes; box No. 6 is associated with G, H, L and M boxes; box No. 7 is associated with H, I, M and N boxes; box No. 8 is associated with I, J, N and O boxes; box No. 9 is associated with K, L, P and Q boxes; box No. 10 is associated with L, M, Q and R boxes; box No. 11 is associated with M, N, R and S boxes; box No. 12 is associated with N, O, S and T boxes; box No. 13 is associated with P, Q, U and V boxes; box No. 14 is associated with Q, R, V and W boxes; box No. 15 is associated with R, S, W and X boxes; box No. 16 is associated with S, T, X and Y boxes. Therefore, attackers could observe the clicked images and filter out the decoy images in each challenge set. After multiple observations, the attackers might be able to work out the registered images. In other words, this scheme is still vulnerable to shoulder-surfing attack as the attackers can login as legitimate users by filtering out the decoy images after multiple observations [16
Ho et al. proposed a method that allows both registered and decoy images to be used as the challenge set’s input in 2014 [17
] (see Figure 2
). During the registration procedure, the user is required to register several images. The user is required to remember the sequence of the registered images. During the authentication procedure, a pass-image is obtained using the starting image, the cued image, and the proposed algorithm. Initially, the first registered image and second registered image are used as the starting image and the cued image respectively. After that, the pass-image is obtained using the proposed method. In the proposed method, the user is required to determine whether the cued image is on the imaginary half-line. If the cued image is not on the imaginary half-line, the amount of offset is fixed to one. Therefore, the immediate image after the starting image along the imaginary half-line is the pass-image. If the cued image is on the imaginary half-line, the user is required to check if the cued image is the last image on the imaginary half-line. If the cued image is not the last image on the imaginary half-line, the maximum offset is applied. Therefore, the last image along the imaginary half-line is the pass-image. If the cued image is the last image on the imaginary half-line, the amount of offset is reduced by one. Therefore, the image before the last image along the imaginary half-line is the pass-image. To determine the subsequent pass-image, the same method is used just that the current pass-image will be used as the starting image and the next registered image will be used as the cued image. This process is repeated until the final pass-image is obtained. To login, the user is required to click on the final pass-image.
According to [17
], this method can prevent direct observation attacks. However, when multiple sessions are video-recorded the system is vulnerable to reverse engineering attacks [18
]. Reverse engineering attacks exploit the fact that the registered images used in a challenge set are constant. Reverse engineering attack can be performed by ruling out some images that could not be the last cued image. After that, an attacker can obtain the remaining registered images by finding out the last starting image or ruling out more images. Therefore, attackers can find out the registered images and login as legitimate users.
Gokhale & Waghmare proposed a graphical password method in 2016 [19
] (see Figure 3
). During registration, a user is required to register several images from a list of 25 images. The user has to register at least six images, and the number of registered images must be even number. The user is required to remember the sequence of registered images. To make it easier for the user, a panel is used to display the selected registered images. However, these images will disappear after 5 seconds. After that, the user is required to choose the question from the question pool. Each question has a number associated with it. After selecting the question, the user is required to register a location as the answer to the question. The user can upload a background image from local storage or use one of the 25 images given by the system to make it easier for the user to memorise the selected location. The user is required to register three locations and each location must be associated with a question. During the authentication procedure, the user needs to obtain several pass-images using the registered images. To identify the location of the first pass-image, the first registered image is used to determine row information and the second registered image is used to determine column information. The intersection image is the first pass-image. This process is repeated for all of the pairs of registered images. After that, the user is presented with the three sets of registered questions randomly. The user is required to answer the questions by clicking on the locations associated with these questions during registration.
According to [19
], this scheme is easy to use and can prevent shoulder-surfing attacks. However, since the locations are fixed, attackers can shoulder-surf the clicked locations easily [16
]. Also, the attackers can filter out the registered images after multiple observations. This means that this scheme is still vulnerable to shoulder-surfing attacks.
Por et al. proposed a method that used digraph substitution rules in 2017 [1
] (see Figure 4
). During the registration procedure, the user is required to register two images. After that, the user is required to register either to use the first pass-image or the second pass-image to login. During authentication, the user is required to select a pass-image to login using digraph substitution rules.
According to [1
], this scheme can prevent shoulder-surfing attacks. However, if attackers know the underlying algorithm, they can easily trace the images clicked and obtain information about the registered images via multiple shoulder-surfer sessions [18
3D graphical user authentication (GUA) was proposed by [20
] (see Figure 5
). During registration, the user is required to register five images from 150 images. These images are distributed on 6 polygons that consist of 5 × 5 grids at each polygon. During authentication, the user is required to identify and click the registered images by rotating the polygon.
According to [20
], this system is easy to use and can prevent shoulder-surfing attacks. However, from our perspective, this system is vulnerable to shoulder-surfing attacks because the images clicked by the user are the registered images. Therefore, attackers can shoulder-surf the clicked images and use them to login.
Sun et al. proposed PassMatrix that used image discretisation algorithm in 2018 [21
] (see Figure 6
). During the registration procedure, a user is required to select several images. Each of the selected images is converted into puzzles using an image discretisation algorithm. After that, the user is required to register one puzzle as the pass-image for each of the selected images. During authentication, a login indicator is generated. The login indicator is comprised of a letter and a number. After that, the random puzzles of the first selected image are shown. Each puzzle is associated with a letter at the horizontal bar and a number at the vertical bar. The user is required to shift the letter to the column on the horizontal bar and the number to the row on the vertical bar for each of the pre-selected puzzles. This process is repeated for all of the selected images.
According to [21
], this system can prevent shoulder-surfing attacks. However, we still believe that this system is vulnerable to shoulder-surfing attacks due to the fact that the selected images and the puzzles are fixed, and attackers can shoulder-surf the pre-selected puzzle in each of the selected images to login after multiple observations.
Our review of the literature shows that there is still room for improvement in preventing shoulder-surfing attacks. Therefore, it is important to explore more methods to overcome this drawback. Hence, this research was carried out to overcome shoulder-surfing attacks, especially those using video-recording methods and multiple methods.
In this study, we have proposed a method that makes use of the registered locations (something that only the users know) and 5 image directions inspired by Cardinal directions (something that the users can see) to determine a pass-location (new knowledge).
We conducted a search using Thomson Reuters, Scopus and Google scholar databases. To our knowledge, user studies are the only method used to evaluate the feasibility of a method in reducing/preventing shoulder-surfing attacks [1
]. Shoulder-surfing occurs when attackers skillfully capture the important data/activities such as login password via direct observation or video recording methods. This behaviour cannot be formalised. Moreover, the related works (WYSWTE [5
], Ho et al. [17
], Por et al. [1
], 3DGUA [20
], Sun et al. [21
]), which we are comparing use user studies to evaluate their methods. Thus, we use a user study to evaluate the feasibility of our proposed method in preventing shoulder-surfing attacks.
The user study was carefully designed to imitate the actual scenarios of direct observation, multiple observations and video recorded shoulder-surfing attacks. The participants were given unlimited trials to perform shoulder-surfing attacks. They could even request the demonstrator demonstrates the authentication process and record the authentication process using their mobile phones for further analysis. The shoulder-surfing testing results indicated that none of the participants was able to login, although they knew the underlying algorithm and they were given sufficient time to perform a shoulder-surfing attack. Hence, we conclude that our proposed method can resist shoulder-surfing attacks in regards to direct observation, multiple observations and video-recorded shoulder-surfing attacks, regardless of gender and competency level.
There are two factors that enable our proposed method to withstand shoulder-surfing attack. Firstly, the registered locations and the images used in our proposed method are meaningful. By combining both types of meaningful information, our proposed method produces useful knowledge. This knowledge is then be used to determine the pass-location in each challenge set. Nevertheless, this new knowledge will not make any sense to the attackers if they obtained it using shoulder-surfing attacks.
Secondly, the images used in our proposed method have higher chances to offset with each other. Offset in this context is referring to “No movement”. No movement could only happen if the registered location shown a solid sphere image or the registered locations are made up of left arrow and right arrow images, or up arrow and down arrow images. The idea of offset could increase the password spaces of our proposed method if an attacker intended to guess the registered location used. For example, in Figure 10
the pass-location is located at the solid sphere image. To get such location, a user must either register a location at the solid sphere image (case i), or the registered locations must either shown both left and right arrows (case ii), or both up and down arrows (case iii), or the registered locations are make up of the two or more repetitive case i, ii, or iii individually (case iv) each, or the registered locations are make up of the any combination among case i, ii, iii and iv (case v). This means that, the number of registered locations used to produce a “no movement” result between 1 and N. N is denoted as a positive integer. Therefore, it is clear that our proposed method could improve the password spaces and this would eventually make it more difficult for the attackers to guess how many registered locations a user is using.