Sensors and Artificial Intelligence Methods and Algorithms for Human–Computer Intelligent Interaction: A Systematic Mapping Study

To equip computers with human communication skills and to enable natural interaction between the computer and a human, intelligent solutions are required based on artificial intelligence (AI) methods, algorithms, and sensor technology. This study aimed at identifying and analyzing the state-of-the-art AI methods and algorithms and sensors technology in existing human–computer intelligent interaction (HCII) research to explore trends in HCII research, categorize existing evidence, and identify potential directions for future research. We conduct a systematic mapping study of the HCII body of research. Four hundred fifty-four studies published in various journals and conferences between 2010 and 2021 were identified and analyzed. Studies in the HCII and IUI fields have primarily been focused on intelligent recognition of emotion, gestures, and facial expressions using sensors technology, such as the camera, EEG, Kinect, wearable sensors, eye tracker, gyroscope, and others. Researchers most often apply deep-learning and instance-based AI methods and algorithms. The support sector machine (SVM) is the most widely used algorithm for various kinds of recognition, primarily an emotion, facial expression, and gesture. The convolutional neural network (CNN) is the often-used deep-learning algorithm for emotion recognition, facial recognition, and gesture recognition solutions.


Introduction
Human-computer interaction (HCI) is a research area related to people's design and use of technology and computers [1]. With the rapid development of modern information and communication technologies the role and purpose of computers has significantly changed. Today, computers have become indispensable for everyday learning, job, social, and life activities. Regardless of the type of device (e.g., a desktop computer, a laptop, a smartphone, etc.), the computer programs need to provide user interfaces (UIs) with an efficient HCI, enabling efficient completion of various tasks, such as typing a document, driving a car, controlling a robot arm, searching information online, listening to music, and others. By the massive infusion of technologies in today's society and their ubiquitous usage in our daily-life activities, the HCI has become one of the most exciting research topics in recent years [2]. The appeal for useful HCI technologies has led numerous researchers to develop innovative and intelligent methods to make HCI attractive to users [3].
In recent years, we have been witnessing a growing need for intelligent user interfaces (IUIs) that can meet user criteria where the user's expectation towards the technology increases day-by-day [4]. Next-generation computing has made it possible to develop anticipatory human-centered UIs built for humans on naturally occurring multimodal human communication, which is able to understand and emulate human communicative intentions as expressed through behavioral cues, such as effective and social signals [5]. The Figure 1. A general architecture of a multimodal HCII system (adapted from [6]).
A natural human-human interaction consists of a mix of verbal signals (e.g., speech, intonation, etc.) and non-verbal signals (e.g., gestures, facial expression, eye motions, body language, etc.). Nonverbal information can be used for predicting and understanding a user's inner (cognitive and affective) state of the mind [31]. In order to provide gen- Figure 1. A general architecture of a multimodal HCII system (adapted from [6]).
Sensors 2022, 22, 20 4 of 40 In the last decade, there has been a rapid increase in existing literature in the HCII field. HCII aims to provide natural ways for humans to use computers and technology in all aspects of future peoples' life and is relevant in applications, such as smart homes, smart offices, virtual reality, education, and call centers [27,28]. For an effective HCII, computers must have the communication skills of humans [29] to be able to interact with the users naturally [30] by enabling interactions that are able to mimic human-human interactions [27]. HCII solutions must implement at least some kind of intelligence in perception from and/or in response to the users [24].
A natural human-human interaction consists of a mix of verbal signals (e.g., speech, intonation, etc.) and non-verbal signals (e.g., gestures, facial expression, eye motions, body language, etc.). Nonverbal information can be used for predicting and understanding a user's inner (cognitive and affective) state of the mind [31]. In order to provide genuinely intuitive communication, computers need to have their sense of verbal and non-verbal signals in order to understand the message and the context of the message [32]. A robust, efficient, and effective HCII system must therefore be able to activate different channels (e.g., auditory channels which carry speech and vocal intonation, a visual channel that carries facial expression and gestures, etc.) and modalities (e.g., sense of sight, hearing, etc.) that enable effective detection, recognition, interpretation, and analysis of various human physiological and behavioral characteristics during the interaction [33,34].
The research and developments in both hardware and software have enabled the use of speech, gestures, body posture, different tracking technology, tactile/force feedback devices, eye-gaze, and biosensors to develop new generations of HCI systems and application [33]. HCI systems that use only one type of input/output or modality enabling the HCI are called unimodal systems. Multimodal HCI systems, on the other hand, use many input or output modalities to communicate with the user, exhibiting some form of intelligent behavior in a particular domain [24,33]. Based on the way of information transfer from user to the computer, UIs can be divided into [35]: (1) contact-based interfaces (e.g., keyboard and mouse based interfaces, touch screen interfaces, etc.), (2) speech-based interfaces (e.g., spontaneous speech, continuous speech, acoustic nonspeech sounds, etc.), (3) gesture-based interfaces (e.g., finger pointing, spatial hand gestures, sign language, head gestures, user behavior, etc.), (4) facial expression-based interfaces (e.g., facial expressions, including those reflecting emotions, articulation of lips, gaze direction, eye winking, etc.), (5) textual and hand-writing interfaces (e.g., handwritten continuous text, typed text, etc.), (6) tactile and myo interfaces (e.g., sensor gloves and body-worn sensors, EMG sensors, etc.), and (7) neural computer interfaces (e.g., EEG signal, evoked potential, etc.).
Based on the nature of the modalities used, the HCIs can be divided into (1) visualbased HCI, which use various visual information about human's response while interacting with the machine, (2) audio-based HCI that use information acquired by different audio signals, and (3) sensor-based HCI that combine a variety of areas with at least one physical sensor used between user and machine to provide the interaction [24]. The visual-based HCI research deals with the development of solutions for an efficient understanding of various humans' responses from visual signals, including facial expression recognition, gesture recognition, gaze detection, and other areas. The audio-based HCI research includes speech recognition, speaker recognition, audio-based emotion recognition, and others. In sensor-based HCI, the solutions are being built using various sensors, which can be very primitive (e.g., a pen, a mouse, etc.) or sophisticated (e.g., motion-tracking sensors, EMG sensors, EEG devices, etc.).
Application fields of HCII are heterogeneous, and the creation of intelligent user interfaces aim to [36]: (1) change the way information is displayed based on users' habits in a particular operating environment, (2) improve human-computer interaction by processing natural language, (3) enabling HCI for users with limitations to interact with technological devices (e.g., improvement of accessibility of interfaces for blind users, using different sensors for acquiring data about user movements and translation of movements into commands sent to a wheelchair, interfaces for cognitive impaired users, etc.). More natural and efficient intelligent interaction paradigms, such as gesture interaction, voice interaction, and face recognition, are widely being implemented in new HCI applications (e.g., smart home solutions, autonomous cars, etc.) [37]. In contrast to the conventional mechanisms of passive manipulation, HCII integrates versatile tools, such as perceptual recognition, AI, affective computing, and emotion cognition, to enhance the ways humans interact with computers [38,39].
Novel IUIs are not necessarily being built for replacing traditional interfaces that use input devices, such as a mouse and a keyboard, but are, rather, complementing when needed or appropriate. Solutions that enable users using speech and hand gestures to control the computer are useful especially in virtual environments, because, for example, in a 3D environment a keyboard and a mouse as an input device are not much useful. Speech recognition solutions that can recognize speech through visual signal (e.g., reading from lips) can complement speech recognition from audio signals in noisy environments where the recognition from audio signal cannot perform well. Intelligent HCI systems can also be used for enabling efficient human-to-human interaction when this is not possible due to the limitations of end-users. For example, an intelligent HCI system that combines AI with wearable devices (e.g., data gloves) can solve communication problems between a hard-of-hearing and a non-disabled person [40]. Mobile IUI solutions today make use of the plethora of advanced sensors available in smartphones, such as camera, microphone, keyboard, touchscreen, depth sensors, accelerometer, gyroscope, geolocation sensor, barometer, compass, ambient light sensor, proximity sensor, etc., which allow the combination of inputs and enrichment of HCII interactions [21] As discussed above, the essential functions of HCII are based on a clear signal of emotional state to infer a person's emotional state [30]. Emotions are complex processes comprised of numerous components, including feelings, body changes, cognitive reactions, behavior, and thoughts [41]. Emotion is a psycho-physiological process triggered by the conscious and unconscious perception of a situation or an object and is often associated with mood, temperament, personality, disposition, and motivation [42]. Intelligent systems providing HCII must, for example, through emotion recognition, be able to perceive the user's emotions, produce the ability of empathy, and respond appropriately [30,43]. By understanding emotions in natural interactions, HCII systems can make smarter decisions and provide better interactive experiences [28]. Emotional interaction makes the humancomputer interaction more intelligent-it makes the interaction natural, cordial, vivid, emotional [43]. Because automatic emotion recognition has many applications also in HCII, it has attracted the recent hype of AI-empowered HCI research [44]. Another essential and challenging task related to emotion recognition in HCII is speech emotion recognition [45], which has become the heart of most HCI applications in the modern world [46]. For many years, eye-tracking technology has been used for usability testing and implementation of various solutions for controlling the user interface. Eye-tracking-based UIs are, for example, various assistive technology solutions for people with severe disabilities (e.g., [47]) that cannot use arms and standard input devices. However, eye-based cues (e.g., eye gaze) are another field of increasing interest to the research community for automatic emotion classification and affect prediction [48].
Facial expression is a powerful, natural, and direct way humans communicate and understand each other's affective states and intentions [29]. Facial expression is considered a significant gesture of social interaction and one of the most significant nonverbal behaviors, through which HCI systems may recognize human emotions' internal or affective state [49]. The clues for understanding facial expressions lie not in global facial appearance but also in informative local dynamics among different but confusing expressions [38]. Automatic facial expression recognition plays a vital role in HCII, as it can help non-intrusively apprehend a person's psychopathology [50]. Motivated by this significant characteristic of instantly conveying nonverbal communication, facial expression recognition plays an intrinsic role in developing the HCII and social computing fields [51] and is becoming a necessary condition for HCII [50]. With many applications in day-to-day developments and other areas, such as interactive video, virtual reality, videoconferencing, user profiling, games, intelligent automobile systems, entertainment industries, etc., facial expression has an essential role in HCII [52].
Human gesture recognition has also become a pillar of today's HCII, as it typically provides more comfortable and ubiquitous interaction [2]. Human gestures include static postures (e.g., hand posture, head pose, and body posture) and dynamic gestures (e.g., hand gestures, head gestures like shaking and nodding, facial action like raising the eyebrows, and body gestures) [53]. Hand gestures, for example, have been widely acknowledged as a promising HCI method [54]. Information about head gestures obtained from head motion is valuable in various applications, such as autonomous driving solutions or assistive tools for disabled users [55].

Sensors Technology for HCII
As stated in the previous section, the user input can be based on standard input devices, recognition-based technologies, or sensor-based technologies. Recognition-based technologies can be implemented using invasive methods and non-invasive methods. Invasive recognition-based technologies use sensors attached to a person (e.g., accelerometer sensor attached to the chest, waist, or different body parts). In contrast, non-invasive recognition-based technologies use non-attached sensors, e.g., vision-based sensors, such as a camera, thermal infrared sensor, depth sensor, smart vision sensor, etc. [66]. Sensor-based HCI technologies are built using various sensors, which can be very primitive (e.g., a pen, a mouse, etc.) or very sophisticated (e.g., motion tracking sensors, EMG sensors, EEG sensors, etc.).
HCI devices with sensor capabilities can be divided into [67]: standard input/output devices (e.g., mouse, keyboard, touch screen, etc.), wearables (e.g., smartwatch, smartphone, band, glove, smart glasses, etc.), non-wearables (e.g., camera, microphone, environmental sensors, etc.). Wearable devices with different kinds of built-in sensors (mechanical, physiological, bioimpedance, and biochemical) can provide data about the physical and mental state of the user [68]. Wearable sensors, for example, are increasingly being used for measuring, in particular, biological signals, such as heart rate or skin conductance [69].
Sensors can be divided into unimodal sensors, providing data about one single signal (e.g., accelerometer), and multimodal sensors (e.g., Body Area Sensor Network [70], Kinect, RespiBAN [71], Empatica E4 [71], etc.). An accelerometer sensor is a device that captures vibrations and orientation of systems that move or rotate. The accelerometer sensor has been used to study activity recognition and physical exercise trackings, such as aerobic exercises [72], gesture recognition [53], human activity recognition [73], and others. The Kinect sensor consists of an RGB camera, a depth sensor, an infrared sensor, and a microphone array. The depth sensor measures the three-dimensional positions of objects in its space [72]. Sensory used for activity recognition are typically classified as ambient and wearable sensors, where ambient sensors are attached to objects in the environment with which users interact [74].
With the development of AI, new types of sensors and interactive devices emerged, enabling new ways for interaction, such as biometrics-based interaction that includes face recognition, fingerprint recognition, attitude recognition, and so on [37]. It is sometimes argued that facial expression and tone of voice are also biological signals [69]. Multimodal HCII systems usually enable technologies for processing active input mode with recognition-based and sensor-based input technologies and technologies for processing passive input using data from sensors (e.g., biosensors, ambient sensors, etc.) [5].
Audio and visual-based input modalities implemented using sensors were developed as well, such as speech recognition based on the audio signal acquired with a microphone (e.g., [63,90,91]), facial expression recognition based on processing visual data from the camera [92], human body posture recognition using data from the conventional gray level or color camera, thermal infrared sensor, depth sensor, smart vision sensor [66], usermovement recognition using Kinect [72], gesture recognition based on data from depth sensor [93] and USB camera on a helmet [94], emotion recognition with a laptop camera [91], and so on. The Kinect can also be used to implement a solution for contact-free stress recognition, where the Kinect can provide respiration signals under different breathing patterns [95]. Eyetracker was used for implementing various HCII solutions, such as cognitive-load assessment during the interaction [96], contactless measurement of heart rate variability from pupillary fluctuations [97], assistant virtual keyboard [98], adaptive UIs [96], autism spectrum disorder prediction [99], etc.
In HCII literature, several solutions for stress recognition were proposed by processing data from various sensors. The physiological response reflects the sympathetic nervous system that can be measured by an ECG sensor, a respiration band sensor, and electrodermal activity (EDA) sensor [79]. In [100] the authors proposed a solution for stress recognition based on keystroke dynamics. Several stress-recognition solutions have combined multi sensors, e.g., visual images from a laptop camera and speech from a laptop microphone [91], a wrist sensor (accelerometer and skin conductance sensor) [101], multimodal wearable sensor (EEG, camera, GPS) [78], a RespiBAN (chest-worn) and Empatica E4 (wrist-worn) sensor (ECG, EDA, EMG, Respiration and Temperature) [71], webcam, Kinect, EDA, and GPS sensor [102], BIOPAC MP150 (ECG from wrist, EMG from corrugator muscle, GSR from fingertips) and video from camera for offline analysis [89], etc.
To enhance the quality of the communication and maximize the user's well being during his or her interaction with the computer, the machine must understand the user's state and automatically respond intelligently. For this, various data about the user's behavior and state related to the interaction must be collected and analyzed. For example, understanding users' emotions can be achieved through various measures, such as subjective self-reports, face tracking, voice analysis, gaze-tracking, and the analysis of autonomic and central neurophysiological measurements [103]. Humans' emotional reactions during HCI can trigger physiological changes that can be recognized using various modalities such as facial expressions, facial blood flow, speech, behavior (gesture/posture), and physiological signals. In existing HCII research the behavioral modeling and recognition uses various physiological signals, including the electrocardiogram (ECG), electromyogram (EMG), electroencephalogram (EEG), galvanic skin response (GSR), blood volume pressure (BVP), heart rate (HR) or heart rate variability (HRV), temperature (T), and respiration rate (RR) [41]. The physiological signals respond to the human body's central nervous system and automatic nervous system, which are voluntary reactions and are more objective [104].
Emotions, for example, can trigger some minor changes in facial blood flow with an impact on skin temperature [42] and speech [105]. EEG-based emotion recognition, for example, has become crucial in enabling the HCII [44] and has been globally accepted in many applications, such as intelligent thinking, decision-making, social communication, feeling detection, affective computing, etc. [106]. Facial expression recognition can involve using sensors, such as cameras, eye-tracker, ECG, EMG, and EEG [107]. The emotion recognition process often includes sensors for detecting physiological signals, which are not visible in the human eye and immediately reflect the emotional changes [44]. Some of the current approaches to emotion recognition based on EEG mostly rely on various handcrafted features extracted over relatively long-time windows of EEG during the participants' exposure to appropriate affective stimuli [103].
One of the main challenges in HCII is the measurement of physiological signals (or biosignals), where the collection process uses invasive sensors that need to be in contact with the human body while recording. However, the ongoing research has enabled the use of non-invasive sensors as well. For example, innovative sensors, such as eye-trackers, enable the development of IUIs able to extract valuable and usable patterns of the users' habits and ways of interaction [64]. The non-invasive sEMG signal can be used to analyze the active state of the muscles and neural activities and performs well in artificial control, clinical diagnosis, motion detection, and neurological rehabilitation [54]. Hand gesture recognition can be implemented by using a vision camera or wearable sensors [108]. Wearable sensors and gesture recognition techniques have been used to develop wearable motion sensors for the hearing-and speech-impaired and wearable gesture-based gadgets for interaction with mobile devices [109]. Recently, depth-based gesture recognition has received much intention in HCII as well [110]. With the rapid development of IoT technologies, many intelligent sensing applications have emerged, which realize contactless sensing [111].

Artificial Intelligence (AI) Methods and Algorithms for HCII
Artificial Intelligence (AI) is one of the most crucial components in the development of HCII and has already significantly impacted how users use and perceive contemporary IUI. The introduction of affective factors to HCIs resulted in developing an interdisciplinary research field, often called affective computing, which attempts to develop human-aware AI that can perceive, understand, and regulate emotions [112]. Once computers understand humans' emotions, AI will rise to a new level [113]. In the HCSII research field, there is an increasing focus on developing emotional AI in HCI since emotion recognition using AI is a fundamental prerequisite to improve HCI [106].
Machine-learning (ML) algorithms and methods can be categorized according to the learning style or similarity in form or function. When categorizing based on the learning style, ML approaches can be divided to following three categories [114,115] Unsupervised learning (UL) algorithms that include clustering, hierarchical ML, unsupervised Gaussian mixture (UGM), hidden Markov model-HMM, K-means, fuzzy c-means, neural networks (NN), etc., 3.
Reinforcement learning (RL) algorithms that include model-based RL, model-free RL, and RL-based adaptive controllers.
In existing HCII-related research, various methods and algorithms have been proposed to accomplish the task of facial expression classification, including SVM, k-NN, NN, rule-based classifiers, and BN [51]. For facial emotion recognition, for example, CNN was recognized as an effective method that can perform feature extraction and classification simultaneously and automatically discover multiple levels of representations in data [50]. Audio-video emotion recognition, for example, is now researched and developed with deep neural network modeling tools [117]. In the speech emotion recognition field, there is considerable attention in digital signal processing research. Researchers have developed different methods and algorithms for analyzing the emotional condition of an individual user with the focus on emotion classification by salient acoustic features of speech. Most researchers in speech emotion recognition have applied handcrafted features and machine learning techniques in recognizing speech emotion [46]. In existing speech emotion recognition research, classical ML classifiers were used, such as the Markov model (MM), Gaussian mixed model (GMM), and SVM [118]. Existing research has demonstrated that DLA effectively extract robust and salient features in the dataset [46]. After the ANN breakthrough, especially CNN, the neural approach has become the main one for creating intelligent computer vision systems [119]. CNN is currently the most widely used deep-learning model for image recognition [63].
This study is interested in AI methods and techniques for HCII solutions available in the existing literature. We are interested in both HCII solutions validated using data from sensor technology and HCII solutions validated using data from publicly available databases.

Related Studies
In the existing literature in HCI and HCII, several studies of systematic literature reviews (SLR) and systematic mapping studies (SMS) have been conducted in the last ten years. This fact indicates that the large body of work encouraged researchers to create a joint knowledge base in this field. Although we can find some parallels between the existing SLR and SMS studies and our study, as the existing studies also dealt with a specific topic related to HCI and HCII, certain differences make our study the first such study in HCII.
We see the first difference when reviewing the keywords based on which existing SLR and SMS studies have systematically acquired and analyzed the existing literature. The most common keywords were HCI (in 7 studies), followed by AI (in 3 studies), IoT (in 3 studies), and EMG (in 3 studies). Others were IUI (in 2 studies), robustness (in 2 studies), accuracy (in 2 studies), smart home (in 2 studies), and deep learning (in 2 studies), followed by over 40 different keywords, particular to different domains, indicating that HCI is an important research field and is integrated into various domains. Furthermore, none of the existing studies aimed to analyze HCII and IUI literature in general and provide a standard overview of sensor technology and machine-learning methods and algorithms used for other HCII developments.
In [9], authors investigated how to approach the design of human-centric IT systems and what they represent. A study published ten years later [10], has still been questioning what is actually deemed intelligent, and surprisingly still examining the question that should have been answered a decade ago. In one of the most extensive reviews of literature related to the HCI field [120], authors analyzed 3243 articles, examining publication growth, geographical distribution, citation analysis research productivity, and keywords, and identified the following five clusters: (1) UI for user centric design, (2) HCI, (3) interaction design, (4) intelligent interaction recognition research, and (5) e-health and health information. The main conclusion in [120] was that the research in this field has little consistency, the researchers start addressing newer technologies, and there is no accumulated knowledge. A similar attempt at visualizing popular clusters for a specific decade was also observed in [121].
Regardless of how potentially unorganized research in HCI might be, the benefits of its use are evident in several other research studies. The benefits of HCI solutions are the main focus of various SLR and SMS studies (e.g., [11,12,122]). For example, the authors in [12] investigated wearable devices, arguing that they have brought the highest level of convenience and assistance to people than ever before. The findings are supported by the study results conducted in [11], where authors addressed the benefits of wearable devices for the aging population with chronic diseases, potentially reducing the social and economic burdens. The importance of HCI in healthcare was furthermore analyzed in [13,122], where the research was focused on people with disabilities and related health problems. The role of AI technology for activity recognition, data processing, decision making, image recognition, prediction making, and voice recognition in smart home interactive solutions was also analyzed [17]. Some existing SLR studies are also focused on the general use of HCI solutions, providing support in healthcare [122], smart living [17], or understanding human emotions from speech [18].
The second focus of published SLR and SMS studies are sensors, signals, and the intelligent use of different devices. Authors in [12] investigated types of wearable devices for general users. At the same time, [13] addressed the benefits of ambient-assisted living and IUI for people with special needs, concluding how important it is to design userfriendly interfaces to provide an excellent HCI mechanism that fits the needs of all users. A similar effort was made in [123], only this time the general user population is included, indicating that several "general solutions" do not always have user-friendly interfaces. The level of intelligence is low, and they need to be improved.
The importance of well-designed and well-interpreted HCII can be presented as the third focus of studies. For example, in [37], authors recognized new HCI scenarios, such as smart homes and driverless cars. In [124], augmented reality (AR) and the third generation of AI technology are investigated. However, both studies lack a systematic review (a lower number of literature units indicate limited research space in these specific topics). The risk of misinterpretation of signals and the connecting risks are addressed in [15] (EMG) and in [16] (EEG), both claiming there are several issues in this area. The review of [15] focuses on deep learning in EMG decoding, while [16] is set to find different good practices within existing research.
Benefits of IoT and HCI collaboration are addressed within the application of IoT systems, emphasizing the influence of human factors while using HCI [125]. The authors in [125] address the advantages of information visualization, cognition, and human trust in intelligent systems. In contrast, the authors in [126] present a unified framework for deriving and analyzing adaptive and scalable network design and resource allocation schemes for IoT.
In the last decade, there was an emphasis on developing solutions for mobile devices. SLRs related to the HCII field on mobile devices are focused on mobile emotion recognition methods, primarily but not exclusively addressing smartphone devices. The authors in [21] deliver a systematic overview of publications from the past ten years addressing smartphone emotion recognition, providing a detailed presentation of 75 studies. Meyer et al. [20] also analyzed the existing research field of mobile emotion measurement and recognition. By conducting a literature review, they were focused on optical emotion recognition or face recognition, acoustic emotion recognition or speech recognition, behavior-based emotion recognition or gesture recognition, and vital-data-based emotion recognition or biofeedback recognition. Research, conducted by Tzafilkou et al. [19] addressed the use of non-intrusive mobile sensing methodologies for emotion recognition in smartphone devices, narrowing the timescale and number of papers even more: 30 articles during the past six years. Similar to the findings of our study, the authors identified a peak of papers published in 2016 and 2017. Based on the results that revealed main research trends and gaps in the field, the authors discussed research challenges and considerations of practical implications for the design of emotion-aware systems within the context of distance education.

Materials and Methods
We conducted a systematic mapping study (SMS) to identify state-of-the-art research in the HCII area and to observe trends in research and the developments in sensor technology and AI methods and algorithms for HCII. We followed the guidelines prepared by Pettersen et al. [127] (see Figure 2). . Systematic mapping study process adapted from [127].
The research activities in this study were divided into five phases (see Figure 3). In the first phase, the research questions presented in Section 3.1 were defined. Next, the scope of the study was reviewed with the preliminary research in the selected databases, presented in Table 2. At this point, we conducted the review of related work and created the first draft of the classification scheme based on our research questions and propositions from related work. The search was conducted following the predefined inclusion and exclusion criteria presented in Table 3. After screening the papers based on abstract in the first step and whole content screening in the next step, the relevant papers were selected, and the draft classification scheme was revised and extended. The final version of the scheme is presented in Table 5. In phase three, data from the selected papers were extracted in accordance with the finalized classification scheme. In phase four, systematic maps presented in Section 4 were created, and the results were analyzed. Predefined steps for SMS were extended in our study, as we defined the draft classification scheme after reviewing the research scope. The research was concluded in phase 5 with writing activities. The research activities in this study were divided into five phases (see Figure 3). In the first phase, the research questions presented in Section 3.1 were defined. Next, the scope of the study was reviewed with the preliminary research in the selected databases, presented in Table 2. At this point, we conducted the review of related work and created the first draft of the classification scheme based on our research questions and propositions from related work. The search was conducted following the predefined inclusion and exclusion criteria presented in Table 3. After screening the papers based on abstract in the first step and whole content screening in the next step, the relevant papers were selected, and the draft classification scheme was revised and extended. The final version of the scheme is presented in Table 5. In phase three, data from the selected papers were extracted in accordance with the finalized classification scheme. In phase four, systematic maps presented in Section 4 were created, and the results were analyzed. Predefined steps for SMS were extended in our study, as we defined the draft classification scheme after reviewing the research scope. The research was concluded in phase 5 with writing activities.
selected, and the draft classification scheme was revised and extended. The final version of the scheme is presented in Table 5. In phase three, data from the selected papers were extracted in accordance with the finalized classification scheme. In phase four, systematic maps presented in Section 4 were created, and the results were analyzed. Predefined steps for SMS were extended in our study, as we defined the draft classification scheme after reviewing the research scope. The research was concluded in phase 5 with writing activities.

Definition of Research Questions
The primary goal of this study is to identify, analyze, and synthesize existing work in the HCII field. The main objective of this study is to (1) systematically review relevant scientific articles to conduct a SMS of the HCII area and (2) to present trends and demographic analysis of the HCII research field. Based on the research goal, we formulated three main research questions-RQ1, RQ2, and RQ3. As the main questions are too general, a set of sub-research questions for all were proposed to provide more complete answers to the formulated research questions, as presented in Table 1.

RQ1
What have been trends and demographics of the literature within the field of HCII? For the first main research question, the following set of research sub-questions were formulated: What is the annual number of publications in HCII field? (Publication count by year).

RQ1.2
Which studies in the HCII are most cited? (Top-cited studies).

RQ1.3
Which countries are contributing the most to the HCII field, based on the affiliations of the researchers? (Active countries).

RQ1.4
Which venues (i.e., journals, conferences) are the main targets of articles in the HCII field, measured by the number of published articles? (Top venues).

RQ2
What has been research space of the literature within the field of HCII in the last decade? For the second main research question, following sub-questions were formulated:

RQ2.1
What type of research is conducted in the HCII field? (Research type, e.g., quantitative, qualitative, and mixed).

RQ2.2
What type of research methods have been conducted in the HCII studies? (research method type, e.g., [127,128]: validation research, evaluation research, solution proposal, philosophical papers, opinion papers, and experience papers).

RQ 2.3
What research methodology is used for validating or evaluating the proposed HCII solution? (Research methodology, e.g., experiment, case study, etc.).

RQ 2.4
What are the common data collection methods in the HCII studies? (Data collection method, e.g., measurement with sensors, questionnaires, observing users, image processing, etc.).

RQ2.6
What phase of the HCII development and evaluation are presented in existing studies (analysis, design, implementation, and testing) (HCII development phase).

RQ3
What sensors technology and intelligent methods and algorithms have been used in the development and evaluation of solutions for HCII? For the third main research question, following sub-questions were formulated: RQ3.1 What is the main aim of the intelligent recognition? (Recognition-of, e.g., emotion, gesture, etc.).

RQ3.2
What is the main data source for the evaluation of the proposed solution of the HCII? (Data source, e.g., audio signal, audiovisual information, sensor, etc.).

Conducting Search and Screening
After identifying the research questions, we defined the appropriate keywords for finding all published articles with topics from HCII and IUI. As we wanted to provide a comprehensive overview of the research area, broad keywords were used. The elementary search query string used for finding published articles was the following: "intelligent interaction" OR "intelligent user interface".
For finding the relevant literature, we used the following publicly established available digital libraries: ACM, IEEE, ScienceDirect, Scopus, and Web of Science. The first search, conducted using the digital libraries, yielded 5642 articles (see Table 2) that were used as inputs into the next selection process step. Several inclusion and exclusion criteria were applied (see the criteria specification in Table 3).  Table 3. Inclusion and exclusion criteria.

I1 Field
Include studies addressing intelligent interaction or intelligent user interfaces. I2 Language The article must be written in English.

I3
Availability The article must be accessible electronically.

I4
Literature type Include articles published in peer-reviewed journals, conference proceedings, or a book (e.g., lecture notes). E1 Year Exclude literature, published before the year 2010. E2 Duplicates Exclude any duplicated studies found in multiple databases.

E3
Research area Exclude non-computer science or non-human-computer interaction literature.

E4
Methodology type Exclude articles that report results of a systematic literature review or systematic mapping study. E5 Language Exclude articles not written in English.

E6
Field Exclude studies outside of the scope of HCII or II.

E7
Exclude articles less than four pages long that do not provide enough information about the study conducted.
The process of selecting the relevant literature was carried out in several steps (see Table 4 and Figure 4). In the second step, we have limited the set of studies by applying the exclusion criteria E1 to include only studies published in 2010 or later, which resulted in a set of 3335 studies. The titles and abstracts were read to select studies that address intelligent interaction or intelligent user interfaces in step three. In the third step, we also excluded studies that were not published in English and were not accessible electronically. The third step provided 657 articles published in a journal, conference proceeding, or a book as a book section. In the next step, removing 35 duplicate entries resulted in 622 articles used for the last step of the selection process. In the fifth step, the exclusion criteria E3-E7 were applied to exclude studies unrelated to computer science and HCI research areas. In addition, literate reviews or systemic mapping studies were excluded. Short papers with less than four pages and articles not written in English were not selected for the final analysis. Although the language criteria were already applied in step three, some articles we found had titles and abstracts written English; however, the rest of the article was not written in English. The last screening step resulted in 454 articles that were used for the systematic mapping study. areas. In addition, literate reviews or systemic mapping studies were excluded. Short papers with less than four pages and articles not written in English were not selected for the final analysis. Although the language criteria were already applied in step three, some articles we found had titles and abstracts written English; however, the rest of the article was not written in English. The last screening step resulted in 454 articles that were used for the systematic mapping study.

Classification and Data Extraction
To ensure that all studies would be analyzed consistently, rules about coding data about study characteristics and the results were specified. A predefined classification scheme is summarized in the Table 5. Firstly, we noted the article type (EC1) as a journal article, conference paper, or book selection. Proceedings papers were classified as conference papers. In terms of research type (EC2), a study was classified as quantitative when authors used quantitative methods for data analysis, qualitative if qualitative methods were used for data analysis, and mixed for studies where quantitative and qualitative data analysis methods were used. Research method type (EC3) was observed as suggested by [128] as validation research, evaluation research, a solution proposal, a philosophical paper, an opinions paper, or an experience paper. Validation research-a certain novel HCII method/technique/algorithm/tool, which has not yet been implemented in practice and was validated using a method-like case study, and experiments in lab, simulation, prototyping, etc. Evaluation research-a certain HCII method/technique/algorithm/tool, which was implemented and evaluated in practice. It shows how the method/technique/algorithm/tool was implemented in practice and what the benefits and drawbacks of the implementation are, evaluated using techniques, such as industrial case study, controlled experiments with practitioners, action research, etc. Solution proposal-a HCII method/technique/algorithm/tool is proposed, which can be either novel or a significant extension of an existing one. The potential benefits and the applicability of the solution is shown by a small example or a good line of argumentation. However, the proposed solution has not yet been implemented. Philosophical papers-for better understanding of existing things, a new-structured view is provided, by constructing a taxonomy or conceptual framework. Opinion papers-a personal opinion is provided of somebody, about whether a certain HCII method/technique/algorithm/tool is good or bad, or how things should be done. The opinions usually do not rely on related work and/or research methodologies. Experience papers-papers explain the author's personal experience on what and how HCII method/technique/algorithm/tool has been done in practice. The research strategy was noted with EC4 as a case study, experiment, survey, grounded theory, user study, field study, mixed study, exploratory study, or literature review. As the IUI is a multidisciplinary filed, we noted the predominant research standpoint with EC6 as accessible UI, adaptive UI, artificial intelligence, brain computer interface (BCI), human-computer interaction (HCI), human-machine interaction (HMI), intelligent interaction (II), or intelligent UI. The research standpoint values were gathered from related literature [7] and extended during the screening. The list of values for data source (EC10), sensor type (EC11), and AI methods used (EC12) were defined literature (one source was for example [13]) and extended during the screening process.

Results
This section presents the results obtained from the analysis of the 454 primary studies. The objective of this study was to identify and systematically map the research in the field of HCII and IUI.

Trends and Demographics of the Literature within the Field of HCII
As illustrated in Figure 5, the number of studies has been increasing in the last decade. The number of primary studies has slightly decreased in 2014 and 2016. Otherwise, a clear trend of increased interest in observed fields can be reported. Along with the increasing number of primary studies in recent years, we also noticed the change in the article types; a positive trend in the number of journal papers compared to other primary research types can be observed since 2017. The data for 2021 is incomplete as the data search was concluded in October 2021. As shown in Figure 6, a majority (72%) of the HCII and IUI articles have been published in conference proceedings or as book sections.
ade. The number of primary studies has slightly decreased in 2014 and 2016. Otherwise, a clear trend of increased interest in observed fields can be reported. Along with the increasing number of primary studies in recent years, we also noticed the change in the article types; a positive trend in the number of journal papers compared to other primary research types can be observed since 2017. The data for 2021 is incomplete as the data search was concluded in October 2021. As shown in Figure 6, a majority (72%) of the HCII and IUI articles have been published in conference proceedings or as book sections.  We ranked the journals and conferences by the number of primary articles published to get a broader view of top venues for the HCII and IUI literature. Results presented in Table 6 indicate that HCII and IUI research articles could be found in a broad spectrum of journals. Most primary studies have been published in IEEE Access (21), followed by Multimedia Tools and Applications (9), Expert Systems with Applications (6), IEEE Transactions on Affective Computing (5), and Procedia Computer Science (5). The conference with the highest number of proceedings papers on HCII and IUI topics is the International Conference on Affective Computing and Intelligent Interaction (ACII), where 16% of the observed proceeding papers were published. A further 24 proceedings papers were published in Humaine Association Conference on Affective Computing and Intelligent Interaction, and As illustrated in Figure 5, the number of studies has been increasing in the last decade. The number of primary studies has slightly decreased in 2014 and 2016. Otherwise, a clear trend of increased interest in observed fields can be reported. Along with the increasing number of primary studies in recent years, we also noticed the change in the article types; a positive trend in the number of journal papers compared to other primary research types can be observed since 2017. The data for 2021 is incomplete as the data search was concluded in October 2021. As shown in Figure 6, a majority (72%) of the HCII and IUI articles have been published in conference proceedings or as book sections.  We ranked the journals and conferences by the number of primary articles published to get a broader view of top venues for the HCII and IUI literature. Results presented in Table 6 indicate that HCII and IUI research articles could be found in a broad spectrum of journals. Most primary studies have been published in IEEE Access (21), followed by Multimedia Tools and Applications (9), Expert Systems with Applications (6), IEEE Transactions on Affective Computing (5), and Procedia Computer Science (5). The conference with the highest number of proceedings papers on HCII and IUI topics is the International Conference on Affective Computing and Intelligent Interaction (ACII), where 16% of the observed proceeding papers were published. A further 24 proceedings papers were published in Humaine Association Conference on Affective Computing and Intelligent Interaction, and We ranked the journals and conferences by the number of primary articles published to get a broader view of top venues for the HCII and IUI literature. Results presented in Table 6 indicate that HCII and IUI research articles could be found in a broad spectrum of journals. Most primary studies have been published in IEEE Access (21), followed by Multimedia Tools and Applications (9), Expert Systems with Applications (6), IEEE Transactions on Affective Computing (5), and Procedia Computer Science (5). The conference with the highest number of proceedings papers on HCII and IUI topics is the International Conference on Affective Computing and Intelligent Interaction (ACII), where 16% of the observed proceeding papers were published. A further 24 proceedings papers were published in Humaine Association Conference on Affective Computing and Intelligent Interaction, and further 15 were published in the Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia). Table 7 lists the top-cited primary studies in HCII and IUI identified and analyzed in this systematic mapping study. Our sample's most cited primary research was the journal article "Analysis of EEG Signals" and "Facial Expressions for Continuous Emotion Detection" [129], with 219 citations. Further analysis demonstrated that the five most cited articles focused on emotion or stress recognition from different data sources (EEG signals, wearable sensors, audio, or smartphones). In the last decade, the mean number of citations for observed studies in the HCII and IUI research fields was 8.7. On average, observed conference proceedings papers in our sample were cited 6.1 times, book chapters were cited 11.5 times, while observed journal papers on average have 15.1 citations. Out of the ten top-cited studies, most (N = 7) were journal papers, while the other three were conference proceedings. Deep-learning analysis of mobile physiological, environmental and location sensor data for emotion detection [131] 2019 Information Fusion 102 Analyzing the impact of a specific study in the existing literature, only from the point of view of the number of all citations, may not be the best, especially if we want to compare studies that have been published ten years apart. Articles published several years ago have probably had a greater chance of being recognized and cited by other authors. Moreover, for studies that have been conducted and published recently, there may not have been enough time for them to be already recognized and cited in the existing literature by other authors. Therefore, to provide an additional view on the most influential research in HCII, we ranked the studies according to the average number of paper citations per year (see Table 8). Based on the average number of citations per year, this time, the study EmotionMeter: A Multimodal Framework for Recognizing Human Emotions [112], performed best with 62,67 citations on average per year. The second best ranked was the study Analysis of EEG Signals and Facial Expressions for Continuous Emotion Detection [129], with 36.5 citations per year. The study ranked third was Deep learning analysis of mobile physiological, environmental, and location sensor data for emotion detection [131], with 34 citations per year on average.  To get a broader view of the contributors in the research field, who scientifically promote HCII and IUI topics, we have observed the country of the author's affiliation. As shown in Table 9, most of the authors active in HCII and IUI field are primarily located in China (28%), followed by the USA (13%), India (10%), United Kingdom (7%), Germany (5%), Japan (4%) and South Korea (4%). We observe that most of the field research is conducted by researchers from Asia (leading with China, India, Japan, and South Korea), Northern America, Europe (leading with UK, Germany, Italy and followed by Portugal, Spain, France, and Belgium), and Australia.

The Research Space of the Literature within the Field of HCII in the Last Decade
To address the RQ2.1, we categorized the primary literature by the type of research as quantitative, qualitative, or mixed. Qualitative research is broadly accepted in the HCI and IUI field, as the research goal can be focused on understanding the subjectivity, not measuring and manipulating the objective data. Nevertheless, as shown in Figure 7, most of the analyzed research (72% of studies) from HCII and IUI fields is quantitative. Qualitative and mixed research is used less often, in 16% and 11% of analyzed papers, respectively. The trend of prevailing quantitative research type is consistent over the years, with most of the primary research being quantitative in all of the observed years in the last decade. However, we noticed a slight increase in studies using mixed research types in recent years. measuring and manipulating the objective data. Nevertheless, as shown in Figure 7, most of the analyzed research (72% of studies) from HCII and IUI fields is quantitative. Qualitative and mixed research is used less often, in 16% and 11% of analyzed papers, respectively. The trend of prevailing quantitative research type is consistent over the years, with most of the primary research being quantitative in all of the observed years in the last decade. However, we noticed a slight increase in studies using mixed research types in recent years. Research type classification per publication year is visualized in Figure 8. As illustrated, the vast majority (91.5%) of the observed primary studies were categorized as validation research. Further 5.7% of included studies were categorized as evaluation research proposals, and 2.8% were solution proposals. The results indicate a strong trend of researchers in the HCII and IUI field publishing finalized ideas and solutions.
To further analyze how the research is conducted in the HCII and IUI studies, we noted the methodology used to validate or evaluate the proposed solutions in the primary studies. Results displayed in Figure 9 show that the most used method in observed studies is an experiment (observed in 387 articles), followed by a case study (57 studies). Only a few user studies (N = 2), mixed studies (N = 1), and exploratory studies (N = 2) were found in our sample, all of which have been conducted in the last two years. In the last decade, five survey studies have been performed, evenly spread through the observed years.  Research type classification per publication year is visualized in Figure 8. As illustrated, the vast majority (91.5%) of the observed primary studies were categorized as validation research. Further 5.7% of included studies were categorized as evaluation research proposals, and 2.8% were solution proposals. The results indicate a strong trend of researchers in the HCII and IUI field publishing finalized ideas and solutions. Research type classification per publication year is visualized in Figure 8. As illustrated, the vast majority (91.5%) of the observed primary studies were categorized as validation research. Further 5.7% of included studies were categorized as evaluation research proposals, and 2.8% were solution proposals. The results indicate a strong trend of researchers in the HCII and IUI field publishing finalized ideas and solutions.
To further analyze how the research is conducted in the HCII and IUI studies, we noted the methodology used to validate or evaluate the proposed solutions in the primary studies. Results displayed in Figure 9 show that the most used method in observed studies is an experiment (observed in 387 articles), followed by a case study (57 studies). Only a few user studies (N = 2), mixed studies (N = 1), and exploratory studies (N = 2) were found in our sample, all of which have been conducted in the last two years. In the last decade, five survey studies have been performed, evenly spread through the observed years.  To further analyze how the research is conducted in the HCII and IUI studies, we noted the methodology used to validate or evaluate the proposed solutions in the primary studies. Results displayed in Figure 9 show that the most used method in observed studies is an experiment (observed in 387 articles), followed by a case study (57 studies). Only a few user studies (N = 2), mixed studies (N = 1), and exploratory studies (N = 2) were found in our sample, all of which have been conducted in the last two years. In the last decade, five survey studies have been performed, evenly spread through the observed years. To achieve a clear signal of the user's emotional state, HCII and IUI solutions use different methods for collecting the data. An overview of data collection methods used in observed research is visualized in Figure 10. The most popular methods for data collection in the last decade have been measurement with sensors (167 studies) and database data analysis (155 studies). The prevalent use of these two methods can be noted in most of the observed years, except for 2014 and 2016, when slightly more studies used user experiments and prototype development for data collection. User experiments or observing To achieve a clear signal of the user's emotional state, HCII and IUI solutions use different methods for collecting the data. An overview of data collection methods used in observed research is visualized in Figure 10. The most popular methods for data collection in the last decade have been measurement with sensors (167 studies) and database data analysis (155 studies). The prevalent use of these two methods can be noted in most of the observed years, except for 2014 and 2016, when slightly more studies used user experiments and prototype development for data collection. User experiments or observing studies and prototype development have been used in 72 and 47 observed studies, respectively, while other methods have overall been used more seldom; simulation has been used for data collection in six studies, questionnaire, or interview in five studies, and text processing in two studies. To achieve a clear signal of the user's emotional state, HCII and IUI solutions use different methods for collecting the data. An overview of data collection methods used in observed research is visualized in Figure 10. The most popular methods for data collection in the last decade have been measurement with sensors (167 studies) and database data analysis (155 studies). The prevalent use of these two methods can be noted in most of the observed years, except for 2014 and 2016, when slightly more studies used user experiments and prototype development for data collection. User experiments or observing studies and prototype development have been used in 72 and 47 observed studies, respectively, while other methods have overall been used more seldom; simulation has been used for data collection in six studies, questionnaire, or interview in five studies, and text processing in two studies. The observed research point is multi-disciplinary, interchanging the ideas from different areas. As visualized in Figure 11, most of the primary research articles (263 articles) were conducted from intelligent interaction, with the trend continuing through the last decade. A further 75 studies were written from a human-computer interaction (HCI) standpoint and a further 36 from an intelligent UI standpoint. HCII and IUI research field was less frequently explored from the standpoint of adaptive UI (28 studies), human-machine interaction (HMI) (28 studies), brain-computer interface (BCI), and accessible UI (6 studies). Nevertheless, we can observe a slight increase of research articles written from the standpoint of human-machine interaction (HMI) and intelligent UI since 2018. The observed research point is multi-disciplinary, interchanging the ideas from different areas. As visualized in Figure 11, most of the primary research articles (263 articles) were conducted from intelligent interaction, with the trend continuing through the last decade. A further 75 studies were written from a human-computer interaction (HCI) standpoint and a further 36 from an intelligent UI standpoint. HCII and IUI research field was less frequently explored from the standpoint of adaptive UI (28 studies), human-machine interaction (HMI) (28 studies), brain-computer interface (BCI), and accessible UI (6 studies). Nevertheless, we can observe a slight increase of research articles written from the standpoint of human-machine interaction (HMI) and intelligent UI since 2018.        Figure 13 shows the correlation between the main aim of intelligent recognition and the standpoint of the research. As we have already discussed in previous sections, most of the analyzed research has been conducted from an intelligent interaction standpoint. As shown in Figure 13, this research is mostly focused on emotion recognition (96 primary studies), gesture recognition (44 studies), and facial expression recognition (32 studies). A similar trend in recognition focus can be observed in papers from other standpoints as well. Overall, analyzed studies from the HCII and IUI fields have primarily focused on intelligent recognition of emotion (129 studies), gestures (87 studies), and facial expressions (45 studies). A slightly different trend can be visible in the research from an intelligent UI and adaptive UI standpoint, where we can observe a focus on behavior recognition along with the previously mentioned recognition trends.  Figure 13 shows the correlation between the main aim of intelligent recognition and the standpoint of the research. As we have already discussed in previous sections, most of the analyzed research has been conducted from an intelligent interaction standpoint. As shown in Figure 13, this research is mostly focused on emotion recognition (96 primary studies), gesture recognition (44 studies), and facial expression recognition (32 studies). A similar trend in recognition focus can be observed in papers from other standpoints as well. Overall, analyzed studies from the HCII and IUI fields have primarily focused on intelligent recognition of emotion (129 studies), gestures (87 studies), and facial expressions (45 studies). A slightly different trend can be visible in the research from an intelligent UI and adaptive UI standpoint, where we can observe a focus on behavior recognition along with the previously mentioned recognition trends.

Sensors Technology and Intelligent Methods in the Development and Evaluation of HCII Solutions
Visualization of analyzed studies based on the primary data source used for evaluating the proposed solutions of the HCII and the main aim of the intelligent recognition is presented in Figure 14. The use of sensor data (165 studies), images (56 studies), and use of multiple sources (51 studies) are overall the most used data sources. However, we can observe that studies focusing on intelligent speech recognition most commonly use voice and speech (five studies), audio (four studies), or database as data sources. Studies focused on eye movement and gaze recognition most often use eye gaze as the data source, while studies focused on emotion recognition mostly use video sources. The difference in used data sources can also be observed in studies focused on the recognition of depression. They most often use audiovisual information, as the data source and in papers focused on behavior recognition, which most commonly use behavior as the data source. As visualized in Figure 14, the research interest in the HCII solutions is highly focused on emotion recognition based on multi-source or sensor data and gesture recognition based on the same data sources. Although some combination of data sources and aim of intelligent recognition is nonsensical, we still observe some research opportunities, especially in using different data sources to recognize human affection, attention, depression, and sign language. Visualization of analyzed studies based on the primary data source used for evaluating the proposed solutions of the HCII and the main aim of the intelligent recognition is presented in Figure 14. The use of sensor data (165 studies), images (56 studies), and use of multiple sources (51 studies) are overall the most used data sources. However, we can observe that studies focusing on intelligent speech recognition most commonly use voice and speech (five studies), audio (four studies), or database as data sources. Studies focused on eye movement and gaze recognition most often use eye gaze as the data source, while studies focused on emotion recognition mostly use video sources. The papers focused on behavior recognition, which most commonly use behavior as the data source. As visualized in Figure 14, the research interest in the HCII solutions is highly focused on emotion recognition based on multi-source or sensor data and gesture recognition based on the same data sources. Although some combination of data sources and aim of intelligent recognition is nonsensical, we still observe some research opportunities, especially in using different data sources to recognize human affection, attention, depression, and sign language.     (21 studies). Obtained data was in most cases used for the recognition of emotion (54 studies) and gesture recognition (54 studies). We can observe some research possibilities in using biometric/body sensors as data sources. They were only used in five studies (2% of all observed studies used sensors to obtain data). Seldom use of touchframe/touch screen sensors (six studies) for data gathering was also surprising due to their affordability and overall widespread use. Some combinations of observed sensors and recognition solutions in Figure 15 are illogical (e.g., an accelerometer was not used for eye blinks and movement recognition) and do not indicate research gaps in the observed field. possibilities in using biometric/body sensors as data sources. They were only used in five studies (2% of all observed studies used sensors to obtain data). Seldom use of touchframe/touch screen sensors (six studies) for data gathering was also surprising due to their affordability and overall widespread use. Some combinations of observed sensors and recognition solutions in Figure 15 are illogical (e.g., an accelerometer was not used for eye blinks and movement recognition) and do not indicate research gaps in the observed field. Note that some studies used multiple different methods of artificial intelligence. Therefore, the sum of used methods is larger than the number of included studies (N = 556). It is visible that the most used AI methods are deeplearning algorithms (178 studies) and instance-based algorithms (123 studies). Further observation of methods used concerning research standpoint shows that deep-learning algorithms are most commonly used without regard of the research standpoint. An exception was observed for adaptive UI, where researchers most commonly used the probabilistic model (35% studies from the adaptive UI standpoint), and for intelligent UI, where To address RQ3.4, AI methods used in the analyzed research were visualized in Figure 16 regarding the research standpoint. Note that some studies used multiple different methods of artificial intelligence. Therefore, the sum of used methods is larger than the number of included studies (N = 556). It is visible that the most used AI methods are deep-learning algorithms (178 studies) and instance-based algorithms (123 studies). Further observation of methods used concerning research standpoint shows that deeplearning algorithms are most commonly used without regard of the research standpoint. An exception was observed for adaptive UI, where researchers most commonly used the probabilistic model (35% studies from the adaptive UI standpoint), and for intelligent UI, where ensemble algorithm was used in 50% of studies. Widespread use of deep learning could be attributed to its effective extraction of robust and salient features from various datasets.  To further investigate the use of AI in the HCII and IUI field, an analysis of the used artificial intelligence method concerning the HCI recognition's aim is visualized in Figure  17. As expected, deep-learning algorithms were widely used in most categories of recognition. However, instance-based algorithms were most widely used with the aim of human/body motion recognition, human activity, gesture, depression, and behavior recognition. The use of artificial neural networks should also be mentioned (included in 47 studies), as it was successfully used in gesture (eight studies) and emotion recognition (13 studies). Further, 41 studies have used probabilistic models for the same aim of recognition.
The visualization of used AI method concerning used data type in Figure 18 reveals some interesting practical implications for using different AI methods for various data types. As previously observed, the use of sensor data are most common in the HCII and IUI solutions (used in 174 studies), with the most common AI methods for its processing being instance-based algorithms (used in 50 studies) and deep-learning algorithms (40 studies). Deep-learning algorithms are widely used to process audiovisual information (13 studies) and database data (19 studies). Instance-based algorithms are also commonly used to process audio (seven studies) and image data (13 studies). In contrast to this trend, behavior data were in most cases processed with probabilistic models (six studies) and ensemble algorithm (five studies), while the eye gaze data were most widely analyzed with the help of other AI methods. Note that all the AI methods used in the studies are presented in this visualization. Therefore, the total sum of instances (N = 151) in Figure 18 is higher than the total number of analyzed primary studies, in which this metric could be observed (N = 454). Most of the primary studies (315 studies) used a single AI method, To further investigate the use of AI in the HCII and IUI field, an analysis of the used artificial intelligence method concerning the HCI recognition's aim is visualized in Figure 17. As expected, deep-learning algorithms were widely used in most categories of recognition. However, instance-based algorithms were most widely used with the aim of human/body motion recognition, human activity, gesture, depression, and behavior recognition. The use of artificial neural networks should also be mentioned (included in 47 studies), as it was successfully used in gesture (eight studies) and emotion recognition (13 studies). Further, 41 studies have used probabilistic models for the same aim of recognition.   The visualization of used AI method concerning used data type in Figure 18 reveals some interesting practical implications for using different AI methods for various data types. As previously observed, the use of sensor data are most common in the HCII and IUI solutions (used in 174 studies), with the most common AI methods for its processing being instance-based algorithms (used in 50 studies) and deep-learning algorithms (40 studies). Deep-learning algorithms are widely used to process audiovisual information (13 studies) and database data (19 studies). Instance-based algorithms are also commonly used to process audio (seven studies) and image data (13 studies). In contrast to this trend, behavior data were in most cases processed with probabilistic models (six studies) and ensemble algorithm (five studies), while the eye gaze data were most widely analyzed with the help of other AI methods. Note that all the AI methods used in the studies are presented in this visualization. Therefore, the total sum of instances (N = 151) in Figure 18 is higher than the total number of analyzed primary studies, in which this metric could be observed (N = 454). Most of the primary studies (315 studies) used a single AI method, further 97 of studies used two methods of artificial intelligence in their research, 26 studies used three different AI methods, 13 studies used four and a further four of the observed primary studies used six different AI methods. Further analysis of AI method types used in HCII and IUI research concerning the sensors used for data collection is presented in Figure 19. Instance-based algorithms were used in the majority of the studies (N = 73). The method was most widely used in the research using data obtained from EEG sensors (24 studies), EMG sensors (10 studies), ambient sensors (six studies), wearable sensors (seven studies) and Leap motion (three sensors). Deep-learning algorithms were also widely used (54 papers). The method was dominantly used in studies with data acquired using cameras (18 papers), EEG sensor (15 studies), multisensors (nine studies), and biometric/body sensors (four studies). In contrast, an artificial neural network was most commonly used in the studies, using cameras (four studies), EEG (four studies), Kinect (four studies), ambient sensors (three studies), and Leap motion (three studies). Further analysis of AI method types used in HCII and IUI research concerning the sensors used for data collection is presented in Figure 19. Instance-based algorithms were used in the majority of the studies (N = 73). The method was most widely used in the research using data obtained from EEG sensors (24 studies), EMG sensors (10 studies), ambient sensors (six studies), wearable sensors (seven studies) and Leap motion (three sensors). Deep-learning algorithms were also widely used (54 papers). The method was dominantly used in studies with data acquired using cameras (18 papers), EEG sensor (15 studies), multisensors (nine studies), and biometric/body sensors (four studies). In contrast, an artificial neural network was most commonly used in the studies, using cameras (four studies), EEG (four studies), Kinect (four studies), ambient sensors (three studies), and Leap motion (three studies). To provide an in-depth overview of the HCII field, a systematic map including the aim of HCI recognition and used AI methods and/or algorithms is presented in Figure 20. Visualization includes only studies where both the aim of HCI recognition and AI methods and/or algorithms could be classified (N = 421). SVM is the most widely used algorithm for various kinds of recognition (used in 87 studies), primarily emotion recognition (36 studies), facial expression recognition (14 studies), and gesture recognition (13 studies). We can also report the widespread use of CNN, which was used in 62 of the observed primary studies, mainly for emotion recognition (19 studies), facial recognition (13 studies), and gesture recognition (10 studies). As mentioned before, emotion recognition was the aim of almost a quarter of observed studies (141 studies), followed by studies working on gesture recognition (65 studies) and facial expression recognition (59 studies). Researchers have used a wide range of AI methods and algorithms for emotion recognition. Of all the observed methods and algorithms, only classification algorithms and dynamic time warping have not been used in the primary research with the aim of emotion recognition. However, the challenges of emotion recognition were most often approached with SVM, CNN, LSTM, and deep neural network (DNN). Studies focused on gesture and facial expression recognition have covered most of the observed AI approaches except for ANN, ensemble classification, FL, GMM, GP, HMM, LDA, and ML. The most used method in studies of facial expression recognition was CNN and SVM. Analyzed HCII research with the aim of gesture recognition has not covered BN, classification algorithm, CNN, deep ML, GP, LoR, and RF. In terms of the mapping results presented in Figure 19, it can be observed that most research, divided by the aim of HCI recognition, cuts across different AI methods and algorithms. For example, HCI research focused on recognizing stress was less researched (seven studies) though analyzed with the classification algorithm, NN, k-NN, ML, RF, and SVM. To provide an in-depth overview of the HCII field, a systematic map including the aim of HCI recognition and used AI methods and/or algorithms is presented in Figure 20. Visualization includes only studies where both the aim of HCI recognition and AI methods and/or algorithms could be classified (N = 421). SVM is the most widely used algorithm for various kinds of recognition (used in 87 studies), primarily emotion recognition (36 studies), facial expression recognition (14 studies), and gesture recognition (13 studies). We can also report the widespread use of CNN, which was used in 62 of the observed primary studies, mainly for emotion recognition (19 studies), facial recognition (13 studies), and gesture recognition (10 studies). As mentioned before, emotion recognition was the aim of almost a quarter of observed studies (141 studies), followed by studies working on gesture recognition (65 studies) and facial expression recognition (59 studies). Researchers have used a wide range of AI methods and algorithms for emotion recognition. Of all the observed methods and algorithms, only classification algorithms and dynamic time warping have not been used in the primary research with the aim of emotion recognition. However, the challenges of emotion recognition were most often approached with SVM, CNN, LSTM, and deep neural network (DNN). Studies focused on gesture and facial expression recognition have covered most of the observed AI approaches except for ANN, ensemble classification, FL, GMM, GP, HMM, LDA, and ML. The most used method in studies of facial expression recognition was CNN and SVM. Analyzed HCII research with the aim of gesture recognition has not covered BN, classification algorithm, CNN, deep ML, GP, LoR, and RF. In terms of the mapping results presented in Figure 19, it can be observed that most research, divided by the aim of HCI recognition, cuts across different AI methods and algorithms. For example, HCI research focused on recognizing stress was less researched (seven studies) though analyzed with the classification algorithm, NN, k-NN, ML, RF, and SVM.

Discussion
While the existing body of research proves there is already a vast plethora of HCIrelated research, research in the field of HCII and IUI has continued to grow even more in the last decade. However, there is confusion present among research classification and often misinterpretation of what is an intelligent user interface, what intelligent HCI is, and how to classify them. Therefore, in this paper, we present the results from a systematic mapping study in HCII and intelligent IUI, including a total of 454 articles, aiming to classify current research trends in the field. This paper summarized insights into the state of the art with the aims of (1) understanding the research standpoints present in this multidisciplinary field, (2) highlighting open issues that need to be filled by future research, and (3) identifying the use of AI methods and algorithms for categorized HCII solutions, aiming to achieve recognition from different data sources.

Discussion
While the existing body of research proves there is already a vast plethora of HCIrelated research, research in the field of HCII and IUI has continued to grow even more in the last decade. However, there is confusion present among research classification and often misinterpretation of what is an intelligent user interface, what intelligent HCI is, and how to classify them. Therefore, in this paper, we present the results from a systematic mapping study in HCII and intelligent IUI, including a total of 454 articles, aiming to classify current research trends in the field. This paper summarized insights into the state of the art with the aims of (1) understanding the research standpoints present in this multidisciplinary field, (2) highlighting open issues that need to be filled by future research, and (3) identifying the use of AI methods and algorithms for categorized HCII solutions, aiming to achieve recognition from different data sources.
Our study highlights how researchers currently aim to develop innovative contributions in terms of type and phase of research, the aim of the recognition, used data sources and used AI methods and algorithms. Our study identified existing research from the HCII and IUI fields and investigated trends related to sensors technology, AI methods, or algorithms. Based on the analysis of the existing HCII in IUI-related research, this study provided theoretical and practical implications, an overview of the most used methods and algorithms in different HCII solutions, and under-researched topics.

Theoretical and Practical Implications
Based on the data from 454 research papers published between 2010 and 2021, several theoretical and practical implications were identified, affecting the course of our future research, and possibly the study focus of other researchers. Numerous research papers claim that intelligent solutions and sensors are becoming progressively crucial for users in their everyday lives, facilitating a higher level of living in the domain of health, entertainment, economics, and general smart living. Literature suggests that AI and IUIs will be part of our daily lives. Our needs will heavily depend on them, creating a necessity to equip the user interfaces with human communication skills and enable natural interaction between the computer and a human to the highest possible extent. Therefore, future efforts should be invested in improving emotion, gesture, and facial recognition of intelligent interfaces, and in HCI in general.
Theoretical implications interesting for researchers are therefore recognized in research gaps illustrated in this study. Some topics, such as accessible UI and sign language recognition, have received little attention and should be considered for future research possibilities. Topics on the brain-computer interface are comparatively still sparsely covered, though they have been gaining popularity in the last decade, with the number of associated studies rising each year. With the detailed overview of the research standpoints, we have illustrated their overlap in selected topics, which could hopefully spark collaboration initiatives from active researchers with various research standpoints that this interdisciplinary field joins. This mapping study confirms that the HCII and IUI field is an emerging field, as the latest few years have been the most productive in the last decade. Our overview also points out the methodological gap in qualitative research studies, which are otherwise broadly accepted in HCI. There appears to be a lack of focus (partly covered by little experiments) on how the end-users accept the suggested intelligent solutions outside of the controlled environment. Lack of validation proposals can also be observed in the low total number of conducted user studies (n = 2). As great efforts have been made to the technical advances of the intelligent solutions, our study also serves to remind the lack of philosophical studies focused on the ethical limitations and aspects future HCII solutions ought to consider.
Practical implications of this research result in several recommendations and support the theoretical implications. The most cited papers focused on emotion and stress recognition, creating the central field of study, largely present in other related studies. In addition, intelligent interaction is frequently connected with emotion recognition, followed by gesture recognition. Emotion and gesture recognition is the most comprehensive data source in sensors, mostly retrieved by EEG and Kinect, creating the foreground of HCI research.
Since quantitative research is more prevalent than qualitative, a lack of research in this field creates a window of opportunity for researchers, especially looking for advancements for users with special needs, focusing on progress inaccessible and adaptive UI. The sensors used in the research mainly were applied to identify gestures and emotion, followed by using existing images to identify and understand facial expressions, again creating the most exciting field of research and directing the course of HCII research in the future.
In HCI publishing, there is a trend of research presentation in conferences and less in journals, indicating several general presentations of HCI solutions, lacking the in-depth (and time-consuming) analysis typical for journal papers. The decline in the number of research after 2017 is arguably not consistent with fast progress achieved in HCI development, creating an opportunity to conduct additional research in the field of the newest HCI application. Practitioners can benefit from our overview of the used AI methods and algorithms, already associated with various aims of HCI recognition, which are presented in Section 4.3. The findings on the connection of the associated sensor types and HCI recognition and associated sensor types and AI methods are also aimed to provide practical benefit in the development process of future HCII solutions, as they offer an overview of tried and tested combinations.

Limitations
Investigation of the sensor's technology and AI methods and algorithms for HCII comes with a set of limitations. Although several digital libraries were used and three researchers were involved in the paper analysis, it is still possible that not all articles were identified and consequently included in the group of selected papers after inclusion and exclusion criteria. Furthermore, the used keywords in research papers varied greatly and consequently covered various multi-disciplinary research. Due to the un-uniform set of used keywords, the probability that articles addressing HCI or HCII topics, but not including the keywords used for the literature search, was not included in our analysis.
The time scope of research was set after the year 2010. Although not much relevant research was conducted before the set period (based on our research), there could possibly exist fundamental papers with significant insight into the beginnings of HCII, but these were not included in our research.
Articles, which were in the reviewing or publishing process when writing this manuscript, were not yet accessible in the used digital libraries and were therefore not included in the research. There is a possibility that after the conclusion of this paper, new important articles will emerge in 2021.
A high number of identified potential papers was manually evaluated as appropriate or not appropriate for further use. The time restriction of a few months created the likelihood that not all relevant articles identified through the database searches (N = 5642) met the criteria set in the full screening process and resulting in 454 papers. Few important papers are possible to be mistakenly omitted from the final paper group.
This study focuses on intelligent HCII and IUI and cannot be generalized to other domains, such as the general HCI. The exclusion of acronyms of the selected terms in the search query (HCII and IUI) could also be recognized as a limitation. This study represents an overview of the observed field and does not aim to contribute to the debate of what solutions are deemed intelligent in the HCII and IUI fields.

Threats to Validity
It is possible that not all relevant primary studies were covered in this study. Our work was guided by inclusion and exclusion criteria within the scope of the reported search string; therefore, it is possible that some articles were automatically excluded in the process. Some relevant studies were published after our literature search phase and are not included in our review. There is a possibility that the research gaps with low publication coverage we have identified have been covered in other sources, and our work is not a relevant summary of the whole population of relevant studies. However, we have examined similar studies and have explored all relevant databases to lessen these threats to gather a good and unbiased population representation. We believe the sample size is adequate and assures reasonable validity of the results.
Data extraction and classification of primary studies have been challenging due to the interdisciplinary characteristics of the field. Some level of subjectivity is implied in classification; therefore, it is possible that if the study is repeated, minor variations could be expected. For this reason, we have attached the classification of studies for selected variables in the Supplementary Materials. To minimize the threats to validity, primary studies were classified by one researcher. This might inherently introduce some bias, but we felt that the resulting classification would be more consistent this way. A predefined classification scheme was used to lower the risks to internal validity related to the data classification phase, and the authors decided to discuss the problematic papers.

Conclusions
This study contributes to the body of research in the multi-disciplinary field of intelligent methods development in HCI and HCII, CS, and AI fields. Our work presents a systematic mapping study of 454 papers, retrieved from several digital libraries and application of the inclusion and exclusion criteria, in order to fulfill the objectives of the study: (1) what have been the trends and demographics of the literature within the field of HCII, (2) what has been the research space of the literature within the field of HCII, and (3) what sensors technology and intelligent methods have been used in the development and evaluation of solutions for HCII.
Our analysis revealed the following results. HCI and HCII solutions are becoming more and more popular and bring several conveniences in people's daily lives. There is an extensively large body of research, with the peak reached in 2017.
The observed research point is multi-disciplinary, interchanging the ideas from different areas. Most of the analyzed research is quantitative. The most used method in observed studies is an experiment. The most popular methods for data collection are measurement with sensors and database data analysis. The most widely used sensor types in the observed primary research were a camera, EEG sensor, and Kinect, followed by wearable sensors, eye trackers, and others. The most used AI methods are deep-learning algorithms (widely used in various types of recognition) and instance-based algorithms, commonly used with the aim of human/body motion recognition, human activity, gesture, depression, and behavior recognition. The use of artificial neural networks was also identified, successfully used in gesture and emotion recognition.
In the future, we envision the following opportunities: There is a lack of philosophical papers dealing with the ethical aspect of intelligent recognition from various data sources, which is currently technically solved. The additional future work opportunity is increasing the inclusion of research in accessibility and using several different data sources to enable effective and efficient HCI for users with disabilities. We observed a lack in use-case studies and qualitative research, which for the HCII field present opportunity for future research, especially for studies evaluating the usability of proposed solutions and the user's experience.
Author Contributions: This paper is a collaborative work by all the authors. B.Š. proposed the idea, designed the review process, searched the literature, reviewed the literature, extracted data, conducted the data analysis, and wrote the manuscript. S.B. reviewed the literature, conducted data analysis, and revised the first draft. M.P. performed the literature review, extracted data, prepared the data for analysis, and revised the first draft. All authors proofread the paper. All authors have read and agreed to the published version of the manuscript.