Special Issue "Human Computer Interaction for Intelligent Systems"

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 31 July 2021.

Special Issue Editors

Dr. Matus Pleva
E-Mail Website
Guest Editor
Department of Electronics and Multimedia Communications Košice, Technical University of Kosice, Slovakia
Interests: speech processing; human–robot interaction; automatic broadcast news processing; multimodal applications; information security and biometrics
Prof. Dr. Yuan-Fu Liao
E-Mail Website
Guest Editor
Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan
Interests: speech signal processing; audio signal processing; natural language processing; machine learning
Prof. Dr. Patrick Bours
E-Mail Website
Guest Editor
Norwegian University of Science and Technology, Gjøvik, Oppland, Norway
Interests: keystroke dynamics; behavioral biometrics; gait recognition; information security; continuous authentication; soft biometrics

Special Issue Information

Dear Colleagues,

The further development of human–computer interaction applications is still in great demand as users expect more natural interactions. For example, speech communication in many languages is expected as a basic feature for intelligent systems like robotic systems, autonomous vehicles, or virtual assistants. For this Special Issue, we invite submissions from researchers addressing the unique opportunities and challenges associated with human–computer interaction with intelligent systems. We encourage authors to submit reports describing systems built for different languages and multilingual systems. We also invite submissions from researchers studying the linguistic, emotional, prosodic, and dialogue aspects of speech communication. We welcome submissions describing other input and output modalities, including multimodal systems, fusion/fission algorithms, and deep learning methods. We encourage the authors to report in detail the state-of-art results and data used to build such systems to support development in those areas. The security of advanced communication channels is also important. Modern biometric technologies, including physical and behavioral analysis, may be proposed and evaluated for different interface modalities and applications. The rapidly growing domain of virtual reality applications is of interest both as an application domain in which new interfaces and methods of interaction are needed and as a potential testbed for evaluating speech and other interface modalities.

This Special Issue aims to cover recent advances in aspects of Human–Computer Interaction for Intelligent Systems, including theory, tools, applications, testbeds, human factors studies, and field deployments. Reviews and surveys of the state-of-the-art in HCI for Intelligent Systems are also welcomed.

Topics of interest to this Special Issue include:

human–robot interaction;
interaction in virtual/augmented reality;
multilingual speech processing;
multimodal HCI;
deep learning in HCI/IS;
EEG in HCI;
biometrics in HCI;
human factors of HCI;
speech recognition;
speech synthesis;
natural language processing;
linguistics; anticipation in speech;
emotion and mood analysis;
prosodic and phonetics;
microphone arrays;
accessible computing.

However, please do not feel limited by these topics, we will consider submissions in any area of HCI for Intelligent Systems.

Dr. Matus Pleva
Prof. Dr. Yuan-Fu Liao
Prof. Dr. Patrick Bours
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

Article
Posting Recommendations in Healthcare Q&A Forums
Electronics 2021, 10(3), 278; https://doi.org/10.3390/electronics10030278 - 25 Jan 2021
Viewed by 420
Abstract
Online Q&A forums, unlike search engines, allow posting of various types of queries, thus attracting users to seek information and solve problems in specific domains. However, as insufficient knowledge leads to incomprehensible queries, unsuitable responses are common. We develop posting recommendation systems (RSs) [...] Read more.
Online Q&A forums, unlike search engines, allow posting of various types of queries, thus attracting users to seek information and solve problems in specific domains. However, as insufficient knowledge leads to incomprehensible queries, unsuitable responses are common. We develop posting recommendation systems (RSs) to support users in composing reasonable posts and receiving effective answers. The posting RSs were evaluated by a user study containing 27 participants and three tasks to examine if users engaged more in the question generation process. Two medical experts were recruited to verify whether professionals can understand and answer posts supported by RSs. The results show that the proposed mechanism enables askers to produce posts with better understandability, which leads experts to devote more attention to answer their questions. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Sentiment Level Evaluation of 3D Handicraft Products Application for Smartphones Usage
Electronics 2021, 10(2), 199; https://doi.org/10.3390/electronics10020199 - 16 Jan 2021
Viewed by 590
Abstract
Three-dimensional (3D) technology has attracted users’ attention because it creates objects that can interact with a given product in a system. Nowadays, Thailand’s government encourages sustainability projects through advertising, trade shows and information systems for small rural entrepreneurship. However, the government’s systems do [...] Read more.
Three-dimensional (3D) technology has attracted users’ attention because it creates objects that can interact with a given product in a system. Nowadays, Thailand’s government encourages sustainability projects through advertising, trade shows and information systems for small rural entrepreneurship. However, the government’s systems do not include virtual products with a 3D display. The objective of this study was four-fold: (1) develop a prototype of 3D handicraft product application for smartphones; (2) create an online questionnaire to collect user usage assessment data in terms of five sentiment levels—strongly negative, negative, neutral, positive and strongly positive—in response to the usage of the proposed 3D application; (3) evaluate users’ sentiment level in 3D handicraft product application usage; and (4) investigate attracting users’ attention to handicraft products after using the proposed 3D handicraft product application. The results indicate that 78.87% of participants’ sentiment was positive and strongly positive under accept using 3D handicraft product application, and evaluations in terms of assessing attention paid by participants to the handicraft products revealed that positive and strongly positive sentiment was described by 79.61% of participants. The participants’ evaluation results in this study prove that our proposed 3D handicraft product application affected users by attracting their attention towards handicraft products. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Research of HRV as a Measure of Mental Workload in Human and Dual-Arm Robot Interaction
Electronics 2020, 9(12), 2174; https://doi.org/10.3390/electronics9122174 - 17 Dec 2020
Cited by 1 | Viewed by 503
Abstract
Robots instead of humans work in unstructured environments, expanding the scope of human work. The interactions between humans and robots are indirect through operating terminals. The mental workloads of human increase with the lack of direct perception to the real scenes. Thus, mental [...] Read more.
Robots instead of humans work in unstructured environments, expanding the scope of human work. The interactions between humans and robots are indirect through operating terminals. The mental workloads of human increase with the lack of direct perception to the real scenes. Thus, mental workload assessment is important, which could effectively avoid serious accidents caused by mental overloading. In this paper, the operating object is a dual-arm robot. The classification of operator’s mental workload is studied by using the heart rate variability (HRV) signal. First, two kinds of electrocardiogram (ECG) signals are collected from six subjects who performed tasks or maintained a relaxed state. Then, HRV data is obtained from ECG signals and 20 kinds of HRV features are extracted. Last, six different classifications are used for mental workload classification. Using each subject’s HRV signal to train the model, the subject’s mental workload is classified. Average classification accuracy of 98.77% is obtained using the K-Nearest Neighbor (KNN) method. By using the HRV signal of five subjects for training and that of one subject for testing with the Gentle Boost (GB) method, the highest average classification accuracy (80.56%) is obtained. This study has implications for the analysis of HRV signals characteristic of mental workload in different subjects, which could improve operators’ well-being and safety in the human-robot interaction process. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Emotion Analysis in Human–Robot Interaction
Electronics 2020, 9(11), 1761; https://doi.org/10.3390/electronics9111761 - 23 Oct 2020
Cited by 1 | Viewed by 815
Abstract
This paper connects two large research areas, namely sentiment analysis and human–robot interaction. Emotion analysis, as a subfield of sentiment analysis, explores text data and, based on the characteristics of the text and generally known emotional models, evaluates what emotion is presented in [...] Read more.
This paper connects two large research areas, namely sentiment analysis and human–robot interaction. Emotion analysis, as a subfield of sentiment analysis, explores text data and, based on the characteristics of the text and generally known emotional models, evaluates what emotion is presented in it. The analysis of emotions in the human–robot interaction aims to evaluate the emotional state of the human being and on this basis to decide how the robot should adapt its behavior to the human being. There are several approaches and algorithms to detect emotions in the text data. We decided to apply a combined method of dictionary approach with machine learning algorithms. As a result of the ambiguity and subjectivity of labeling emotions, it was possible to assign more than one emotion to a sentence; thus, we were dealing with a multi-label problem. Based on the overview of the problem, we performed experiments with the Naive Bayes, Support Vector Machine and Neural Network classifiers. Results obtained from classification were subsequently used in human–robot experiments. Despise the lower accuracy of emotion classification, we proved the importance of expressing emotion gestures based on the words we speak. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Self-Attentive Multi-Layer Aggregation with Feature Recalibration and Deep Length Normalization for Text-Independent Speaker Verification System
Electronics 2020, 9(10), 1706; https://doi.org/10.3390/electronics9101706 - 17 Oct 2020
Viewed by 535
Abstract
One of the most important parts of a text-independent speaker verification system is speaker embedding generation. Previous studies demonstrated that shortcut connections-based multi-layer aggregation improves the representational power of a speaker embedding system. However, model parameters are relatively large in number, and unspecified [...] Read more.
One of the most important parts of a text-independent speaker verification system is speaker embedding generation. Previous studies demonstrated that shortcut connections-based multi-layer aggregation improves the representational power of a speaker embedding system. However, model parameters are relatively large in number, and unspecified variations increase in the multi-layer aggregation. Therefore, in this study, we propose a self-attentive multi-layer aggregation with feature recalibration and deep length normalization for a text-independent speaker verification system. To reduce the number of model parameters, we set the ResNet with the scaled channel width and layer depth as a baseline. To control the variability in the training, we apply a self-attention mechanism to perform multi-layer aggregation with dropout regularizations and batch normalizations. Subsequently, we apply a feature recalibration layer to the aggregated feature using fully-connected layers and nonlinear activation functions. Further, deep length normalization is used on a recalibrated feature in the training process. Experimental results using the VoxCeleb1 evaluation dataset showed that the performance of the proposed methods was comparable to that of state-of-the-art models (equal error rate of 4.95% and 2.86%, using the VoxCeleb1 and VoxCeleb2 training datasets, respectively). Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Lex-Pos Feature-Based Grammar Error Detection System for the English Language
Electronics 2020, 9(10), 1686; https://doi.org/10.3390/electronics9101686 - 14 Oct 2020
Cited by 1 | Viewed by 798
Abstract
This work focuses on designing a grammar detection system that understands both structural and contextual information of sentences for validating whether the English sentences are grammatically correct. Most existing systems model a grammar detector by translating the sentences into sequences of either words [...] Read more.
This work focuses on designing a grammar detection system that understands both structural and contextual information of sentences for validating whether the English sentences are grammatically correct. Most existing systems model a grammar detector by translating the sentences into sequences of either words appearing in the sentences or syntactic tags holding the grammar knowledge of the sentences. In this paper, we show that both these sequencing approaches have limitations. The former model is over specific, whereas the latter model is over generalized, which in turn affects the performance of the grammar classifier. Therefore, the paper proposes a new sequencing approach that contains both information, linguistic as well as syntactic, of a sentence. We call this sequence a Lex-Pos sequence. The main objective of the paper is to demonstrate that the proposed Lex-Pos sequence has the potential to imbibe the specific nature of the linguistic words (i.e., lexicals) and generic structural characteristics of a sentence via Part-Of-Speech (POS) tags, and so, can lead to a significant improvement in detecting grammar errors. Furthermore, the paper proposes a new vector representation technique, Word Embedding One-Hot Encoding (WEOE) to transform this Lex-Pos into mathematical values. The paper also introduces a new error induction technique to artificially generate the POS tag specific incorrect sentences for training. The classifier is trained using two corpora of incorrect sentences, one with general errors and another with POS tag specific errors. Long Short-Term Memory (LSTM) neural network architecture has been employed to build the grammar classifier. The study conducts nine experiments to validate the strength of the Lex-Pos sequences. The Lex-Pos -based models are observed as superior in two ways: (1) they give more accurate predictions; and (2) they are more stable as lesser accuracy drops have been recorded from training to testing. To further prove the potential of the proposed Lex-Pos -based model, we compare it with some well known existing studies. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Lexicon-based Sentiment Analysis Using the Particle Swarm Optimization
Electronics 2020, 9(8), 1317; https://doi.org/10.3390/electronics9081317 - 15 Aug 2020
Cited by 2 | Viewed by 931
Abstract
This work belongs to the field of sentiment analysis; in particular, to opinion and emotion classification using a lexicon-based approach. It solves several problems related to increasing the effectiveness of opinion classification. The first problem is related to lexicon labelling. Human labelling in [...] Read more.
This work belongs to the field of sentiment analysis; in particular, to opinion and emotion classification using a lexicon-based approach. It solves several problems related to increasing the effectiveness of opinion classification. The first problem is related to lexicon labelling. Human labelling in the field of emotions is often too subjective and ambiguous, and so the possibility of replacement by automatic labelling is examined. This paper offers experimental results using a nature-inspired algorithm—particle swarm optimization—for labelling. This optimization method repeatedly labels all words in a lexicon and evaluates the effectiveness of opinion classification using the lexicon until the optimal labels for words in the lexicon are found. The second problem is that the opinion classification of texts which do not contain words from the lexicon cannot be successfully done using the lexicon-based approach. Therefore, an auxiliary approach, based on a machine learning method, is integrated into the method. This hybrid approach is able to classify more than 99% of texts and achieves better results than the original lexicon-based approach. The final hybrid model can be used for emotion analysis in human–robot interactions. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Using Augmented Reality and Internet of Things for Control and Monitoring of Mechatronic Devices
Electronics 2020, 9(8), 1272; https://doi.org/10.3390/electronics9081272 - 07 Aug 2020
Cited by 3 | Viewed by 1054
Abstract
At present, computer networks are no longer used to connect just personal computers. Smaller devices can connect to them even at the level of individual sensors and actuators. This trend is due to the development of modern microcontrollers and singleboard computers which can [...] Read more.
At present, computer networks are no longer used to connect just personal computers. Smaller devices can connect to them even at the level of individual sensors and actuators. This trend is due to the development of modern microcontrollers and singleboard computers which can be easily connected to the global Internet. The result is a new paradigm—the Internet of Things (IoT) as an integral part of the Industry 4.0; without it, the vision of the fourth industrial revolution would not be possible. In the field of digital factories it is a natural successor of the machine-to-machine (M2M) communication. Presently, mechatronic systems in IoT networks are controlled and monitored via industrial HMI (human-machine interface) panels, console, web or mobile applications. Using these conventional control and monitoring methods of mechatronic systems within IoT networks, this method may be fully satisfactory for smaller rooms. Since the list of devices fits on one screen, we can monitor the status and control these devices almost immediately. However, in the case of several rooms or buildings, which is the case of digital factories, ordinary ways of interacting with mechatronic systems become cumbersome. In such case, there is the possibility to apply advanced digital technologies such as extended (computer-generated) reality. Using these technologies, digital (computer-generated) objects can be inserted into the real world. The aim of this article is to describe design and implementation of a new method for control and monitoring of mechatronic systems connected to the IoT network using a selected segment of extended reality to create an innovative form of HMI. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Article
Pediatric Speech Audiometry Web Application for Hearing Detection in the Home Environment
Electronics 2020, 9(6), 994; https://doi.org/10.3390/electronics9060994 - 13 Jun 2020
Cited by 2 | Viewed by 1513
Abstract
This paper describes the development of the speech audiometry application for pediatric patients in Slovak language and experiences obtained during testing with healthy children, hearing-impaired children, and elderly persons. The first motivation behind the presented work was to reduce the stress and fear [...] Read more.
This paper describes the development of the speech audiometry application for pediatric patients in Slovak language and experiences obtained during testing with healthy children, hearing-impaired children, and elderly persons. The first motivation behind the presented work was to reduce the stress and fear of the children, who must undergo postoperative audiometry, but over time, we changed our direction to the simple game-like mobile application for the detection of possible hearing problems of children in the home environment. Conditioned play audiometry principles were adopted to create a speech audiometry application, where children help the virtual robot Thomas assign words to pictures; this can be described as a speech recognition test. Several game scenarios together with the setting condition issues were created, tested, and discussed. First experiences show a positive influence on the children’s mood and motivation. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Review

Jump to: Research

Review
A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism
Electronics 2021, 10(10), 1163; https://doi.org/10.3390/electronics10101163 - 13 May 2021
Viewed by 360
Abstract
Emotions are an integral part of human interactions and are significant factors in determining user satisfaction or customer opinion. speech emotion recognition (SER) modules also play an important role in the development of human–computer interaction (HCI) applications. A tremendous number of SER systems [...] Read more.
Emotions are an integral part of human interactions and are significant factors in determining user satisfaction or customer opinion. speech emotion recognition (SER) modules also play an important role in the development of human–computer interaction (HCI) applications. A tremendous number of SER systems have been developed over the last decades. Attention-based deep neural networks (DNNs) have been shown as suitable tools for mining information that is unevenly time distributed in multimedia content. The attention mechanism has been recently incorporated in DNN architectures to emphasise also emotional salient information. This paper provides a review of the recent development in SER and also examines the impact of various attention mechanisms on SER performance. Overall comparison of the system accuracies is performed on a widely used IEMOCAP benchmark database. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Review
Survey of Automatic Spelling Correction
Electronics 2020, 9(10), 1670; https://doi.org/10.3390/electronics9101670 - 13 Oct 2020
Cited by 1 | Viewed by 1238
Abstract
Automatic spelling correction has been receiving sustained research attention. Although each article contains a brief introduction to the topic, there is a lack of work that would summarize the theoretical framework and provide an overview of the approaches developed so far. Our survey [...] Read more.
Automatic spelling correction has been receiving sustained research attention. Although each article contains a brief introduction to the topic, there is a lack of work that would summarize the theoretical framework and provide an overview of the approaches developed so far. Our survey selected papers about spelling correction indexed in Scopus and Web of Science from 1991 to 2019. The first group uses a set of rules designed in advance. The second group uses an additional model of context. The third group of automatic spelling correction systems in the survey can adapt its model to the given problem. The summary tables show the application area, language, string metrics, and context model for each system. The survey describes selected approaches in a common theoretical framework based on Shannon’s noisy channel. A separate section describes evaluation methods and benchmarks. Full article
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
Show Figures

Figure 1

Back to TopTop