Topic Editors

Prof. Dr. Friedhelm Schwenker
Insitute of Neural Information Processing, Ulm University, James Frank Ring, 89081 Ulm, Germany
Prof. Dr. Mariofanna Milanova
Department of Computer Science, University of Arkansas at Little Rock, Little Rock, AR 72204, USA

Data Analytics and Machine Learning in Artificial Emotional Intelligence

Abstract submission deadline
closed (28 February 2022)
Manuscript submission deadline
closed (30 April 2022)
Viewed by
41756

Topic Information

Dear Colleagues,

Computers using Artificial Emotional Intelligence (AEI) technologies are able to determine the user’s emotional state and can react in a proper way. Such systems allow a naturalistic communication between humans and machines. Nowadays AUI technologies are based on methods from Data Analytics and Machine Learning - in particular learning in deep artificial neural networks from multimodal data, including the videos of facial expressions, pose and gesture, voice, bio-physiological data (such as eye movement, ECG, respiration, EEG, FMRT, EMG, eye tracking). The aim of this interdisciplinary topic is to discuss the effectiveness of machine learning and data analytics methods is this special field of multimodal data processing.

We welcome Manuscripts to this interdisciplinary topic from the following fields (but not limited to them)

  • Facial expression analysis
  • Classification of emotional gestures and poses
  • Speech based emotion
  • Biophysiological sensors in emotion recognition
  • Neural patterns of emotions in EEG and FMRT
  • Multimodal analysis and classification of emotions
  • Sentiment classification of texts
  • Sensor systems for emotion classification
  • Emotions in human computer interaction for healthcare and well-being
  • Emotion based human computer interaction

Dr. Friedhelm Schwenker
Prof. Dr. Mariofanna Milanova
Topic Editors

Keywords

  • affective computing
  • affective databases
  • affective interaction
  • affective medicine
  • data analytics
  • deep neural networks
  • emotion recognition
  • feature extraction
  • human computer interaction
  • intelligent systems
  • machine learning
  • multimodal data fusion
  • sensor systems
  • sentiment analysis
  • signal processing

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Applied System Innovation
asi
- 3.2 2018 13.1 Days 1400 CHF
Sensors
sensors
3.847 6.4 2001 15 Days 2400 CHF
Algorithms
algorithms
- 3.3 2008 17.6 Days 1600 CHF
AI
ai
- - 2020 25.1 Days 1200 CHF
Journal of Sensor and Actuator Networks
jsan
- 6.9 2012 18.4 Days 1600 CHF

Preprints is a platform dedicated to making early versions of research outputs permanently available and citable. MDPI journals allow posting on preprint servers such as Preprints.org prior to publication. For more details about reprints, please visit https://www.preprints.org.

Published Papers (28 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
Article
Improving the Reader’s Attention and Focus through an AI-Driven Interactive and User-Aware Virtual Assistant for Handheld Devices
Appl. Syst. Innov. 2022, 5(5), 92; https://doi.org/10.3390/asi5050092 - 22 Sep 2022
Viewed by 825
Abstract
This paper describes the design and development of an AI-driven, interactive and user-aware virtual assistant aimed at helping users to focus their attention on reading or attending to other long-lasting visual tasks. The proposed approach uses computer vision and artificial intelligence to analyze [...] Read more.
This paper describes the design and development of an AI-driven, interactive and user-aware virtual assistant aimed at helping users to focus their attention on reading or attending to other long-lasting visual tasks. The proposed approach uses computer vision and artificial intelligence to analyze the orientation of the head and the gaze of the user’s eyes to estimate the level of attention during the task, as well as administer effective and balanced stimuli to correct significant deviations. The stimuli are provided by a graphical character (i.e., the virtual assistant), which is able to emulate face expressions, generate spoken messages and produce deictic visual cues to better involve the user and establish an effective, natural and enjoyable experience. The described virtual assistant is based on a modular architecture that can be scaled to support a wide range of applications, from virtual and blended collaborative spaces to mobile devices. In particular, this paper focuses on an application designed to integrate seamlessly into tablets and e-book readers to provide its services in mobility and exactly when and where needed. Full article
Show Figures

Figure 1

Article
Use of Machine Learning for Early Detection of Knee Osteoarthritis and Quantifying Effectiveness of Treatment Using Force Platform
J. Sens. Actuator Netw. 2022, 11(3), 48; https://doi.org/10.3390/jsan11030048 - 23 Aug 2022
Viewed by 1078
Abstract
Knee osteoarthritis is one of the most prevalent chronic diseases. It leads to pain, stiffness, decreased participation in activities of daily living and problems with balance recognition. Force platforms have been one of the tools used to analyse balance in patients. However, identification [...] Read more.
Knee osteoarthritis is one of the most prevalent chronic diseases. It leads to pain, stiffness, decreased participation in activities of daily living and problems with balance recognition. Force platforms have been one of the tools used to analyse balance in patients. However, identification in early stages and assessing the severity of osteoarthritis using parameters derived from a force plate are yet unexplored to the best of our knowledge. Combining artificial intelligence with medical knowledge can provide a faster and more accurate diagnosis. The aim of our study is to present a novel algorithm to classify the occurrence and severity of knee osteoarthritis based on the parameters derived from a force plate. Forty-four sway movements graphs were measured. The different machine learning algorithms, such as K-Nearest Neighbours, Logistic Regression, Gaussian Naive Bayes, Support Vector Machine, Decision Tree Classifier and Random Forest Classifier, were implemented on the dataset. The proposed method achieves 91% accuracy in detecting sway variation and would help the rehabilitation specialist to objectively identify the patient’s condition in the initial stage and educate the patient about disease progression. Full article
Show Figures

Figure 1

Article
ViTFER: Facial Emotion Recognition with Vision Transformers
Appl. Syst. Innov. 2022, 5(4), 80; https://doi.org/10.3390/asi5040080 - 15 Aug 2022
Cited by 1 | Viewed by 1519
Abstract
In several fields nowadays, automated emotion recognition has been shown to be a highly powerful tool. Mapping different facial expressions to their respective emotional states is the main objective of facial emotion recognition (FER). In this study, facial expression recognition (FER) was classified [...] Read more.
In several fields nowadays, automated emotion recognition has been shown to be a highly powerful tool. Mapping different facial expressions to their respective emotional states is the main objective of facial emotion recognition (FER). In this study, facial expression recognition (FER) was classified using the ResNet-18 model and transformers. This study examines the performance of the Vision Transformer in this task and contrasts our model with cutting-edge models on hybrid datasets. The pipeline and associated procedures for face detection, cropping, and feature extraction using the most recent deep learning model, fine-tuned transformer, are described in this study. The experimental findings demonstrate that our proposed emotion recognition system is capable of being successfully used in practical settings. Full article
Show Figures

Figure 1

Article
SASMOTE: A Self-Attention Oversampling Method for Imbalanced CSI Fingerprints in Indoor Positioning Systems
Sensors 2022, 22(15), 5677; https://doi.org/10.3390/s22155677 - 29 Jul 2022
Cited by 1 | Viewed by 617
Abstract
WiFi localization based on channel state information (CSI) fingerprints has become the mainstream method for indoor positioning due to the widespread deployment of WiFi networks, in which fingerprint database building is critical. However, issues, such as insufficient samples or missing data in the [...] Read more.
WiFi localization based on channel state information (CSI) fingerprints has become the mainstream method for indoor positioning due to the widespread deployment of WiFi networks, in which fingerprint database building is critical. However, issues, such as insufficient samples or missing data in the collection fingerprint database, result in unbalanced training data for the localization system during the construction of the CSI fingerprint database. To address the above issue, we propose a deep learning-based oversampling method, called Self-Attention Synthetic Minority Oversampling Technique (SASMOTE), for complementing the fingerprint database to improve localization accuracy. Specifically, a novel self-attention encoder-decoder is firstly designed to compress the original data dimensionality and extract rich features. The synthetic minority oversampling technique (SMOTE) is adopted to oversample minority class data to achieve data balance. In addition, we also construct the corresponding CSI fingerprinting dataset to train the model. Finally, extensive experiments are performed on different data to verify the performance of the proposed method. The results show that our SASMOTE method can effectively solve the data imbalance problem. Meanwhile, the improved location model, 1D-MobileNet, is tested on the balanced fingerprint database to further verify the excellent performance of our proposed methods. Full article
Show Figures

Figure 1

Article
A Preliminary Study of Robust Speech Feature Extraction Based on Maximizing the Probability of States in Deep Acoustic Models
Appl. Syst. Innov. 2022, 5(4), 71; https://doi.org/10.3390/asi5040071 - 26 Jul 2022
Cited by 1 | Viewed by 905
Abstract
This study proposes a novel robust speech feature extraction technique to improve speech recognition performance in noisy environments. This novel method exploits the information provided by the original acoustic model in the automatic speech recognition (ASR) system to learn a deep neural network [...] Read more.
This study proposes a novel robust speech feature extraction technique to improve speech recognition performance in noisy environments. This novel method exploits the information provided by the original acoustic model in the automatic speech recognition (ASR) system to learn a deep neural network that converts the original speech features. This deep neural network is trained to maximize the posterior accuracy of the state sequences of acoustic models with respect to the speech feature sequences. Compared with the robustness methods that retrain or adapt acoustic models, the new method has the advantages of less computation load and faster training. In the experiments conducted on the medium-vocabulary TIMIT database and task, the presented method provides lower word error rates than the unprocessed baseline and speech-enhancement-based techniques. These results indicate that the presented method is promising and worth further developing. Full article
Show Figures

Figure 1

Article
Research on Mask-Wearing Detection Algorithm Based on Improved YOLOv5
Sensors 2022, 22(13), 4933; https://doi.org/10.3390/s22134933 - 29 Jun 2022
Cited by 4 | Viewed by 2062
Abstract
COVID-19 is highly contagious, and proper wearing of a mask can hinder the spread of the virus. However, complex factors in natural scenes, including occlusion, dense, and small-scale targets, frequently lead to target misdetection and missed detection. To address these issues, this paper [...] Read more.
COVID-19 is highly contagious, and proper wearing of a mask can hinder the spread of the virus. However, complex factors in natural scenes, including occlusion, dense, and small-scale targets, frequently lead to target misdetection and missed detection. To address these issues, this paper proposes a YOLOv5-based mask-wearing detection algorithm, YOLOv5-CBD. Firstly, the Coordinate Attention mechanism is introduced into the feature fusion process to stress critical features and decrease the impact of redundant features after feature fusion. Then, the original feature pyramid network module in the feature fusion module was replaced with a weighted bidirectional feature pyramid network to achieve efficient bidirectional cross-scale connectivity and weighted feature fusion. Finally, we combined Distance Intersection over Union with Non-Maximum Suppression to improve the missed detection of overlapping targets. Experiments show that the average detection accuracy of the YOLOv5-CBD model is 96.7%—an improvement of 2.1% compared to the baseline model (YOLOv5). Full article
Show Figures

Graphical abstract

Article
Emotion Recognition for Partial Faces Using a Feature Vector Technique
Sensors 2022, 22(12), 4633; https://doi.org/10.3390/s22124633 - 19 Jun 2022
Cited by 1 | Viewed by 1002
Abstract
Wearing a facial mask is indispensable in the COVID-19 pandemic; however, it has tremendous effects on the performance of existing facial emotion recognition approaches. In this paper, we propose a feature vector technique comprising three main steps to recognize emotions from facial mask [...] Read more.
Wearing a facial mask is indispensable in the COVID-19 pandemic; however, it has tremendous effects on the performance of existing facial emotion recognition approaches. In this paper, we propose a feature vector technique comprising three main steps to recognize emotions from facial mask images. First, a synthetic mask is used to cover the facial input image. With only the upper part of the image showing, and including only the eyes, eyebrows, a portion of the bridge of the nose, and the forehead, the boundary and regional representation technique is applied. Second, a feature extraction technique based on our proposed rapid landmark detection method employing the infinity shape is utilized to flexibly extract a set of feature vectors that can effectively indicate the characteristics of the partially occluded masked face. Finally, those features, including the location of the detected landmarks and the Histograms of the Oriented Gradients, are brought into the classification process by adopting CNN and LSTM; the experimental results are then evaluated using images from the CK+ and RAF-DB data sets. As the result, our proposed method outperforms existing cutting-edge approaches and demonstrates better performance, achieving 99.30% and 95.58% accuracy on CK+ and RAF-DB, respectively. Full article
Show Figures

Figure 1

Article
A Multi-Stage Visible and Infrared Image Fusion Network Based on Attention Mechanism
Sensors 2022, 22(10), 3651; https://doi.org/10.3390/s22103651 - 11 May 2022
Viewed by 818
Abstract
Pixel-level image fusion is an effective way to fully exploit the rich texture information of visible images and the salient target characteristics of infrared images. With the development of deep learning technology in recent years, the image fusion algorithm based on this method [...] Read more.
Pixel-level image fusion is an effective way to fully exploit the rich texture information of visible images and the salient target characteristics of infrared images. With the development of deep learning technology in recent years, the image fusion algorithm based on this method has also achieved great success. However, owing to the lack of sufficient and reliable paired data and a nonexistent ideal fusion result as supervision, it is difficult to design a precise network training mode. Moreover, the manual fusion strategy has difficulty ensuring the full use of information, which easily causes redundancy and omittance. To solve the above problems, this paper proposes a multi-stage visible and infrared image fusion network based on an attention mechanism (MSFAM). Our method stabilizes the training process through multi-stage training and enhances features by the learning attention fusion block. To improve the network effect, we further design a Semantic Constraint module and Push–Pull loss function for the fusion task. Compared with several recently used methods, the qualitative comparison intuitively shows more beautiful and natural fusion results by our model with a stronger applicability. For quantitative experiments, MSFAM achieves the best results in three of the six frequently used metrics in fusion tasks, while other methods only obtain good scores on a single metric or a few metrics. Besides, a commonly used high-level semantic task, i.e., object detection, is used to prove its greater benefits for downstream tasks compared with singlelight images and fusion results by existing methods. All these experiments prove the superiority and effectiveness of our algorithm. Full article
Show Figures

Figure 1

Article
Deep Neural Networks Based on Span Association Prediction for Emotion-Cause Pair Extraction
Sensors 2022, 22(10), 3637; https://doi.org/10.3390/s22103637 - 10 May 2022
Viewed by 836
Abstract
The emotion-cause pair extraction task is a fine-grained task in text sentiment analysis, which aims to extract all emotions and their underlying causes in a document. Recent studies have addressed the emotion-cause pair extraction task in a step-by-step manner, i.e., the two subtasks [...] Read more.
The emotion-cause pair extraction task is a fine-grained task in text sentiment analysis, which aims to extract all emotions and their underlying causes in a document. Recent studies have addressed the emotion-cause pair extraction task in a step-by-step manner, i.e., the two subtasks of emotion extraction and cause extraction are completed first, followed by the pairing task of emotion-cause pairs. However, this fail to deal well with the potential relationship between the two subtasks and the extraction task of emotion-cause pairs. At the same time, the grammatical information contained in the document itself is ignored. To address the above issues, we propose a deep neural network based on span association prediction for the task of emotion-cause pair extraction, exploiting general grammatical conventions to span-encode sentences. We use the span association pairing method to obtain candidate emotion-cause pairs, and establish a multi-dimensional information interaction mechanism to screen candidate emotion-cause pairs. Experimental results on a quasi-baseline corpus show that our model can accurately extract potential emotion-cause pairs and outperform existing baselines. Full article
Show Figures

Figure 1

Article
Automatic Detection Method of Dairy Cow Feeding Behaviour Based on YOLO Improved Model and Edge Computing
Sensors 2022, 22(9), 3271; https://doi.org/10.3390/s22093271 - 24 Apr 2022
Cited by 3 | Viewed by 1361
Abstract
The feeding behaviour of cows is an essential sign of their health in dairy farming. For the impression of cow health status, precise and quick assessment of cow feeding behaviour is critical. This research presents a method for monitoring dairy cow feeding behaviour [...] Read more.
The feeding behaviour of cows is an essential sign of their health in dairy farming. For the impression of cow health status, precise and quick assessment of cow feeding behaviour is critical. This research presents a method for monitoring dairy cow feeding behaviour utilizing edge computing and deep learning algorithms based on the characteristics of dairy cow feeding behaviour. Images of cow feeding behaviour were captured and processed in real time using an edge computing device. A DenseResNet-You Only Look Once (DRN-YOLO) deep learning method was presented to address the difficulties of existing cow feeding behaviour detection algorithms’ low accuracy and sensitivity to the open farm environment. The deep learning and feature extraction enhancement of the model was improved by replacing the CSPDarknet backbone network with the self-designed DRNet backbone network based on the YOLOv4 algorithm using multiple feature scales and the Spatial Pyramid Pooling (SPP) structure to enrich the scale semantic feature interactions, finally achieving the recognition of cow feeding behaviour in the farm feeding environment. The experimental results showed that DRN-YOLO improved the accuracy, recall, and mAP by 1.70%, 1.82%, and 0.97%, respectively, compared to YOLOv4. The research results can effectively solve the problems of low recognition accuracy and insufficient feature extraction in the analysis of dairy cow feeding behaviour by traditional methods in complex breeding environments, and at the same time provide an important reference for the realization of intelligent animal husbandry and precision breeding. Full article
Show Figures

Figure 1

Article
Recognition of Upper Limb Action Intention Based on IMU
Sensors 2022, 22(5), 1954; https://doi.org/10.3390/s22051954 - 02 Mar 2022
Cited by 4 | Viewed by 1754
Abstract
Using motion information of the upper limb to control the prosthetic hand has become a hotspot of current research. The operation of the prosthetic hand must also be coordinated with the user’s intention. Therefore, identifying action intention of the upper limb based on [...] Read more.
Using motion information of the upper limb to control the prosthetic hand has become a hotspot of current research. The operation of the prosthetic hand must also be coordinated with the user’s intention. Therefore, identifying action intention of the upper limb based on motion information of the upper limb is key to controlling the prosthetic hand. Since a wearable inertial sensor bears the advantages of small size, low cost, and little external environment interference, we employ an inertial sensor to collect angle and angular velocity data during movement of the upper limb. Aiming at the action classification for putting on socks, putting on shoes and tying shoelaces, this paper proposes a recognition model based on the Dynamic Time Warping (DTW) algorithm of the motion unit. Based on whether the upper limb is moving, the complete motion data are divided into several motion units. Considering the delay associated with controlling the prosthetic hand, this paper only performs feature extraction on the first motion unit and the second motion unit, and recognizes action on different classifiers. The experimental results reveal that the DTW algorithm based on motion unit bears a higher recognition rate and lower running time. The recognition rate reaches as high as 99.46%, and the average running time measures 8.027 ms. In order to enable the prosthetic hand to understand the grasping intention of the upper limb, this paper proposes a Generalized Regression Neural Network (GRNN) model based on 10-fold cross-validation. The motion state of the upper limb is subdivided, and the static state is used as the sign of controlling the prosthetic hand. This paper applies a 10-fold cross-validation method to train the neural network model to find the optimal smoothing parameter. In addition, the recognition performance of different neural networks is compared. The experimental results show that the GRNN model based on 10-fold cross-validation exhibits a high accuracy rate, capable of reaching 98.28%. Finally, the two algorithms proposed in this paper are implemented in an experiment of using the prosthetic hand to reproduce an action, and the feasibility and practicability of the algorithm are verified by experiment. Full article
Show Figures

Figure 1

Article
Learning to Reason on Tree Structures for Knowledge-Based Visual Question Answering
Sensors 2022, 22(4), 1575; https://doi.org/10.3390/s22041575 - 17 Feb 2022
Viewed by 912
Abstract
Collaborative reasoning for knowledge-based visual question answering is challenging but vital and efficient in understanding the features of the images and questions. While previous methods jointly fuse all kinds of features by attention mechanism or use handcrafted rules to generate a layout for [...] Read more.
Collaborative reasoning for knowledge-based visual question answering is challenging but vital and efficient in understanding the features of the images and questions. While previous methods jointly fuse all kinds of features by attention mechanism or use handcrafted rules to generate a layout for performing compositional reasoning, which lacks the process of visual reasoning and introduces a large number of parameters for predicting the correct answer. For conducting visual reasoning on all kinds of image–question pairs, in this paper, we propose a novel reasoning model of a question-guided tree structure with a knowledge base (QGTSKB) for addressing these problems. In addition, our model consists of four neural module networks: the attention model that locates attended regions based on the image features and question embeddings by attention mechanism, the gated reasoning model that forgets and updates the fused features, the fusion reasoning model that mines high-level semantics of the attended visual features and knowledge base and knowledge-based fact model that makes up for the lack of visual and textual information with external knowledge. Therefore, our model performs visual analysis and reasoning based on tree structures, knowledge base and four neural module networks. Experimental results show that our model achieves superior performance over existing methods on the VQA v2.0 and CLVER dataset, and visual reasoning experiments prove the interpretability of the model. Full article
Show Figures

Figure 1

Article
HRGAN: A Generative Adversarial Network Producing Higher-Resolution Images than Training Sets
Sensors 2022, 22(4), 1435; https://doi.org/10.3390/s22041435 - 13 Feb 2022
Cited by 3 | Viewed by 1182
Abstract
The generative adversarial network (GAN) has demonstrated superb performance in generating synthetic images in recent studies. However, in the conventional framework of GAN, the maximum resolution of generated images is limited to the resolution of real images that are used as the training [...] Read more.
The generative adversarial network (GAN) has demonstrated superb performance in generating synthetic images in recent studies. However, in the conventional framework of GAN, the maximum resolution of generated images is limited to the resolution of real images that are used as the training set. In this paper, in order to address this limitation, we propose a novel GAN framework using a pre-trained network called evaluator. The proposed model, higher resolution GAN (HRGAN), employs additional up-sampling convolutional layers to generate higher resolution. Then, using the evaluator, an additional target for the training of the generator is introduced to calibrate the generated images to have realistic features. In experiments with the CIFAR-10 and CIFAR-100 datasets, HRGAN successfully generates images of 64 × 64 and 128 × 128 resolutions, while the training sets consist of images of 32 × 32 resolution. In addition, HRGAN outperforms other existing models in terms of the Inception score, one of the conventional methods to evaluate GANs. For instance, in the experiment with CIFAR-10, a HRGAN generating 128 × 128 resolution demonstrates an Inception score of 12.32, outperforming an existing model by 28.6%. Thus, the proposed HRGAN demonstrates the possibility of generating higher resolution than training images. Full article
Show Figures

Figure 1

Communication
Identification of Chemical Vapor Mixture Assisted by Artificially Extended Database for Environmental Monitoring
Sensors 2022, 22(3), 1169; https://doi.org/10.3390/s22031169 - 03 Feb 2022
Cited by 2 | Viewed by 982
Abstract
A fully integrated sensor array assisted by pattern recognition algorithm has been a primary candidate for the assessment of complex vapor mixtures based on their chemical fingerprints. Diverse prototypes of electronic nose systems consisting of a multisensory device and a post processing engine [...] Read more.
A fully integrated sensor array assisted by pattern recognition algorithm has been a primary candidate for the assessment of complex vapor mixtures based on their chemical fingerprints. Diverse prototypes of electronic nose systems consisting of a multisensory device and a post processing engine have been developed. However, their precision and validity in recognizing chemical vapors are often limited by the collected database and applied classifiers. Here, we present a novel way of preparing the database and distinguishing chemical vapor mixtures with small data acquisition for chemical vapors and their mixtures of interest. The database for individual vapor analytes is expanded and the one for their mixtures is prepared in the first-order approximation. Recognition of individual target vapors of NO2, HCHO, and NH3 and their mixtures was evaluated by applying the support vector machine (SVM) classifier in different conditions of temperature and humidity. The suggested method demonstrated the recognition accuracy of 95.24%. The suggested method can pave a way to analyze gas mixtures in a variety of industrial and safety applications. Full article
Show Figures

Figure 1

Article
New Business Models on Artificial Intelligence—The Case of the Optimization of a Blast Furnace in the Steel Industry by a Machine Learning Solution
Appl. Syst. Innov. 2022, 5(1), 6; https://doi.org/10.3390/asi5010006 - 29 Dec 2021
Cited by 2 | Viewed by 2419
Abstract
This article took the case of the adoption of a Machine Learning (ML) solution in a steel manufacturing process through a platform provided by a Canadian startup, Canvass Analytics. The content of the paper includes a study around the state of the art [...] Read more.
This article took the case of the adoption of a Machine Learning (ML) solution in a steel manufacturing process through a platform provided by a Canadian startup, Canvass Analytics. The content of the paper includes a study around the state of the art of AI/ML adoption in steel manufacturing industries to optimize processes. The work aimed to highlight the opportunities that bring new business models based on AI/ML to improve processes in traditional industries. Methodologically, bibliographic research in the Scopus database was performed to establish the conceptual framework and the state of the art in the steel industry, then the case was presented and analyzed, to finally evaluate the impact of the new business model on the operation of the steel mill. The results of the case highlighted the way the innovative business model, based on a No-Code/Low-Code solution, achieved results in less time than conventional approaches of analytics solutions, and the way it is possible to democratize artificial intelligence and machine learning in traditional industrial environments. This work was focused on opportunities that arise around new business models linked to AI. In addition, the study looked into the framework of the adoption of AI/ML in a traditional industrial environment toward a smart manufacturing approach. The contribution of this article was the proposal of an innovative methodology to put AI/ML in the hands of process operators. It aimed to show how it was possible to achieve better results in a less complex and time-consuming adoption process. The work also highlighted the need for an important quantity of data from the process to approach this kind of solution. Full article
Show Figures

Figure 1

Article
Metaknowledge Enhanced Open Domain Question Answering with Wiki Documents
Sensors 2021, 21(24), 8439; https://doi.org/10.3390/s21248439 - 17 Dec 2021
Viewed by 1521
Abstract
The commonly-used large-scale knowledge bases have been facing challenges in open domain question answering tasks which are caused by the loose knowledge association and weak structural logic of triplet-based knowledge. To find a way out of this dilemma, this work proposes a novel [...] Read more.
The commonly-used large-scale knowledge bases have been facing challenges in open domain question answering tasks which are caused by the loose knowledge association and weak structural logic of triplet-based knowledge. To find a way out of this dilemma, this work proposes a novel metaknowledge enhanced approach for open domain question answering. We design an automatic approach to extract metaknowledge and build a metaknowledge network from Wiki documents. For the purpose of representing the directional weighted graph with hierarchical and semantic features, we present an original graph encoder GE4MK to model the metaknowledge network. Then, a metaknowledge enhanced graph reasoning model MEGr-Net is proposed for question answering, which aggregates both relational and neighboring interactions comparing with R-GCN and GAT. Experiments have proved the improvement of metaknowledge over main-stream triplet-based knowledge. We have found that the graph reasoning models and pre-trained language models also have influences on the metaknowledge enhanced question answering approaches. Full article
Show Figures

Figure 1

Article
Mine MIMO Depth Receiver: An Intelligent Receiving Model Based on Densely Connected Convolutional Networks
Sensors 2021, 21(24), 8326; https://doi.org/10.3390/s21248326 - 13 Dec 2021
Viewed by 1271
Abstract
Multiple-input multiple-output (MIMO) systems suffer from high BER in the mining environment. In this paper, the mine MIMO depth receiver model is proposed. The model uses densely connected convolutional networks for feature extraction and constructs multiple binary classifiers to recover the original information. [...] Read more.
Multiple-input multiple-output (MIMO) systems suffer from high BER in the mining environment. In this paper, the mine MIMO depth receiver model is proposed. The model uses densely connected convolutional networks for feature extraction and constructs multiple binary classifiers to recover the original information. Compared with conventional MIMO receivers, the model has no error accumulation caused by processes such as decoding and demodulation. The experimental results show that the model has better performance than conventional decoding methods under different modulation codes and variations in the number of transmitting terminals. Furthermore, we demonstrate that the model can still achieve effective decoding and recover the original information with some data loss at the receiver. Full article
Show Figures

Figure 1

Article
Real-Time Jellyfish Classification and Detection Based on Improved YOLOv3 Algorithm
Sensors 2021, 21(23), 8160; https://doi.org/10.3390/s21238160 - 06 Dec 2021
Cited by 4 | Viewed by 1816
Abstract
In recent years, jellyfish outbreaks have frequently occurred in offshore areas worldwide, posing a significant threat to the marine fishery, tourism, coastal industry, and personal safety. Effective monitoring of jellyfish is a vital method to solve the above problems. However, the optical detection [...] Read more.
In recent years, jellyfish outbreaks have frequently occurred in offshore areas worldwide, posing a significant threat to the marine fishery, tourism, coastal industry, and personal safety. Effective monitoring of jellyfish is a vital method to solve the above problems. However, the optical detection method for jellyfish is still in the primary stage. Therefore, this paper studies a jellyfish detection method based on convolution neural network theory and digital image processing technology. This paper studies the underwater image preprocessing algorithm because the quality of underwater images directly affects the detection results. The results show that the image quality is better after applying the three algorithms namely prior defogging, adaptive histogram equalization, and multi-scale retinal enhancement, which is more conducive to detection. We establish a data set containing seven species of jellyfishes and fish. A total of 2141 images are included in the data set. The YOLOv3 algorithm is used to detect jellyfish, and its feature extraction network Darknet53 is optimized to ensure it is conducted in real-time. In addition, we introduce label smoothing and cosine annealing learning rate methods during the training process. The experimental results show that the improved algorithms improve the detection accuracy of jellyfish on the premise of ensuring the detection speed. This paper lays a foundation for the construction of an underwater jellyfish optical imaging real-time monitoring system. Full article
Show Figures

Figure 1

Article
Nondestructive Testing and Visualization of Catechin Content in Black Tea Fermentation Using Hyperspectral Imaging
Sensors 2021, 21(23), 8051; https://doi.org/10.3390/s21238051 - 02 Dec 2021
Cited by 2 | Viewed by 1210
Abstract
Catechin is a major reactive substance involved in black tea fermentation. It has a determinant effect on the final quality and taste of made teas. In this study, we applied hyperspectral technology with the chemometrics method and used different pretreatment and variable filtering [...] Read more.
Catechin is a major reactive substance involved in black tea fermentation. It has a determinant effect on the final quality and taste of made teas. In this study, we applied hyperspectral technology with the chemometrics method and used different pretreatment and variable filtering algorithms to reduce noise interference. After reduction of the spectral data dimensions by principal component analysis (PCA), an optimal prediction model for catechin content was constructed, followed by visual analysis of catechin content when fermenting leaves for different periods of time. The results showed that zero mean normalization (Z-score), multiplicative scatter correction (MSC), and standard normal variate (SNV) can effectively improve model accuracy; while the shuffled frog leaping algorithm (SFLA), the variable combination population analysis genetic algorithm (VCPA-GA), and variable combination population analysis iteratively retaining informative variables (VCPA-IRIV) can significantly reduce spectral data and enhance the calculation speed of the model. We found that nonlinear models performed better than linear ones. The prediction accuracy for the total amount of catechins and for epicatechin gallate (ECG) of the extreme learning machine (ELM), based on optimal variables, reached 0.989 and 0.994, respectively, and the prediction accuracy for EGC, C, EC, and EGCG of the content support vector regression (SVR) models reached 0.972, 0.993, 0.990, and 0.994, respectively. The optimal model offers accurate prediction, and visual analysis can determine the distribution of the catechin content when fermenting leaves for different fermentation periods. The findings provide significant reference material for intelligent digital assessment of black tea during processing. Full article
Show Figures

Figure 1

Article
Stock Price Movement Prediction Using Sentiment Analysis and CandleStick Chart Representation
Sensors 2021, 21(23), 7957; https://doi.org/10.3390/s21237957 - 29 Nov 2021
Cited by 7 | Viewed by 3236
Abstract
Determining the price movement of stocks is a challenging problem to solve because of factors such as industry performance, economic variables, investor sentiment, company news, company performance, and social media sentiment. People can predict the price movement of stocks by applying machine learning [...] Read more.
Determining the price movement of stocks is a challenging problem to solve because of factors such as industry performance, economic variables, investor sentiment, company news, company performance, and social media sentiment. People can predict the price movement of stocks by applying machine learning algorithms on information contained in historical data, stock candlestick-chart data, and social-media data. However, it is hard to predict stock movement based on a single classifier. In this study, we proposed a multichannel collaborative network by incorporating candlestick-chart and social-media data for stock trend predictions. We first extracted the social media sentiment features using the Natural Language Toolkit and sentiment analysis data from Twitter. We then transformed the stock’s historical time series data into a candlestick chart to elucidate patterns in the stock’s movement. Finally, we integrated the stock’s sentiment features and its candlestick chart to predict the stock price movement over 4-, 6-, 8-, and 10-day time periods. Our collaborative network consisted of two branches: the first branch contained a one-dimensional convolutional neural network (CNN) performing sentiment classification. The second branch included a two-dimensional (2D) CNN performing image classifications based on 2D candlestick chart data. We evaluated our model for five high-demand stocks (Apple, Tesla, IBM, Amazon, and Google) and determined that our collaborative network achieved promising results and compared favorably against single-network models using either sentiment data or candlestick charts alone. The proposed method obtained the most favorable performance with 75.38% accuracy for Apple stock. We also found that the stock price prediction achieved more favorable performance over longer periods of time compared with shorter periods of time. Full article
Show Figures

Figure 1

Article
Research on Classification Model of Panax notoginseng Taproots Based on Machine Vision Feature Fusion
Sensors 2021, 21(23), 7945; https://doi.org/10.3390/s21237945 - 28 Nov 2021
Cited by 5 | Viewed by 1093
Abstract
The existing classification methods for Panax notoginseng taproots suffer from low accuracy, low efficiency, and poor stability. In this study, a classification model based on image feature fusion is established for Panax notoginseng taproots. The images of Panax notoginseng taproots collected in the [...] Read more.
The existing classification methods for Panax notoginseng taproots suffer from low accuracy, low efficiency, and poor stability. In this study, a classification model based on image feature fusion is established for Panax notoginseng taproots. The images of Panax notoginseng taproots collected in the experiment are preprocessed by Gaussian filtering, binarization, and morphological methods. Then, a total of 40 features are extracted, including size and shape features, HSV and RGB color features, and texture features. Through BP neural network, extreme learning machine (ELM), and support vector machine (SVM) models, the importance of color, texture, and fusion features for the classification of the main roots of Panax notoginseng is verified. Among the three models, the SVM model performs the best, achieving an accuracy of 92.037% on the prediction set. Next, iterative retaining information variables (IRIVs), variable iterative space shrinkage approach (VISSA), and stepwise regression analysis (SRA) are used to reduce the dimension of all the features. Finally, a traditional machine learning SVM model based on feature selection and a deep learning model based on semantic segmentation are established. With the model size of only 125 kb and the training time of 3.4 s, the IRIV-SVM model achieves an accuracy of 95.370% on the test set, so IRIV-SVM is selected as the main root classification model for Panax notoginseng. After being optimized by the gray wolf optimizer, the IRIV-GWO-SVM model achieves the highest classification accuracy of 98.704% on the test set. The study results of this paper provide a basis for developing online classification methods of Panax notoginseng with different grades in actual production. Full article
Show Figures

Figure 1

Article
User Identity Protection in Automatic Emotion Recognition through Disguised Speech
AI 2021, 2(4), 636-649; https://doi.org/10.3390/ai2040038 - 25 Nov 2021
Viewed by 1893
Abstract
Ambient Assisted Living (AAL) technologies are being developed which could assist elderly people to live healthy and active lives. These technologies have been used to monitor people’s daily exercises, consumption of calories and sleep patterns, and to provide coaching interventions to foster positive [...] Read more.
Ambient Assisted Living (AAL) technologies are being developed which could assist elderly people to live healthy and active lives. These technologies have been used to monitor people’s daily exercises, consumption of calories and sleep patterns, and to provide coaching interventions to foster positive behaviour. Speech and audio processing can be used to complement such AAL technologies to inform interventions for healthy ageing by analyzing speech data captured in the user’s home. However, collection of data in home settings presents challenges. One of the most pressing challenges concerns how to manage privacy and data protection. To address this issue, we proposed a low cost system for recording disguised speech signals which can protect user identity by using pitch shifting. The disguised speech so recorded can then be used for training machine learning models for affective behaviour monitoring. Affective behaviour could provide an indicator of the onset of mental health issues such as depression and cognitive impairment, and help develop clinical tools for automatically detecting and monitoring disease progression. In this article, acoustic features extracted from the non-disguised and disguised speech are evaluated in an affect recognition task using six different machine learning classification methods. The results of transfer learning from non-disguised to disguised speech are also demonstrated. We have identified sets of acoustic features which are not affected by the pitch shifting algorithm and also evaluated them in affect recognition. We found that, while the non-disguised speech signal gives the best Unweighted Average Recall (UAR) of 80.01%, the disguised speech signal only causes a slight degradation of performance, reaching 76.29%. The transfer learning from non-disguised to disguised speech results in a reduction of UAR (65.13%). However, feature selection improves the UAR (68.32%). This approach forms part of a large project which includes health and wellbeing monitoring and coaching. Full article
Show Figures

Figure 1

Article
Fault Diagnosis of Permanent Magnet DC Motors Based on Multi-Segment Feature Extraction
Sensors 2021, 21(22), 7505; https://doi.org/10.3390/s21227505 - 11 Nov 2021
Cited by 3 | Viewed by 955
Abstract
For permanent magnet DC motors (PMDCMs), the amplitude of the current signals gradually decreases after the motor starts. Only using the signal features of current in a single segment is not conducive to fault diagnosis for PMDCMs. In this work, multi-segment feature extraction [...] Read more.
For permanent magnet DC motors (PMDCMs), the amplitude of the current signals gradually decreases after the motor starts. Only using the signal features of current in a single segment is not conducive to fault diagnosis for PMDCMs. In this work, multi-segment feature extraction is presented for improving the effect of fault diagnosis of PMDCMs. Additionally, a support vector machine (SVM), a classification and regression tree (CART), and the k-nearest neighbor algorithm (k-NN) are utilized for the construction of fault diagnosis models. The time domain features extracted from several successive segments of current signals make up a feature vector, which is adopted for fault diagnosis of PMDCMs. Experimental results show that multi-segment features have a better diagnostic effect than single-segment features; the average accuracy of fault diagnosis improves by 19.88%. This paper lays the foundation of fault diagnosis for PMDCMs through multi-segment feature extraction and provides a novel method for feature extraction. Full article
Show Figures

Figure 1

Article
Adversarial Learning with Bidirectional Attention for Visual Question Answering
Sensors 2021, 21(21), 7164; https://doi.org/10.3390/s21217164 - 28 Oct 2021
Cited by 1 | Viewed by 942
Abstract
In this paper, we provide external image features and use the internal attention mechanism to solve the VQA problem given a dataset of textual questions and related images. Most previous models for VQA use a pair of images and questions as input. In [...] Read more.
In this paper, we provide external image features and use the internal attention mechanism to solve the VQA problem given a dataset of textual questions and related images. Most previous models for VQA use a pair of images and questions as input. In addition, the model adopts a question-oriented attention mechanism to extract the features of the entire image and then perform feature fusion. However, the shortcoming of these models is that they cannot effectively eliminate the irrelevant features of the image. In addition, the problem-oriented attention mechanism lacks in the mining of image features, which will bring in redundant image features. In this paper, we propose a VQA model based on adversarial learning and bidirectional attention. We exploit external image features that are not related to the question to form an adversarial mechanism to boost the accuracy of the model. Target detection is performed on the image—that is, the image-oriented attention mechanism. The bidirectional attention mechanism is conducive to promoting model attention and eliminating interference. Experimental results are evaluated on benchmark datasets, and our model performs better than other models based on attention methods. In addition, the qualitative results show the attention maps on the images and leads to predicting correct answers. Full article
Show Figures

Figure 1

Article
An Empathy Evaluation System Using Spectrogram Image Features of Audio
Sensors 2021, 21(21), 7111; https://doi.org/10.3390/s21217111 - 26 Oct 2021
Cited by 1 | Viewed by 1312
Abstract
Watching videos online has become part of a relaxed lifestyle. The music in videos has a sensitive influence on human emotions, perception, and imaginations, which can make people feel relaxed or sad, and so on. Therefore, it is particularly important for people who [...] Read more.
Watching videos online has become part of a relaxed lifestyle. The music in videos has a sensitive influence on human emotions, perception, and imaginations, which can make people feel relaxed or sad, and so on. Therefore, it is particularly important for people who make advertising videos to understand the relationship between the physical elements of music and empathy characteristics. The purpose of this paper is to analyze the music features in an advertising video and extract the music features that make people empathize. This paper combines both methods of the power spectrum of MFCC and image RGB analysis to find the audio feature vector. In spectral analysis, the eigenvectors obtained in the analysis process range from blue (low range) to green (medium range) to red (high range). The machine learning random forest classifier is used to classify the data obtained by machine learning, and the trained model is used to monitor the development of an advertisement empathy system in real time. The result is that the optimal model is obtained with the training accuracy result of 99.173% and a test accuracy of 86.171%, which can be deemed as correct by comparing the three models of audio feature value analysis. The contribution of this study can be summarized as follows: (1) the low-frequency and high-amplitude audio in the video is more likely to resonate than the high-frequency and high-amplitude audio; (2) it is found that frequency and audio amplitude are important attributes for describing waveforms by observing the characteristics of the machine learning classifier; (3) a new audio extraction method is proposed to induce human empathy. That is, the feature value extracted by the method of spectrogram image features of audio has the most ability to arouse human empathy. Full article
Show Figures

Figure 1

Article
Personality Classification of Social Users Based on Feature Fusion
Sensors 2021, 21(20), 6758; https://doi.org/10.3390/s21206758 - 12 Oct 2021
Cited by 3 | Viewed by 1317
Abstract
Based on the openness and accessibility of user data, personality recognition is widely used in personalized recommendation, intelligent medicine, natural language processing, and so on. Existing approaches usually adopt a single deep learning mechanism to extract personality information from user data, which leads [...] Read more.
Based on the openness and accessibility of user data, personality recognition is widely used in personalized recommendation, intelligent medicine, natural language processing, and so on. Existing approaches usually adopt a single deep learning mechanism to extract personality information from user data, which leads to semantic loss to some extent. In addition, researchers encode scattered user posts in a sequential or hierarchical manner, ignoring the connection between posts and the unequal value of different posts to classification tasks. We propose a hierarchical hybrid model based on a self-attention mechanism, namely HMAttn-ECBiL, to fully excavate deep semantic information horizontally and vertically. Multiple modules composed of convolutional neural network and bi-directional long short-term memory encode different types of personality representations in a hierarchical and partitioned manner, which pays attention to the contribution of different words in posts and different posts to personality information and captures the dependencies between scattered posts. Moreover, the addition of a word embedding module effectively makes up for the original semantics filtered by a deep neural network. We verified the hybrid model on the MyPersonality dataset. The experimental results showed that the classification performance of the hybrid model exceeds the different model architectures and baseline models, and the average accuracy reached 72.01%. Full article
Show Figures

Figure 1

Article
InterNet+: A Light Network for Hand Pose Estimation
Sensors 2021, 21(20), 6747; https://doi.org/10.3390/s21206747 - 11 Oct 2021
Cited by 1 | Viewed by 1286
Abstract
Hand pose estimation from RGB images has always been a difficult task, owing to the incompleteness of the depth information. Moon et al. improved the accuracy of hand pose estimation by using a new network, InterNet, through their unique design. Still, the network [...] Read more.
Hand pose estimation from RGB images has always been a difficult task, owing to the incompleteness of the depth information. Moon et al. improved the accuracy of hand pose estimation by using a new network, InterNet, through their unique design. Still, the network still has potential for improvement. Based on the architecture of MobileNet v3 and MoGA, we redesigned a feature extractor that introduced the latest achievements in the field of computer vision, such as the ACON activation function and the new attention mechanism module, etc. Using these modules effectively with our network, architecture can better extract global features from an RGB image of the hand, leading to a greater performance improvement compared to InterNet and other similar networks. Full article
Show Figures

Figure 1

Article
LSTM-DDPG for Trading with Variable Positions
Sensors 2021, 21(19), 6571; https://doi.org/10.3390/s21196571 - 30 Sep 2021
Cited by 2 | Viewed by 2088
Abstract
In recent years, machine learning for trading has been widely studied. The direction and size of position should be determined in trading decisions based on market conditions. However, there is no research so far that considers variable position sizes in models developed for [...] Read more.
In recent years, machine learning for trading has been widely studied. The direction and size of position should be determined in trading decisions based on market conditions. However, there is no research so far that considers variable position sizes in models developed for trading purposes. In this paper, we propose a deep reinforcement learning model named LSTM-DDPG to make trading decisions with variable positions. Specifically, we consider the trading process as a Partially Observable Markov Decision Process, in which the long short-term memory (LSTM) network is used to extract market state features and the deep deterministic policy gradient (DDPG) framework is used to make trading decisions concerning the direction and variable size of position. We test the LSTM-DDPG model on IF300 (index futures of China stock market) data and the results show that LSTM-DDPG with variable positions performs better in terms of return and risk than models with fixed or few-level positions. In addition, the investment potential of the model can be better tapped by the reward function of the differential Sharpe ratio than that of profit reward function. Full article
Show Figures

Figure 1

Back to TopTop