MDPI - Publisher of Open Access Journals

19 pages, 821 KiB

Open AccessArticle

AI Concepts for System of Systems Dynamic Interoperability

by Jacob Nilsson, Saleha Javed, Kim Albertsson, Jerker Delsing, Marcus Liwicki and Fredrik Sandin

Sensors 2024, 24(9), 2921; https://doi.org/10.3390/s24092921 - 3 May 2024

Cited by 8 | Viewed by 2750

Abstract

Interoperability is a central problem in digitization and System of Systems (SoS) engineering, which concerns the capacity of systems to exchange information and cooperate. The task to dynamically establish interoperability between heterogeneous cyber-physical systems (CPSs) at run-time is a challenging problem. Different aspects [...] Read more.

Interoperability is a central problem in digitization and System of Systems (SoS) engineering, which concerns the capacity of systems to exchange information and cooperate. The task to dynamically establish interoperability between heterogeneous cyber-physical systems (CPSs) at run-time is a challenging problem. Different aspects of the interoperability problem have been studied in fields such as SoS, neural translation, and agent-based systems, but there are no unifying solutions beyond domain-specific standardization efforts. The problem is complicated by the uncertain and variable relations between physical processes and human-centric symbols, which result from, e.g., latent physical degrees of freedom, maintenance, re-configurations, and software updates. Therefore, we surveyed the literature for concepts and methods needed to automatically establish SoSs with purposeful CPS communication, focusing on machine learning and connecting approaches that are not integrated in the present literature. Here, we summarize recent developments relevant to the dynamic interoperability problem, such as representation learning for ontology alignment and inference on heterogeneous linked data; neural networks for transcoding of text and code; concept learning-based reasoning; and emergent communication. We find that there has been a recent interest in deep learning approaches to establishing communication under different assumptions about the environment, language, and nature of the communicating entities. Furthermore, we present examples of architectures and discuss open problems associated with artificial intelligence (AI)-enabled solutions in relation to SoS interoperability requirements. Although these developments open new avenues for research, there are still no examples that bridge the concepts necessary to establish dynamic interoperability in complex SoSs, and realistic testbeds are needed. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

24 pages, 5068 KiB

Open AccessArticle

Deep Ontology Alignment Using a Natural Language Processing Approach for Automatic M2M Translation in IIoT

by Saleha Javed, Muhammad Usman, Fredrik Sandin, Marcus Liwicki and Hamam Mokayed

Sensors 2023, 23(20), 8427; https://doi.org/10.3390/s23208427 - 12 Oct 2023

Cited by 5 | Viewed by 3272

Abstract

The technical capabilities of modern Industry 4.0 and Industry 5.0 are vast and growing exponentially daily. The present-day Industrial Internet of Things (IIoT) combines manifold underlying technologies that require real-time interconnection and communication among heterogeneous devices. Smart cities are established with sophisticated designs [...] Read more.

The technical capabilities of modern Industry 4.0 and Industry 5.0 are vast and growing exponentially daily. The present-day Industrial Internet of Things (IIoT) combines manifold underlying technologies that require real-time interconnection and communication among heterogeneous devices. Smart cities are established with sophisticated designs and control of seamless machine-to-machine (M2M) communication, to optimize resources, costs, performance, and energy distributions. All the sensory devices within a building interact to maintain a sustainable climate for residents and intuitively optimize the energy distribution to optimize energy production. However, this encompasses quite a few challenges for devices that lack a compatible and interoperable design. The conventional solutions are restricted to limited domains or rely on engineers designing and deploying translators for each pair of ontologies. This is a costly process in terms of engineering effort and computational resources. An issue persists that a new device with a different ontology must be integrated into an existing IoT network. We propose a self-learning model that can determine the taxonomy of devices given their ontological meta-data and structural information. The model finds matches between two distinct ontologies using a natural language processing (NLP) approach to learn linguistic contexts. Then, by visualizing the ontological network as a knowledge graph, it is possible to learn the structure of the meta-data and understand the device’s message formulation. Finally, the model can align entities of ontological graphs that are similar in context and structure.Furthermore, the model performs dynamic M2M translation without requiring extra engineering or hardware resources. Full article

(This article belongs to the Special Issue IoT Enabled Sensing System: Technologies, Challenges, and Smart Applications)

► Show Figures

Figure 1

17 pages, 2837 KiB

Open AccessArticle

T5 for Hate Speech, Augmented Data, and Ensemble

by Tosin Adewumi, Sana Sabah Sabry, Nosheen Abid, Foteini Liwicki and Marcus Liwicki

Sci 2023, 5(4), 37; https://doi.org/10.3390/sci5040037 - 22 Sep 2023

Cited by 8 | Viewed by 3155

Abstract

We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what [...] Read more.

We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods, such as data augmentation and ensemble, may have on the best model, if any. We carry out six cross-task investigations. We achieve new SoTA results on two subtasks—macro F1 scores of 91.73% and 53.21% for subtasks A and B of the HASOC 2020 dataset, surpassing previous SoTA scores of 51.52% and 26.52%, respectively. We achieve near-SoTA results on two others—macro F1 scores of 81.66% for subtask A of the OLID 2019 and 82.54% for subtask A of the HASOC 2021, in comparison to SoTA results of 82.9% and 83.05%, respectively. We perform error analysis and use two eXplainable Artificial Intelligence (XAI) algorithms (Integrated Gradient (IG) and SHapley Additive exPlanations (SHAP)) to reveal how two of the models (Bi-Directional Long Short-Term Memory Network (Bi-LSTM) and Text-to-Text-Transfer Transformer (T5)) make the predictions they do by using examples. Other contributions of this work are: (1) the introduction of a simple, novel mechanism for correcting Out-of-Class (OoC) predictions in T5, (2) a detailed description of the data augmentation methods, and (3) the revelation of the poor data annotations in the HASOC 2021 dataset by using several examples and XAI (buttressing the need for better quality control). We publicly release our model checkpoints and codes to foster transparency. Full article

(This article belongs to the Special Issue Computational Linguistics and Artificial Intelligence)

► Show Figures

Figure 1

15 pages, 2917 KiB

Open AccessArticle

Machine Learning Composite-Nanoparticle-Enriched Lubricant Oil Development for Improved Frictional Performance—An Experiment

by Ali Usman, Saad Arif, Ahmed Hassan Raja, Reijo Kouhia, Andreas Almqvist and Marcus Liwicki

Lubricants 2023, 11(6), 254; https://doi.org/10.3390/lubricants11060254 - 9 Jun 2023

Cited by 5 | Viewed by 2827

Abstract

Improving the frictional response of a functional surface interface has been a significant research concern. During the last couple of decades, lubricant oils have been enriched with several additives to obtain formulations that can meet the requirements of different lubricating regimes from boundary [...] Read more.

Improving the frictional response of a functional surface interface has been a significant research concern. During the last couple of decades, lubricant oils have been enriched with several additives to obtain formulations that can meet the requirements of different lubricating regimes from boundary to full-film hydrodynamic lubrication. The possibility to improve the tribological performance of lubricating oils using various types of nanoparticles has been investigated. In this study, we proposed a data-driven approach that utilizes machine learning (ML) techniques to optimize the composition of a hybrid oil by adding ceramic and carbon-based nanoparticles in varying concentrations to the base oil. Supervised-learning-based regression methods including support vector machines, random forest trees, and artificial neural network (ANN) models are developed to capture the inherent non-linear behavior of the nano lubricants. The ANN hyperparameters were fine-tuned with Bayesian optimization. The regression performance is evaluated with multiple assessment metrics such as the root mean square error (RMSE), mean squared error (MSE), mean absolute error (MAE), and coefficient of determination (R²). The ANN showed the best prediction performance among all ML models, with 2.22 × 10⁻³ RMSE, 4.92 × 10⁻⁶ MSE, 2.1 × 10⁻³ MAE, and 0.99 R². The computational models’ performance curves for the different nanoparticles and how the composition affects the interface were investigated. The results show that the composition of the optimized hybrid oil was highly dependent on the lubrication regime and that the coefficient of friction was significantly reduced when optimal concentrations of ceramic and carbon-based nanoparticles are added to the base oil. The proposed research work has potential applications in designing hybrid nano lubricants to achieve optimized tribological performance in changing lubrication regimes. Full article

(This article belongs to the Special Issue Recent Advances in Machine Learning in Tribology)

► Show Figures

Figure 1

18 pages, 7774 KiB

Open AccessArticle

Attention-Guided Disentangled Feature Aggregation for Video Object Detection

by Shishir Muralidhara, Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Sensors 2022, 22(21), 8583; https://doi.org/10.3390/s22218583 - 7 Nov 2022

Cited by 5 | Viewed by 4392

Abstract

Object detection is a computer vision task that involves localisation and classification of objects in an image. Video data implicitly introduces several challenges, such as blur, occlusion and defocus, making video object detection more challenging in comparison to still image object detection, which [...] Read more.

Object detection is a computer vision task that involves localisation and classification of objects in an image. Video data implicitly introduces several challenges, such as blur, occlusion and defocus, making video object detection more challenging in comparison to still image object detection, which is performed on individual and independent images. This paper tackles these challenges by proposing an attention-heavy framework for video object detection that aggregates the disentangled features extracted from individual frames. The proposed framework is a two-stage object detector based on the Faster R-CNN architecture. The disentanglement head integrates scale, spatial and task-aware attention and applies it to the features extracted by the backbone network across all the frames. Subsequently, the aggregation head incorporates temporal attention and improves detection in the target frame by aggregating the features of the support frames. These include the features extracted from the disentanglement network along with the temporal features. We evaluate the proposed framework using the ImageNet VID dataset and achieve a mean Average Precision (mAP) of 49.8 and 52.5 using the backbones of ResNet-50 and ResNet-101, respectively. The improvement in performance over the individual baseline methods validates the efficacy of the proposed approach. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

23 pages, 4524 KiB

Open AccessArticle

Rethinking Learnable Proposals for Graphical Object Detection in Scanned Document Images

by Sankalp Sinha, Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Appl. Sci. 2022, 12(20), 10578; https://doi.org/10.3390/app122010578 - 20 Oct 2022

Cited by 9 | Viewed by 2261

Abstract

In the age of deep learning, researchers have looked at domain adaptation under the pre-training and fine-tuning paradigm to leverage the gains in the natural image domain. These backbones and subsequent networks are designed for object detection in the natural image domain. They [...] Read more.

In the age of deep learning, researchers have looked at domain adaptation under the pre-training and fine-tuning paradigm to leverage the gains in the natural image domain. These backbones and subsequent networks are designed for object detection in the natural image domain. They do not consider some of the critical characteristics of document images. Document images are sparse in contextual information, and the graphical page objects are logically clustered. This paper investigates the effectiveness of deep and robust backbones in the document image domain. Further, it explores the idea of learnable object proposals through Sparse R-CNN. This paper shows that simple domain adaptation of top-performing object detectors to the document image domain does not lead to better results. Furthermore, empirically showing that detectors based on dense object priors like Faster R-CNN, Mask R-CNN, and Cascade Mask R-CNN are perhaps not best suited for graphical page object detection. Detectors that reduce the number of object candidates while making them learnable are a step towards a better approach. We formulate and evaluate the Sparse R-CNN (SR-CNN) model on the IIIT-AR-13k, PubLayNet, and DocBank datasets and hope to inspire a rethinking of object proposals in the domain of graphical page object detection. Full article

► Show Figures

Figure 1

14 pages, 374 KiB

Open AccessArticle

Vector Representations of Idioms in Conversational Systems

by Tosin Adewumi, Foteini Liwicki and Marcus Liwicki

Sci 2022, 4(4), 37; https://doi.org/10.3390/sci4040037 - 29 Sep 2022

Cited by 6 | Viewed by 3042

Abstract

In this study, we demonstrate that an open-domain conversational system trained on idioms or figurative language generates more fitting responses to prompts containing idioms. Idioms are a part of everyday speech in many languages and across many cultures, but they pose a great [...] Read more.

In this study, we demonstrate that an open-domain conversational system trained on idioms or figurative language generates more fitting responses to prompts containing idioms. Idioms are a part of everyday speech in many languages and across many cultures, but they pose a great challenge for many natural language processing (NLP) systems that involve tasks such as information retrieval (IR), machine translation (MT), and conversational artificial intelligence (AI). We utilized the Potential Idiomatic Expression (PIE)-English idiom corpus for the two tasks that we investigated: classification and conversation generation. We achieved a state-of-the-art (SoTA) result of a 98% macro F1 score on the classification task by using the SoTA T5 model. We experimented with three instances of the SoTA dialogue model—the Dialogue Generative Pre-trained Transformer (DialoGPT)—for conversation generation. Their performances were evaluated by using the automatic metric, perplexity, and a human evaluation. The results showed that the model trained on the idiom corpus generated more fitting responses to prompts containing idioms 71.9% of the time in comparison with a similar model that was not trained on the idiom corpus. We have contributed the model checkpoint/demo/code to the HuggingFace hub for public access. Full article

(This article belongs to the Section Computer Sciences, Mathematics and AI)

► Show Figures

Figure 1

18 pages, 1760 KiB

Open AccessArticle

Mask-Aware Semi-Supervised Object Detection in Floor Plans

by Tahira Shehzadi, Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Appl. Sci. 2022, 12(19), 9398; https://doi.org/10.3390/app12199398 - 20 Sep 2022

Cited by 13 | Viewed by 4491

Abstract

Research has been growing on object detection using semi-supervised methods in past few years. We examine the intersection of these two areas for floor-plan objects to promote the research objective of detecting more accurate objects with less labeled data. The floor-plan objects include [...] Read more.

Research has been growing on object detection using semi-supervised methods in past few years. We examine the intersection of these two areas for floor-plan objects to promote the research objective of detecting more accurate objects with less labeled data. The floor-plan objects include different furniture items with multiple types of the same class, and this high inter-class similarity impacts the performance of prior methods. In this paper, we present Mask R-CNN-based semi-supervised approach that provides pixel-to-pixel alignment to generate individual annotation masks for each class to mine the inter-class similarity. The semi-supervised approach has a student–teacher network that pulls information from the teacher network and feeds it to the student network. The teacher network uses unlabeled data to form pseudo-boxes, and the student network uses both label data with the pseudo boxes and labeled data as the ground truth for training. It learns representations of furniture items by combining labeled and label data. On the Mask R-CNN detector with ResNet-101 backbone network, the proposed approach achieves a mAP of 98.8%, 99.7%, and 99.8% with only 1%, 5% and 10% labeled data, respectively. Our experiment affirms the efficiency of the proposed approach, as it outperforms the previous semi-supervised approaches using only 1% of the labels. Full article

► Show Figures

Figure 1

18 pages, 4762 KiB

Open AccessReview

A Comprehensive Survey of Depth Completion Approaches

by Muhammad Ahmed Ullah Khan, Danish Nazir, Alain Pagani, Hamam Mokayed, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Sensors 2022, 22(18), 6969; https://doi.org/10.3390/s22186969 - 14 Sep 2022

Cited by 15 | Viewed by 5988

Abstract

Depth maps produced by LiDAR-based approaches are sparse. Even high-end LiDAR sensors produce highly sparse depth maps, which are also noisy around the object boundaries. Depth completion is the task of generating a dense depth map from a sparse depth map. While the [...] Read more.

Depth maps produced by LiDAR-based approaches are sparse. Even high-end LiDAR sensors produce highly sparse depth maps, which are also noisy around the object boundaries. Depth completion is the task of generating a dense depth map from a sparse depth map. While the earlier approaches focused on directly completing this sparsity from the sparse depth maps, modern techniques use RGB images as a guidance tool to resolve this problem. Whilst many others rely on affinity matrices for depth completion. Based on these approaches, we have divided the literature into two major categories; unguided methods and image-guided methods. The latter is further subdivided into multi-branch and spatial propagation networks. The multi-branch networks further have a sub-category named image-guided filtering. In this paper, for the first time ever we present a comprehensive survey of depth completion methods. We present a novel taxonomy of depth completion approaches, review in detail different state-of-the-art techniques within each category for depth completion of LiDAR data, and provide quantitative results for the approaches on KITTI and NYUv2 depth completion benchmark datasets. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

23 pages, 3511 KiB

Open AccessReview

Three-Dimensional Reconstruction from a Single RGB Image Using Deep Learning: A Review

by Muhammad Saif Ullah Khan, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

J. Imaging 2022, 8(9), 225; https://doi.org/10.3390/jimaging8090225 - 23 Aug 2022

Cited by 14 | Viewed by 6911

Abstract

Performing 3D reconstruction from a single 2D input is a challenging problem that is trending in literature. Until recently, it was an ill-posed optimization problem, but with the advent of learning-based methods, the performance of 3D reconstruction has also significantly improved. Infinitely many [...] Read more.

Performing 3D reconstruction from a single 2D input is a challenging problem that is trending in literature. Until recently, it was an ill-posed optimization problem, but with the advent of learning-based methods, the performance of 3D reconstruction has also significantly improved. Infinitely many different 3D objects can be projected onto the same 2D plane, which makes the reconstruction task very difficult. It is even more difficult for objects with complex deformations or no textures. This paper serves as a review of recent literature on 3D reconstruction from a single view, with a focus on deep learning methods from 2018 to 2021. Due to the lack of standard datasets or 3D shape representation methods, it is hard to compare all reviewed methods directly. However, this paper reviews different approaches for reconstructing 3D shapes as depth maps, surface normals, point clouds, and meshes; along with various loss functions and metrics used to train and evaluate these methods. Full article

(This article belongs to the Special Issue Geometry Reconstruction from Images)

► Show Figures

Figure 1

18 pages, 10506 KiB

Open AccessArticle

Investigating Attention Mechanism for Page Object Detection in Document Images

by Shivam Naik, Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Appl. Sci. 2022, 12(15), 7486; https://doi.org/10.3390/app12157486 - 26 Jul 2022

Cited by 11 | Viewed by 4522

Abstract

Page object detection in scanned document images is a complex task due to varying document layouts and diverse page objects. In the past, traditional methods such as Optical Character Recognition (OCR)-based techniques have been employed to extract textual information. However, these methods fail [...] Read more.

Page object detection in scanned document images is a complex task due to varying document layouts and diverse page objects. In the past, traditional methods such as Optical Character Recognition (OCR)-based techniques have been employed to extract textual information. However, these methods fail to comprehend complex page objects such as tables and figures. This paper addresses the localization problem and classification of graphical objects that visually summarize vital information in documents. Furthermore, this work examines the benefit of incorporating attention mechanisms in different object detection networks to perform page object detection on scanned document images. The model is designed with a Pytorch-based framework called Detectron2. The proposed pipelines can be optimized end-to-end and exhaustively evaluated on publicly available datasets such as DocBank, PublayNet, and IIIT-AR-13K. The achieved results reflect the effectiveness of incorporating the attention mechanism for page object detection in documents. Full article

► Show Figures

Figure 1

17 pages, 935 KiB

Open AccessArticle

State-of-the-Art in Open-Domain Conversational AI: A Survey

by Tosin Adewumi, Foteini Liwicki and Marcus Liwicki

Information 2022, 13(6), 298; https://doi.org/10.3390/info13060298 - 10 Jun 2022

Cited by 16 | Viewed by 5377

Abstract

We survey SoTA open-domain conversational AI models with the objective of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. [...] Read more.

We survey SoTA open-domain conversational AI models with the objective of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI models are known to have several challenges, including bland, repetitive responses and performance degradation when prompted with figurative language, among others. First, we provide some background by discussing some topics of interest in conversational AI. We then discuss the method applied to the two investigations carried out that make up this study. The first investigation involves a search for recent SoTA open-domain conversational AI models, while the second involves the search for 100 conversational AI to assess their gender. Results of the survey show that progress has been made with recent SoTA conversational AI, but there are still persistent challenges that need to be solved, and the female gender is more common than the male for conversational AI. One main takeaway is that hybrid models of conversational AI offer more advantages than any single architecture. The key contributions of this survey are (1) the identification of prevailing challenges in SoTA open-domain conversational AI, (2) the rarely held discussion on open-domain conversational AI for low-resource languages, and (3) the discussion about the ethics surrounding the gender of conversational AI. Full article

(This article belongs to the Special Issue Natural Language Processing for Conversational AI)

► Show Figures

Figure 1

21 pages, 4473 KiB

Open AccessArticle

Toward Semi-Supervised Graphical Object Detection in Document Images

by Goutham Kallempudi, Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Future Internet 2022, 14(6), 176; https://doi.org/10.3390/fi14060176 - 8 Jun 2022

Cited by 5 | Viewed by 3579

Abstract

The graphical page object detection classifies and localizes objects such as Tables and Figures in a document. As deep learning techniques for object detection become increasingly successful, many supervised deep neural network-based methods have been introduced to recognize graphical objects in documents. However, [...] Read more.

The graphical page object detection classifies and localizes objects such as Tables and Figures in a document. As deep learning techniques for object detection become increasingly successful, many supervised deep neural network-based methods have been introduced to recognize graphical objects in documents. However, these models necessitate a substantial amount of labeled data for the training process. This paper presents an end-to-end semi-supervised framework for graphical object detection in scanned document images to address this limitation. Our method is based on a recently proposed Soft Teacher mechanism that examines the effects of small percentage-labeled data on the classification and localization of graphical objects. On both the PubLayNet and the IIIT-AR-13K datasets, the proposed approach outperforms the supervised models by a significant margin in all labeling ratios

(1 %, 5 %

, and

10 %)

. Furthermore, the

10 %

PubLayNet Soft Teacher model improves the average precision of Table, Figure, and List by

+ 5.4, + 1.2

, and

+ 3.2

points, respectively, with a similar total mAP as the Faster-RCNN baseline. Moreover, our model trained on

10 %

of IIIT-AR-13K labeled data beats the previous fully supervised method

+ 4.5

points. Full article

(This article belongs to the Section Big Data and Augmented Intelligence)

► Show Figures

Figure 1

19 pages, 11387 KiB

Open AccessArticle

Exploiting Concepts of Instance Segmentation to Boost Detection in Challenging Environments

by Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Sensors 2022, 22(10), 3703; https://doi.org/10.3390/s22103703 - 12 May 2022

Cited by 3 | Viewed by 3236

Abstract

In recent years, due to the advancements in machine learning, object detection has become a mainstream task in the computer vision domain. The first phase of object detection is to find the regions where objects can exist. With the improvements in deep learning, [...] Read more.

In recent years, due to the advancements in machine learning, object detection has become a mainstream task in the computer vision domain. The first phase of object detection is to find the regions where objects can exist. With the improvements in deep learning, traditional approaches, such as sliding windows and manual feature selection techniques, have been replaced with deep learning techniques. However, object detection algorithms face a problem when performed in low light, challenging weather, and crowded scenes, similar to any other task. Such an environment is termed a challenging environment. This paper exploits pixel-level information to improve detection under challenging situations. To this end, we exploit the recently proposed hybrid task cascade network. This network works collaboratively with detection and segmentation heads at different cascade levels. We evaluate the proposed methods on three complex datasets of ExDark, CURE-TSD, and RESIDE, and achieve a mAP of 0.71, 0.52, and 0.43, respectively. Our experimental results assert the efficacy of the proposed approach. Full article

(This article belongs to the Section Environmental Sensing)

► Show Figures

Figure 1

19 pages, 911 KiB

Open AccessArticle

Rethinking the Methods and Algorithms for Inner Speech Decoding and Making Them Reproducible

by Foteini Simistira Liwicki, Vibha Gupta, Rajkumar Saini, Kanjar De and Marcus Liwicki

NeuroSci 2022, 3(2), 226-244; https://doi.org/10.3390/neurosci3020017 - 19 Apr 2022

Cited by 12 | Viewed by 4959

Abstract

This study focuses on the automatic decoding of inner speech using noninvasive methods, such as Electroencephalography (EEG). While inner speech has been a research topic in philosophy and psychology for half a century, recent attempts have been made to decode nonvoiced [...] Read more.

This study focuses on the automatic decoding of inner speech using noninvasive methods, such as Electroencephalography (EEG). While inner speech has been a research topic in philosophy and psychology for half a century, recent attempts have been made to decode nonvoiced spoken words by using various brain–computer interfaces. The main shortcomings of existing work are reproducibility and the availability of data and code. In this work, we investigate various methods (using Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), Long Short-Term Memory Networks (LSTM)) for the detection task of five vowels and six words on a publicly available EEG dataset. The main contributions of this work are (1) subject dependent vs. subject-independent approaches, (2) the effect of different preprocessing steps (Independent Component Analysis (ICA), down-sampling and filtering), and (3) word classification (where we achieve state-of-the-art performance on a publicly available dataset). Overall we achieve a performance accuracy of 35.20% and 29.21% when classifying five vowels and six words, respectively, in a publicly available dataset, using our tuned iSpeech-CNN architecture. All of our code and processed data are publicly available to ensure reproducibility. As such, this work contributes to a deeper understanding and reproducibility of experiments in the area of inner speech detection. Full article

(This article belongs to the Collection Feature Papers in NeuroSci: From Consciousness to Clinical Neurology)

► Show Figures

Figure 1

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI