You are currently on the new version of our website. Access the old version .

111 Results Found

  • Article
  • Open Access
63 Citations
8,434 Views
19 Pages

SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams

  • Madina Abdrakhmanova,
  • Askat Kuzdeuov,
  • Sheikh Jarju,
  • Yerbolat Khassanov,
  • Michael Lewis and
  • Huseyin Atakan Varol

16 May 2021

We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interact...

  • Article
  • Open Access
5 Citations
8,210 Views
13 Pages

Annotated-VocalSet: A Singing Voice Dataset

  • Behnam Faghih and
  • Joseph Timoney

15 September 2022

There are insufficient datasets of singing files that are adequately annotated. One of the available datasets that includes a variety of vocal techniques (n = 17) and several singers (m = 20) with several WAV files (p = 3560) is the VocalSet dataset....

  • Article
  • Open Access
47 Citations
7,652 Views
27 Pages

16 April 2021

Stereotype is a type of social bias massively present in texts that computational models use. There are stereotypes that present special difficulties because they do not rely on personal attributes. This is the case of stereotypes about immigrants, a...

  • Feature Paper
  • Article
  • Open Access
15 Citations
5,667 Views
13 Pages

Mental Illness Stigma and Associated Factors among Arabic-Speaking Religious and Community Leaders

  • Klimentina Krstanoska-Blazeska,
  • Russell Thomson and
  • Shameran Slewa-Younan

Evidence suggests that Arabic-speaking refugees in Australia seek help from informal sources, including religious and community leaders, when experiencing mental health issues. Despite their significant influence, there is scarce research exploring a...

  • Article
  • Open Access
4 Citations
6,523 Views
17 Pages

28 February 2023

Existing persona-based dialogue generation models focus on the semantic consistency between personas and responses. However, various influential factors can cause persona inconsistency, such as the speaking style in the context. Existing models perfo...

  • Article
  • Open Access
1 Citations
1,363 Views
11 Pages

Utilization of the Spanish Bisyllable Word Recognition Test to Assess Cochlear Implant Performance Trajectory

  • Meredith A. Holcomb,
  • Erin Williams,
  • Sandra Prentiss,
  • Chrisanda M. Sanchez,
  • Molly R. Smeal,
  • Tina Stern,
  • Amanda K. Tolen,
  • Sandra Velandia and
  • Jennifer Coto

24 January 2025

Objectives: The aims of this study were to compare pre- and post-operative word recognition scores (WRSs) for the adult Spanish-speaking population and to describe their cochlear implant (CI) performance trajectory. Methods: A retrospective chart rev...

  • Article
  • Open Access
1 Citations
2,541 Views
18 Pages

Updated Swiss Growth References 2025: No Height Differences, but BMI Variations Associated with Migration

  • Urs Eiholzer,
  • Anika Stephan,
  • Ilja Dubinski,
  • Christiane Fritz and
  • Cees Noordam

21 August 2025

Background/Objectives: The 2019 Swiss growth references for height, weight, and BMI were based on a large dataset from the German-speaking part of Switzerland (Cohort 2019). The current study aimed to ensure national representativeness by proportiona...

  • Article
  • Open Access
26 Citations
5,888 Views
15 Pages

Improving the Accuracy of Automatic Facial Expression Recognition in Speaking Subjects with Deep Learning

  • Sathya Bursic,
  • Giuseppe Boccignone,
  • Alfio Ferrara,
  • Alessandro D’Amelio and
  • Raffaella Lanzarotti

9 June 2020

When automatic facial expression recognition is applied to video sequences of speaking subjects, the recognition accuracy has been noted to be lower than with video sequences of still subjects. This effect known as the speaking effect arises during s...

  • Article
  • Open Access
19 Citations
5,020 Views
16 Pages

DisCaaS: Micro Behavior Analysis on Discussion by Camera as a Sensor

  • Ko Watanabe,
  • Yusuke Soneda,
  • Yuki Matsuda,
  • Yugo Nakamura,
  • Yutaka Arakawa,
  • Andreas Dengel and
  • Shoya Ishimaru

25 August 2021

The emergence of various types of commercial cameras (compact, high resolution, high angle of view, high speed, and high dynamic range, etc.) has contributed significantly to the understanding of human activities. By taking advantage of the character...

  • Article
  • Open Access
25 Citations
4,723 Views
13 Pages

Detecting Hateful and Offensive Speech in Arabic Social Media Using Transfer Learning

  • Zakaria Boulouard,
  • Mariya Ouaissa,
  • Mariyam Ouaissa,
  • Moez Krichen,
  • Mutiq Almutiq and
  • Karim Gasmi

14 December 2022

The democratization of access to internet and social media has given an opportunity for every individual to openly express his or her ideas and feelings. Unfortunately, this has also created room for extremist, racist, misogynist, and offensive opini...

  • Article
  • Open Access
17 Citations
5,935 Views
16 Pages

Object detection is an important computer vision technique that has increasingly attracted the attention of researchers in recent years. The literature to date in the field has introduced a range of object detection models. However, these models have...

  • Data Descriptor
  • Open Access
9 Citations
6,163 Views
8 Pages

Visual Lip Reading Dataset in Turkish

  • Ali Berkol,
  • Talya Tümer-Sivri,
  • Nergis Pervan-Akman,
  • Melike Çolak and
  • Hamit Erdem

5 January 2023

The promised dataset was obtained from daily Turkish words and phrases pronounced by various people in videos posted on YouTube. The purpose of compiling the dataset was to provide a method for the detection of the spoken word by recognizing patterns...

  • Article
  • Open Access
38 Citations
11,516 Views
19 Pages

The success of Youtube has attracted a lot of users, which results in an increase of the number of comments present on Youtube channels. By analyzing those comments we could provide insight to the Youtubers that would help them to deliver better qual...

  • Article
  • Open Access
13 Citations
2,472 Views
15 Pages

24 August 2022

Sign language has played a crucial role in the lives of impaired people having hearing and speaking disabilities. They can send messages via hand gesture movement. Arabic Sign Language (ASL) recognition is a very difficult task because of its high co...

  • Article
  • Open Access
3 Citations
4,967 Views
15 Pages

The Real-Time Image Sequences-Based Stress Assessment Vision System for Mental Health

  • Mavlonbek Khomidov,
  • Deokwoo Lee,
  • Chang-Hyun Kim and
  • Jong-Ha Lee

Early detection and prevention of stress is crucial because stress affects our vital signs like heart rate, blood pressure, skin temperature, respiratory rate, and heart rate variability. There are different ways to determine stress using different d...

  • Article
  • Open Access
1 Citations
2,130 Views
16 Pages

Effects of Sinusoidal Model on Non-Parallel Voice Conversion with Adversarial Learning

  • Mohammed Salah Al-Radhi,
  • Tamás Gábor Csapó and
  • Géza Németh

15 August 2021

Voice conversion (VC) transforms the speaking style of a source speaker to the speaking style of a target speaker by keeping linguistic information unchanged. Traditional VC techniques rely on parallel recordings of multiple speakers uttering the sam...

  • Article
  • Open Access
60 Citations
8,186 Views
17 Pages

Recommending Advanced Deep Learning Models for Efficient Insect Pest Detection

  • Wei Li,
  • Tengfei Zhu,
  • Xiaoyu Li,
  • Jianzhang Dong and
  • Jun Liu

Insect pest management is one of the main ways to improve the crop yield and quality in agriculture and it can accurately and timely detect insect pests, which is of great significance to agricultural production. In the past, most insect pest detecti...

  • Data Descriptor
  • Open Access
2,766 Views
18 Pages

BELMASK—An Audiovisual Dataset of Adversely Produced Speech for Auditory Cognition Research

  • Cleopatra Christina Moshona,
  • Frederic Rudawski,
  • André Fiebig and
  • Ennes Sarradj

24 July 2024

In this article, we introduce the Berlin Dataset of Lombard and Masked Speech (BELMASK), a phonetically controlled audiovisual dataset of speech produced in adverse speaking conditions, and describe the development of the related speech task. The dat...

  • Article
  • Open Access
1,972 Views
23 Pages

Background/Objectives: This study investigates the application of machine learning (ML) techniques in diagnosing speech sound disorders (SSDs) in Saudi Arabic-speaking children, with a specific focus on phonological biomarkers, particularly Infrequen...

  • Article
  • Open Access
3 Citations
1,940 Views
17 Pages

Recently, multimodal approaches that combine various modalities have been attracting attention to recognizing emotions more accurately. Although multimodal fusion delivers strong performance, it is computationally intensive and difficult to handle in...

  • Data Descriptor
  • Open Access
1 Citations
2,068 Views
11 Pages

Electroencephalogram Dataset of Visually Imagined Arabic Alphabet for Brain–Computer Interface Design and Evaluation

  • Rami Alazrai,
  • Khalid Naqi,
  • Alaa Elkouni,
  • Amr Hamza,
  • Farah Hammam,
  • Sahar Qaadan,
  • Mohammad I. Daoud,
  • Mostafa Z. Ali and
  • Hasan Al-Nashash

22 May 2025

Visual imagery (VI) is a mental process in which an individual generates and sustains a mental image of an object without physically seeing it. Recent advancements in assistive technology have enabled the utilization of VI mental tasks as a control p...

  • Article
  • Open Access
2 Citations
868 Views
17 Pages

18 September 2025

Active Speaker Localization (ASL) involves identifying both who is speaking and where they are speaking from within audiovisual content. This capability is crucial in constrained and acoustically challenging environments, such as aircraft cabins duri...

  • Article
  • Open Access
138 Citations
14,060 Views
24 Pages

31 July 2020

Remote sensing targets have different dimensions, and they have the characteristics of dense distribution and a complex background. This makes remote sensing target detection difficult. With the aim at detecting remote sensing targets at different sc...

  • Article
  • Open Access
1 Citations
2,565 Views
21 Pages

Background: Pre-Trained Language Models hold significant promise for revolutionizing mental health care by delivering accessible and culturally sensitive resources. Despite this potential, their efficacy in mental health applications, particularly in...

  • Article
  • Open Access
4 Citations
3,729 Views
16 Pages

Learning the Relative Dynamic Features for Word-Level Lipreading

  • Hao Li,
  • Nurbiya Yadikar,
  • Yali Zhu,
  • Mutallip Mamut and
  • Kurban Ubul

13 May 2022

Lipreading is a technique for analyzing sequences of lip movements and then recognizing the speech content of a speaker. Limited by the structure of our vocal organs, the number of pronunciations we could make is finite, leading to problems with homo...

  • Article
  • Open Access
21 Citations
5,573 Views
18 Pages

28 February 2022

This paper employs a unique sensor fusion (SF) approach to detect a COVID-19 suspect and the enhanced MobileNetV2 model is used for face mask detection on an Internet-of-Things (IoT) platform. The SF algorithm avoids incorrect predictions of the susp...

  • Article
  • Open Access
542 Views
27 Pages

Accurate classification of cognitive levels in instructional dialogues is essential for personalized education and intelligent teaching systems. However, most existing methods predominantly rely on static textual features and a shallow semantic analy...

  • Article
  • Open Access
4 Citations
2,522 Views
11 Pages

17 October 2022

Current medical science has not yet found a cure for dementia. The most important measures to combat dementia are to detect the tendency toward cognitive decline as early as possible and to intervene at an early stage. For this reason, screening for...

  • Proceeding Paper
  • Open Access
1 Citations
1,591 Views
6 Pages

21 September 2023

The information that needs to be communicated to the partner depends heavily on how emotions are expressed in human communication. There are many different ways that people can communicate their feelings. Body language, facial expressions, eye contac...

  • Article
  • Open Access
9 Citations
13,311 Views
12 Pages

Atypical Genotypes for Canine Agouti Signaling Protein Suggest Novel Chromosomal Rearrangement

  • Dayna L. Dreger,
  • Heidi Anderson,
  • Jonas Donner,
  • Jessica A. Clark,
  • Arlene Dykstra,
  • Angela M. Hughes and
  • Kari J. Ekenstedt

3 July 2020

Canine coat color is a readily observed phenotype of great interest to dog enthusiasts; it is also an excellent avenue to explore the mechanisms of genetics and inheritance. As such, multiple commercial testing laboratories include basic color allele...

  • Article
  • Open Access
15 Citations
8,084 Views
25 Pages

16 August 2023

Chatbots are programs with the ability to understand and respond to natural language in a way that is both informative and engaging. This study explored the current trends of using transformers and transfer learning techniques on Arabic chatbots. The...

  • Article
  • Open Access
4 Citations
2,845 Views
22 Pages

26 July 2023

In recent years, emotional dialogue generation garnered widespread attention and made significant progress in the English-speaking domain. However, research on emotional dialogue generation in Chinese still faces two critical issues: firstly, the lac...

  • Article
  • Open Access
28 Citations
20,138 Views
24 Pages

Identity Leadership, Employee Burnout and the Mediating Role of Team Identification: Evidence from the Global Identity Leadership Development Project

  • Rolf van Dick,
  • Berrit L. Cordes,
  • Jérémy E. Lemoine,
  • Niklas K. Steffens,
  • S. Alexander Haslam,
  • Serap Arslan Akfirat,
  • Christine Joy A. Ballada,
  • Tahir Bazarov,
  • John Jamir Benzon R. Aruta and
  • Rudolf Kerschreiter
  • + 42 authors

Do leaders who build a sense of shared social identity in their teams thereby protect them from the adverse effects of workplace stress? This is a question that the present paper explores by testing the hypothesis that identity leadership contributes...

  • Article
  • Open Access
4 Citations
3,770 Views
16 Pages

Simplicial-Map Neural Networks Robust to Adversarial Examples

  • Eduardo Paluzo-Hidalgo,
  • Rocio Gonzalez-Diaz,
  • Miguel A. Gutiérrez-Naranjo and
  • Jónathan Heras

15 January 2021

Broadly speaking, an adversarial example against a classification model occurs when a small perturbation on an input data point produces a change on the output label assigned by the model. Such adversarial examples represent a weakness for the safety...

  • Article
  • Open Access
16 Citations
5,086 Views
18 Pages

Multi-Path and Group-Loss-Based Network for Speech Emotion Recognition in Multi-Domain Datasets

  • Kyoung Ju Noh,
  • Chi Yoon Jeong,
  • Jiyoun Lim,
  • Seungeun Chung,
  • Gague Kim,
  • Jeong Mook Lim and
  • Hyuntae Jeong

24 February 2021

Speech emotion recognition (SER) is a natural method of recognizing individual emotions in everyday life. To distribute SER models to real-world applications, some key challenges must be overcome, such as the lack of datasets tagged with emotion labe...

  • Article
  • Open Access
1 Citations
3,667 Views
12 Pages

5 May 2022

People with speech impediments and hearing impairments, whether congenital or acquired, often encounter difficulty in speaking. Therefore, to acquire conversational communication abilities, it is necessary to practice lipreading and imitation so that...

  • Article
  • Open Access
1,337 Views
14 Pages

Extracting Information from Unstructured Medical Reports Written in Minority Languages: A Case Study of Finnish

  • Elisa Myllylä,
  • Pekka Siirtola,
  • Antti Isosalo,
  • Jarmo Reponen,
  • Satu Tamminen and
  • Outi Laatikainen

1 July 2025

In the era of digital healthcare, electronic health records generate vast amounts of data, much of which is unstructured, and therefore, not in a usable format for conventional machine learning and artificial intelligence applications. This study inv...

  • Article
  • Open Access
10 Citations
3,652 Views
18 Pages

29 October 2021

The recent increase in user interaction with social media has completely changed the way customers communicate their opinions, questions, and concerns to brands. For this reason, many companies have established on the top of their agendas the necessi...

  • Article
  • Open Access
1 Citations
729 Views
13 Pages

Toward the Alleviation of the H0 Tension in Myrzakulov f(R,T) Gravity

  • Mashael A. Aljohani,
  • Emad E. Mahmoud,
  • Koblandy Yerzhanov and
  • Almira Sergazina

29 July 2025

In this work, we provide a promising way to alleviate the Hubble tension within the framework of Myrzakulov f(R,T) gravity. The latter incorporates both curvature and torsion under a non-special connection. We consider the f(R,T)=R+αR2 class, w...

  • Article
  • Open Access
179 Citations
36,636 Views
12 Pages

Deepsign: Sign Language Detection and Recognition Using Deep Learning

  • Deep Kothadiya,
  • Chintan Bhatt,
  • Krenil Sapariya,
  • Kevin Patel,
  • Ana-Belén Gil-González and
  • Juan M. Corchado

The predominant means of communication is speech; however, there are persons whose speaking or hearing abilities are impaired. Communication presents a significant barrier for persons with such disabilities. The use of deep learning methods can help...

  • Article
  • Open Access
20 Citations
4,151 Views
17 Pages

Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech

  • Nikola Simić,
  • Siniša Suzić,
  • Tijana Nosek,
  • Mia Vujović,
  • Zoran Perić,
  • Milan Savić and
  • Vlado Delić

16 March 2022

Speaker recognition is an important classification task, which can be solved using several approaches. Although building a speaker recognition model on a closed set of speakers under neutral speaking conditions is a well-researched task and there are...

  • Article
  • Open Access
1,112 Views
10 Pages

25 July 2025

People frequently choose not to be in an intimate relationship, but the reasons behind this choice vary. In the current study, we analyzed a dataset pooled from previous studies, consisting of 3226 Greek-speaking participants, 357 of whom were volunt...

  • Article
  • Open Access
42 Citations
6,850 Views
21 Pages

13 July 2020

A system for the automatic classification of cardiac sounds can be of great help for doctors in the diagnosis of cardiac diseases. Generally speaking, the main stages of such systems are (i) the pre-processing of the heart sound signal, (ii) the segm...

  • Proceeding Paper
  • Open Access

7 November 2025

Voice acoustics have been extensively investigated as potential non-invasive markers for Autism Spectrum Disorder (ASD). Although many studies report high accuracies, they typically rely on highly controlled clinical protocols that reduce linguistic...

  • Article
  • Open Access
3 Citations
3,092 Views
31 Pages

25 December 2024

This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike...

  • Article
  • Open Access
5 Citations
2,470 Views
15 Pages

A Web-Based Model to Predict a Neurological Disorder Using ANN

  • Abdulwahab Ali Almazroi,
  • Hitham Alamin,
  • Radhakrishnan Sujatha and
  • Noor Zaman Jhanjhi

Dementia is a condition in which cognitive ability deteriorates beyond what can be anticipated with natural ageing. Characteristically it is recurring and deteriorates gradually with time affecting a person’s ability to remember, think logicall...

  • Article
  • Open Access
1 Citations
4,056 Views
20 Pages

16 March 2025

Voice activity detection (VAD) is the process of automatically determining whether a person is speaking and identifying the timing of their speech in an audiovisual data. Traditionally, this task has been tackled by processing either audio signals or...

  • Article
  • Open Access
58 Citations
9,470 Views
28 Pages

Depression Detection Based on Hybrid Deep Learning SSCL Framework Using Self-Attention Mechanism: An Application to Social Networking Data

  • Aleena Nadeem,
  • Muhammad Naveed,
  • Muhammad Islam Satti,
  • Hammad Afzal,
  • Tanveer Ahmad and
  • Ki-Il Kim

13 December 2022

In today’s world, mental health diseases have become highly prevalent, and depression is one of the mental health problems that has become widespread. According to WHO reports, depression is the second-leading cause of the global burden of dise...

  • Article
  • Open Access
5 Citations
3,300 Views
29 Pages

Continuous Arabic Sign Language Recognition Models

  • Nahlah Algethami,
  • Raghad Farhud,
  • Manal Alghamdi,
  • Huda Almutairi,
  • Maha Sorani and
  • Noura Aleisa

5 May 2025

A significant communication gap persists between the deaf and hearing communities, often leaving deaf individuals isolated and marginalised. This challenge is especially pronounced for Arabic-speaking individuals, given the lack of publicly available...

  • Review
  • Open Access
61 Citations
11,830 Views
20 Pages

Abstractive vs. Extractive Summarization: An Experimental Review

  • Nikolaos Giarelis,
  • Charalampos Mastrokostas and
  • Nikos Karacapilidis

28 June 2023

Text summarization is a subtask of natural language processing referring to the automatic creation of a concise and fluent summary that captures the main ideas and topics from one or multiple documents. Earlier literature surveys focus on extractive...

of 3