Skip to Content

391 Results Found

  • Article
  • Open Access
135 Views
26 Pages

Analysis of Fundamental Frequency Changes in Astronaut Speech in Microgravity and in Terrestrial Conditions

  • Natalia Repyuk,
  • Anton Konev,
  • Vladimir Faerman,
  • Dmitry Rulev and
  • Grigory Yashchenko

This study investigates the influence of microgravity on the fundamental frequency (F0) of astronauts’ speech. A speech corpus was compiled, including recordings in microgravity and on Earth, matched by speaker and content. The signal processin...

  • Article
  • Open Access
18 Citations
5,869 Views
17 Pages

The purpose of speech enhancement is to improve the quality of speech signals degraded by noise, reverberation, or other artifacts that can affect the intelligibility, automatic recognition, or other attributes involved in speech technologies and tel...

  • Article
  • Open Access
4 Citations
4,255 Views
20 Pages

An Open CAPT System for Prosody Practice: Practical Steps towards Multilingual Setup

  • John Blake,
  • Natalia Bogach,
  • Akemi Kusakari,
  • Iurii Lezhenin,
  • Veronica Khaustova,
  • Son Luu Xuan,
  • Van Nhi Nguyen,
  • Nam Ba Pham,
  • Roman Svechnikov and
  • Evgeny Pyshkin
  • + 2 authors

12 January 2024

This paper discusses the challenges posed in creating a Computer-Assisted Pronunciation Training (CAPT) environment for multiple languages. By selecting one language from each of three different language families, we show that a single environment ma...

  • Article
  • Open Access
22 Citations
7,660 Views
17 Pages

5 September 2020

The quality and intelligibility of the speech are usually impaired by the interference of background noise when using internet voice calls. To solve this problem in the context of wearable smart devices, this paper introduces a dual-microphone, bone-...

  • Article
  • Open Access
15 Citations
6,142 Views
39 Pages

11 July 2019

In multi-modal emotion aware frameworks, it is essential to estimate the emotional features then fuse them to different degrees. This basically follows either a feature-level or decision-level strategy. In all likelihood, while features from several...

  • Article
  • Open Access
3 Citations
2,259 Views
15 Pages

Frame-Based Phone Classification Using EMG Signals

  • Inge Salomons,
  • Eder del Blanco,
  • Eva Navas,
  • Inma Hernáez and
  • Xabier de Zuazo

30 June 2023

This paper evaluates the impact of inter-speaker and inter-session variability on the development of a silent speech interface (SSI) based on electromyographic (EMG) signals from the facial muscles. The final goal of the SSI is to provide a communica...

  • Article
  • Open Access
134 Citations
19,768 Views
22 Pages

Wearable Biomedical Measurement Systems for Assessment of Mental Stress of Combatants in Real Time

  • Fernando Seoane,
  • Inmaculada Mohino-Herranz,
  • Javier Ferreira,
  • Lorena Alvarez,
  • Ruben Buendia,
  • David Ayllón,
  • Cosme Llerena and
  • Roberto Gil-Pita

22 April 2014

The Spanish Ministry of Defense, through its Future Combatant program, has sought to develop technology aids with the aim of extending combatants’ operational capabilities. Within this framework the ATREC project funded by the “Coincidente” program a...

  • Article
  • Open Access
36 Citations
8,276 Views
23 Pages

A Hybrid U-Lossian Deep Learning Network for Screening and Evaluating Parkinson’s Disease

  • Rytis Maskeliūnas,
  • Robertas Damaševičius,
  • Audrius Kulikajevas,
  • Evaldas Padervinskis,
  • Kipras Pribuišis and
  • Virgilijus Uloza

15 November 2022

Speech impairment analysis and processing technologies have evolved substantially in recent years, and the use of voice as a biomarker has gained popularity. We have developed an approach for clinical speech signal processing to demonstrate the promi...

  • Article
  • Open Access
756 Views
10 Pages

Development of a Speech-in-Noise Test in European Portuguese Based on QuickSIN: A Pilot Study

  • Margarida Serrano,
  • Jéssica Simões,
  • Joana Vicente,
  • Maria Ferreira,
  • Ana Murta and
  • João Tiago Ferrão

Background and Objectives: Speech-in-noise testing is essential for evaluating functional hearing abilities in clinical practice. Although the Quick Speech-in-Noise test (QuickSIN) is widely used, no equivalent tool existed for European Portuguese. T...

  • Article
  • Open Access
5 Citations
4,196 Views
17 Pages

Wireless Mouth Motion Recognition System Based on EEG-EMG Sensors for Severe Speech Impairments

  • Kee S. Moon,
  • John S. Kang,
  • Sung Q. Lee,
  • Jeff Thompson and
  • Nicholas Satterlee

25 June 2024

This study aims to demonstrate the feasibility of using a new wireless electroencephalography (EEG)–electromyography (EMG) wearable approach to generate characteristic EEG-EMG mixed patterns with mouth movements in order to detect distinct move...

  • Article
  • Open Access
370 Citations
21,432 Views
15 Pages

28 December 2019

Speech is the most significant mode of communication among human beings and a potential method for human-computer interaction (HCI) by using a microphone sensor. Quantifiable emotion recognition using these sensors from speech signals is an emerging...

  • Review
  • Open Access
148 Citations
33,521 Views
24 Pages

22 April 2019

High-tech augmentative and alternative communication (AAC) methods are on a constant rise; however, the interaction between the user and the assistive technology is still challenged for an optimal user experience centered around the desired activity....

  • Article
  • Open Access
399 Views
22 Pages

21 February 2026

Real-time acoustic signal enhancement in non-stationary noise remains challenging, especially for sensing systems that must be causal, low latency, and interpretable. This paper proposes a unified Bayesian–Kalman estimator (UBKE) that analytica...

  • Article
  • Open Access
28 Citations
6,081 Views
10 Pages

Development and improvement of a mathematical model for a large-scale analysis based on the Daubechies discrete wavelet transform will be implemented in an algebraic system possessing a property of ring and field suitable for speech signals processin...

  • Article
  • Open Access
1 Citations
4,992 Views
12 Pages

10 March 2021

This research studies the perceptual evaluation of speech signals using an inexpensive recording device. Different types of noise-reduction and electronic enhancement filters viz. Hamming window, high-pass filter (HPF), Wiener-filter and no-speech ac...

  • Article
  • Open Access
22 Citations
8,161 Views
12 Pages

Relationship of Cepstral Peak Prominence-Smoothed and Long-Term Average Spectrum with Auditory–Perceptual Analysis

  • Angélica Emygdio da Silva Antonetti,
  • Larissa Thais Donalonso Siqueira,
  • Maria Paula de Almeida Gobbo,
  • Alcione Ghedini Brasolotto and
  • Kelly Cristina Alves Silverio

1 December 2020

Cepstral peak prominence-smoothed (CPPs) and long-term average spectrum (LTAS) are robust measures that represent the glottal source and source-filter interactions, respectively. Until now, little has been known about how physiological events impact...

  • Article
  • Open Access
35 Citations
5,534 Views
16 Pages

A Speech Command Control-Based Recognition System for Dysarthric Patients Based on Deep Learning Technology

  • Yu-Yi Lin,
  • Wei-Zhong Zheng,
  • Wei Chung Chu,
  • Ji-Yan Han,
  • Ying-Hsiu Hung,
  • Guan-Min Ho,
  • Chia-Yuan Chang and
  • Ying-Hui Lai

10 March 2021

Voice control is an important way of controlling mobile devices; however, using it remains a challenge for dysarthric patients. Currently, there are many approaches, such as automatic speech recognition (ASR) systems, being used to help dysarthric pa...

  • Article
  • Open Access
3 Citations
1,482 Views
13 Pages

Factors to Describe the Outcome Characteristics of a CI Recipient

  • Matthias Hey,
  • Kevyn Kogel,
  • Jan Dambon,
  • Alexander Mewes,
  • Tim Jürgens and
  • Thomas Hocke

29 July 2024

Background: In cochlear implant (CI) treatment, there is a large variability in outcome. The aim of our study was to identify the independent audiometric measures that are most directly relevant for describing this variability in outcome characterist...

  • Article
  • Open Access
8 Citations
3,679 Views
18 Pages

Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions

  • Piotr Odya,
  • Jozef Kotus,
  • Adam Kurowski and
  • Bozena Kostek

21 September 2021

The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particu...

  • Article
  • Open Access
5 Citations
3,341 Views
20 Pages

17 February 2024

Audio inpainting plays an important role in addressing incomplete, damaged, or missing audio signals, contributing to improved quality of service and overall user experience in multimedia communications over the Internet and mobile networks. This pap...

  • Communication
  • Open Access
628 Views
11 Pages

Background/Objectives: Audiometric methods for hearing-impaired patients are constantly evolving as new therapeutic interventions and improved clinical standards are established. This study aimed to explore the relationship between patient-reported o...

  • Review
  • Open Access
8 Citations
6,552 Views
22 Pages

Machine-Learning Methods for Speech and Handwriting Detection Using Neural Signals: A Review

  • Ovishake Sen,
  • Anna M. Sheehan,
  • Pranay R. Raman,
  • Kabir S. Khara,
  • Adam Khalifa and
  • Baibhab Chatterjee

14 June 2023

Brain–Computer Interfaces (BCIs) have become increasingly popular in recent years due to their potential applications in diverse fields, ranging from the medical sector (people with motor and/or communication disabilities), cognitive training,...

  • Article
  • Open Access
26 Citations
5,936 Views
19 Pages

21 June 2022

Speech is a complex mechanism allowing us to communicate our needs, desires and thoughts. In some cases of neural dysfunctions, this ability is highly affected, which makes everyday life activities that require communication a challenge. This paper s...

  • Article
  • Open Access
81 Citations
10,012 Views
18 Pages

Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures

  • M. Rupesh Kumar,
  • Susmitha Vekkot,
  • S. Lalitha,
  • Deepa Gupta,
  • Varasiddhi Jayasuryaa Govindraj,
  • Kamran Shaukat,
  • Yousef Ajami Alotaibi and
  • Mohammed Zakariah

29 November 2022

Dementia affects the patient’s memory and leads to language impairment. Research has demonstrated that speech and language deterioration is often a clear indication of dementia and plays a crucial role in the recognition process. Even though ea...

  • Article
  • Open Access
1,588 Views
26 Pages

Feature Generation with Genetic Algorithms for Imagined Speech Electroencephalogram Signal Classification

  • Edgar Lara-Arellano,
  • Andras Takacs,
  • Saul Tovar-Arriaga and
  • Juvenal Rodríguez-Reséndiz

10 April 2025

This work presents a method for classifying EEG (Electroencephalogram) signals generated when a person concentrates on specific words, defined as “Imagined Speech”. Imagined speech is essential to enhance problem-solving, memory, and lang...

  • Article
  • Open Access
716 Views
16 Pages

19 November 2025

Conventional waveform-based speech enhancement models prioritize temporal modeling, often neglecting the irreversible spectral information loss triggered by standard downsampling. Consequently, this study introduces a novel frequency-aware framework....

  • Review
  • Open Access
3,048 Views
29 Pages

Parkinson’s disease (PD) is a progressive neurodegenerative disorder characterized by motor and non-motor symptoms, among which vocal impairment is one of the earliest and most prevalent. In recent years, voice analysis supported by machine lea...

  • Article
  • Open Access
53 Views
26 Pages

C-EMDNet: A Nonlinear Morphological Deep Framework for Robust Speech Enhancement

  • Kais Khaldi,
  • Sahar Almenwer,
  • Afrah Alanazi,
  • Inam Alanazi and
  • Anis Mohamed

18 March 2026

This study introduces C-EMDNet, a nonlinear speech denoising approach that combines the adaptive decomposition capabilities of Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and a deep convolutional architecture operatin...

  • Article
  • Open Access
1 Citations
5,188 Views
20 Pages

A Wearable Silent Text Input System Using EMG and Piezoelectric Sensors

  • John S. Kang,
  • Kee S. Moon,
  • Sung Q. Lee,
  • Nicholas Satterlee and
  • Xiaowei Zuo

21 April 2025

This paper introduces a wearable silent text input system designed to capture text input through silent speech, without generating audible sound. The system integrates Electromyography (EMG) and piezoelectric lead zirconate titanate (PZT) sensors in...

  • Systematic Review
  • Open Access
9 Citations
6,337 Views
20 Pages

Factors in the Effective Use of Hearing Aids among Subjects with Age-Related Hearing Loss: A Systematic Review

  • Perrine Morvan,
  • Johanna Buisson-Savin,
  • Catherine Boiteux,
  • Eric Bailly-Masson,
  • Mareike Buhl and
  • Hung Thai-Van

10 July 2024

Objectives: Investigate factors contributing to the effective management of age-related hearing loss (ARHL) rehabilitation. Methods: A systematic review was conducted following PRISMA guidelines. The protocol was registered in PROSPERO (CRD4202237481...

  • Article
  • Open Access
14 Citations
8,962 Views
30 Pages

Speech Signal Analysis in Patients with Parkinson’s Disease, Taking into Account Phonation, Articulation, and Prosody of Speech

  • Ewelina Majda-Zdancewicz,
  • Anna Potulska-Chromik,
  • Monika Nojszewska and
  • Anna Kostera-Pruszczyk

28 November 2024

This study involved performing tests to detect Parkinson’s disease (PD) based on voice changes, including speech phonation, articulation, and prosody, in patients with PD using different types of speech signal. For this purpose, during the firs...

  • Article
  • Open Access
8 Citations
3,508 Views
17 Pages

Assessing Cognitive Workload Using Cardiovascular Measures and Voice

  • Eydis H. Magnusdottir,
  • Kamilla R. Johannsdottir,
  • Arnab Majumdar and
  • Jon Gudnason

13 September 2022

Monitoring cognitive workload has the potential to improve both the performance and fidelity of human decision making. However, previous efforts towards discriminating further than binary levels (e.g., low/high or neutral/high) in cognitive workload...

  • Article
  • Open Access
22 Citations
4,846 Views
17 Pages

Several researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters sep...

  • Article
  • Open Access
55 Citations
7,008 Views
18 Pages

6 July 2020

Call center operators communicate with callers in different emotional states (anger, anxiety, fear, stress, joy, etc.). Sometimes a number of calls coming in a short period of time have to be answered and processed. In the moments when all call cente...

  • Article
  • Open Access
8 Citations
2,927 Views
13 Pages

Prediction of Public Trust in Politicians Using a Multimodal Fusion Approach

  • Muhammad Shehram Shah Syed,
  • Elena Pirogova and
  • Margaret Lech

This paper explores the automatic prediction of public trust in politicians through the use of speech, text, and visual modalities. It evaluates the effectiveness of each modality individually, and it investigates fusion approaches for integrating in...

  • Review
  • Open Access
48 Citations
10,290 Views
21 Pages

Deep Multimodal Emotion Recognition on Human Speech: A Review

  • Panagiotis Koromilas and
  • Theodoros Giannakopoulos

28 August 2021

This work reviews the state of the art in multimodal speech emotion recognition methodologies, focusing on audio, text and visual information. We provide a new, descriptive categorization of methods, based on the way they handle the inter-modality an...

  • Article
  • Open Access
3 Citations
4,231 Views
11 Pages

Research on Speech Synthesis Based on Mixture Alignment Mechanism

  • Yan Deng,
  • Ning Wu,
  • Chengjun Qiu,
  • Yan Chen and
  • Xueshan Gao

20 August 2023

In recent years, deep learning-based speech synthesis has attracted a lot of attention from the machine learning and speech communities. In this paper, we propose Mixture-TTS, a non-autoregressive speech synthesis model based on mixture alignment mec...

  • Article
  • Open Access
1 Citations
2,205 Views
17 Pages

Deep Learning System for Speech Command Recognition

  • Dejan Vujičić,
  • Đorđe Damnjanović,
  • Dušan Marković and
  • Zoran Stamenković

24 September 2025

We present a deep learning model for the recognition of speech commands in the English language. The dataset is based on the Google Speech Commands Dataset by Warden P., version 0.01, and it consists of ten distinct commands (“left”, &ldq...

  • Article
  • Open Access
50 Citations
5,730 Views
12 Pages

Machine Learning Approach to Dysphonia Detection

  • Zuzana Dankovičová,
  • Dávid Sovák,
  • Peter Drotár and
  • Liberios Vokorokos

15 October 2018

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech...

  • Article
  • Open Access
46 Citations
8,192 Views
21 Pages

Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm

  • Akmalbek Bobomirzaevich Abdusalomov,
  • Furkat Safarov,
  • Mekhriddin Rakhimov,
  • Boburkhon Turaev and
  • Taeg Keun Whangbo

24 October 2022

Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker’s features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process involves thre...

  • Article
  • Open Access
4 Citations
2,304 Views
14 Pages

16 June 2023

With the evolution in technology, communication based on the voice has gained importance in applications such as online conferencing, online meetings, voice-over internet protocol (VoIP), etc. Limiting factors such as environmental noise, encoding an...

  • Article
  • Open Access
31 Citations
9,172 Views
19 Pages

29 October 2016

People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the mo...

  • Review
  • Open Access
17 Citations
6,880 Views
21 Pages

19 February 2023

Superficially, read and spontaneous speech—the two main kinds of training data for automatic speech recognition—appear as complementary, but are equal: pairs of texts and acoustic signals. Yet, spontaneous speech is typically harder for r...

  • Review
  • Open Access
461 Views
39 Pages

7 February 2026

Speech enhancement aims to improve speech quality and intelligibility in noisy environments and is important in applications such as hearing aids, mobile communications and automatic speech recognition (ASR). This paper shows a structured review of s...

  • Review
  • Open Access
7 Citations
10,936 Views
19 Pages

14 February 2025

Recorded audio often contains speech from multiple people in conversation. It is useful to label such signals with speaker turns, noting when each speaker is talking and identifying each speaker. This paper discusses how to process speech signals to...

  • Article
  • Open Access
8 Citations
3,110 Views
14 Pages

3 December 2018

This paper presents a method for speech steganography using two levels of security: The first one related to the scrambling process, the second one related to the hiding process. The scrambling block uses a technique based on the ability of adaptatio...

  • Article
  • Open Access
5 Citations
3,420 Views
29 Pages

Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm

  • Ali Dehghan Firoozabadi,
  • Pablo Irarrazaval,
  • Pablo Adasme,
  • David Zabala-Blanco,
  • Hugo Durney,
  • Miguel Sanhueza,
  • Pablo Palacios-Játiva and
  • Cesar Azurdia-Meza

6 June 2020

Speech enhancement is one of the most important fields in audio and speech signal processing. The speech enhancement methods are divided into the single and multi-channel algorithms. The multi-channel methods increase the speech enhancement performan...

  • Article
  • Open Access
16 Citations
5,590 Views
19 Pages

Speech Enhancement for Secure Communication Using Coupled Spectral Subtraction and Wiener Filter

  • Hilman Pardede,
  • Kalamullah Ramli,
  • Yohan Suryanto,
  • Nur Hayati and
  • Alfan Presekal

The encryption process for secure voice communication may degrade the speech quality when it is applied to the speech signals before encoding them through a conventional communication system such as GSM or radio trunking. This is because the encrypti...

  • Article
  • Open Access
6 Citations
4,245 Views
26 Pages

22 August 2022

There are many speech and audio processing applications and their number is growing. They may cover a wide range of tasks, each having different requirements on the processed speech or audio signals and, therefore, indirectly, on the audio sensors as...

  • Review
  • Open Access
5 Citations
3,183 Views
12 Pages

1 February 2021

Speech is an acoustically variable signal, and one of the sources of this variation is the presence of multiple speakers. Empirical evidence has suggested that adult listeners possess remarkably sensitive (and systematic) abilities to process speech...

of 8