Search for Articles

Article

135 Views

26 Pages

Analysis of Fundamental Frequency Changes in Astronaut Speech in Microgravity and in Terrestrial Conditions

Natalia Repyuk,
Anton Konev,
Vladimir Faerman,
Dmitry Rulev and
Grigory Yashchenko

Acoustics2026, 8(1), 18;https://doi.org/10.3390/acoustics8010018

-

13 March 2026

This study investigates the influence of microgravity on the fundamental frequency (F0) of astronauts’ speech. A speech corpus was compiled, including recordings in microgravity and on Earth, matched by speaker and content. The signal processin...

Full Article

Article

18 Citations

5,872 Views

17 Pages

An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning

Michelle Gutiérrez-Muñoz and
Marvin Coto-Jiménez

Computation2022, 10(6), 102;https://doi.org/10.3390/computation10060102

-

20 June 2022

The purpose of speech enhancement is to improve the quality of speech signals degraded by noise, reverberation, or other artifacts that can affect the intelligibility, automatic recognition, or other attributes involved in speech technologies and tel...

Full Article

Article

4 Citations

4,260 Views

20 Pages

An Open CAPT System for Prosody Practice: Practical Steps towards Multilingual Setup

John Blake,
Natalia Bogach,
Akemi Kusakari,
Iurii Lezhenin,
Veronica Khaustova,
Son Luu Xuan,
Van Nhi Nguyen,
Nam Ba Pham,
Roman Svechnikov and
Evgeny Pyshkin
+ 2 authors

Languages2024, 9(1), 27;https://doi.org/10.3390/languages9010027

-

12 January 2024

This paper discusses the challenges posed in creating a Computer-Assisted Pronunciation Training (CAPT) environment for multiple languages. By selecting one language from each of three different language families, we show that a single environment ma...

Full Article

Article

22 Citations

7,676 Views

17 Pages

A Real-Time Dual-Microphone Speech Enhancement Algorithm Assisted by Bone Conduction Sensor

Yi Zhou,
Yufan Chen,
Yongbao Ma and
Hongqing Liu

Sensors2020, 20(18), 5050;https://doi.org/10.3390/s20185050

-

5 September 2020

The quality and intelligibility of the speech are usually impaired by the interference of background noise when using internet voice calls. To solve this problem in the context of wearable smart devices, this paper introduces a dual-microphone, bone-...

Full Article

Article

15 Citations

6,150 Views

39 Pages

Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Rania M. Ghoniem,
Abeer D. Algarni and
Khaled Shaalan

Information2019, 10(7), 239;https://doi.org/10.3390/info10070239

-

11 July 2019

In multi-modal emotion aware frameworks, it is essential to estimate the emotional features then fuse them to different degrees. This basically follows either a feature-level or decision-level strategy. In all likelihood, while features from several...

Full Article

Article

3 Citations

2,262 Views

15 Pages

Frame-Based Phone Classification Using EMG Signals

Inge Salomons,
Eder del Blanco,
Eva Navas,
Inma Hernáez and
Xabier de Zuazo

Appl. Sci.2023, 13(13), 7746;https://doi.org/10.3390/app13137746

-

30 June 2023

This paper evaluates the impact of inter-speaker and inter-session variability on the development of a silent speech interface (SSI) based on electromyographic (EMG) signals from the facial muscles. The final goal of the SSI is to provide a communica...

Full Article

Article

134 Citations

19,768 Views

22 Pages

Wearable Biomedical Measurement Systems for Assessment of Mental Stress of Combatants in Real Time

Fernando Seoane,
Inmaculada Mohino-Herranz,
Javier Ferreira,
Lorena Alvarez,
Ruben Buendia,
David Ayllón,
Cosme Llerena and
Roberto Gil-Pita

Sensors2014, 14(4), 7120-7141;https://doi.org/10.3390/s140407120

-

22 April 2014

The Spanish Ministry of Defense, through its Future Combatant program, has sought to develop technology aids with the aim of extending combatants’ operational capabilities. Within this framework the ATREC project funded by the “Coincidente” program a...

Full Article

Article

36 Citations

8,281 Views

23 Pages

A Hybrid U-Lossian Deep Learning Network for Screening and Evaluating Parkinson’s Disease

Rytis Maskeliūnas,
Robertas Damaševičius,
Audrius Kulikajevas,
Evaldas Padervinskis,
Kipras Pribuišis and
Virgilijus Uloza

Appl. Sci.2022, 12(22), 11601;https://doi.org/10.3390/app122211601

-

15 November 2022

Speech impairment analysis and processing technologies have evolved substantially in recent years, and the use of voice as a biomarker has gained popularity. We have developed an approach for clinical speech signal processing to demonstrate the promi...

Full Article

Article

760 Views

10 Pages

Development of a Speech-in-Noise Test in European Portuguese Based on QuickSIN: A Pilot Study

Margarida Serrano,
Jéssica Simões,
Joana Vicente,
Maria Ferreira,
Ana Murta and
João Tiago Ferrão

J. Otorhinolaryngol. Hear. Balance Med.2025, 6(2), 22;https://doi.org/10.3390/ohbm6020022

-

26 November 2025

Background and Objectives: Speech-in-noise testing is essential for evaluating functional hearing abilities in clinical practice. Although the Quick Speech-in-Noise test (QuickSIN) is widely used, no equivalent tool existed for European Portuguese. T...

Full Article

Article

5 Citations

4,202 Views

17 Pages

Wireless Mouth Motion Recognition System Based on EEG-EMG Sensors for Severe Speech Impairments

Kee S. Moon,
John S. Kang,
Sung Q. Lee,
Jeff Thompson and
Nicholas Satterlee

Sensors2024, 24(13), 4125;https://doi.org/10.3390/s24134125

-

25 June 2024

This study aims to demonstrate the feasibility of using a new wireless electroencephalography (EEG)–electromyography (EMG) wearable approach to generate characteristic EEG-EMG mixed patterns with mouth movements in order to detect distinct move...

Full Article

Article

372 Citations

21,437 Views

15 Pages

A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition

Mustaqeem and
Soonil Kwon

Sensors2020, 20(1), 183;https://doi.org/10.3390/s20010183

-

28 December 2019

Speech is the most significant mode of communication among human beings and a potential method for human-computer interaction (HCI) by using a microphone sensor. Quantifiable emotion recognition using these sensors from speech signals is an emerging...

Full Article

Review

151 Citations

33,555 Views

24 Pages

Augmentative and Alternative Communication (AAC) Advances: A Review of Configurations for Individuals with a Speech Disability

Yasmin Elsahar,
Sijung Hu,
Kaddour Bouazza-Marouf,
David Kerr and
Annysa Mansor

Sensors2019, 19(8), 1911;https://doi.org/10.3390/s19081911

-

22 April 2019

High-tech augmentative and alternative communication (AAC) methods are on a constant rise; however, the interaction between the user and the assistive technology is still challenged for an optimal user experience centered around the desired activity....

Full Article

Article

399 Views

22 Pages

Real-Time Signal Processing for Distributed Acoustic Sensing and Acoustic Sensing Systems Under Non-Stationary Noise

Samuel Yaw Mensah,
Tao Zhang,
Xin Zhao and
Nahid Al Mahmud

Sensors2026, 26(4), 1372;https://doi.org/10.3390/s26041372

-

21 February 2026

Real-time acoustic signal enhancement in non-stationary noise remains challenging, especially for sensing systems that must be causal, low latency, and interpretable. This paper proposes a unified Bayesian–Kalman estimator (UBKE) that analytica...

Full Article

Article

28 Citations

6,081 Views

10 Pages

An Algorithm of Daubechies Wavelet Transform in the Final Field When Processing Speech Signals

Dmitry Popov,
Artem Gapochkin and
Alexey Nekrasov

Electronics2018, 7(7), 120;https://doi.org/10.3390/electronics7070120

-

18 July 2018

Development and improvement of a mathematical model for a large-scale analysis based on the Daubechies discrete wavelet transform will be implemented in an algebraic system possessing a property of ring and field suitable for speech signals processin...

Full Article

Article

1 Citations

4,995 Views

12 Pages

Perceptual Evaluation of Speech Quality for Inexpensive Recording Equipment

Anas Hashmi

Acoustics2021, 3(1), 200-211;https://doi.org/10.3390/acoustics3010014

-

10 March 2021

This research studies the perceptual evaluation of speech signals using an inexpensive recording device. Different types of noise-reduction and electronic enhancement filters viz. Hamming window, high-pass filter (HPF), Wiener-filter and no-speech ac...

Full Article

Article

22 Citations

8,166 Views

12 Pages

Relationship of Cepstral Peak Prominence-Smoothed and Long-Term Average Spectrum with Auditory–Perceptual Analysis

Angélica Emygdio da Silva Antonetti,
Larissa Thais Donalonso Siqueira,
Maria Paula de Almeida Gobbo,
Alcione Ghedini Brasolotto and
Kelly Cristina Alves Silverio

Appl. Sci.2020, 10(23), 8598;https://doi.org/10.3390/app10238598

-

1 December 2020

Cepstral peak prominence-smoothed (CPPs) and long-term average spectrum (LTAS) are robust measures that represent the glottal source and source-filter interactions, respectively. Until now, little has been known about how physiological events impact...

Full Article

Communication

631 Views

11 Pages

Precision Audiometry and Ecological Validity: Exploring the Link Between Patient-Reported Outcome Measures and Speech Testing in CI Users

Matthias Hey and
Thomas Hocke

Audiol. Res.2025, 15(5), 142;https://doi.org/10.3390/audiolres15050142

-

21 October 2025

Background/Objectives: Audiometric methods for hearing-impaired patients are constantly evolving as new therapeutic interventions and improved clinical standards are established. This study aimed to explore the relationship between patient-reported o...

Full Article

Article

8 Citations

3,685 Views

18 Pages

Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions

Piotr Odya,
Jozef Kotus,
Adam Kurowski and
Bozena Kostek

Sensors2021, 21(18), 6320;https://doi.org/10.3390/s21186320

-

21 September 2021

The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particu...

Full Article

Article

6 Citations

3,348 Views

20 Pages

Speech Inpainting Based on Multi-Layer Long Short-Term Memory Networks

Haohan Shi,
Xiyu Shi and
Safak Dogan

Future Internet2024, 16(2), 63;https://doi.org/10.3390/fi16020063

-

17 February 2024

Audio inpainting plays an important role in addressing incomplete, damaged, or missing audio signals, contributing to improved quality of service and overall user experience in multimedia communications over the Internet and mobile networks. This pap...

Full Article

Article

3 Citations

1,482 Views

13 Pages

Factors to Describe the Outcome Characteristics of a CI Recipient

Matthias Hey,
Kevyn Kogel,
Jan Dambon,
Alexander Mewes,
Tim Jürgens and
Thomas Hocke

J. Clin. Med.2024, 13(15), 4436;https://doi.org/10.3390/jcm13154436

-

29 July 2024

Background: In cochlear implant (CI) treatment, there is a large variability in outcome. The aim of our study was to identify the independent audiometric measures that are most directly relevant for describing this variability in outcome characterist...

Full Article

Article

36 Citations

5,541 Views

16 Pages

A Speech Command Control-Based Recognition System for Dysarthric Patients Based on Deep Learning Technology

Yu-Yi Lin,
Wei-Zhong Zheng,
Wei Chung Chu,
Ji-Yan Han,
Ying-Hsiu Hung,
Guan-Min Ho,
Chia-Yuan Chang and
Ying-Hui Lai

Appl. Sci.2021, 11(6), 2477;https://doi.org/10.3390/app11062477

-

10 March 2021

Voice control is an important way of controlling mobile devices; however, using it remains a challenge for dysarthric patients. Currently, there are many approaches, such as automatic speech recognition (ASR) systems, being used to help dysarthric pa...

Full Article

Review

8 Citations

6,557 Views

22 Pages

Machine-Learning Methods for Speech and Handwriting Detection Using Neural Signals: A Review

Ovishake Sen,
Anna M. Sheehan,
Pranay R. Raman,
Kabir S. Khara,
Adam Khalifa and
Baibhab Chatterjee

Sensors2023, 23(12), 5575;https://doi.org/10.3390/s23125575

-

14 June 2023

Brain–Computer Interfaces (BCIs) have become increasingly popular in recent years due to their potential applications in diverse fields, ranging from the medical sector (people with motor and/or communication disabilities), cognitive training,...

Full Article

Article

8 Citations

2,928 Views

13 Pages

Prediction of Public Trust in Politicians Using a Multimodal Fusion Approach

Muhammad Shehram Shah Syed,
Elena Pirogova and
Margaret Lech

Electronics2021, 10(11), 1259;https://doi.org/10.3390/electronics10111259

-

25 May 2021

This paper explores the automatic prediction of public trust in politicians through the use of speech, text, and visual modalities. It evaluates the effectiveness of each modality individually, and it investigates fusion approaches for integrating in...

Full Article

Article

8 Citations

3,513 Views

17 Pages

Assessing Cognitive Workload Using Cardiovascular Measures and Voice

Eydis H. Magnusdottir,
Kamilla R. Johannsdottir,
Arnab Majumdar and
Jon Gudnason

Sensors2022, 22(18), 6894;https://doi.org/10.3390/s22186894

-

13 September 2022

Monitoring cognitive workload has the potential to improve both the performance and fidelity of human decision making. However, previous efforts towards discriminating further than binary levels (e.g., low/high or neutral/high) in cognitive workload...

Full Article

Article

1 Citations

5,192 Views

20 Pages

A Wearable Silent Text Input System Using EMG and Piezoelectric Sensors

John S. Kang,
Kee S. Moon,
Sung Q. Lee,
Nicholas Satterlee and
Xiaowei Zuo

Sensors2025, 25(8), 2624;https://doi.org/10.3390/s25082624

-

21 April 2025

This paper introduces a wearable silent text input system designed to capture text input through silent speech, without generating audible sound. The system integrates Electromyography (EMG) and piezoelectric lead zirconate titanate (PZT) sensors in...

Full Article

Article

716 Views

16 Pages

Frequency-Aware Multi-Rate Resampling with Multi-Band Deep Supervision for Modular Speech Denoising

Seon Man Kim

Electronics2025, 14(22), 4523;https://doi.org/10.3390/electronics14224523

-

19 November 2025

Conventional waveform-based speech enhancement models prioritize temporal modeling, often neglecting the irreversible spectral information loss triggered by standard downsampling. Consequently, this study introduces a novel frequency-aware framework....

Full Article

Review

3,062 Views

29 Pages

Voice-Based Detection of Parkinson’s Disease Using Machine and Deep Learning Approaches: A Systematic Review

Hadi Sedigh Malekroodi,
Byeong-il Lee and
Myunggi Yi

Bioengineering2025, 12(11), 1279;https://doi.org/10.3390/bioengineering12111279

-

20 November 2025

Parkinson’s disease (PD) is a progressive neurodegenerative disorder characterized by motor and non-motor symptoms, among which vocal impairment is one of the earliest and most prevalent. In recent years, voice analysis supported by machine lea...

Full Article

Article

1 Citations

2,219 Views

17 Pages

Deep Learning System for Speech Command Recognition

Dejan Vujičić,
Đorđe Damnjanović,
Dušan Marković and
Zoran Stamenković

Electronics2025, 14(19), 3793;https://doi.org/10.3390/electronics14193793

-

24 September 2025

We present a deep learning model for the recognition of speech commands in the English language. The dataset is based on the Google Speech Commands Dataset by Warden P., version 0.01, and it consists of ten distinct commands (“left”, &ldq...

Full Article

Article

3 Citations

4,234 Views

11 Pages

Research on Speech Synthesis Based on Mixture Alignment Mechanism

Yan Deng,
Ning Wu,
Chengjun Qiu,
Yan Chen and
Xueshan Gao

Sensors2023, 23(16), 7283;https://doi.org/10.3390/s23167283

-

20 August 2023

In recent years, deep learning-based speech synthesis has attracted a lot of attention from the machine learning and speech communities. In this paper, we propose Mixture-TTS, a non-autoregressive speech synthesis model based on mixture alignment mec...

Full Article

Article

55 Citations

7,011 Views

18 Pages

Call Redistribution for a Call Center Based on Speech Emotion Recognition

Milana Bojanić,
Vlado Delić and
Alexey Karpov

Appl. Sci.2020, 10(13), 4653;https://doi.org/10.3390/app10134653

-

6 July 2020

Call center operators communicate with callers in different emotional states (anger, anxiety, fear, stress, joy, etc.). Sometimes a number of calls coming in a short period of time have to be answered and processed. In the moments when all call cente...

Full Article

Article

22 Citations

4,851 Views

17 Pages

Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks

Marvin Coto-Jiménez

Biomimetics2019, 4(2), 39;https://doi.org/10.3390/biomimetics4020039

-

28 May 2019

Several researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters sep...

Full Article

Article

50 Citations

5,730 Views

12 Pages

Machine Learning Approach to Dysphonia Detection

Zuzana Dankovičová,
Dávid Sovák,
Peter Drotár and
Liberios Vokorokos

Appl. Sci.2018, 8(10), 1927;https://doi.org/10.3390/app8101927

-

15 October 2018

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech...

Full Article

Review

50 Citations

10,295 Views

21 Pages

Deep Multimodal Emotion Recognition on Human Speech: A Review

Panagiotis Koromilas and
Theodoros Giannakopoulos

Appl. Sci.2021, 11(17), 7962;https://doi.org/10.3390/app11177962

-

28 August 2021

This work reviews the state of the art in multimodal speech emotion recognition methodologies, focusing on audio, text and visual information. We provide a new, descriptive categorization of methods, based on the way they handle the inter-modality an...

Full Article

Article

27 Citations

5,939 Views

19 Pages

CNN Architectures and Feature Extraction Methods for EEG Imaginary Speech Recognition

Ana-Luiza Rusnac and
Ovidiu Grigore

Sensors2022, 22(13), 4679;https://doi.org/10.3390/s22134679

-

21 June 2022

Speech is a complex mechanism allowing us to communicate our needs, desires and thoughts. In some cases of neural dysfunctions, this ability is highly affected, which makes everyday life activities that require communication a challenge. This paper s...

Full Article

Article

53 Views

26 Pages

C-EMDNet: A Nonlinear Morphological Deep Framework for Robust Speech Enhancement

Kais Khaldi,
Sahar Almenwer,
Afrah Alanazi,
Inam Alanazi and
Anis Mohamed

Sensors2026, 26(6), 1917;https://doi.org/10.3390/s26061917

-

18 March 2026

This study introduces C-EMDNet, a nonlinear speech denoising approach that combines the adaptive decomposition capabilities of Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and a deep convolutional architecture operatin...

Full Article

Article

1,592 Views

26 Pages

Feature Generation with Genetic Algorithms for Imagined Speech Electroencephalogram Signal Classification

Edgar Lara-Arellano,
Andras Takacs,
Saul Tovar-Arriaga and
Juvenal Rodríguez-Reséndiz

Eng2025, 6(4), 75;https://doi.org/10.3390/eng6040075

-

10 April 2025

This work presents a method for classifying EEG (Electroencephalogram) signals generated when a person concentrates on specific words, defined as “Imagined Speech”. Imagined speech is essential to enhance problem-solving, memory, and lang...

Full Article

Systematic Review

9 Citations

6,347 Views

20 Pages

Factors in the Effective Use of Hearing Aids among Subjects with Age-Related Hearing Loss: A Systematic Review

Perrine Morvan,
Johanna Buisson-Savin,
Catherine Boiteux,
Eric Bailly-Masson,
Mareike Buhl and
Hung Thai-Van

J. Clin. Med.2024, 13(14), 4027;https://doi.org/10.3390/jcm13144027

-

10 July 2024

Objectives: Investigate factors contributing to the effective management of age-related hearing loss (ARHL) rehabilitation. Methods: A systematic review was conducted following PRISMA guidelines. The protocol was registered in PROSPERO (CRD4202237481...

Full Article

Article

83 Citations

10,016 Views

18 Pages

Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures

M. Rupesh Kumar,
Susmitha Vekkot,
S. Lalitha,
Deepa Gupta,
Varasiddhi Jayasuryaa Govindraj,
Kamran Shaukat,
Yousef Ajami Alotaibi and
Mohammed Zakariah

Sensors2022, 22(23), 9311;https://doi.org/10.3390/s22239311

-

29 November 2022

Dementia affects the patient’s memory and leads to language impairment. Research has demonstrated that speech and language deterioration is often a clear indication of dementia and plays a crucial role in the recognition process. Even though ea...

Full Article

Article

15 Citations

8,966 Views

30 Pages

Speech Signal Analysis in Patients with Parkinson’s Disease, Taking into Account Phonation, Articulation, and Prosody of Speech

Ewelina Majda-Zdancewicz,
Anna Potulska-Chromik,
Monika Nojszewska and
Anna Kostera-Pruszczyk

Appl. Sci.2024, 14(23), 11085;https://doi.org/10.3390/app142311085

-

28 November 2024

This study involved performing tests to detect Parkinson’s disease (PD) based on voice changes, including speech phonation, articulation, and prosody, in patients with PD using different types of speech signal. For this purpose, during the firs...

Full Article

Article

49 Citations

8,194 Views

21 Pages

Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm

Akmalbek Bobomirzaevich Abdusalomov,
Furkat Safarov,
Mekhriddin Rakhimov,
Boburkhon Turaev and
Taeg Keun Whangbo

Sensors2022, 22(21), 8122;https://doi.org/10.3390/s22218122

-

24 October 2022

Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker’s features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process involves thre...

Full Article

Article

4 Citations

2,306 Views

14 Pages

NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram

Shakeel Zafar,
Imran Fareed Nizami,
Mobeen Ur Rehman,
Muhammad Majid and
Jihyoung Ryu

Sensors2023, 23(12), 5652;https://doi.org/10.3390/s23125652

-

16 June 2023

With the evolution in technology, communication based on the voice has gained importance in applications such as online conferencing, online meetings, voice-over internet protocol (VoIP), etc. Limiting factors such as environmental noise, encoding an...

Full Article

Article

32 Citations

9,172 Views

19 Pages

Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar

Young Hoon Shin and
Jiwon Seo

Sensors2016, 16(11), 1812;https://doi.org/10.3390/s16111812

-

29 October 2016

People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the mo...

Full Article

Review

17 Citations

6,890 Views

21 Pages

Reconsidering Read and Spontaneous Speech: Causal Perspectives on the Generation of Training Data for Automatic Speech Recognition

Philipp Gabler,
Bernhard C. Geiger,
Barbara Schuppler and
Roman Kern

Information2023, 14(2), 137;https://doi.org/10.3390/info14020137

-

19 February 2023

Superficially, read and spontaneous speech—the two main kinds of training data for automatic speech recognition—appear as complementary, but are equal: pairs of texts and acoustic signals. Yet, spontaneous speech is typically harder for r...

Full Article

Review

461 Views

39 Pages

An In-Depth Review of Speech Enhancement Algorithms: Classifications, Underlying Principles, Challenges, and Emerging Trends

Nisreen Talib Abdulhusein and
Basheera M. Mahmmod

Algorithms2026, 19(2), 134;https://doi.org/10.3390/a19020134

-

7 February 2026

Speech enhancement aims to improve speech quality and intelligibility in noisy environments and is important in applications such as hearing aids, mobile communications and automatic speech recognition (ASR). This paper shows a structured review of s...

Full Article

Review

7 Citations

10,965 Views

19 Pages

Speaker Diarization: A Review of Objectives and Methods

Douglas O’Shaughnessy

Appl. Sci.2025, 15(4), 2002;https://doi.org/10.3390/app15042002

-

14 February 2025

Recorded audio often contains speech from multiple people in conversation. It is useful to label such signals with speaker turns, noting when each speaker is talking and identifying each speaker. This paper discusses how to process speech signals to...

Full Article

Article

8 Citations

3,110 Views

14 Pages

Secure Speech Content Based on Scrambling and Adaptive Hiding

Dora M. Ballesteros and
Diego Renza

Symmetry2018, 10(12), 694;https://doi.org/10.3390/sym10120694

-

3 December 2018

This paper presents a method for speech steganography using two levels of security: The first one related to the scrambling process, the second one related to the hiding process. The scrambling block uses a technique based on the ability of adaptatio...

Full Article

Article

5 Citations

3,423 Views

29 Pages

Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm

Ali Dehghan Firoozabadi,
Pablo Irarrazaval,
Pablo Adasme,
David Zabala-Blanco,
Hugo Durney,
Miguel Sanhueza,
Pablo Palacios-Játiva and
Cesar Azurdia-Meza

Appl. Sci.2020, 10(11), 3955;https://doi.org/10.3390/app10113955

-

6 June 2020

Speech enhancement is one of the most important fields in audio and speech signal processing. The speech enhancement methods are divided into the single and multi-channel algorithms. The multi-channel methods increase the speech enhancement performan...

Full Article

Article

16 Citations

5,595 Views

19 Pages

Speech Enhancement for Secure Communication Using Coupled Spectral Subtraction and Wiener Filter

Hilman Pardede,
Kalamullah Ramli,
Yohan Suryanto,
Nur Hayati and
Alfan Presekal

Electronics2019, 8(8), 897;https://doi.org/10.3390/electronics8080897

-

14 August 2019

The encryption process for secure voice communication may degrade the speech quality when it is applied to the speech signals before encoding them through a conventional communication system such as GSM or radio trunking. This is because the encrypti...

Full Article

Article

6 Citations

4,250 Views

26 Pages

Frequency, Time, Representation and Modeling Aspects for Major Speech and Audio Processing Applications

Juraj Kacur,
Boris Puterka,
Jarmila Pavlovicova and
Milos Oravec

Sensors2022, 22(16), 6304;https://doi.org/10.3390/s22166304

-

22 August 2022

There are many speech and audio processing applications and their number is growing. They may cover a wide range of tasks, each having different requirements on the processed speech or audio signals and, therefore, indirectly, on the audio sensors as...

Full Article

Review

5 Citations

3,184 Views

12 Pages

A New Proposal for Phoneme Acquisition: Computing Speaker-Specific Distribution

Mihye Choi and
Mohinish Shukla

Brain Sci.2021, 11(2), 177;https://doi.org/10.3390/brainsci11020177

-

1 February 2021

Speech is an acoustically variable signal, and one of the sources of this variation is the presence of multiple speakers. Empirical evidence has suggested that adult listeners possess remarkably sensitive (and systematic) abilities to process speech...

Full Article

391 Results Found

Analysis of Fundamental Frequency Changes in Astronaut Speech in Microgravity and in Terrestrial Conditions

An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning

An Open CAPT System for Prosody Practice: Practical Steps towards Multilingual Setup

A Real-Time Dual-Microphone Speech Enhancement Algorithm Assisted by Bone Conduction Sensor

Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Frame-Based Phone Classification Using EMG Signals

Wearable Biomedical Measurement Systems for Assessment of Mental Stress of Combatants in Real Time

A Hybrid U-Lossian Deep Learning Network for Screening and Evaluating Parkinson’s Disease

Development of a Speech-in-Noise Test in European Portuguese Based on QuickSIN: A Pilot Study

Wireless Mouth Motion Recognition System Based on EEG-EMG Sensors for Severe Speech Impairments

A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition

Augmentative and Alternative Communication (AAC) Advances: A Review of Configurations for Individuals with a Speech Disability

Real-Time Signal Processing for Distributed Acoustic Sensing and Acoustic Sensing Systems Under Non-Stationary Noise

An Algorithm of Daubechies Wavelet Transform in the Final Field When Processing Speech Signals

Perceptual Evaluation of Speech Quality for Inexpensive Recording Equipment

Relationship of Cepstral Peak Prominence-Smoothed and Long-Term Average Spectrum with Auditory–Perceptual Analysis

Precision Audiometry and Ecological Validity: Exploring the Link Between Patient-Reported Outcome Measures and Speech Testing in CI Users

Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions

Speech Inpainting Based on Multi-Layer Long Short-Term Memory Networks

Factors to Describe the Outcome Characteristics of a CI Recipient

A Speech Command Control-Based Recognition System for Dysarthric Patients Based on Deep Learning Technology

Machine-Learning Methods for Speech and Handwriting Detection Using Neural Signals: A Review

Prediction of Public Trust in Politicians Using a Multimodal Fusion Approach

Assessing Cognitive Workload Using Cardiovascular Measures and Voice

A Wearable Silent Text Input System Using EMG and Piezoelectric Sensors

Frequency-Aware Multi-Rate Resampling with Multi-Band Deep Supervision for Modular Speech Denoising

Voice-Based Detection of Parkinson’s Disease Using Machine and Deep Learning Approaches: A Systematic Review

Deep Learning System for Speech Command Recognition

Research on Speech Synthesis Based on Mixture Alignment Mechanism

Call Redistribution for a Call Center Based on Speech Emotion Recognition

Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks

Machine Learning Approach to Dysphonia Detection

Deep Multimodal Emotion Recognition on Human Speech: A Review

CNN Architectures and Feature Extraction Methods for EEG Imaginary Speech Recognition

C-EMDNet: A Nonlinear Morphological Deep Framework for Robust Speech Enhancement

Feature Generation with Genetic Algorithms for Imagined Speech Electroencephalogram Signal Classification

Factors in the Effective Use of Hearing Aids among Subjects with Age-Related Hearing Loss: A Systematic Review

Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures

Speech Signal Analysis in Patients with Parkinson’s Disease, Taking into Account Phonation, Articulation, and Prosody of Speech

Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm

NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram

Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar

Reconsidering Read and Spontaneous Speech: Causal Perspectives on the Generation of Training Data for Automatic Speech Recognition

An In-Depth Review of Speech Enhancement Algorithms: Classifications, Underlying Principles, Challenges, and Emerging Trends

Speaker Diarization: A Review of Objectives and Methods

Secure Speech Content Based on Scrambling and Adaptive Hiding

Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm

Speech Enhancement for Secure Communication Using Coupled Spectral Subtraction and Wiener Filter

Frequency, Time, Representation and Modeling Aspects for Major Speech and Audio Processing Applications

A New Proposal for Phoneme Acquisition: Computing Speaker-Specific Distribution