Skip to Content

312 Results Found

  • Article
  • Open Access
3 Citations
2,734 Views
15 Pages

14 February 2023

In IP audio systems, audio quality is degraded by environmental noise, poor network quality, and encoding–decoding algorithms. Therefore, there is a need for a continuous automatic quality evaluation of the transmitted audio. Speech quality mon...

  • Article
  • Open Access
3,042 Views
13 Pages

Evaluation of Speech Quality Through Recognition and Classification of Phonemes

  • Svetlana Pekarskikh,
  • Evgeny Kostyuchenko and
  • Lidiya Balatskaya

25 November 2019

This paper discusses an approach for assessing the quality of speech while undergoing speech rehabilitation. One of the main reasons for speech quality decrease during the surgical treatment of vocal tract diseases is the loss of the vocal tractˈs pa...

  • Article
  • Open Access
4 Citations
3,584 Views
18 Pages

22 December 2020

The paper focuses on the description of a system for the automatic evaluation of synthetic speech quality based on the Gaussian mixture model (GMM) classifier. The speech material originating from a real speaker is compared with synthesized material...

  • Article
  • Open Access
9 Citations
4,934 Views
12 Pages

Development and Evaluation of Speech Synthesis System Based on Deep Learning Models

  • Alakbar Valizada,
  • Sevil Jafarova,
  • Emin Sultanov and
  • Samir Rustamov

7 May 2021

This study concentrates on the investigation, development, and evaluation of Text-to-Speech Synthesis systems based on Deep Learning models for the Azerbaijani Language. We have selected and compared state-of-the-art models-Tacotron and Deep Convolut...

  • Article
  • Open Access
6 Citations
9,416 Views
19 Pages

10 July 2023

Voice cloning, an emerging field in the speech-processing area, aims to generate synthetic utterances that closely resemble the voices of specific individuals. In this study, we investigated the impact of various techniques on improving the quality o...

  • Article
  • Open Access
1 Citations
4,985 Views
12 Pages

10 March 2021

This research studies the perceptual evaluation of speech signals using an inexpensive recording device. Different types of noise-reduction and electronic enhancement filters viz. Hamming window, high-pass filter (HPF), Wiener-filter and no-speech ac...

  • Article
  • Open Access
4 Citations
2,287 Views
14 Pages

16 June 2023

With the evolution in technology, communication based on the voice has gained importance in applications such as online conferencing, online meetings, voice-over internet protocol (VoIP), etc. Limiting factors such as environmental noise, encoding an...

  • Article
  • Open Access
8 Citations
3,405 Views
19 Pages

25 August 2022

Indoor acoustic quality is one of the critical indicators for occupants’ health, comfort, and productivity in contemporary office environments. Post-occupancy evaluation (POE) is usually employed to examine in situ acoustic measurements to ensu...

  • Article
  • Open Access
2 Citations
2,770 Views
14 Pages

Management Discourse Analysis of High- and Low-Efficacy Schools: A Comparative Study of Factors Influencing School Performance

  • Jesús García-Jiménez,
  • Inés Lucas-Oliva,
  • Javier Rodríguez-Santero and
  • Juan-Jesús Torres-Gordillo

Offering an efficient, egalitarian, and quality education is an agreed-upon goal in society that aims to guarantee upwards social mobility. For this reason, the objectives of this article are to determine how Andalusian primary schools with high and...

  • Article
  • Open Access
32 Citations
9,412 Views
10 Pages

Speech Enhancement for Hearing Aids with Deep Learning on Environmental Noises

  • Gyuseok Park,
  • Woohyeong Cho,
  • Kyu-Sung Kim and
  • Sangmin Lee

2 September 2020

Hearing aids are small electronic devices designed to improve hearing for persons with impaired hearing, using sophisticated audio signal processing algorithms and technologies. In general, the speech enhancement algorithms in hearing aids remove the...

  • Article
  • Open Access
1 Citations
975 Views
17 Pages

DeCGAN: Speech Enhancement Algorithm for Air Traffic Control

  • Haijun Liang,
  • Yimin He,
  • Hanwen Chang and
  • Jianguo Kong

24 April 2025

Air traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer genera...

  • Article
  • Open Access
1,979 Views
39 Pages

The Development and Experimental Evaluation of a Multilingual Speech Corpus for Low-Resource Turkic Languages

  • Aidana Karibayeva,
  • Vladislav Karyukin,
  • Ualsher Tukeyev,
  • Balzhan Abduali,
  • Dina Amirova,
  • Diana Rakhimova,
  • Rashid Aliyev and
  • Assem Shormakova

5 December 2025

The development of parallel audio corpora for Turkic languages, such as Kazakh, Uzbek, and Tatar, remains a significant challenge in the development of multilingual speech synthesis, recognition systems, and machine translation. These languages are l...

  • Review
  • Open Access
387 Views
39 Pages

7 February 2026

Speech enhancement aims to improve speech quality and intelligibility in noisy environments and is important in applications such as hearing aids, mobile communications and automatic speech recognition (ASR). This paper shows a structured review of s...

  • Article
  • Open Access
2,725 Views
18 Pages

The quality of air traffic control speech is crucial. However, internal and external noise can impact air traffic control speech quality. Clear speech instructions and feedback help optimize flight processes and responses to emergencies. The traditio...

  • Article
  • Open Access
3 Citations
3,261 Views
16 Pages

21 July 2021

Deep neural networks have been applied for speech enhancements efficiently. However, for large variations of speech patterns and noisy environments, an individual neural network with a fixed number of hidden layers causes strong interference, which c...

  • Article
  • Open Access
5 Citations
4,794 Views
12 Pages

3 December 2023

Speech synthesis is a technology that converts text into speech waveforms. With the development of deep learning, neural network-based speech synthesis technology is being researched in various fields, and the quality of synthesized speech has signif...

  • Article
  • Open Access
3 Citations
2,825 Views
21 Pages

11 July 2022

Speech coding is an essential technology for digital cellular communications, voice over IP, and video conferencing systems. For more than 25 years, the main approach to speech coding for these applications has been block-based analysis-by-synthesis...

  • Article
  • Open Access
5 Citations
4,186 Views
19 Pages

21 October 2020

In hearing aid devices, speech enhancement techniques are a critical component to enable users with hearing loss to attain improved speech quality under noisy conditions. Recently, the deep denoising autoencoder (DDAE) was adopted successfully for re...

  • Article
  • Open Access
1 Citations
1,577 Views
17 Pages

A Helium Speech Correction Method Based on Generative Adversarial Networks

  • Hongjun Li,
  • Yuxiang Chen,
  • Hongwei Ji and
  • Shibing Zhang

The distortion of helium speech caused by helium−oxygen gas mixtures significantly impacts the safety and communication efficiency of saturation divers. Although existing correction methods have shown some effectiveness in improving the intelli...

  • Article
  • Open Access
466 Views
14 Pages

AirSpeech: Lightweight Speech Synthesis Framework for Home Intelligent Space Service Robots

  • Xiugong Qin,
  • Fenghu Pan,
  • Jing Gao,
  • Shilong Huang,
  • Yichen Sun and
  • Xiao Zhong

Text-to-Speech (TTS) methods typically employ a sequential approach with an Acoustic Model (AM) and a vocoder, using a Mel spectrogram as an intermediate representation. However, in home environments, TTS systems often struggle with issues such as in...

  • Article
  • Open Access
12 Citations
22,166 Views
19 Pages

Text-to-Speech (TTS) systems have made strides but creating natural-sounding human voices remains challenging. Existing methods rely on noncomprehensive models with only one-layer nonlinear transformations, which are less effective for processing com...

  • Communication
  • Open Access
9 Citations
3,630 Views
13 Pages

13 June 2022

In this study, we propose a method to reduce noise from speech obtained from a general microphone using the information of a throat microphone. A throat microphone records a sound by detecting the vibration of the skin surface near the throat directl...

  • Article
  • Open Access
11 Citations
9,848 Views
14 Pages

17 May 2011

In this paper, a packet loss concealment (PLC) algorithm for CELP-type speech coders is proposed in order to improve the quality of decoded speech under burst packet loss conditions in a wireless sensor network. Conventional receiver-based PLC algori...

  • Article
  • Open Access
9 Citations
2,835 Views
18 Pages

Subjective Experience of Speech Depending on the Acoustic Treatment in an Ordinary Room

  • Emma Arvidsson,
  • Erling Nilsson,
  • Delphine Bard-Hagberg and
  • Ola J. I. Karlsson

In environments such as classrooms and offices, complex tasks are performed. A satisfactory acoustic environment is critical for the performance of such tasks. To ensure a good acoustic environment, the right acoustic treatment must be used. The rela...

  • Editorial
  • Open Access
1,322 Views
7 Pages

Special Issue on IberSPEECH 2022: Speech and Language Technologies for Iberian Languages

  • José L. Pérez-Córdoba,
  • Francesc Alías-Pujol and
  • Zoraida Callejas

24 May 2024

ThisSpecial Issue presents the latest advances in research and novel applications of speech and language technologies based on the works presented at the sixth edition of the IberSPEECH conference held in Granada in 2022, paying special attention to...

  • Article
  • Open Access
2 Citations
1,823 Views
21 Pages

This paper presents a new speech-enhancement approach based on an enhanced empirical wavelet transform, considering the time and scale adaptation of thresholds for individual component signals obtained from the used transform. The time adaptation is...

  • Article
  • Open Access
1 Citations
3,334 Views
15 Pages

29 November 2024

Speech enhancement technology seeks to improve the quality and intelligibility of speech signals degraded by noise, particularly in telephone communications. Recent advancements have focused on leveraging deep neural networks (DNN), especially U-Net...

  • Article
  • Open Access
5 Citations
2,198 Views
16 Pages

29 September 2023

In the current field of air traffic control speech, there is a lack of effective objective speech quality evaluation methods. This paper proposes a new network framework based on ResNet–BiLSTM to address this issue. Firstly, the mel-spectrogram...

  • Article
  • Open Access
2,415 Views
15 Pages

17 April 2024

This paper addresses a joint training approach applied to a pipeline comprising speech enhancement (SE) and automatic speech recognition (ASR) models, where an acoustic tokenizer is included in the pipeline to leverage the linguistic information from...

  • Editorial
  • Open Access
2,492 Views
8 Pages

4 January 2020

The main goal of this Special Issue is to present the latest advances in research and novel applications of speech and language technologies based on the works presented at the IberSPEECH edition held in Barcelona in 2018, paying special attention to...

  • Article
  • Open Access
4,607 Views
19 Pages

3 July 2024

Recent advancements in text-to-speech (TTS) models have aimed to streamline the two-stage process into a single-stage training approach. However, many single-stage models still lag behind in audio quality, particularly when handling Kurdish text and...

  • Article
  • Open Access
4 Citations
3,116 Views
20 Pages

14 January 2022

Moderate performance in terms of intelligibility and naturalness can be obtained using previously established silent speech interface (SSI) methods. Nevertheless, a common problem associated with SSI has involved deficiencies in estimating the spectr...

  • Article
  • Open Access
5 Citations
3,644 Views
14 Pages

Oral Impacts of Aligners versus Fixed Self-Ligating Lingual Orthodontic Appliances

  • Gerassimos G. Angelopoulos,
  • Panagiotis Kanarelis,
  • Georgia Vagdouti,
  • Ageliki Zavlanou and
  • Iosif Sifakakis

27 October 2021

The aim of this prospective study was to compare a fixed lingual orthodontic appliance with a commonly used aligner system, focusing on oral impacts and speech disturbances, during the first 3 months of orthodontic treatment. Two groups of adults wer...

  • Article
  • Open Access
5 Citations
3,300 Views
20 Pages

17 February 2024

Audio inpainting plays an important role in addressing incomplete, damaged, or missing audio signals, contributing to improved quality of service and overall user experience in multimedia communications over the Internet and mobile networks. This pap...

  • Article
  • Open Access
1 Citations
4,412 Views
29 Pages

Speech Recognition and Synthesis Models and Platforms for the Kazakh Language

  • Aidana Karibayeva,
  • Vladislav Karyukin,
  • Balzhan Abduali and
  • Dina Amirova

10 October 2025

With the rapid development of artificial intelligence and machine learning technologies, automatic speech recognition (ASR) and text-to-speech (TTS) have become key components of the digital transformation of society. The Kazakh language, as a repres...

  • Article
  • Open Access
1 Citations
5,116 Views
19 Pages

25 August 2022

Acoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands...

  • Article
  • Open Access
13 Citations
4,437 Views
13 Pages

Efficacy of Bilateral Cochlear Implantation in Pediatric and Adult Patients with Profound Sensorineural Hearing Loss: A Retrospective Analysis in a Developing European Country

  • Claudia Raluca Balasa Virzob,
  • Marioara Poenaru,
  • Raluca Morar,
  • Ioana Delia Horhat,
  • Nicolae Constantin Balica,
  • Reshmanth Prathipati,
  • Radu Dumitru Moleriu,
  • Ana-Olivia Toma,
  • Iulius Juganaru and
  • Stela Iurciuc
  • + 5 authors

18 April 2023

This retrospective study aimed to evaluate the outcomes of bilateral cochlear implantation in patients with severe-to-profound sensorineural hearing loss at the Timisoara Municipal Emergency Clinical Hospital ENT Clinic. The study involved 77 partici...

  • Article
  • Open Access
4 Citations
3,763 Views
25 Pages

Deep Learning-Based Speech Enhancement for Robust Sound Classification in Security Systems

  • Samuel Yaw Mensah,
  • Tao Zhang,
  • Nahid AI Mahmud and
  • Yanzhang Geng

Deep learning has emerged as a powerful technique for speech enhancement, particularly in security systems where audio signals are often degraded by non-stationary noise. Traditional signal processing methods struggle in such conditions, making it di...

  • Article
  • Open Access
2 Citations
8,587 Views
16 Pages

Automatic Speech Recognition of Vietnamese for a New Large-Scale Corpus

  • Linh Thi Thuc Tran,
  • Han-Gyu Kim,
  • Hoang Minh La and
  • Su Van Pham

Vietnamese is an under-resourced language. The requirement for a large-scale and high-quality Vietnamese speech corpus increases on demand. We introduce a new large-scale Vietnamese speech corpus with 100.5 h collected from various audio sources in t...

  • Article
  • Open Access
5,094 Views
27 Pages

14 May 2025

Voice conversion (VC) is an advanced technology that enables the transformation of raw speech into high-quality audio resembling the target speaker’s voice while preserving the original linguistic content and prosodic patterns. In this study, w...

  • Feature Paper
  • Article
  • Open Access
8 Citations
2,894 Views
14 Pages

26 June 2021

Pathological speech such as Oesophageal Speech (OS) is difficult to understand due to the presence of undesired artefacts and lack of normal healthy speech characteristics. Modern speech technologies and machine learning enable us to transform pathol...

  • Article
  • Open Access
2 Citations
3,379 Views
16 Pages

22 December 2022

Online multi-microphone speech enhancement aims to extract target speech from multiple noisy inputs by exploiting the spatial information as well as the spectro-temporal characteristics with low latency. Acoustic parameters such as the acoustic trans...

  • Article
  • Open Access
8 Citations
3,839 Views
20 Pages

ODIN112–AI-Assisted Emergency Services in Romania

  • Dan Ungureanu,
  • Stefan-Adrian Toma,
  • Ion-Dorinel Filip,
  • Bogdan-Costel Mocanu,
  • Iulian Aciobăniței,
  • Bogdan Marghescu,
  • Titus Balan,
  • Mihai Dascalu,
  • Ion Bica and
  • Florin Pop

3 January 2023

The evolution of Natural Language Processing technologies transformed them into viable choices for various accessibility features and for facilitating interactions between humans and computers. A subset of them consists of speech processing systems,...

  • Case Report
  • Open Access
1 Citations
1,235 Views
4 Pages

Quality of Life and Speech Perception in Two Late Deafened Adults with Cochlear Implants

  • Marwa F. Abdrabbou,
  • Denise A. Tucker,
  • Mary V. Compton and
  • Lyn Mankoff

The aim was to demonstrate the need for a quality of life assessment in biopsychosocial aural rehabilitation (AR) practices with late deafened adults (LDAs) with cochlear implants (CIs). We present a case report of a medical records review of two LDA...

  • Article
  • Open Access
1 Citations
6,033 Views
12 Pages

16 January 2025

Background: Hypokinetic dysarthria is a speech disorder observed in almost 90% of PD patients that can appear at any stage of the disease, usually worsening as the disease progresses. Today, speech therapy intervention in PD is seen as a possible the...

  • Article
  • Open Access
2 Citations
3,925 Views
23 Pages

16 January 2018

In this work, a multiple speech source separation method using inter-channel correlation and relaxed sparsity is proposed. A B-format microphone with four spatially located channels is adopted due to the size of the microphone array to preserve the s...

  • Article
  • Open Access
2 Citations
3,233 Views
21 Pages

A Dual Stream Generative Adversarial Network with Phase Awareness for Speech Enhancement

  • Xintao Liang,
  • Yuhang Li,
  • Xiaomin Li,
  • Yue Zhang and
  • Youdong Ding

4 April 2023

Implementing single-channel speech enhancement under unknown noise conditions is a challenging problem. Most existing time-frequency domain methods are based on the amplitude spectrogram, and these methods often ignore the phase mismatch between nois...

  • Article
  • Open Access
348 Views
22 Pages

3 January 2026

In this study, we propose a novel automated model for speech quality estimation that objectively evaluates perceptual dysphonia severity and breathiness in audio samples, demonstrating strong correlation with expert ratings. The proposed model integr...

  • Article
  • Open Access
2,633 Views
18 Pages

8 May 2025

The score-based diffusion model has made significant progress in the field of computer vision, surpassing the performance of generative models, such as variational autoencoders, and has been extended to applications such as speech enhancement and rec...

of 7