Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (2)

Search Parameters:
Keywords = audio inpainting

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 19399 KB  
Article
Speech Inpainting Based on Multi-Layer Long Short-Term Memory Networks
by Haohan Shi, Xiyu Shi and Safak Dogan
Future Internet 2024, 16(2), 63; https://doi.org/10.3390/fi16020063 - 17 Feb 2024
Cited by 3 | Viewed by 2623
Abstract
Audio inpainting plays an important role in addressing incomplete, damaged, or missing audio signals, contributing to improved quality of service and overall user experience in multimedia communications over the Internet and mobile networks. This paper presents an innovative solution for speech inpainting using [...] Read more.
Audio inpainting plays an important role in addressing incomplete, damaged, or missing audio signals, contributing to improved quality of service and overall user experience in multimedia communications over the Internet and mobile networks. This paper presents an innovative solution for speech inpainting using Long Short-Term Memory (LSTM) networks, i.e., a restoring task where the missing parts of speech signals are recovered from the previous information in the time domain. The lost or corrupted speech signals are also referred to as gaps. We regard the speech inpainting task as a time-series prediction problem in this research work. To address this problem, we designed multi-layer LSTM networks and trained them on different speech datasets. Our study aims to investigate the inpainting performance of the proposed models on different datasets and with varying LSTM layers and explore the effect of multi-layer LSTM networks on the prediction of speech samples in terms of perceived audio quality. The inpainted speech quality is evaluated through the Mean Opinion Score (MOS) and a frequency analysis of the spectrogram. Our proposed multi-layer LSTM models are able to restore up to 1 s of gaps with high perceptual audio quality using the features captured from the time domain only. Specifically, for gap lengths under 500 ms, the MOS can reach up to 3~4, and for gap lengths ranging between 500 ms and 1 s, the MOS can reach up to 2~3. In the time domain, the proposed models can proficiently restore the envelope and trend of lost speech signals. In the frequency domain, the proposed models can restore spectrogram blocks with higher similarity to the original signals at frequencies less than 2.0 kHz and comparatively lower similarity at frequencies in the range of 2.0 kHz~8.0 kHz. Full article
(This article belongs to the Special Issue Deep Learning and Natural Language Processing II)
Show Figures

Figure 1

8 pages, 219 KB  
Editorial
Deep Learning Applications with Practical Measured Results in Electronics Industries
by Mong-Fong Horng, Hsu-Yang Kung, Chi-Hua Chen and Feng-Jang Hwang
Electronics 2020, 9(3), 501; https://doi.org/10.3390/electronics9030501 - 19 Mar 2020
Cited by 8 | Viewed by 3938
Abstract
This editorial introduces the Special Issue, entitled “Deep Learning Applications with Practical Measured Results in Electronics Industries”, of Electronics. Topics covered in this issue include four main parts: (I) environmental information analyses and predictions, (II) unmanned aerial vehicle (UAV) and object tracking [...] Read more.
This editorial introduces the Special Issue, entitled “Deep Learning Applications with Practical Measured Results in Electronics Industries”, of Electronics. Topics covered in this issue include four main parts: (I) environmental information analyses and predictions, (II) unmanned aerial vehicle (UAV) and object tracking applications, (III) measurement and denoising techniques, and (IV) recommendation systems and education systems. Four papers on environmental information analyses and predictions are as follows: (1) “A Data-Driven Short-Term Forecasting Model for Offshore Wind Speed Prediction Based on Computational Intelligence” by Panapakidis et al.; (2) “Multivariate Temporal Convolutional Network: A Deep Neural Networks Approach for Multivariate Time Series Forecasting” by Wan et al.; (3) “Modeling and Analysis of Adaptive Temperature Compensation for Humidity Sensors” by Xu et al.; (4) “An Image Compression Method for Video Surveillance System in Underground Mines Based on Residual Networks and Discrete Wavelet Transform” by Zhang et al. Three papers on UAV and object tracking applications are as follows: (1) “Trajectory Planning Algorithm of UAV Based on System Positioning Accuracy Constraints” by Zhou et al.; (2) “OTL-Classifier: Towards Imaging Processing for Future Unmanned Overhead Transmission Line Maintenance” by Zhang et al.; (3) “Model Update Strategies about Object Tracking: A State of the Art Review” by Wang et al. Five papers on measurement and denoising techniques are as follows: (1) “Characterization and Correction of the Geometric Errors in Using Confocal Microscope for Extended Topography Measurement. Part I: Models, Algorithms Development and Validation” by Wang et al.; (2) “Characterization and Correction of the Geometric Errors Using a Confocal Microscope for Extended Topography Measurement, Part II: Experimental Study and Uncertainty Evaluation” by Wang et al.; (3) “Deep Transfer HSI Classification Method Based on Information Measure and Optimal Neighborhood Noise Reduction” by Lin et al.; (4) “Quality Assessment of Tire Shearography Images via Ensemble Hybrid Faster Region-Based ConvNets” by Chang et al.; (5) “High-Resolution Image Inpainting Based on Multi-Scale Neural Network” by Sun et al. Two papers on recommendation systems and education systems are as follows: (1) “Deep Learning-Enhanced Framework for Performance Evaluation of a Recommending Interface with Varied Recommendation Position and Intensity Based on Eye-Tracking Equipment Data Processing” by Sulikowski et al. and (2) “Generative Adversarial Network Based Neural Audio Caption Model for Oral Evaluation” by Zhang et al. Full article
Back to TopTop