Next Article in Journal
Reduction of Liquid Bridge Force for 3D Microstructure Measurements
Next Article in Special Issue
Chord Recognition Based on Temporal Correlation Support Vector Machine
Previous Article in Journal
Physical and Degradable Properties of Mulching Films Prepared from Natural Fibers and Biodegradable Polymers
Previous Article in Special Issue
Two-Polarisation Physical Model of Bowed Strings with Nonlinear Contact and Friction Forces, and Application to Gesture-Based Sound Synthesis
Article Menu

Export Article

Open AccessReview
Appl. Sci. 2016, 6(5), 143; doi:10.3390/app6050143

A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds

GTM - Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Quatre Camins, 30, 08022 Barcelona, Spain
*
Author to whom correspondence should be addressed.
Academic Editor: Vesa Välimäki
Received: 15 March 2016 / Revised: 22 April 2016 / Accepted: 28 April 2016 / Published: 12 May 2016
(This article belongs to the Special Issue Audio Signal Processing)
View Full-Text   |   Download PDF [789 KB, uploaded 12 May 2016]   |  

Abstract

Endowing machines with sensing capabilities similar to those of humans is a prevalent quest in engineering and computer science. In the pursuit of making computers sense their surroundings, a huge effort has been conducted to allow machines and computers to acquire, process, analyze and understand their environment in a human-like way. Focusing on the sense of hearing, the ability of computers to sense their acoustic environment as humans do goes by the name of machine hearing. To achieve this ambitious aim, the representation of the audio signal is of paramount importance. In this paper, we present an up-to-date review of the most relevant audio feature extraction techniques developed to analyze the most usual audio signals: speech, music and environmental sounds. Besides revisiting classic approaches for completeness, we include the latest advances in the field based on new domains of analysis together with novel bio-inspired proposals. These approaches are described following a taxonomy that organizes them according to their physical or perceptual basis, being subsequently divided depending on the domain of computation (time, frequency, wavelet, image-based, cepstral, or other domains). The description of the approaches is accompanied with recent examples of their application to machine hearing related problems. View Full-Text
Keywords: audio feature extraction; machine hearing; audio analysis; music; speech; environmental sound audio feature extraction; machine hearing; audio analysis; music; speech; environmental sound
Figures

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Alías, F.; Socoró, J.C.; Sevillano, X. A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds. Appl. Sci. 2016, 6, 143.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Appl. Sci. EISSN 2076-3417 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top