Next Article in Journal
Robust Sensing of Approaching Vehicles Relying on Acoustic Cues
Next Article in Special Issue
A Novel Micro- and Nano-Scale Positioning Sensor Based on Radio Frequency Resonant Cavities
Previous Article in Journal
A Compact Microelectrode Array Chip with Multiple Measuring Sites for Electrochemical Applications
Previous Article in Special Issue
Real-Time Algebraic Derivative Estimations Using a Novel Low-Cost Architecture Based on Reconfigurable Logic
Sensors 2014, 14(6), 9522-9545; doi:10.3390/s140609522

Audio-Visual Perception System for a Humanoid Robotic Head

1,* , 2
1 University of Jaén, Multimedia and Multimodal Processing Group, Polytechnic School of Linares, University of Jaén Alfonso X El Sabio, 28, 23700, Linares, Spain 2 Dpto. Tecnología Electrónica, University of Málaga, Campus de Teatinos - 29071 Málaga, Spain
* Author to whom correspondence should be addressed.
Received: 28 December 2013 / Revised: 7 May 2014 / Accepted: 20 May 2014 / Published: 28 May 2014
(This article belongs to the Special Issue State-of-the-Art Sensors Technology in Spain 2013)
View Full-Text   |   Download PDF [7298 KB, uploaded 21 June 2014]   |   Browse Figures


One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.
Keywords: multimodal perception; bio-inspired attention mechanism; human-robot interaction multimodal perception; bio-inspired attention mechanism; human-robot interaction
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Share & Cite This Article

Further Mendeley | CiteULike
Export to BibTeX |
EndNote |
MDPI and ACS Style

Viciana-Abad, R.; Marfil, R.; Perez-Lorenzo, J.M.; Bandera, J.P.; Romero-Garces, A.; Reche-Lopez, P. Audio-Visual Perception System for a Humanoid Robotic Head. Sensors 2014, 14, 9522-9545.

View more citation formats

Related Articles

Article Metrics

For more information on the journal, click here


[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert