In this paper we describe a novel algorithm, inspired by the mirror neuron discovery, to support automatic learning oriented to advanced man-machine interfaces. The algorithm introduces several points of innovation, based on complex metrics of similarity that involve different characteristics of the entire learning process. In more detail, the proposed approach deals with an humanoid robot algorithm suited for automatic vocalization acquisition from a human tutor. The learned vocalization can be used to multi-modal reproduction of speech, as the articulatory and acoustic parameters that compose the vocalization database can be used to synthesize unrestricted speech utterances and reproduce the articulatory and facial movements of the humanoid talking face automatically synchronized. The algorithm uses fuzzy articulatory rules, which describe transitions between phonemes derived from the International Phonetic Alphabet (IPA), to allow simpler adaptation to different languages, and genetic optimization of the membership degrees. Large experimental evaluation and analysis of the proposed algorithm on synthetic and real data sets confirms the benefits of our proposal. Indeed, experimental results show that the vocalization acquired respects the basic phonetic rules of Italian languages and that subjective results show the effectiveness of multi-modal speech production with automatic synchronization between facial movements and speech emissions. The algorithm has been applied to a virtual speaking face but it may also be used in mechanical vocalization systems as well.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited