MDPI - Publisher of Open Access Journals

24 pages, 3755 KiB

Open AccessArticle

Pilot Data for a New Headphone-Based Assessment of Absolute Localization in the Assessment of Auditory Processing Disorder (APD)

by Jack Hargreaves, Julia Sarant, Bryn Douglas and Harvey Dillon

Audiol. Res. 2025, 15(1), 12; https://doi.org/10.3390/audiolres15010012 - 27 Jan 2025

Viewed by 1260

Abstract

Background/Objectives: Localization deficit is often said to be a symptom of Auditory Processing Disorder (APD). However, no clinically viable assessment of localization ability has been developed to date. The current study presents pilot data for a new assessment of absolute auditory localization [...] Read more.

Background/Objectives: Localization deficit is often said to be a symptom of Auditory Processing Disorder (APD). However, no clinically viable assessment of localization ability has been developed to date. The current study presents pilot data for a new assessment of absolute auditory localization using headphones. Methods: Speech phrases encoded with non-individualized head-related transfer functions (HRTF) using real-time digital processing were presented to two cohorts of participants with normal hearing. Variations in the simulated environment (anechoic and reverberant) and signal to noise ratio (SNR) were made to assess each of these factors’ influences on localization performance. Experiment 1 assessed 30 young adults aged 21–33 years old and Experiment 2 assessed 28 young adults aged 21–29 years old. All participants had hearing thresholds better than 20 dB HL. Results: Participants performed the localization task with a moderate degree of accuracy (Experiment 1: Mean RMS error = 25.9°; Experiment 2: Mean RMS error 27.2°). Front–back errors (FBEs) were evident, contributing to an average RMS error that was notably elevated when compared to similar free-field tasks. There was no statistically significant influence from the simulated environment or SNR on performance. Conclusions: An exploration of test viability in the pediatric and APD-positive populations is warranted alongside further correction for FBEs; however, the potential for future clinical implementation of this measure of absolute auditory localization is encouraging. Full article

► Show Figures

Figure 1

46 pages, 2469 KiB

Open AccessReview

A Review on Head-Related Transfer Function Generation for Spatial Audio

by Valeria Bruschi, Loris Grossi, Nefeli A. Dourou, Andrea Quattrini, Alberto Vancheri, Tiziano Leidi and Stefania Cecchi

Appl. Sci. 2024, 14(23), 11242; https://doi.org/10.3390/app142311242 - 2 Dec 2024

Viewed by 5731

Abstract

A head-related transfer function (HRTF) is a mathematical model that describes the acoustic path between a sound source and a listener’s ear. Using binaural synthesis techniques, HRTFs play a crucial role in creating immersive audio experiences through headphones or loudspeakers, using binaural synthesis [...] Read more.

A head-related transfer function (HRTF) is a mathematical model that describes the acoustic path between a sound source and a listener’s ear. Using binaural synthesis techniques, HRTFs play a crucial role in creating immersive audio experiences through headphones or loudspeakers, using binaural synthesis techniques. HRTF measurements can be conducted either with standardised mannequins or with in-ear microphones on real subjects. However, various challenges arise in, for example, individual differences in head shape, pinnae geometry, and torso dimensions, as well as in the extensive number of measurements required for optimal audio immersion. To address these issues, numerous methods have been developed to generate new HRTFs from existing data or through computer simulations. This review paper provides an overview of the current approaches and technologies for generating, adapting, and optimising HRTFs, with a focus on physical modelling, anthropometric techniques, machine learning methods, interpolation strategies, and their practical applications. Full article

(This article belongs to the Special Issue Spatial Audio and Sound Design)

► Show Figures

Figure 1

21 pages, 4976 KiB

Open AccessArticle

The Effect of Training on Localizing HoloLens-Generated 3D Sound Sources

by Wonyeol Ryu, Sukhan Lee and Eunil Park

Sensors 2024, 24(11), 3442; https://doi.org/10.3390/s24113442 - 27 May 2024

Cited by 1 | Viewed by 1459

Abstract

Sound localization is a crucial aspect of human auditory perception. VR (virtual reality) technologies provide immersive audio platforms that allow human listeners to experience natural sounds based on their ability to localize sound. However, the simulations of sound generated by these platforms, which [...] Read more.

Sound localization is a crucial aspect of human auditory perception. VR (virtual reality) technologies provide immersive audio platforms that allow human listeners to experience natural sounds based on their ability to localize sound. However, the simulations of sound generated by these platforms, which are based on the general head-related transfer function (HRTF), often lack accuracy in terms of individual sound perception and localization due to significant individual differences in this function. In this study, we aimed to investigate the disparities between the perceived locations of sound sources by users and the locations generated by the platform. Our goal was to determine if it is possible to train users to adapt to the platform-generated sound sources. We utilized the Microsoft HoloLens 2 virtual platform and collected data from 12 subjects based on six separate training sessions arranged in 2 weeks. We employed three modes of training to assess their effects on sound localization, in particular for studying the impacts of multimodal error, visual, and sound guidance in combination with kinesthetic/postural guidance, on the effectiveness of the training. We analyzed the collected data in terms of the training effect between pre- and post-sessions as well as the retention effect between two separate sessions based on subject-wise paired statistics. Our findings indicate that, as far as the training effect between pre- and post-sessions is concerned, the effect is proven to be statistically significant, in particular in the case wherein kinesthetic/postural guidance is mixed with visual and sound guidance. Conversely, visual error guidance alone was found to be largely ineffective. On the other hand, as far as the retention effect between two separate sessions is concerned, we could not find any meaningful statistical implication on the effect for all three error guidance modes out of the 2-week session of training. These findings can contribute to the improvement of VR technologies by ensuring they are designed to optimize human sound localization abilities. Full article

(This article belongs to the Special Issue Feature Papers in Intelligent Sensors 2024)

► Show Figures

Figure 1

18 pages, 11603 KiB

Open AccessArticle

Comparative Analysis of HRTFs Measurement Using In-Ear Microphones

by Valeria Bruschi, Alessandro Terenzi, Nefeli A. Dourou, Susanna Spinsante and Stefania Cecchi

Sensors 2023, 23(13), 6016; https://doi.org/10.3390/s23136016 - 29 Jun 2023

Viewed by 2367

Abstract

The head-related transfer functions (HRTFs) describe the acoustic path transfer functions between sound sources in the free-field and the listener’s ear canal. They enable the evaluation of the sound perception of a human being and the creation of immersive virtual acoustic environments that [...] Read more.

The head-related transfer functions (HRTFs) describe the acoustic path transfer functions between sound sources in the free-field and the listener’s ear canal. They enable the evaluation of the sound perception of a human being and the creation of immersive virtual acoustic environments that can be reproduced over headphones or loudspeakers. HRTFs are strongly individual and they can be measured by in-ear microphones worn by real subjects. However, standardized HRTFs can also be measured using artificial head simulators which standardize the body dimensions. In this paper, a comparative analysis of HRTF measurement using in-ear microphones is presented. The results obtained with in-ear microphones are compared with the HRTFs measured with a standard head and torso simulator, investigating different positions of the microphones and of the sound source and employing two different types of microphones. Finally, the HRTFs of five real subjects are measured and compared with the ones measured by the microphones in the ear of a standard mannequin. Full article

(This article belongs to the Collection Advanced Techniques for Acquisition and Sensing)

► Show Figures

Figure 1

12 pages, 1782 KiB

Open AccessArticle

The Accuracy of Dynamic Sound Source Localization and Recognition Ability of Individual Head-Related Transfer Functions in Binaural Audio Systems with Head Tracking

by Vedran Planinec, Jonas Reijniers, Marko Horvat, Herbert Peremans and Kristian Jambrošić

Appl. Sci. 2023, 13(9), 5254; https://doi.org/10.3390/app13095254 - 23 Apr 2023

Cited by 3 | Viewed by 2990

Abstract

The use of audio systems that employ binaural synthesis with head tracking has become increasingly popular, particularly in virtual reality gaming systems. The binaural synthesis process uses the Head-Related Transfer Functions (HRTF) as an input required to assign the directions of arrival to [...] Read more.

The use of audio systems that employ binaural synthesis with head tracking has become increasingly popular, particularly in virtual reality gaming systems. The binaural synthesis process uses the Head-Related Transfer Functions (HRTF) as an input required to assign the directions of arrival to sounds coming from virtual sound sources in the created virtual environments. Generic HRTFs are often used for this purpose to accommodate all potential listeners. The hypothesis of the research is that the use of individual HRTF in binaural synthesis instead of generic HRTF leads to improved accuracy and quality of virtual sound source localization, thus enhancing the user experience. A novel methodology is proposed that involves the use of dynamic virtual sound sources. In the experiments, the test participants were asked to determine the direction of a dynamic virtual sound source in both the horizontal and vertical planes using both generic and individual HRTFs. The gathered data are statistically analyzed, and the accuracy of localization is assessed with respect to the type of HRTF used. The individual HRTFs of the test participants are measured using a novel and efficient method that is accessible to a broad range of users. Full article

(This article belongs to the Section Acoustics and Vibrations)

► Show Figures

Figure 1

14 pages, 3864 KiB

Open AccessArticle

Prediction of Head Related Transfer Functions Using Machine Learning Approaches

by Roberto Fernandez Martinez, Pello Jimbert, Eric Michael Sumner, Morris Riedel and Runar Unnthorsson

Acoustics 2023, 5(1), 254-267; https://doi.org/10.3390/acoustics5010015 - 1 Mar 2023

Cited by 2 | Viewed by 4478

Abstract

The generation of a virtual, personal, auditory space to obtain a high-quality sound experience when using headphones is of great significance. Normally this experience is improved using personalized head-related transfer functions (HRTFs) that depend on a large degree of personal anthropometric information on [...] Read more.

The generation of a virtual, personal, auditory space to obtain a high-quality sound experience when using headphones is of great significance. Normally this experience is improved using personalized head-related transfer functions (HRTFs) that depend on a large degree of personal anthropometric information on pinnae. Most of the studies focus their personal auditory optimization analysis on the study of amplitude versus frequency on HRTFs, mainly in the search for significant elevation cues of frequency maps. Therefore, knowing the HRTFs of each individual is of considerable help to improve sound quality. The following work proposes a methodology to model HRTFs according to the individual structure of pinnae using multilayer perceptron and linear regression techniques. It is proposed to generate several models that allow knowing HRTFs amplitude for each frequency based on the personal anthropometric data on pinnae, the azimuth angle, and the elevation of the sound source, thus predicting frequency magnitudes. Experiments show that the prediction of new personal HRTF generates low errors, thus this model can be applied to new heads with different pinnae characteristics with high confidence. Improving the results obtained with the standard KEMAR pinna, usually used in cases where there is a lack of information. Full article

(This article belongs to the Collection Featured Position and Review Papers in Acoustics Science)

► Show Figures

Figure 1

17 pages, 3421 KiB

Open AccessArticle

HRTFs Measurement Based on Periodic Sequences Robust towards Nonlinearities in Automotive Audio

by Stefania Cecchi, Valeria Bruschi, Stefano Nobili, Alessandro Terenzi and Alberto Carini

Sensors 2023, 23(3), 1692; https://doi.org/10.3390/s23031692 - 3 Feb 2023

Cited by 1 | Viewed by 1865

Abstract

The head related transfer functions (HRTFs) represent the acoustic path transfer functions between sound sources in 3D space and the listener’s ear. They are used to create immersive audio scenarios or to subjectively evaluate sound systems according to a human-centric point of view. [...] Read more.

The head related transfer functions (HRTFs) represent the acoustic path transfer functions between sound sources in 3D space and the listener’s ear. They are used to create immersive audio scenarios or to subjectively evaluate sound systems according to a human-centric point of view. Cars are nowadays the most popular audio listening environment and the use of HRTFs in automotive audio has recently attracted the attention of researchers. In this context, the paper proposes a measurement method for HRTFs based on perfect or orthogonal periodic sequences. The proposed measurement method ensures robustness towards the nonlinearities that may affect the measurement system. The experimental results considering both an emulated scenario and real measurements in a controlled environment illustrate the effectiveness of the approach and compare the proposed method with other popular approaches. Full article

(This article belongs to the Special Issue Selected Papers from the 2022 IEEE International Workshop on Metrology for Automotive)

► Show Figures

Figure 1

13 pages, 1579 KiB

Open AccessArticle

Ear Centering for Accurate Synthesis of Near-Field Head-Related Transfer Functions

by Ayrton Urviola, Shuichi Sakamoto and César D. Salvador

Appl. Sci. 2022, 12(16), 8290; https://doi.org/10.3390/app12168290 - 19 Aug 2022

Cited by 2 | Viewed by 2611

Abstract

The head-related transfer function (HRTF) is a major tool in spatial sound technology. The HRTF for a point source is defined as the ratio between the sound pressure at the ear position and the free-field sound pressure at a reference position. The reference [...] Read more.

The head-related transfer function (HRTF) is a major tool in spatial sound technology. The HRTF for a point source is defined as the ratio between the sound pressure at the ear position and the free-field sound pressure at a reference position. The reference is typically placed at the center of the listener’s head. When using the spherical Fourier transform (SFT) and distance-varying filters (DVF) to synthesize HRTFs for point sources very close to the head, the spherical symmetry of the model around the head center does not allow for distinguishing between the ear position and the head center. Ear centering is a technique that overcomes this source of inaccuracy by translating the reference position. Hitherto, plane-wave (PW) translation operators have yield effective ear centering when synthesizing far-field HRTFs. We propose spherical-wave (SW) translation operators for ear centering required in the accurate synthesis of near-field HRTFs. We contrasted the performance of PW and SW ear centering. The synthesis errors decreased consistently when applying SW ear centering and the enhancement was observed up to the maximum frequency determined by the spherical grid. Full article

(This article belongs to the Special Issue Immersive 3D Audio: From Architecture to Automotive)

► Show Figures

Figure 1

15 pages, 3061 KiB

Open AccessArticle

Magnitude Modeling of Personalized HRTF Based on Ear Images and Anthropometric Measurements

by Manlin Zhao, Zhichao Sheng and Yong Fang

Appl. Sci. 2022, 12(16), 8155; https://doi.org/10.3390/app12168155 - 15 Aug 2022

Cited by 7 | Viewed by 3206

Abstract

In this paper, we propose a global personalized head-related transfer function (HRTF) method based on anthropometric measurements and ear images. The model consists of two sub-networks. The first is the VGG-Ear Model, which extracts features from the ear images. The second sub-network uses [...] Read more.

In this paper, we propose a global personalized head-related transfer function (HRTF) method based on anthropometric measurements and ear images. The model consists of two sub-networks. The first is the VGG-Ear Model, which extracts features from the ear images. The second sub-network uses anthropometric measurements, ear features, and frequency information to predict the spherical harmonic (SH) coefficients. Finally, the personalized HRTF is obtained through inverse spherical harmonic transform (SHT) reconstruction. With only one training, the HRTF in all directions can be obtained, which greatly reduces the parameters and training cost of the model. To objectively evaluate the proposed method, we calculate the spectral distance (SD) between the predicted HRTF and the actual HRTF. The results show that the SD provided by this method is 5.31 dB, which is better than the average HRTF of 7.61 dB. In particular, the SD value is only increased by 0.09 dB compared to directly using the pinna measurements. Full article

(This article belongs to the Section Acoustics and Vibrations)

► Show Figures

Figure 1

23 pages, 20456 KiB

Open AccessArticle

Spatial Audio Scene Characterization (SASC): Automatic Localization of Front-, Back-, Up-, and Down-Positioned Music Ensembles in Binaural Recordings

by Sławomir K. Zieliński, Paweł Antoniuk and Hyunkook Lee

Appl. Sci. 2022, 12(3), 1569; https://doi.org/10.3390/app12031569 - 1 Feb 2022

Cited by 3 | Viewed by 2343

Abstract

The automatic localization of audio sources distributed symmetrically with respect to coronal or transverse planes using binaural signals still poses a challenging task, due to the front–back and up–down confusion effects. This paper demonstrates that the convolutional neural network (CNN) can be used [...] Read more.

The automatic localization of audio sources distributed symmetrically with respect to coronal or transverse planes using binaural signals still poses a challenging task, due to the front–back and up–down confusion effects. This paper demonstrates that the convolutional neural network (CNN) can be used to automatically localize music ensembles panned to the front, back, up, or down positions. The network was developed using the repository of the binaural excerpts obtained by the convolution of multi-track music recordings with the selected sets of head-related transfer functions (HRTFs). They were generated in such a way that a music ensemble (of circular shape in terms of its boundaries) was positioned in one of the following four locations with respect to the listener: front, back, up, and down. According to the obtained results, CNN identified the location of the ensembles with the average accuracy levels of 90.7% and 71.4% when tested under the HRTF-dependent and HRTF-independent conditions, respectively. For HRTF-dependent tests, the accuracy decreased monotonically with the increase in the ensemble size. A modified image occlusion sensitivity technique revealed selected frequency bands as being particularly important in terms of the localization process. These frequency bands are largely in accordance with the psychoacoustical literature. Full article

(This article belongs to the Special Issue Applications of Machine Learning in Audio Classification and Acoustic Scene Characterization)

► Show Figures

Figure 1

17 pages, 11946 KiB

Open AccessArticle

Towards Child-Appropriate Virtual Acoustic Environments: A Database of High-Resolution HRTF Measurements and 3D-Scans of Children

by Hark Simon Braren and Janina Fels

Int. J. Environ. Res. Public Health 2022, 19(1), 324; https://doi.org/10.3390/ijerph19010324 - 29 Dec 2021

Cited by 9 | Viewed by 4161

Abstract

Head-related transfer functions (HRTFs) play a significant role in modern acoustic experiment designs in the auralization of 3-dimensional virtual acoustic environments. This technique enables us to create close to real-life situations including room-acoustic effects, background noise and multiple sources in a controlled laboratory [...] Read more.

Head-related transfer functions (HRTFs) play a significant role in modern acoustic experiment designs in the auralization of 3-dimensional virtual acoustic environments. This technique enables us to create close to real-life situations including room-acoustic effects, background noise and multiple sources in a controlled laboratory environment. While adult HRTF databases are widely available to the research community, datasets of children are not. To fill this gap, children aged 5–10 years old were recruited among 1st and 2nd year primary school children in Aachen, Germany. Their HRTFs were measured in the hemi-anechoic chamber with a 5-degree × 5-degree resolution. Special care was taken to reduce artifacts from motion during the measurements by means of fast measurement routines. To complement the HRTF measurements with the anthropometric data needed for individualization methods, a high-resolution 3D-scan of the head and upper torso of each participant was recorded. The HRTF measurement took around 3 min. The children’s head movement during that time was larger compared to adult participants in comparable experiments but was generally kept within 5 degrees of rotary and 1 cm of translatory motion. Adult participants only exhibit this range of motion in longer duration measurements. A comparison of the HRTF measurements to the KEMAR artificial head shows that it is not representative of an average child HRTF. Difference can be seen in both the spectrum and in the interaural time delay (ITD) with differences of 70 μs on average and a maximum difference of 138 μs. For both spectrum and ITD, the KEMAR more closely resembles the 95th percentile of range of children’s data. This warrants a closer look at using child specific HRTFs in the binaural presentation of virtual acoustic environments in the future. Full article

(This article belongs to the Special Issue Speech Communication in Complex Auditory Scenes and Effects on Voice Behaviour and Health, Listening Comfort, Well-Being, and Learning)

► Show Figures

Figure 1

21 pages, 752 KiB

Open AccessArticle

Dynamic Binaural Rendering: The Advantage of Virtual Artificial Heads over Conventional Ones for Localization with Speech Signals

by Mina Fallahi, Martin Hansen, Simon Doclo, Steven van de Par, Dirk Püschel and Matthias Blau

Appl. Sci. 2021, 11(15), 6793; https://doi.org/10.3390/app11156793 - 23 Jul 2021

Cited by 3 | Viewed by 2860

Abstract

As an alternative to conventional artificial heads, a virtual artificial head (VAH), i.e., a microphone array-based filter-and-sum beamformer, can be used to create binaural renderings of spatial sound fields. In contrast to conventional artificial heads, a VAH enables one to individualize the binaural [...] Read more.

As an alternative to conventional artificial heads, a virtual artificial head (VAH), i.e., a microphone array-based filter-and-sum beamformer, can be used to create binaural renderings of spatial sound fields. In contrast to conventional artificial heads, a VAH enables one to individualize the binaural renderings and to incorporate head tracking. This can be achieved by applying complex-valued spectral weights—calculated using individual head related transfer functions (HRTFs) for each listener and for different head orientations—to the microphone signals of the VAH. In this study, these spectral weights were applied to measured room impulse responses in an anechoic room to synthesize individual binaural room impulse responses (BRIRs). In the first part of the paper, the results of localizing virtual sources generated with individually synthesized BRIRs and measured BRIRs using a conventional artificial head, for different head orientations, were assessed in comparison with real sources. Convincing localization performances could be achieved for virtual sources generated with both individually synthesized and measured non-individual BRIRs with respect to azimuth and externalization. In the second part of the paper, the results of localizing virtual sources were compared in two listening tests, with and without head tracking. The positive effect of head tracking on the virtual source localization performance confirmed a major advantage of the VAH over conventional artificial heads. Full article

(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))

► Show Figures

Figure 1

16 pages, 478 KiB

Open AccessFeature PaperArticle

Head-Related Transfer Functions for Dynamic Listeners in Virtual Reality

by Olli S. Rummukainen, Thomas Robotham and Emanuël A. P. Habets

Appl. Sci. 2021, 11(14), 6646; https://doi.org/10.3390/app11146646 - 20 Jul 2021

Cited by 9 | Viewed by 4832

Abstract

In dynamic virtual reality, visual cues and motor actions aid auditory perception. With multimodal integration and auditory adaptation effects, generic head-related transfer functions (HRTFs) may yield no significant disadvantage to individual HRTFs regarding accurate auditory perception. This study compares two individual HRTF sets [...] Read more.

In dynamic virtual reality, visual cues and motor actions aid auditory perception. With multimodal integration and auditory adaptation effects, generic head-related transfer functions (HRTFs) may yield no significant disadvantage to individual HRTFs regarding accurate auditory perception. This study compares two individual HRTF sets against a generic HRTF set by way of objective analysis and two subjective experiments. First, auditory-model-based predictions examine the objective deviations in localization cues between the sets. Next, the HRTFs are compared in a static subjective (

N = 8

) localization experiment. Finally, the localization accuracy, timbre, and overall quality of the HRTF sets are evaluated subjectively (

N = 12

) in a six-degrees-of-freedom audio-visual virtual environment. The results show statistically significant objective deviations between the sets, but no perceived localization or overall quality differences in the dynamic virtual reality. Full article

(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))

► Show Figures

Figure 1

19 pages, 6428 KiB

Open AccessArticle

Bone Conduction Auditory Navigation Device for Blind People

by Takumi Asakura

Appl. Sci. 2021, 11(8), 3356; https://doi.org/10.3390/app11083356 - 8 Apr 2021

Cited by 8 | Viewed by 3936

Abstract

A navigation system using a binaural bone-conducted sound is proposed. This system has three features to accurately navigate the user to the destination point. First, the selection of the bone-conduction device and the optimal contact conditions between the device and the human head [...] Read more.

A navigation system using a binaural bone-conducted sound is proposed. This system has three features to accurately navigate the user to the destination point. First, the selection of the bone-conduction device and the optimal contact conditions between the device and the human head are discussed. Second, the basic performance of sound localization reproduced by the selected bone-conduction device with binaural sounds is confirmed considering the head-related transfer functions (HRTFs) obtained in the air-borne sound field. Here, a panned sound technique that may emphasize the localization of the sound is also validated. Third, to ensure the safety of the navigating person, which is the most important factor in the navigation of a visually impaired person by voice guidance, an appropriate warning sound reproduced by the bone-conduction device is investigated. Finally, based on the abovementioned conditions, we conduct an auditory navigation experiment using bone-conducted guide announcement. The time required to reach the destination of the navigation route is shorter in the case with voice information including the binaural sound reproduction, as compared to the case with only voice information. Therefore, a navigation system using binaural bone-conducted sound is confirmed to be effective. Full article

► Show Figures

Figure 1

24 pages, 2773 KiB

Open AccessArticle

Low-Order Spherical Harmonic HRTF Restoration Using a Neural Network Approach

by Benjamin Tsui, William A. P. Smith and Gavin Kearney

Appl. Sci. 2020, 10(17), 5764; https://doi.org/10.3390/app10175764 - 20 Aug 2020

Cited by 6 | Viewed by 3793

Abstract

Spherical harmonic (SH) interpolation is a commonly used method to spatially up-sample sparse head related transfer function (HRTF) datasets to denser HRTF datasets. However, depending on the number of sparse HRTF measurements and SH order, this process can introduce distortions into high frequency [...] Read more.

Spherical harmonic (SH) interpolation is a commonly used method to spatially up-sample sparse head related transfer function (HRTF) datasets to denser HRTF datasets. However, depending on the number of sparse HRTF measurements and SH order, this process can introduce distortions into high frequency representations of the HRTFs. This paper investigates whether it is possible to restore some of the distorted high frequency HRTF components using machine learning algorithms. A combination of convolutional auto-encoder (CAE) and denoising auto-encoder (DAE) models is proposed to restore the high frequency distortion in SH-interpolated HRTFs. Results were evaluated using both perceptual spectral difference (PSD) and localisation prediction models, both of which demonstrated significant improvement after the restoration process. Full article

(This article belongs to the Special Issue Deep Learning for Applications in Acoustics: Modeling, Synthesis, and Listening)

► Show Figures

Figure 1

Search Results (21)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (21)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI