applsci-logo

Journal Browser

Journal Browser

Psychoacoustics for Extended Reality (XR)

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (31 January 2022) | Viewed by 29786

Special Issue Editor


E-Mail Website
Guest Editor
Applied Psychoacoustics Lab, University of Huddersfield, Huddersfield HD1 3DH, UK
Interests: psychoacoustics; spatial audio; 3D audio; virtual acoustics; extended reality
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Extended reality (XR), which embraces the concepts of virtual reality, augmented reality and mixed reality, is a rapidly growing area of research and development. XR technologies are now being adopted in many sectors of the industry (music and film entertainment, medical training, military training, architectural simulation, virtual tourism, virtual education, etc.) XR ultimately aims to provide the user with realistic, engaging and interactive virtual experiences in 3-degrees-of-freedom (3DOF) or 6-degrees-of-freedom (6DOF), and for this, it is important to achieve the  high-quality dynamic rendering of audio as well as visual information. The traditional psychoacoustics research has focused mainly on the investigations of specific auditory cues in controlled listening environments. However, to provide the user with a more plausible multimodal sensory experience in XR, psychoacoustics research needs to evolve and provide more ecologically valid experimental data and theories about how human auditory perception works in various practical XR scenarios. From this background, this Special Issue aims to introduce the recent development of psychoacoustics-based research focusing on XR and provide insights into future directions of research and development in this field. This issue will aim to collect more than 10 papers and will be published as a book collection.   

Research topics of interest include, but are not limited to, the following:

  • Dynamic sound localisation
  • Auditory spatial perception
  • Binaural processing with head-tracking or/and motion-tracking
  • Auditory-visual interaction/multimodal perception
  • Rendering and perception of virtual acoustics
  • Sound recording and mixing techniques
  • Sound synthesis and design
  • Interactive and immersive storytelling
  • Hearing aid
  • Assistive listening
  • Auditory(–visual) simulation and training

Prof. Dr. Hyunkook Lee
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • psychoacoustics
  • extended reality
  • virtual reality
  • augmented reality
  • mixed reality
  • multimodal perception
  • immersion
  • spatial audio
  • virtual acoustics
  • auditory–visual interaction

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

24 pages, 1918 KiB  
Article
Impact Thresholds of Parameters of Binaural Room Impulse Responses (BRIRs) on Perceptual Reverberation
by Huan Mi, Gavin Kearney and Helena Daffern
Appl. Sci. 2022, 12(6), 2823; https://doi.org/10.3390/app12062823 - 9 Mar 2022
Cited by 5 | Viewed by 2450
Abstract
This paper presents a study on the perceived importance of different acoustic parameters of Binaural Room Impulse Response (BRIR) rendering. A headphone-based listening test was conducted with twenty expert participants. Three BRIRs generated from simulations of three different rooms were convolved with a [...] Read more.
This paper presents a study on the perceived importance of different acoustic parameters of Binaural Room Impulse Response (BRIR) rendering. A headphone-based listening test was conducted with twenty expert participants. Three BRIRs generated from simulations of three different rooms were convolved with a dry speech signal and used as reference audio samples. Four BRIR parameters, Initial Time Delay Gap (ITDG), Forward Early Reflections (FER), Reverse Early Reflections (RER) and Late Reverberation (LR) were systematically altered and convolved with a speech signal to generate the test conditions. A staircase method was used to obtain the threshold at which each BRIR parameter was perceived as different from the reference audio sample. The average perceived impact threshold of each parameter was then calculated across the twenty participants. Results show that RER removal and ITDG extension have a clear impact on the perceptual reverberation of speech audio. Subjects were less sensitive to FER removal. The effect of LR removal on perceptual reverberation is hard to distinguish. Therefore, RER and ITDG are of particular importance when designing artificial reverberation algorithms, whilst more research is needed to understand the perceptual contribution of LR. Minor changes in FER and LR are less significant. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

10 pages, 2277 KiB  
Communication
Investigating the Potential Use of EEG for the Objective Measurement of Auditory Presence
by Shufeng Zhang, Xuelei Feng and Yong Shen
Appl. Sci. 2022, 12(5), 2647; https://doi.org/10.3390/app12052647 - 4 Mar 2022
Cited by 2 | Viewed by 1264
Abstract
Presence is the sense of being in a virtual environment when physically situated in another place. It is one of the key components of the overall virtual reality (VR) experience, as well as other immersive audio applications. However, there is no standardized method [...] Read more.
Presence is the sense of being in a virtual environment when physically situated in another place. It is one of the key components of the overall virtual reality (VR) experience, as well as other immersive audio applications. However, there is no standardized method for measuring presence. In our previous study, we explored the possibility of using electroencephalography (EEG) to measure presence by using questionnaires as a reference. It was found that an increase in the subjective presence level was correlated with an increase in the theta/beta ratio (an index derived from EEG). In the present study, we re-analyzed the original data and found that the peak alpha frequency (PAF), another EEG index, may also have the potential to reflect the change in the subjective presence level. Specifically, an increase in the subjective presence level was found to be correlated with a decrease in PAF. Together with our previous study, these results indicate the potential use of EEG for the objective measurement of presence in the future. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

15 pages, 3897 KiB  
Article
Predicting the Colouration between Binaural Signals
by Thomas McKenzie, Cal Armstrong, Lauren Ward, Damian T. Murphy and Gavin Kearney
Appl. Sci. 2022, 12(5), 2441; https://doi.org/10.3390/app12052441 - 26 Feb 2022
Cited by 9 | Viewed by 2053
Abstract
Although the difference between the fast Fourier transforms of two audio signals is often used as a basic measure of predicting perceived colouration, these signal measures do not provide information on how relevant the results are from a perceptual point of view. This [...] Read more.
Although the difference between the fast Fourier transforms of two audio signals is often used as a basic measure of predicting perceived colouration, these signal measures do not provide information on how relevant the results are from a perceptual point of view. This paper presents a perceptually motivated loudness calculation for predicting the colouration between binaural signals which incorporates equal loudness frequency contouring, relative subjective loudness weighting, cochlea frequency modelling, and an iterative normalisation of input signals. The validation compares the presented model to three other colouration calculations in two ways: using test signals designed to evaluate specific elements of the model, and against the results of a listening test on degraded binaural audio signals. Results demonstrate the presented model is appropriate for predicting the colouration between binaural signals. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

19 pages, 1186 KiB  
Article
Effect of Environment-Related Cues on Auditory Distance Perception in the Context of Audio-Only Augmented Reality
by Vincent Martin, Isabelle Viaud-Delmon and Olivier Warusfel
Appl. Sci. 2022, 12(1), 348; https://doi.org/10.3390/app12010348 - 30 Dec 2021
Cited by 3 | Viewed by 1708
Abstract
Audio-only augmented reality consists of enhancing a real environment with virtual sound events. A seamless integration of the virtual events within the environment requires processing them with artificial spatialization and reverberation effects that simulate the acoustic properties of the room. However, in augmented [...] Read more.
Audio-only augmented reality consists of enhancing a real environment with virtual sound events. A seamless integration of the virtual events within the environment requires processing them with artificial spatialization and reverberation effects that simulate the acoustic properties of the room. However, in augmented reality, the visual and acoustic environment of the listener may not be fully mastered. This study aims to gain some insight into the acoustic cues (intensity and reverberation) that are used by the listener to form an auditory distance judgment, and to observe if these strategies can be influenced by the listener’s environment. To do so, we present a perceptual evaluation of two distance-rendering models informed by a measured Spatial Room Impulse Response. The choice of the rendering methods was made to design stimuli categories in which the availability and reproduction quality of acoustic cues are different. The proposed models have been evaluated in an online experiment gathering 108 participants who were asked to provide judgments of auditory distance about a stationary source. To evaluate the importance of environmental cues, participants had to describe the environment in which they were running the experiment, and more specifically the volume of the room and the distance to the wall they were facing. It could be shown that these context cues had a limited, but significant, influence on the perceived auditory distance. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

21 pages, 3984 KiB  
Article
Prediction and Controlling of Auditory Perception in Augmented Environments. A Loudness-Based Dynamic Mixing Technique
by Nikolaos Moustakas, Andreas Floros, Emmanouel Rovithis and Konstantinos Vogklis
Appl. Sci. 2021, 11(22), 10944; https://doi.org/10.3390/app112210944 - 19 Nov 2021
Viewed by 1419
Abstract
At the core of augmented reality audio (ARA) technology lies the ARA mix, a process responsible for the assignment of a virtual environment to a real one. Legacy ARA mix models have focused on the natural reproduction of the real environment, whereas the [...] Read more.
At the core of augmented reality audio (ARA) technology lies the ARA mix, a process responsible for the assignment of a virtual environment to a real one. Legacy ARA mix models have focused on the natural reproduction of the real environment, whereas the virtual environment is simply mixed through fixed gain methods. This study presents a novel approach of a dynamic ARA mix that facilitates a smooth adaptation of the virtual environment to the real one, as well as dynamic control of the virtual audio engine, by taking into account the inherent characteristics of both ARA technology and binaural auditory perception. A prototype feature extraction technique of auditory perception characteristics through a real-time binaural loudness prediction method was used to upgrade the legacy ARA mix model into a dynamic model, which was evaluated through benchmarks and subjective tests and showed encouraging results in terms of functionality and acceptance. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

11 pages, 5481 KiB  
Article
The Influence of Binaural Room Impulse Responses on Externalization in Virtual Reality Scenarios
by Song Li, Roman Schlieper, Aly Tobbala and Jürgen Peissig
Appl. Sci. 2021, 11(21), 10198; https://doi.org/10.3390/app112110198 - 30 Oct 2021
Cited by 2 | Viewed by 1811
Abstract
A headphone-based virtual sound image can not be perceived as perfectly externalized if the acoustic of the synthesized room does not match that of the real listening environment. This effect has been well explored and is known as the room divergence effect (RDE). [...] Read more.
A headphone-based virtual sound image can not be perceived as perfectly externalized if the acoustic of the synthesized room does not match that of the real listening environment. This effect has been well explored and is known as the room divergence effect (RDE). The RDE is important for perceived externalization of virtual sounds if listeners are aware of the room-related auditory information provided by the listening environment. In the case of virtual reality (VR) applications, users get a visual impression of the virtual room, but may not be aware of the auditory information of this room. It is unknown whether the acoustic congruence between the synthesized (binaurally rendered) room and the visual-only virtual listening environment is important for externalization. VR-based psychoacoustic experiments were performed and the results reveal that perceived externalization of virtual sounds depends on listeners’ expectations of the acoustic of the visual-only virtual room. The virtual sound images can be perceived as externalized, although there is an acoustic divergence between the binaurally synthesized room and the visual-only virtual listening environment. However, the “correct” room information in binaural sounds may lead to degraded externalization if the acoustic properties of the room do not match listeners’ expectations. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

21 pages, 752 KiB  
Article
Dynamic Binaural Rendering: The Advantage of Virtual Artificial Heads over Conventional Ones for Localization with Speech Signals
by Mina Fallahi, Martin Hansen, Simon Doclo, Steven van de Par, Dirk Püschel and Matthias Blau
Appl. Sci. 2021, 11(15), 6793; https://doi.org/10.3390/app11156793 - 23 Jul 2021
Cited by 1 | Viewed by 1966
Abstract
As an alternative to conventional artificial heads, a virtual artificial head (VAH), i.e., a microphone array-based filter-and-sum beamformer, can be used to create binaural renderings of spatial sound fields. In contrast to conventional artificial heads, a VAH enables one to individualize the binaural [...] Read more.
As an alternative to conventional artificial heads, a virtual artificial head (VAH), i.e., a microphone array-based filter-and-sum beamformer, can be used to create binaural renderings of spatial sound fields. In contrast to conventional artificial heads, a VAH enables one to individualize the binaural renderings and to incorporate head tracking. This can be achieved by applying complex-valued spectral weights—calculated using individual head related transfer functions (HRTFs) for each listener and for different head orientations—to the microphone signals of the VAH. In this study, these spectral weights were applied to measured room impulse responses in an anechoic room to synthesize individual binaural room impulse responses (BRIRs). In the first part of the paper, the results of localizing virtual sources generated with individually synthesized BRIRs and measured BRIRs using a conventional artificial head, for different head orientations, were assessed in comparison with real sources. Convincing localization performances could be achieved for virtual sources generated with both individually synthesized and measured non-individual BRIRs with respect to azimuth and externalization. In the second part of the paper, the results of localizing virtual sources were compared in two listening tests, with and without head tracking. The positive effect of head tracking on the virtual source localization performance confirmed a major advantage of the VAH over conventional artificial heads. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

16 pages, 478 KiB  
Article
Head-Related Transfer Functions for Dynamic Listeners in Virtual Reality
by Olli S. Rummukainen, Thomas Robotham and Emanuël A. P. Habets
Appl. Sci. 2021, 11(14), 6646; https://doi.org/10.3390/app11146646 - 20 Jul 2021
Cited by 5 | Viewed by 3655
Abstract
In dynamic virtual reality, visual cues and motor actions aid auditory perception. With multimodal integration and auditory adaptation effects, generic head-related transfer functions (HRTFs) may yield no significant disadvantage to individual HRTFs regarding accurate auditory perception. This study compares two individual HRTF sets [...] Read more.
In dynamic virtual reality, visual cues and motor actions aid auditory perception. With multimodal integration and auditory adaptation effects, generic head-related transfer functions (HRTFs) may yield no significant disadvantage to individual HRTFs regarding accurate auditory perception. This study compares two individual HRTF sets against a generic HRTF set by way of objective analysis and two subjective experiments. First, auditory-model-based predictions examine the objective deviations in localization cues between the sets. Next, the HRTFs are compared in a static subjective (N=8) localization experiment. Finally, the localization accuracy, timbre, and overall quality of the HRTF sets are evaluated subjectively (N=12) in a six-degrees-of-freedom audio-visual virtual environment. The results show statistically significant objective deviations between the sets, but no perceived localization or overall quality differences in the dynamic virtual reality. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

24 pages, 11913 KiB  
Article
Listener-Position and Orientation Dependency of Auditory Perception in an Enclosed Space: Elicitation of Salient Attributes
by Bogdan Ioan Băcilă and Hyunkook Lee
Appl. Sci. 2021, 11(4), 1570; https://doi.org/10.3390/app11041570 - 9 Feb 2021
Cited by 4 | Viewed by 3275
Abstract
This paper presents a subjective study conducted on the perception of auditory attributes depending on listener position and head orientation in an enclosed space. Two elicitation experiments were carried out using the repertory grid technique—in-situ and laboratory experiments—which aimed to identify perceptual attributes [...] Read more.
This paper presents a subjective study conducted on the perception of auditory attributes depending on listener position and head orientation in an enclosed space. Two elicitation experiments were carried out using the repertory grid technique—in-situ and laboratory experiments—which aimed to identify perceptual attributes among 10 different combinations of the listener’s positions and head orientations in a concert hall. It was found that, between the in-situ and laboratory experiments, the listening positions and head orientations were clustered identically. Ten salient perceptual attributes were identified from the data obtained from the laboratory experiment. Whilst these included conventional attributes such as ASW (apparent source width) and LEV (listener envelopment), new attributes such as PRL (perceived reverb loudness), ARW (apparent reverb width) and Reverb Direction were identified, and they are hypothesised to be sub-attributes of LEV (listener envelopment). Timbral characteristics such as Reverb Brightness and Echo Brightness were also identified as salient attributes, which are considered to potentially contribute to the overall perceived clarity. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

20 pages, 5260 KiB  
Article
Creation of Auditory Augmented Reality Using a Position-Dynamic Binaural Synthesis System—Technical Components, Psychoacoustic Needs, and Perceptual Evaluation
by Stephan Werner, Florian Klein, Annika Neidhardt, Ulrike Sloma, Christian Schneiderwind and Karlheinz Brandenburg
Appl. Sci. 2021, 11(3), 1150; https://doi.org/10.3390/app11031150 - 27 Jan 2021
Cited by 12 | Viewed by 4113
Abstract
For a spatial audio reproduction in the context of augmented reality, a position-dynamic binaural synthesis system can be used to synthesize the ear signals for a moving listener. The goal is the fusion of the auditory perception of the virtual audio objects with [...] Read more.
For a spatial audio reproduction in the context of augmented reality, a position-dynamic binaural synthesis system can be used to synthesize the ear signals for a moving listener. The goal is the fusion of the auditory perception of the virtual audio objects with the real listening environment. Such a system has several components, each of which help to enable a plausible auditory simulation. For each possible position of the listener in the room, a set of binaural room impulse responses (BRIRs) congruent with the expected auditory environment is required to avoid room divergence effects. Adequate and efficient approaches are methods to synthesize new BRIRs using very few measurements of the listening room. The required spatial resolution of the BRIR positions can be estimated by spatial auditory perception thresholds. Retrieving and processing the tracking data of the listener’s head-pose and position as well as convolving BRIRs with an audio signal needs to be done in real-time. This contribution presents work done by the authors including several technical components of such a system in detail. It shows how the single components are affected by psychoacoustics. Furthermore, the paper also discusses the perceptive effect by means of listening tests demonstrating the appropriateness of the approaches. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

Review

Jump to: Research

17 pages, 2541 KiB  
Review
Psychoacoustic Principle, Methods, and Problems with Perceived Distance Control in Spatial Audio
by Bosun Xie and Guangzheng Yu
Appl. Sci. 2021, 11(23), 11242; https://doi.org/10.3390/app112311242 - 26 Nov 2021
Cited by 3 | Viewed by 2601
Abstract
One purpose of spatial audio is to create perceived virtual sources at various spatial positions in terms of direction and distance with respect to the listener. The psychoacoustic principle of spatial auditory perception is essential for creating perceived virtual sources. Currently, the technical [...] Read more.
One purpose of spatial audio is to create perceived virtual sources at various spatial positions in terms of direction and distance with respect to the listener. The psychoacoustic principle of spatial auditory perception is essential for creating perceived virtual sources. Currently, the technical means for recreating virtual sources in different directions of various spatial audio techniques are relatively mature. However, perceived distance control in spatial audio remains a challenging task. This article reviews the psychoacoustic principle, methods, and problems with perceived distance control and compares them with the principles and methods of directional localization control in spatial audio, showing that the validation of various methods for perceived distance control depends on the principle and method used for spatial audio. To improve perceived distance control, further research on the detailed psychoacoustic mechanisms of auditory distance perception is required. Full article
(This article belongs to the Special Issue Psychoacoustics for Extended Reality (XR))
Show Figures

Figure 1

Back to TopTop