applsci-logo

Journal Browser

Journal Browser

Spatial Audio and Sound Design

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (20 April 2025) | Viewed by 9491

Special Issue Editors


E-Mail Website
Guest Editor
Laboratory of Electronic Media, School of Journalism & Mass Communications, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
Interests: sound source localization; audiovisual stream management; audio semantics

E-Mail Website
Guest Editor
Laboratory of Electronic Media, School of Journalism & Mass Communications, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
Interests: audio semantics; deep learning; multimodal systems

Special Issue Information

Dear Colleagues,

Spatial audio has gained a lot of attention in recent years, given the emergence of immersive environments and eXtended Reality. This has opened new possibilities in sound design while fueling rapid developments in the fields of the 3D acoustic modeling of spaces, spatial audio encoding and distribution, and challenges in playback, which also involve human perception. New recording techniques are being introduced, extending the established knowledge on microphone arrays. Moreover, sound source localization is evolving, not only for creative purposes and sound design, but also for industrial and other applications, like underwater acoustics. The simultaneous rise of deep learning unlocks new possibilities in the deployment of data-driven approaches to the management and extraction of information from multichannel spatial audio information volumes. New publicly available benchmark datasets help to further advance this field in the near future. This Special Issue calls on encouraging researchers to contribute to the above and similar topics around spatial audio through original research articles and review papers.

Dr. Nikolaos Vryzas
Dr. Lazaros Vrysis
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • sound source localization
  • sound design and immersive environments
  • data-driven approaches and machine learning for multichannel audio
  • spatial audio and room acoustics
  • perception and subjective evaluation
  • sound design for music, audio, and performing arts
  • spatial audio recording and playback
  • spatial audio encoding and distribution

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

13 pages, 608 KiB  
Article
The Impact of Noise on Learning in Children and Adolescents: A Meta-Analysis
by Gabriela Fretes and Ramon Palau
Appl. Sci. 2025, 15(8), 4128; https://doi.org/10.3390/app15084128 - 9 Apr 2025
Viewed by 446
Abstract
Given the growing body of research on the impact of noise, synthesizing these findings is crucial to gaining a comprehensive understanding of the influence of noise on student performance. This meta-analysis investigates the effects of environmental and classroom noise on learning, with a [...] Read more.
Given the growing body of research on the impact of noise, synthesizing these findings is crucial to gaining a comprehensive understanding of the influence of noise on student performance. This meta-analysis investigates the effects of environmental and classroom noise on learning, with a focus on cognitive and academic performance in elementary and secondary school students. A systematic review and meta-analysis were conducted on 21 studies comprising 152 effect sizes. Different noise types were analyzed in relation to cognitive functions such as attention, memory, comprehension, and overall academic performance. The Restricted Maximum Likelihood (REML) method was used to estimate the overall effect size, resulting in a value of −0.46 (95% CI: −0.54 to −0.38), indicating the moderate negative impact of noise on performance. The negative effects were particularly significant in children aged 6 to 12. Despite high heterogeneity across the studies, likely due to variations in noise types and study designs, model fit measures confirmed the adequacy of the meta-analytic model. These findings underscore the importance of mitigating noise in educational settings to improve students’ cognitive and academic outcomes. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

17 pages, 10294 KiB  
Article
Virtual Sound Source Construction Based on Direct-to-Reverberant Ratio Control Using Multiple Pairs of Parametric-Array Loudspeakers and Conventional Loudspeakers
by Masato Nakayama, Takuma Ekawa, Toru Takahashi and Takanobu Nishiura
Appl. Sci. 2025, 15(7), 3744; https://doi.org/10.3390/app15073744 - 28 Mar 2025
Viewed by 314
Abstract
We propose a new method for constructing a virtual sound source (VSS) based on the direct-to-reverberant ratio (DRR) of room impulse responses (RIRs), using multiple pairs of parametric-array loudspeakers (PALs) and conventional loudspeakers (hereafter referred to simply as loudspeakers). In this paper, we [...] Read more.
We propose a new method for constructing a virtual sound source (VSS) based on the direct-to-reverberant ratio (DRR) of room impulse responses (RIRs), using multiple pairs of parametric-array loudspeakers (PALs) and conventional loudspeakers (hereafter referred to simply as loudspeakers). In this paper, we focus on the differences in the DRRs of the RIRs generated by PALs and loudspeakers. The DRR of an RIR is recognized as a key cue for distance perception. A PAL can achieve super-directivity using an array of ultrasonic transducers. Its RIR exhibits a high DRR, characterized by a large-amplitude direct wave and low-amplitude reverberations. Consequently, a PAL makes the VSS appear to be closer to the listener. In contrast, a loudspeaker causes the VSS to be perceived as farther away because the sound it emits has a low DRR. The proposed method leverages the differences in the DRRs of the RIRs between PALs and loudspeakers. It controls the perceived distance of the VSS by reproducing the desired DRR at the listener’s position through a weighted combination of the RIRs emitted from PALs and loudspeakers into the air. Additionally, the proposed method adjusts the direction of the VSS using vector-based amplitude panning (VBAP). Finally, we have confirmed the effectiveness of the proposed method through evaluation experiments. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

17 pages, 2555 KiB  
Article
Spatial Sound Rendering Using Intensity Impulse Response and Cardioid Masking Function
by Witold Mickiewicz and Mirosław Łazoryszczak
Appl. Sci. 2025, 15(3), 1112; https://doi.org/10.3390/app15031112 - 23 Jan 2025
Viewed by 663
Abstract
This study presents a new technique for creating spatial sounds based on a convolution processor. The main objective of this research was to propose a new method for generating a set of impulse responses that guarantee a realistic spatial experience based on the [...] Read more.
This study presents a new technique for creating spatial sounds based on a convolution processor. The main objective of this research was to propose a new method for generating a set of impulse responses that guarantee a realistic spatial experience based on the fusion of amplitude data acquired from an omnidirectional microphone and directional data acquired from an intensity probe. The advantages of the proposed approach are its versatility and easy adaptation to playback in a variety of multi-speaker systems, as well as a reduction in the amount of data, thereby simplifying the measurement procedure required to create any set of channel responses at the post-production stage. This paper describes the concept behind the method, the data acquisition method, and the signal processing algorithm required to generate any number of high-quality channel impulse responses. Experimental results are presented to confirm the suitability of the proposed solution by comparing the results obtained for a traditional surround 5.1 recording system and the proposed approach. This study aims to highlight the potential of intensity impulse responses in the audio recording and virtual reality industries. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

23 pages, 5822 KiB  
Article
Reverb and Noise as Real-World Effects in Speech Recognition Models: A Study and a Proposal of a Feature Set
by Valerio Cesarini and Giovanni Costantini
Appl. Sci. 2024, 14(23), 11446; https://doi.org/10.3390/app142311446 - 9 Dec 2024
Cited by 1 | Viewed by 1122
Abstract
Reverberation and background noise are common and unavoidable real-world phenomena that hinder automatic speaker recognition systems, particularly because these systems are typically trained on noise-free data. Most models rely on fixed audio feature sets. To evaluate the dependency of features on reverberation and [...] Read more.
Reverberation and background noise are common and unavoidable real-world phenomena that hinder automatic speaker recognition systems, particularly because these systems are typically trained on noise-free data. Most models rely on fixed audio feature sets. To evaluate the dependency of features on reverberation and noise, this study proposes augmenting the commonly used mel-frequency cepstral coefficients (MFCCs) with relative spectral (RASTA) features. The performance of these features was assessed using noisy data generated by applying reverberation and pink noise to the DEMoS dataset, which includes 56 speakers. Verification models were trained on clean data using MFCCs, RASTA features, or their combination as inputs. They validated on augmented data with progressively increasing noise and reverberation levels. The results indicate that MFCCs struggle to identify the main speaker, while the RASTA method has difficulty with the opposite class. The hybrid feature set, derived from their combination, demonstrates the best overall performance as a compromise between the two. Although the MFCC method is the standard and performs well on clean training data, it shows a significant tendency to misclassify the main speaker in real-world scenarios, which is a critical limitation for modern user-centric verification applications. The hybrid feature set, therefore, proves effective as a balanced solution, optimizing both sensitivity and specificity. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

16 pages, 1422 KiB  
Article
Limitations and Performance Analysis of Spherical Sector Harmonics for Sound Field Processing
by Hanwen Bi, Shaoheng Xu, Fei Ma, Thushara D. Abhayapala and Prasanga N. Samarasinghe
Appl. Sci. 2024, 14(22), 10633; https://doi.org/10.3390/app142210633 - 18 Nov 2024
Viewed by 811
Abstract
Developing spherical sector harmonics (SSHs) benefits sound field decomposition and analysis over spherical sector regions. Although SSHs demonstrate potential in the field of spatial audio, a comprehensive investigation into their properties and performance is absent. This paper seeks to close this gap by [...] Read more.
Developing spherical sector harmonics (SSHs) benefits sound field decomposition and analysis over spherical sector regions. Although SSHs demonstrate potential in the field of spatial audio, a comprehensive investigation into their properties and performance is absent. This paper seeks to close this gap by revealing three key limitations of SSHs and exploring their performance in two aspects: sector sound field radial extrapolation and sector sound field decomposition and reconstruction. First, SSHs are not solutions to the Helmholtz equation, which is their main limitation. Then, due to the violation of the Helmholtz equation, SSHs lack the ability to conduct sound field radial extrapolation, especially for interior cases. Third, when using SSHs to decompose and reconstruct a sound field, the shifted associated Legendre polynomials and scaled exponential function in SSHs result in severe distortion around the edge of the sector region. In light of these three limitations, the future implementation of SSHs should focus on processing and analyzing the measurement sector region without any extrapolation process, and the measurement region should be larger than the target sector region. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

16 pages, 3236 KiB  
Article
Comparison of the Sensitivity of Various Fibers in Distributed Acoustic Sensing
by Artem T. Turov, Yuri A. Konstantinov, D. Claude, Vitaliy A. Maximenko, Victor V. Krishtop, Dmitry A. Korobko and Andrei A. Fotiadi
Appl. Sci. 2024, 14(22), 10147; https://doi.org/10.3390/app142210147 - 6 Nov 2024
Viewed by 2003
Abstract
Standard single-mode telecommunication optical fiber is still one of the most popular in distributed acoustic sensing. Understanding the acoustic, mechanical and optical features of various fibers available currently can lead to a better optimization of distributed acoustic sensors, cost reduction and adaptation for [...] Read more.
Standard single-mode telecommunication optical fiber is still one of the most popular in distributed acoustic sensing. Understanding the acoustic, mechanical and optical features of various fibers available currently can lead to a better optimization of distributed acoustic sensors, cost reduction and adaptation for specific needs. In this paper, a study of the performances of seven fibers with different coatings and production methods in a distributed acoustic sensor setup is presented. The main results include the amplitude–frequency characteristic for each of the investigated fibers in the range of acoustic frequencies from 100 to 7000 Hz. A single-mode fiber fabricated using the modified chemical vapor deposition technique together with a polyimide coating has shown the best sensitivity to acoustic events in the investigated range of frequencies. All of this allows us to both compare the studied specialty fibers with the standard single-mode fiber and choose the most suitable fiber for a specific application, providing an enhancement for the performance of distributed acoustic sensors and better adaptation for the newly aroused potential applications. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

Review

Jump to: Research

46 pages, 2469 KiB  
Review
A Review on Head-Related Transfer Function Generation for Spatial Audio
by Valeria Bruschi, Loris Grossi, Nefeli A. Dourou, Andrea Quattrini, Alberto Vancheri, Tiziano Leidi and Stefania Cecchi
Appl. Sci. 2024, 14(23), 11242; https://doi.org/10.3390/app142311242 - 2 Dec 2024
Viewed by 3261
Abstract
A head-related transfer function (HRTF) is a mathematical model that describes the acoustic path between a sound source and a listener’s ear. Using binaural synthesis techniques, HRTFs play a crucial role in creating immersive audio experiences through headphones or loudspeakers, using binaural synthesis [...] Read more.
A head-related transfer function (HRTF) is a mathematical model that describes the acoustic path between a sound source and a listener’s ear. Using binaural synthesis techniques, HRTFs play a crucial role in creating immersive audio experiences through headphones or loudspeakers, using binaural synthesis techniques. HRTF measurements can be conducted either with standardised mannequins or with in-ear microphones on real subjects. However, various challenges arise in, for example, individual differences in head shape, pinnae geometry, and torso dimensions, as well as in the extensive number of measurements required for optimal audio immersion. To address these issues, numerous methods have been developed to generate new HRTFs from existing data or through computer simulations. This review paper provides an overview of the current approaches and technologies for generating, adapting, and optimising HRTFs, with a focus on physical modelling, anthropometric techniques, machine learning methods, interpolation strategies, and their practical applications. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

Back to TopTop