MDPI - Publisher of Open Access Journals

20 pages, 2639 KB

Open AccessArticle

Model-Informed Speech Enhancement Using Virtual Room Acoustics and Acoustic Descriptor Optimization

by Samuel Yaw Mensah, Tao Zhang, Xin Zhao and Nahid-Al Mahmud

Sensors 2026, 26(12), 3630; https://doi.org/10.3390/s26123630 - 6 Jun 2026

Viewed by 299

Reverberation and background noise remain persistent obstacles to achieving clear and intelligible speech in enclosed environments. Conventional data-driven or purely empirical dereverberation systems often perform well only under training conditions but lack robustness and physical interpretability when exposed to new acoustic spaces. To [...] Read more.

Reverberation and background noise remain persistent obstacles to achieving clear and intelligible speech in enclosed environments. Conventional data-driven or purely empirical dereverberation systems often perform well only under training conditions but lack robustness and physical interpretability when exposed to new acoustic spaces. To address these limitations, this paper proposes a physics-informed speech enhancement algorithm that integrates analytical room acoustics modeling with a descriptor-guided optimization framework. The method employs virtual field simulations based on the Helmholtz equation to estimate key acoustic descriptors, reverberation time (RT60), direct-to-reverberant ratio (DRR), and clarity index (C50), which are then used to adaptively control a model-informed dereverberation filter. This hybrid formulation bridges physical modeling and signal processing, allowing the algorithm to minimize late reverberation energy while maintaining spectral fidelity. Experimental results across multiple simulated and real-room conditions demonstrate measurable improvements over baseline methods, achieving average gains of +6.4 dB in SNR, +1.2 in PESQ, and +0.13 in STOI, along with reduced RT60 and enhanced clarity. The proposed approach offers both computational efficiency and interpretability, making it suitable for real-time deployment in teleconferencing, hearing-assistive, and smart audio applications. Full article

(This article belongs to the Special Issue Multimodal Signal Processing for Speech Enhancement and Intelligent Sensing)

► Show Figures

Figure 1

13 pages, 1200 KB

Open AccessArticle

Spatial Release from Masking with Simulated Electric–Acoustic and Cochlear Implant Speech

by Nirmal Srinivasan, Bailey Borkowski, Morgan Barkhouse and Chhayakanta Patro

J. Otorhinolaryngol. Hear. Balance Med. 2026, 7(1), 15; https://doi.org/10.3390/ohbm7010015 - 16 Apr 2026

Viewed by 711

Abstract

Background/Objectives: Spatial release from masking (SRM) refers to the improvement in speech understanding that occurs when a target talker is spatially separated from competing speech. Although normal-hearing (NH) listeners benefit substantially from spatially separating the maskers from the target, cochlear implant (CI) users [...] Read more.

Background/Objectives: Spatial release from masking (SRM) refers to the improvement in speech understanding that occurs when a target talker is spatially separated from competing speech. Although normal-hearing (NH) listeners benefit substantially from spatially separating the maskers from the target, cochlear implant (CI) users experience markedly reduced advantages due to degraded spectral and binaural cue transmission. Electric–acoustic stimulation (EAS), which preserves low-frequency acoustic hearing in combination with electric stimulation, may partially restore these cues, but its benefits at small, conversationally relevant spatial separations remain poorly understood. Methods: This study measured speech identification thresholds using Coordinate Response Measure (CRM) sentences in NH listeners using natural, EAS, and simulated CI speech across five spatial configurations (0°, ±5°, ±10°, ±15°, ±30°). Speech identification thresholds were measured using a one-up/one-down adaptive procedure with Coordinate Response Measure (CRM) sentences. CI simulation used an eight-channel noise-band vocoder, whereas EAS simulation replaced the two lowest-frequency vocoder channels with low-pass speech (≤500 Hz). All stimuli were spatialized using head-related impulse responses generated from a validated virtual-acoustics model. Results: All stimulus types showed improved thresholds with increasing spatial separation; however, the magnitude of spatial release from masking (SRM) varied systematically. Natural speech produced the lowest thresholds and largest SRM, EAS speech yielded intermediate benefits, and simulated CI speech produced the smallest improvements. Notably, EAS and CI simulations were comparable at small separations, but EAS provided significantly greater SRM at ±15° and ±30°. Conclusions: These findings demonstrate that even partial low-frequency acoustic preservation enhances SRM at moderate spatial separations, highlighting the importance of EAS configurations for improving spatial hearing in CI-related listening environments. Full article

(This article belongs to the Section Otology and Neurotology)

► Show Figures

Figure 1

16 pages, 3089 KB

Open AccessArticle

A Sound Power Measurement Method for Radiated Noise of the Collaborative Robot with Multi-Joint Arms

by Wenshuo Zhu and Yu Huang

Appl. Sci. 2026, 16(6), 3063; https://doi.org/10.3390/app16063063 - 22 Mar 2026

Viewed by 376

Abstract

The growing demand for noise reduction in multi-joint long-reach robotic arms necessitates the development of precise noise measurement methodologies. However, accurate characterization remains challenging due to the robot’s complex kinematics. Specifically, dynamic joint positions and motion trajectories can lead to acoustic occlusion, while [...] Read more.

The growing demand for noise reduction in multi-joint long-reach robotic arms necessitates the development of precise noise measurement methodologies. However, accurate characterization remains challenging due to the robot’s complex kinematics. Specifically, dynamic joint positions and motion trajectories can lead to acoustic occlusion, while the inherent directivity of sound sources further compromises measurement reliability. To address these issues, this study proposes a component-based hybrid measurement approach. First, the noise generated by a single joint was characterized using a simplified 4-point method, with Green’s function applied to correct for variable propagation distances. Subsequently, the total sound power level of the entire robotic arm was synthesized in a virtual environment by integrating the single-joint acoustic data with the arm’s operational kinematic program. Validation results demonstrate that the proposed method achieves a measurement error of only 0.8 dB relative to the reverberation chamber benchmark—an accuracy superior to that of direct measurements of the full robotic arm cycle (2.6 dB). Furthermore, a comparison with the ISO 3744:2025 9-point standard method reveals that while the proposed 4-point approach yields a slightly larger error (0.8 dB vs. 0.2 dB), it significantly reduces experimental complexity. Consequently, this method offers a sufficiently accurate and operationally efficient solution for practical engineering applications. Full article

(This article belongs to the Special Issue Sound and Vibration: Measurement, Perception, and Control)

► Show Figures

Figure 1

22 pages, 3040 KB

Open AccessArticle

Prefabricated Co-Working Spaces’ Window Design: Emotional Salience Scale-Based Optimisation

by Antonio Ciervo, Massimiliano Masullo, Luigi Maffei, Roxana Adina Toma, Maria Dolores Morelli and Michelangelo Scorpio

Buildings 2026, 16(4), 875; https://doi.org/10.3390/buildings16040875 - 22 Feb 2026

Viewed by 615

Abstract

Windows are key elements of the building’s system; they connect workers with the outdoor environment, influence daylight penetration, sound insulation, and thermal exchanges of façades, but they also moderate the workers’ well-being and productivity. This research investigates how the window-to-wall ratio, as well [...] Read more.

Windows are key elements of the building’s system; they connect workers with the outdoor environment, influence daylight penetration, sound insulation, and thermal exchanges of façades, but they also moderate the workers’ well-being and productivity. This research investigates how the window-to-wall ratio, as well as the position and orientation of mullions, in movable offices affect the combination of workers’ perceptual and emotional responses. A smart co-working prefabricated movable office was modelled in virtual reality to include dynamic visual elements and acoustic stimuli. Experiments were performed in a laboratory under controlled thermal conditions involving 32 volunteers. The Igroup Presence and Emotional Salience Questionnaires were used to collect subjective responses. ANOVA analysis and post hoc test with the Bonferroni correction were used for data elaboration. Results revealed that window design affects emotional salience. High window-to-wall ratio and no mullions achieved the highest scores. Increasing the number of mullions, particularly when they obstruct key visual elements, reduced the positive emotional salience rating. Horizontal mullions diminish the outdoors’ spatial perception, interrupting visual continuity and restricting users’ capacity to recognise variations in the views. Finally, the results suggest some valuable insights and suggestions that can help designers improve window design and people’s well-being and satisfaction. Full article

(This article belongs to the Section Building Energy, Physics, Environment, and Systems)

► Show Figures

Figure 1

23 pages, 12136 KB

Open AccessArticle

DOA Estimation for Underwater Coprime Arrays with Sensor Failure Based on Segmented Array Validation and Multipath Matching Pursuit

by Xiao Chen and Ying Zhang

Algorithms 2026, 19(2), 125; https://doi.org/10.3390/a19020125 - 4 Feb 2026

Cited by 1 | Viewed by 519

Abstract

Coprime arrays enable enhanced degrees of freedom through the construction of virtual array equivalent signals. However, the presence of large “holes” leads to discontinuous co-arrays, which severely hampers direction-of-arrival (DOA) estimation techniques that rely on uniform array structures. This paper explores the practical [...] Read more.

Coprime arrays enable enhanced degrees of freedom through the construction of virtual array equivalent signals. However, the presence of large “holes” leads to discontinuous co-arrays, which severely hampers direction-of-arrival (DOA) estimation techniques that rely on uniform array structures. This paper explores the practical application of co-array domain signal processing for underwater acoustic coprime arrays. We propose a novel array configuration based on coprime minimum disordered pairs, enabling the formation of continuously connected co-arrays without interpolating. To address the challenge of limited snapshots in underwater environments, DOA estimation can be achieved by utilizing traditional multipath matching pursuit (MMP) algorithms under the proposed continuous co-array implementation scheme. In practical applications, physical array element failures are inevitable, and faulty elements can create holes in the originally continuous co-array. While interpolation techniques can mitigate small gaps, their performance deteriorates significantly in the presence of large holes or uneven data distribution. To overcome these limitations, we introduce a sparse signal recovery (SSR) method using a fragment array data validation technique for sparse DOA estimation with an underwater acoustic coprime array. Based on the designed continuous array expansion scheme, the resulting continuous co-array is used to map the positions of element failures, revealing the gaps in the co-array. A validation model is established for partially continuous sub-arrays within the discontinuous co-array, enabling signal direction estimation based on the fragmented array validation. Both simulation and sea trial results confirm that the proposed approach maximizes the utilization of co-array elements without relying on interpolation or prediction, offering a robust solution for scenarios involving sensor failures. Full article

(This article belongs to the Special Issue Signal Processing, Intelligent Analysis, and Optimization for Communication and Electronic Systems)

► Show Figures

Figure 1

36 pages, 12167 KB

Open AccessArticle

Perceptual Evaluation of Acoustic Level of Detail in Virtual Acoustic Environments

by Stefan Fichna, Steven van de Par, Bernhard U. Seeber and Stephan D. Ewert

Acoustics 2026, 8(1), 9; https://doi.org/10.3390/acoustics8010009 - 30 Jan 2026

Viewed by 1293

Abstract

Virtual acoustics enables the creation and simulation of realistic and ecologically valid indoor environments vital for hearing research and audiology. For real-time applications, room acoustics simulation requires simplifications. However, the acoustic level of detail (ALOD) necessary to capture all perceptually relevant effects remains [...] Read more.

Virtual acoustics enables the creation and simulation of realistic and ecologically valid indoor environments vital for hearing research and audiology. For real-time applications, room acoustics simulation requires simplifications. However, the acoustic level of detail (ALOD) necessary to capture all perceptually relevant effects remains unclear. This study examines the impact of varying ALOD in simulations of three real environments: a living room with a coupled kitchen, a pub, and an underground station. ALOD was varied by generating different numbers of image sources for early reflections, or by excluding geometrical room details specific for each environment. Simulations were perceptually evaluated using headphones in comparison to measured, real binaural room impulse responses, or by using loudspeakers. The perceived overall difference, spatial audio quality differences, plausibility, speech intelligibility, and externalization were assessed. A transient pulse, an electric bass, and a speech token were used as stimuli. The results demonstrate that considerable reductions in acoustic level of detail are perceptually acceptable for communication-oriented scenarios. Speech intelligibility was robust across ALOD levels, whereas broadband transient stimuli revealed increased sensitivity to simplifications. High-ALOD simulations yielded plausibility and externalization ratings comparable to real-room recordings under both headphone and loudspeaker reproduction. Full article

► Show Figures

Figure 1

23 pages, 31325 KB

Open AccessArticle

Public Evaluation of Notre-Dame Whispers, a Geolocated Outdoor Audio-Guided Tour of Notre-Dame’s Sonic History

by Julien De Muynke, Stéphanie Peichert and Brian F. G. Katz

Heritage 2026, 9(1), 19; https://doi.org/10.3390/heritage9010019 - 9 Jan 2026

Viewed by 1021

Abstract

This study presents the on-site public evaluation of Notre-Dame Whispers, a geolocated audio-guided tour that explores the sonic history of the Cathédrale Notre-Dame de Paris. The experience combines binaural reproduction, embodied storytelling, and historically informed soundscapes to immerse visitors in the cathedral’s [...] Read more.

This study presents the on-site public evaluation of Notre-Dame Whispers, a geolocated audio-guided tour that explores the sonic history of the Cathédrale Notre-Dame de Paris. The experience combines binaural reproduction, embodied storytelling, and historically informed soundscapes to immerse visitors in the cathedral’s past auditory environments. Drawing on virtually recreated acoustics, it reconstructs key components of Notre-Dame’s sound heritage, including the medieval construction site, early polyphonic chant, and the contemporary urban soundscape. An on-site evaluation was conducted to assess visitor engagement, usability, and the perceived authenticity of the reconstructed soundscapes. A mixed-methods approach integrated questionnaire responses, semi-structured interviews, and anonymized user analytics collected through the mobile application. Results indicate a high level of immersion, with participants particularly valuing the spatialised audio design and narrative depth. However, challenges were identified regarding GPS-based triggering reliability and the difficulty of situational interpretation in complex spatial environments. These findings offer insights into public reception of immersive heritage audio experiences and inform future developments in digital cultural mediation. Full article

(This article belongs to the Special Issue The Past Has Ears: Archaeoacoustics and Acoustic Heritage)

► Show Figures

Figure 1

19 pages, 1187 KB

Open AccessArticle

Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports

by Abdullah All Tanvir, Chenyu Huang, Moe Alahmad, Chuyang Yang and Xin Zhong

Aerospace 2026, 13(1), 32; https://doi.org/10.3390/aerospace13010032 - 28 Dec 2025

Cited by 1 | Viewed by 738

Abstract

Accurate estimation of aircraft operations, such as takeoffs and landings, is critical for airport planning and resource allocation, yet it remains particularly challenging at non-towered airports, where no dedicated surveillance infrastructure exists. Existing solutions, including video analytics, acoustic sensors, and transponder-based systems, are [...] Read more.

Accurate estimation of aircraft operations, such as takeoffs and landings, is critical for airport planning and resource allocation, yet it remains particularly challenging at non-towered airports, where no dedicated surveillance infrastructure exists. Existing solutions, including video analytics, acoustic sensors, and transponder-based systems, are often costly, incomplete, or unreliable in environments with mixed traffic and inconsistent radio usage, highlighting the need for a scalable, infrastructure-free alternative. To address this gap, this study proposes a novel dual-pipeline machine learning framework that classifies pilot radio communications using both textual and spectral features to infer operational intent. A total of 2489 annotated pilot transmissions collected from a U.S. non-towered airport were processed through automatic speech recognition (ASR) and Mel-spectrogram extraction. We benchmarked multiple traditional classifiers and deep learning models, including ensemble methods, long short-term memory (LSTM) networks, and convolutional neural networks (CNNs), across both feature pipelines. Results show that spectral features paired with deep architectures consistently achieved the highest performance, with F1-scores exceeding 91% despite substantial background noise, overlapping transmissions, and speaker variability These findings indicate that operational intent can be inferred reliably from existing communication audio alone, offering a practical, low-cost path toward scalable aircraft operations monitoring and supporting emerging virtual tower and automated air traffic surveillance applications. Full article

(This article belongs to the Special Issue AI, Machine Learning and Automation for Air Traffic Control (ATC))

► Show Figures

Figure 1

27 pages, 2447 KB

Open AccessArticle

A Subcarrier Silence-Based Anti-Jamming Method for FBMC-IM Underwater Acoustic Communication

by Zheng Wang, Biao Wang, Tao Fang, Biao Liu and Xingyang Nie

Electronics 2025, 14(22), 4497; https://doi.org/10.3390/electronics14224497 - 18 Nov 2025

Viewed by 834

Abstract

Considering that multi-band interference often leads to a significant increase in the bit error rate at the system receiver end in actual underwater acoustic communication environments, this paper proposes a subcarrier silence anti-interference technology scheme based on filter bank multi-carrier (FBMC) with index [...] Read more.

Considering that multi-band interference often leads to a significant increase in the bit error rate at the system receiver end in actual underwater acoustic communication environments, this paper proposes a subcarrier silence anti-interference technology scheme based on filter bank multi-carrier (FBMC) with index modulation (IM). First, it is analyzed that, under three different underwater acoustic channels and without added interference, the underwater acoustic filter bank multi-carrier with index modulation (FBMC-IM) communication system outperforms traditional FBMC systems in terms of bit error rate performance. Subsequently, targeting the frequency distribution characteristics of multi-band interference, this paper designs an adaptive subcarrier silence mechanism. Through notch detection, interference band information is fed back to the transmitter, and subcarriers within the communication band that overlap with the interference signal spectrum are silenced, while unaffected subcarriers continue to carry communication information, thereby achieving multi-band partitioning to avoid interference effects. Additionally, to further enhance system performance, the paper integrates Virtual Time Reversal Mirror (VTRM) channel equalization technology, which leverages the time-focusing characteristics of multipath signals to effectively suppress multipath interference and delay spread in the acoustic channel. Simulation and field test results demonstrate that the proposed subcarrier-silence-based FBMC-IM anti-interference scheme significantly improves system reliability under multi-narrowband interference conditions. In the simulated underwater acoustic channel, the BER is reduced by approximately 65–80% at a signal-to-noise ratio of 0 dB; in the 5 km test channel in the Bohai Sea, the BER is reduced by 70–85% compared to the traditional FBMC system; in the test channel near Dalian with strong multipath spread, the BER is improved by more than one order of magnitude at a signal-to-noise ratio of 30 dB, with a BER reduction exceeding 90% under the configuration of Q = 4, k = 1. These results fully validate the superior anti-interference capability and communication robustness of the proposed scheme in interfering underwater acoustic environments. Full article

► Show Figures

Figure 1

36 pages, 2468 KB

Open AccessSystematic Review

Virtual Reality Application in Evaluating the Soundscape in Urban Environment: A Systematic Review

by Özlem Gök Tokgöz, Margret Sibylle Engel, Cherif Othmani and M. Ercan Altinsoy

Acoustics 2025, 7(4), 68; https://doi.org/10.3390/acoustics7040068 - 17 Oct 2025

Cited by 1 | Viewed by 3450

Abstract

Urban soundscapes are complex due to the interaction of different sound sources and the influence of structures on sound propagation. Moreover, the dynamic nature of sounds over time and space adds to this complexity. Virtual reality (VR) has emerged as a powerful tool [...] Read more.

Urban soundscapes are complex due to the interaction of different sound sources and the influence of structures on sound propagation. Moreover, the dynamic nature of sounds over time and space adds to this complexity. Virtual reality (VR) has emerged as a powerful tool to simulate acoustic and visual environments, offering users an immersive sense of presence in controlled settings. This technology facilitates more accurate and predictive assessment of urban environments. It serves as a flexible tool for exploring, analyzing, and interpreting them under repeatable conditions. This study presents a systematic literature review focusing on research that integrates VR technology for the audiovisual reconstruction of urban environments. This topic remains relatively underrepresented in the existing literature. A total of 69 peer-reviewed studies were analyzed in this systematic review. The studies were classified according to research goals, selected urban environments, VR technologies used, technical equipment, and experimental setups. In this study, the relationship between the tools used in urban VR representations is examined, and experimental setups are discussed from both technical and perceptual perspectives. This paper highlights existing challenges and opportunities in using VR to assess soundscapes and offers practical insights for future applications of VR in urban environments. Full article

► Show Figures

Figure 1

22 pages, 7050 KB

Open AccessArticle

Designing for Special Neurological Conditions: Architecture Design Criteria for Anti-Misophonia and Anti-ADHD Spaces for Enhanced User Experience

by Yomna K. Abdallah

Architecture 2025, 5(4), 85; https://doi.org/10.3390/architecture5040085 - 23 Sep 2025

Viewed by 3243

Abstract

ADHD and misophonia are developmental neurological disorders that are currently increasing in prevalence due to excessive acoustic and visual pollution. ADHD, which is characterized by a lack of attention and excessive impulsive hyperactivity, and misophonia, which is hypersensitivity to sounds accompanied by a [...] Read more.

ADHD and misophonia are developmental neurological disorders that are currently increasing in prevalence due to excessive acoustic and visual pollution. ADHD, which is characterized by a lack of attention and excessive impulsive hyperactivity, and misophonia, which is hypersensitivity to sounds accompanied by a severe emotional and psychological reaction, are both affected by the user’s spatial environment to a great extent. Spatial design can contribute to increasing or decreasing these unfavorable sensory triggers that affect individuals with ADHD and/or Misophonia. However, the role of architectural spatial design as a therapeutic approach to alleviate the symptoms of Misophonia and ADHD has never been proposed before in the literature, despite its accumulative and chronic effects on the user’s experience in everyday life in terms of well-being and productivity. Therefore, the current work discusses this problem of neglecting the potential effect of architectural spatial design on alleviating Misophonia and ADHD. Thus, the objective of the current work is to propose customized architectural spatial design as a therapeutic approach to alleviate Misophonia and ADHD through adopting the compatible architectural trends of minimal and metaphysical architecture. The methodology of the current work includes a theoretical proposal of this customized architectural spatial design for alleviating these two special neurological conditions. This includes introducing and analyzing these two neurological conditions and their relation to and interaction with architectural spatial design, analyzing minimal and metaphysical architectural trends employed in the proposed therapeutic architectural design, and then proposing augmented and virtual reality as auxiliary add-ons to the architectural spatial design to boost its therapeutic effect. Minimal architecture achieves the “no emotion” criteria through reduced forms, patterns, and colors and adopts simple geometry and natural materials to reduce sensory stressors or stimuli, in order to alleviate the loss of attention and distraction prevalent in those with ADHD, as well as allowing the employment of acoustic materials to achieve acoustic comfort and noise blockage for Misophonia relief. Metaphysical architecture leads the hierarchy of sensory experience through the symbolistic, dynamic, and enigmatic composition of forms and colors, which enhance the spatial analysis and cognitive capacities of the inhabitants. Meanwhile, the use of customized virtual and augmented reality environments is an effective add-on to minimal and metaphysical architectural spaces thanks to its proven therapeutic effect in alleviating various neurological disorders and injuries. At this level of intervention, VR/AR can be used as an add-on to minimal-architecture design, to simulate varied scenarios, as minimal design offers a clean canvas for simulating these varied virtual environments. The other option is to build these customized VR/AR scenarios around a specific architectural element as an add-on metaphysical architecture design to lead the sensory experience and enable the user to detach from the physical constraints of the space. AI-generated designs were used as a proof of concept for the proposed customized architectural spatial design following minimal and metaphysical architecture, as well as to provide AR and VR scenarios as add-on architecture to enhance the therapeutic effect of these architectural spaces for Misophonia and ADHD patients. Furthermore, the validity of VR/AR as a therapeutic approach, alongside the customized architectural design, was discussed, and it was concluded that this study proves the need for extended clinical studies on its efficiency in the long run, which will be conducted in the future. Full article

► Show Figures

Figure 1

28 pages, 7369 KB

Open AccessArticle

Comparison of Impulse Response Generation Methods for a Simple Shoebox-Shaped Room

by Lloyd May, Nima Farzaneh, Orchisama Das and Jonathan S. Abel

Acoustics 2025, 7(3), 56; https://doi.org/10.3390/acoustics7030056 - 6 Sep 2025

Cited by 2 | Viewed by 4567

Abstract

Simulated room impulse responses (RIRs) are important tools for studying architectural acoustics. Many methods exist to generate RIRs, each with unique properties that need to be considered when choosing an RIR synthesis technique. Despite the variation in synthesis techniques, there is a dearth [...] Read more.

Simulated room impulse responses (RIRs) are important tools for studying architectural acoustics. Many methods exist to generate RIRs, each with unique properties that need to be considered when choosing an RIR synthesis technique. Despite the variation in synthesis techniques, there is a dearth of comparisons between these techniques. To address this, a comprehensive comparison of four major categories of RIR synthesis techniques was conducted: wave-based methods (hybrid FEM and modal analysis), geometrical acoustics methods (the image source method and ray tracing), delay-network reverberators (SDNs), and statistical methods (Sabine-NED). To compare these techniques, RIRs were recorded in a simple shoebox-shaped racquetball court, and we compared the synthesized RIRs against these recordings. We conducted both objective analyses, such as energy decay curves, normalized echo density, and frequency-dependent decay times, and a perceptual assessment of synthesized RIRs, which consisted of a listening assessment with 29 participants that utilized a MUSHRA comparison methodology. Our results reveal distinct advantages and limitations across synthesis categories. For example, the Sabine-NED technique was indistinguishable from the recorded IR, but it does not scale well with increasing geometric complexity. These findings provide valuable insights for selecting appropriate synthesis techniques for applications in architectural acoustics, immersive audio rendering, and virtual reality environments. Full article

► Show Figures

Figure 1

22 pages, 3273 KB

Open AccessArticle

Virtual Acoustic Environment Rehearsal and Performance in an Unknown Venue

by Charlotte Fernandez, Martin S. Lawless, David Poirier-Quinot and Brian F. G. Katz

Virtual Worlds 2025, 4(3), 35; https://doi.org/10.3390/virtualworlds4030035 - 1 Aug 2025

Cited by 2 | Viewed by 3126

Abstract

Due to the effect of room acoustics on musical interpretation, a musician’s rehearsal may be greatly enhanced by leveraging virtual and augmented reality technology. This paper presents a preliminary study on a rehearsal tool designed for musicians, enabling practice in a virtual acoustic [...] Read more.

Due to the effect of room acoustics on musical interpretation, a musician’s rehearsal may be greatly enhanced by leveraging virtual and augmented reality technology. This paper presents a preliminary study on a rehearsal tool designed for musicians, enabling practice in a virtual acoustic environment with audience-positioned playback. Fourteen participants, both professional and non-professional musicians, were recruited to practice with the rehearsal tool prior to performing in an unfamiliar venue. Throughout the rehearsal, the subjects either played in a virtual environment that matched the acoustics of the performance venue or one that was acoustically different. A control group rehearsed in an acoustically dry room with no virtual acoustic environment. The tool’s effectiveness was evaluated with two 16-item questionnaires that assessed quality, usefulness, satisfaction with the rehearsal, and aspects of the performance. Findings indicate that rehearsing in a virtual acoustic environment that matches the performance venue improves acoustic awareness during the performance and enhances ease and comfort on stage compared to practising in a different environment. These results support the integration of virtual acoustics in rehearsal tools to help musicians better adapt their performance to concert settings. Full article

(This article belongs to the Special Issue Contemporary Developments in Mixed, Augmented, and Virtual Reality: Implications for Teaching and Learning)

► Show Figures

Figure 1

24 pages, 18515 KB

Open AccessArticle

Simplified Fly Tower Modeling for Preliminary Acoustic Predictions in Opera Houses

by Fabrizio Cumo, Umberto Derme and Sofia Agostinelli

Appl. Sci. 2025, 15(15), 8393; https://doi.org/10.3390/app15158393 - 29 Jul 2025

Viewed by 1680

Abstract

The acoustic field of an opera house is much more difficult to predict than those of concert halls because, in the fly tower, the absorption characteristics vary from time to time, according to the opera piece layout. For this reason, the paper aims [...] Read more.

The acoustic field of an opera house is much more difficult to predict than those of concert halls because, in the fly tower, the absorption characteristics vary from time to time, according to the opera piece layout. For this reason, the paper aims to find a simplified fly tower model to be used as a fixed reference in a preliminary acoustic prediction for opera houses. Firstly, referring to a case study, the effects of the fly tower Depth and absorptive characteristics are investigated to identify the simplified model. As a traditional opera is set on an empty stage, and modern pieces are supported by a virtual projected environment, the influence of the variable stage elements on Reverberation Time RT, Clarity C80, and Strength G is considered, comparing the traditional Semiramide opera to a modern digital one, according to the Just Noticeable Difference JND. Results confirm the utility of the suggested fly tower model, which does not require any set definition. Full article

(This article belongs to the Special Issue Acoustics Analysis and Noise Control for Buildings)

► Show Figures

Figure 1

12 pages, 1391 KB

Open AccessArticle

Speech Intelligibility in Virtual Avatars: Comparison Between Audio and Audio–Visual-Driven Facial Animation

by Federico Cioffi, Massimiliano Masullo, Aniello Pascale and Luigi Maffei

Acoustics 2025, 7(2), 30; https://doi.org/10.3390/acoustics7020030 - 23 May 2025

Cited by 1 | Viewed by 3438

Abstract

Speech intelligibility (SI) is critical in effective communication across various settings, although it is often compromised by adverse acoustic conditions. In noisy environments, visual cues such as lip movements and facial expressions, when congruent with auditory information, can significantly enhance speech perception and [...] Read more.

Speech intelligibility (SI) is critical in effective communication across various settings, although it is often compromised by adverse acoustic conditions. In noisy environments, visual cues such as lip movements and facial expressions, when congruent with auditory information, can significantly enhance speech perception and reduce cognitive effort. In an ever-growing diffusion of virtual environments, communicating through virtual avatars is becoming increasingly prevalent, thus requiring a comprehensive understanding of these dynamics to ensure effective interactions. The present study used Unreal Engine’s MetaHuman technology to compare four methodologies used to create facial animation: MetaHuman Animator (MHA), MetaHuman LiveLink (MHLL), Audio-Driven MetaHuman (ADMH), and Synthetized Audio-Driven MetaHuman (SADMH). Thirty-six word pairs from the Diagnostic Rhyme Test (DRT) were used as input stimuli to create the animations and to compare them in terms of intelligibility. Moreover, to simulate a challenging background noise, the animations were mixed with a babble noise at a signal-to-noise ratio of −13 dB (A). Participants assessed a total of 144 facial animations. Results showed the ADMH condition to be the most intelligible among the methodologies used, probably due to enhanced clarity and consistency in the generated facial animations, while eliminating distractions like micro-expressions and natural variations in human articulation. Full article

► Show Figures

Figure 1

Search Results (61)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (61)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI