Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (120)

Search Parameters:
Keywords = room reverberation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 4224 KB  
Article
Optimizing Museum Acoustics: How Absorption Magnitude and Surface Location of Finishing Materials Influence Acoustic Performance
by Milena Jonas Bem and Jonas Braasch
Acoustics 2025, 7(3), 43; https://doi.org/10.3390/acoustics7030043 - 11 Jul 2025
Viewed by 535
Abstract
The architecture of contemporary museums often emphasizes visual aesthetics, such as large volumes, open-plan layouts, and highly reflective finishes, resulting in acoustic challenges, such as excessive reverberation, poor speech intelligibility, elevated background noise, and reduced privacy. This study quantified the impact of surface—specific [...] Read more.
The architecture of contemporary museums often emphasizes visual aesthetics, such as large volumes, open-plan layouts, and highly reflective finishes, resulting in acoustic challenges, such as excessive reverberation, poor speech intelligibility, elevated background noise, and reduced privacy. This study quantified the impact of surface—specific absorption treatments on acoustic metrics across eight gallery spaces. Room impulse responses calibrated virtual models, which simulated nine absorption scenarios (low, medium, and high on ceilings, floors, and walls) and evaluated reverberation time (T20), speech transmission index (STI), clarity (C50), distraction distance (rD), Spatial Decay Rate of Speech (D2,S), and Speech Level at 4 m (Lp,A,S,4m). The results indicate that going from concrete to a wooden floor yields the most rapid T20 reductions (up to −1.75 s), ceiling treatments deliver the greatest STI and C50 gains (e.g., STI increases of +0.16), and high-absorption walls maximize privacy metrics (D2,S and Lp,A,S,4m). A linear regression model further predicted the STI from T20, total absorption (Sabins), and room volume, with an 84.9% conditional R2, enabling ±0.03 accuracy without specialized testing. These findings provide empirically derived, surface-specific “first-move” guidelines for architects and acousticians, underscoring the necessity of integrating acoustics early in museum design to balance auditory and visual objectives and enhance the visitor experience. Full article
Show Figures

Figure 1

24 pages, 7707 KB  
Article
Improving Building Acoustics with Coir Fiber Composites: Towards Sustainable Construction Systems
by Luis Bravo-Moncayo, Virginia Puyana-Romero, Miguel Chávez and Giuseppe Ciaburro
Sustainability 2025, 17(14), 6306; https://doi.org/10.3390/su17146306 - 9 Jul 2025
Cited by 1 | Viewed by 713
Abstract
Studies underscore the significance of coir fibers as a sustainable building material. Based on these insights, this research aims to evaluate coir fiber composite panels of various thicknesses as eco-friendly sound absorbing alternatives to synthetic construction materials like rockwool and fiberglass, aligning its [...] Read more.
Studies underscore the significance of coir fibers as a sustainable building material. Based on these insights, this research aims to evaluate coir fiber composite panels of various thicknesses as eco-friendly sound absorbing alternatives to synthetic construction materials like rockwool and fiberglass, aligning its use with the United Nations Sustainable Development Goals. Acoustic absorption was quantified with an impedance tube, and subsequent simulations compared the performance of coir composite panels with that of conventional materials, which constitutes an underexplored evaluation. Using 10 receiver points, the simulations reproduced the acoustic conditions of a multipurpose auditorium before and after the coir covering of parts of the rear and posterior walls. The results indicate that when coir coverings account for approximately 10% of the auditorium surface, reverberation times at 250, 500, 2000, and 4000 Hz are reduced by roughly 1 s. Furthermore, the outcomes reveal that early reflections occur more rapidly in the coir-enhanced model, while the values of the early decay time parameter decrease across all receiver points. Although the original configuration had poor speech clarity, the modified model achieved optimal values at all the measurement locations. These findings underscore the potential of coir fiber panels in enhancing acoustic performance while fostering sustainable construction practices. Full article
(This article belongs to the Special Issue Sustainable Architecture: Energy Efficiency in Buildings)
Show Figures

Figure 1

6 pages, 797 KB  
Proceeding Paper
Machine Learning Classifiers for Voice Health Assessment Under Simulated Room Acoustics
by Ahmed M. Yousef and Eric J. Hunter
Eng. Proc. 2024, 81(1), 16; https://doi.org/10.3390/engproc2024081016 - 7 May 2025
Viewed by 441
Abstract
Machine learning (ML) robustness for voice disorder detection was evaluated using reverberation-augmented recordings. Common vocal health assessment voice features from steady vowel samples (135 pathological, 49 controls) were used to train/test six ML classifiers. Detection performance was evaluated under low-reverb and simulated medium [...] Read more.
Machine learning (ML) robustness for voice disorder detection was evaluated using reverberation-augmented recordings. Common vocal health assessment voice features from steady vowel samples (135 pathological, 49 controls) were used to train/test six ML classifiers. Detection performance was evaluated under low-reverb and simulated medium (med = 0.48 s) and high-reverb times (high = 1.82 s). All models’ performance declined with longer reverberation. Support Vector Machine exhibited slight robustness but faced performance challenges. Random Forest and Gradient Boosting, though strong under low reverb, lacked generalizability in med/high reverb. Training/testing ML on augmented data is essential to enhance their reliability in real-world voice assessments. Full article
(This article belongs to the Proceedings of The 1st International Online Conference on Bioengineering)
Show Figures

Figure 1

17 pages, 4556 KB  
Article
Acoustic Investigations of Two Barrel-Vaulted Halls: Sisto V in Naples and Aula Magna at the University of Parma
by Antonella Bevilacqua, Adriano Farina, Gino Iannace and Jessica Ferrari
Appl. Sci. 2025, 15(9), 5127; https://doi.org/10.3390/app15095127 - 5 May 2025
Viewed by 741
Abstract
The percentage of historical heritage buildings in Italy is substantial. Many of these buildings are abandoned or not adequately restored for public access due to safety concerns. However, some are managed by city councils and made available to local communities. These heritage buildings, [...] Read more.
The percentage of historical heritage buildings in Italy is substantial. Many of these buildings are abandoned or not adequately restored for public access due to safety concerns. However, some are managed by city councils and made available to local communities. These heritage buildings, valued for their historical significance, are now frequently used for live events, including musical performances by ensembles and small groups. This paper deals with the acoustics of two rooms provided with barrel-vaulted ceilings: Sisto V Hall in Naples and Aula Magna at the University of Parma. These spaces are structurally very similar, differing mainly in length. Acoustic measurements conducted in both halls reveal reverberation times of approximately 4.5 s at mid frequencies, resulting in poor speech clarity. This is primarily due to the presence of reflective surfaces, as the walls and ceilings are plastered, and the floors are tiled. To optimize their acoustic properties for functions such as celebrations, gatherings, and conferences, an acoustic design intervention was proposed. Digital models of the halls were calibrated and used to correct the acoustics by incorporating absorbing panels on the walls and carpeting on the floors of the central walk path. This treatment successfully balanced the reverberation time to approximately 1.3–1.4 s at mid frequencies, making speech more intelligible. Additionally, an amplified audio system was analyzed to enhance sound distribution, ensuring uniform coverage, even in the last rows of seating. Under amplified conditions, sound pressure levels (SPLs) range between 90 dB and 93 dB, with appropriate gain control applied to the column array speakers. Full article
(This article belongs to the Special Issue Architectural Acoustics: From Theory to Application)
Show Figures

Figure 1

27 pages, 3004 KB  
Article
Designing for Neonates’ Wellness: Differences in the Reverberation Time Between an Incubator Located in an Open Unit and in a Private Room of a NICU
by Virginia Puyana-Romero, Daniel Nuñez-Solano, Ricardo Hernández-Molina, Francisco Fernández-Zacarías, Juan Jimenez and Giuseppe Ciaburro
Buildings 2025, 15(9), 1411; https://doi.org/10.3390/buildings15091411 - 22 Apr 2025
Viewed by 457
Abstract
Noise levels in Neonatal Intensive Care Units (NICUs) significantly impact neonatal health, influencing stress levels, sleep cycles, and overall development. One critical factor in managing noise is reverberation time (T), which affects sound persistence and acoustic comfort. This study, conducted at the Universidad [...] Read more.
Noise levels in Neonatal Intensive Care Units (NICUs) significantly impact neonatal health, influencing stress levels, sleep cycles, and overall development. One critical factor in managing noise is reverberation time (T), which affects sound persistence and acoustic comfort. This study, conducted at the Universidad de Las Américas in Quito, Ecuador, examines T in two NICU room types—open unit and private room. Measurements were taken in simulated environments to assess acoustic differences between these two designs. Results indicate that T is significantly lower in private rooms compared to open units, suggesting that private rooms provide a more controlled and acoustically favorable environment for neonates. Lower T reduces excessive noise exposure, improving sleep quality and minimizing stress responses in preterm infants. Furthermore, the findings align with Sustainable Development Goals (SDGs), particularly SDG 3 (Good Health and Well-being) and SDG 11 (Sustainable Cities and Communities), by advocating for hospital designs that enhance patient health and promote sustainable infrastructure. These results highlight the importance of integrating acoustically optimized spaces in NICUs to improve neonatal outcomes and contribute to a more sustainable healthcare system. Future research should further explore architectural solutions for noise reduction to refine NICU design standards. Full article
(This article belongs to the Special Issue Acoustics and Well-Being: Towards Healthy Environments)
Show Figures

Figure 1

17 pages, 4416 KB  
Article
Discover the Acoustics of Vanvitelli Architecture in the Royal Palace of Caserta
by Gino Iannace, Ilaria Lombardi, Ernesto Scarano and Amelia Trematerra
Heritage 2025, 8(4), 142; https://doi.org/10.3390/heritage8040142 - 16 Apr 2025
Viewed by 622
Abstract
In this paper, the acoustic characteristics of the most important rooms of the Royal Palace of Caserta are presented. The palace, built in the XVIII century as a residence for the King of Naples, consists of numerous rooms dedicated to court life. The [...] Read more.
In this paper, the acoustic characteristics of the most important rooms of the Royal Palace of Caserta are presented. The palace, built in the XVIII century as a residence for the King of Naples, consists of numerous rooms dedicated to court life. The acoustic properties of the rooms have been studied according to ISO 3382. For each room, the average values of reverberation time (T30), clarity (C80), definition (D50), and Speech Transmission Index (STI) are reported. The acoustic issues of the rooms are highlighted as the understanding of acoustics during the period in which the palace was constructed was limited. While the rudiments of Vitruvius’ theories were known, the good acoustics of the rooms resulted primarily from the intuition and experience of the architects who designed them. The building materials—marble and plaster—contribute to the long reverberation times in the rooms. Special attention was given to the elliptical vault where musicians were positioned, the Palatine Chapel, the theatre used for court entertainment, and the Royal Throne Room. The study applies methods and techniques already seen in the literature and already reported in other published papers. Full article
(This article belongs to the Special Issue Acoustical Heritage: Characteristics and Preservation)
Show Figures

Figure 1

17 pages, 2640 KB  
Article
Study on Acoustic Properties of Helmholtz-Type Honeycomb Sandwich Acoustic Metamaterials
by Xiao-Ling Gai, Xian-Hui Li, Xi-Wen Guan, Tuo Xing, Ze-Nong Cai and Wen-Cheng Hu
Materials 2025, 18(7), 1600; https://doi.org/10.3390/ma18071600 - 1 Apr 2025
Cited by 2 | Viewed by 725
Abstract
In order to improve the acoustic performance of honeycomb sandwich structures, a Helmholtz-type honeycomb sandwich acoustic metamaterial (HHSAM) was proposed. The theoretical and finite element models were established by calculating the acoustic impedance of multiple parallel Helmholtz resonators (HR). By comparing the sound [...] Read more.
In order to improve the acoustic performance of honeycomb sandwich structures, a Helmholtz-type honeycomb sandwich acoustic metamaterial (HHSAM) was proposed. The theoretical and finite element models were established by calculating the acoustic impedance of multiple parallel Helmholtz resonators (HR). By comparing the sound absorption of the single and multiple HR, it was found that the simulation results were basically consistent with the theoretical calculations. The sound absorption and insulation performance of the honeycomb panels, the honeycomb perforated panels, and the HHSAM structure were compared through impedance tube experiments. The results showed that, over a wide frequency range, the acoustic performance of the HHSAM structure was superior to that of the other two structures. Under scattered sound field conditions, the reverberation room results showed that the sound absorption of the HHSAM structure was better than that of the honeycomb panel in the frequency range of 100–5000 Hz. The noise reduction coefficient (NRC) of the honeycomb panel was 0.1, indicating almost no sound absorption effect in engineering. The NRC of the HHSAM structure could reach 0.35. In terms of sound insulation, the HHSAM structure was more prominent in the 400–4000 Hz range than the honeycomb panel. In the frequency range of 500–1600 Hz, the transmission loss of the HHSAM was 5 dB higher than that of the honeycomb panel. Full article
(This article belongs to the Special Issue Novel Materials for Sound-Absorbing Applications)
Show Figures

Figure 1

17 pages, 10294 KB  
Article
Virtual Sound Source Construction Based on Direct-to-Reverberant Ratio Control Using Multiple Pairs of Parametric-Array Loudspeakers and Conventional Loudspeakers
by Masato Nakayama, Takuma Ekawa, Toru Takahashi and Takanobu Nishiura
Appl. Sci. 2025, 15(7), 3744; https://doi.org/10.3390/app15073744 - 28 Mar 2025
Viewed by 666
Abstract
We propose a new method for constructing a virtual sound source (VSS) based on the direct-to-reverberant ratio (DRR) of room impulse responses (RIRs), using multiple pairs of parametric-array loudspeakers (PALs) and conventional loudspeakers (hereafter referred to simply as loudspeakers). In this paper, we [...] Read more.
We propose a new method for constructing a virtual sound source (VSS) based on the direct-to-reverberant ratio (DRR) of room impulse responses (RIRs), using multiple pairs of parametric-array loudspeakers (PALs) and conventional loudspeakers (hereafter referred to simply as loudspeakers). In this paper, we focus on the differences in the DRRs of the RIRs generated by PALs and loudspeakers. The DRR of an RIR is recognized as a key cue for distance perception. A PAL can achieve super-directivity using an array of ultrasonic transducers. Its RIR exhibits a high DRR, characterized by a large-amplitude direct wave and low-amplitude reverberations. Consequently, a PAL makes the VSS appear to be closer to the listener. In contrast, a loudspeaker causes the VSS to be perceived as farther away because the sound it emits has a low DRR. The proposed method leverages the differences in the DRRs of the RIRs between PALs and loudspeakers. It controls the perceived distance of the VSS by reproducing the desired DRR at the listener’s position through a weighted combination of the RIRs emitted from PALs and loudspeakers into the air. Additionally, the proposed method adjusts the direction of the VSS using vector-based amplitude panning (VBAP). Finally, we have confirmed the effectiveness of the proposed method through evaluation experiments. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

21 pages, 3626 KB  
Article
Exploring Factors Influencing Speech Intelligibility in Airport Terminal Pier-Style Departure Lounges
by Xi Li and Yuezhe Zhao
Buildings 2025, 15(3), 426; https://doi.org/10.3390/buildings15030426 - 29 Jan 2025
Cited by 2 | Viewed by 1015
Abstract
This study investigates speech intelligibility and its influencing factors within pier-style airport lounges and assesses the applicability of the Speech Transmission Index (STI) in these large, elongated spaces. Field impulse response measurements were conducted in two pier-style departure lounges with volumes of 98,099 [...] Read more.
This study investigates speech intelligibility and its influencing factors within pier-style airport lounges and assesses the applicability of the Speech Transmission Index (STI) in these large, elongated spaces. Field impulse response measurements were conducted in two pier-style departure lounges with volumes of 98,099 m3 and 60,414 m3, respectively, complemented by simulated binaural room impulse responses for subjective speech intelligibility testing in Mandarin. The research explores the correlations between various acoustic parameters—Early Decay Time (EDT), Reverberation Time (T30), and Definition(D50)—and speech intelligibility scores under different Signal-to-Noise Ratios (SNRs). Findings indicate a significant impact of SNR on speech intelligibility, with a coefficient of determination (R2) of 0.849, suggesting substantial variability explained by SNR. As SNR increases to 10 dB(A), speech intelligibility scores improve significantly; however, further enhancements in clarity diminish beyond this threshold. Additionally, the study reveals a significant relationship between room acoustic parameters, particularly EDT and D50, and speech intelligibility scores, with EDT having a negative impact and D50 a positive impact on speech clarity. The results confirm the suitability of STI in evaluating speech intelligibility in these specific architectural contexts. This study recommends maintaining an SNR of 10 dB(A) and a minimum STI of 0.45 for public address broadcasts in pier-style departure lounges to ensure that announcements are clearly audible to passengers. Full article
(This article belongs to the Section Building Energy, Physics, Environment, and Systems)
Show Figures

Figure 1

14 pages, 2353 KB  
Article
Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions
by Ahmed M. Yousef and Eric J. Hunter
Bioengineering 2024, 11(12), 1253; https://doi.org/10.3390/bioengineering11121253 - 11 Dec 2024
Cited by 3 | Viewed by 1210
Abstract
Room reverberation can affect oral/aural communication and is especially critical in computer analysis of voice. High levels of reverberation can distort voice recordings, impacting the accuracy of quantifying voice production quality and vocal health evaluations. This study quantifies the impact of additive simulated [...] Read more.
Room reverberation can affect oral/aural communication and is especially critical in computer analysis of voice. High levels of reverberation can distort voice recordings, impacting the accuracy of quantifying voice production quality and vocal health evaluations. This study quantifies the impact of additive simulated reverberation on otherwise clean voice recordings as reflected in voice metrics commonly used for voice quality evaluation. From a larger database of voice recordings collected in a low-noise, low-reverberation environment, voice samples of a sustained [a:] vowel produced at two different speaker intents (comfortable and clear) by five healthy voice college-age female native English speakers were used. Using the reverb effect in Audacity, eight reverberation situations indicating a range of reverberation times (T20 between 0.004 and 1.82 s) were simulated and convolved with the original recordings. All voice samples, both original and reverberation-affected, were analyzed using freely available PRAAT software (version 6.0.13) to calculate five common voice parameters: jitter, shimmer, harmonic-to-noise ratio (HNR), alpha ratio, and smoothed cepstral peak prominence (CPPs). Statistical analyses assessed the sensitivity and variations in voice metrics to a range of simulated room reverberation conditions. Results showed that jitter, HNR, and alpha ratio were stable at simulated reverberation times below T20 of 1 s, with HNR and jitter more stable in the clear vocal style. Shimmer was highly sensitive even at T20 of 0.53 s, which would reflect a common room, while CPPs remained stable across all simulated reverberation conditions. Understanding the sensitivity and stability of these voice metrics to a range of room acoustics effects allows for targeted use of certain metrics even in less controlled environments, enabling selective application of stable measures like CPPs and cautious interpretation of shimmer, ensuring more reliable and accurate voice assessments. Full article
(This article belongs to the Special Issue Models and Analysis of Vocal Emissions for Biomedical Applications)
Show Figures

Figure 1

15 pages, 11679 KB  
Article
Convergence Time Measurement Method of Active Noise Cancelling Headphones
by Agata Zatorska, Michał Łuczyński and Wojciech Bartnik
Acoustics 2024, 6(4), 1100-1114; https://doi.org/10.3390/acoustics6040060 - 30 Nov 2024
Viewed by 4751
Abstract
The aim of this paper is to develop and describe an objective method for measuring the performance of headphones with active noise cancellation (ANC). The focus was on measuring both passive and active sound attenuation and determining the convergence time of the ANC [...] Read more.
The aim of this paper is to develop and describe an objective method for measuring the performance of headphones with active noise cancellation (ANC). The focus was on measuring both passive and active sound attenuation and determining the convergence time of the ANC system. A new parameter was introduced—the reaction speed, expressed in dB/ms, allowing an accurate correlation of the active attenuation values with the time needed to achieve them. A series of tests were conducted using three active noise cancelling headphone models of different prices and specifications. The response times were recorded and analyzed. Measurements were performed on two different dummy head models and under two different measurement conditions (reverberation chamber and acoustically adapted room). The results revealed differences between the models, with some headphones consistently providing a better reaction speed. Remarkably, the headphone associated with the lower reaction speed were also the cheapest. This justifies the need for the reaction speed to be a parameter provided by the manufacturer in the datasheet. Full article
(This article belongs to the Special Issue Active Control of Sound and Vibration)
Show Figures

Figure 1

15 pages, 2846 KB  
Article
History and Acoustics of Preaching in Notre-Dame de Paris
by Elliot K. Canfield-Dafilou, Brian F. G. Katz and Beatrice Caseau Chevallier
Heritage 2024, 7(12), 6614-6628; https://doi.org/10.3390/heritage7120306 - 26 Nov 2024
Cited by 1 | Viewed by 1844
Abstract
This article investigates the audibility and intelligibility of preaching in a loud voice inside the Cathedral Notre-Dame de Paris during the Middle Ages, after the construction of the Gothic cathedral, until the late 19th century. Through this time period, the locations where oration [...] Read more.
This article investigates the audibility and intelligibility of preaching in a loud voice inside the Cathedral Notre-Dame de Paris during the Middle Ages, after the construction of the Gothic cathedral, until the late 19th century. Through this time period, the locations where oration took place changed along with religious practices inside the cathedral. Here, we combine a historical approach with room acoustic modelling to evaluate the locations inside the cathedral where one would hear sermons well. In a reverberant cathedral such as Notre-Dame, speech would be most intelligible in areas near the orator. Until the introduction of electronically amplified public address systems, speech would not be intelligible throughout the entire cathedral. Full article
(This article belongs to the Special Issue The Past Has Ears: Archaeoacoustics and Acoustic Heritage)
Show Figures

Figure 1

14 pages, 6674 KB  
Article
Application of Hybrid Absorptive–Diffusive Panels with Variable Acoustic Characteristics Based on Wooden Overlays Designed Using Third-Degree-of-Freedom Bezier Curves
by Bartlomiej Chojnacki, Kamil Schynol and Klara Chojnacka
Materials 2024, 17(22), 5421; https://doi.org/10.3390/ma17225421 - 6 Nov 2024
Viewed by 993
Abstract
This manuscript describes the application of novel hybrid acoustic panels with variable acoustic properties that could be used in the design process. Despite the significant growth in the modern acoustic absorbing and diffusing panel sector in recent years, there is still a need [...] Read more.
This manuscript describes the application of novel hybrid acoustic panels with variable acoustic properties that could be used in the design process. Despite the significant growth in the modern acoustic absorbing and diffusing panel sector in recent years, there is still a need for sustainable and original designs that will fit standard interior design trends. The most significant requirement is satisfying the design needs of variable acoustic venues. The availability of acoustic panels with variable properties is minimal, as most designs are based on textiles in the form of rolling banners; therefore, there is no market diversity. The current paper presents an original solution for a novel perforated wooden panel based on third-degree-of-freedom curves. Due to the possibility of exchanging the front panel, the acoustic surface can be varied and adjusted to the room considering different requirements for the acoustic climate, for example, by modifying the attenuation range from low to mid–high frequencies. The novel panels have unique esthetic properties with functional acoustic features regarding sound diffusion and absorption. In this paper, sound absorption and diffusion measurements will be presented for the different variants of the panels, presenting the option to modify the parameters to adjust the panel’s features to the room’s needs. In situ acoustic measurements in a laboratory were conducted to test the variable acoustic panels’ influence on the room’s acoustic parameters, such as T30 and C80. In summary, the advantages of this kind of design will be discussed, alongside the possible impact on modern construction materials’ utilization in architecture. Full article
Show Figures

Figure 1

24 pages, 3684 KB  
Article
Speech Emotion Recognition Using Transfer Learning: Integration of Advanced Speaker Embeddings and Image Recognition Models
by Maros Jakubec, Eva Lieskovska, Roman Jarina, Michal Spisiak and Peter Kasak
Appl. Sci. 2024, 14(21), 9981; https://doi.org/10.3390/app14219981 - 31 Oct 2024
Cited by 1 | Viewed by 2923
Abstract
Automatic Speech Emotion Recognition (SER) plays a vital role in making human–computer interactions more natural and effective. A significant challenge in SER development is the limited availability of diverse emotional speech datasets, which hinders the application of advanced deep learning models. Transfer learning [...] Read more.
Automatic Speech Emotion Recognition (SER) plays a vital role in making human–computer interactions more natural and effective. A significant challenge in SER development is the limited availability of diverse emotional speech datasets, which hinders the application of advanced deep learning models. Transfer learning is a machine learning technique that helps address this issue by utilizing knowledge from pre-trained models to improve performance on a new task in a target domain, even with limited data. This study investigates the use of transfer learning from various pre-trained networks, including speaker embedding models such as d-vector, x-vector, and r-vector, and image classification models like AlexNet, GoogLeNet, SqueezeNet, ResNet-18, and ResNet-50. We also propose enhanced versions of the x-vector and r-vector models incorporating Multi-Head Attention Pooling and Angular Margin Softmax, alongside other architectural improvements. Additionally, reverberation from the Room Impulse Response datasets was added to the speech utterances to diversify and augment the available data. Notably, the enhanced r-vector model achieved classification accuracies of 74.05% Unweighted Accuracy (UA) and 73.68% Weighted Accuracy (WA) on the IEMOCAP dataset, and 80.25% UA and 79.81% WA on the CREMA-D dataset, outperforming the existing state-of-the-art methods. This study shows that using cross-domain transfer learning is beneficial for low-resource emotion recognition. The enhanced models developed in other domains (for non-emotional tasks) can further improve the accuracy of SER. Full article
Show Figures

Figure 1

33 pages, 46059 KB  
Article
Real and Virtual Lecture Rooms: Validation of a Virtual Reality System for the Perceptual Assessment of Room Acoustical Quality
by Angela Guastamacchia, Riccardo Giovanni Rosso, Giuseppina Emma Puglisi, Fabrizio Riente, Louena Shtrepi and Arianna Astolfi
Acoustics 2024, 6(4), 933-965; https://doi.org/10.3390/acoustics6040052 - 30 Oct 2024
Viewed by 2476
Abstract
Enhancing the acoustical quality in learning environments is necessary, especially for hearing aid (HA) users. When in-field evaluations cannot be performed, virtual reality (VR) can be adopted for acoustical quality assessments of existing and new buildings, contributing to the acquisition of subjective impressions [...] Read more.
Enhancing the acoustical quality in learning environments is necessary, especially for hearing aid (HA) users. When in-field evaluations cannot be performed, virtual reality (VR) can be adopted for acoustical quality assessments of existing and new buildings, contributing to the acquisition of subjective impressions in lab settings. To ensure an accurate spatial reproduction of the sound field in VR for HA users, multi-speaker-based systems can be employed to auralize a given environment. However, most systems require a lot of effort due to cost, size, and construction. This work deals with the validation of a VR-system based on a 16-speaker-array synced with a VR headset, arranged to be easily replicated in small non-anechoic spaces and suitable for HA users. Both objective and subjective validations are performed against a real university lecture room of 800 m3 and with 2.3 s of reverberation time at mid-frequencies. Comparisons of binaural and monoaural room acoustic parameters are performed between measurements in the real lecture room and its lab reproduction. To validate the audiovisual experience, 32 normal-hearing subjects were administered the Igroup Presence Questionnaire (IPQ) on the overall sense of perceived presence. The outcomes confirm that the system is a promising and feasible tool to predict the perceived acoustical quality of a room. Full article
(This article belongs to the Special Issue Acoustical Comfort in Educational Buildings)
Show Figures

Figure 1

Back to TopTop