Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,040)

Search Parameters:
Keywords = sound recording

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 1037 KB  
Article
Acoustic Side-Channel Vulnerabilities in Keyboard Input Explored Through Convolutional Neural Network Modeling: A Pilot Study
by Michał Rzemieniuk, Artur Niewiarowski and Wojciech Książek
Appl. Sci. 2026, 16(2), 563; https://doi.org/10.3390/app16020563 - 6 Jan 2026
Viewed by 124
Abstract
This paper presents the findings of a pilot study investigating the feasibility of recognizing keyboard keystroke sounds using Convolutional Neural Networks (CNNs) as a means of simulating an acoustic side-channel attack aimed at recovering typed text. A dedicated dataset of keyboard audio recordings [...] Read more.
This paper presents the findings of a pilot study investigating the feasibility of recognizing keyboard keystroke sounds using Convolutional Neural Networks (CNNs) as a means of simulating an acoustic side-channel attack aimed at recovering typed text. A dedicated dataset of keyboard audio recordings was collected and preprocessed using signal-processing techniques, including Fourier-transform-based feature extraction and mel-spectrogram analysis. Data augmentation methods were applied to improve model robustness, and a CNN-based prediction architecture was developed and trained. A series of experiments was performed under multiple conditions, including controlled laboratory settings, scenarios with background noise interference, tests involving a different keyboard model, and evaluations following model quantization. The results indicate that CNN-based models can achieve high keystroke-prediction accuracy, demonstrating that this class of acoustic side-channel attacks is technically viable. Additionally, the study outlines potential mitigation strategies designed to reduce exposure to such threats. Overall, the findings highlight the need for increased awareness of acoustic side-channel vulnerabilities and underscore the importance of further research to more comprehensively understand, evaluate, and prevent attacks of this nature. Full article
(This article belongs to the Special Issue Artificial Neural Network and Deep Learning in Cybersecurity)
Show Figures

Figure 1

15 pages, 4535 KB  
Article
Histomorphometric Analysis of the Endometrium of Jennies (Equus asinus) and Mares (Equus caballus) in Estrus: Anatomical Differences and Possible Reproductive Implications
by Pilar Vallejo-Soto, Jesús Dorado, Rafaela Herrera-García, Carmen Álvarez-Delgado, Jaime Gómez-Laguna, Álvaro de Santiago, María Manrique, Antonio González Ariza, José Manuel León Jurado, Manuel Hidalgo and Isabel Ortiz
Animals 2026, 16(1), 143; https://doi.org/10.3390/ani16010143 - 4 Jan 2026
Viewed by 219
Abstract
Assisted reproductive techniques are often extrapolated from horses to donkeys, despite poorer fertility outcomes in jennies. This issue has been attributed to unknown uterine species-specific differences. This study compared, through histomorphometry, the endometrium of jennies and mares. Endometrial biopsies (N = 12) were [...] Read more.
Assisted reproductive techniques are often extrapolated from horses to donkeys, despite poorer fertility outcomes in jennies. This issue has been attributed to unknown uterine species-specific differences. This study compared, through histomorphometry, the endometrium of jennies and mares. Endometrial biopsies (N = 12) were taken from reproductively sound jennies (n = 6) and mares (n = 6) in estrus. Histomorphometric analysis evaluated luminal (LE, µm) and glandular epithelium height (GE, µm), glandular lumen diameter (LD, µm), glandular area (GA, µm2), the number of glands (#G), and glandular tissue percentage (GT, %), measured in the stratum compactum (SC) and spongiosum (SS). A total of 30 measurements of glandular size parameters and 10 fields of glandular density parameters per sample were recorded. Results were statistically compared between species (jennies vs. mares), parity status (maiden vs. foaling), and stratum (SC vs. SS). Jennies exhibited higher (p < 0.05) values than mares for LE, LD-SC, GA-SC, and GT-SC. These findings suggest that the histomorphometric features observed in reproductively sound jennies reflect anatomical differences that might partly explain previously observed species differences in post-breeding uterine response. In conclusion, histomorphometry revealed significant endometrial differences between species, with jennies displaying taller luminal epithelium, greater glandular size, and higher glandular tissue percentage in the SC than mares. Full article
(This article belongs to the Section Animal Reproduction)
Show Figures

Figure 1

19 pages, 1346 KB  
Article
AI-Based Respiratory Monitoring-Guided Evaluation of Rottlerin Therapy for PRRS in Grower–Finisher Pig Farms
by Cha Eun Yoon, Dong Hyun Cho, Hye Lim Park, Ju Yeon Song, Sangshin Park, Sang Won Lee, Yun Young Go, In-Soo Choi, Chang-Seon Song, Joong-Bok Lee, Seung-Yong Park and Yeong-Lim Kang
Viruses 2026, 18(1), 72; https://doi.org/10.3390/v18010072 - 4 Jan 2026
Viewed by 258
Abstract
Porcine reproductive and respiratory syndrome virus (PRRSV) remains a major cause of economic loss in the swine industry, and highly pathogenic variants such as NADC34-like PRRSV highlight the need for antiviral strategies that complement vaccination. In this field study, we evaluated the efficacy [...] Read more.
Porcine reproductive and respiratory syndrome virus (PRRSV) remains a major cause of economic loss in the swine industry, and highly pathogenic variants such as NADC34-like PRRSV highlight the need for antiviral strategies that complement vaccination. In this field study, we evaluated the efficacy of AlimenWOW, a rottlerin–lipid formulation, in grower–finisher pigs under commercial conditions using AI-based respiratory monitoring. A total of 2000 pigs were assigned to four groups: AlimenWOW G1 (PRRSV-stable source farm), AlimenWOW G2 (PRRSV-unstable source farm), Control 1 (antibiotic), and Control 2 (antipyretic). Respiratory Health Status (ReHS) and a derived Clinical Cough Index (CCI = 100 − ReHS) were continuously recorded with SoundTalks®, and oral fluid PRRSV load, serology, clinical outcomes, and productivity were assessed over 4 weeks. AlimenWOW G2 showed a marked improvement in ReHS from severely compromised baseline values to levels comparable with healthy status, while both control groups remained low; CCI was significantly lower in AlimenWOW G2 than in controls from day 14 onward (p ≤ 0.0001). AlimenWOW treatment was associated with reduced PRRSV titers in oral fluid, lower mortality and wasting rates, and improved feed conversion with lower feed costs compared with controls. These findings indicate that AlimenWOW, integrated with AI-based acoustic monitoring, can improve respiratory health and mitigate PRRSV-associated clinical and economic losses, supporting its use as a complementary tool in PRRSV control programs. Full article
(This article belongs to the Section Animal Viruses)
Show Figures

Figure 1

22 pages, 9913 KB  
Article
Analysis of BirdNET Configuration and Performance Applied to the Acoustic Monitoring of a Restored Quarry
by Carlos Iglesias-Merchan, Raquel Sanchez-Torres and Raúl Alonso
Environments 2026, 13(1), 31; https://doi.org/10.3390/environments13010031 - 2 Jan 2026
Viewed by 555
Abstract
In the global context of biodiversity loss, increased demand for natural resources, and major efforts to restore ecosystems altered by human activities, the widespread use of passive acoustic monitoring (PAM) and acoustic recording devices allows for the collection of enormous amounts of data [...] Read more.
In the global context of biodiversity loss, increased demand for natural resources, and major efforts to restore ecosystems altered by human activities, the widespread use of passive acoustic monitoring (PAM) and acoustic recording devices allows for the collection of enormous amounts of data for monitoring the health of ecosystems. BirdNET Analyzer is a freely accessible machine learning tool that has had a great impact on the scientific community due to its apparent ease of use for identifying animals by sound. However, the literature shows some gaps regarding the influence of certain BirdNET configuration parameters on the results of its predictions. This study applies PAM and uses BirdNET in a real acoustic monitoring project and analyzes the potential impact of the configuration parameters Overlap and Sensitivity on the results of the bird inventory of a wetland created on the site of a former limestone quarry in Spain. Our results guide other researchers in the optimal combination of configuration parameters at the community level. Higher Sensitivity configuration values provided the optimal solution for minimizing the loss of species in the bird inventory. On the other hand, we identified that Recall is the best indicator to identify all combinations of BirdNET configuration parameters that cause the lowest species loss, in line with the goal of this monitoring program. Full article
(This article belongs to the Special Issue Interdisciplinary Noise Research)
Show Figures

Figure 1

11 pages, 1725 KB  
Article
Tool Wear Detection in Milling Using Convolutional Neural Networks and Audible Sound Signals
by Halil Ibrahim Turan and Ali Mamedov
Machines 2026, 14(1), 59; https://doi.org/10.3390/machines14010059 - 2 Jan 2026
Viewed by 227
Abstract
Timely tool wear detection has been an important target for the metal cutting industry for decades because of its significance for part quality and production cost control. With the shift toward intelligent and sustainable manufacturing, reliable tool-condition monitoring has become even more critical. [...] Read more.
Timely tool wear detection has been an important target for the metal cutting industry for decades because of its significance for part quality and production cost control. With the shift toward intelligent and sustainable manufacturing, reliable tool-condition monitoring has become even more critical. One of the main challenges in sound-based tool wear monitoring is the presence of noise interference, instability and the highly volatile nature of machining acoustics, which complicates the extraction of meaningful features. In this study, a Convolutional Neural Network (CNN) model is proposed to classify tool wear conditions in milling operations using acoustic signals. Sound recordings were collected from tools at different wear stages under two cutting speeds, and Mel-Frequency Cepstral Coefficients (MFCCs) were extracted to obtain a compact representation of the short-term power spectrum. These MFCC matrices enabled the CNN to learn discriminative spectral patterns associated with wear. To evaluate model stability and reduce the effects of algorithmic randomness, training was repeated three times for each cutting speed. For the 520 rpm dataset, the model achieved an average validation accuracy of 96.85 ± 2.07%, while for the 635 rpm dataset it achieved 93.69 ± 2.07%. The results demonstrate the feasibility of using acoustic signals, despite inherent noise challenges, as a complementary approach for identifying suitable tool replacement intervals in milling. Full article
(This article belongs to the Special Issue Intelligent Tool Wear Monitoring)
Show Figures

Figure 1

25 pages, 6809 KB  
Article
Sound Insulation Prediction and Analysis of Vehicle Floor Systems Based on Squeeze-and-Excitation ResNet Method
by Yan Ma, Jingjing Wang, Dianlong Pan, Wei Zhao, Xiaotao Yang, Xiaona Liu, Jie Yan and Weiping Ding
Electronics 2026, 15(1), 184; https://doi.org/10.3390/electronics15010184 - 30 Dec 2025
Viewed by 252
Abstract
The floor acoustic package is a crucial component of a vehicle’s overall acoustic insulation system, and its performance directly influences the interior sound field distribution and acoustic comfort. Conventional investigations of acoustic package performance primarily rely on experimental testing and computer-aided engineering (CAE) [...] Read more.
The floor acoustic package is a crucial component of a vehicle’s overall acoustic insulation system, and its performance directly influences the interior sound field distribution and acoustic comfort. Conventional investigations of acoustic package performance primarily rely on experimental testing and computer-aided engineering (CAE) simulations. However, these methods often suffer from limited accuracy control, high computational cost, and low efficiency. In contrast, data-driven modeling approaches have recently demonstrated strong potential in addressing these challenges. In this paper, a Squeeze-and-Excitation Residual Network (SE-ResNet) is proposed to predict and analyze the sound insulation performance of vehicle floor systems based on the original structural and material parameters of acoustic package components. By replacing the conventional CAE process with a data-driven framework, the proposed method enhances prediction accuracy and computational efficiency. With the lowest recorded RMSE of 0.4048 dB across the 200–8000 Hz spectrum, the SE-ResNet model ranks first in overall performance. It substantially outperforms the SE-CNN (0.9207 dB) and also shows a clear advantage over both the SE-LSTM (0.4591 dB) and the ResNet (0.4593 dB). Validation using the acoustic package data of a new vehicle model further confirms the robustness of the proposed approach, yielding an overall RMSE = 0.4089 dB and CORR = 0.9996 on the test dataset. These results collectively demonstrate that the SE-ResNet-based method presents a promising and robust solution for forecasting the sound insulation performance of vehicle floor systems. Moreover, the proposed framework offers methodological and technical support for the data-driven prediction and analysis of other vehicle noise and vibration problems. Full article
Show Figures

Figure 1

21 pages, 1302 KB  
Article
Heart Sound Classification with MFCCs and Wavelet Daubechies Analysis Using Machine Learning Algorithms
by Sebastian Guzman-Alfaro, Karen E. Villagrana-Bañuelos, Manuel A. Soto-Murillo, Jorge Isaac Galván-Tejada, Antonio Baltazar-Raigosa, Angel Garcia-Duran, José María Celaya-Padilla and Andrea Acuña-Correa
Diagnostics 2026, 16(1), 83; https://doi.org/10.3390/diagnostics16010083 - 26 Dec 2025
Viewed by 342
Abstract
Background/Objectives: Cardiovascular diseases are the leading cause of mortality worldwide according to the World Health Organization (WHO), highlighting the need for accessible tools for early detection. Automated classification systems based on signal processing and machine learning offer a non-invasive alternative to support clinical [...] Read more.
Background/Objectives: Cardiovascular diseases are the leading cause of mortality worldwide according to the World Health Organization (WHO), highlighting the need for accessible tools for early detection. Automated classification systems based on signal processing and machine learning offer a non-invasive alternative to support clinical diagnosis. Methods: This study implements and evaluates machine learning models for distinguishing normal and abnormal heart sounds using a hybrid feature extraction approach. Recordings labeled as normal, murmur, and extrasystolic were obtained from the PASCAL dataset and subsequently binarized into two classes. Multiple numerical datasets were generated through statistical features derived from Mel-Frequency Cepstral Coefficients (MFCCs) and Daubechies wavelet analysis. Each dataset was standardized and used to train four classifiers: support vector machines, logistic regression, random forests, and decision trees. Results: Model performance was assessed using accuracy, precision, recall, specificity, F1-score, and area under curve. All classifiers achieved notable results; however, the support vector machine model trained with 26 MFCCs and Daubechies-4 wavelet coefficients obtained the best performance. Conclusions: These findings demonstrate that the proposed hybrid MFCC–Wavelet framework provides competitive diagnostic accuracy and represents a lightweight, interpretable, and computationally efficient solution for computer-aided auscultation and early cardiovascular screening. Full article
(This article belongs to the Special Issue Artificial Intelligence and Computational Methods in Cardiology 2025)
Show Figures

Figure 1

24 pages, 10048 KB  
Entry
Immersive Methods and Biometric Tools in Food Science and Consumer Behavior
by Abdul Hannan Zulkarnain and Attila Gere
Encyclopedia 2026, 6(1), 2; https://doi.org/10.3390/encyclopedia6010002 - 22 Dec 2025
Viewed by 263
Definition
Immersive methods and biometric tools provide a rigorous, context-rich way to study how people perceive and choose food. Immersive methods use extended reality, including virtual, augmented, mixed, and augmented virtual environments, to recreate settings such as homes, shops, and restaurants. They increase participants’ [...] Read more.
Immersive methods and biometric tools provide a rigorous, context-rich way to study how people perceive and choose food. Immersive methods use extended reality, including virtual, augmented, mixed, and augmented virtual environments, to recreate settings such as homes, shops, and restaurants. They increase participants’ sense of presence and the ecological validity (realism of conditions) of experiments, while still tightly controlling sensory and social cues like lighting, sound, and surroundings. Biometric tools record objective signals linked to attention, emotion, and cognitive load via sensors such as eye-tracking, galvanic skin response (GSR), heart rate (and variability), facial electromyography, electroencephalography, and functional near-infrared spectroscopy. Researchers align stimuli presentation, gaze, and physiology on a common temporal reference and link these data to outcomes like liking, choice, or willingness-to-buy. This approach reveals implicit responses that self-reports may miss, clarifies how changes in context shift perception, and improves predictive power. It enables faster, lower-risk product and packaging development, better-informed labeling and retail design, and more targeted nutrition and health communication. Good practices emphasize careful system calibration, adequate statistical power, participant comfort and safety, robust data protection, and transparent analysis. In food science and consumer behavior, combining immersive environments with biometrics yields valid, reproducible evidence about what captures attention, creates value, and drives food choice. Full article
(This article belongs to the Collection Food and Food Culture)
Show Figures

Graphical abstract

30 pages, 4486 KB  
Article
Passive Localization in GPS-Denied Environments via Acoustic Side Channels: Harnessing Smartphone Microphones to Infer Wireless Signal Strength Using MFCC Features
by Khalid A. Darabkh, Oswa M. Amro and Feras B. Al-Qatanani
J. Sens. Actuator Netw. 2025, 14(6), 119; https://doi.org/10.3390/jsan14060119 - 16 Dec 2025
Viewed by 448
Abstract
The Global Positioning System (GPS) and Received Signal Strength Indicator (RSSI) usage for location provenance often fails in obstructed, noisy, or densely populated urban environments. This study proposes a passive location provenance method that uses the location’s acoustics and the device’s acoustic side [...] Read more.
The Global Positioning System (GPS) and Received Signal Strength Indicator (RSSI) usage for location provenance often fails in obstructed, noisy, or densely populated urban environments. This study proposes a passive location provenance method that uses the location’s acoustics and the device’s acoustic side channel to address these limitations. With the smartphone’s internal microphone, we can effectively capture the subtle vibrations produced by the capacitors within the voltage-regulating circuit during wireless transmissions. Subsequently, we extract key features from the resulting audio signals. Meanwhile, we record the RSSI values of the WiFi access points received by the smartphone in the exact location of the audio recordings. Our analysis reveals a strong correlation between acoustic features and RSSI values, indicating that passive acoustic emissions can effectively represent the strength of WiFi signals. Hence, the audio recordings can serve as proxies for Radio-Frequency (RF)-based location signals. We propose a location-provenance framework that utilizes sound features alone, particularly the Mel-Frequency Cepstral Coefficients (MFCCs), achieving coarse localization within approximately four kilometers. This method requires no specialized hardware, works in signal-degraded environments, and introduces a previously overlooked privacy concern: that internal device sounds can unintentionally leak spatial information. Our findings highlight a novel passive side-channel with implications for both privacy and security in mobile systems. Full article
Show Figures

Graphical abstract

33 pages, 1114 KB  
Article
Bangladesh’s Ship Recycling Industry in the Global South: Readiness, Regional Competition, and Reform Imperatives
by Khandakar Akhter Hossain
Sustainability 2025, 17(24), 10998; https://doi.org/10.3390/su172410998 - 9 Dec 2025
Viewed by 547
Abstract
The ship recycling industry in Bangladesh has transformed from informal, beaching-based operations into a globally significant sector, representing over 45% of global recycling tonnage and providing essential raw materials and employment opportunities. This study adopts a mixed-methods design, combining secondary data analysis (2014–2024 [...] Read more.
The ship recycling industry in Bangladesh has transformed from informal, beaching-based operations into a globally significant sector, representing over 45% of global recycling tonnage and providing essential raw materials and employment opportunities. This study adopts a mixed-methods design, combining secondary data analysis (2014–2024 gross tonnage records), over 500 stakeholder interviews, and ARIMA-based scenario forecasting up to 2050. The findings indicate that the sector contributes approximately USD 2.1 billion annually to the national economy and supports more than 250,000 direct and indirect jobs. Despite its economic significance, major compliance gaps persist with the Hong Kong International Convention (HKC): only about 52% of yards are certified or in the process of certification. Workplace accident rates remain roughly 30% higher than regional averages, while environmental assessments reveal elevated heavy metal concentrations in soil and water, underscoring weak regulatory enforcement and environmental management. Comparative analysis shows that India has successfully modernized over 120 HKC-compliant yards through targeted policy and financial incentives, whereas Pakistan is rapidly upgrading its Gadani facilities through major investment programs. Forecasting results identify three trajectories: a baseline of ~2.7 million GT annually to 2050, an optimistic expansion to ~5 million GT with green reforms, and a pessimistic decline below 2 million GT if progress stagnates. To ensure sustainable advancement, five strategic policy pillars are proposed, offering an evidence-based roadmap for Bangladesh to achieve safe, environmentally sound, and globally competitive ship recycling. Full article
Show Figures

Figure 1

13 pages, 866 KB  
Article
Designing a Virtual Reality Platform for University Students: An Immersive Approach to Developing Oral Presentation Skills
by Yasna Sandoval, Carlos Rojas, Gabriel Lagos, Bárbara Farías, Soledad Quezada and Luis Gajardo
Educ. Sci. 2025, 15(12), 1655; https://doi.org/10.3390/educsci15121655 - 8 Dec 2025
Viewed by 393
Abstract
The increasing use of Virtual Reality in education has demonstrated its potential to enhance student engagement and skill development. This study investigates the design and implementation of a VR platform aimed at helping university students improve their oral presentation skills, while also evaluating [...] Read more.
The increasing use of Virtual Reality in education has demonstrated its potential to enhance student engagement and skill development. This study investigates the design and implementation of a VR platform aimed at helping university students improve their oral presentation skills, while also evaluating user satisfaction through structured surveys. A total of 40 university students from the Speech and Language Therapy program participated in this study, focusing on their interactions with a custom-built, realistic VR application inspired by the main auditorium of their university. The students faced various distraction scenarios that emulated real-life public speaking challenges. Cybersickness symptoms were continuously monitored throughout the sessions; no participants reported or exhibited symptoms requiring interruption of the VR exposure. The VR environment was constructed using Unity and featured adjustable audience sizes, ambient sound controls, and recording capabilities for presentations. The results demonstrated significant enhancements in oral presentation skills post-VR training. Participants exhibited significant improvements in speaking fluency and clarity of expression, as well as reduced anxiety, during the VR experience. Specifically, their fluency increased significantly, and their clarity ratings also improved substantially. Furthermore, behavioral indicators showed a marked decrease in anxiety levels. Participants reported that the immersive nature of the VR experience enhanced their enjoyment, contributing positively to the overall outcomes. The findings suggest that VR is an effective tool for enhancing oral presentation skills in university students, leading to improved confidence and performance in real-life situations. Full article
Show Figures

Figure 1

13 pages, 64366 KB  
Article
Pilot Passive Acoustic Monitoring in the Strait of Gibraltar: First Evidence of Iberian Orca Calls and 40 Hz Fin Whale Foraging Signals
by Javier Almunia, Sergio García Beitia, Jonas Philipp Lüke, Fernando Rosa and Renaud de Stephanis
J. Mar. Sci. Eng. 2025, 13(12), 2330; https://doi.org/10.3390/jmse13122330 - 8 Dec 2025
Viewed by 586
Abstract
The Strait of Gibraltar is a major biogeographic bottleneck connecting the Atlantic Ocean and the Mediterranean Sea, where migratory cetaceans coexist with an intense maritime traffic. To evaluate the feasibility of broadband passive acoustic monitoring (PAM) for both soundscape characterisation and cetacean detection, [...] Read more.
The Strait of Gibraltar is a major biogeographic bottleneck connecting the Atlantic Ocean and the Mediterranean Sea, where migratory cetaceans coexist with an intense maritime traffic. To evaluate the feasibility of broadband passive acoustic monitoring (PAM) for both soundscape characterisation and cetacean detection, a short drifting-buoy experiment was conducted near Barbate, Spain, in May 2025. The system, equipped with a calibrated SoundTrap 400 recorder, continuously sampled the underwater acoustic environment for 2.5 h. Analysis of the recordings revealed vocalisations of Orcinus orca, representing the first preliminary and incomplete description of the Iberian killer whale acoustic repertoire, and numerous transient tonal events with energy peaks between 40 and 50 Hz, consistent with baleen whale sounds previously attributed to foraging fin whales (Balaenoptera physalus). Sperm whale clicks and delphinid whistles were also occasionally detected. The power spectral density analysis further showed a persistent anthropogenic component dominated by vessel noise below 200 Hz and narrow-band echosounder signals at 30 and 50 kHz. These findings confirm the potential of PAM to detect multiple cetacean species and to resolve the complex interplay between biophony and anthropophony in one of the world’s busiest marine corridors. Establishing a permanent PAM observatory in the Strait would enable continuous, non-intrusive monitoring of species presence, behaviour, and habitat use, thereby contributing to conservation efforts for endangered populations such as the Iberian killer whale. Full article
(This article belongs to the Special Issue Recent Advances in Marine Bioacoustics)
Show Figures

Figure 1

16 pages, 1427 KB  
Article
Acoustic Vector Sensor–Based Speaker Diarization Using Sound Intensity Analysis for Two-Speaker Dialogues
by Grzegorz Szwoch, Józef Kotus and Szymon Zaporowski
Appl. Sci. 2025, 15(23), 12780; https://doi.org/10.3390/app152312780 - 3 Dec 2025
Viewed by 844
Abstract
Speaker diarization is a key component of automatic speech recognition (ASR) systems, particularly in interview scenarios where speech segments must be assigned to individual speakers. This study presents a diarization algorithm based on sound intensity analysis using an Acoustic Vector Sensor (AVS). The [...] Read more.
Speaker diarization is a key component of automatic speech recognition (ASR) systems, particularly in interview scenarios where speech segments must be assigned to individual speakers. This study presents a diarization algorithm based on sound intensity analysis using an Acoustic Vector Sensor (AVS). The algorithm determines the azimuth of each speaker, defines directional beams, and detects speaker activity by analyzing intensity distributions within each beam, enabling identification of both single and overlapping speech segments. A dedicated dataset of interview recordings involving five speakers was created for evaluation. Performance was assessed using the Diarization Error Rate (DER) metric and compared with the State-of-the-Art Pyannote.audio system. The proposed AVS-based method achieved a lower DER value (0.112) than Pyannote (0.213) without overlapping speech, and a DER equal to 0.187 with overlapping speech included, demonstrating improved diarization accuracy and better handling of overlapping speech. The algorithm does not require training, operates independently of speaker-specific features, and can be adapted to various acoustic conditions. The results confirm that AVS-based diarization provides a robust and interpretable alternative to neural approaches, particularly suitable for structured two-speaker dialogues such as physician–patient or interviewer–interviewee scenarios. Full article
(This article belongs to the Special Issue Advances in Audio Signal Processing)
Show Figures

Figure 1

21 pages, 11649 KB  
Article
A Low-Cost Passive Acoustic Toolkit for Underwater Recordings
by Vassilis Galanos, Vasilis Trygonis, Antonios D. Mazaris and Stelios Katsanevakis
Sensors 2025, 25(23), 7306; https://doi.org/10.3390/s25237306 - 1 Dec 2025
Viewed by 912
Abstract
Passive acoustic monitoring is a key tool for studying underwater soundscapes and assessing anthropogenic impacts, yet the high cost of hydrophones limits large-scale deployment and citizen science participation. We present the design, construction, and field evaluation of a low-cost hydrophone unit integrated into [...] Read more.
Passive acoustic monitoring is a key tool for studying underwater soundscapes and assessing anthropogenic impacts, yet the high cost of hydrophones limits large-scale deployment and citizen science participation. We present the design, construction, and field evaluation of a low-cost hydrophone unit integrated into an acoustic toolkit. The hydrophone, built from off-the-shelf components at a cost of ~20 €, was paired with a commercially available handheld recorder, resulting in a complete system priced at ~50 €. Four field experiments in Greek coastal waters validated hydrophone performance across a marine-protected area, commercial port, aquaculture site, and coastal reef. Recordings were compared with those from a calibrated scientific hydrophone (SNAP, Loggerhead Instruments). Results showed that the low-cost hydrophones were mechanically robust and consistently detected most anthropogenic sounds also identified by the reference instrument, though their performance was poor at low frequencies (<200 Hz) and susceptible to mid-frequency (3 kHz) resonance issues. Despite these constraints, the toolkit demonstrates potential for large-scale, low-budget passive acoustic monitoring and outreach applications, offering a scalable solution for citizen scientists, educational programs, and research groups with limited resources. Full article
(This article belongs to the Section Environmental Sensing)
Show Figures

Figure 1

23 pages, 18754 KB  
Article
Wavelet-Based Analysis of Soundscape Dynamics in a Riparian Woodland: The Bernate-Ticino River Park
by Roberto Benocci, Giorgia Guagliumi, Andrea Potenza, Valentina Zaffaroni-Caorsi, Hector Eduardo Roman and Giovanni Zambon
Sensors 2025, 25(23), 7248; https://doi.org/10.3390/s25237248 - 27 Nov 2025
Viewed by 459
Abstract
Passive acoustic monitoring (PAM) is a valuable tool for ecological research, but many eco-acoustic indices show inconsistent correlations with biodiversity due to methodological variability and environmental noise. We propose a complementary, physically interpretable approach using energy-derived metrics. We analyzed audio recordings from three [...] Read more.
Passive acoustic monitoring (PAM) is a valuable tool for ecological research, but many eco-acoustic indices show inconsistent correlations with biodiversity due to methodological variability and environmental noise. We propose a complementary, physically interpretable approach using energy-derived metrics. We analyzed audio recordings from three sites near a major highway in the Ticino River Park (Milan, Italy) using 1 sec equivalent continuous sound pressure level (Leq1s), peak interval statistics, maximal-overlap discrete-wavelet transform (MODWT), and temporal fractal analysis. This multi-resolution type of approach enabled frequency-specific tracking of acoustic energy and temporal structure. Our results reveal site-specific differences: Site 3, the most distant from the highway, showed higher high-frequency energy and longer temporal persistence, suggesting richer biophonic activity. Site 1, the closest to the highway, displayed flatter spectral profiles and faster autocorrelation decay. Diel patterns were reflected in hourly Leq trends, while fractal analysis revealed frequency- and site-dependent acoustic memory. These automated findings were corroborated by expert annotations of bird activity and traffic. The integration of Leq1s, peak metrics, and wavelet decomposition offers a suitable framework for soundscape characterization, with strong potential for long-term ecoacoustic monitoring and habitat quality assessment in complex environments. Full article
Show Figures

Figure 1

Back to TopTop