Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,295)

Search Parameters:
Keywords = acoustic signal

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 5584 KB  
Article
Experimental Study on the Effect of Rubber Fibre Content on the Mechanical Properties and Failure Mode of Grouting Materials
by Yixiang Wang, Xianzhang Ling, Xipeng Qin, Zhongnian Yang, Mingyu Liu, Runqi Guo and Yingying Zhang
Appl. Sci. 2026, 16(2), 931; https://doi.org/10.3390/app16020931 - 16 Jan 2026
Abstract
To promote waste tyre resource utilisation and reduce environmental pressure, this study prepared five stone sample groups using waste tyre rubber fibre (RF) as a modifier, combined with blast furnace slag, fly ash, carbide slag, and calcium chloride, with RF contents of 0%, [...] Read more.
To promote waste tyre resource utilisation and reduce environmental pressure, this study prepared five stone sample groups using waste tyre rubber fibre (RF) as a modifier, combined with blast furnace slag, fly ash, carbide slag, and calcium chloride, with RF contents of 0%, 6%, 10%, 14%, and 18%. Working performance was analysed via density, fluidity, and water separation rate tests, while mechanical properties and failure mechanisms were explored through uniaxial compression tests, acoustic emission (AE) monitoring, and SEM microstructure observations. Results showed that as RF content increased, slurry density and fluidity decreased nonlinearly, water separation rate first rose then fell, and uniaxial compressive strength dropped significantly (64.97% lower at 18% RF than 0%). Failure mode shifted from shear to tensile–shear mixed failure, AE signal activity weakened, energy release gentled, and crack propagation was delayed. Microstructurally, 6–10% RF ensured uniform fibre dispersion, blocking microcracks and optimising interfacial zones, while 14–18% RF caused agglomeration and pore defects. The optimal grouting material ratio was determined as 10% RF, blast furnace slag: fly ash = 4:1, 40% carbide slag, 1% calcium chloride, and a 0.7 water–cement ratio (total solid component 100%). Full article
Show Figures

Figure 1

18 pages, 3170 KB  
Article
A Terrain Perception Method for Quadruped Robots Based on Acoustic Signal Fusion
by Meng Hong, Nian Wang, Xingyu Liu, Chao Huang, Ganchang Li, Zijian Li, Shuai Shu, Ruixuan Chen, Jincheng Sheng, Zhongren Wang, Sijia Guan and Min Guo
Sensors 2026, 26(2), 594; https://doi.org/10.3390/s26020594 - 15 Jan 2026
Viewed by 36
Abstract
In unstructured environments, terrain perception is essential for stability and environmental awareness of Quadruped robot locomotion. Existing approaches primarily rely on visual or proprioceptive signals, but their effectiveness is limited under conditions of visual occlusion or ambiguous terrain features. To address this, this [...] Read more.
In unstructured environments, terrain perception is essential for stability and environmental awareness of Quadruped robot locomotion. Existing approaches primarily rely on visual or proprioceptive signals, but their effectiveness is limited under conditions of visual occlusion or ambiguous terrain features. To address this, this study proposes a multimodal terrain perception method that integrates acoustic features with proprioceptive signals. This terrain perception method collects environmental acoustic information through an externally mounted sound sensor, and combines the sound signal with proprioceptive sensor data from IMU and joint encoder of the quadruped robot. The method was deployed on the quadruped robot Lite2 platform developed by Deep Robotics, and experiments were conducted on four representative terrain types: concrete, gravel, sand, and carpet. Mel-spectrogram features are extracted from the acoustic signals and concatenated with the IMU and joint encoder to form feature vectors, which are subsequently fed into a support vector machine for terrain classification. For each terrain type, 400 s of data were collected. Experimental results show that the terrain classification accuracy reaches 78.28% without using acoustic signals, while increasing to 82.52% when acoustic features are incorporated. To further enhance the classification performance, this study performs a combined exploration of the SVM hyperparameters C and γ as well as the time-window length win. The final results demonstrate that the classification accuracy can be improved to as high as 99.53% across all four terrains. Full article
(This article belongs to the Special Issue Dynamics and Control System Design for Robotics)
Show Figures

Figure 1

32 pages, 483 KB  
Review
The Complexity of Communication in Mammals: From Social and Emotional Mechanisms to Human Influence and Multimodal Applications
by Krzysztof Górski, Stanisław Kondracki and Katarzyna Kępka-Borkowska
Animals 2026, 16(2), 265; https://doi.org/10.3390/ani16020265 - 15 Jan 2026
Viewed by 23
Abstract
Communication in mammals constitutes a complex, multimodal system that integrates visual, acoustic, tactile, and chemical signals whose functions extend beyond simple information transfer to include the regulation of social relationships, coordination of behaviour, and expression of emotional states. This article examines the fundamental [...] Read more.
Communication in mammals constitutes a complex, multimodal system that integrates visual, acoustic, tactile, and chemical signals whose functions extend beyond simple information transfer to include the regulation of social relationships, coordination of behaviour, and expression of emotional states. This article examines the fundamental mechanisms of communication from biological, neuroethological, and behavioural perspectives, with particular emphasis on domesticated and farmed species. Analysis of sensory signals demonstrates that their perception and interpretation are closely linked to the physiology of sensory organs as well as to social experience and environmental context. In companion animals such as dogs and cats, domestication has significantly modified communicative repertoires ranging from the development of specialised facial musculature in dogs to adaptive diversification of vocalisations in cats. The neurobiological foundations of communication, including the activity of the amygdala, limbic structures, and mirror-neuron systems, provide evidence for homologous mechanisms of emotion recognition across species. The article also highlights the role of communication in shaping social structures and the influence of husbandry conditions on the behaviour of farm animals. In intensive production environments, acoustic, visual, and chemical signals are often shaped or distorted by crowding, noise, and chronic stress, with direct consequences for welfare. Furthermore, the growing importance of multimodal technologies such as Precision Livestock Farming (PLF) and Animal–Computer Interaction (ACI) is discussed, particularly their role in enabling objective monitoring of emotional states and behaviour and supporting individualised care. Overall, the analysis underscores that communication forms the foundation of social functioning in mammals, and that understanding this complexity is essential for ethology, animal welfare, training practices, and the design of modern technologies facilitating human–animal interaction. Full article
(This article belongs to the Section Human-Animal Interactions, Animal Behaviour and Emotion)
37 pages, 12417 KB  
Article
Rate-Dependent Fracturing Mechanisms of Granite Under Different Levels of Initial Damage
by Chunde Ma, Chenyang Li, Wenyuan Yang, Chenyu Wang, Qiang Gong and Hongbo Zhou
Appl. Sci. 2026, 16(2), 871; https://doi.org/10.3390/app16020871 - 14 Jan 2026
Viewed by 75
Abstract
Excavation of underground spaces often causes significant initial damage to surrounding rock, which can notably alter its mechanical properties. However, most studies on loading rate effects neglect the role of initial damage. This study investigates how initial damage and loading rate together affect [...] Read more.
Excavation of underground spaces often causes significant initial damage to surrounding rock, which can notably alter its mechanical properties. However, most studies on loading rate effects neglect the role of initial damage. This study investigates how initial damage and loading rate together affect granite’s mechanical behavior and fracturing characteristics. Granite specimens with different initial damage levels were subjected to uniaxial compression at varying loading rates to assess their mechanical parameters, stress thresholds, failure modes, energy evolution, and associated acoustic emission (AE) activity. Results indicate that granite’s mechanical behavior exhibits greater sensitivity to loading rate than to initial damage. As the loading rate increases, both strength and elastic modulus initially decrease and then rise, while the dissipated-to-input energy ratio reaches a maximum when the strength is at its lowest. This phenomenon occurs because, when cracks are allowed to fully develop, a relatively higher loading rate increases the likelihood of crack initiation and propagation, thereby reducing strength. The AE responses of initial damage granite samples (IDGSs), including counts, RA/AF value, b-value, and entropy, exhibit stage-dependent variations and contain precursory information before failure. Moreover, AE signals display multifractal characteristics across different loading rates. These findings reveal the mechanisms underlying granite’s mechanical response when both initial damage and loading rate act together: initial damage primarily affects the complexity and number of local microcracks, while loading rate determines the dominant crack initiation and propagation modes. Moreover, how the failure time of IDGSs varies with loading rate can be described by an inverse exponential function. These findings enhance insight into the coupling mechanism of initial damage and loading rate, with significant implications for failure warning and the cost-effectiveness of underground excavation. Full article
12 pages, 1892 KB  
Article
Effects of Bubbles During Water Resistance Therapy on the Vibration Characteristics of Vocal Folds During the Phonation of Different Vowels
by Marie-Anne Kainz, Rebekka Hoppermann, Theresa Pilsl, Marie Köberlein, Jonas Kirsch, Michael Döllinger and Matthias Echternach
J. Clin. Med. 2026, 15(2), 669; https://doi.org/10.3390/jcm15020669 - 14 Jan 2026
Viewed by 73
Abstract
Background: Semi-occluded vocal tract exercises (SOVTE) improve vocal quality and capacity. Water resistance therapy (WRT), a specific form of SOVTE with a tube submerged under water, generates increased and oscillating oral pressure through bubble formation during phonation, thereby influencing transglottal pressure and vocal [...] Read more.
Background: Semi-occluded vocal tract exercises (SOVTE) improve vocal quality and capacity. Water resistance therapy (WRT), a specific form of SOVTE with a tube submerged under water, generates increased and oscillating oral pressure through bubble formation during phonation, thereby influencing transglottal pressure and vocal fold dynamics. While the physiological effects of WRT using tube-based systems have been extensively studied, the influence of vowel-specific vocal tract configurations during WRT remains unclarified. This study examined how different vowel qualities during WRT affect vocal fold oscillation using the DoctorVox® mask, which allows near-natural mouth opening and vowel articulation. Methods: Ten vocally healthy, untrained adults (25–50 years) performed a continuous vowel glide (/i/–/a/–/u/-/i/) at constant fundamental frequency and habitual loudness during WRT using the DoctorVox® mask, with the tube submerged 2 cm in water. Simultaneous recordings included transnasal high-speed videoendoscopy (20,000 fps), electroglottography (EGG), acoustic signals and intra-tube oral pressure measurements. Glottal area waveforms (GAW) were derived to calculate the open quotient (OQGAW) and closing quotient (ClQGAW). Analyses were conducted separately for intra-tube pressure maxima, minima and intermediate phases within the bubble cycle during WRT. Statistical analysis used Wilcoxon signed-rank tests with Bonferroni correction. Results: In the baseline condition without WRT, significant vowel-related differences were found: /u/ showed a higher open quotient than /i/ and /a/ (p < 0.05) and a higher closing quotient than /a/ (p < 0.05). During WRT, these vowel-specific differences were no longer statistically significant. A non-significant trend toward reduced OQGAW during WRT was observed, most notably for /u/, while differences between pressure phases within the bubble cycle were minimal. Conclusions: WRT using the DoctorVox® mask reduces vowel-specific differences in vocal fold vibration patterns, suggesting that for voice therapy, vowel quality modifications during WRT have little impact on vocal outcomes. Full article
(This article belongs to the Special Issue New Advances in the Management of Voice Disorders: 2nd Edition)
Show Figures

Figure 1

22 pages, 5277 KB  
Article
High-Speed Microprocessor-Based Optical Instrumentation for the Detection and Analysis of Hydrodynamic Cavitation Downstream of an Additively Manufactured Nozzle
by Luís Gustavo Macêdo West, André Jackson Ramos Simões, Leandro do Rozário Teixeira, Lucas Ramalho Oliveira, Juliane Grasiela de Carvalho Gomes, Igor Silva Moreira dos Anjos, Antonio Samuel Bacelar de Freitas Devesa, Leonardo Rafael Teixeira Cotrim Gomes, Lucas Gomes Pereira, Iran Eduardo Lima Neto, Júlio Cesar de Souza Inácio Gonçalves, Luiz Carlos Simões Soares Junior, Germano Pinto Guedes, Geydison Gonzaga Demetino, Marcus Vinícius Santos da Silva, Vitor Leão Filardi, Vitor Pinheiro Ferreira, André Luiz Andrade Simões, Luciano Matos Queiroz and Iuri Muniz Pepe
Fluids 2026, 11(1), 21; https://doi.org/10.3390/fluids11010021 - 14 Jan 2026
Viewed by 125
Abstract
This study presents the development and validation of a high-speed optical data acquisition system for detecting and characterizing hydrodynamic cavitation downstream of a triangular nozzle. The system integrates a PIN photodiode, a transimpedance amplifier, and a high-sampling-rate microcontroller. Its performance was first evaluated [...] Read more.
This study presents the development and validation of a high-speed optical data acquisition system for detecting and characterizing hydrodynamic cavitation downstream of a triangular nozzle. The system integrates a PIN photodiode, a transimpedance amplifier, and a high-sampling-rate microcontroller. Its performance was first evaluated using controlled sinusoidal signals, and statistical stability was assessed as a function of the number of acquired samples. Experiments were subsequently conducted in a converging–diverging conduit under biphasic flow conditions, where mean irradiance, standard deviation, and frequency spectra were analyzed downstream of the nozzle. The optical signal distributions revealed transitions in flow behavior associated with cavitation development, which were quantified through statistical metrics and spectral features. The Strouhal number was estimated from dominant frequencies extracted from the spectra, exhibiting a non-monotonic dependence on the Reynolds number, consistent with changes in flow structure and turbulence intensity. Spectral analysis further indicated frequency bands associated with energy transfer across turbulent scales and bubble dynamics. Overall, the results demonstrate that the proposed optical system constitutes a viable and non-intrusive methodology for detecting and characterizing cavitation intensity in a way that complements other optical and acoustic methods. Full article
Show Figures

Figure 1

18 pages, 1419 KB  
Review
How the Vestibular Labyrinth Encodes Air-Conducted Sound: From Pressure Waves to Jerk-Sensitive Afferent Pathways
by Leonardo Manzari
J. Otorhinolaryngol. Hear. Balance Med. 2026, 7(1), 5; https://doi.org/10.3390/ohbm7010005 - 14 Jan 2026
Viewed by 192
Abstract
Background/Objectives: The vestibular labyrinth is classically viewed as a sensor of low-frequency head motion—linear acceleration for the otoliths and angular velocity/acceleration for the semicircular canals. However, there is now substantial evidence that air-conducted sound (ACS) can also activate vestibular receptors and afferents in [...] Read more.
Background/Objectives: The vestibular labyrinth is classically viewed as a sensor of low-frequency head motion—linear acceleration for the otoliths and angular velocity/acceleration for the semicircular canals. However, there is now substantial evidence that air-conducted sound (ACS) can also activate vestibular receptors and afferents in mammals and other vertebrates. This sound sensitivity underlies sound-evoked vestibular-evoked myogenic potentials (VEMPs), sound-induced eye movements, and several clinical phenomena in third-window pathologies. The cellular and biophysical mechanisms by which a pressure wave in the cochlear fluids is transformed into a vestibular neural signal remain incompletely integrated into a single framework. This study aimed to provide a narrative synthesis of how ACS activates the vestibular labyrinth, with emphasis on (1) the anatomical and biophysical specializations of the maculae and cristae, (2) the dual-channel organization of vestibular hair cells and afferents, and (3) the encoding of fast, jerk-rich acoustic transients by irregular, striolar/central afferents. Methods: We integrate experimental evidence from single-unit recordings in animals, in vitro hair cell and calyx physiology, anatomical studies of macular structure, and human clinical data on sound-evoked VEMPs and sound-induced eye movements. Key concepts from vestibular cellular neurophysiology and from the physics of sinusoidal motion (displacement, velocity, acceleration, jerk) are combined into a unified interpretative scheme. Results: ACS transmitted through the middle ear generates pressure waves in the perilymph and endolymph not only in the cochlea but also in vestibular compartments. These waves produce local fluid particle motions and pressure gradients that can deflect hair bundles in selected regions of the otolith maculae and canal cristae. Irregular afferents innervating type I hair cells in the striola (maculae) and central zones (cristae) exhibit phase locking to ACS up to at least 1–2 kHz, with much lower thresholds than regular afferents. Cellular and synaptic specializations—transducer adaptation, low-voltage-activated K+ conductances (KLV), fast quantal and non-quantal transmission, and afferent spike-generator properties—implement effective high-pass filtering and phase lead, making these pathways particularly sensitive to rapid changes in acceleration, i.e., mechanical jerk, rather than to slowly varying displacement or acceleration. Clinically, short-rise-time ACS stimuli (clicks and brief tone bursts) elicit robust cervical and ocular VEMPs with clear thresholds and input–output relationships, reflecting the recruitment of these jerk-sensitive utricular and saccular pathways. Sound-induced eye movements and nystagmus in third-window syndromes similarly reflect abnormally enhanced access of ACS-generated pressure waves to canal and otolith receptors. Conclusions: The vestibular labyrinth does not merely “tolerate” air-conducted sound as a spill-over from cochlear mechanics; it contains a dedicated high-frequency, transient-sensitive channel—dominated by type I hair cells and irregular afferents—that is well suited to encoding jerk-rich acoustic events. We propose that ACS-evoked vestibular responses, including VEMPs, are best interpreted within a dual-channel framework in which (1) regular, extrastriolar/peripheral pathways encode sustained head motion and low-frequency acceleration, while (2) irregular, striolar/central pathways encode fast, sound-driven transients distinguished by high jerk, steep onset, and precise spike timing. Full article
(This article belongs to the Section Otology and Neurotology)
Show Figures

Figure 1

21 pages, 1089 KB  
Article
Data Augmentation and Time–Frequency Joint Attention for Underwater Acoustic Communication Modulation Classification
by Mingyu Cao, Qi Chen, Jinsong Tang and Haoran Wu
J. Mar. Sci. Eng. 2026, 14(2), 172; https://doi.org/10.3390/jmse14020172 - 13 Jan 2026
Viewed by 73
Abstract
This paper presents a modulation signal classification and recognition algorithm based on data augmentation and time–frequency joint attention (DA-TFJA) for underwater acoustic (UWA) communication systems. UWA communication, as an important means of marine information transmission, plays a key role in fields such as [...] Read more.
This paper presents a modulation signal classification and recognition algorithm based on data augmentation and time–frequency joint attention (DA-TFJA) for underwater acoustic (UWA) communication systems. UWA communication, as an important means of marine information transmission, plays a key role in fields such as marine engineering, military reconnaissance, and marine science research. Accurate recognition of modulated signals is a core technology for ensuring the reliability of UWA communication systems. Traditional classification and recognition methods, mostly based on pure neural network algorithms, suffer from insufficient feature representation and limited generalization performance in complex and changing UWA channel environments. They also struggle to address complex factors such as multipath, Doppler shift, and noise interference, often resulting in scarce effective training samples and inadequate classification accuracy. To overcome these limitations, the proposed DA-TFJA algorithm simulates the characteristics of real UWA channels through two novel data augmentation strategies: the adaptive time–frequency transform enhancement algorithm (ATFT) and dynamic path superposition enhancement algorithm (DPSE). An end-to-end recognition network is developed that integrates a multiscale time–frequency feature extractor (MTFE), two-layer long short-term memory (LSTM) temporal modeling, and a time–frequency joint attention mechanism (TFAM). This comprehensive architecture achieves high-precision recognition of six modulation types, including 2FSK, 4FSK, BPSK, QPSK, DSSS, and OFDM. Experimental results demonstrate that compared with existing advanced methods, DA-TFJA achieves a classification accuracy of 98.36% on the measured reservoir dataset, representing an improvement of 3.09 percentage points, which fully verifies the effectiveness and practical value of the proposed approach. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

24 pages, 5571 KB  
Article
Bearing Fault Diagnosis Based on a Depthwise Separable Atrous Convolution and ASPP Hybrid Network
by Xiaojiao Gu, Chuanyu Liu, Jinghua Li, Xiaolin Yu and Yang Tian
Machines 2026, 14(1), 93; https://doi.org/10.3390/machines14010093 - 13 Jan 2026
Viewed by 74
Abstract
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial [...] Read more.
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial Pyramid Pooling (ASPP). First, the Continuous Wavelet Transform (CWT) is applied to the vibration and acoustic signals to convert them into time–frequency representations. The vibration CWT is then fed into a multi-scale feature extraction module to obtain preliminary vibration features, whereas the acoustic CWT is processed by a Deep Residual Shrinkage Network (DRSN). The two feature streams are concatenated in a feature fusion module and subsequently fed into the DSAC and ASPP modules, which together expand the effective receptive field and aggregate multi-scale contextual information. Finally, global pooling followed by a classifier outputs the bearing fault category, enabling high-precision bearing fault identification. Experimental results show that, under both clean data and multiple low signal-to-noise ratio (SNR) noise conditions, the proposed DSAC-ASPP method achieves higher accuracy and lower variance than baselines such as ResNet, VGG, and MobileNet, while requiring fewer parameters and FLOPs and exhibiting superior robustness and deployability. Full article
Show Figures

Figure 1

19 pages, 3746 KB  
Article
Fault Diagnosis and Classification of Rolling Bearings Using ICEEMDAN–CNN–BiLSTM and Acoustic Emission
by Jinliang Li, Haoran Sheng, Bin Liu and Xuewei Liu
Sensors 2026, 26(2), 507; https://doi.org/10.3390/s26020507 - 12 Jan 2026
Viewed by 186
Abstract
Reliable operation of rolling bearings is essential for mechanical systems. Acoustic emission (AE) offers a promising approach for bearing fault detection because of its high-frequency response and strong noise-suppression capability. This study proposes an intelligent diagnostic method that combines an improved complete ensemble [...] Read more.
Reliable operation of rolling bearings is essential for mechanical systems. Acoustic emission (AE) offers a promising approach for bearing fault detection because of its high-frequency response and strong noise-suppression capability. This study proposes an intelligent diagnostic method that combines an improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) and a convolutional neural network–bidirectional long short-term memory (CNN–BiLSTM) architecture. The method first applies wavelet denoising to AE signals, then uses ICEEMDAN decomposition followed by kurtosis-based screening to extract key fault components and construct feature vectors. Subsequently, a CNN automatically learns deep time–frequency features, and a BiLSTM captures temporal dependencies among these features, enabling end-to-end fault identification. Experiments were conducted on a bearing acoustic emission dataset comprising 15 operating conditions, five fault types, and three rotational speeds; comparative model tests were also performed. Results indicate that ICEEMDAN effectively suppresses mode mixing (average mixing rate 6.08%), and the proposed model attained an average test-set recognition accuracy of 98.00%, significantly outperforming comparative models. Moreover, the model maintained 96.67% accuracy on an independent validation set, demonstrating strong generalization and practical application potential. Full article
(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)
Show Figures

Figure 1

26 pages, 29009 KB  
Article
Quantifying the Relationship Between Speech Quality Metrics and Biometric Speaker Recognition Performance Under Acoustic Degradation
by Ajan Ahmed and Masudul H. Imtiaz
Signals 2026, 7(1), 7; https://doi.org/10.3390/signals7010007 - 12 Jan 2026
Viewed by 256
Abstract
Self-supervised learning (SSL) models have achieved remarkable success in speaker verification tasks, yet their robustness to real-world audio degradation remains insufficiently characterized. This study presents a comprehensive analysis of how audio quality degradation affects three prominent SSL-based speaker verification systems (WavLM, Wav2Vec2, and [...] Read more.
Self-supervised learning (SSL) models have achieved remarkable success in speaker verification tasks, yet their robustness to real-world audio degradation remains insufficiently characterized. This study presents a comprehensive analysis of how audio quality degradation affects three prominent SSL-based speaker verification systems (WavLM, Wav2Vec2, and HuBERT) across three diverse datasets: TIMIT, CHiME-6, and Common Voice. We systematically applied 21 degradation conditions spanning noise contamination (SNR levels from 0 to 20 dB), reverberation (RT60 from 0.3 to 1.0 s), and codec compression (various bit rates), then measured both objective audio quality metrics (PESQ, STOI, SNR, SegSNR, fwSNRseg, jitter, shimmer, HNR) and speaker verification performance metrics (EER, AUC-ROC, d-prime, minDCF). At the condition level, multiple regression with all eight quality metrics explained up to 80% of the variance in minDCF for HuBERT and 78% for WavLM, but only 35% for Wav2Vec2; EER predictability was lower (69%, 67%, and 28%, respectively). PESQ was the strongest single predictor for WavLM and HuBERT, while Shimmer showed the highest single-metric correlation for Wav2Vec2; fwSNRseg yielded the top single-metric R2 for WavLM, and PESQ for HuBERT and Wav2Vec2 (with much smaller gains for Wav2Vec2). WavLM and HuBERT exhibited more predictable quality-performance relationships compared to Wav2Vec2. These findings establish quantitative relationships between measurable audio quality and speaker verification accuracy at the condition level, though substantial within-condition variability limits utterance-level prediction accuracy. Full article
Show Figures

Figure 1

18 pages, 5084 KB  
Article
Angle Modulation Phase Shift in Vibro-Acoustic Modulation: A Novel Approach for Early Crack Detection
by Mohammad M. Bazrafkan, Norbert Hoffmann and Marcus Rutner
NDT 2026, 4(1), 5; https://doi.org/10.3390/ndt4010005 - 9 Jan 2026
Viewed by 118
Abstract
Detecting structural defects is one of the primary challenges engineers face. Consequently, the development of techniques and methods capable of detecting structural defects has always been critical. It should be emphasized that crack detection is only meaningful if it occurs before the final [...] Read more.
Detecting structural defects is one of the primary challenges engineers face. Consequently, the development of techniques and methods capable of detecting structural defects has always been critical. It should be emphasized that crack detection is only meaningful if it occurs before the final stages of structural failure. Accordingly, the early identification of structural defects has become a significant research challenge, motivating the development of techniques and diagnostic parameters that can effectively capture and reflect the structure’s nonlinearity or non-uniform behavior. This study aims to provide a more detailed examination of modulation phenomena observed in the measured response using the vibro-acoustic modulation (VAM) method, and propose a new model that simultaneously incorporates all three conventional modulation types (amplitude, frequency, and phase), which may offer a more accurate representation of the response signal behavior. Both theoretical and experimental results clearly confirm that the phase shifts of individual frequency components in the frequency domain vary throughout the lifetime of the tested specimen. This behavior, as anticipated by the proposed model, reveals a strong correlation between phase shifts and modulation indices (MIs). Furthermore, the relative sensitivity analysis indicates that the phase shift is more sensitive than the modulation index (MI), suggesting its strong potential as an indicator for early defect detection in structural components. Full article
Show Figures

Figure 1

18 pages, 3957 KB  
Article
Real-Time Acoustic Telemetry Buoys as Tools for Nearshore Monitoring and Management
by James M. Anderson, Brian S. Stirling, Patrick T. Rex, Emily A. Spurgeon, Anthony McGinnis, Zachariah S. Merson, Darnell Gadberry and Christopher G. Lowe
J. Mar. Sci. Eng. 2026, 14(2), 128; https://doi.org/10.3390/jmse14020128 - 8 Jan 2026
Viewed by 287
Abstract
Acoustic telemetry monitoring for tagged sharks in nearshore waters has become an important tool for beach safety management; however, detection performance can vary widely in shallow, high-energy nearshore environments where management decisions are often most time-sensitive. Real-time acoustic telemetry buoys offer the potential [...] Read more.
Acoustic telemetry monitoring for tagged sharks in nearshore waters has become an important tool for beach safety management; however, detection performance can vary widely in shallow, high-energy nearshore environments where management decisions are often most time-sensitive. Real-time acoustic telemetry buoys offer the potential to deliver live detections and system diagnostics, but their performance relative to autonomous bottom-mounted receivers remains poorly evaluated under realistic coastal conditions. We compared the detection efficiency of real-time buoy-mounted acoustic receivers and autonomous bottom-mounted receivers across five nearshore sites in southern California. Using paired long-term reference tag deployments and short-term range tests, we quantified detection probability, effective detection range, and the influence of environmental conditions and receiver placement. Detection performance was evaluated in relation to wind speed, water temperature, receiver tilt, and signal-to-noise ratio. Both buoy-mounted and bottom-mounted receivers maintained high long-term detection efficiency, recovering 77–99% of expected transmissions at 82–250 m. Range tests indicated greater effective detection distances for buoy-mounted receivers, with 50% detection probabilities occurring at approximately 471 m compared to 282 m for bottom-mounted receivers. Receiver placement strongly influenced performance, with surface-mounted receivers outperforming bottom-mounted units regardless of receiver model. Environmental effects on detections were site-specific and variable. Detection probability varied predictably with environmental conditions. Higher SNR increased detection success, particularly for bottom/substrate mounted receivers, while warm water significantly reduced detection probability across placement configuration. These results demonstrate that real-time acoustic telemetry buoys provide reliable detection performance in dynamic nearshore environments while offering key operational advantages, including immediate data access and system diagnostics. The observed relationships demonstrate that receiver performance is dynamic rather than fixed, and that real-time buoy systems therefore represent a practical tool for coastal monitoring programs that require timely information to support adaptive management, public safety, and conservation decision making. Full article
(This article belongs to the Section Physical Oceanography)
Show Figures

Figure 1

32 pages, 32603 KB  
Article
Convolutional Neural Network-Based Detection of Booming Noise in Internal Combustion Engine Vehicles Using Simulated Acoustic Spectrograms
by Pedro Leite, Joaquim Mendes, Filipe Pereira, António Mendes Lopes and António Ramos Silva
Appl. Sci. 2026, 16(2), 616; https://doi.org/10.3390/app16020616 - 7 Jan 2026
Viewed by 95
Abstract
In this work, we tested the use of Convolutional Neural Networks (CNNs) to classify booming noise inside vehicles. Instead of relying only on long experimental campaigns, we generated a synthetic dataset from Sound Quality Equivalent (SQE) models that were originally built from real [...] Read more.
In this work, we tested the use of Convolutional Neural Networks (CNNs) to classify booming noise inside vehicles. Instead of relying only on long experimental campaigns, we generated a synthetic dataset from Sound Quality Equivalent (SQE) models that were originally built from real acoustic measurements collected with sensors. By applying smoothing functions and Hann windows, we were able to vary the intensity of the booming effect across different mission profiles. The CNNs were trained on spectrograms derived from these signals, with labels informed by psychoacoustic evaluations. The best model reached about 95.5% accuracy in the binary task (booming vs. no booming) and around 93.3% when using three classes (severe, mild, none). Tests with data from three different car models showed that the method can generalize across platforms. These results suggest that CNNs may become a practical tool for NVH analysis, offering a simpler and cheaper complement to traditional end-of-line testing, and one that could be adapted for real-time embedded systems. Full article
Show Figures

Figure 1

27 pages, 11379 KB  
Article
Performance Analysis and Comparison of Two Deep Learning Methods for Direction-of-Arrival Estimation with Observed Data
by Shuo Liu, Wen Zhang, Junqiang Song, Jian Shi, Hongze Leng and Qiankun Yu
Electronics 2026, 15(2), 261; https://doi.org/10.3390/electronics15020261 - 7 Jan 2026
Viewed by 145
Abstract
Direction-of-arrival (DOA) estimation is fundamental in array signal processing, yet classical algorithms suffer from significant performance degradation under low signal-to-noise ratio (SNR) conditions and require computationally intensive eigenvalue decomposition. This study presents a systematic comparative analysis of two backbone networks, a convolutional neural [...] Read more.
Direction-of-arrival (DOA) estimation is fundamental in array signal processing, yet classical algorithms suffer from significant performance degradation under low signal-to-noise ratio (SNR) conditions and require computationally intensive eigenvalue decomposition. This study presents a systematic comparative analysis of two backbone networks, a convolutional neural network (CNN) and long short-term memory (LSTM) for DOA estimation, addressing two critical research gaps: the lack of a mechanistic understanding of architecture-dependent performance under varying conditions and insufficient validation using real measured data. Both networks are trained using cross-spectral density matrices (CSDMs) from simulated uniform linear array (ULA) signals. Under baseline conditions (1° classification interval), both CNN and LSTM methods reach an accuracy (ACC) above 98%, in which the error is ±1° for CNN and ±2° for LSTM, only existing in the end-fire direction. Key findings reveal that LSTM maintains above 90% accuracy down to −20 dB SNR, demonstrating superior noise robustness, whereas CNN exhibits better angular resolution. Four performance boundaries are identified: optimal performance is achieved at half-wavelength element spacing; SNR crossover occurs at −20 dB below which accuracy drops sharply; the snapshot threshold of 32 marks the transition from snapshot-deficient to snapshot-sufficient conditions; the array size of 8 is the turning point for the performance variation rate. Comparative analysis against traditional methods demonstrates that deep learning approaches achieve superior resolution ability, batch processing efficiency, and noise robustness. Critically, models trained exclusively on single-target simulated data successfully generalize to multi-target experimental data from the Shallow Water Array Performance (SWAP) program, recovering primary target trajectories without domain adaptation. These results provide concrete engineering guidelines for architecture selection and validate the sim-to-real generalization capability of CSDM-based deep learning approaches in underwater acoustic environments. Full article
Show Figures

Figure 1

Back to TopTop