Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (29)

Search Parameters:
Keywords = multi-channel audio signal

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 10692 KB  
Article
Short-Time Homomorphic Deconvolution (STHD): A Novel 2D Feature for Robust Indoor Direction of Arrival Estimation
by Yeonseok Park and Jun-Hwa Kim
Sensors 2026, 26(2), 722; https://doi.org/10.3390/s26020722 - 21 Jan 2026
Viewed by 207
Abstract
Accurate indoor positioning and navigation remain significant challenges, with audio sensor-based sound source localization emerging as a promising sensing modality. Conventional methods, often reliant on multi-channel processing or time-delay estimation techniques such as Generalized Cross-Correlation, encounter difficulties regarding computational complexity, hardware synchronization, and [...] Read more.
Accurate indoor positioning and navigation remain significant challenges, with audio sensor-based sound source localization emerging as a promising sensing modality. Conventional methods, often reliant on multi-channel processing or time-delay estimation techniques such as Generalized Cross-Correlation, encounter difficulties regarding computational complexity, hardware synchronization, and reverberant environments where time difference in arrival cues are masked. While machine learning approaches have shown potential, their performance depends heavily on the discriminative power of input features. This paper proposes a novel feature extraction method named Short-Time Homomorphic Deconvolution, which transforms multi-channel audio signals into a 2D Time × Time-of-Flight representation. Unlike prior 1D methods, this feature effectively captures the temporal evolution and stability of time-of-flight differences between microphone pairs, offering a rich and robust input for deep learning models. We validate this feature using a lightweight Convolutional Neural Network integrated with a dual-stage channel attention mechanism, designed to prioritize reliable spatial cues. The system was trained on a large-scale dataset generated via simulations and rigorously tested using real-world data acquired in an ISO-certified anechoic chamber. Experimental results demonstrate that the proposed model achieves precise Direction of Arrival estimation with a Mean Absolute Error of 1.99 degrees in real-world scenarios. Notably, the system exhibits remarkable consistency between simulation and physical experiments, proving its effectiveness for robust indoor navigation and positioning systems. Full article
Show Figures

Figure 1

19 pages, 4574 KB  
Article
Multi-Service Multiplexing System Based on Visible Light Communication
by Yangyu Zhang
Sensors 2025, 25(23), 7207; https://doi.org/10.3390/s25237207 - 26 Nov 2025
Viewed by 471
Abstract
As the Internet of Things (IoT) and communication technologies continue to evolve, the value of multi-service multiplexing in visible light communication (VLC) systems has been increasingly recognized, particularly in addressing the scarcity of wireless spectrum resources. This study reconstructed the stereo transmission protocol [...] Read more.
As the Internet of Things (IoT) and communication technologies continue to evolve, the value of multi-service multiplexing in visible light communication (VLC) systems has been increasingly recognized, particularly in addressing the scarcity of wireless spectrum resources. This study reconstructed the stereo transmission protocol through methods such as dynamic level control, designed a timer interrupt service routine with a double buffer, and reassigned channel status bits in the frame processing function. Consequently, a multi-service multiplexing system based on VLC was designed and implemented. The system enables hybrid transmission of audio signals (1–21.6 kHz) and character data (300–1200 bps) via a single channel, accurately reproducing both voice and text input over a 3.2 m communication range. The system, benefiting from the directional nature of visible light communication, exhibits inherent robustness to multipath-induced interference in dominant line-of-sight (LoS) scenarios and can be easily integrated into existing lighting networks. Featuring a simple architecture and cost-effective design, this solution shows promise for deployment in RF-sensitive areas requiring multi-service communication. Full article
(This article belongs to the Collection Visible Light Communication (VLC))
Show Figures

Figure 1

20 pages, 2021 KB  
Article
Crosstalk Suppression in a Multi-Channel, Multi-Speaker System Using Acoustic Vector Sensors
by Grzegorz Szwoch
Sensors 2025, 25(21), 6731; https://doi.org/10.3390/s25216731 - 3 Nov 2025
Viewed by 773
Abstract
Automatic speech recognition in a scenario with multiple speakers in a reverberant space, such as a small courtroom, often requires multiple sensors. This leads to a problem of crosstalk that must be removed before the speech-to-text transcription is performed. This paper presents an [...] Read more.
Automatic speech recognition in a scenario with multiple speakers in a reverberant space, such as a small courtroom, often requires multiple sensors. This leads to a problem of crosstalk that must be removed before the speech-to-text transcription is performed. This paper presents an algorithm intended for application in multi-speaker scenarios requiring speech-to-text transcription, such as court sessions or conferences. The proposed method uses Acoustic Vector Sensors to acquire audio streams. Speaker detection is performed using statistical analysis of the direction of arrival. This information is then used to perform source separation. Next, speakers’ activity in each channel is analyzed, and signal fragments containing direct speech and crosstalk are identified. Crosstalk is then suppressed using a dynamic gain processor, and the resulting audio streams may be passed to a speech recognition system. The algorithm was evaluated using a custom set of speech recordings. An increase in SI-SDR (Scale-Invariant Signal-to-Distortion Ratio) over the unprocessed signal was achieved: 7.54 dB and 19.53 dB for the algorithm with and without the source separation stage, respectively. Full article
(This article belongs to the Special Issue Acoustic Sensors and Their Applications—2nd Edition)
Show Figures

Figure 1

18 pages, 916 KB  
Article
Real-Time Electroencephalography-Guided Binaural Beat Audio Enhances Relaxation and Cognitive Performance: A Randomized, Double-Blind, Sham-Controlled Repeated-Measures Crossover Trial
by Chanaka N. Kahathuduwa, Jessica Blume, Chinnadurai Mani and Chathurika S. Dhanasekara
Physiologia 2025, 5(4), 44; https://doi.org/10.3390/physiologia5040044 - 24 Oct 2025
Viewed by 5591
Abstract
Background/Objectives: Binaural beat audio has gained popularity as a non-invasive tool to promote relaxation and enhance cognitive performance, though empirical support has been inconsistent. We developed a novel algorithm integrating real-time electroencephalography (EEG) feedback to dynamically tailor binaural beats to induce relaxed brain [...] Read more.
Background/Objectives: Binaural beat audio has gained popularity as a non-invasive tool to promote relaxation and enhance cognitive performance, though empirical support has been inconsistent. We developed a novel algorithm integrating real-time electroencephalography (EEG) feedback to dynamically tailor binaural beats to induce relaxed brain states. This study aimed to examine the efficacy and feasibility of this algorithm in a clinical trial. Methods: In a randomized, double-blinded, sham-controlled crossover trial, 25 healthy adults completed two 30 min sessions (EEG-guided intervention versus sham). EEG (Fp1) was recorded using a consumer-grade single-electrode headset, with auditory stimulation adjusted in real time based on EEG data. Outcomes included EEG frequency profiles, stop signal reaction time (SSRT), and novelty encoding task performance. Results: The intervention rapidly reduced dominant EEG frequency in all participants, with 100% achieving <8 Hz and 96% achieving <4 Hz within median 7.4 and 9.0 min, respectively. Compared to the sham, the intervention was associated with an faster novelty encoding reaction time (p = 0.039, dz = −0.225) and trends towards improved SSRT (p = 0.098, dz = −0.209), increased boundary separation in stop trials (p = 0.065, dz = 0.350), and improved inhibitory drift rate (p = 0.067, dz = 0.452) within the limits of the exploratory nature of these findings. Twenty-four (96%) participants reached a target level of <4 Hz with the intervention, while none reached this level with the sham. Conclusions: Real-time EEG-guided binaural beats may rapidly induce low-frequency brain states while potentially preserving or enhancing aspects of executive function. These findings support the feasibility of personalized, closed-loop auditory entrainment for promoting “relaxed alertness.” The results are preliminary and hypothesis-generating, warranting larger, multi-channel EEG studies in ecologically valid contexts. Full article
Show Figures

Figure 1

16 pages, 1530 KB  
Article
Enhanced Respiratory Sound Classification Using Deep Learning and Multi-Channel Auscultation
by Yeonkyeong Kim, Kyu Bom Kim, Ah Young Leem, Kyuseok Kim and Su Hwan Lee
J. Clin. Med. 2025, 14(15), 5437; https://doi.org/10.3390/jcm14155437 - 1 Aug 2025
Cited by 2 | Viewed by 3062
Abstract
Background/Objectives: Identifying and classifying abnormal lung sounds is essential for diagnosing patients with respiratory disorders. In particular, the simultaneous recording of auscultation signals from multiple clinically relevant positions offers greater diagnostic potential compared to traditional single-channel measurements. This study aims to improve [...] Read more.
Background/Objectives: Identifying and classifying abnormal lung sounds is essential for diagnosing patients with respiratory disorders. In particular, the simultaneous recording of auscultation signals from multiple clinically relevant positions offers greater diagnostic potential compared to traditional single-channel measurements. This study aims to improve the accuracy of respiratory sound classification by leveraging multichannel signals and capturing positional characteristics from multiple sites in the same patient. Methods: We evaluated the performance of respiratory sound classification using multichannel lung sound data with a deep learning model that combines a convolutional neural network (CNN) and long short-term memory (LSTM), based on mel-frequency cepstral coefficients (MFCCs). We analyzed the impact of the number and placement of channels on classification performance. Results: The results demonstrated that using four-channel recordings improved accuracy, sensitivity, specificity, precision, and F1-score by approximately 1.11, 1.15, 1.05, 1.08, and 1.13 times, respectively, compared to using three, two, or single-channel recordings. Conclusions: This study confirms that multichannel data capture a richer set of features corresponding to various respiratory sound characteristics, leading to significantly improved classification performance. The proposed method holds promise for enhancing sound classification accuracy not only in clinical applications but also in broader domains such as speech and audio processing. Full article
(This article belongs to the Section Respiratory Medicine)
Show Figures

Figure 1

23 pages, 4973 KB  
Article
Detection of Electric Network Frequency in Audio Using Multi-HCNet
by Yujin Li, Tianliang Lu, Shufan Peng, Chunhao He, Kai Zhao, Gang Yang and Yan Chen
Sensors 2025, 25(12), 3697; https://doi.org/10.3390/s25123697 - 13 Jun 2025
Viewed by 1408
Abstract
With the increasing application of electrical network frequency (ENF) in forensic audio and video analysis, ENF signal detection has emerged as a critical technology. However, high-pass filtering operations commonly employed in modern communication scenarios, while effectively removing infrasound to enhance communication quality at [...] Read more.
With the increasing application of electrical network frequency (ENF) in forensic audio and video analysis, ENF signal detection has emerged as a critical technology. However, high-pass filtering operations commonly employed in modern communication scenarios, while effectively removing infrasound to enhance communication quality at reduced costs, result in a substantial loss of fundamental frequency information, thereby degrading the performance of existing detection methods. To tackle this issue, this paper introduces Multi-HCNet, an innovative deep learning model specifically tailored for ENF signal detection in high-pass filtered environments. Specifically, the model incorporates an array of high-order harmonic filters (AFB), which compensates for the loss of fundamental frequency by capturing high-order harmonic components. Additionally, a grouped multi-channel adaptive attention mechanism (GMCAA) is proposed to precisely distinguish between multiple frequency signals, demonstrating particular effectiveness in differentiating between 50 Hz and 60 Hz fundamental frequency signals. Furthermore, a sine activation function (SAF) is utilized to better align with the periodic nature of ENF signals, enhancing the model’s capacity to capture periodic oscillations. Experimental results indicate that after hyperparameter optimization, Multi-HCNet exhibits superior performance across various experimental conditions. Compared to existing approaches, this study not only significantly improves the detection accuracy of ENF signals in complex environments, achieving a peak accuracy of 98.84%, but also maintains an average detection accuracy exceeding 80% under high-pass filtering conditions. These findings demonstrate that even in scenarios where fundamental frequency information is lost, the model remains capable of effectively detecting ENF signals, offering a novel solution for ENF signal detection under extreme conditions of fundamental frequency absence. Moreover, this study successfully distinguishes between 50 Hz and 60 Hz fundamental frequency signals, providing robust support for the practical deployment and extension of ENF signal applications. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

24 pages, 7390 KB  
Article
Algorithm for Extraction of Reflection Waves in Single-Well Imaging Based on MC-ConvTasNet
by Wanting Lin, Jiaqi Xu and Hengshan Hu
Appl. Sci. 2025, 15(8), 4189; https://doi.org/10.3390/app15084189 - 10 Apr 2025
Viewed by 1628
Abstract
Single-well imaging makes use of reflected waves to image geological structures outside a borehole, with a detection distance expected to reach tens of meters. However, in the received full wave signal, reflected waves have much smaller amplitudes than borehole-guided waves, which travel directly [...] Read more.
Single-well imaging makes use of reflected waves to image geological structures outside a borehole, with a detection distance expected to reach tens of meters. However, in the received full wave signal, reflected waves have much smaller amplitudes than borehole-guided waves, which travel directly through the borehole. To obtain clear reflected waves, we use a deep neural network, the multi-channel convolutional time-domain audio separation network (MC-ConvTasNet), to extract reflected waves. In the signal channels of the common-source gather, there exists a notable arrival time difference between direct waves and reflected waves. Leveraging this characteristic, we train MC-ConvTasNet on the common-source gathers, ultimately achieving satisfactory results in wave separation. For the hard-to-hard single-interface, soft-to-hard single-interface and double-interface models, the reflected waves extracted by MC-ConvTasNet are closer to the theoretical reflected waves in both phase and shape (the average scale-invariant signal-to-distortion ratio exceeds 32 dB) than those extracted by parameter estimation, a median filter and an F-K filter. Meanwhile, MC-ConvTasNet naturally fits in the scenarios of various inclined interfaces and interfaces parallel to the borehole axis. As an application, our method is employed on field logging data and its ability to separate waves is verified. Full article
(This article belongs to the Special Issue Seismic Analysis and Design of Ocean and Underground Structures)
Show Figures

Figure 1

22 pages, 32472 KB  
Article
Multi-Scale Feature Fusion GANomaly with Dilated Neighborhood Attention for Oil and Gas Pipeline Sound Anomaly Detection
by Yizhuo Zhang, Zhengfeng Sun, Shen Shi and Huiling Yu
Information 2025, 16(4), 279; https://doi.org/10.3390/info16040279 - 30 Mar 2025
Viewed by 1543
Abstract
Anomaly detection in oil and gas pipelines based on acoustic signals currently faces challenges, including limited anomalous samples, varying audio data distributions across different operating conditions, and interference from background noise. These challenges lead to reduced accuracy and efficiency in pipeline anomaly detection. [...] Read more.
Anomaly detection in oil and gas pipelines based on acoustic signals currently faces challenges, including limited anomalous samples, varying audio data distributions across different operating conditions, and interference from background noise. These challenges lead to reduced accuracy and efficiency in pipeline anomaly detection. The primary challenge in reconstruction-based pipeline audio anomaly detection is to prevent the loss of critical information and ensure the high-quality reconstruction of feature maps. This paper proposes a pipeline anomaly detection method termed Multi-scale Feature Fusion GANomaly with Dilated Neighborhood Attention. Firstly, to mitigate information loss during network deepening, a Multi-scale Feature Fusion module is proposed to merge the encoded and decoded feature maps at different dimensions, enhancing low-level detail and high-level semantic information. Secondly, a Dilated Neighborhood Attention module is introduced to assign varying weights to neighborhoods at various dilation rates, extracting channel interactions and spatial relationships between the current pixel and its neighborhoods. Finally, to enhance the quality of the reconstructed spectrum, a loss function based on the Structure Similarity Index Measure is designed, considering both pixel-level and structural differences to maintain the structural characteristics of the reconstructed spectrum. MFDNA-GANomaly achieved 92.06% AUC, 93.96% Accuracy, and 0.955 F1-score on the test set, demonstrating that the proposed method can effectively enhance pipeline anomaly detection performance. Additionally, MFDNA-GANomaly exhibited competitive performance on the ToyTrain and Bearing subsets of the development dataset in the DCASE Challenge 2023 Task 2, confirming the generalization capability of the model. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

17 pages, 2555 KB  
Article
Spatial Sound Rendering Using Intensity Impulse Response and Cardioid Masking Function
by Witold Mickiewicz and Mirosław Łazoryszczak
Appl. Sci. 2025, 15(3), 1112; https://doi.org/10.3390/app15031112 - 23 Jan 2025
Viewed by 1676
Abstract
This study presents a new technique for creating spatial sounds based on a convolution processor. The main objective of this research was to propose a new method for generating a set of impulse responses that guarantee a realistic spatial experience based on the [...] Read more.
This study presents a new technique for creating spatial sounds based on a convolution processor. The main objective of this research was to propose a new method for generating a set of impulse responses that guarantee a realistic spatial experience based on the fusion of amplitude data acquired from an omnidirectional microphone and directional data acquired from an intensity probe. The advantages of the proposed approach are its versatility and easy adaptation to playback in a variety of multi-speaker systems, as well as a reduction in the amount of data, thereby simplifying the measurement procedure required to create any set of channel responses at the post-production stage. This paper describes the concept behind the method, the data acquisition method, and the signal processing algorithm required to generate any number of high-quality channel impulse responses. Experimental results are presented to confirm the suitability of the proposed solution by comparing the results obtained for a traditional surround 5.1 recording system and the proposed approach. This study aims to highlight the potential of intensity impulse responses in the audio recording and virtual reality industries. Full article
(This article belongs to the Special Issue Spatial Audio and Sound Design)
Show Figures

Figure 1

28 pages, 7710 KB  
Article
Research on Underwater Acoustic Target Recognition Based on a 3D Fusion Feature Joint Neural Network
by Weiting Xu, Xingcheng Han, Yingliang Zhao, Liming Wang, Caiqin Jia, Siqi Feng, Junxuan Han and Li Zhang
J. Mar. Sci. Eng. 2024, 12(11), 2063; https://doi.org/10.3390/jmse12112063 - 14 Nov 2024
Cited by 8 | Viewed by 3592
Abstract
In the context of a complex marine environment, extracting and recognizing underwater acoustic target features using ship-radiated noise present significant challenges. This paper proposes a novel deep neural network model for underwater target recognition, which integrates 3D Mel frequency cepstral coefficients (3D-MFCC) and [...] Read more.
In the context of a complex marine environment, extracting and recognizing underwater acoustic target features using ship-radiated noise present significant challenges. This paper proposes a novel deep neural network model for underwater target recognition, which integrates 3D Mel frequency cepstral coefficients (3D-MFCC) and 3D Mel features derived from ship audio signals as inputs. The model employs a serial architecture that combines a convolutional neural network (CNN) with a long short-term memory (LSTM) network. It replaces the traditional CNN with a multi-scale depthwise separable convolutional network (MSDC) and incorporates a multi-scale channel attention mechanism (MSCA). The experimental results demonstrate that the average recognition rate of this method reaches 87.52% on the DeepShip dataset and 97.32% on the ShipsEar dataset, indicating a strong classification performance. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

20 pages, 3003 KB  
Article
Equipment Sounds’ Event Localization and Detection Using Synthetic Multi-Channel Audio Signal to Support Collision Hazard Prevention
by Kehinde Elelu, Tuyen Le and Chau Le
Buildings 2024, 14(11), 3347; https://doi.org/10.3390/buildings14113347 - 23 Oct 2024
Viewed by 1972
Abstract
Construction workplaces often face unforeseen collision hazards due to a decline in auditory situational awareness among on-foot workers, leading to severe injuries and fatalities. Previous studies that used auditory signals to prevent collision hazards focused on employing a classical beamforming approach to determine [...] Read more.
Construction workplaces often face unforeseen collision hazards due to a decline in auditory situational awareness among on-foot workers, leading to severe injuries and fatalities. Previous studies that used auditory signals to prevent collision hazards focused on employing a classical beamforming approach to determine equipment sounds’ Direction of Arrival (DOA). No existing frameworks implement a neural network-based approach for both equipment sound classification and localization. This paper presents an innovative framework for sound classification and localization using multichannel sound datasets artificially synthesized in a virtual three-dimensional space. The simulation synthesized 10,000 multi-channel datasets using just fourteen single sound source audiotapes. This training includes a two-staged convolutional recurrent neural network (CRNN), where the first stage learns multi-label sound event classes followed by the second stage to estimate their DOA. The proposed framework achieves a low average DOA error of 30 degrees and a high F-score of 0.98, demonstrating accurate localization and classification of equipment near workers’ positions on the site. Full article
(This article belongs to the Special Issue Big Data Technologies in Construction Management)
Show Figures

Figure 1

18 pages, 3703 KB  
Article
Joint Spatio-Temporal-Frequency Representation Learning for Improved Sound Event Localization and Detection
by Baoqing Chen, Mei Wang and Yu Gu
Sensors 2024, 24(18), 6090; https://doi.org/10.3390/s24186090 - 20 Sep 2024
Cited by 3 | Viewed by 2776
Abstract
Sound event localization and detection (SELD) is a crucial component of machine listening that aims to simultaneously identify and localize sound events in multichannel audio recordings. This task demands an integrated analysis of spatial, temporal, and frequency domains to accurately characterize sound events. [...] Read more.
Sound event localization and detection (SELD) is a crucial component of machine listening that aims to simultaneously identify and localize sound events in multichannel audio recordings. This task demands an integrated analysis of spatial, temporal, and frequency domains to accurately characterize sound events. The spatial domain pertains to the varying acoustic signals captured by multichannel microphones, which are essential for determining the location of sound sources. However, the majority of recent studies have focused on time-frequency correlations and spatio-temporal correlations separately, leading to inadequate performance in real-life scenarios. In this paper, we propose a novel SELD method that utilizes the newly developed Spatio-Temporal-Frequency Fusion Network (STFF-Net) to jointly learn comprehensive features across spatial, temporal, and frequency domains of sound events. The backbone of our STFF-Net is the Enhanced-3D (E3D) residual block, which combines 3D convolutions with a parameter-free attention mechanism to capture and refine the intricate correlations among these domains. Furthermore, our method incorporates the multi-ACCDOA format to effectively handle homogeneous overlaps between sound events. During the evaluation, we conduct extensive experiments on three de facto benchmark datasets, and our results demonstrate that the proposed SELD method significantly outperforms current state-of-the-art approaches. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

14 pages, 1709 KB  
Article
Multi-Channel Audio Completion Algorithm Based on Tensor Nuclear Norm
by Lin Zhu, Lidong Yang, Yong Guo, Dawei Niu and Dandan Zhang
Electronics 2024, 13(9), 1745; https://doi.org/10.3390/electronics13091745 - 1 May 2024
Viewed by 1781
Abstract
Multi-channel audio signals provide a better auditory sensation to the audience. However, missing data may occur in the collection, transmission, compression, or other processes of audio signals, resulting in audio quality degradation and affecting the auditory experience. As a result, the completeness of [...] Read more.
Multi-channel audio signals provide a better auditory sensation to the audience. However, missing data may occur in the collection, transmission, compression, or other processes of audio signals, resulting in audio quality degradation and affecting the auditory experience. As a result, the completeness of the audio signal has become a popular research topic in the field of signal processing. In this paper, the tensor nuclear norm is introduced into the audio signal completion algorithm, and the multi-channel audio signals with missing data are restored by using the completion algorithm based on the tensor nuclear norm. First of all, the multi-channel audio signals are preprocessed and are then transformed from the time domain to the frequency domain. Afterwards, the multi-channel audio with missing data is modeled to construct a third-order multi-channel audio tensor. In the next part, the tensor completion algorithm is used to complete the third-order tensor. The optimal solution of the convex optimization model of the tensor completion is obtained by using the convex relaxation technique and, ultimately, the data recovery of the multi-channel audio with data loss is accomplished. The experimental results of the tensor completion algorithm and the traditional matrix completion algorithm are compared using both objective and subjective indicators. The final result shows that the high-order tensor completion algorithm has a better completion ability and can restore the audio signal better. Full article
(This article belongs to the Section Circuit and Signal Processing)
Show Figures

Figure 1

30 pages, 6523 KB  
Article
On the Challenges of Acoustic Energy Mapping Using a WASN: Synchronization and Audio Capture
by Emiliano Ehecatl García-Unzueta, Paul Erick Mendez-Monroy and Caleb Rascon
Sensors 2023, 23(10), 4645; https://doi.org/10.3390/s23104645 - 10 May 2023
Cited by 1 | Viewed by 2716
Abstract
Acoustic energy mapping provides the functionality to obtain characteristics of acoustic sources, as: presence, localization, type and trajectory of sound sources. Several beamforming-based techniques can be used for this purpose. However, they rely on the difference of arrival times of the signal at [...] Read more.
Acoustic energy mapping provides the functionality to obtain characteristics of acoustic sources, as: presence, localization, type and trajectory of sound sources. Several beamforming-based techniques can be used for this purpose. However, they rely on the difference of arrival times of the signal at each capture node (or microphone), so it is of major importance to have synchronized multi-channel recordings. A Wireless Acoustic Sensor Network (WASN) can be very practical to install when used for mapping the acoustic energy of a given acoustic environment. However, they are known for having low synchronization between the recordings from each node. The objective of this paper is to characterize the impact of current popular synchronization methodologies as part of the WASN to capture reliable data to be used for acoustic energy mapping. The two evaluated synchronization protocols are: Network Time Protocol (NTP) y Precision Time Protocol (PTP). Additionally, three different audio capture methodologies were proposed for the WASN to capture the acoustic signal: two of them, recording the data locally and one sending the data through a local wireless network. As a real-life evaluation scenario, a WASN was built using nodes conformed by a Raspberry Pi 4B+ with a single MEMS microphone. Experimental results demonstrate that the most reliable methodology is using the PTP synchronization protocol and audio recording locally. Full article
(This article belongs to the Special Issue Reliability Analysis of Wireless Sensor Network)
Show Figures

Figure 1

10 pages, 3010 KB  
Article
Multi-Channel Long-Distance Audio Transmission System Using Power-over-Fiber Technology
by Can Guo, Chenggang Guan, Hui Lv, Shiyi Chai and Hao Chen
Photonics 2023, 10(5), 521; https://doi.org/10.3390/photonics10050521 - 1 May 2023
Cited by 9 | Viewed by 3549
Abstract
To establish stable communication networks in harsh environments where power supply is difficult, such as coal mines and underwater, we propose an effective scheme for co-transmission of analog audio signals and energy. By leveraging the advantages of optical fibers, such as corrosion resistance [...] Read more.
To establish stable communication networks in harsh environments where power supply is difficult, such as coal mines and underwater, we propose an effective scheme for co-transmission of analog audio signals and energy. By leveraging the advantages of optical fibers, such as corrosion resistance and strong resistance to electromagnetic interference, the scheme uses a 1550 nm laser beam as the carrier for analog audio signal propagation, which is then converted to electrical energy through a custom InGaAs/InP photovoltaic power converter (PPC) for energy supply and information transfer without an external power supply after a 25 km fiber transmission. In the experiment, with 160 mW of optical power injection, the scheme not only provides 4 mW of electrical power, but also transmits an analog signal with an acoustic overload point (AOP) of 105 dBSPL and a signal-to-noise ratio (SNR) of 50 dB. In addition, the system employs wavelength division multiplexing (WDM) technology to transform from single-channel to multi-channel communication on a single independent fiber, enabling the arraying of receiving terminals. The passive arrayed terminals make the multi-channel long-distance audio transmission system using power-over-fiber (PoF) technology a superior choice for establishing a stable communication network in harsh environments. Full article
Show Figures

Figure 1

Back to TopTop