MDPI - Publisher of Open Access Journals

18 pages, 10724 KiB

Open AccessArticle

Object Detection in Complex Traffic Scenes Based on Environmental Perception Attention and Three-Scale Feature Fusion

by Chunmiao Yuan, Jinlong Liu, Haobo Wang and Qingyong Yang

Appl. Sci. 2025, 15(6), 3163; https://doi.org/10.3390/app15063163 - 14 Mar 2025

Viewed by 624

With the recent advancements in automated driving technology, object detection algorithms that can effectively respond to complex and diverse road traffic scenarios are especially important for driving safety during real driving. In this context, we conduct an in-depth study on object detection algorithms [...] Read more.

With the recent advancements in automated driving technology, object detection algorithms that can effectively respond to complex and diverse road traffic scenarios are especially important for driving safety during real driving. In this context, we conduct an in-depth study on object detection algorithms for diverse scenarios in autonomous driving. For diverse and changing backgrounds and multi-scale targets, we propose environmental perception attention (EPA) and the three-scale fusion module (TSFM) to improve the accuracy of object detection algorithms in complex traffic scenes. Environmental perception attention effectively improves the model’s ability to perceive the object by modeling long-range information and inter-channel relationships to direct the model’s attention to important task-related regions and important features in the image. The three-scale fusion module mixes features from different scales while introducing low-level feature map information, enabling the model to take into account the features of objects at different scales. In our experiments, we apply the proposed method to the YOLOv8 model for validation. The results show that compared to the performance of the baseline model on the BDD100K automated driving domain dataset with diverse and complex backgrounds, the mAP@0.5 metric of the improved model is increased by 1.3%. This makes the YOLOv8 more accurate and effective for the detection of different objects in the scenario, and it can better adapt to the different traffic scenarios and environmental changes. Full article

► Show Figures

Figure 1

26 pages, 66184 KiB

Open AccessArticle

Advanced Seismic Sedimentology Techniques for Characterizing Shallow-Water Fan Deltas: Reservoir Architecture and Sedimentary Evolution of the Upper Karamay Formation, Bai21 Area, NW Junggar Basin, China

by Derong Huang, Xinmin Song, Youjing Wang and Guosheng Qin

Appl. Sci. 2025, 15(5), 2306; https://doi.org/10.3390/app15052306 - 21 Feb 2025

Viewed by 674

Abstract

Various glutenite reservoirs, developed by fans, can be found in the Junggar Basin. Among these, there are different interpretations of the glutenite reservoirs formed by shallow-water fan deltas in the Triassic system in the northwestern margin of the basin. The characteristics of these [...] Read more.

Various glutenite reservoirs, developed by fans, can be found in the Junggar Basin. Among these, there are different interpretations of the glutenite reservoirs formed by shallow-water fan deltas in the Triassic system in the northwestern margin of the basin. The characteristics of these deltas and their reservoir architecture have not been understood clearly. It seriously restricts the advancement of the subsequent development of the oilfield. Therefore, it is of great significance to carry out the fine reservoir architecture characterization of the shallow-water fan delta in this area. In this study, the upper member of the Triassic Karamay Formation in the Bai 21 area was selected as the study site. Through core analysis, nine types of sedimentary microfacies, including mudflow deposit, braided river, flood plain, underwater main channel, underwater distributary channel, overbank channel, interchannel deposition, estuary bar, and sheet sand, are found. Through mixed-phase wavelet frequency extension, the main frequency of seismic data is moderately increased and the frequency band is broadened, which makes it possible to identify the thin layer of about 10 m. Through continuous stratal slicing, the thin-layer sedimentary bodies that are difficult to be distinguished in the vertical direction are depicted, and the distribution of sedimentary bodies at different vertical positions is obtained by browsing the slices. Through color fusion based on seismic frequency decomposition, the fusion results contain information reflecting the thickness, and the characterization effect of the fan boundary is significantly improved. In summary, this study depicts the distribution of single-stage fans and recognizes the sand body development characteristics of the two-stage fans. Research suggests that two large shallow-water fan-delta complexes were discovered in the S3 sand group within the study area. Each fan possesses a multilevel branching distributary channel system, resulting in multiple horizontally oriented lobes. Within the fan-delta complex in S3, which is the third sand group in the Upper Triassic Karamay Formation, the fan complex can be divided into two single-stage fans recorded in the sublayer S31 and S32 upward. The two-stage fan deltas show inherited development characteristics in sedimentary characteristics and form in a regression sequence. The sand bodies formed during the low-water-level stage in S31 are thick, with few interlayers developed. Most sand bodies intersect each other vertically. In the shallow fan delta, a widespread estuary bar is deposited, which develops along the underwater distributary channel. This research enhances the understanding of shallow-water fan-delta reservoirs in the study area, and it provides a precise target for oilfield development and solves the key problem of unclear understanding of sand body distribution and combination relationships, which restricts development. Full article

(This article belongs to the Special Issue Advances in Seismic Sedimentology and Geomorphology)

► Show Figures

Figure 1

28 pages, 1581 KiB

Open AccessArticle

Sustainable Operation Mode Choices for Second-Hand Inspection Platforms

by Han Yue and Min Huang

Systems 2024, 12(12), 512; https://doi.org/10.3390/systems12120512 - 21 Nov 2024

Viewed by 1069

Abstract

The sale of second-hand goods has formed a complete industrial chain, and second-hand product testing is a crucial part of it. Second-hand inspection platforms (SIPs) have achieved remarkable commercial success by providing inspection services that alleviate consumers’ quality concerns. Different SIPs typically adopt [...] Read more.

The sale of second-hand goods has formed a complete industrial chain, and second-hand product testing is a crucial part of it. Second-hand inspection platforms (SIPs) have achieved remarkable commercial success by providing inspection services that alleviate consumers’ quality concerns. Different SIPs typically adopt various operation modes, such as consignment, resale, or hybrid modes. Appropriate operation modes not only benefit SIPs in maintaining profitability but also contribute to the sustainable development of the sharing economy. In order to realize the sustainable operation of second-hand inspection platforms, we construct a platform-dominated Stackelberg model to explore the motivations behind SIPs’ choices of different operation modes and investigate the impacts of changes in the inspection service level on the platform’s optimal decisions and market performance. System data analysis results show that the cost of guarantee significantly influences SIPs’ choices of operation modes, specifically; SIPs are inclined to adopt consignment mode or resale mode when the cost of guarantee is relatively high or low, respectively, and choose hybrid mode when the cost of guarantee is moderate. Furthermore, in the presence of inter-channel competition, if the inspection failure loss is relatively high, SIPs may lower the prices of used products as the inspection service level increases. Additionally, although inspection service can disclose the true quality of used products, a higher inspection service level may attract more low-quality sellers into the market when the inspection failure loss is substantial. Finally, under the resale mode, consumer surplus and social welfare will decrease with the inspection service level. Conversely, under the consignment or hybrid mode, both consumer surplus and social welfare will increase with the inspection service level when the inspection failure loss is relatively low. Full article

(This article belongs to the Special Issue New Trends in Sustainable Operations and Supply Chain Management)

► Show Figures

Figure 1

15 pages, 2704 KiB

Open AccessArticle

An Improved YOLOv5 Model for Concrete Bubble Detection Based on Area K-Means and ECANet

by Wei Tian, Bazhou Li, Jingjing Cao, Feichao Di, Yang Li and Jun Liu

Mathematics 2024, 12(17), 2777; https://doi.org/10.3390/math12172777 - 8 Sep 2024

Cited by 1 | Viewed by 1054

Abstract

The appearance quality of fair-faced concrete plays a crucial role in evaluating the engineering quality, as the abundance of small-area bubbles generated during construction diminishes the surface quality of concrete. However, existing methods are plagued by sluggish detection speed and inadequate accuracy. Therefore, [...] Read more.

The appearance quality of fair-faced concrete plays a crucial role in evaluating the engineering quality, as the abundance of small-area bubbles generated during construction diminishes the surface quality of concrete. However, existing methods are plagued by sluggish detection speed and inadequate accuracy. Therefore, this paper proposes an improved method based on YOLOv5 to rapidly and accurately detect small bubble defects on the surface of fair-faced concrete. Firstly, to address the issue of YOLOv5 in generating prior boxes for imbalanced samples, we divide the image preprocessing part into small-, medium-, and large-area intervals corresponding to the number of heads. Additionally, we propose an area-based k-means clustering approach specifically tailored for the anchor boxes within each of these intervals. Moreover, we adjust the number of prior boxes generated by k-means clustering according to the training loss function to adapt to bubbles of different sizes. Then, we introduce the ECA (Efficient Channel Attention) mechanism into the neck part of the model to effectively capture inter-channel interactions and enhance feature representation. Subsequently, we incorporate feature concatenation in the neck part to facilitate the fusion of low-level and high-level features, thereby improving the accuracy and generalization ability of the network. Finally, we construct our own dataset containing 980 images of two classes: cement and bubbles. Comparative experiments are conducted on our dataset using YOLOv5s, YOLOv6s, YOLOxs, and our method. Experimental results demonstrate that the proposed method achieves the highest detection accuracy in terms of mAP0.5, mAP0.75, and mAP0.5:0.95. Compared to YOLOv5s, our method achieves a 7.1% improvement in mAP0.5, a 3.7% improvement in mAP0.75, and a 4.5% improvement in mAP0.5:0.95. Full article

(This article belongs to the Special Issue Application of Machine Learning and Data Mining, 2nd Edition)

► Show Figures

Figure 1

23 pages, 10188 KiB

Open AccessArticle

Sparse-View Spectral CT Reconstruction Based on Tensor Decomposition and Total Generalized Variation

by Xuru Li, Kun Wang, Xiaoqin Xue and Fuzhong Li

Electronics 2024, 13(10), 1868; https://doi.org/10.3390/electronics13101868 - 10 May 2024

Cited by 1 | Viewed by 1357

Abstract

Spectral computed tomography (CT)-reconstructed images often exhibit severe noise and artifacts, which compromise the practical application of spectral CT imaging technology. Methods that use tensor dictionary learning (TDL) have shown superior performance, but it is difficult to obtain a high-quality pre-trained global tensor [...] Read more.

Spectral computed tomography (CT)-reconstructed images often exhibit severe noise and artifacts, which compromise the practical application of spectral CT imaging technology. Methods that use tensor dictionary learning (TDL) have shown superior performance, but it is difficult to obtain a high-quality pre-trained global tensor dictionary in practice. In order to resolve this problem, this paper develops an algorithm called tensor decomposition with total generalized variation (TGV) for sparse-view spectral CT reconstruction. In the process of constructing tensor volumes, the proposed algorithm utilizes the non-local similarity feature of images to construct fourth-order tensor volumes and uses Canonical Polyadic (CP) tensor decomposition instead of pre-trained tensor dictionaries to further explore the inter-channel correlation of images. Simultaneously, introducing the TGV regularization term to characterize spatial sparsity features, the use of higher-order derivatives can better adapt to different image structures and noise levels. The proposed objective minimization model has been addressed using the split-Bregman algorithm. To assess the performance of the proposed algorithm, several numerical simulations and actual preclinical mice are studied. The final results demonstrate that the proposed algorithm has an enormous improvement in the quality of spectral CT images when compared to several existing competing algorithms. Full article

(This article belongs to the Special Issue Pattern Recognition and Machine Learning Applications, 2nd Edition)

► Show Figures

Figure 1

16 pages, 5554 KiB

Open AccessArticle

A Method for Estimating the Hydrodynamic Values of Anastomosing Rivers: The Expression of Channel Morphological Parameters

by Suiji Wang

Water 2024, 16(1), 163; https://doi.org/10.3390/w16010163 - 31 Dec 2023

Cited by 6 | Viewed by 2373

Abstract

An anastomosing river is a stable multiple-channel system separated by inter-channel wetlands, and there are serious difficulties in observing the hydrodynamics of such river patterns in situ. Therefore, there are few reports on the hydrodynamic data of such rivers, for example, the upper [...] Read more.

An anastomosing river is a stable multiple-channel system separated by inter-channel wetlands, and there are serious difficulties in observing the hydrodynamics of such river patterns in situ. Therefore, there are few reports on the hydrodynamic data of such rivers, for example, the upper Columbia and Pearl Rivers. In order to obtain the hydrodynamic parameter values at flow cross-sections of anastomosing rivers, without having to observe hydraulic radius, this study proposes a method called the Expression of Channel Morphological Parameters (ECMP) for hydrodynamic estimation. The calculation formula of the ECMP method is based on the shape factor (width–depth ratio), scale factor (mean depth), and gradient factor of the channel cross-sections of anastomosing rivers below a given water level as independent variables. This method can be used to calculate the mean velocity, discharge, specific stream power, and gross stream power of the flow cross-section at different water levels, only requiring the measurements of channel morphological parameters such as the mean depth, width–depth ratio, and gradient at the channel cross-section below the corresponding water level. The applicability of the ECMP method was verified using measured hydrological data. The results showed that the ECMP method is a practical estimation method with higher accuracy that is convenient for calculating the hydrodynamic parameters of anastomosing rivers. It can also be used to reconstruct ancient anastomosing rivers using the channel morphological parameters revealed from the fill sediments in ancient channels. Full article

(This article belongs to the Special Issue Landscape Dynamics and Fluvial Geomorphology)

► Show Figures

Figure 1

16 pages, 8476 KiB

Open AccessArticle

SAR and Multi-Spectral Data Fusion for Local Climate Zone Classification with Multi-Branch Convolutional Neural Network

by Guangjun He, Zhe Dong, Jian Guan, Pengming Feng, Shichao Jin and Xueliang Zhang

Remote Sens. 2023, 15(2), 434; https://doi.org/10.3390/rs15020434 - 11 Jan 2023

Cited by 10 | Viewed by 3406

Abstract

The local climate zone (LCZ) scheme is of great value for urban heat island (UHI) effect studies by providing a standard classification framework to describe the local physical structure at a global scale. In recent years, with the rapid development of satellite imaging [...] Read more.

The local climate zone (LCZ) scheme is of great value for urban heat island (UHI) effect studies by providing a standard classification framework to describe the local physical structure at a global scale. In recent years, with the rapid development of satellite imaging techniques, both multi-spectral (MS) and synthetic aperture radar (SAR) data have been widely used in LCZ classification tasks. However, the fusion of MS and SAR data still faces the challenges of the different imaging mechanisms and the feature heterogeneity. In this study, to fully exploit and utilize the features of SAR and MS data, a data-grouping method was firstly proposed to divide multi-source data into several band groups according to the spectral characteristics of different bands. Then, a novel network architecture, namely Multi-source data Fusion Network for Local Climate Zone (MsF-LCZ-Net), was introduced to achieve high-precision LCZ classification, which contains a multi-branch CNN for multi-modal feature extraction and fusion, followed by a classifier for LCZ prediction. In the proposed multi-branch structure, a split–fusion-aggregate strategy was adopted to capture multi-level information and enhance the feature representation. In addition, a self channel attention (SCA) block was introduced to establish long-range spatial and inter-channel dependencies, which made the network pay more attention to informative features. Experiments were conducted on the So2Sat LCZ42 dataset, and the results show the superiority of our proposed method when compared with state-of-the-art methods. Moreover, the LCZ maps of three main cities in China were generated and analyzed to demonstrate the effectiveness of our proposed method. Full article

(This article belongs to the Special Issue Information Extraction, Processing and Analysis Methods for Remote Sensing Multi-Modal Information Navigation Applications)

► Show Figures

Figure 1

14 pages, 1293 KiB

Open AccessArticle

A Modified KNN Algorithm for High-Performance Computing on FPGA of Real-Time m-QAM Demodulators

by David Marquez-Viloria, Luis Castano-Londono and Neil Guerrero-Gonzalez

Electronics 2021, 10(5), 627; https://doi.org/10.3390/electronics10050627 - 9 Mar 2021

Cited by 7 | Viewed by 4020

Abstract

A methodology for scalable and concurrent real-time implementation of highly recurrent algorithms is presented and experimentally validated using the AWS-FPGA. This paper presents a parallel implementation of a KNN algorithm focused on the m-QAM demodulators using high-level synthesis for fast prototyping, parameterization, and [...] Read more.

A methodology for scalable and concurrent real-time implementation of highly recurrent algorithms is presented and experimentally validated using the AWS-FPGA. This paper presents a parallel implementation of a KNN algorithm focused on the m-QAM demodulators using high-level synthesis for fast prototyping, parameterization, and scalability of the design. The proposed design shows the successful implementation of the KNN algorithm for interchannel interference mitigation in a 3 × 16 Gbaud 16-QAM Nyquist WDM system. Additionally, we present a modified version of the KNN algorithm in which comparisons among data symbols are reduced by identifying the closest neighbor using the rule of the 8-connected clusters used for image processing. Real-time implementation of the modified KNN on a Xilinx Virtex UltraScale+ VU9P AWS-FPGA board was compared with the results obtained in previous work using the same data from the same experimental setup but offline DSP using Matlab. The results show that the difference is negligible below FEC limit. Additionally, the modified KNN shows a reduction of operations from 43 percent to 75 percent, depending on the symbol’s position in the constellation, achieving a reduction

47.25 %

reduction in total computational time for 100 K input symbols processed on 20 parallel cores compared to the KNN algorithm. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

18 pages, 6031 KiB

Open AccessArticle

Denoising Algorithm for the FY-4A GIIRS Based on Principal Component Analysis

by Sihui Fan, Wei Han, Zhiqiu Gao, Ruoying Yin and Yu Zheng

Remote Sens. 2019, 11(22), 2710; https://doi.org/10.3390/rs11222710 - 19 Nov 2019

Cited by 9 | Viewed by 5989

Abstract

The Geostationary Interferometric Infrared Sounder (GIIRS) is the first high-spectral resolution advanced infrared (IR) sounder onboard the new-generation Chinese geostationary meteorological satellite FengYun-4A (FY-4A). The GIIRS has 1650 channels, and its spectrum ranges from 700 to 2250 cm⁻¹ with an unapodized spectral [...] Read more.

The Geostationary Interferometric Infrared Sounder (GIIRS) is the first high-spectral resolution advanced infrared (IR) sounder onboard the new-generation Chinese geostationary meteorological satellite FengYun-4A (FY-4A). The GIIRS has 1650 channels, and its spectrum ranges from 700 to 2250 cm⁻¹ with an unapodized spectral resolution of 0.625 cm⁻¹. It represents a significant breakthrough for measurements with high temporal, spatial and spectral resolutions worldwide. Many GIIRS channels have quite similar spectral signal characteristics that are highly correlated with each other in content and have a high degree of information redundancy. Therefore, this paper applies a principal component analysis (PCA)-based denoising algorithm (PDA) to study simulation data with different noise levels and observation data to reduce noise. The results show that the channel reconstruction using inter-channel spatial dependency and spectral similarity can reduce the noise in the observation brightness temperature (BT). A comparison of the BT observed by the GIIRS (O) with the BT simulated by the radiative transfer model (B) shows that a deviation occurs in the observation channel depending on the observation array. The results show that the array features of the reconstructed observation BT (rrO) depending on the observation array are weakened and the effect of the array position on the observations in the sub-center of the field of regard (FOR) are partially eliminated after the PDA procedure is applied. The high observation and simulation differences (O-B) in the sub-center of the FOR array notably reduced after the PDA procedure is implemented. The improvement of the high O-B is more distinct, and the low O-B becomes smoother. In each scan line, the standard deviation of the reconstructed background departures (rrO-B) is lower than that of the background departures (O-B). The observation error calculated by posterior estimation based on variational assimilation also verifies the efficiency of the PDA. The typhoon experiment also shows that among the 29 selected assimilation channels, the observation error of 65% of the channels was reduced as calculated by the triangle method. Full article

(This article belongs to the Special Issue Feature Papers for Section Atmosphere Remote Sensing)

► Show Figures

Graphical abstract

16 pages, 1188 KiB

Open AccessArticle

RGB Inter-Channel Measures for Morphological Color Texture Characterization

by Nelson Luis Durañona Sosa, José Luis Vázquez Noguera, Juan José Cáceres Silva, Miguel García Torres and Horacio Legal-Ayala

Symmetry 2019, 11(10), 1190; https://doi.org/10.3390/sym11101190 - 20 Sep 2019

Cited by 6 | Viewed by 3741

Abstract

The perception of textures is based on high-level features such as symmetry, brightness, color or direction. Texture characterization is a widely studied topic in the image processing community. The normalized volume of morphological series is used as a texture descriptor in RGB images. [...] Read more.

The perception of textures is based on high-level features such as symmetry, brightness, color or direction. Texture characterization is a widely studied topic in the image processing community. The normalized volume of morphological series is used as a texture descriptor in RGB images. However, the correlation between different color channels is not exploited with this descriptor. We propose the usage of inter-channel measures in addition to the volume, to enhance the descriptors potential to discriminate textures. The experiments show that standard texture classification techniques increase between 3%–10% in performance when using our descriptor instead of other state of the art descriptors that do not use inter-channel measures. Full article

(This article belongs to the Special Issue Mathematical Modeling and Computational Methods in Science and Engineering)

► Show Figures

Figure 1

9 pages, 494 KiB

Open AccessArticle

Dual Microphone Voice Activity Detection Based on Reliable Spatial Cues

by Soojoong Hwang, Yu Gwang Jin and Jong Won Shin

Sensors 2019, 19(14), 3056; https://doi.org/10.3390/s19143056 - 11 Jul 2019

Cited by 7 | Viewed by 3769

Abstract

Two main spatial cues that can be exploited for dual microphone voice activity detection (VAD) are the interchannel time difference (ITD) and the interchannel level difference (ILD). While both ITD and ILD provide information on the location of audio sources, they may be [...] Read more.

Two main spatial cues that can be exploited for dual microphone voice activity detection (VAD) are the interchannel time difference (ITD) and the interchannel level difference (ILD). While both ITD and ILD provide information on the location of audio sources, they may be impaired in different manners by background noises and reverberation and therefore can have complementary information. Conventional approaches utilize the statistics from all frequencies with fixed weight, although the information from some time–frequency bins may degrade the performance of VAD. In this letter, we propose a dual microphone VAD scheme based on the spatial cues in reliable frequency bins only, considering the sparsity of the speech signal in the time–frequency domain. The reliability of each time–frequency bin is determined by three conditions on signal energy, ILD, and ITD. ITD-based and ILD-based VADs and statistics are evaluated using the information from selected frequency bins and then combined to produce the final VAD results. Experimental results show that the proposed frequency selective approach enhances the performances of VAD in realistic environments. Full article

(This article belongs to the Special Issue Speech, Acoustics, Audio Signal Processing and Applications in Sensors)

► Show Figures

Figure 1

24 pages, 4193 KiB

Open AccessEditor’s ChoiceArticle

Transfer Entropy as a Tool for Hydrodynamic Model Validation

by Alicia Sendrowski, Kazi Sadid, Ehab Meselhe, Wayne Wagner, David Mohrig and Paola Passalacqua

Entropy 2018, 20(1), 58; https://doi.org/10.3390/e20010058 - 12 Jan 2018

Cited by 19 | Viewed by 6460

Abstract

The validation of numerical models is an important component of modeling to ensure reliability of model outputs under prescribed conditions. In river deltas, robust validation of models is paramount given that models are used to forecast land change and to track water, solid, [...] Read more.

The validation of numerical models is an important component of modeling to ensure reliability of model outputs under prescribed conditions. In river deltas, robust validation of models is paramount given that models are used to forecast land change and to track water, solid, and solute transport through the deltaic network. We propose using transfer entropy (TE) to validate model results. TE quantifies the information transferred between variables in terms of strength, timescale, and direction. Using water level data collected in the distributary channels and inter-channel islands of Wax Lake Delta, Louisiana, USA, along with modeled water level data generated for the same locations using Delft3D, we assess how well couplings between external drivers (river discharge, tides, wind) and modeled water levels reproduce the observed data couplings. We perform this operation through time using ten-day windows. Modeled and observed couplings compare well; their differences reflect the spatial parameterization of wind and roughness in the model, which prevents the model from capturing high frequency fluctuations of water level. The model captures couplings better in channels than on islands, suggesting that mechanisms of channel-island connectivity are not fully represented in the model. Overall, TE serves as an additional validation tool to quantify the couplings of the system of interest at multiple spatial and temporal scales. Full article

(This article belongs to the Special Issue Transfer Entropy II)

► Show Figures

Figure 1

19 pages, 3053 KiB

Open AccessArticle

Stereophonic Microphone Array for the Recording of the Direct Sound Field in a Reverberant Environment

by Jonathan Albert Gößwein, Julian Grosse and Steven Van de Par

Appl. Sci. 2017, 7(6), 541; https://doi.org/10.3390/app7060541 - 24 May 2017

Cited by 1 | Viewed by 5944

Abstract

State-of-the-art stereo recording techniques using two microphones have two main disadvantages: first, a limited reduction of the reverberation in the direct sound component, and second, compression or expansion of the angular position of sound sources. To address these disadvantages, the aim of this [...] Read more.

State-of-the-art stereo recording techniques using two microphones have two main disadvantages: first, a limited reduction of the reverberation in the direct sound component, and second, compression or expansion of the angular position of sound sources. To address these disadvantages, the aim of this study is the development of a true stereo recording microphone array that aims to record the direct and reverberant sound field separately. This array can be used within the recording and playback configuration developed in Grosse and van de Par, 2015. Instead of using only two microphones, the proposed method combines two logarithmically-spaced microphone arrays, whose directivity patterns are optimized with a superdirective beamforming algorithm. The optimization allows us to have a better control of the overall beam pattern and of interchannel level differences. A comparison between the newly-proposed system and existing microphone techniques shows a lower percentage of the recorded reverberance within the sound field. Full article

(This article belongs to the Special Issue Spatial Audio)

► Show Figures

Figure 1

13 pages, 1736 KiB

Open AccessArticle

Frequency-Dependent Amplitude Panning for the Stereophonic Image Enhancement of Audio Recorded Using Two Closely Spaced Microphones

by Chan Jun Chun and Hong Kook Kim

Appl. Sci. 2016, 6(2), 39; https://doi.org/10.3390/app6020039 - 1 Feb 2016

Cited by 5 | Viewed by 5875

Abstract

In this paper, we propose a new frequency-dependent amplitude panning method for stereophonic image enhancement applied to a sound source recorded using two closely spaced omni-directional microphones. The ability to detect the direction of such a sound source is limited due to weak [...] Read more.

In this paper, we propose a new frequency-dependent amplitude panning method for stereophonic image enhancement applied to a sound source recorded using two closely spaced omni-directional microphones. The ability to detect the direction of such a sound source is limited due to weak spatial information, such as the inter-channel time difference (ICTD) and inter-channel level difference (ICLD). Moreover, when sound sources are recorded in a convolutive or a real room environment, the detection of sources is affected by reverberation effects. Thus, the proposed method first tries to estimate the source direction depending on the frequency using azimuth-frequency analysis. Then, a frequency-dependent amplitude panning technique is proposed to enhance the stereophonic image by modifying the stereophonic law of sines. To demonstrate the effectiveness of the proposed method, we compare its performance with that of a conventional method based on the beamforming technique in terms of directivity pattern, perceived direction, and quality degradation under three different recording conditions (anechoic, convolutive, and real reverberant). The comparison shows that the proposed method gives us better stereophonic images in a stereo loudspeaker reproduction than the conventional method without any annoying effects. Full article

(This article belongs to the Special Issue Audio Signal Processing)

► Show Figures

Figure 1

Search Results (14)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (14)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI