applsci-logo

Journal Browser

Journal Browser

Advances in Audio/Image Signals Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 31 January 2026 | Viewed by 2578

Special Issue Editors


E-Mail Website
Guest Editor
Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
Interests: machine learning; deep neural network architectures; medical image analysis; emotional state analysis;
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Acoustic, Electronic and IT Solutions, GIG National Research Institute, Gwarków 1, 40-166 Katowice, Poland
Interests: computer vision; machine learning; deep learning; artificial intelligence; industrial applications of AI tools and methods
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are pleased to announce a call for submissions for our Special Issue entitled “Advances in Audio/Image Signals Processing”.

This Special Issue will focus on advancements in the design and development of methods for processing audio and image signals. The increasing availability and accessibility of hardware for capturing these signals has led to a surge in data collection. However, the large volume of data collected requires the development of faster and more efficient methods for analyzing and processing signal data. As the acquisition of data has become easier and more cost-effective, there is a growing need for automated processing and analyses that make sense of the collected information.

This Special Issue welcomes the submission of papers that explore cutting-edge research and recent advancements in the field of visual and audio signal analysis across all possible domains. We particularly encourage submissions that introduce novel approaches and compare their impact to the current state of the art. Nevertheless, case studies and theoretical studies will also be considered, provided that they do not constitute the majority of the papers published.

Dr. Karolina Nurzynska
Dr. Sebastian Iwaszenko
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing and understanding
  • image recognition
  • computer vision
  • multimodal image understanding, multispectral image understanding
  • visual attention
  • image denoising
  • knowledge discovery
  • image-understanding applications
  • audio processing and understanding
  • frequency analysis
  • wavelet analysis
  • speech recognition
  • signal denoising
  • signal decomposition
  • timeseries analysis
  • feature discovery
  • pattern recognition
  • pattern identification

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 8888 KiB  
Article
E2-VINS: An Event-Enhanced Visual–Inertial SLAM Scheme for Dynamic Environments
by Jiafeng Huang, Shengjie Zhao and Lin Zhang
Appl. Sci. 2025, 15(3), 1314; https://doi.org/10.3390/app15031314 - 27 Jan 2025
Viewed by 951
Abstract
Simultaneous Localization and Mapping (SLAM) technology has garnered significant interest in the robotic vision community over the past few decades. The rapid development of SLAM technology has resulted in its widespread application across various fields, including autonomous driving, robot navigation, and virtual reality. [...] Read more.
Simultaneous Localization and Mapping (SLAM) technology has garnered significant interest in the robotic vision community over the past few decades. The rapid development of SLAM technology has resulted in its widespread application across various fields, including autonomous driving, robot navigation, and virtual reality. Although SLAM, especially Visual–Inertial SLAM (VI-SLAM), has made substantial progress, most classic algorithms in this field are designed based on the assumption that the observed scene is static. In complex real-world environments, the presence of dynamic objects such as pedestrians and vehicles can seriously affect the robustness and accuracy of such systems. Event cameras, which use recently introduced motion-sensitive biomimetic sensors, efficiently capture scene changes (referred to as “events”) with high temporal resolution, offering new opportunities to enhance VI-SLAM performance in dynamic environments. Integrating this kind of innovative sensor, we propose the first event-enhanced Visual–Inertial SLAM framework specifically designed for dynamic environments, termed E2-VINS. Specifically, the system uses visual–inertial alignment strategy to estimate IMU biases and correct IMU measurements. The calibrated IMU measurements are used to assist in motion compensation, achieving spatiotemporal alignment of events. The event-based dynamicity metrics, which measure the dynamicity of each pixel, are then generated on these aligned events. Based on these metrics, the visual residual terms of different pixels are adaptively assigned weights, namely, dynamicity weights. Subsequently, E2-VINS jointly and alternately optimizes the system state (camera poses and map points) and dynamicity weights, effectively filtering out dynamic features through a soft-threshold mechanism. Our scheme enhances the robustness of classic VI-SLAM against dynamic features, which significantly enhances VI-SLAM performance in dynamic environments, resulting in an average improvement of 1.884% in the mean position error compared to state-of-the-art methods. The superior performance of E2-VINS is validated through both qualitative and quantitative experimental results. To ensure that our results are fully reproducible, all the relevant data and codes have been released. Full article
(This article belongs to the Special Issue Advances in Audio/Image Signals Processing)
Show Figures

Figure 1

18 pages, 5732 KiB  
Article
AFT-SAM: Adaptive Fusion Transformer with a Sparse Attention Mechanism for Audio–Visual Speech Recognition
by Na Che, Yiming Zhu, Haiyan Wang, Xianwei Zeng and Qinsheng Du
Appl. Sci. 2025, 15(1), 199; https://doi.org/10.3390/app15010199 - 29 Dec 2024
Cited by 1 | Viewed by 1024
Abstract
Aiming at the problems of serious information redundancy, complex inter-modal information interaction, and difficult multimodal fusion faced by the audio–visual speech recognition system when dealing with complex multimodal information, this paper proposes an adaptive fusion transformer algorithm (AFT-SAM) based on a sparse attention [...] Read more.
Aiming at the problems of serious information redundancy, complex inter-modal information interaction, and difficult multimodal fusion faced by the audio–visual speech recognition system when dealing with complex multimodal information, this paper proposes an adaptive fusion transformer algorithm (AFT-SAM) based on a sparse attention mechanism. The algorithm adopts the sparse attention mechanism in the feature-encoding process to reduce excessive attention to non-important regions and dynamically adjusts the attention weights through adaptive fusion to capture and integrate the multimodal information more effectively and reduce the impact of redundant information on the model performance. Experiments are conducted on the audio–visual speech recognition dataset LRS2 and compared with other algorithms, and the experimental results show that the proposed algorithm in this paper has significantly lower WERs in the audio-only, visual-only, and audio–visual bimodal cases. Full article
(This article belongs to the Special Issue Advances in Audio/Image Signals Processing)
Show Figures

Figure 1

Back to TopTop