Real-Time Audio, Video and Image Processing: Latest Advances and Prospects

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 October 2025 | Viewed by 2575

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, University of Iowa, Iowa City, IA 52242, USA
Interests: AI safety and multi-modal learning

E-Mail Website
Guest Editor
Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
Interests: computer vision; machine learning; deep learning; smart infrastructure; intelligent transportation

Special Issue Information

Dear Colleagues,

The rapid advancements that have been made in the real-time processing of audio, image, and video data are driving significant progress in a variety of applications across industries, from autonomous vehicles and smart cities to healthcare and entertainment. As these technologies continue to evolve, they are increasingly playing a critical role in the development of next-generation systems that require ultra-low latency, high reliability, and efficient processing capabilities.

This Special Issue aims to explore the latest advances and future prospects in real-time audio, image, and video processing. We invite original research articles, comprehensive reviews, and case studies that address the challenges of, propose solutions for, and demonstrate applications of these technologies. We are particularly interested in contributions that highlight innovative approaches and cutting-edge techniques in this fast-evolving field.

Topics of interest include, but are not limited to:

  • Real-time audio and speech processing techniques;
  • Real-time image and video processing algorithms;
  • Applications of AI and machine learning in real-time processing;
  • 5G/6G-enabled real-time communication systems;
  • Edge computing for real-time audio, image, and video processing;
  • IoT and wearable devices for multimedia applications;
  • Low-latency streaming and broadcasting technologies;
  • Data compression and transmission;
  • Security and privacy in multimedia processing;
  • Applications in autonomous vehicles, smart cities, and healthcare.

Dr. Muchao Ye
Dr. Pan He
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • real-time processing
  • artificial intelligence
  • machine learning
  • computer vision

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 7350 KiB  
Article
Lightweight Network for Spoof Fingerprint Detection by Attention-Aggregated Receptive Field-Wise Feature
by Md Al Amin, Naim Reza and Ho Yub Jung
Electronics 2025, 14(9), 1823; https://doi.org/10.3390/electronics14091823 - 29 Apr 2025
Viewed by 258
Abstract
The spread of biometric systems utilizing fingerprints has increased the need for advanced spoof detection techniques, but training convolutional neural networks (CNNs) with the limited number of images available in fingerprint datasets poses significant challenges. In this paper, we propose a lightweight network [...] Read more.
The spread of biometric systems utilizing fingerprints has increased the need for advanced spoof detection techniques, but training convolutional neural networks (CNNs) with the limited number of images available in fingerprint datasets poses significant challenges. In this paper, we propose a lightweight network architecture which addresses the challenges inherent in small fingerprint datasets by employing a moderately deep network architecture which is sufficient for extracting essential features from fingerprint images. We apply a hyperbolic tangent activation to the final feature map, which has features from local receptive fields, and average the responses into a single value. Thus, our architecture reduces overfitting by increasing the number of effective labels during training. Additionally, the incorporation of the spatial attention module enhances feature representation, culminating in improved accuracy. The evaluation results show that the proposed model, with only 0.14 million parameters, outperforms existing techniques including lightweight models and transfer-learning-based models, achieving superior average test accuracies of 98.30% and 95.57% on the LivDet-2015 and -2017 datasets, respectively. It also delivers state-of-the-art cross-material performance, with corresponding average classification error values of 0.81% and 1.91%, making it highly effective for on-device fingerprint authentication. Full article
Show Figures

Figure 1

34 pages, 122053 KiB  
Article
Development of a Virtual Environment for Rapid Generation of Synthetic Training Images for Artificial Intelligence Object Recognition
by Chenyu Wang, Lawrence Tinsley and Barmak Honarvar Shakibaei Asli
Electronics 2024, 13(23), 4740; https://doi.org/10.3390/electronics13234740 - 29 Nov 2024
Cited by 1 | Viewed by 942
Abstract
In the field of machine learning and computer vision, the lack of annotated datasets is a major challenge for model development and accuracy improvement. Synthetic data generation addresses this issue by providing large, diverse, and accurately annotated datasets, thereby enhancing model training and [...] Read more.
In the field of machine learning and computer vision, the lack of annotated datasets is a major challenge for model development and accuracy improvement. Synthetic data generation addresses this issue by providing large, diverse, and accurately annotated datasets, thereby enhancing model training and validation. This study presents a Unity-based virtual environment that utilises the Unity Perception package to generate high-quality datasets. First, high-precision 3D (Three-Dimensional) models are created using a 3D structured light scanner, with textures processed to remove specular reflections. These models are then imported into Unity to generate diverse and accurately annotated synthetic datasets. The experimental results indicate that object recognition models trained with synthetic data achieve a high rate of performance on real images, validating the effectiveness of synthetic data in improving model generalisation and application performance. Monocular distance measurement verification shows that the synthetic data closely matches real-world physical scales, confirming its visual realism and physical accuracy. Full article
Show Figures

Figure 1

17 pages, 4207 KiB  
Article
Deep Multi-Similarity Hashing with Spatial-Enhanced Learning for Remote Sensing Image Retrieval
by Huihui Zhang, Qibing Qin, Meiling Ge and Jianyong Huang
Electronics 2024, 13(22), 4520; https://doi.org/10.3390/electronics13224520 - 18 Nov 2024
Viewed by 1011
Abstract
Remote sensing image retrieval (RSIR) plays a crucial role in remote sensing applications, focusing on retrieving a collection of items that closely match a specified query image. Due to the advantages of low storage cost and fast search speed, deep hashing has been [...] Read more.
Remote sensing image retrieval (RSIR) plays a crucial role in remote sensing applications, focusing on retrieving a collection of items that closely match a specified query image. Due to the advantages of low storage cost and fast search speed, deep hashing has been one of the most active research problems in remote sensing image retrieval. However, remote sensing images contain many content-irrelevant backgrounds or noises, and they often lack the ability to capture essential fine-grained features. In addition, existing hash learning often relies on random sampling or semi-hard negative mining strategies to form training batches, which could be overwhelmed by some redundant pairs that slow down the model convergence and compromise the retrieval performance. To solve these problems effectively, a novel Deep Multi-similarity Hashing with Spatial-enhanced Learning, termed DMsH-SL, is proposed to learn compact yet discriminative binary descriptors for remote sensing image retrieval. Specifically, to suppress interfering information and accurately localize the target location, by introducing a spatial enhancement learning mechanism, the spatial group-enhanced hierarchical network is firstly designed to learn the spatial distribution of different semantic sub-features, capturing the noise-robust semantic embedding representation. Furthermore, to fully explore the similarity relationships of data points in the embedding space, the multi-similarity loss is proposed to construct informative and representative training batches, which is based on pairwise mining and weighting to compute the self-similarity and relative similarity of the image pairs, effectively mitigating the effects of redundant and unbalanced pairs. Experimental results on three benchmark datasets validate the superior performance of our approach. Full article
Show Figures

Figure 1

Back to TopTop