New Trends in Computer Vision and Image Processing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Electronic Multimedia".

Deadline for manuscript submissions: 10 June 2025 | Viewed by 3793

Special Issue Editors


E-Mail Website
Guest Editor
National Centre for Computer Animation, Bournemouth University, Bournemouth BH12 5BB, UK
Interests: geometric modeling; computer animation; computer graphics; image and point cloud-based shape reconstruction; machine learning; applications of ODEs and PDEs in geometric modeling and computer animation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Creative and Digital Industries, Buckinghamshire New University, High Wycombe HP11 2JZ, UK
Interests: artificial intelligence; computer vision; deep learning; computer graphics

E-Mail Website
Guest Editor Assistant
Centre for Computer Animation, Bournemouth University, Bournemouth BH12 5BB, UK
Interests: 3D face reconstruction; face animation; audio to mesh

Special Issue Information

Dear Colleagues,

With advancements in artificial intelligence, especially artificial neural networks, many new computer vision and image processing techniques have been developed to improve or transform digital images and analyze and interpret information in image and video data. This Special Issue on "New Trends in Computer Vision and Image Processing" aims to present new advancements and applications in the fields, advance the state of the art in computer vision and image processing, and facilitate the translation of research findings into practical applications. It focuses on leveraging artificial intelligence techniques to enhance visual perception and understanding and addressing challenges and exploring opportunities in various domains such as healthcare, autonomous vehicles, surveillance, and augmented reality. By bringing together contributions from researchers across academia and industry, it seeks to provide a platform for researchers to exchange ideas and showcase innovative approaches in computer vision and image processing and promote the development of novel methodologies and algorithms for solving real-world problems.

This Special Issue entitled “New Trends in Computer Vision and Image Processing” invites original research and comprehensive reviews, including but not limited to the following topics:

  • Image classification and recognition;
  • Object detection and tracking;
  • Image segmentation and feature extraction;
  • Artificial intelligence techniques of machine learning for computer animation and image processing;
  • 3D vision and reconstruction;
  • Image enhancement and restoration techniques;
  • Image retrieval;
  • Biomedical image processing and analysis;
  • Applications of computer vision in robotics and automation;
  • Ethical considerations and societal impacts of computer vision technologies.

Prof. Dr. Lihua You
Dr. Shaojun Bian
Dr. Diqiong Jiang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • image processing
  • artificial intelligence
  • deep learning
  • object detection
  • image analysis
  • 3D vision and reconstruction

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

22 pages, 46829 KiB  
Article
Waveshift 2.0: An Improved Physics-Driven Data Augmentation Strategy in Fine-Grained Image Classification
by Gent Imeraj and Hitoshi Iyatomi
Electronics 2025, 14(9), 1735; https://doi.org/10.3390/electronics14091735 - 24 Apr 2025
Viewed by 109
Abstract
This paper presents Waveshift Augmentation 2.0 (WS 2.0), an enhanced version of the previously proposed Waveshift Augmentation (WS 1.0), a novel data augmentation technique inspired by light propagation dynamics in optical systems. While WS 1.0 introduced phase-based wavefront transformations under the assumption of [...] Read more.
This paper presents Waveshift Augmentation 2.0 (WS 2.0), an enhanced version of the previously proposed Waveshift Augmentation (WS 1.0), a novel data augmentation technique inspired by light propagation dynamics in optical systems. While WS 1.0 introduced phase-based wavefront transformations under the assumption of an infinitesimally small aperture, WS 2.0 incorporates an additional aperture-dependent hyperparameter that models real-world optical attenuation. This refinement enables broader frequency modulation and greater diversity in image transformations while preserving compatibility with well-established data augmentation pipelines such as CLAHE, AugMix, and RandAugment. Evaluated across a wide range of tasks, including medical imaging, fine-grained object recognition, and grayscale image classification, WS 2.0 consistently outperformed both WS 1.0 and standard geometric augmentation. Notably, when benchmarked against geometric augmentation alone, it achieved average macro-F1 improvements of +1.48 (EfficientNetV2), +0.65 (ConvNeXt), and +0.73 (Swin Transformer), with gains of up to +9.32 points in medical datasets. These results demonstrate that WS 2.0 advances physics-based augmentation by enhancing generalization without sacrificing modularity or preprocessing efficiency, offering a scalable and realistic augmentation strategy for complex imaging domains. Full article
(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)
Show Figures

Figure 1

18 pages, 4891 KiB  
Article
A Lightweight Detection Model Without Convolutions for Complex Stacked Grasping Tasks
by Li Ren, Yiming Kang, Haitao Jia, Shaojiang Wang and Rui Zhong
Electronics 2025, 14(3), 437; https://doi.org/10.3390/electronics14030437 - 22 Jan 2025
Viewed by 647
Abstract
In complex environments with multi-object stacking, the spatial relationships between objects necessitate a sequential grasping strategy to ensure both the safety of the target objects and the efficiency of robotic arm operations. To address this challenge, this study introduces a Visual Manipulation Relationship [...] Read more.
In complex environments with multi-object stacking, the spatial relationships between objects necessitate a sequential grasping strategy to ensure both the safety of the target objects and the efficiency of robotic arm operations. To address this challenge, this study introduces a Visual Manipulation Relationship Network (VMRN) to determine the optimal grasping sequence. Traditional VMRN frameworks typically rely on convolutional neural networks (CNNs) for feature extraction, which often struggle with high-frequency feature extraction, long-tail data distributions, and real-time computational demands in multi-object stacking scenarios. To overcome these limitations, we propose a lightweight, convolution-free Transformer-based feature extraction network integrated into the visual detection model. This model is specifically designed for visual reasoning, with a focus on lightweight optimization to enhance the extraction of features for stacked objects. The proposed network incorporates local window attention, global information aggregation and broadcasting, and a dual-dimensional attention-based feedforward network to improve feature representation. Additionally, a novel loss function is designed to address the performance degradation in detecting long-tail categories, effectively mitigating the over-suppression of rare objects in imbalanced datasets. Experimental results demonstrate that the proposed model significantly improves both detection accuracy and computational efficiency, making it particularly suitable for real-time robotic grasping tasks in complex environments due to its lightweight design. Full article
(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)
Show Figures

Figure 1

16 pages, 2489 KiB  
Article
A Method for Retina Segmentation by Means of U-Net Network
by Antonella Santone, Rosamaria De Vivo, Laura Recchia, Mario Cesarelli and Francesco Mercaldo
Electronics 2024, 13(22), 4340; https://doi.org/10.3390/electronics13224340 - 5 Nov 2024
Cited by 1 | Viewed by 1302
Abstract
Retinal image segmentation plays a critical role in diagnosing and monitoring ophthalmic diseases such as diabetic retinopathy and age-related macular degeneration. We propose a deep learning-based approach utilizing the U-Net network for the accurate and efficient segmentation of retinal images. U-Net, a convolutional [...] Read more.
Retinal image segmentation plays a critical role in diagnosing and monitoring ophthalmic diseases such as diabetic retinopathy and age-related macular degeneration. We propose a deep learning-based approach utilizing the U-Net network for the accurate and efficient segmentation of retinal images. U-Net, a convolutional neural network widely used for its performance in medical image segmentation, is employed to segment key retinal structures, including the optic disc and blood vessels. We evaluate the proposed model on a publicly available retinal image dataset, demonstrating interesting performance in automatic retina segmentation, thus showing the effectiveness of the proposed method. Our proposal provides a promising method for automated retinal image analysis, aiding in early disease detection and personalized treatment planning. Full article
(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)
Show Figures

Figure 1

23 pages, 76553 KiB  
Article
3DRecNet: A 3D Reconstruction Network with Dual Attention and Human-Inspired Memory
by Muhammad Awais Shoukat, Allah Bux Sargano, Lihua You and Zulfiqar Habib
Electronics 2024, 13(17), 3391; https://doi.org/10.3390/electronics13173391 - 26 Aug 2024
Viewed by 1118
Abstract
Humans inherently perceive 3D scenes using prior knowledge and visual perception, but 3D reconstruction in computer graphics is challenging due to complex object geometries, noisy backgrounds, and occlusions, leading to high time and space complexity. To addresses these challenges, this study introduces 3DRecNet, [...] Read more.
Humans inherently perceive 3D scenes using prior knowledge and visual perception, but 3D reconstruction in computer graphics is challenging due to complex object geometries, noisy backgrounds, and occlusions, leading to high time and space complexity. To addresses these challenges, this study introduces 3DRecNet, a compact 3D reconstruction architecture optimized for both efficiency and accuracy through five key modules. The first module, the Human-Inspired Memory Network (HIMNet), is designed for initial point cloud estimation, assisting in identifying and localizing objects in occluded and complex regions while preserving critical spatial information. Next, separate image and 3D encoders perform feature extraction from input images and initial point clouds. These features are combined using a dual attention-based feature fusion module, which emphasizes features from the image branch over those from the 3D encoding branch. This approach ensures independence from proposals at inference time and filters out irrelevant information, leading to more accurate and detailed reconstructions. Finally, a Decoder Branch transforms the fused features into a 3D representation. The integration of attention-based fusion with the memory network in 3DRecNet significantly enhances the overall reconstruction process. Experimental results on the benchmark datasets, such as ShapeNet, ObjectNet3D, and Pix3D, demonstrate that 3DRecNet outperforms existing methods. Full article
(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)
Show Figures

Figure 1

Back to TopTop