Convolutional Neural Networks and Vision Applications, 4th Edition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 20 December 2025 | Viewed by 2329

Special Issue Editors

Special Issue Information

Dear Colleagues,

Processing speed is critical for visual inspection automation and mobile visual computing applications. Many powerful and sophisticated computer vision algorithms generate accurate results but require high computational power or resources, and they are not entirely suitable for real-time vision applications. On the other hand, there are vision algorithms and convolutional neural networks that perform at camera frame rates but with moderately reduced accuracy, which is arguably more applicable for real-time vision applications. This Special Issue is reserved for research related to the design, optimization, and implementation of machine-learning-based vision algorithms or convolutional neural networks that are suitable for real-time vision applications.

General topics covered in this Special Issue include the following:

  • Optimization of software-based vision algorithms;
  • CNN architecture optimizations for real-time performance;
  • CNN acceleration through approximate computing;
  • CNN applications that require real-time performance;
  • Tradeoff analysis between speed and accuracy in CNN;
  • GPU-based implementations for real-time CNN performance;
  • FPGA-based implementations for real-time CNN performance;
  • Embedded vision systems for applications that require real-time performance;
  • Machine vision applications that require real-time performance.

Prof. Dr. D. J. Lee
Prof. Dr. Dong Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • convolutional neural networks
  • vision applications
  • CNN architecture

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 3506 KiB  
Article
Evaluation of Vision Transformers for Multi-Organ Tumor Classification Using MRI and CT Imaging
by Óscar A. Martín and Javier Sánchez
Electronics 2025, 14(15), 2976; https://doi.org/10.3390/electronics14152976 - 25 Jul 2025
Viewed by 211
Abstract
Using neural networks has become the standard technique for medical diagnostics, especially in cancer detection and classification. This work evaluates the performance of Vision Transformer architectures, including Swin Transformer and MaxViT, for several datasets of magnetic resonance imaging (MRI) and computed tomography (CT) [...] Read more.
Using neural networks has become the standard technique for medical diagnostics, especially in cancer detection and classification. This work evaluates the performance of Vision Transformer architectures, including Swin Transformer and MaxViT, for several datasets of magnetic resonance imaging (MRI) and computed tomography (CT) scans. We used three training sets of images with brain, lung, and kidney tumors. Each dataset included different classification labels, from brain gliomas and meningiomas to benign and malignant lung conditions and kidney anomalies such as cysts and cancers. This work aims to analyze the behavior of the neural networks in each dataset and the benefits of combining different image modalities and tumor classes. We designed several experiments by fine-tuning the models on combined and individual datasets. The results revealed that the Swin Transformer achieved the highest accuracy, with an average of 99.0% on single datasets and reaching 99.43% on the combined dataset. This research highlights the adaptability of Transformer-based models to various human organs and image modalities. The main contribution lies in evaluating multiple ViT architectures across multi-organ tumor datasets, demonstrating their generalization to multi-organ classification. Integrating these models across diverse datasets could mark a significant advance in precision medicine, paving the way for more efficient healthcare solutions. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
Show Figures

Figure 1

21 pages, 3250 KiB  
Article
Deploying Optimized Deep Vision Models for Eyeglasses Detection on Low-Power Platforms
by Henrikas Giedra, Tomyslav Sledevič and Dalius Matuzevičius
Electronics 2025, 14(14), 2796; https://doi.org/10.3390/electronics14142796 - 11 Jul 2025
Viewed by 489
Abstract
This research addresses the optimization and deployment of convolutional neural networks for eyeglasses detection on low-power edge devices. Multiple convolutional neural network architectures were trained and evaluated using the FFHQ dataset, which contains annotated eyeglasses in the context of faces with diverse facial [...] Read more.
This research addresses the optimization and deployment of convolutional neural networks for eyeglasses detection on low-power edge devices. Multiple convolutional neural network architectures were trained and evaluated using the FFHQ dataset, which contains annotated eyeglasses in the context of faces with diverse facial features and eyewear styles. Several post-training quantization techniques, including Float16, dynamic range, and full integer quantization, were applied to reduce model size and computational demand while preserving detection accuracy. The impact of model architecture and quantization methods on detection accuracy and inference latency was systematically evaluated. The optimized models were deployed and benchmarked on Raspberry Pi 5 and NVIDIA Jetson Orin Nano platforms. Experimental results show that full integer quantization reduces model size by up to 75% while maintaining competitive detection accuracy. Among the evaluated models, MobileNet architectures achieved the most favorable balance between inference speed and accuracy, demonstrating their suitability for real-time eyeglasses detection in resource-constrained environments. These findings enable efficient on-device eyeglasses detection, supporting applications such as virtual try-ons and IoT-based facial analysis systems. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
Show Figures

Figure 1

24 pages, 985 KiB  
Article
Attention-Based Deep Feature Aggregation Network for Skin Lesion Classification
by Siddiqui Muhammad Yasir and Hyun Kim
Electronics 2025, 14(12), 2364; https://doi.org/10.3390/electronics14122364 - 9 Jun 2025
Viewed by 667
Abstract
Early and accurate detection of dermatological conditions, particularly melanoma, is critical for effective treatment and improved patient outcomes. Misclassifications may lead to delayed diagnosis, disease progression, and severe complications in medical image processing. Hence, robust and reliable classification techniques are essential to enhance [...] Read more.
Early and accurate detection of dermatological conditions, particularly melanoma, is critical for effective treatment and improved patient outcomes. Misclassifications may lead to delayed diagnosis, disease progression, and severe complications in medical image processing. Hence, robust and reliable classification techniques are essential to enhance diagnostic precision in clinical practice. This study presents a deep learning-based framework designed to improve feature representation while maintaining computational efficiency. The proposed architecture integrates multi-level feature aggregation with a squeeze-and-excitation attention mechanism to effectively extract salient patterns from dermoscopic medical images. The model is rigorously evaluated on five publicly available benchmark datasets—ISIC-2019, ISIC-2020, SKINL2, MED-NODE, and HAM10000—covering a diverse spectrum of dermatological medical disorders. Experimental results demonstrate that the proposed method consistently outperforms existing approaches in classification performance, achieving accuracy rates of 94.41% and 97.45% on the MED-NODE and HAM10000 datasets, respectively. These results underscore the method’s potential for real-world deployment in automated skin lesion analysis and clinical decision support. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
Show Figures

Figure 1

25 pages, 9742 KiB  
Article
Autism Spectrum Disorder Detection Using Skeleton-Based Body Movement Analysis via Dual-Stream Deep Learning
by Jungpil Shin, Abu Saleh Musa Miah, Manato Kakizaki, Najmul Hassan and Yoichi Tomioka
Electronics 2025, 14(11), 2231; https://doi.org/10.3390/electronics14112231 - 30 May 2025
Viewed by 633
Abstract
Autism Spectrum Disorder (ASD) poses significant challenges in diagnosis due to its diverse symptomatology and the complexity of early detection. Atypical gait and gesture patterns, prominent behavioural markers of ASD, hold immense potential for facilitating early intervention and optimising treatment outcomes. These patterns [...] Read more.
Autism Spectrum Disorder (ASD) poses significant challenges in diagnosis due to its diverse symptomatology and the complexity of early detection. Atypical gait and gesture patterns, prominent behavioural markers of ASD, hold immense potential for facilitating early intervention and optimising treatment outcomes. These patterns can be efficiently and non-intrusively captured using modern computational techniques, making them valuable for ASD recognition. Various types of research have been conducted to detect ASD through deep learning, including facial feature analysis, eye gaze analysis, and movement and gesture analysis. In this study, we optimise a dual-stream architecture that combines image classification and skeleton recognition models to analyse video data for body motion analysis. The first stream processes Skepxels—spatial representations derived from skeleton data—using ConvNeXt-Base, a robust image recognition model that efficiently captures aggregated spatial embeddings. The second stream encodes angular features, embedding relative joint angles into the skeleton sequence and extracting spatiotemporal dynamics using Multi-Scale Graph 3D Convolutional Network(MSG3D), a combination of Graph Convolutional Networks (GCNs) and Temporal Convolutional Networks (TCNs). We replace the ViT model from the original architecture with ConvNeXt-Base to evaluate the efficacy of CNN-based models in capturing gesture-related features for ASD detection. Additionally, we experimented with a Stack Transformer in the second stream instead of MSG3D but found it to result in lower performance accuracy, thus highlighting the importance of GCN-based models for motion analysis. The integration of these two streams ensures comprehensive feature extraction, capturing both global and detailed motion patterns. A pairwise Euclidean distance loss is employed during training to enhance the consistency and robustness of feature representations. The results from our experiments demonstrate that the two-stream approach, combining ConvNeXt-Base and MSG3D, offers a promising method for effective autism detection. This approach not only enhances accuracy but also contributes valuable insights into optimising deep learning models for gesture-based recognition. By integrating image classification and skeleton recognition, we can better capture both global and detailed motion patterns, which are crucial for improving early ASD diagnosis and intervention strategies. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
Show Figures

Figure 1

Back to TopTop