Recent Progress in Visual AI: Architectures, Learning, and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 30 September 2025 | Viewed by 3394

Special Issue Editors


E-Mail Website
Guest Editor
Cyber Security Research Centre, Nanyang Technological University, Singapore, Singapore
Interests: scene understanding and generation; multimodal representation; human-centered visual understanding
Special Issues, Collections and Topics in MDPI journals
National Heart and Lung Institute, Imperial College London, South Kensington, London SW7 2AZ, UK
Interests: medical image analysis; multimodal information fusion; data synthesis; data harmonisation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Information Technology, Halmstad University, Halmstad, Sweden
Interests: artificial intelligence; quantum machine learning; neural networks; graph neural networks; federated learning; reinforcement learning; cognitive science; healthcare; bioinformatics; and IoT

E-Mail Website
Guest Editor
School of Computing and Data Engineering, NingboTech University, Ningbo 315100, China
Interests: Image\video compression algorithm; image\video intelligent analysis

E-Mail
Guest Editor
Cyber Security Research Centre, Nanyang Technologicial University, Singapore, Singapore
Interests: computer vision; trajectory prediction

Special Issue Information

Dear Colleagues,

With the rapid development of artificial intelligence (AI) and computer vision, novel model architectures and multimodal representation learning have become research hotspots. These new technologies not only advance visual tasks, such as segmentation, detection, and re-identification, but also show great potential in applications like autonomous driving, robotics, and medical imaging. We are curating this Special Issue to gather and showcase the latest research achievements in the field of visual AI and to explore further directions and technological innovations.

This Special Issue aims to collect innovative and emerging research findings covering a wide range of topics, from foundational model design to multimodal representation learning and specific applications. We hope that this call for papers will promote communication and co-operation between academia and industry and further drive development in this field.

We welcome submissions on, but not limited to, the following topics:

Novel Model Architecture Design:

  • Applications of CNNs/Transformers in visual tasks;
  • Visual state-space models (SSMs);
  • Mamba models;
  • Kolmogorov–Arnold networks (KANs).

Multimodal Representation Learning

  • Joint understanding and generation of multimodal data;
  • Cross-modal representation and alignment;
  • The CLIP model and its extensions;
  • Few-shot and zero-shot learning for multimodal data.

General Vision Models

  • Universal segmentation models;
  • Universal detection models;
  • Universal depth-estimation models.

Human-Centric Tasks

  • Pose estimation;
  • Human parsing;
  • Human detection;
  • Human segmentation;
  • Person re-identification.

Scene-Centric Tasks

  • Indoor and outdoor scene segmentation;
  • Scene detection;
  • Scene depth estimation.

3D Vision

  • Point-cloud understanding;
  • Multi-view processing;
  • RGB-D processing;
  • 3D reconstruction and generation.

Integration of Large Language Models with Vision Tasks

  • Applications of large language models in vision tasks;
  • Fusion and innovation in vision–language

Application Domains

  • Embodied intelligence;
  • Autonomous driving;
  • Robotic vision;
  • Remote sensing image analysis;
  • Medical image analysis.

Dr. Changshuo Wang
Dr. Guang Yang
Dr. Prayag Tiwari
Dr. Gang Wang
Dr. Ruiping Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • visual AI
  • pattern recognition
  • transformer models
  • multimodal representation learning
  • 3D vision
  • autonomous driving

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 393 KiB  
Article
Facial Age Estimation Using Multi-Stage Deep Neural Networks
by Salah Eddine Bekhouche, Azeddine Benlamoudi, Fadi Dornaika, Hichem Telli and Yazid Bounab
Electronics 2024, 13(16), 3259; https://doi.org/10.3390/electronics13163259 - 16 Aug 2024
Cited by 2 | Viewed by 3342
Abstract
Over the last decade, the world has witnessed many breakthroughs in artificial intelligence, largely due to advances in deep learning technology. Notably, computer vision solutions have significantly contributed to these achievements. Human face analysis, a core area of computer vision, has gained considerable [...] Read more.
Over the last decade, the world has witnessed many breakthroughs in artificial intelligence, largely due to advances in deep learning technology. Notably, computer vision solutions have significantly contributed to these achievements. Human face analysis, a core area of computer vision, has gained considerable attention due to its wide applicability in fields such as law enforcement, social media, and marketing. However, existing methods for facial age estimation often struggle with accuracy due to limited feature extraction capabilities and inefficiencies in learning hierarchical representations. This paper introduces a novel framework to address these issues by proposing a Multi-Stage Deep Neural Network (MSDNN) architecture. The MSDNN architecture divides each CNN backbone into multiple stages, enabling more comprehensive feature extraction, thereby improving the accuracy of age predictions from facial images. Our framework demonstrates a significant performance improvement over traditional solutions, with its effectiveness validated through comparisons with the EfficientNet and MobileNetV3 architectures. The proposed MSDNN architecture achieves a notable decrease in Mean Absolute Error (MAE) across three widely used public datasets (MORPH2, CACD, and AFAD) while maintaining a virtually identical parameter count compared to the initial backbone architectures. These results underscore the effectiveness and feasibility of our methodology in advancing the field of age estimation, showcasing it as a robust solution for enhancing the accuracy of age prediction algorithms. Full article
Show Figures

Figure 1

Back to TopTop