Journal of Imaging

Journal Menu

Journal Browser

► Journal Browser

AI-Driven Robot Vision: Progress, Challenges, and Perspectives

Share This Special Issue

Editor

Dr. Deping Li

E-Mail Website
Guest Editor

School of Intelligent Systems Science and Engineering, Jinan University, Zhuhai 519070, China
Interests: pattern recognition; machine vision; grasping; point cloud registration

Special Issue Information

Dear Colleagues,

Robot vision serves as a fundamental technology enabling intelligent robots to perceive, understand, and interact with their environments. In recent years, the rapid evolution of artificial intelligence, particularly the rise of Convolutional Neural Networks (CNNs) and Vision Transformers, has significantly enhanced the precision, speed, and efficiency of visual understanding systems. These advancements have driven remarkable progress in core tasks such as semantic segmentation, 3D machine vision, and complex pattern recognition.

Furthermore, the integration of multimodal data (e.g., RGB, depth, point clouds, and tactile information) and the rapid development of Embodied AI have empowered robots with unprecedented capabilities. Modern visual systems can now achieve robust object detection and tracking, comprehensive scene understanding, and precise robotic grasping across highly diverse and uncontrolled environments.

Despite these remarkable breakthroughs, the field continues to face critical technical challenges. Persistent issues include ensuring robustness in dynamic environments, handling severe occlusions or perceiving challenging targets, and achieving real-time processing efficiency.

This Special Issue focuses on cutting-edge research in robot vision, covering key areas including visual SLAM, object detection and tracking, scene understanding, visual servoing, and human–robot interaction. We also address technical challenges in dynamic environments, occlusion handling, and real-time processing requirements. Original research on novel sensor fusion, end-to-end learning, neural radiance fields applications, and embodied AI vision are particularly welcome. This issue aims to promote innovative applications of robot vision in industrial manufacturing, autonomous driving, medical surgery, and service robotics, while exploring future development trends.

Research areas may include, but are not limited to, the following:

Image and video analysis, pattern recognition, and semantic segmentation in complex scenarios;
Three-dimensional machine vision, point cloud registration, and processing;
Self-supervised, semi-supervised, and few-shot learning;
Generative models and foundation models for vision tasks;
Pose estimation;
Depth estimation;
Object detection, tracking, and perception of challenging targets;
Visual SLAM, scene reconstruction, and environment understanding;
Vision-driven robotic grasping, manipulation, and visual servoing;
Multimodal and cross-domain representation learning for robot vision;
Embodied AI and human–robot interaction.

Ultimately, this Special Issue seeks to bridge the critical gap between advanced visual perception and autonomous robotic action. By assembling these diverse research efforts, we aim to pave the way for robust, adaptable, and highly intelligent robotic systems capable of thriving in unconstrained real-world environments.

Dr. Deping Li
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-anonymized peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.

Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.

Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.

External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.

Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

21 pages, 2332 KB

Open AccessArticle

GCA-Trans: Global Context-Aware Transformer for Robust Transparent Object Segmentation in Robotic Environments

by Deping Li, Zujian Dong, Zilong Yang, Ka-Kui Li and Yushen Huang

J. Imaging 2026, 12(5), 212; https://doi.org/10.3390/jimaging12050212 - 16 May 2026

Viewed by 716

Abstract

Transparent object segmentation plays a critical role in indoor and outdoor scene understanding, particularly driven by the rapid advancements in autonomous driving and robotics. However, this task presents significant challenges due to the lack of distinct texture and chromatic features in transparent objects, causing their appearance to blend into the background. Existing methods face inherent architectural limitations: CNNs are restricted by limited receptive fields, while Transformer-based methods may inadvertently suppress the weak feature details of transparent surfaces due to the inherent low-pass filtering property of self-attention mechanisms, treating them as background noise. Consequently, these approaches struggle to consistently segment transparent objects across diverse scales, failing to preserve both fine details and large-scale structures. To address these limitations, we propose the Global Context-Aware Transformer (GCA-Trans). Specifically, we design a Multi-scale Context Mining (MCM) module that leverages parallel dilated convolutions with varying receptive fields to simultaneously extract features at multiple scales. This design allows the model to capture and fuse fine-grained local details (e.g., edges and textures) with coarse-grained global spatial context (e.g., overall object shapes), ensuring robust segmentation performance for transparent objects of varying scales. Extensive experiments on four benchmark datasets demonstrate that GCA-Trans sets a new state of the art, achieving significant improvements of 2.53% mIoU on Trans10K-v2, 2.1% IoU on RGB-D GSD, 2.2% IoU on GDD, and 1.9% IoU on GSD, validating the effectiveness and robustness of our approach. Full article

(This article belongs to the Special Issue AI-Driven Robot Vision: Progress, Challenges, and Perspectives)

► Show Figures

Journal Menu

Journal Browser

AI-Driven Robot Vision: Progress, Challenges, and Perspectives

Share This Special Issue

Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (1 paper)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI