sensors-logo

Journal Browser

Journal Browser

Artificial Intelligence in Computer Vision: Methods and Applications2nd Edition

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 20 December 2025 | Viewed by 20490

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mechanical Engineering, The Catholic University of America, Washington, DC 20064, USA
Interests: optics; mechanics; robotics; computer vision
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Spree3D, Alameda, CA 94502, USA
Interests: computer vision; computational photography; machine learning

E-Mail Website
Guest Editor
Neuroimaging Research Branch, National Institute on Drug Abuse, National Institutes of Health, Baltimore, MD 21224, USA
Interests: computer vision; machine learning; deep learning; computer hardware; neuroimaging
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
U.S. Army Research Laboratory, 2201 Aberdeen Boulevard, Aberdeen, MD 21005, USA
Interests: machine learning

Special Issue Information

Dear Colleagues,

In recent years, there has been high interest in the research and development of artificial intelligence techniques. In the meantime, computer vision methods have been enhanced and extended to encompass an astonishing number of novel sensors and measurement systems. As artificial intelligence spreads over almost all fields of science and engineering, computer vision remains one of its primary application areas. Notably, incorporating artificial intelligence into computer vision-based sensing and measurement techniques has led to numerous unprecedented performances, such as high-accuracy object detection, image segmentation, human pose estimation, and real-time 3D sensing, which cannot be fulfilled using conventional methods.

This Special Issue aims to cover the recent advancements in computer vision that involve using artificial intelligence methods, with a particular interest in sensors and sensing. Both original research and review articles are welcome. Typical topics include but are not limited to the following:

  • Physical, chemical, biological, and healthcare sensors and sensing techniques with deep learning approaches;
  • Localization, mapping, and navigation techniques with artificial intelligence;
  • Artificial intelligence-based recognition of objects, scenes, actions, faces, gestures, expressions, and emotions, as well as object relations and interactions;
  • 3D imaging and sensing with deep learning schemes;
  • Accurate learning with simulation datasets or with a small number of training labels for sensors and sensing;
  • Supervised and unsupervised learning for sensors and sensing;
  • Broad computer vision methods and applications that involve using deep learning or artificial intelligence.

Dr. Zhaoyang Wang
Dr. Minh P. Vo
Dr. Hieu Nguyen
Dr. John Hyatt
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • deep learning
  • computer vision
  • smart sensors
  • intelligent sensing
  • 3D imaging and sensing
  • localization and mapping
  • navigation and positioning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review

17 pages, 11202 KB  
Editorial
AI-Powered Visual Sensors and Sensing: Where We Are and Where We Are Going
by Hieu Nguyen, Minh Vo, John Hyatt and Zhaoyang Wang
Sensors 2025, 25(6), 1758; https://doi.org/10.3390/s25061758 - 12 Mar 2025
Cited by 1 | Viewed by 2947
Abstract
Deep learning, a machine learning method that mimics the neural network structures of the human brain to process data, recognize patterns, and make decisions, traces its origins back to the 1950s [...] Full article
Show Figures

Figure 1

Research

Jump to: Editorial, Review

21 pages, 9540 KB  
Article
Ghost-Free HDR Imaging in Dynamic Scenes via High–Low-Frequency Decomposition
by Xiang Zhang, Genggeng Chen, Fan Zhang and Yongzhong Zhang
Sensors 2025, 25(22), 7013; https://doi.org/10.3390/s25227013 - 17 Nov 2025
Viewed by 417
Abstract
Generating high-quality high-dynamic-range (HDR) images in dynamic scenes remains a challenging task. Recently, Transformers have been introduced into HDR imaging and have demonstrated superior performance over traditional convolutional neural networks (CNNs) in handling large-scale motion. However, due to the low-pass filtering nature of [...] Read more.
Generating high-quality high-dynamic-range (HDR) images in dynamic scenes remains a challenging task. Recently, Transformers have been introduced into HDR imaging and have demonstrated superior performance over traditional convolutional neural networks (CNNs) in handling large-scale motion. However, due to the low-pass filtering nature of self-attention, Transformers tend to weaken the capture of high-frequency information, which impairs the recovery of structural details. In addition, their high computational complexity limits practical applications. To address these issues, we propose HL-HDR, a high–low-frequency-aware ghost-free HDR reconstruction network for dynamic scenes. By decomposing features into high- and low-frequency components, HL-HDR effectively overcomes the limitations of existing Transformer and CNN-based methods. The Frequency Alignment Module (FAM) captures large-scale motion in the low-frequency branch while refining local details in the high-frequency branch. The Frequency Decomposition Processing Block (FDPB) fuses local high-frequency details and global low-frequency context, enabling precise HDR reconstruction. Extensive experiments on five public HDR datasets demonstrate that HL-HDR consistently outperforms state-of-the-art methods in both quantitative metrics and qualitative evaluation. Full article
Show Figures

Figure 1

26 pages, 20666 KB  
Article
DRC2-Net: A Context-Aware and Geometry-Adaptive Network for Lightweight SAR Ship Detection
by Abdelrahman Yehia, Naser El-Sheimy, Ashraf Helmy, Ibrahim Sh. Sanad and Mohamed Hanafy
Sensors 2025, 25(22), 6837; https://doi.org/10.3390/s25226837 - 8 Nov 2025
Viewed by 376
Abstract
Synthetic Aperture Radar (SAR) ship detection remains challenging due to background clutter, target sparsity, and fragmented or partially occluded ships, particularly at small scales. To address these issues, we propose the Deformable Recurrent Criss-Cross Attention Network (DRC2-Net), a lightweight and [...] Read more.
Synthetic Aperture Radar (SAR) ship detection remains challenging due to background clutter, target sparsity, and fragmented or partially occluded ships, particularly at small scales. To address these issues, we propose the Deformable Recurrent Criss-Cross Attention Network (DRC2-Net), a lightweight and efficient detection framework built upon the YOLOX-Tiny architecture. The model incorporates two SAR-specific modules: a Recurrent Criss-Cross Attention (RCCA) module to enhance contextual awareness and reduce false positives and a Deformable Convolutional Networks v2 (DCNv2) module to capture geometric deformations and scale variations adaptively. These modules expand the Effective Receptive Field (ERF) and improve feature adaptability under complex conditions. DRC2-Net is trained on the SSDD and iVision-MRSSD datasets, encompassing highly diverse SAR imagery including inshore and offshore scenes, variable sea states, and complex coastal backgrounds. The model maintains a compact architecture with 5.05 M parameters, ensuring strong generalization and real-time applicability. On the SSDD dataset, it outperforms the YOLOX-Tiny baseline with AP@50 of 93.04% (+0.9%), APs of 91.15% (+1.31%), APm of 88.30% (+1.22%), and APl of 89.47% (+13.32%). On the more challenging iVision-MRSSD dataset, it further demonstrates improved scale-aware detection, achieving higher AP across small, medium, and large targets. These results confirm the effectiveness and robustness of DRC2-Net for multi-scale ship detection in complex SAR environments, consistently surpassing state-of-the-art detectors. Full article
Show Figures

Figure 1

19 pages, 3612 KB  
Article
CA-YOLO: An Efficient YOLO-Based Algorithm with Context-Awareness and Attention Mechanism for Clue Cell Detection in Fluorescence Microscopy Images
by Can Cui, Xi Chen, Lijun He and Fan Li
Sensors 2025, 25(19), 6001; https://doi.org/10.3390/s25196001 - 29 Sep 2025
Cited by 1 | Viewed by 905
Abstract
Automatic detection of clue cells is crucial for rapid diagnosis of bacterial vaginosis (BV), but existing algorithms suffer from low sensitivity. This is because clue cells are highly similar to normal epithelial cells in terms of macroscopic size and shape. The key difference [...] Read more.
Automatic detection of clue cells is crucial for rapid diagnosis of bacterial vaginosis (BV), but existing algorithms suffer from low sensitivity. This is because clue cells are highly similar to normal epithelial cells in terms of macroscopic size and shape. The key difference between clue cells and normal epithelial cells lies in the surface texture and edge morphology. To address this specific problem, we propose an clue cell detection algorithm named CA-YOLO. The contributions of our approach lie in two synergistic and custom-designed feature extraction modules: the context-aware module (CAM) extracts and captures bacterial distribution patterns on the surface of clue cells; and the shuffle global attention mechanism (SGAM) enhances cell edge features and suppresses irrelevant information. In addition, we integrate focal loss into the classification loss to alleviate the severe class imbalance problem inherent in clinical samples. Experimental results show that the proposed CA-YOLO achieves a sensitivity of 0.778, which is 9.2% higher than the baseline model, making the automated BV detection more reliable and feasible. Full article
Show Figures

Figure 1

17 pages, 2418 KB  
Article
InstructSee: Instruction-Aware and Feedback-Driven Multimodal Retrieval with Dynamic Query Generation
by Guihe Gu, Yuan Xue, Zhengqian Wu, Lin Song and Chao Liang
Sensors 2025, 25(16), 5195; https://doi.org/10.3390/s25165195 - 21 Aug 2025
Viewed by 1251
Abstract
In recent years, cross-modal retrieval has garnered significant attention due to its potential to bridge heterogeneous data modalities, particularly in aligning visual content with natural language. Despite notable progress, existing methods often struggle to accurately capture user intent when queries are expressed through [...] Read more.
In recent years, cross-modal retrieval has garnered significant attention due to its potential to bridge heterogeneous data modalities, particularly in aligning visual content with natural language. Despite notable progress, existing methods often struggle to accurately capture user intent when queries are expressed through complex or evolving instructions. To address this challenge, we propose a novel cross-modal representation learning framework that incorporates an instruction-aware dynamic query generation mechanism, augmented by the semantic reasoning capabilities of large language models (LLMs). The framework dynamically constructs and iteratively refines query representations conditioned on natural language instructions and guided by user feedback, thereby enabling the system to effectively infer and adapt to implicit retrieval intent. Extensive experiments on standard multimodal retrieval benchmarks demonstrate that our method significantly improves retrieval accuracy and adaptability, outperforming fixed-query baselines and showing enhanced cross-modal alignment and generalization across diverse retrieval tasks. Full article
Show Figures

Figure 1

16 pages, 5435 KB  
Article
PAPRec: 3D Point Cloud Reconstruction Based on Prior-Guided Adaptive Probabilistic Network
by Caixia Liu, Minhong Zhu, Yali Chen, Xiulan Wei and Haisheng Li
Sensors 2025, 25(5), 1354; https://doi.org/10.3390/s25051354 - 22 Feb 2025
Cited by 1 | Viewed by 2158
Abstract
Inferring a complete 3D shape from a single-view image is an ill-posed problem. The proposed methods often have problems such as insufficient feature expression, unstable training and limited constraints, resulting in a low accuracy and ambiguity reconstruction. To address these problems, we propose [...] Read more.
Inferring a complete 3D shape from a single-view image is an ill-posed problem. The proposed methods often have problems such as insufficient feature expression, unstable training and limited constraints, resulting in a low accuracy and ambiguity reconstruction. To address these problems, we propose a prior-guided adaptive probabilistic network for single-view 3D reconstruction, called PAPRec. In the training stage, PAPRec encodes a single-view image and its corresponding 3D prior into image feature distribution and point cloud feature distribution, respectively. PAPRec then utilizes a latent normalizing flow to fit the two distributions and obtains a latent vector with rich cues. PAPRec finally introduces an adaptive probabilistic network consisting of a shape normalizing flow and a diffusion model in order to decode the latent vector as a complete 3D point cloud. Unlike the proposed methods, PAPRec fully learns the global and local features of objects by innovatively integrating 3D prior guidance and the adaptive probability network under the optimization of a loss function combining prior, flow and diffusion losses. The experimental results on the public ShapeNet dataset show that PAPRec, on average, improves CD by 2.62%, EMD by 5.99% and F1 by 4.41%, in comparison to several state-of-the-art methods. Full article
Show Figures

Figure 1

19 pages, 8290 KB  
Article
Multi-Scale Contrastive Learning with Hierarchical Knowledge Synergy for Visible-Infrared Person Re-Identification
by Yongheng Qian and Su-Kit Tang
Sensors 2025, 25(1), 192; https://doi.org/10.3390/s25010192 - 1 Jan 2025
Cited by 2 | Viewed by 1908
Abstract
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and [...] Read more.
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and infrared images. However, exclusively relying on high-level semantic information from the network’s final layers can restrict shared feature representations and overlook the benefits of low-level details. Different from these methods, we propose a multi-scale contrastive learning network (MCLNet) with hierarchical knowledge synergy for VI-ReID. MCLNet is a novel two-stream contrastive deep supervision framework designed to train low-level details and high-level semantic representations simultaneously. MCLNet utilizes supervised contrastive learning (SCL) at each intermediate layer to strengthen visual representations and enhance cross-modality feature learning. Furthermore, a hierarchical knowledge synergy (HKS) strategy for pairwise knowledge matching promotes explicit information interaction across multi-scale features and improves information consistency. Extensive experiments on three benchmarks demonstrate the effectiveness of MCLNet. Full article
Show Figures

Figure 1

27 pages, 12241 KB  
Article
SURABHI: Self-Training Using Rectified Annotations-Based Hard Instances for Eidetic Cattle Recognition
by Manu Ramesh and Amy R. Reibman
Sensors 2024, 24(23), 7680; https://doi.org/10.3390/s24237680 - 30 Nov 2024
Cited by 3 | Viewed by 1020
Abstract
We propose a self-training scheme, SURABHI, that trains deep-learning keypoint detection models on machine-annotated instances, together with the methodology to generate those instances. SURABHI aims to improve the keypoint detection accuracy not by altering the structure of a deep-learning-based keypoint detector model but [...] Read more.
We propose a self-training scheme, SURABHI, that trains deep-learning keypoint detection models on machine-annotated instances, together with the methodology to generate those instances. SURABHI aims to improve the keypoint detection accuracy not by altering the structure of a deep-learning-based keypoint detector model but by generating highly effective training instances. The machine-annotated instances used in SURABHI are hard instances—instances that require a rectifier to correct the keypoints misplaced by the keypoint detection model. We engineer this scheme for the task of predicting keypoints of cattle from the top, in conjunction with our Eidetic Cattle Recognition System, which is dependent on accurate prediction of keypoints for predicting the correct cow ID. We show that the final cow ID prediction accuracy on previously unseen cows also improves significantly after applying SURABHI to a deep-learning detection model with high capacity, especially when available training data are minimal. SURABHI helps us achieve a top-6 cow recognition accuracy of 91.89% on a dataset of cow videos. Using SURABHI on this dataset also improves the number of cow instances with correct identification by 22% over the baseline result from fully supervised training. Full article
Show Figures

Figure 1

22 pages, 12107 KB  
Article
Deep Learning-Based Classification of Macrofungi: Comparative Analysis of Advanced Models for Accurate Fungi Identification
by Sifa Ozsari, Eda Kumru, Fatih Ekinci, Ilgaz Akata, Mehmet Serdar Guzel, Koray Acici, Eray Ozcan and Tunc Asuroglu
Sensors 2024, 24(22), 7189; https://doi.org/10.3390/s24227189 - 9 Nov 2024
Cited by 12 | Viewed by 3357
Abstract
This study focuses on the classification of six different macrofungi species using advanced deep learning techniques. Fungi species, such as Amanita pantherina, Boletus edulis, Cantharellus cibarius, Lactarius deliciosus, Pleurotus ostreatus and Tricholoma terreum were chosen based on their ecological [...] Read more.
This study focuses on the classification of six different macrofungi species using advanced deep learning techniques. Fungi species, such as Amanita pantherina, Boletus edulis, Cantharellus cibarius, Lactarius deliciosus, Pleurotus ostreatus and Tricholoma terreum were chosen based on their ecological importance and distinct morphological characteristics. The research employed 5 different machine learning techniques and 12 deep learning models, including DenseNet121, MobileNetV2, ConvNeXt, EfficientNet, and swin transformers, to evaluate their performance in identifying fungi from images. The DenseNet121 model demonstrated the highest accuracy (92%) and AUC score (95%), making it the most effective in distinguishing between species. The study also revealed that transformer-based models, particularly the swin transformer, were less effective, suggesting room for improvement in their application to this task. Further advancements in macrofungi classification could be achieved by expanding datasets, incorporating additional data types such as biochemical, electron microscopy, and RNA/DNA sequences, and using ensemble methods to enhance model performance. The findings contribute valuable insights into both the use of deep learning for biodiversity research and the ecological conservation of macrofungi species. Full article
Show Figures

Figure 1

23 pages, 1025 KB  
Article
Adversarial Examples on XAI-Enabled DT for Smart Healthcare Systems
by Niddal H. Imam
Sensors 2024, 24(21), 6891; https://doi.org/10.3390/s24216891 - 27 Oct 2024
Cited by 5 | Viewed by 2839
Abstract
There have recently been rapid developments in smart healthcare systems, such as precision diagnosis, smart diet management, and drug discovery. These systems require the integration of the Internet of Things (IoT) for data acquisition, Digital Twins (DT) for data representation into a digital [...] Read more.
There have recently been rapid developments in smart healthcare systems, such as precision diagnosis, smart diet management, and drug discovery. These systems require the integration of the Internet of Things (IoT) for data acquisition, Digital Twins (DT) for data representation into a digital replica and Artificial Intelligence (AI) for decision-making. DT is a digital copy or replica of physical entities (e.g., patients), one of the emerging technologies that enable the advancement of smart healthcare systems. AI and Machine Learning (ML) offer great benefits to DT-based smart healthcare systems. They also pose certain risks, including security risks, and bring up issues of fairness, trustworthiness, explainability, and interpretability. One of the challenges that still make the full adaptation of AI/ML in healthcare questionable is the explainability of AI (XAI) and interpretability of ML (IML). Although the study of the explainability and interpretability of AI/ML is now a trend, there is a lack of research on the security of XAI-enabled DT for smart healthcare systems. Existing studies limit their focus to either the security of XAI or DT. This paper provides a brief overview of the research on the security of XAI-enabled DT for smart healthcare systems. It also explores potential adversarial attacks against XAI-enabled DT for smart healthcare systems. Additionally, it proposes a framework for designing XAI-enabled DT for smart healthcare systems that are secure and trusted. Full article
Show Figures

Figure 1

Review

Jump to: Editorial, Research

30 pages, 3117 KB  
Review
Computer Vision for Glass Waste: Technologies and Sensors
by Eduardo Adán and Antonio Adán
Sensors 2025, 25(21), 6634; https://doi.org/10.3390/s25216634 - 29 Oct 2025
Viewed by 887
Abstract
Several reviews have been published addressing the challenges of waste collection and recycling across various sectors, including municipal, industrial, construction, and agricultural domains. These studies often emphasize the role of existing technologies in addressing recycling-related issues. Among the diverse range of waste materials, [...] Read more.
Several reviews have been published addressing the challenges of waste collection and recycling across various sectors, including municipal, industrial, construction, and agricultural domains. These studies often emphasize the role of existing technologies in addressing recycling-related issues. Among the diverse range of waste materials, glass remains a significant component, frequently grouped with other multi-class waste types (such as plastic, cardboard, and metal) for segregation and classification processes. The primary aim of this review is to examine the technologies specifically involved in the collection and separation stages of waste in which glass represents a major or exclusive fraction. The second objective is to present the main technologies and computer vision sensors currently used in managing glass waste. This study not only references laboratory developments or experiments on standard datasets, but also includes projects, patents, and real-world implementations that are already delivering measurable results. The review discusses the technological possibilities, gaps, and challenges faced in this specialized field of research. Full article
Show Figures

Figure 1

Back to TopTop