Recent Advances in Computer Vision: Technologies and Applications, 2nd Edition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 June 2026 | Viewed by 6427

Special Issue Editors


E-Mail Website
Guest Editor
College of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255049, China
Interests: computer vision; machine learning; intelligent optimization control
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
College of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255049, China
Interests: person re-recognition; power vision technology
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent decades, computer vision has achieved great success in many fields. This Special Issue aims to encompass a broad spectrum of topics within computer vision, including image processing, pattern recognition, machine learning, and artificial intelligence techniques applied to visual data analysis. It seeks to showcase innovative methodologies and algorithms for tasks such as image classification, object detection and tracking, semantic segmentation, scene understanding, and beyond. Furthermore, this Special Issue welcomes contributions that explore the integration of computer vision into emerging technologies such as augmented reality, virtual reality, and robotics, as well as its applications in diverse fields such as healthcare, transportation, surveillance, entertainment, and manufacturing. By providing a platform for the dissemination of cutting-edge research and practical applications, this Special Issue will endeavor to facilitate knowledge exchange, foster interdisciplinary collaborations, and propel the advancement of computer vision technologies in order to effectively address real-world challenges.

Potential topics include, but are not limited to, the following:

  1. Image processing, analysis, and understanding;
  2. Object detection and visual tracking;
  3. Human behavior analysis;
  4. Computer vision and smart cities;
  5. Industrial machine vision;
  6. Supervised/semi-supervised/unsupervised learning;
  7. Reinforcement learning;
  8. Deep learning theory and applications;
  9. Pattern recognition;
  10. Data mining;
  11. Feature engineering;
  12. Information fusion;
  13. Object classification;
  14. Network analytics for efficient and reliable network operation.

Technical Program Committee Member:

  1. Qilei (Kevin) Li  Queen Mary University of London
  2. Wenzhe Zhai Harbin Engineering University

Dr. Mingliang Gao
Dr. Guofeng Zou
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • machine learning
  • image processing
  • pattern recognition
  • deep learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 3658 KiB  
Article
A Hard Negatives Mining and Enhancing Method for Multi-Modal Contrastive Learning
by Guangping Li, Yanan Gao, Xianhui Huang and Bingo Wing-Kuen Ling
Electronics 2025, 14(4), 767; https://doi.org/10.3390/electronics14040767 - 16 Feb 2025
Viewed by 685
Abstract
Contrastive learning has emerged as a dominant paradigm for understanding 3D open-world environments, particularly in the realm of multi-modalities. However, due to the nature of self-supervised learning and the limited size of 3D datasets, pre-trained models in the 3D point cloud domain often [...] Read more.
Contrastive learning has emerged as a dominant paradigm for understanding 3D open-world environments, particularly in the realm of multi-modalities. However, due to the nature of self-supervised learning and the limited size of 3D datasets, pre-trained models in the 3D point cloud domain often suffer from overfitting in downstream tasks, especially in zero-shot classification. To tackle this problem, we design a module to mine and enhance hard negatives from datasets, which are useful to improve the discrimination of models. This module could be seamlessly integrated into cross-modal contrastive learning frameworks, addressing the overfitting issue by enhancing the mined hard negatives during the process of training. This module consists of two key components: mining and enhancing. In the process of mining, we identify hard negative samples by examining similarity relationships between vision–vision and vision–text modalities, locating hard negative pairs within the visual domain. In the process of enhancing, we compute weighting coefficients via the similarity differences of these mined hard negatives. By enhancing the mined hard negatives while leaving others unchanged, we improve the overall performance and discrimination of models. A series of experiments demonstrate that our module can be easily incorporated into various contrastive learning frameworks, leading to improved model performance in both zero-shot and few-shot tasks. Full article
Show Figures

Figure 1

18 pages, 8632 KiB  
Article
CT and MRI Image Fusion via Coupled Feature-Learning GAN
by Qingyu Mao, Wenzhe Zhai, Xiang Lei, Zenghui Wang and Yongsheng Liang
Electronics 2024, 13(17), 3491; https://doi.org/10.3390/electronics13173491 - 3 Sep 2024
Cited by 1 | Viewed by 2158
Abstract
The fusion of multimodal medical images, particularly CT and MRI, is driven by the need to enhance the diagnostic process by providing clinicians with a single, comprehensive image that encapsulates all necessary details. Existing fusion methods often exhibit a bias towards features from [...] Read more.
The fusion of multimodal medical images, particularly CT and MRI, is driven by the need to enhance the diagnostic process by providing clinicians with a single, comprehensive image that encapsulates all necessary details. Existing fusion methods often exhibit a bias towards features from one of the source images, making it challenging to simultaneously preserve both structural information and textural details. Designing an effective fusion method that can preserve more discriminative information is therefore crucial. In this work, we propose a Coupled Feature-Learning GAN (CFGAN) to fuse the multimodal medical images into a single informative image. The proposed method establishes an adversarial game between the discriminators and a couple of generators. First, the coupled generators are trained to generate two real-like fused images, which are then used to deceive the two coupled discriminators. Subsequently, the two discriminators are devised to minimize the structural distance to ensure the abundant information in the original source images is well-maintained in the fused image. We further empower the generators to be robust under various scales by constructing a discriminative feature extraction (DFE) block with different dilation rates. Moreover, we introduce a cross-dimension interaction attention (CIA) block to refine the feature representations. The qualitative and quantitative experiments on common benchmarks demonstrate the competitive performance of the CFGAN compared to other state-of-the-art methods. Full article
Show Figures

Figure 1

25 pages, 3845 KiB  
Article
Bud-YOLOv8s: A Potato Bud-Eye-Detection Algorithm Based on Improved YOLOv8s
by Wenlong Liu, Zhao Li, Shaoshuang Zhang, Ting Qin and Jiaqi Zhao
Electronics 2024, 13(13), 2541; https://doi.org/10.3390/electronics13132541 - 28 Jun 2024
Cited by 5 | Viewed by 1968
Abstract
The key to intelligent seed potato cutting technology lies in the accurate and rapid identification of potato bud eyes. Existing detection algorithms suffer from low recognition accuracy and high model complexity, resulting in an increased miss rate. To address these issues, this study [...] Read more.
The key to intelligent seed potato cutting technology lies in the accurate and rapid identification of potato bud eyes. Existing detection algorithms suffer from low recognition accuracy and high model complexity, resulting in an increased miss rate. To address these issues, this study proposes a potato bud-eye-detection algorithm based on an improved YOLOv8s. First, by integrating the Faster Neural Network (FasterNet) with the Efficient Multi-scale Attention (EMA) module, a novel Faster Block-EMA network structure is designed to replace the bottleneck components within the C2f module of YOLOv8s. This enhancement improves the model’s feature-extraction capability and computational efficiency for bud detection. Second, this study introduces a weighted bidirectional feature pyramid network (BiFPN) to optimize the neck network, achieving multi-scale fusion of potato bud eye features while significantly reducing the model’s parameters, computation, and size due to its flexible network topology. Finally, the Efficient Intersection over Union (EIoU) loss function is employed to optimize the bounding box regression process, further enhancing the model’s localization capability. The experimental results show that the improved model achieves a mean average precision (mAP@0.5) of 98.1% with a model size of only 11.1 MB. Compared to the baseline model, the mAP@0.5 and mAP@0.5:0.95 were improved by 3.1% and 4.5%, respectively, while the model’s parameters, size, and computation were reduced by 49.1%, 48.1%, and 31.1%, respectively. Additionally, compared to the YOLOv3, YOLOv5s, YOLOv6s, YOLOv7-tiny, and YOLOv8m algorithms, the mAP@0.5 was improved by 4.6%, 3.7%, 5.6%, 5.2%, and 3.3%, respectively. Therefore, the proposed algorithm not only significantly enhances the detection accuracy, but also greatly reduces the model complexity, providing essential technical support for the application and deployment of intelligent potato cutting technology. Full article
Show Figures

Figure 1

17 pages, 2252 KiB  
Article
Enhanced Multi-View Low-Rank Graph Optimization for Dimensionality Reduction
by Haohao Li and Huibing Wang
Electronics 2024, 13(12), 2421; https://doi.org/10.3390/electronics13122421 - 20 Jun 2024
Viewed by 963
Abstract
In the last decade, graph embedding-based dimensionality reduction for multi-view data has been extensively studied. However, constructing a high-quality graph for dimensionality reduction is still a significant challenge. Herein, we propose a new algorithm, named multi-view low-rank graph optimization for dimensionality reduction (MvLRGO), [...] Read more.
In the last decade, graph embedding-based dimensionality reduction for multi-view data has been extensively studied. However, constructing a high-quality graph for dimensionality reduction is still a significant challenge. Herein, we propose a new algorithm, named multi-view low-rank graph optimization for dimensionality reduction (MvLRGO), which integrates graph optimization with dimensionality reduction into one objective function in order to simultaneously determine the optimal subspace and graph. The subspace learning of each view is conducted independently by the general graph embedding framework. For graph construction, we exploit low-rank representation (LRR) to obtain reconstruction relationships as the affinity weight of the graph. Subsequently, the learned graph of each view is further optimized throughout the learning process to obtain the ideal assignment of relations. Moreover, to integrate information from multiple views, MvLRGO regularizes each of the view-specific optimal graphs such that they align with one another. Benefiting from this term, MvLRGO can achieve flexible multi-view communication without constraining the subspaces of all views to be the same. Various experimental results obtained with different datasets show that the proposed method outperforms many state-of-the-art multi-view and single-view dimensionality reduction algorithms. Full article
Show Figures

Figure 1

Back to TopTop