Intelligent Image and Video Processing: Quality, Compression and Vision Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Electronic Multimedia".

Deadline for manuscript submissions: 15 December 2026 | Viewed by 3769

Special Issue Editor

Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
Interests: deep learning; electrode implantation robot for brain-computer interface; industrial vision detection
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Intelligent image and video processing regarding quality, compression and vision applications represent a paradigm shift in signal processing, driven by the synergistic innovation of deep learning and multi-modal fusion technologies. This transformation fundamentally redefines traditional theoretical frameworks and methodological systems, manifesting in three key dimensions: from local optimization to global perception in quality assessment systems, from general-purpose computing to edge intelligence in architectural paradigms, and from single-modal analysis to cross-modal understanding in cognitive approaches.

In this Special Issue, original research articles and reviews are welcome. Research areas may include (but not limited to) the following:

  • Image Quality Enhancement Techniques;
  • Low-Light/Super-Resolution Reconstruction;
  • Model Compression Paradigms;
  • Knowledge Distillation Techniques;
  • Visual Inspection and Measurement;
  • Surface Defect Detection;
  • Industrial Anomaly Detection;
  • Vision-Based Industrial Applications.

I/We look forward to receiving your contributions.

Dr. Xian Tao
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image quality enhancement techniques
  • low-light/super-resolution reconstruction
  • model compression paradigms
  • knowledge distillation techniques
  • visual inspection and measurement
  • surface defect detection
  • industrial anomaly detection
  • vision-based industrial applications

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

28 pages, 4645 KB  
Article
Impact of Environmental Control on Subjective Video Quality Assessment in Crowdsourced QoE Experiments
by Avrajyoti Dutta, Mohamedalfateh T. M. Saeed, Swapnil Arawade, Andreja Samčović, Syed Uddin, Dawid Juszka, Michał Grega and Mikołaj Leszczuk
Electronics 2026, 15(8), 1666; https://doi.org/10.3390/electronics15081666 - 16 Apr 2026
Viewed by 723
Abstract
This research investigates the influence of environmental regulation on subjective evaluations of video quality within the Quality of Experience (QoE) paradigm. This work presents a supplementary experiment conducted in a controlled laboratory setting, building on our previous crowdsourcing studies carried out in uncontrolled, [...] Read more.
This research investigates the influence of environmental regulation on subjective evaluations of video quality within the Quality of Experience (QoE) paradigm. This work presents a supplementary experiment conducted in a controlled laboratory setting, building on our previous crowdsourcing studies carried out in uncontrolled, web-based conditions using the Prolific platform. Both tests utilized the identical crowdsourcing platform and complied with the International Telecommunication Union Telecommunication (ITU-T) P.910 Recommendations, ensuring external validity and methodological consistency. Participants assessed a collection of processed video sequences (PVS) comprising 46 distinct video clips utilizing the 5-point Absolute Category Rating (ACR) scale, while their response times were documented in milliseconds as measures of cognitive exertion and decision delay. The comparison analysis employs nonparametric tests (Mann–Whitney U and Kolmogorov–Smirnov) and a hierarchical Linear Mixed-Effects Model (LMM) to examine disparities in reaction time distributions, rating consistency, and the incidence of outliers across both environments. The results indicate that controlled settings produce statistically significantly less response variability and enhanced data reliability, whereas uncontrolled settings encompass greater external diversity and real-world unpredictability. These findings offer significant insights into the balance between experimental control and external validity in crowdsourced video quality assessment, advancing the development of scalable approaches for Quality of Experience research. Full article
Show Figures

Figure 1

21 pages, 4628 KB  
Article
BAG-CLIP: Bifurcated Attention Graph-Enhanced CLIP for Zero-Shot Industrial Anomaly Detection
by Hua Wu, Tingting Zhang and Shubo Li
Electronics 2026, 15(8), 1659; https://doi.org/10.3390/electronics15081659 - 15 Apr 2026
Viewed by 392
Abstract
While vision-language models (VLMs) have been widely applied in zero-shot anomaly detection (ZSAD), their performance remains limited by the inability to distinguish fine-grained normal and abnormal textures, coupled with inadequate capabilities in detecting complex morphological anomalies. To address these limitations, this paper proposes [...] Read more.
While vision-language models (VLMs) have been widely applied in zero-shot anomaly detection (ZSAD), their performance remains limited by the inability to distinguish fine-grained normal and abnormal textures, coupled with inadequate capabilities in detecting complex morphological anomalies. To address these limitations, this paper proposes BAG-CLIP (Bifurcated Attention Graph-Enhanced CLIP), a dual-path graph-enhanced zero-shot anomaly detection method. This approach employs a Bifurcated Self-Attention (BSA) module to decouple visual features, processing global semantics and spatial details separately to mitigate the inherent conflict between abstract semantic representation and precise spatial localization. A Self-Attention Graph (SAG) module is designed to model the topological structure of complex morphological anomalies. This module dynamically constructs visual features’ topological relationships and utilizes graph convolutions to aggregate neighborhood information, thereby enhancing the model’s representational capacity for diverse and complex morphological anomalies. Extensive experiments are conducted on five diverse industrial datasets, featuring complex transmission line backgrounds alongside general industrial scenarios. The proposed method is comprehensively evaluated against 11 state-of-the-art (SOTA) methods. On the EPED (Electrical Power Equipment Dataset) and MPDD datasets, BAG-CLIP outperforms the second-best methods in image-level AUROC (Area Under the Receiver Operating Characteristic Curve) by 3.7% and 2.8%, respectively. BAG-CLIP achieves superior performance in both zero-shot anomaly detection and segmentation. Full article
Show Figures

Figure 1

22 pages, 4595 KB  
Article
Toward Real-Time Industrial Small Object Inspection: Decoupled Attention and Multi-Scale Aggregation for PCB Defect Detection
by Yuting Wang, Bingyang Guo, Liming Sun and Ruiyun Yu
Electronics 2026, 15(6), 1191; https://doi.org/10.3390/electronics15061191 - 12 Mar 2026
Viewed by 609
Abstract
PCB surface defect detection plays a critical role in ensuring electronics manufacturing quality. To address the challenges of small target defect detection, this study proposes PCB-YOLO, an enhanced lightweight detector based on YOLOv8n. PCB-YOLO introduces three key improvements. First, a RepViT-EMA Fusion Architecture [...] Read more.
PCB surface defect detection plays a critical role in ensuring electronics manufacturing quality. To address the challenges of small target defect detection, this study proposes PCB-YOLO, an enhanced lightweight detector based on YOLOv8n. PCB-YOLO introduces three key improvements. First, a RepViT-EMA Fusion Architecture (REFA) module is designed for deep backbone layers to strengthen feature extraction while suppressing background interference from complex circuit patterns. Second, a Multi-Scale Grouped Aggregation (MSGA) module is developed to reduce feature redundancy and improve spatial-semantic information extraction for multi-scale defects. Third, a Pixel-level Intersection over Union (PIoU) loss function is proposed to enable pixel-level IoU calculation with enhanced angular and area constraints for more precise localization. Extensive experiments on the PKU-Market-PCB dataset demonstrate that PCB-YOLO achieves 98.4% mAP@0.5, 97.4% recall, and 96.1% precision with only 2.4 M parameters, 6.9 G FLOPs, and an inference speed of 224 FPS, outperforming multiple state-of-the-art methods while maintaining real-time capability. Additional experiments on the DeepPCB dataset yield 99.0% mAP@0.5 and 80.4% mAP@0.5:0.95, confirming the cross-dataset generalization ability of the proposed method. Full article
Show Figures

Figure 1

20 pages, 4373 KB  
Article
SO-YOLO11-CDP: An Instance Segmentation-Based Approach for Cross-Depth-of-Field Positioning Micro Image Sensor Modules in Precision Assembly
by Xi Lu, Juan Zhang, Yi Yang and Lie Bi
Electronics 2026, 15(2), 411; https://doi.org/10.3390/electronics15020411 - 16 Jan 2026
Viewed by 493
Abstract
During batch soldering, assembly of micro image sensor modules, initial random pose, and feature partially occlude target micro-component image, leading to issues of missed and erroneous detection, and low 3D spatial positioning accuracy due to cross-depth-of-field detection errors in microscopic vision. This paper [...] Read more.
During batch soldering, assembly of micro image sensor modules, initial random pose, and feature partially occlude target micro-component image, leading to issues of missed and erroneous detection, and low 3D spatial positioning accuracy due to cross-depth-of-field detection errors in microscopic vision. This paper proposes Small object-YOLO11-Cross-Depth-of-field Positioning (SO-YOLO11-CDP), an instance segmentation-based approach for precision cross-depth-of-field positioning micro-component. First, an improved Small object-YOLO11 (SO-YOLO11) image segmentation algorithm is designed. By incorporating a coordinate attention mechanism (CA) into segmentation head to enhance localization of micro-targets, the backbone uses non-stride convolution to preserve fine-grained feature, while target regression performance is boosted via Efficient-IoU (EIoU) loss combined with normalized Wasserstein distance (NWD). Subsequently, to further improve spatial position detection accuracy in cross-depth-of-field detection, a calibration error compensation model for image Jacobian matrix is established based on pinhole imaging principles. Experimental results indicate that SO-YOLO11 achieves 16.1% increase in precision, 4.0% increase in recall, and 9.9% increase in mean average precision (mAP0.5) over baseline YOLO11. Furthermore, it accomplishes spatial detection accuracy superior to 6.5 μm for target micro-components. The method presented in this paper holds significant engineering application value for high-precision spatial position detection of micro image sensor components. Full article
Show Figures

Figure 1

14 pages, 3077 KB  
Article
Visual Localization and Policy Learning for Robotic Large-Diameter Peg-in-Hole Assembly Tasks
by Tao Liang, Dingrong Wang, Wenzhi Ma, Lei Zhang and Dongsheng Chen
Electronics 2025, 14(23), 4592; https://doi.org/10.3390/electronics14234592 - 23 Nov 2025
Viewed by 857
Abstract
The conventional component assembly techniques employed in manufacturing industries typically necessitate laborious manual parameter calibration prior to system deployment, while existing vision-based control algorithms suffer from limited adaptability and inefficient learning capabilities. This paper presents a novel framework for automated large-diameter peg-in-hole assembly [...] Read more.
The conventional component assembly techniques employed in manufacturing industries typically necessitate laborious manual parameter calibration prior to system deployment, while existing vision-based control algorithms suffer from limited adaptability and inefficient learning capabilities. This paper presents a novel framework for automated large-diameter peg-in-hole assembly through convolutional network-based perception and reinforcement learning-driven control. Our methodology introduces three key innovations: (1) an enhanced deep segmentation architecture for precise identification and spatial localization of peg-end centroids, enabling accurate preliminary peg-in-hole; (2) a hybrid control strategy combining deep deterministic policy gradient (DDPG) reinforcement learning with classical control theory, augmented by real-time force feedback data acquisition; (3) systematic integration of visual–spatial information and haptic feedback for robust error compensation. Experimental validation on an industrial robotic platform demonstrates the method’s superior performance, achieving an Intersection over Union (IoU) score of 0.946 in peg segmentation tasks and maintaining insertion stability with maximum radial forces below 5.34 N during assembly operations. The proposed approach significantly reduces manual intervention requirements while exhibiting remarkable tolerance to positional deviations (±2.5 mm) and angular misalignments (±3°) commonly encountered in industrial assembly scenarios. Full article
Show Figures

Figure 1

Back to TopTop