Artificial Intelligence for Smart Image Perception, Recognition and Understanding

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 April 2026 | Viewed by 2538

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
Interests: deep neural networks; image analysis and understanding; video action understanding; human pose estimation; applications of large foundation models; concept editing on generative models; cross-modal content alignment; multi-modal fusion

E-Mail Website
Guest Editor
School of Information Science and Technology, University of Science and Technology of China, Hefei 230001, China
Interests: multimodal content analysis; model lightweighting; multimodal large models

E-Mail Website
Guest Editor
School of Electrical and Information Engineering, Wuhan Institute of Technology, Wuhan 430205, China
Interests: semantic matching; 3D classification; question answering

Special Issue Information

Dear Colleagues,

This Special Issue focuses on smart, data-efficient, and trustworthy image understanding across tasks such as recognition, segmentation, VQA, captioning, generation, and cross-modal alignment. We welcome contributions spanning theoretical advances to practical systems, covering (but not limited to) deep architecture design, self-/un-/weak supervised training, vision–language models, generative–discriminative integration, 2D–3D geometric reasoning, efficiency strategies, robustness and uncertainty, and benchmarking and reproducibility. Application areas include media data processing, medical imaging, industrial inspection, autonomous driving, and remote sensing, among others. By emphasizing shared architectures, data-efficient training paradigms, and rigorous evaluation protocols, this Special Issue aims to transcend task-specific silos and accelerate the development of practical, reliable, and transparent image understanding technologies.

Prof. Dr. Yanbin Hao
Dr. Shuo Wang
Dr. Jinmeng Wu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image classification/recognition/retrieval/detection
  • deep image neural networks
  • vision–language models
  • image generative models
  • open-vocabulary detection/segmentation
  • image processing/augmentation
  • self/un/weakly/full supervised learning
  • instruction tuning and prompting for vision
  • 2D–3D fusion and depth estimation
  • anomaly/defect detection
  • cross-modal retrieval and alignment
  • human pose estimation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 16728 KB  
Article
Semantic and Sketch-Guided Diffusion Model for Fine-Grained Restoration of Damaged Ancient Paintings
by Li Zhao, Yingzhi Chen, Guangqi Du and Xiaojun Wu
Electronics 2025, 14(21), 4187; https://doi.org/10.3390/electronics14214187 - 27 Oct 2025
Viewed by 1291
Abstract
Ancient paintings, as invaluable cultural heritage, often suffer from damages like creases, mold, and missing regions. Current restoration methods, while effective for natural images, struggle with the fine-grained control required for ancient paintings’ artistic styles and brushstroke patterns. We propose the Semantic and [...] Read more.
Ancient paintings, as invaluable cultural heritage, often suffer from damages like creases, mold, and missing regions. Current restoration methods, while effective for natural images, struggle with the fine-grained control required for ancient paintings’ artistic styles and brushstroke patterns. We propose the Semantic and Sketch-Guided Restoration (SSGR) framework, which uses pixel-level semantic maps to restore missing and mold-affected areas and depth-aware sketch maps to ensure texture continuity in creased regions. The sketch maps are automatically extracted using advanced methods that preserve original brushstroke styles while conveying geometry and semantics. SSGR employs a semantic segmentation network to categorize painting regions and depth-sensitive sketch extraction to guide a diffusion model. To enhance style controllability, we cluster diverse attributes of landscape paintings and incorporate a Semantic-Sketch-Attribute-Normalization (SSAN) block that explores consistent patterns across styles through spatial semantic and attribute-adaptive normalization modules. Evaluated on the CLP-2K dataset, SSGR achieves an mIoU of 53.30%, SSIM of 0.42, and PSNR of 13.11, outperforming state-of-the-art methods. This approach not only preserves historical aesthetics but also advances digital heritage preservation with a tailored, controllable technique for ancient paintings. Full article
Show Figures

Figure 1

20 pages, 4451 KB  
Article
Skeleton-Guided Diffusion for Font Generation
by Li Zhao, Shan Dong, Jiayi Liu, Xijin Zhang, Xiaojiao Gao and Xiaojun Wu
Electronics 2025, 14(19), 3932; https://doi.org/10.3390/electronics14193932 - 3 Oct 2025
Viewed by 987
Abstract
Generating non-standard fonts, such as running script (e.g., XingShu), poses significant challenges due to their high stroke continuity, structural flexibility, and stylistic diversity, which traditional component-based prior knowledge methods struggle to model effectively. While diffusion models excel at capturing continuous feature spaces and [...] Read more.
Generating non-standard fonts, such as running script (e.g., XingShu), poses significant challenges due to their high stroke continuity, structural flexibility, and stylistic diversity, which traditional component-based prior knowledge methods struggle to model effectively. While diffusion models excel at capturing continuous feature spaces and stroke variations through iterative denoising, they face critical limitations: (1) style leakage, where large stylistic differences lead to inconsistent outputs due to noise interference; (2) structural distortion, caused by the absence of explicit structural guidance, resulting in broken strokes or deformed glyphs; and (3) style confusion, where similar font styles are inadequately distinguished, producing ambiguous results. To address these issues, we propose a novel skeleton-guided diffusion model with three key innovations: (1) a skeleton-constrained style rendering module that enforces semantic alignment and balanced energy constraints to amplify critical skeletal features, mitigating style leakage and ensuring stylistic consistency; (2) a cross-scale skeleton preservation module that integrates multi-scale glyph skeleton information through cross-dimensional interactions, effectively modeling macro-level layouts and micro-level stroke details to prevent structural distortions; (3) a contrastive style refinement module that leverages skeleton decomposition and recombination strategies, coupled with contrastive learning on positive and negative samples, to establish robust style representations and disambiguate similar styles. Extensive experiments on diverse font datasets demonstrate that our approach significantly improves the generation quality, achieving superior style fidelity, structural integrity, and style differentiation compared to state-of-the-art diffusion-based font generation methods. Full article
Show Figures

Figure 1

Back to TopTop