Advances in Deep Learning for Open-World Computer Vision and Pattern Recognition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 May 2026 | Viewed by 734

Special Issue Editors


E-Mail Website
Guest Editor
School of Software, Shandong University, Jinan 250101, China
Interests: computer vision; multimedia computing and information retrieval; explainable AI

E-Mail Website
Guest Editor
School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China
Interests: computer vision; embedded intelligent computing; perception and decision-making for unmanned systems

E-Mail Website
Guest Editor
College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Interests: computer vision; visual salient object detection; intelligent perception and deep learning; visual navigation

Special Issue Information

Dear Colleagues,

In recent years, deep learning has significantly advanced the fields of computer vision and pattern recognition, and its applications have been widely integrated into various aspects of daily life and industrial production. While encouraging progress has been achieved in relatively simple or closed scenarios, performance in open-world environments remains unsatisfactory. Challenges stem not only from the openness of visual scenes (such as illumination variations, multi-scale objects, rainy or foggy conditions, and occlusion), but also from the openness of training and learning paradigms, including few-shot and zero-shot conditions where annotated data is scarce or unavailable. These challenges underscore the urgent need to investigate advanced deep learning techniques, encompassing diverse neural architectures (such as convolutional networks, Transformers, graph convolutional networks, and Mamba) as well as innovative learning strategies (such as few-shot or zero-shot learning), to enhance the robustness, adaptability, and generalization capability of computer vision and pattern recognition systems in open-world settings.

We are pleased to invite you to contribute to this Special Issue on “Advances in Deep Learning for Open-World Computer Vision and Pattern Recognition”. The aim of this Special Issue is to bring together cutting-edge research efforts that leverage the latest advances in deep learning to tackle open-world challenges in computer vision and pattern recognition.

This Special Issue seeks to provide a platform for researchers and practitioners to share original contributions, novel methodologies, and comprehensive reviews that address both theoretical and practical aspects. Submissions exploring innovative learning paradigms, robust model architectures, and application-driven solutions are highly encouraged.

In this Special Issue, original research articles and reviews are welcome. Research areas may include (but are not limited to) the following:

  • Multimodal computer vision and pattern recognition;
  • Open-world image recognition and understanding (including detection, classification, and segmentation, and enhancement);
  • Advanced neural network architectures for visual representation;
  • Few-shot, zero-shot, and other data-efficient learning strategies;
  • Generative and self-supervised methods for robust visual understanding;
  • Novel benchmarks, datasets, applications, and evaluation protocols for open-world vision tasks.

We look forward to receiving your valuable contributions.

Dr. Mingzhu Xu
Prof. Dr. Bing Liu
Dr. Lina Gao
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • open-world computer vision
  • pattern recognition
  • multimodal learning
  • neural network architectures
  • few-shot learning
  • zero-shot learning
  • self-supervised learning
  • generative models
  • benchmarks and evaluation protocols

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 978 KB  
Article
SpectTrans: Joint Spectral–Temporal Modeling for Polyphonic Piano Transcription via Spectral Gating Networks
by Rui Cao, Yan Liang, Lei Feng and Yuanzi Li
Electronics 2026, 15(3), 665; https://doi.org/10.3390/electronics15030665 - 3 Feb 2026
Viewed by 420
Abstract
Automatic Music Transcription (AMT) plays a fundamental role in Music Information Retrieval (MIR) by converting raw audio signals into symbolic representations such as MIDI or musical scores. Despite advances in deep learning, accurately transcribing piano performances remains challenging due to dense polyphony, wide [...] Read more.
Automatic Music Transcription (AMT) plays a fundamental role in Music Information Retrieval (MIR) by converting raw audio signals into symbolic representations such as MIDI or musical scores. Despite advances in deep learning, accurately transcribing piano performances remains challenging due to dense polyphony, wide dynamic range, sustain pedal effects, and harmonic interactions between simultaneous notes. Existing approaches using convolutional and recurrent architectures, or autoregressive models, often fail to capture long-range temporal dependencies and global harmonic structures, while conventional Vision Transformers overlook the anisotropic characteristics of audio spectrograms, leading to harmonic neglect. In this work, we propose SpectTrans, a novel piano transcription framework that integrates a Spectral Gating Network with a multi-head self-attention Transformer to jointly model spectral and temporal dependencies. Latent CNN features are projected into the frequency domain via a Real Fast Fourier Transform, enabling adaptive filtering of overlapping harmonics and suppression of non-stationary noise, while deeper layers capture long-term melodic and chordal relationships. Experimental evaluation on polyphonic piano datasets demonstrates that this architecture produces acoustically coherent representations, improving the robustness and precision of transcription under complex performance conditions. These results suggest that combining frequency-domain refinement with global temporal modeling provides an effective strategy for high-fidelity AMT. Full article
Show Figures

Figure 1

Back to TopTop