Multimodal Deep Learning and Its Applications

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Interests: computer vision; deep learning; large language models

E-Mail Website
Guest Editor
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Interests: large language models; information retrieval; multimodal learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Information and Communication Engineering, Dalian University of Technology, Dalian 116024, China
Interests: computer vision; deep learning; video image analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The rapid evolution of artificial intelligence, coupled with the widespread availability of heterogeneous data sources, has brought multimodal deep learning to the forefront of modern research. Real-world data are no longer limited to a single modality; instead, they consist of complex combinations of text, images, videos, audio signals, graphs, and structured records. Effectively modeling and reasoning over such diverse information has become essential in a wide range of domains, including intelligent recommendation systems, video understanding, information retrieval, healthcare analytics, and decision support systems.

Recent advances in multimodal deep learning—particularly multimodal large language models—have demonstrated remarkable capabilities in integrating vision, language, and other modalities for reasoning, generation, and interaction. However, significant challenges remain in scalability, robustness, interpretability, and real-world deployment. Tasks such as video anomaly detection, cross-modal retrieval, and multimodal recommendation require models that can capture fine-grained temporal dynamics, semantic alignment across modalities, and complex relational structures. Moreover, emerging paradigms such as generative recommender systems and graph-based multimodal representation learning demand novel architectures and learning strategies that go beyond traditional fusion techniques.

This Special Issue, entitled “Multimodal Deep Learning and Its Applications,” aims to bring together cutting-edge research that advances the foundations, methodologies, and applications of multimodal deep learning. The goal is to highlight innovative models and systems that effectively combine multiple modalities, leverage large-scale pretraining, and support intelligent reasoning, generation, and decision-making in complex environments. We particularly encourage contributions that explore the synergy between multimodal learning, large language models, graph representations, and real-world applications.

We welcome original research papers and comprehensive review articles. Topics of interest include, but are not limited to, the following:

  • Multimodal large language models and foundation models;
  • Video understanding and video anomaly detection;
  • Multimodal recommendation systems;
  • Generative recommender models and user behavior modeling;
  • Cross-modal and multimodal retrieval;
  • Graph-based multimodal representation learning;
  • Multimodal fusion, alignment, and representation techniques;
  • Zero-shot and few-shot multimodal learning;
  • Multimodal content generation and editing;
  • Multimodal learning for big data analytics;
  • Trustworthy, explainable, and efficient multimodal models;
  • Continual and adaptive multimodal learning;
  • Applications of multimodal deep learning in healthcare, economics, and social computing.

We believe this Special Issue will provide a timely and meaningful platform for researchers and practitioners to share novel ideas, methodological advances, and practical insights into multimodal deep learning. We look forward to your valuable contributions and to fostering interdisciplinary collaboration within this rapidly growing research area.

Prof. Dr. Ping Hu
Prof. Dr. Jie Zou
Dr. Lu Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • multimodal deep learning
  • multimodal large language models
  • multimodal recommendation
  • cross-modal retrieval
  • graph representation learning
  • multimodal data fusion

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers

This special issue is now open for submission.
Back to TopTop