Multimodal Deep Learning and Its Application in Healthcare

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E1: Mathematics and Computer Science".

Deadline for manuscript submissions: 30 April 2026 | Viewed by 144

Special Issue Editor


E-Mail Website
Guest Editor
School of Public Health, Shanghai Jiao Tong University, Shanghai 200127, China
Interests: multimodal foundation models; computer vision; AI for healthcare

Special Issue Information

Dear Colleagues,

We are pleased to invite you to contribute to this Special Issue entitled "Multimodal Deep Learning and Its Application in Healthcare" in the journal Mathematics.

Recent years have witnessed remarkable progress in deep learning, particularly with the advent of foundation models and multimodal architectures. These models have shown unprecedented capabilities in learning representations across multiple data types such as text, images, signals, and structured data. In healthcare and medicine, where data is often heterogeneous and high-dimensional—ranging from clinical notes and genomic profiles to radiological and pathological images—multimodal learning is particularly transformative.

At the same time, the mathematical underpinnings of these powerful models are still under active development. A better understanding of their theoretical properties, generalization capabilities, and optimization dynamics is critical to ensuring robustness, interpretability, and trustworthiness in medical and healthcare applications.

This special issue aims to bridge the gap between the mathematical theory of deep and foundation models and their applications in multimodal healthcare data. It seeks to explore both rigorous theoretical developments and practical implementations that advance our understanding and use of deep learning in real-world healthcare contexts.

The journal Mathematics focuses on mathematical and theoretical aspects across various scientific fields. Therefore, this special issue is particularly interested in contributions that either:

  • Provide mathematical insights into deep learning and multimodal models, including generalization bounds, convergence analyses, optimization theory, or information-theoretic frameworks, or
  • Apply these methods to solve important problems in healthcare and biomedical imaging, demonstrating how mathematical foundations contribute to practical advances.

This special issue is expected to gather at least 10 high-quality articles and may be published as a printed book if this goal is achieved.

In this Special Issue, original research articles, theoretical studies, and comprehensive review papers are welcome. Research areas may include (but are not limited to) the following topics:

  • Mathematical foundations of deep learning and foundation models
  • Theoretical analyses of multimodal architectures (e.g., transformers, diffusion models)
  • Convergence theory and generalization bounds for deep neural networks
  • Optimization and training dynamics of large-scale models
  • Information theory and statistical learning theory in multimodal learning
  • Applications of multimodal deep learning in healthcare and medicine
  • Medical image analysis using deep learning (e.g., radiology, pathology, 3D reconstruction)
  • Fusion of clinical, imaging, and omics data for disease diagnosis and prognosis
  • Trustworthy and interpretable AI for healthcare
  • Computational models of disease progression using multimodal data

We look forward to receiving your contributions and building a strong collection of high-quality research at the intersection of mathematics, deep learning, and healthcare.

Dr. Xiaoshui Huang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • multimodal learning
  • deep learning theory
  • foundation models
  • optimization
  • generalization
  • medical imaging
  • healthcare AI
  • mathematical modeling
  • biomedical data fusion
  • theoretical machine learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 3066 KB  
Article
A Tree-Based Search Algorithm with Global Pheromone and Local Signal Guidance for Scientific Chart Reasoning
by Min Zhou, Zhiheng Qi, Tianlin Zhu, Jan Vijg and Xiaoshui Huang
Mathematics 2025, 13(17), 2739; https://doi.org/10.3390/math13172739 - 26 Aug 2025
Abstract
Chart reasoning, a critical task for automating data interpretation in domains such as aiding scientific data analysis and medical diagnostics, leverages large-scale vision language models (VLMs) to interpret chart images and answer natural language questions, enabling semantic understanding that enhances knowledge accessibility and [...] Read more.
Chart reasoning, a critical task for automating data interpretation in domains such as aiding scientific data analysis and medical diagnostics, leverages large-scale vision language models (VLMs) to interpret chart images and answer natural language questions, enabling semantic understanding that enhances knowledge accessibility and supports data-driven decision making across diverse domains. In this work, we formalize chart reasoning as a sequential decision-making problem governed by a Markov Decision Process (MDP), thereby providing a mathematically grounded framework for analyzing visual question answering tasks. While recent advances such as multi-step reasoning with Monte Carlo tree search (MCTS) offer interpretable and stochastic planning capabilities, these methods often suffer from redundant path exploration and inefficient reward propagation. To address these challenges, we propose a novel algorithmic framework that integrates a pheromone-guided search strategy inspired by Ant Colony Optimization (ACO). In our approach, chart reasoning is cast as a combinatorial optimization problem over a dynamically evolving search tree, where path desirability is governed by pheromone concentration functions that capture global phenomena across search episodes and are reinforced through trajectory-level rewards. Transition probabilities are further modulated by local signals, which are evaluations derived from the immediate linguistic feedback of large language models. This enables fine grained decision making at each step while preserving long-term planning efficacy. Extensive experiments across four benchmark datasets, ChartQA, MathVista, GRAB, and ChartX, demonstrate the effectiveness of our approach, with multi-agent reasoning and pheromone guidance yielding success rate improvements of +18.4% and +7.6%, respectively. Full article
(This article belongs to the Special Issue Multimodal Deep Learning and Its Application in Healthcare)
Show Figures

Figure 1

Back to TopTop