Mathematics-Driven Computer Vision and Multi-Modal Learning

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E: Applied Mathematics".

Deadline for manuscript submissions: 31 July 2026 | Viewed by 420

Special Issue Editor


E-Mail Website
Guest Editor
School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
Interests: computer vision; machine learning; multimedia data analysis; multi-modal learning; remote sensing interpretation; intelligent object perception

Special Issue Information

Dear Colleagues,

This Special Issue of Mathematics focuses on the critical role of mathematics in advancing computer vision and multi-modal learning for remote sensing and low-altitude scene analysis. We seek high-quality articles that explore how mathematical frameworks, such as optimization theory, statistical learning, geometric modeling, signal processing, and differential geometry, drive innovations in these specialized domains. Submissions may cover theoretical advancements, including novel mathematical models for multi-modal data fusion, scene feature representation, or robust inference under complex environmental conditions, as well as practical applications such as enhancing the accuracy of remote sensing image interpretation, low-altitude target detection, or dynamic scene understanding.

We encourage studies that bridge mathematical rigor with real-world challenges in remote sensing and low-altitude scenarios, such as comparing classical and emerging mathematical tools for handling data sparsity, noise, or multi-source heterogeneity or demonstrating how mathematical insights solve critical problems such as multi-modal alignment, small target recognition, or 3D scene reconstruction. Manuscripts that are purely speculative without empirical validation, theoretical proof, or practical application verification will not be considered.

We welcome contributions from researchers worldwide, especially those pioneering cross-disciplinary breakthroughs at the intersection of mathematics, computer vision, multi-modal learning, and the rapidly evolving fields of remote sensing and low-altitude scene analysis.

Dr. Junjie Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • mathematics
  • computer vision
  • multi-modal learning
  • remote sensing
  • low-altitude scene analysis
  • mathematical modeling
  • data fusion
  • scene understanding
  • optimization theory
  • statistical learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

36 pages, 1375 KB  
Article
SGMT with S-PACE: A Framework for Temporal Alignment and Quality-Aware Multimodal Fusion in Emotion Recognition
by Jun-Young Ahn, Sathiyamoorthi Arthanari, Sathishkumar Moorthy and Yeon-Kug Moon
Mathematics 2026, 14(10), 1743; https://doi.org/10.3390/math14101743 - 19 May 2026
Viewed by 114
Abstract
Multimodal emotion recognition is challenging because behavioral signals and physiological responses evolve at different temporal rates. Facial expressions and speech often change rapidly after an emotional event, whereas peripheral biosignals such as electrodermal activity, blood volume pulse, and skin temperature exhibit delayed and [...] Read more.
Multimodal emotion recognition is challenging because behavioral signals and physiological responses evolve at different temporal rates. Facial expressions and speech often change rapidly after an emotional event, whereas peripheral biosignals such as electrodermal activity, blood volume pulse, and skin temperature exhibit delayed and smoother dynamics. This temporal inconsistency can degrade fusion performance, particularly in real-world recordings with noisy or missing modalities. To address this issue, this study proposes SGMT, an S-PACE Gated Multimodal Transformer for emotion recognition using speech, facial video, and physiological signals. The proposed SGMT introduces S-PACE, a physiology-guided cross-attention mechanism that aligns fast behavioral cues with slower biosignal representations without assuming a fixed temporal delay. A Quality-Aware Gate further improves robustness by adaptively weighting modalities according to signal reliability. The fused representations are processed using a Temporal Swin Transformer and a Perceiver Fusion module for arousal–valence prediction and emotion quadrant classification. Experiments are conducted on the Korean multimodal emotion datasets KEMDy20 and K-EmoCon under different modality settings. SGMT achieves arousal UARs of 68.4% on KEMDy20 and 62.9% on K-EmoCon, with quadrant accuracies of 44.7% and 62.5%, respectively. Ablation studies demonstrate that the proposed alignment and gating strategies provide more stable multimodal fusion than conventional feature concatenation. The results indicate that SGMT effectively adapts to varying modality availability and improves multimodal emotion recognition in naturalistic environments. Full article
(This article belongs to the Special Issue Mathematics-Driven Computer Vision and Multi-Modal Learning)
Show Figures

Figure 1

Back to TopTop