Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (193)

Search Parameters:
Keywords = modal fusion network framework

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 6498 KB  
Article
A Cross-Modal Deep Feature Fusion Framework Based on Ensemble Learning for Land Use Classification
by Xiaohuan Wu, Houji Qi, Keli Wang, Yikun Liu and Yang Wang
ISPRS Int. J. Geo-Inf. 2025, 14(11), 411; https://doi.org/10.3390/ijgi14110411 - 23 Oct 2025
Abstract
Land use classification based on multi-modal data fusion has gained significant attention due to its potential to capture the complex characteristics of urban environments. However, effectively extracting and integrating discriminative features derived from heterogeneous geospatial data remain challenging. This study proposes an ensemble [...] Read more.
Land use classification based on multi-modal data fusion has gained significant attention due to its potential to capture the complex characteristics of urban environments. However, effectively extracting and integrating discriminative features derived from heterogeneous geospatial data remain challenging. This study proposes an ensemble learning framework for land use classification by fusing cross-modal deep features from both physical and socioeconomic perspectives. Specifically, the framework utilizes the Masked Autoencoder (MAE) to extract global spatial dependencies from remote sensing imagery and applies long short-term memory (LSTM) networks to model spatial distribution patterns of points of interest (POIs) based on type co-occurrence. Furthermore, we employ inter-modal contrastive learning to enhance the representation of physical and socioeconomic features. To verify the superiority of the ensemble learning framework, we apply it to map the land use distribution of Bejing. By coupling various physical and socioeconomic features, the framework achieves an average accuracy of 84.33 %, surpassing several comparative baseline methods. Furthermore, the framework demonstrates comparable performance when applied to a Shenzhen dataset, confirming its robustness and generalizability. The findings highlight the importance of fully extracting and effectively integrating multi-source deep features in land use classification, providing a robust solution for urban planning and sustainable development. Full article
Show Figures

Figure 1

72 pages, 2054 KB  
Article
Neural Network IDS/IPS Intrusion Detection and Prevention System with Adaptive Online Training to Improve Corporate Network Cybersecurity, Evidence Recording, and Interaction with Law Enforcement Agencies
by Serhii Vladov, Victoria Vysotska, Svitlana Vashchenko, Serhii Bolvinov, Serhii Glubochenko, Andrii Repchonok, Maksym Korniienko, Mariia Nazarkevych and Ruslan Herasymchuk
Big Data Cogn. Comput. 2025, 9(11), 267; https://doi.org/10.3390/bdcc9110267 - 22 Oct 2025
Abstract
Thise article examines the reliable online detection and IDS/IPS intrusion prevention in dynamic corporate networks problems, where traditional signature-based methods fail to keep pace with new and evolving attacks, and streaming data is susceptible to drift and targeted “poisoning” of the training dataset. [...] Read more.
Thise article examines the reliable online detection and IDS/IPS intrusion prevention in dynamic corporate networks problems, where traditional signature-based methods fail to keep pace with new and evolving attacks, and streaming data is susceptible to drift and targeted “poisoning” of the training dataset. As a solution, we propose a hybrid neural network system with adaptive online training, a formal minimax false-positive control framework, and a robustness mechanism set (a Huber model, pruned learning rate, DRO, a gradient-norm regularizer, and a prioritized replay). In practice, the system combines modal encoders for traffic, logs, and metrics; a temporal GNN for entity correlation; a variational module for uncertainty assessment; a differentiable symbolic unit for logical rules; an RL agent for incident prioritization; and an NLG module for explanations and the preparation of forensically relevant artifacts. In this case, the applied components are connected via a cognitive layer (cross-modal fusion memory), a Bayesian-neural network fuser, and a single multi-task loss function. The practical implementation includes the pipeline “novelty detection → active labelling → incremental supervised update” and chain-of-custody mechanisms for evidential fitness. A significant improvement in quality has been experimentally demonstrated, since the developed system achieves an ROC AUC of 0.96, an F1-score of 0.95, and a significantly lower FPR compared to basic architectures (MLP, CNN, and LSTM). In applied validation tasks, detection rates of ≈92–94% and resistance to distribution drift are noted. Full article
(This article belongs to the Special Issue Internet Intelligence for Cybersecurity)
24 pages, 2308 KB  
Review
Review on Application of Machine Vision-Based Intelligent Algorithms in Gear Defect Detection
by Dehai Zhang, Shengmao Zhou, Yujuan Zheng and Xiaoguang Xu
Processes 2025, 13(10), 3370; https://doi.org/10.3390/pr13103370 - 21 Oct 2025
Abstract
Gear defect detection directly affects the operational reliability of critical equipment in fields such as automotive and aerospace. Gear defect detection technology based on machine vision, leveraging the advantages of non-contact measurement, high efficiency, and cost-effectiveness, has become a key support for quality [...] Read more.
Gear defect detection directly affects the operational reliability of critical equipment in fields such as automotive and aerospace. Gear defect detection technology based on machine vision, leveraging the advantages of non-contact measurement, high efficiency, and cost-effectiveness, has become a key support for quality control in intelligent manufacturing. However, it still faces challenges including difficulties in semantic alignment of multimodal data, the imbalance between real-time detection requirements and computational resources, and poor model generalization in few-shot scenarios. This paper takes the paradigm evolution of gear defect detection technology as the main line, systematically reviews its development from traditional image processing to deep learning, and focuses on the innovative application of intelligent algorithms. A research framework of “technical bottleneck-breakthrough path-application verification” is constructed: for the problem of multimodal fusion, the cross-modal feature alignment mechanism based on Transformer network is deeply analyzed, clarifying its technical path of realizing joint embedding of visual and vibration signals by establishing global correlation mapping; for resource constraints, the performance of lightweight models such as MobileNet and ShuffleNet is quantitatively compared, verifying that these models reduce Parameters by 40–60% while maintaining the mean Average Precision essentially unchanged; for small-sample scenarios, few-shot generation models based on contrastive learning are systematically organized, confirming that their accuracy in the 10-shot scenario can reach 90% of that of fully supervised models, thus enhancing generalization ability. Future research can focus on the collaboration between few-shot generation and physical simulation, edge-cloud dynamic scheduling, defect evolution modeling driven by multiphysics fields, and standardization of explainable artificial intelligence. It aims to construct a gear detection system with autonomous perception capabilities, promoting the development of industrial quality inspection toward high-precision, high-robustness, and low-cost intelligence. Full article
Show Figures

Figure 1

20 pages, 2565 KB  
Article
GBV-Net: Hierarchical Fusion of Facial Expressions and Physiological Signals for Multimodal Emotion Recognition
by Jiling Yu, Yandong Ru, Bangjun Lei and Hongming Chen
Sensors 2025, 25(20), 6397; https://doi.org/10.3390/s25206397 - 16 Oct 2025
Viewed by 436
Abstract
A core challenge in multimodal emotion recognition lies in the precise capture of the inherent multimodal interactive nature of human emotions. Addressing the limitation of existing methods, which often process visual signals (facial expressions) and physiological signals (EEG, ECG, EOG, and GSR) in [...] Read more.
A core challenge in multimodal emotion recognition lies in the precise capture of the inherent multimodal interactive nature of human emotions. Addressing the limitation of existing methods, which often process visual signals (facial expressions) and physiological signals (EEG, ECG, EOG, and GSR) in isolation and thus fail to exploit their complementary strengths effectively, this paper presents a new multimodal emotion recognition framework called the Gated Biological Visual Network (GBV-Net). This framework enhances emotion recognition accuracy through deep synergistic fusion of facial expressions and physiological signals. GBV-Net integrates three core modules: (1) a facial feature extractor based on a modified ConvNeXt V2 architecture incorporating lightweight Transformers, specifically designed to capture subtle spatio-temporal dynamics in facial expressions; (2) a hybrid physiological feature extractor combining 1D convolutions, Temporal Convolutional Networks (TCNs), and convolutional self-attention mechanisms, adept at modeling local patterns and long-range temporal dependencies in physiological signals; and (3) an enhanced gated attention fusion module capable of adaptively learning inter-modal weights to achieve dynamic, synergistic integration at the feature level. A thorough investigation of the publicly accessible DEAP and MAHNOB-HCI datasets reveals that GBV-Net surpasses contemporary methods. Specifically, on the DEAP dataset, the model attained classification accuracies of 95.10% for Valence and 95.65% for Arousal, with F1-scores of 95.52% and 96.35%, respectively. On MAHNOB-HCI, the accuracies achieved were 97.28% for Valence and 97.73% for Arousal, with F1-scores of 97.50% and 97.74%, respectively. These experimental findings substantiate that GBV-Net effectively captures deep-level interactive information between multimodal signals, thereby improving emotion recognition accuracy. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

24 pages, 2289 KB  
Article
Improving Early Prediction of Sudden Cardiac Death Risk via Hierarchical Feature Fusion
by Xin Huang, Guangle Jia, Mengmeng Huang, Xiaoyu He, Yang Li and Mingfeng Jiang
Symmetry 2025, 17(10), 1738; https://doi.org/10.3390/sym17101738 - 15 Oct 2025
Viewed by 237
Abstract
Sudden cardiac death (SCD) is a leading cause of mortality worldwide, with arrhythmia serving as a major precursor. Early and accurate prediction of SCD using non-invasive electrocardiogram (ECG) signals remains a critical clinical challenge, particularly due to the inherent asymmetric and non-stationary characteristics [...] Read more.
Sudden cardiac death (SCD) is a leading cause of mortality worldwide, with arrhythmia serving as a major precursor. Early and accurate prediction of SCD using non-invasive electrocardiogram (ECG) signals remains a critical clinical challenge, particularly due to the inherent asymmetric and non-stationary characteristics of ECG signals, which complicate feature extraction and model generalization. In this study, we propose a novel SCD prediction framework based on hierarchical feature fusion, designed to capture both non-stationary and asymmetrical patterns in ECG data across six distinct time intervals preceding the onset of ventricular fibrillation (VF). First, linear features are extracted from ECG signals using waveform detection methods; nonlinear features are derived from RR interval sequences via second-order detrended fluctuation analysis (DFA2); and multi-scale deep learning features are captured using a Temporal Convolutional Network-based sequence-to-vector (TCN-Seq2vec) model. These multi-scale deep learning features, along with linear and nonlinear features, are then hierarchically fused. Finally, two fully connected layers are employed as a classifier to estimate the probability of SCD occurrence. The proposed method is evaluated under an inter-patient paradigm using the Sudden Cardiac Death Holter (SCDH) Database and the Normal Sinus Rhythm (NSR) Database. This method achieves average prediction accuracies of 97.48% and 98.8% for the 60 and 30 min periods preceding SCD, respectively. The findings suggest that integrating traditional and deep learning features effectively enhances the discriminability of abnormal samples, thereby improving SCD prediction accuracy. Ablation studies confirm that multi-feature fusion significantly improves performance compared to single-modality models, and validation on the Creighton University Ventricular Tachyarrhythmia Database (CUDB) demonstrates strong generalization capability. This approach offers a reliable, long-horizon early warning tool for clinical SCD risk assessment. Full article
(This article belongs to the Section Life Sciences)
Show Figures

Figure 1

19 pages, 4569 KB  
Article
NeuroNet-AD: A Multimodal Deep Learning Framework for Multiclass Alzheimer’s Disease Diagnosis
by Saeka Rahman, Md Motiur Rahman, Smriti Bhatt, Raji Sundararajan and Miad Faezipour
Bioengineering 2025, 12(10), 1107; https://doi.org/10.3390/bioengineering12101107 - 15 Oct 2025
Viewed by 478
Abstract
Alzheimer’s disease (AD) is the most prevalent form of dementia. This disease significantly impacts cognitive functions and daily activities. Early and accurate diagnosis of AD, including the preliminary stage of mild cognitive impairment (MCI), is critical for effective patient care and treatment development. [...] Read more.
Alzheimer’s disease (AD) is the most prevalent form of dementia. This disease significantly impacts cognitive functions and daily activities. Early and accurate diagnosis of AD, including the preliminary stage of mild cognitive impairment (MCI), is critical for effective patient care and treatment development. Although advancements in deep learning (DL) and machine learning (ML) models improve diagnostic precision, the lack of large datasets limits further enhancements, necessitating the use of complementary data. Existing convolutional neural networks (CNNs) effectively process visual features but struggle to fuse multimodal data effectively for AD diagnosis. To address these challenges, we propose NeuroNet-AD, a novel multimodal CNN framework designed to enhance AD classifcation accuracy. NeuroNet-AD integrates Magnetic Resonance Imaging (MRI) images with clinical text-based metadata, including psychological test scores, demographic information, and genetic biomarkers. In NeuroNet-AD, we incorporate Convolutional Block Attention Modules (CBAMs) within the ResNet-18 backbone, enabling the model to focus on the most informative spatial and channel-wise features. We introduce an attention computation and multimodal fusion module, named Meta Guided Cross Attention (MGCA), which facilitates effective cross-modal alignment between images and meta-features through a multi-head attention mechanism. Additionally, we employ an ensemble-based feature selection strategy to identify the most discriminative features from the textual data, improving model generalization and performance. We evaluate NeuroNet-AD on the Alzheimer’s Disease Neuroimaging Initiative (ADNI1) dataset using subject-level 5-fold cross-validation and a held-out test set to ensure robustness. NeuroNet-AD achieved 98.68% accuracy in multiclass classification of normal control (NC), MCI, and AD and 99.13% accuracy in the binary setting (NC vs. AD) on the ADNI dataset, outperforming state-of-the-art models. External validation on the OASIS-3 dataset further confirmed the model’s generalization ability, achieving 94.10% accuracy in the multiclass setting and 98.67% accuracy in the binary setting, despite variations in demographics and acquisition protocols. Further extensive evaluation studies demonstrate the effectiveness of each component of NeuroNet-AD in improving the performance. Full article
Show Figures

Graphical abstract

26 pages, 1049 KB  
Article
Graph-Driven Medical Report Generation with Adaptive Knowledge Distillation
by Jingqian Chen, Xin Huang, Mingfeng Jiang, Yang Li, Zimin Zou and Diqing Qian
Appl. Sci. 2025, 15(20), 10974; https://doi.org/10.3390/app152010974 - 13 Oct 2025
Viewed by 288
Abstract
Automated medical report generation (MRG) faces a critical hurdle in seamlessly integrating detailed visual evidence with accurate clinical diagnoses. Current approaches often rely on static knowledge transfer, overlooking the complex interdependencies among pathological findings and their nuanced alignment with visual evidence, often yielding [...] Read more.
Automated medical report generation (MRG) faces a critical hurdle in seamlessly integrating detailed visual evidence with accurate clinical diagnoses. Current approaches often rely on static knowledge transfer, overlooking the complex interdependencies among pathological findings and their nuanced alignment with visual evidence, often yielding reports that are linguistically sound but clinically misaligned. To address these limitations, we propose a novel graph-driven medical report generation framework with adaptive knowledge distillation. Our architecture leverages a dual-phase optimization process. First, visual–semantic enhancement proceeds through the explicit correlation of image features with a structured knowledge network and their concurrent enrichment via cross-modal semantic fusion, ensuring that generated descriptions are grounded in anatomical and pathological context. Second, a knowledge distillation mechanism iteratively refines both global narrative flow and local descriptive precision, enhancing the consistency between images and text. Comprehensive experiments on the MIMIC-CXR and IU X-Ray datasets demonstrate the effectiveness of our approach, which achieves state-of-the-art performance in clinical efficacy metrics across both datasets. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

20 pages, 5086 KB  
Article
A Multi-Modal Attention Fusion Framework for Road Connectivity Enhancement in Remote Sensing Imagery
by Yongqi Yuan, Yong Cheng, Bo Pan, Ge Jin, De Yu, Mengjie Ye and Qian Zhang
Mathematics 2025, 13(20), 3266; https://doi.org/10.3390/math13203266 - 13 Oct 2025
Viewed by 336
Abstract
Ensuring the structural continuity and completeness of road networks in high-resolution remote sensing imagery remains a major challenge for current deep learning methods, especially under conditions of occlusion caused by vegetation, buildings, or shadows. To address this, we propose a novel post-processing enhancement [...] Read more.
Ensuring the structural continuity and completeness of road networks in high-resolution remote sensing imagery remains a major challenge for current deep learning methods, especially under conditions of occlusion caused by vegetation, buildings, or shadows. To address this, we propose a novel post-processing enhancement framework that improves the connectivity and accuracy of initial road extraction results produced by any segmentation model. The method employs a dual-stream encoder architecture, which jointly processes RGB images and preliminary road masks to obtain complementary spatial and semantic information. A core component is the MAF (Multi-Modal Attention Fusion) module, designed to capture fine-grained, long-range, and cross-scale dependencies between image and mask features. This fusion leads to the restoration of fragmented road segments, the suppression of noise, and overall improvement in road completeness. Experiments on benchmark datasets (DeepGlobe and Massachusetts) demonstrate substantial gains in precision, recall, F1-score, and mIoU, confirming the framework’s effectiveness and generalization ability in real-world scenarios. Full article
(This article belongs to the Special Issue Mathematical Methods for Machine Learning and Computer Vision)
Show Figures

Figure 1

31 pages, 2953 KB  
Article
A Balanced Multimodal Multi-Task Deep Learning Framework for Robust Patient-Specific Quality Assurance
by Xiaoyang Zeng, Awais Ahmed and Muhammad Hanif Tunio
Diagnostics 2025, 15(20), 2555; https://doi.org/10.3390/diagnostics15202555 - 10 Oct 2025
Viewed by 416
Abstract
Background: Multimodal Deep learning has emerged as a crucial method for automated patient-specific quality assurance (PSQA) in radiotherapy research. Integrating image-based dose matrices with tabular plan complexity metrics enables more accurate prediction of quality indicators, including the Gamma Passing Rate (GPR) and dose [...] Read more.
Background: Multimodal Deep learning has emerged as a crucial method for automated patient-specific quality assurance (PSQA) in radiotherapy research. Integrating image-based dose matrices with tabular plan complexity metrics enables more accurate prediction of quality indicators, including the Gamma Passing Rate (GPR) and dose difference (DD). However, modality imbalance remains a significant challenge, as tabular encoders often dominate training, suppressing image encoders and reducing model robustness. This issue becomes more pronounced under task heterogeneity, with GPR prediction relying more on tabular data, whereas dose difference prediction (DDP) depends heavily on image features. Methods: We propose BMMQA (Balanced Multi-modal Quality Assurance), a novel framework that achieves modality balance by adjusting modality-specific loss factors to control convergence dynamics. The framework introduces four key innovations: (1) task-specific fusion strategies (softmax-weighted attention for GPR regression and spatial cascading for DD prediction); (2) a balancing mechanism supported by Shapley values to quantify modality contributions; (3) a fast network forward mechanism for efficient computation of different modality combinations; and (4) a modality-contribution-based task weighting scheme for multi-task multimodal learning. A large-scale multimodal dataset comprising 1370 IMRT plans was curated in collaboration with Peking Union Medical College Hospital (PUMCH). Results: Experimental results demonstrate that, under the standard 2%/3 mm GPR criterion, BMMQA outperforms existing fusion baselines. Under the stricter 2%/2 mm criterion, it achieves a 15.7% reduction in mean absolute error (MAE). The framework also enhances robustness in critical failure cases (GPR < 90%) and achieves a peak SSIM of 0.964 in dose distribution prediction. Conclusions: Explicit modality balancing improves predictive accuracy and strengthens clinical trustworthiness by mitigating overreliance on a single modality. This work highlights the importance of addressing modality imbalance for building trustworthy and robust AI systems in PSQA and establishes a pioneering framework for multi-task multimodal learning. Full article
(This article belongs to the Special Issue Deep Learning in Medical and Biomedical Image Processing)
Show Figures

Figure 1

21 pages, 14964 KB  
Article
An Automated Framework for Abnormal Target Segmentation in Levee Scenarios Using Fusion of UAV-Based Infrared and Visible Imagery
by Jiyuan Zhang, Zhonggen Wang, Jing Chen, Fei Wang and Lyuzhou Gao
Remote Sens. 2025, 17(20), 3398; https://doi.org/10.3390/rs17203398 - 10 Oct 2025
Viewed by 326
Abstract
Levees are critical for flood defence, but their integrity is threatened by hazards such as piping and seepage, especially during high-water-level periods. Traditional manual inspections for these hazards and associated emergency response elements, such as personnel and assets, are inefficient and often impractical. [...] Read more.
Levees are critical for flood defence, but their integrity is threatened by hazards such as piping and seepage, especially during high-water-level periods. Traditional manual inspections for these hazards and associated emergency response elements, such as personnel and assets, are inefficient and often impractical. While UAV-based remote sensing offers a promising alternative, the effective fusion of multi-modal data and the scarcity of labelled data for supervised model training remain significant challenges. To overcome these limitations, this paper reframes levee monitoring as an unsupervised anomaly detection task. We propose a novel, fully automated framework that unifies geophysical hazards and emergency response elements into a single analytical category of “abnormal targets” for comprehensive situational awareness. The framework consists of three key modules: (1) a state-of-the-art registration algorithm to precisely align infrared and visible images; (2) a generative adversarial network to fuse the thermal information from IR images with the textural details from visible images; and (3) an adaptive, unsupervised segmentation module where a mean-shift clustering algorithm, with its hyperparameters automatically tuned by Bayesian optimization, delineates the targets. We validated our framework on a real-world dataset collected from a levee on the Pajiang River, China. The proposed method demonstrates superior performance over all baselines, achieving an Intersection over Union of 0.348 and a macro F1-Score of 0.479. This work provides a practical, training-free solution for comprehensive levee monitoring and demonstrates the synergistic potential of multi-modal fusion and automated machine learning for disaster management. Full article
Show Figures

Graphical abstract

21 pages, 2189 KB  
Article
Hybrid CNN-Swin Transformer Model to Advance the Diagnosis of Maxillary Sinus Abnormalities on CT Images Using Explainable AI
by Mohammad Alhumaid and Ayman G. Fayoumi
Computers 2025, 14(10), 419; https://doi.org/10.3390/computers14100419 - 2 Oct 2025
Viewed by 316
Abstract
Accurate diagnosis of sinusitis is essential due to its widespread prevalence and its considerable impact on patient quality of life. While multiple imaging techniques are available for detecting maxillary sinus, computed tomography (CT) remains the preferred modality because of its high sensitivity and [...] Read more.
Accurate diagnosis of sinusitis is essential due to its widespread prevalence and its considerable impact on patient quality of life. While multiple imaging techniques are available for detecting maxillary sinus, computed tomography (CT) remains the preferred modality because of its high sensitivity and spatial resolution. Although recent advances in deep learning have led to the development of automated methods for sinusitis classification, many existing models perform poorly in the presence of complex pathological features and offer limited interpretability, which hinders their integration into clinical workflows. In this study, we propose a hybrid deep learning framework that combines EfficientNetB0, a convolutional neural network, with the Swin Transformer, a vision transformer, to improve feature representation. An attention-based fusion module is used to integrate both local and global information, thereby enhancing diagnostic accuracy. To improve transparency and support clinical adoption, the model incorporates explainable artificial intelligence (XAI) techniques using Gradient-weighted Class Activation Mapping (Grad-CAM). This allows for visualization of the regions influencing the model’s predictions, helping radiologists assess the clinical relevance of the results. We evaluate the proposed method on a curated maxillary sinus CT dataset covering four diagnostic categories: Normal, Opacified, Polyposis, and Retention Cysts. The model achieves a classification accuracy of 95.83%, with precision, recall, and F1 score all at 95%. Grad-CAM visualizations indicate that the model consistently focuses on clinically significant regions of the sinus anatomy, supporting its potential utility as a reliable diagnostic aid in medical practice. Full article
Show Figures

Figure 1

18 pages, 748 KB  
Review
Statistical Methods for Multi-Omics Analysis in Neurodevelopmental Disorders: From High Dimensionality to Mechanistic Insight
by Manuel Airoldi, Veronica Remori and Mauro Fasano
Biomolecules 2025, 15(10), 1401; https://doi.org/10.3390/biom15101401 - 2 Oct 2025
Viewed by 721
Abstract
Neurodevelopmental disorders (NDDs), including autism spectrum disorder, intellectual disability, and attention-deficit/hyperactivity disorder, are genetically and phenotypically heterogeneous conditions affecting millions worldwide. High-throughput omics technologies—transcriptomics, proteomics, metabolomics, and epigenomics—offer a unique opportunity to link genetic variation to molecular and cellular mechanisms underlying these disorders. [...] Read more.
Neurodevelopmental disorders (NDDs), including autism spectrum disorder, intellectual disability, and attention-deficit/hyperactivity disorder, are genetically and phenotypically heterogeneous conditions affecting millions worldwide. High-throughput omics technologies—transcriptomics, proteomics, metabolomics, and epigenomics—offer a unique opportunity to link genetic variation to molecular and cellular mechanisms underlying these disorders. However, the high dimensionality, sparsity, batch effects, and complex covariance structures of omics data present significant statistical challenges, requiring robust normalization, batch correction, imputation, dimensionality reduction, and multivariate modeling approaches. This review provides a comprehensive overview of statistical frameworks for analyzing high-dimensional omics datasets in NDDs, including univariate and multivariate models, penalized regression, sparse canonical correlation analysis, partial least squares, and integrative multi-omics methods such as DIABLO, similarity network fusion, and MOFA. We illustrate how these approaches have revealed convergent molecular signatures—synaptic, mitochondrial, and immune dysregulation—across transcriptomic, proteomic, and metabolomic layers in human cohorts and experimental models. Finally, we discuss emerging strategies, including single-cell and spatially resolved omics, machine learning-driven integration, and longitudinal multi-modal analyses, highlighting their potential to translate complex molecular patterns into mechanistic insights, biomarkers, and therapeutic targets. Integrative multi-omics analyses, grounded in rigorous statistical methodology, are poised to advance mechanistic understanding and precision medicine in NDDs. Full article
(This article belongs to the Section Bioinformatics and Systems Biology)
Show Figures

Figure 1

16 pages, 1698 KB  
Article
Fall Detection by Deep Learning-Based Bimodal Movement and Pose Sensing with Late Fusion
by Haythem Rehouma and Mounir Boukadoum
Sensors 2025, 25(19), 6035; https://doi.org/10.3390/s25196035 - 1 Oct 2025
Viewed by 455
Abstract
The timely detection of falls among the elderly remains challenging. Single modality sensing approaches using inertial measurement units (IMUs) or vision-based monitoring systems frequently exhibit high false positives and compromised accuracy under suboptimal operating conditions. We propose a novel bimodal deep learning-based bimodal [...] Read more.
The timely detection of falls among the elderly remains challenging. Single modality sensing approaches using inertial measurement units (IMUs) or vision-based monitoring systems frequently exhibit high false positives and compromised accuracy under suboptimal operating conditions. We propose a novel bimodal deep learning-based bimodal sensing framework to address the problem, by leveraging a memory-based autoencoder neural network for inertial abnormality detection and an attention-based neural network for visual pose assessment, with late fusion at the decision level. Our experimental evaluation with a custom dataset of simulated falls and routine activities, captured with waist-mounted IMUs and RGB cameras under dim lighting, shows significant performance improvement by the described bimodal late-fusion system, with an F1-score of 97.3% and, most notably, a false-positive rate of 3.6% significantly lower than the 11.3% and 8.9% with IMU-only and vision-only baselines, respectively. These results confirm the robustness of the described fall detection approach and validate its applicability to real-time fall detection under different light settings, including nighttime conditions. Full article
(This article belongs to the Special Issue Sensor-Based Human Activity Recognition)
Show Figures

Figure 1

29 pages, 3761 KB  
Article
An Adaptive Transfer Learning Framework for Multimodal Autism Spectrum Disorder Diagnosis
by Wajeeha Malik, Muhammad Abuzar Fahiem, Jawad Khan, Younhyun Jung and Fahad Alturise
Life 2025, 15(10), 1524; https://doi.org/10.3390/life15101524 - 26 Sep 2025
Viewed by 558
Abstract
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition with diverse behavioral, genetic, and structural characteristics. Due to its heterogeneous nature, early diagnosis of ASD is challenging, and conventional unimodal approaches often fail to capture cross-modal dependencies. To address this, this study introduces [...] Read more.
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition with diverse behavioral, genetic, and structural characteristics. Due to its heterogeneous nature, early diagnosis of ASD is challenging, and conventional unimodal approaches often fail to capture cross-modal dependencies. To address this, this study introduces an adaptive multimodal fusion framework that integrates behavioral, genetic, and structural MRI (sMRI) data, addressing the limitations of unimodal approaches. Each modality undergoes a dedicated preprocessing and feature optimization phase. For behavioral data, an ensemble of classifiers using a stacking technique and attention mechanism is applied for feature extraction, achieving an accuracy of 95.5%. The genetic data is analyzed using Gradient Boosting, which attained a classification accuracy of 86.6%. For the sMRI data, a Hybrid Convolutional Neural Network–Graph Neural Network (Hybrid-CNN-GNN) architecture is proposed, demonstrating a strong performance with an accuracy of 96.32%, surpassing existing methods. To unify these modalities, fused using an adaptive late fusion strategy implemented with a Multilayer Perceptron (MLP), where adaptive weighting adjusts each modality’s contribution based on validation performance. The integrated framework addresses the limitations of unimodal approaches by creating a unified diagnostic model. The transfer learning framework achieves superior diagnostic accuracy (98.7%) compared to unimodal baselines, demonstrating strong generalization across heterogeneous datasets and offering a promising step toward reliable, multimodal ASD diagnosis. Full article
(This article belongs to the Special Issue Advanced Machine Learning for Disease Prediction and Prevention)
Show Figures

Figure 1

25 pages, 1432 KB  
Article
GATransformer: A Network Threat Detection Method Based on Graph-Sequence Enhanced Transformer
by Qigang Zhu, Xiong Zhan, Wei Chen, Yuanzhi Li, Hengwei Ouyang, Tian Jiang and Yu Shen
Electronics 2025, 14(19), 3807; https://doi.org/10.3390/electronics14193807 - 25 Sep 2025
Viewed by 468
Abstract
Emerging complex multi-step attacks such as Advanced Persistent Threats (APTs) pose significant risks to national economic development, security, and social stability. Effectively detecting these sophisticated threats is a critical challenge. While deep learning methods show promise in identifying unknown malicious behaviors, they often [...] Read more.
Emerging complex multi-step attacks such as Advanced Persistent Threats (APTs) pose significant risks to national economic development, security, and social stability. Effectively detecting these sophisticated threats is a critical challenge. While deep learning methods show promise in identifying unknown malicious behaviors, they often struggle with fragmented modal information, limited feature representation, and generalization. To address these limitations, we propose GATransformer, a new dual-modal detection method that integrates topological structure analysis with temporal sequence modeling. Its core lies in a cross-attention semantic fusion mechanism, which deeply integrates heterogeneous features and effectively mitigates the constraints of unimodal representations. GATransformer reconstructs network behavior representation via a parallel processing framework in which graph attention captures intricate spatial dependencies, and self-attention focuses on modeling long-range temporal correlations. Experimental results on the CIDDS-001 and CIDDS-002 datasets demonstrate the superior performance of our method compared to baseline methods with detection accuracies of 99.74% (nodes) and 88.28% (edges) on CIDDS-001 and 99.99% and 99.98% on CIDDS-002, respectively. Full article
(This article belongs to the Special Issue Advances in Information Processing and Network Security)
Show Figures

Figure 1

Back to TopTop