Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (93)

Search Parameters:
Keywords = tabular neural network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 709 KB  
Article
A Tabular Data Imputation Technique Using Transformer and Convolutional Neural Networks
by Charlène Béatrice Bridge-Nduwimana, Salah Eddine El Harrauss, Aziza El Ouaazizi and Majid Benyakhlef
Big Data Cogn. Comput. 2025, 9(12), 321; https://doi.org/10.3390/bdcc9120321 - 13 Dec 2025
Viewed by 527
Abstract
Upstream processes strongly influence downstream analysis in sequential data-processing workflows, particularly in machine learning, where data quality directly affects model performance. Conventional statistical imputations often fail to capture nonlinear dependencies, while deep learning approaches typically lack uncertainty quantification. We introduce a hybrid imputation [...] Read more.
Upstream processes strongly influence downstream analysis in sequential data-processing workflows, particularly in machine learning, where data quality directly affects model performance. Conventional statistical imputations often fail to capture nonlinear dependencies, while deep learning approaches typically lack uncertainty quantification. We introduce a hybrid imputation model that integrates a deep learning autoencoder with Convolutional Neural Network (CNN) layers and a Transformer-based contextual modeling architecture to address systematic variation across heterogeneous data sources. Performing multiple imputations in the autoencoder–transformer latent space and averaging representations provides implicit batch correction that suppresses context-specific remains without explicit batch identifiers. We performed experiments on datasets in which 10% of missing data was artificially introduced by completely random missing data (MCAR) and non-random missing data (MNAR) mechanisms. They demonstrated practical performance, jointly ranking first among the imputation methods evaluated. This imputation technique reduced the root mean square error (RMSE) by 50% compared to denoising autoencoders (DAE) and by 46% compared to iterative imputation (MICE). Performance was comparable for adversarial models (GAIN) and attention-based models (MIDA), and both provided interpretable uncertainty estimates (CV = 0.08–0.15). Validation on datasets from multiple sources confirmed the robustness of the technique: notably, on a forensic dataset from multiple laboratories, our imputation technique achieved a practical improvement over GAIN (0.146 vs. 0.189 RMSE), highlighting its effectiveness in mitigating batch effects. Full article
Show Figures

Graphical abstract

27 pages, 1614 KB  
Article
Comparative Analysis of Neural Network Models for Predicting Peach Maturity on Tabular Data
by Dejan Ljubobratović, Marko Vuković, Marija Brkić Bakarić, Tomislav Jemrić and Maja Matetić
Computers 2025, 14(12), 554; https://doi.org/10.3390/computers14120554 - 13 Dec 2025
Viewed by 279
Abstract
Peach maturity at harvest is a critical factor influencing fruit quality and postharvest life. Traditional destructive methods for maturity assessment, although effective, compromise fruit integrity and are unsuitable for practical implementation in modern production. This study presents a machine learning approach for non-destructive [...] Read more.
Peach maturity at harvest is a critical factor influencing fruit quality and postharvest life. Traditional destructive methods for maturity assessment, although effective, compromise fruit integrity and are unsuitable for practical implementation in modern production. This study presents a machine learning approach for non-destructive peach maturity prediction using tabular data collected from 701 ‘Redhaven’ peaches. Three neural network models suitable for small tabular datasets (TabNet, SAINT, and NODE) were applied and evaluated using classification metrics, including accuracy, F1-score, and AUC. The models demonstrated consistently strong performance across several feature configurations, with TabNet achieving the highest accuracy when all non-destructive measurements were available, while TabNet provided the most robust and practical performance on the comprehensive non-destructive subset and in optimized minimal-feature settings. These findings indicate that non-destructive sensing methods, particularly when combined with modern neural architectures, can reliably predict maturity and offer potential for real-time, automated fruit selection after harvest. The integration of such models into autonomous harvesting systems, for instance, through drone-based platforms equipped with appropriate sensors, could significantly improve efficiency and fruit quality management in horticultural peach production. Full article
(This article belongs to the Special Issue Machine Learning and Statistical Learning with Applications 2025)
Show Figures

Figure 1

26 pages, 2602 KB  
Article
A Big Data Pipeline Approach for Predicting Real-Time Pandemic Hospitalization Risk
by Vishnu S. Pendyala, Mayank Kapadia, Basanth Periyapatnaroopakumar, Manav Anandani and Nischitha Nagendran
Algorithms 2025, 18(12), 730; https://doi.org/10.3390/a18120730 - 21 Nov 2025
Viewed by 523
Abstract
Pandemics emphasize the importance of real-time, interpretable clinical decision-support systems for identifying high-risk patients and assisting with prompt triage, particularly in data-intensive healthcare systems. This paper describes a novel dual big-data pipeline that includes (i) a streaming module for real-time epidemiological hospitalization risk [...] Read more.
Pandemics emphasize the importance of real-time, interpretable clinical decision-support systems for identifying high-risk patients and assisting with prompt triage, particularly in data-intensive healthcare systems. This paper describes a novel dual big-data pipeline that includes (i) a streaming module for real-time epidemiological hospitalization risk prediction and (ii) a supplementary imaging-based detection and reasoning module for chest X-rays, with COVID-19 as an example. The first pipeline uses state-of-the-art machine learning algorithms to estimate patient-level hospitalization risk based on data from the Centers for Disease Control and Prevention’s (CDC) COVID-19 Case Surveillance dataset. A Bloom filter accelerated triage by constant-time pre-screening of high-risk profiles. Specifically, after significant experimentation and optimization, one of the models, XGBoost, was selected because it achieved the best minority-class F1-score (0.76) and recall (0.80), outperforming baseline models. Synthetic data generation was employed to mimic streaming workloads, including a strategy that used the Conditional Tabular Generative Adversarial Network (CTGAN) to produce the best balanced and realistic distributions. The second pipeline focuses on diagnostic imaging and combines an advanced convolutional neural network, EfficientNet-B0, with Grad-CAM visual explanations, achieving 99.5% internal and 99.3% external accuracy. A lightweight Generative Pre-trained Transformer (GPT)-based reasoning layer converts model predictions into auditable triage comments (ALERT/FLAG/LOG), yielding traceable and interpretable decision logs. This scalable, explainable, and near-real-time framework provides a foundation for future multimodal and genomic advancements in public health readiness. Full article
Show Figures

Figure 1

32 pages, 18645 KB  
Article
More Trustworthy Prediction of Elastic Modulus of Recycled Aggregate Concrete Using MCBE and TabPFN
by Wei-Tian Lu, Ze-Zhao Wang and Xin-Yu Zhao
Materials 2025, 18(22), 5221; https://doi.org/10.3390/ma18225221 - 18 Nov 2025
Viewed by 440
Abstract
The sustainable use of recycled aggregate concrete (RAC) is a critical pathway toward resource-efficient and environmentally responsible construction. However, the mechanical performance of RAC—particularly its elastic modulus—exhibits pronounced variability due to the heterogeneous quality and microstructural defects of recycled aggregates. This variability complicates [...] Read more.
The sustainable use of recycled aggregate concrete (RAC) is a critical pathway toward resource-efficient and environmentally responsible construction. However, the mechanical performance of RAC—particularly its elastic modulus—exhibits pronounced variability due to the heterogeneous quality and microstructural defects of recycled aggregates. This variability complicates the establishment of reliable predictive models and equations for elastic modulus estimation and restricts RAC’s broader structural implementation. Conventional empirical and machine-learning-based models (e.g., support vector machine, random forest, and artificial neural networks) are typically dataset-specific, prone to overfitting, and incapable of quantifying bias and uncertainty, making them unsuitable for heterogeneous materials data. This study introduces a bias-aware and more accurate predictive framework that integrates the Tabular Prior-data Fitted Network (TabPFN) with Monte Carlo Bias Estimation (MCBE)—for the first time applied in RAC materials research. A database containing 1161 RAC samples from diverse literature sources was established. This database includes key parameters such as apparent density ranging from 2270 kg/m3 to 3150 kg/m3, water absorption from 0.75% to 7.81%, replacement ratio from 0% to 100%, and compressive strength values ranging from 10.00 MPa to 108.51 MPa. MCBE quantified representational bias and guided targeted data augmentation, while TabPFN—pretrained on millions of Bayesian inference tasks—achieved R2 = 0.912 and RMSE = 1.65 GPa without any hyperparameter tuning. Feature attribution analysis confirmed compressive strength as the most influential factor governing the elastic modulus, consistent with established composite mechanics principles. The proposed TabPFN–MCBE framework provides a reliable, bias-corrected, and transferable approach for modeling recycled aggregate concrete (RAC). It enables accurate predictions that are both trustworthy and interpretable, advancing the use of data-driven methods in sustainable materials design. Full article
Show Figures

Figure 1

26 pages, 2931 KB  
Review
Prospects of AI-Powered Bowel Sound Analytics for Diagnosis, Characterization, and Treatment Management of Inflammatory Bowel Disease
by Divyanshi Sood, Zenab Muhammad Riaz, Jahnavi Mikkilineni, Narendra Nath Ravi, Vineeta Chidipothu, Gayathri Yerrapragada, Poonguzhali Elangovan, Mohammed Naveed Shariff, Thangeswaran Natarajan, Jayarajasekaran Janarthanan, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Keerthy Gopalakrishnan and Shivaram P. Arunachalam
Med. Sci. 2025, 13(4), 230; https://doi.org/10.3390/medsci13040230 - 13 Oct 2025
Cited by 2 | Viewed by 2046
Abstract
Background: This narrative review examines the role of artificial intelligence (AI) in bowel sound analysis for the diagnosis and management of inflammatory bowel disease (IBD). Inflammatory bowel disease (IBD), encompassing Crohn’s disease and ulcerative colitis, presents a significant clinical burden due to its [...] Read more.
Background: This narrative review examines the role of artificial intelligence (AI) in bowel sound analysis for the diagnosis and management of inflammatory bowel disease (IBD). Inflammatory bowel disease (IBD), encompassing Crohn’s disease and ulcerative colitis, presents a significant clinical burden due to its unpredictable course, variable symptomatology, and reliance on invasive procedures for diagnosis and disease monitoring. Despite advances in imaging and biomarkers, tools such as colonoscopy and fecal calprotectin remain costly, uncomfortable, and impractical for frequent or real-time assessment. Meanwhile, bowel sounds—an overlooked physiologic signal—reflect underlying gastrointestinal motility and inflammation but have historically lacked objective quantification. With recent advances in artificial intelligence (AI) and acoustic signal processing, there is growing interest in leveraging bowel sound analysis as a novel, non-invasive biomarker for detecting IBD, monitoring disease activity, and predicting disease flares. This approach holds the promise of continuous, low-cost, and patient-friendly monitoring, which could transform IBD management. Objectives: This narrative review assesses the clinical utility, methodological rigor, and potential future integration of artificial intelligence (AI)-driven bowel sound analysis in inflammatory bowel disease (IBD), with a focus on its potential as a non-invasive biomarker for disease activity, flare prediction, and differential diagnosis. Methods: This manuscript reviews the potential of AI-powered bowel sound analysis as a non-invasive tool for diagnosing, monitoring, and managing inflammatory bowel disease (IBD), including Crohn’s disease and ulcerative colitis. Traditional diagnostic methods, such as colonoscopy and biomarkers, are often invasive, costly, and impractical for real-time monitoring. The manuscript explores bowel sounds, which reflect gastrointestinal motility and inflammation, as an alternative biomarker by utilizing AI techniques like convolutional neural networks (CNNs), transformers, and gradient boosting. We analyze data on acoustic signal acquisition (e.g., smart T-shirts, smartphones), signal processing methodologies (e.g., MFCCs, spectrograms, empirical mode decomposition), and validation metrics (e.g., accuracy, F1 scores, AUC). Studies were assessed for clinical relevance, methodological rigor, and translational potential. Results: Across studies enrolling 16–100 participants, AI models achieved diagnostic accuracies of 88–96%, with AUCs ≥ 0.83 and F1 scores ranging from 0.71 to 0.85 for differentiating IBD from healthy controls and IBS. Transformer-based approaches (e.g., HuBERT, Wav2Vec 2.0) consistently outperformed CNNs and tabular models, yielding F1 scores of 80–85%, while gradient boosting on wearable multi-microphone recordings demonstrated robustness to background noise. Distinct acoustic signatures were identified, including prolonged sound-to-sound intervals in Crohn’s disease (mean 1232 ms vs. 511 ms in IBS) and high-pitched tinkling in stricturing phenotypes. Despite promising performance, current models remain below established biomarkers such as fecal calprotectin (~90% sensitivity for active disease), and generalizability is limited by small, heterogeneous cohorts and the absence of prospective validation. Conclusions: AI-powered bowel sound analysis represents a promising, non-invasive tool for IBD monitoring. However, widespread clinical integration requires standardized data acquisition protocols, large multi-center datasets with clinical correlates, explainable AI frameworks, and ethical data governance. Future directions include wearable-enabled remote monitoring platforms and multi-modal decision support systems integrating bowel sounds with biomarker and symptom data. This manuscript emphasizes the need for large-scale, multi-center studies, the development of explainable AI frameworks, and the integration of these tools within clinical workflows. Future directions include remote monitoring using wearables and multi-modal systems that combine bowel sounds with biomarkers and patient symptoms, aiming to transform IBD care into a more personalized and proactive model. Full article
Show Figures

Figure 1

19 pages, 685 KB  
Article
Intent-Based Resource Allocation in Edge and Cloud Computing Using Reinforcement Learning
by Dimitrios Konidaris, Polyzois Soumplis, Andreas Varvarigos and Panagiotis Kokkinos
Algorithms 2025, 18(10), 627; https://doi.org/10.3390/a18100627 - 4 Oct 2025
Viewed by 1045
Abstract
Managing resource use in cloud and edge environments is crucial for optimizing performance and efficiency. Traditionally, this process is performed with detailed knowledge of the available infrastructure while being application-specific. However, it is common that users cannot accurately specify their applications’ low-level requirements, [...] Read more.
Managing resource use in cloud and edge environments is crucial for optimizing performance and efficiency. Traditionally, this process is performed with detailed knowledge of the available infrastructure while being application-specific. However, it is common that users cannot accurately specify their applications’ low-level requirements, and they tend to overestimate them—a problem further intensified by their lack of detailed knowledge on the infrastructure’s characteristics. In this context, resource orchestration mechanisms perform allocations based on the provided worst-case assumptions, with a direct impact on the performance of the whole infrastructure. In this work, we propose a resource orchestration mechanism based on intents, in which users provide their high-level workload requirements by specifying their intended preferences for how the workload should be managed, such as prioritizing high capacity, low cost, or other criteria. Building on this, the proposed mechanism dynamically assigns resources to applications through a Reinforcement Learning method leveraging the feedback from the users and infrastructure providers’ monitoring system. We formulate the respective problem as a discrete-time, finite horizon Markov decision process. Initially, we solve the problem using a tabular Q-learning method. However, due to the large state space inherent in real-world scenarios, we also employ Deep Reinforcement Learning, utilizing a neural network for the Q-value approximation. The presented mechanism is capable of continuously adapting the manner in which resources are allocated based on feedback from users and infrastructure providers. A series of simulation experiments were conducted to demonstrate the applicability of the proposed methodologies in intent-based resource allocation, examining various aspects and characteristics and performing comparative analysis. Full article
(This article belongs to the Special Issue Emerging Trends in Distributed AI for Smart Environments)
Show Figures

Figure 1

17 pages, 1548 KB  
Article
Hybrid Deep-Ensemble Network with VAE-Based Augmentation for Imbalanced Tabular Data Classification
by Sang-Jeong Lee and You-Suk Bae
Appl. Sci. 2025, 15(19), 10360; https://doi.org/10.3390/app151910360 - 24 Sep 2025
Viewed by 885
Abstract
Background: Severe class imbalance limits reliable tabular AI in manufacturing, finance, and healthcare. Methods: We built a modular pipeline comprising correlation-aware seriation; a hybrid convolutional neural network (CNN)–transformer–Bidirectional Long Short-Term Memory (BiLSTM) encoder; variational autoencoder (VAE)-based minority augmentation; and deep/tree ensemble heads (XGBoost [...] Read more.
Background: Severe class imbalance limits reliable tabular AI in manufacturing, finance, and healthcare. Methods: We built a modular pipeline comprising correlation-aware seriation; a hybrid convolutional neural network (CNN)–transformer–Bidirectional Long Short-Term Memory (BiLSTM) encoder; variational autoencoder (VAE)-based minority augmentation; and deep/tree ensemble heads (XGBoost and Support Vector Machine, SVM). We benchmarked the Synthetic Minority Oversampling Technique (SMOTE) and ADASYN under identical protocols. Focal loss and ensemble weights were tuned per dataset. The primary metric was the Area Under the Precision–Recall Curve (AUPRC), with receiver operating characteristic area under the curve (ROC AUC) as complementary. Synthetic-data fidelity was quantified by train-on-synthetic/test-on-real (TSTR) utility, two-sample discriminability (ROC AUC of a real-vs-synthetic classifier), and Maximum Mean Discrepancy (MMD2). Results: Across five datasets (SECOM, CREDIT, THYROID, APS, and UCI), augmentation was data-dependent: VAE led on APS (+3.66 pp AUPRC vs. SMOTE) and was competitive on CREDIT (+0.10 pp vs. None); the SMOTE dominated SECOM; no augmentation performed best for THYROID and UCI. Positional embedding (PE) with seriation helped when strong local correlations were present. Ensembles typically favored XGBoost while benefiting from the hybrid encoder. Efficiency profiling and a slim variant supported latency-sensitive use. Conclusions: A data-aware recipe emerged: prefer VAE when fidelity is high, the SMOTE on smoother minority manifolds, and no augmentation when baselines suffice; apply PE/seriation selectively and tune per dataset for robust, reproducible deployment. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

18 pages, 6012 KB  
Article
Vision-AQ: Explainable Multi-Modal Deep Learning for Air Pollution Classification in Smart Cities
by Faisal Mehmood, Sajid Ur Rehman and Ahyoung Choi
Mathematics 2025, 13(18), 3017; https://doi.org/10.3390/math13183017 - 18 Sep 2025
Cited by 4 | Viewed by 1355
Abstract
Accurate air quality prediction (AQP) is crucial for safeguarding public health and guiding smart city management. However, reliable assessment remains challenging due to complex emission patterns, meteorological variability, and chemical interactions, compounded by the limited coverage of ground-based monitoring networks. To address this [...] Read more.
Accurate air quality prediction (AQP) is crucial for safeguarding public health and guiding smart city management. However, reliable assessment remains challenging due to complex emission patterns, meteorological variability, and chemical interactions, compounded by the limited coverage of ground-based monitoring networks. To address this gap, we propose Vision-AQ (Visual Integrated Operational Network for Air Quality), a novel multi-modal deep learning framework that classifies Air Quality Index (AQI) levels by integrating environmental imagery with pollutant data. Vision-AQ employs a dual-input neural architecture: (1) a pre-trained ResNet50 convolutional neural network (CNN) that extracts high-level features from city-scale environmental photographs in India and Nepal, capturing haze, smog, and visibility patterns, and (2) a multi-layer perceptron (MLP) that processes tabular sensor data, including PM2.5, PM10, and AQI values. The fused representations are passed to a classifier to predict six AQI categories. Trained on a comprehensive dataset, the model achieves strong predictive performance with high accuracy, precision, recall and F1-score of 99%, with 23.7 million parameters. To ensure interpretability, we use Grad-CAM visualization to highlights the model’s reliance on meaningful atmospheric features, confirming its explainability. The results demonstrate that Vision-AQ is a reliable, scalable, and cost-effective approach for localized AQI classification, offering the potential to augment conventional monitoring networks and enable more granular air quality management in urban South Asia. Full article
(This article belongs to the Special Issue Explainable and Trustworthy AI Models for Data Analytics)
Show Figures

Figure 1

49 pages, 3209 KB  
Article
SAFE-MED for Privacy-Preserving Federated Learning in IoMT via Adversarial Neural Cryptography
by Mohammad Zubair Khan, Waseem Abbass, Nasim Abbas, Muhammad Awais Javed, Abdulrahman Alahmadi and Uzma Majeed
Mathematics 2025, 13(18), 2954; https://doi.org/10.3390/math13182954 - 12 Sep 2025
Cited by 2 | Viewed by 2080
Abstract
Federated learning (FL) offers a promising paradigm for distributed model training in Internet of Medical Things (IoMT) systems, where patient data privacy and device heterogeneity are critical concerns. However, conventional FL remains vulnerable to gradient leakage, model poisoning, and adversarial inference, particularly in [...] Read more.
Federated learning (FL) offers a promising paradigm for distributed model training in Internet of Medical Things (IoMT) systems, where patient data privacy and device heterogeneity are critical concerns. However, conventional FL remains vulnerable to gradient leakage, model poisoning, and adversarial inference, particularly in privacy-sensitive and resource-constrained medical environments. To address these challenges, we propose SAFE-MED, a secure and adversarially robust framework for privacy-preserving FL tailored for IoMT deployments. SAFE-MED integrates neural encryption, adversarial co-training, anomaly-aware gradient filtering, and trust-weighted aggregation into a unified learning pipeline. The encryption and decryption components are jointly optimized with a simulated adversary under a minimax objective, ensuring high reconstruction fidelity while suppressing inference risk. To enhance robustness, the system dynamically adjusts client influence based on behavioral trust metrics and detects malicious updates using entropy-based anomaly scores. Comprehensive experiments are conducted on three representative medical datasets: Cleveland Heart Disease (tabular), MIT-BIH Arrhythmia (ECG time series), and PhysioNet Respiratory Signals. SAFE-MED achieves near-baseline accuracy with less than 2% degradation, while reducing gradient leakage by up to 85% compared to vanilla FedAvg and over 66% compared to recent neural cryptographic FL baselines. The framework maintains over 90% model accuracy under 20% poisoning attacks and reduces communication cost by 42% relative to homomorphic encryption-based methods. SAFE-MED demonstrates strong scalability, reliable convergence, and practical runtime efficiency across heterogeneous network conditions. These findings validate its potential as a secure, efficient, and deployable FL solution for next-generation medical AI applications. Full article
Show Figures

Figure 1

23 pages, 1584 KB  
Article
Image-Based Formalization of Tabular Data for Threshold-Based Prediction of Hospital Stay Using Convolutional Neural Networks: An Intelligent Decision Support System Applied in COPD
by Alberto Pinheira, Manuel Casal-Guisande, Julia López-Canay, Alberto Fernández-García, Rafael Golpe, Cristina Represas-Represas, María Torres-Durán, Jorge Cerqueiro-Pequeño, Alberto Comesaña-Campos and Alberto Fernández-Villar
Appl. Syst. Innov. 2025, 8(5), 128; https://doi.org/10.3390/asi8050128 - 2 Sep 2025
Cited by 4 | Viewed by 1231
Abstract
Background: Chronic Obstructive Pulmonary Disease (COPD) often leads to acute exacerbations requiring hospitalization. While artificial intelligence (AI) has been increasingly used to improve COPD management, predicting whether the length of hospital stay (LOHS) will exceed clinically relevant thresholds remains insufficiently explored. Methods: This [...] Read more.
Background: Chronic Obstructive Pulmonary Disease (COPD) often leads to acute exacerbations requiring hospitalization. While artificial intelligence (AI) has been increasingly used to improve COPD management, predicting whether the length of hospital stay (LOHS) will exceed clinically relevant thresholds remains insufficiently explored. Methods: This study presents a novel clinical decision support system to predict whether LOHS following an acute exacerbation will surpass specific cutoffs (6 or 10 days). The approach involves two stages: (1) clinical, demographic, and social variables are encoded into structured signals and transformed into spectrogram-like images via a pipeline based on sinusoidal encoding and Mel-frequency cepstral coefficients (MFCCs); and (2) these images are fed into a Convolutional Neural Network (CNN) to estimate the probability of extended hospitalization. Feature selection with XGBoost reduced dimensionality to 16 variables. The model was trained and tested on a dataset of over 500 patients. Results: On the test set, the model achieved an AUC of 0.77 for predicting stays longer than 6 days and 0.75 for stays over 10 days. Sensitivity and specificity were 0.79/0.72 and 0.74/0.80, respectively. Conclusions: This system leverages image-based data formalization to predict LOHS in COPD patients, facilitating early risk stratification and more informed clinical planning. Results are promising, but external validation with larger and more diverse datasets remains essential. Full article
Show Figures

Figure 1

24 pages, 3133 KB  
Article
A Feature Selection-Based Multi-Stage Methodology for Improving Driver Injury Severity Prediction on Imbalanced Crash Data
by Çiğdem İnan Acı, Gizen Mutlu, Murat Ozen, Esra Sarac and Vahide Nida Kılıç Uzel
Electronics 2025, 14(17), 3377; https://doi.org/10.3390/electronics14173377 - 25 Aug 2025
Cited by 1 | Viewed by 1216
Abstract
Predicting driver injury severity is critical for enhancing road safety, but it is complicated because fatal accidents inherently create class imbalance within datasets. This study conducts a comparative analysis of machine-learning (ML) and deep-learning (DL) models for multi-class driver injury severity prediction using [...] Read more.
Predicting driver injury severity is critical for enhancing road safety, but it is complicated because fatal accidents inherently create class imbalance within datasets. This study conducts a comparative analysis of machine-learning (ML) and deep-learning (DL) models for multi-class driver injury severity prediction using a comprehensive dataset of 107,195 traffic accidents from the Adana, Mersin, and Antalya provinces in Turkey (2018–2023). To address the significant imbalance between fatal, injury, and non-injury classes, the hybrid SMOTE-ENN algorithm was employed for data balancing. Subsequently, feature selection techniques, including Relief-F, Extra Trees, and Recursive Feature Elimination (RFE), were utilized to identify the most influential predictors. Various ML models (K-Nearest Neighbors (KNN), XGBoost, Random Forest) and DL architectures (Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Recurrent Neural Network (RNN)) were developed and rigorously evaluated. The findings demonstrate that traditional ML models, particularly KNN (0.95 accuracy, 0.95 F1-macro) and XGBoost (0.92 accuracy, 0.92 F1-macro), significantly outperformed DL models. The SMOTE-ENN technique proved effective in managing class imbalance, and RFE identified a critical 25-feature subset including driver fault, speed limit, and road conditions. This research highlights the efficacy of well-preprocessed ML approaches for tabular crash data, offering valuable insights for developing robust predictive tools to improve traffic safety outcomes. Full article
(This article belongs to the Special Issue Machine Learning Approach for Prediction: Cross-Domain Applications)
Show Figures

Graphical abstract

70 pages, 4767 KB  
Review
Advancements in Breast Cancer Detection: A Review of Global Trends, Risk Factors, Imaging Modalities, Machine Learning, and Deep Learning Approaches
by Md. Atiqur Rahman, M. Saddam Hossain Khan, Yutaka Watanobe, Jarin Tasnim Prioty, Tasfia Tahsin Annita, Samura Rahman, Md. Shakil Hossain, Saddit Ahmed Aitijjo, Rafsun Islam Taskin, Victor Dhrubo, Abubokor Hanip and Touhid Bhuiyan
BioMedInformatics 2025, 5(3), 46; https://doi.org/10.3390/biomedinformatics5030046 - 20 Aug 2025
Cited by 5 | Viewed by 9863
Abstract
Breast cancer remains a critical global health challenge, with over 2.1 million new cases annually. This review systematically evaluates recent advancements (2022–2024) in machine and deep learning approaches for breast cancer detection and risk management. Our analysis demonstrates that deep learning models achieve [...] Read more.
Breast cancer remains a critical global health challenge, with over 2.1 million new cases annually. This review systematically evaluates recent advancements (2022–2024) in machine and deep learning approaches for breast cancer detection and risk management. Our analysis demonstrates that deep learning models achieve 90–99% accuracy across imaging modalities, with convolutional neural networks showing particular promise in mammography (99.96% accuracy) and ultrasound (100% accuracy) applications. Tabular data models using XGBoost achieve comparable performance (99.12% accuracy) for risk prediction. The study confirms that lifestyle modifications (dietary changes, BMI management, and alcohol reduction) significantly mitigate breast cancer risk. Key findings include the following: (1) hybrid models combining imaging and clinical data enhance early detection, (2) thermal imaging achieves high diagnostic accuracy (97–100% in optimized models) while offering a cost-effective, less hazardous screening option, (3) challenges persist in data variability and model interpretability. These results highlight the need for integrated diagnostic systems combining technological innovations with preventive strategies. The review underscores AI’s transformative potential in breast cancer diagnosis while emphasizing the continued importance of risk factor management. Future research should prioritize multi-modal data integration and clinically interpretable models. Full article
(This article belongs to the Section Imaging Informatics)
Show Figures

Figure 1

25 pages, 28131 KB  
Article
Landslide Susceptibility Assessment in Ya’an Based on Coupling of GWR and TabNet
by Jiatian Li, Ruirui Wang, Wei Shi, Le Yang, Jiahao Wei, Fei Liu and Kaiwei Xiong
Remote Sens. 2025, 17(15), 2678; https://doi.org/10.3390/rs17152678 - 2 Aug 2025
Cited by 2 | Viewed by 1418
Abstract
Landslides are destructive geological hazards, making accurate landslide susceptibility assessment essential for disaster prevention and mitigation. However, existing studies often lack scientific rigor in negative sample construction and have unclear model applicability. This study focuses on Ya’an City, Sichuan Province, China, and proposes [...] Read more.
Landslides are destructive geological hazards, making accurate landslide susceptibility assessment essential for disaster prevention and mitigation. However, existing studies often lack scientific rigor in negative sample construction and have unclear model applicability. This study focuses on Ya’an City, Sichuan Province, China, and proposes an innovative approach to negative sample construction using Geographically Weighted Regression (GWR), which is then integrated with Tabular Network (TabNet), a deep learning architecture tailored to structured tabular data, to assess landslide susceptibility. The performance of TabNet is compared against Random Forest, Light Gradient Boosting Machine, deep neural networks, and Residual Networks. The experimental results indicate that (1) the GWR-based sampling strategy substantially improves model performance across all tested models; (2) TabNet trained using the GWR-based negative samples achieves superior performance over all other evaluated models, with an average AUC of 0.9828, exhibiting both high accuracy and interpretability; and (3) elevation, land cover, and annual Normalized Difference Vegetation Index are identified as dominant predictors through TabNet’s feature importance analysis. The results demonstrate that combining GWR and TabNet substantially enhances landslide susceptibility modeling by improving both accuracy and interpretability, establishing a more scientifically grounded approach to negative sample construction, and providing an interpretable, high-performing modeling framework for geological hazard risk assessment. Full article
Show Figures

Figure 1

31 pages, 7946 KB  
Article
EpInflammAge: Epigenetic-Inflammatory Clock for Disease-Associated Biological Aging Based on Deep Learning
by Alena Kalyakulina, Igor Yusipov, Arseniy Trukhanov, Claudio Franceschi, Alexey Moskalev and Mikhail Ivanchenko
Int. J. Mol. Sci. 2025, 26(13), 6284; https://doi.org/10.3390/ijms26136284 - 29 Jun 2025
Cited by 3 | Viewed by 4374
Abstract
We present EpInflammAge, an explainable deep learning tool that integrates epigenetic and inflammatory markers to create a highly accurate, disease-sensitive biological age predictor. This novel approach bridges two key hallmarks of aging—epigenetic alterations and immunosenescence. First, epigenetic and inflammatory data from the same [...] Read more.
We present EpInflammAge, an explainable deep learning tool that integrates epigenetic and inflammatory markers to create a highly accurate, disease-sensitive biological age predictor. This novel approach bridges two key hallmarks of aging—epigenetic alterations and immunosenescence. First, epigenetic and inflammatory data from the same participants was used for AI models predicting levels of 24 cytokines from blood DNA methylation. Second, open-source epigenetic data (25 thousand samples) was used for generating synthetic inflammatory biomarkers and training an age estimation model. Using state-of-the-art deep neural networks optimized for tabular data analysis, EpInflammAge achieves competitive performance metrics against 34 epigenetic clock models, including an overall mean absolute error of 7 years and a Pearson correlation coefficient of 0.85 in healthy controls, while demonstrating robust sensitivity across multiple disease categories. Explainable AI revealed the contribution of each feature to the age prediction. The sensitivity to multiple diseases due to combining inflammatory and epigenetic profiles is promising for both research and clinical applications. EpInflammAge is released as an easy-to-use web tool that generates the age estimates and levels of inflammatory parameters for methylation data, with the detailed report on the contribution of input variables to the model output for each sample. Full article
(This article belongs to the Section Molecular Biology)
Show Figures

Figure 1

16 pages, 871 KB  
Article
XSQ-Learning: Adaptive Similarity Thresholds for Accelerated and Stable Q-Learning
by Ansel Y. Rodríguez González, Roberto E. López Díaz, Shender M. Ávila Sansores and María G. Sánchez Cervantes
Appl. Sci. 2025, 15(13), 7281; https://doi.org/10.3390/app15137281 - 27 Jun 2025
Viewed by 1360
Abstract
Reinforcement Learning (RL) enables agents to learn optimal policies through environment interaction, with Q-learning being a fundamental algorithm for Markov Decision Processes (MDPs). However, Q-learning suffers from slow convergence due to its exhaustive exploration requirements, particularly in large state spaces where Q-value estimation [...] Read more.
Reinforcement Learning (RL) enables agents to learn optimal policies through environment interaction, with Q-learning being a fundamental algorithm for Markov Decision Processes (MDPs). However, Q-learning suffers from slow convergence due to its exhaustive exploration requirements, particularly in large state spaces where Q-value estimation becomes computationally expensive, whether using tabular methods or Deep Neural Networks (DNNs). To address this limitation, we propose XSQ-Learning, a novel algorithm that accelerates convergence by leveraging similarities between state–action pairs to generalize Q-value updates intelligently. XSQ-Learning introduces two key innovations: (1) an adaptive update mechanism that propagates temporal-difference errors to similar states proportionally to their similarity, and (2) a similarity-aware control strategy that regulates which updates are propagated and to what extent. Our experiments demonstrate that XSQ-Learning can reduce the required iterations by 36.83% compared to standard Q-learning and by 24.43% versus state-of-the-art similarity-based methods, while maintaining policy stability. These results show that similarity-based value propagation can significantly enhance RL efficiency without compromising learning reliability. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop