Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (589)

Search Parameters:
Keywords = bagging technique

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 4780 KB  
Article
Vibration and Stray Flux Signal Fusion for Corrosion Damage Detection in Rolling Bearings Using Ensemble Learning Algorithms
by José Pablo Pacheco-Guerrero, Israel Zamudio-Ramírez, Larisa Dunai and Jose Alfonso Antonino-Daviu
Sensors 2026, 26(1), 233; https://doi.org/10.3390/s26010233 (registering DOI) - 30 Dec 2025
Abstract
Early fault diagnosis in induction motors is important to maintain correct operation in terms of energy and efficiency, as well as to achieve a reduction in costs associated with maintenance or unexpected stoppages in production processes. These motors are widely used in industry [...] Read more.
Early fault diagnosis in induction motors is important to maintain correct operation in terms of energy and efficiency, as well as to achieve a reduction in costs associated with maintenance or unexpected stoppages in production processes. These motors are widely used in industry due to their reliability, low cost, and great robustness; however, over time, they may be exposed to wear that can affect their performance, endanger the integrity of operators, or cause unexpected shutdowns that generate economic losses. Corrosion in the bearings is one of the most common failures, which is mainly triggered by high humidity in combination with high temperatures. However, despite its relevance, it has not been widely explored as a cause of failure in induction motors. Unlike failures that occur in specific or localized areas, corrosion in bearings does not manifest through specific frequencies associated with the phenomenon, since the corrosion occurs extensively on the surface of the raceway, making early diagnosis difficult with conventional techniques based on spectral analysis. Therefore, this work proposes an approach for the analysis of magnetic stray flux and vibration signals under different levels of corrosion using statistical and non-statistical parameters to capture variations in the dynamic behavior of the motors while employing genetic algorithms to select the most relevant parameters for each signal and optimize the configuration of an ensemble learning algorithm. The classification of the bearing condition is achieved using support vector machines in combination with the bagging method, which increases the robustness and accuracy of the model in the presence of signal variability. A classification accuracy between the healthy state and two gradualities greater than 99% was obtained, indicating that the proposed approach is reliable and efficient for corrosion diagnosis. Full article
(This article belongs to the Special Issue Feature Papers in Fault Diagnosis & Sensors 2025)
Show Figures

Figure 1

18 pages, 973 KB  
Review
Brain Age as a Biomarker in Alzheimer’s Disease: Narrative Perspectives on Imaging, Biomarkers, Machine Learning, and Intervention Potential
by Lan Lin, Yanxue Li, Shen Sun, Jeffery Lin, Ziyi Wang, Yutong Wu, Zhenrong Fu and Hongjian Gao
Brain Sci. 2026, 16(1), 33; https://doi.org/10.3390/brainsci16010033 - 25 Dec 2025
Viewed by 149
Abstract
Background/Objectives: Alzheimer’s disease (AD) has a prolonged preclinical phase and marked heterogeneity. Brain age and the Brain Age Gap (BAG), derived from neuroimaging and machine learning (ML), offer a non-invasive, system-level indicator of brain integrity, with potential relevance for early detection, risk [...] Read more.
Background/Objectives: Alzheimer’s disease (AD) has a prolonged preclinical phase and marked heterogeneity. Brain age and the Brain Age Gap (BAG), derived from neuroimaging and machine learning (ML), offer a non-invasive, system-level indicator of brain integrity, with potential relevance for early detection, risk stratification, and intervention monitoring. This review summarizes the conceptual basis, imaging characteristics, biological relevance, and explores its potential clinical utility of BAG across the AD continuum. Methods: We conducted a narrative synthesis of evidence from morphometric structural magnetic resonance imaging (sMRI), connectivity-based functional magnetic resonance imaging (fMRI), positron emission tomography (PET), and diffusion tensor imaging (DTI), alongside recent advances in deep learning architectures and multimodal fusion techniques. We further examined associations between BAG and the Amyloid/Tau/Neurodegeneration (A/T/N) framework, neuroinflammation, cognitive reserve, and lifestyle interventions. Results: BAG may reflect neurodegeneration associated with AD, showing greater deviations in individuals with mild cognitive impairment (MCI) and early AD, and is correlated with tau pathology, neuroinflammation, and metabolic or functional network dysregulation. Multimodal and deep learning approaches enhance the sensitivity of BAG to disease-related deviations. Longitudinal BAG changes outperform static BAG in forecasting cognitive decline, and lifestyle or exercise interventions can attenuate BAG acceleration. Conclusions: BAG emerges as a promising, dynamic, integrative, and modifiable complementary biomarker with the potential for assessing neurobiological resilience, disease staging, and personalized intervention monitoring in AD. While further standardization and large-scale validation are essential to support clinical translation, BAG provides a novel systems-level perspective on brain health across the AD continuum. Full article
Show Figures

Figure 1

19 pages, 1172 KB  
Article
Shelf Life Prediction of Longan with Intermediate Moisture Using Osmotic Dehydration, Combined with Different Packaging and Storage Temperatures
by Hong Phuc Vu Le, Napapan Chokumnoyporn, Jurmkwan Sangsuwan, Witoon Prinyawiwatkul and Sujinda Sriwattana
Foods 2026, 15(1), 40; https://doi.org/10.3390/foods15010040 - 23 Dec 2025
Viewed by 242
Abstract
This study aimed to evaluate the shelf life of intermediate moisture longan (IML). A hurdle technology approach was applied, combining osmotic dehydration (OD), hot-air drying, and packaging methods: aluminum foil-laminated plastic bags with nitrogen flushing (Al bag with nitrogen), aluminum foil-laminated plastic bags [...] Read more.
This study aimed to evaluate the shelf life of intermediate moisture longan (IML). A hurdle technology approach was applied, combining osmotic dehydration (OD), hot-air drying, and packaging methods: aluminum foil-laminated plastic bags with nitrogen flushing (Al bag with nitrogen), aluminum foil-laminated plastic bags without nitrogen (the Al bag without nitrogen), and clear plastic bags. Samples were stored at 4, 25, 35, and 45 °C for 24 weeks (six months). The combination of these preservation techniques was effective in extending the shelf life of IML products. Quality changes in IML during storage were significantly influenced by packaging type, storage temperature, and storage duration (p ≤ 0.05). Products stored in all three types of packaging at low temperatures retained better color (L* 31.92 ± 0.97–32.67 ± 1.47) and higher sensory scores (6.5 ± 1.4–6.6 ± 1.5) compared to those stored at higher temperatures (L* 19.54 ± 1.00–20.90 ± 1.48, 3.3 ± 1.6–4.1 ± 1.7). Accelerated shelf life testing using the Arrhenius equation was applied to predict changes in color and sensory acceptance. The kinetics of these quality changes followed the first-order reaction models. Among the packaging types, IML stored in Al bags with nitrogen exhibited the lowest rate constants, indicating slower quality deterioration and better protection compared to Al bags without nitrogen and clear plastic bags. The predictive model demonstrated strong agreement with the experimental data, accurately predicting shelf life at 25 °C and above. However, the model projected a potential shelf life of up to 58 weeks for IML samples packaged in aluminum bags with nitrogen and stored at 4 °C; this projection extended beyond the 24-week experimental period, which still verified a minimum shelf life of 24 weeks. This technology reduces post-harvest food loss, advances packaging innovation for agro-industry, and strengthens food security. Full article
Show Figures

Figure 1

14 pages, 2954 KB  
Article
Research on the Efficient Industrial Scale-Up Cultivation of a Novel Aromatic and Disease-Resistant Mushroom Rhodotus palmatus
by Xi Luo, Shaoxiong Liu, Fan Zhou, Jianying Li, Junbo Zhang, Qimeng Liu, Chunli Liu, Lei Wang, Rong Hua and Dafeng Sun
Agronomy 2025, 15(12), 2882; https://doi.org/10.3390/agronomy15122882 - 15 Dec 2025
Viewed by 283
Abstract
Rhodotus palmatus is commercially marketed as ‘fruity-scented mushroom’, a novel cultivar first brought into commercial cultivation in China at the end of 2023. It is characterized by a distinct pinkish pigmentation, a pileus with a distinct reticulated surface pattern, and an intense fruity [...] Read more.
Rhodotus palmatus is commercially marketed as ‘fruity-scented mushroom’, a novel cultivar first brought into commercial cultivation in China at the end of 2023. It is characterized by a distinct pinkish pigmentation, a pileus with a distinct reticulated surface pattern, and an intense fruity aroma. To date, only a few natural strains have been documented in China, and scientific research on this species remains scarce. This study successfully bred a new variety of Rhodotus palmatus and established corresponding efficient techniques for its industrial-scale cultivation. As a result, strain ZJGWG001, known for its short growth cycle (23 d), high yield potential (177.43 ± 10.08 g·bag−1) and biological efficiency (59.1%), and distinctive, stable phenotypic traits, is well suited for industrial cultivation. This cultivar is the first newly registered Rhodotus palmatus variety in China to receive provincial-level certification and has been officially designated as ‘Zhongjunguoweigu No. 1’. It represents an important discovery in the collection and exploration of wild fungal germplasm resources in recent years. Full article
Show Figures

Figure 1

29 pages, 8414 KB  
Article
Optimized Explainable Machine Learning Protocol for Battery State-of-Health Prediction Based on Electrochemical Impedance Spectra
by Lamia Akther, Md Shafiul Alam, Mohammad Ali, Mohammed A. AlAqil, Tahmida Khanam and Md. Feroz Ali
Electronics 2025, 14(24), 4869; https://doi.org/10.3390/electronics14244869 - 10 Dec 2025
Viewed by 393
Abstract
Monitoring the battery state of health (SOH) has become increasingly important for electric vehicles (EVs), renewable storage systems, and consumer gadgets. It indicates the residual usable capacity and performance of a battery in relation to its original specifications. This information is crucial for [...] Read more.
Monitoring the battery state of health (SOH) has become increasingly important for electric vehicles (EVs), renewable storage systems, and consumer gadgets. It indicates the residual usable capacity and performance of a battery in relation to its original specifications. This information is crucial for the safety and performance enhancement of the overall system. This paper develops an explainable machine learning protocol with Bayesian optimization techniques trained on electrochemical impedance spectroscopy (EIS) data to predict battery SOH. Various robust ensemble algorithms, including HistGradientBoosting (HGB), Random Forest, AdaBoost, Extra Trees, Bagging, CatBoost, Decision Tree, LightGBM, Gradient Boost, and XGB, have been developed and fine-tuned for predicting battery health. Eight comprehensive metrics are employed to estimate the model’s performance rigorously: coefficient of determination (R2), mean squared error (MSE), median absolute error (medae), mean absolute error (MAE), correlation coefficient (R), Nash–Sutcliffe efficiency (NSE), Kling–Gupta efficiency (KGE), and root mean squared error (RMSE). Bayesian optimization techniques were developed to optimize hyperparameters across all models, ensuring optimal implementation of each algorithm. Feature importance analysis was performed to thoroughly evaluate the models and assess the features with the most influence on battery health degradation. The comparison indicated that the GradientBoosting model outperformed others, achieving an MAE of 0.1041 and an R2 of 0.9996. The findings suggest that Bayesian-optimized tree-based ensemble methods, particularly gradient boosting, excel at forecasting battery health status from electrochemical impedance spectroscopy data. This result offers an excellent opportunity for practical use in battery management systems that employ diverse industrial state-of-health assessment techniques to enhance battery longevity, contributing to sustainability initiatives for second-life lithium-ion batteries. This capability enables the recycling of vehicle batteries for application in static storage systems, which is environmentally advantageous and ensures continuity. Full article
(This article belongs to the Special Issue Advanced Control and Power Electronics for Electric Vehicles)
Show Figures

Figure 1

31 pages, 1941 KB  
Article
Boosting Traffic Crash Prediction Performance with Ensemble Techniques and Hyperparameter Tuning
by Naima Goubraim, Zouhair Elamrani Abou Elassad, Hajar Mousannif and Mohamed Ameksa
Safety 2025, 11(4), 121; https://doi.org/10.3390/safety11040121 - 9 Dec 2025
Viewed by 1010
Abstract
Road traffic crashes are a major global challenge, resulting in significant loss of life, economic burden, and societal impact. This study seeks to enhance the precision of traffic accident prediction using advanced machine learning techniques. This study employs an ensemble learning approach combining [...] Read more.
Road traffic crashes are a major global challenge, resulting in significant loss of life, economic burden, and societal impact. This study seeks to enhance the precision of traffic accident prediction using advanced machine learning techniques. This study employs an ensemble learning approach combining the Random Forest, the Bagging Classifier (Bootstrap Aggregating), the Extreme Gradient Boosting (XGBoost) and the Light Gradient Boosting Machine (LightGBM) algorithms. To address class imbalance and feature relevance, we implement feature selection using the Extra Trees Classifier and oversampling using the Synthetic Minority Over-sampling Technique (SMOTE). Rigorous hyperparameter tuning is applied to optimize model performance. Our results show that the ensemble approach, coupled with hyperparameter optimization, significantly improves prediction accuracy. This research contributes to the development of more effective road safety strategies and can help to reduce the number of road accidents. Full article
(This article belongs to the Special Issue Road Traffic Risk Assessment: Control and Prevention of Collisions)
Show Figures

Figure 1

20 pages, 1030 KB  
Article
VISTA: A Multi-View, Hierarchical, and Interpretable Framework for Robust Topic Modelling
by Tvrtko Glunčić, Domjan Barić and Matko Glunčić
Mach. Learn. Knowl. Extr. 2025, 7(4), 162; https://doi.org/10.3390/make7040162 - 8 Dec 2025
Viewed by 435
Abstract
Topic modeling is a fundamental technique in natural language processing used to uncover latent themes in large text corpora, yet existing approaches struggle to jointly achieve interpretability, semantic coherence, and scalability. Classical probabilistic models such as LDA and NMF rely on bag-of-words assumptions [...] Read more.
Topic modeling is a fundamental technique in natural language processing used to uncover latent themes in large text corpora, yet existing approaches struggle to jointly achieve interpretability, semantic coherence, and scalability. Classical probabilistic models such as LDA and NMF rely on bag-of-words assumptions that obscure contextual meaning, while embedding-based methods (e.g., BERTopic, Top2Vec) improve coherence at the expense of diversity and stability. Prompt-based frameworks (e.g., TopicGPT) enhance interpretability but remain sensitive to prompt design and are computationally costly on large datasets. This study introduces VISTA (Vector-Similarity Topic Analysis), a multi-view, hierarchical, and interpretable framework that integrates complementary document embeddings, mutual-nearest-neighbor hierarchical clustering with selective dimension analysis, and large language model (LLM)-based topic labeling enforcing hierarchical coherence. Experiments on three heterogeneous corpora—BBC News, BillSum, and a mixed U.S. Government agency news + Twitter dataset—show that VISTA consistently ranks among the top-performing models, achieving the highest C_UCI coherence and a strong balance between topic diversity and semantic consistency. Qualitative analyses confirm that VISTA identifies domain-relevant themes overlooked by probabilistic or prompt-based models. Overall, VISTA provides a scalable, semantically robust, and interpretable framework for topic discovery, bridging probabilistic, embedding-based, and LLM-driven paradigms in a unified and reproducible design. Full article
(This article belongs to the Section Visualization)
Show Figures

Graphical abstract

22 pages, 1178 KB  
Article
Identification of Potential Biomarkers in Prostate Cancer Microarray Gene Expression Leveraging Explainable Machine Learning Classifiers
by Ahmed Al Marouf, Jon George Rokne and Reda Alhajj
Cancers 2025, 17(23), 3853; https://doi.org/10.3390/cancers17233853 - 30 Nov 2025
Viewed by 362
Abstract
Background and Objective: Prostate cancer remains one of the most prevalent and potentially lethal malignancies among men worldwide, and timely and accurate diagnosis, along with the stratification of patients by disease severity, is critical for personalized treatment and improved outcomes for this cancer. [...] Read more.
Background and Objective: Prostate cancer remains one of the most prevalent and potentially lethal malignancies among men worldwide, and timely and accurate diagnosis, along with the stratification of patients by disease severity, is critical for personalized treatment and improved outcomes for this cancer. One of the tools used for diagnosis is bioinformatics. However, traditional biomarker discovery methods often lack transparency and interpretability, which means that clinicians find it difficult to trust biomarkers for their application in a clinical setting. Methods: This paper introduces a novel approach that leverages Explainable Machine Learning (XML) techniques to identify and prioritize biomarkers associated with different levels of severity of prostate cancer. The proposed XML approach presented in this study incorporates some traditional machine learning (ML) algorithms with transparent models to facilitate understanding of the importance of the characteristics for bioinformatics analysis, allowing for more informed clinical decisions. The proposed method contains the implementation of several ML classifiers, such as Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), Support Vector Machine (SVM), Logistic Regression (LR), and Bagging (Bg); followed by SHAPly values for the XML pipeline. In this study, for pre-processing of missing values, imputation was applied; SMOTE (Synthetic Minority Oversampling Technique) and the Tomek link method were applied to handle the class imbalance problem. The k-fold stratified validation of machine learning (ML) models and SHAP values (SHapley Additive explanations) were used for explainability. Results: This study utilized a novel tissue microarray data set that has 102 patient data comprising prostate cancer and healthy patients. The proposed model satisfactorily identifies genes as biomarkers, with highest accuracy obtained being 81.01% using RF. The top 10 potential biomarkers identified in this study are DEGS1, HPN, ERG, CFD, TMPRSS2, PDLIM5, XBP1, AJAP1, NPM1 and C7. Conclusions: As XML continues to unravel the complexities within prostate cancer datasets, the identification of severity-specific biomarkers is poised at the forefront of precision oncology. This integration paves the way for targeted interventions, improving patient outcomes, and heralding a new era of individualized care in the fight against prostate cancer. Full article
Show Figures

Figure 1

32 pages, 6248 KB  
Article
AI-Driven Resilient Fault Diagnosis of Bearings in Rotating Machinery
by Syed Muhammad Wasi ul Hassan Naqvi, Arsalan Arif, Asif Khan, Fazail Bangash, Ghulam Jawad Sirewal and Bin Huang
Sensors 2025, 25(22), 7092; https://doi.org/10.3390/s25227092 - 20 Nov 2025
Viewed by 752
Abstract
Predictive maintenance is increasingly important in rotating machinery to prevent unexpected failures, reduce downtime, and improve operational efficiency. This study compares the efficacy of traditional machine learning (ML) and deep learning (DL) techniques in diagnosing bearing faults under varying load and speed conditions. [...] Read more.
Predictive maintenance is increasingly important in rotating machinery to prevent unexpected failures, reduce downtime, and improve operational efficiency. This study compares the efficacy of traditional machine learning (ML) and deep learning (DL) techniques in diagnosing bearing faults under varying load and speed conditions. Two classification tasks were conducted: a simpler three-class task that distinguishes healthy bearings, inner race faults, and outer race faults, and a more complex nine-class task that includes faults of varying severity in the inner and outer races. In this study, the machine learning algorithm ensemble bagged trees, achieved maximum accuracies of 93.04% for the three-class and 87.13% for the nine-class classifications, followed by neural network, SVM, KNN, decision tree, and other algorithms. For deep learning, the CNN model, trained on scalograms (time–frequency images generated by continuous wavelet transform), demonstrated superior performance, reaching up to 100% accuracy in both classification tasks after six training epochs for the nine-class classifications. While CNNs take longer training time, their superior accuracy and capability to automatically extract complex features make the investment worthwhile. Consequently, the results demonstrate that the CNN model trained on CWT-based scalogram images achieved remarkably high classification accuracy, confirming that deep learning methods can outperform traditional ML algorithms in handling complex, non-linear, and dynamic diagnostic scenarios. Full article
(This article belongs to the Special Issue AI-Assisted Condition Monitoring and Fault Diagnosis)
Show Figures

Figure 1

19 pages, 4138 KB  
Article
Machinability Analysis of LPBF-AlSi10Mg: A Study on SL-MQL Efficiency and ML Prediction Models
by Zhenhua Dou, Kai Guo, Jie Sun and Xiaoming Huang
Processes 2025, 13(11), 3687; https://doi.org/10.3390/pr13113687 - 14 Nov 2025
Viewed by 508
Abstract
Because of their exceptional strength, corrosion resistance, and low weight, materials such as titanium, aluminum, and others are becoming increasingly popular. The application scope of additive manufacturing (AM) in the aerospace sector continues to expand. Because of its high performance and low coefficient [...] Read more.
Because of their exceptional strength, corrosion resistance, and low weight, materials such as titanium, aluminum, and others are becoming increasingly popular. The application scope of additive manufacturing (AM) in the aerospace sector continues to expand. Because of its high performance and low coefficient of thermal expansion, AlSi10Mg processed by laser-based powder bed fusion (LPBF) is becoming increasingly popular in lightweight aerospace component design. Nonetheless, the AM technique has a number of benefits; poor surface quality is the only drawback, necessitating post-processing. This study aims to focus on the machinability of AlSi10Mg under three distinct environmental conditions (dry, minimum quantity lubrication (MQL), and SL-MQL). The experimental investigations were centered on chip morphology, flank wear (Vb), surface roughness (Ra), and cutting temperature (Tc). SL-MQL reduced the roughness by 53–57% over dry machining and 23–29% over MQL condition, and in a similar way lessened the flank wear by 36–40% over dry machining and 12–15% over MQL condition. In addition, to check the predictive accuracy and optimize machining parameters, four machine learning models were used: Gaussian Process Regression (GPR), Bagging, Multilayer Perceptron (MLP), and Random Forest (RF). In both the training and testing stages, MLP consistently demonstrated superior performance across all parameters in comparison to other algorithms, achieving high levels of accuracy and low error rates. Full article
Show Figures

Figure 1

26 pages, 2975 KB  
Article
CTGAN-Augmented Ensemble Learning Models for Classifying Dementia and Heart Failure
by Pornthep Phanbua, Sujitra Arwatchananukul, Georgi Hristov and Punnarumol Temdee
Inventions 2025, 10(6), 101; https://doi.org/10.3390/inventions10060101 - 6 Nov 2025
Viewed by 823
Abstract
Research shows that individuals with heart failure are 60% more likely to develop dementia because of their shared metabolic risk factors. Developing a classification model to differentiate between these two conditions effectively is crucial for improving diagnostic accuracy, guiding clinical decision-making, and supporting [...] Read more.
Research shows that individuals with heart failure are 60% more likely to develop dementia because of their shared metabolic risk factors. Developing a classification model to differentiate between these two conditions effectively is crucial for improving diagnostic accuracy, guiding clinical decision-making, and supporting timely interventions in older adults. This study proposes a novel method for dementia classification, distinguishing it from its common comorbidity, heart failure, using blood testing and personal data. A dataset comprising 11,124 imbalanced electronic health records of older adults from hospitals in Chiang Rai, Thailand, was utilized. Conditional tabular generative adversarial networks (CTGANs) were employed to generate synthetic data while preserving key statistical relationships, diversity, and distributions of the original dataset. Two groups of ensemble models were analyzed: the boosting group—extreme gradient boosting, light gradient boosting machine—and the bagging group—random forest and extra trees. Performance metrics, including accuracy, precision, recall, F1-score, and area under the receiver-operating characteristic curve were evaluated. Compared with the synthetic minority oversampling technique, CTGAN-based synthetic data generation significantly enhanced the performance of ensemble learning models in classifying dementia and heart failure. Full article
(This article belongs to the Special Issue Machine Learning Applications in Healthcare and Disease Prediction)
Show Figures

Figure 1

8 pages, 561 KB  
Proceeding Paper
Connected Health Revolution: Deployment of an Intelligent Chatbot for Efficient Management of Online Medical Information Requests
by Achraf Berrajaa, Issam Berrajaa and Naoufal Rouky
Eng. Proc. 2025, 112(1), 50; https://doi.org/10.3390/engproc2025112050 - 27 Oct 2025
Viewed by 554
Abstract
Within the rapidly advancing disciplines of natural language processing (NLP) and artificial intelligence (AI), this paper introduces an innovative approach aimed at improving access to health-related information. Fueled by the growing reliance on digital platforms for health inquiries, our research unveils an intelligent [...] Read more.
Within the rapidly advancing disciplines of natural language processing (NLP) and artificial intelligence (AI), this paper introduces an innovative approach aimed at improving access to health-related information. Fueled by the growing reliance on digital platforms for health inquiries, our research unveils an intelligent chatbot designed to categorize health-related queries and deliver personalized advice. By leveraging a diverse dataset and employing advanced NLP techniques, our models, which include Support Vector Machines, Random Forest, Bagging Classifier, among others, assist in building a flexible conversational agent. The evaluation metrics demonstrate that the Bagging Classifier delivers outstanding results, reaching an accuracy of 99%. The study concludes with a comparative analysis, positioning the Bagging Classifier as a benchmark for accuracy and performance in the classification of health-related queries. Full article
Show Figures

Figure 1

30 pages, 379 KB  
Article
An Enhanced Discriminant Analysis Approach for Multi-Classification with Integrated Machine Learning-Based Missing Data Imputation
by Autcha Araveeporn and Atid Kangtunyakarn
Mathematics 2025, 13(21), 3392; https://doi.org/10.3390/math13213392 - 24 Oct 2025
Viewed by 627
Abstract
This study addresses the challenge of accurate classification under missing data conditions by integrating multiple imputation strategies with discriminant analysis frameworks. The proposed approach evaluates six imputation methods (Mean, Regression, KNN, Random Forest, Bagged Trees, MissRanger) across several discriminant techniques. Simulation scenarios varied [...] Read more.
This study addresses the challenge of accurate classification under missing data conditions by integrating multiple imputation strategies with discriminant analysis frameworks. The proposed approach evaluates six imputation methods (Mean, Regression, KNN, Random Forest, Bagged Trees, MissRanger) across several discriminant techniques. Simulation scenarios varied in sample size, predictor dimensionality, and correlation structure, while the real-world application employed the Cirrhosis Prediction Dataset. The results consistently demonstrate that ensemble-based imputations, particularly regression, KNN, and MissRanger, outperform simpler approaches by preserving multivariate structure, especially in high-dimensional and highly correlated settings. MissRanger yielded the highest classification accuracy across most discriminant analysis methods in both simulated and real data, with performance gains most pronounced when combined with flexible or regularized classifiers. Regression imputation showed notable improvements under low correlation, aligning with the theoretical benefits of shrinkage-based covariance estimation. Across all methods, larger sample sizes and high correlation enhanced classification accuracy by improving parameter stability and imputation precision. Full article
(This article belongs to the Section D1: Probability and Statistics)
27 pages, 4945 KB  
Article
A Robust Framework for Coffee Bean Package Label Recognition: Integrating Image Enhancement with Vision–Language OCR Models
by Thi-Thu-Huong Le, Yeonjeong Hwang, Ahmada Yusril Kadiptya, JunYoung Son and Howon Kim
Sensors 2025, 25(20), 6484; https://doi.org/10.3390/s25206484 - 20 Oct 2025
Cited by 1 | Viewed by 1365
Abstract
Text recognition on coffee bean package labels is of great importance for product tracking and brand verification, but it poses a challenge due to variations in image quality, packaging materials, and environmental conditions. In this paper, we propose a pipeline that combines several [...] Read more.
Text recognition on coffee bean package labels is of great importance for product tracking and brand verification, but it poses a challenge due to variations in image quality, packaging materials, and environmental conditions. In this paper, we propose a pipeline that combines several image enhancement techniques and is followed by an Optical Character Recognition (OCR) model based on vision–language (VL) Qwen VL variants, conditioned by structured prompts. To facilitate the evaluation, we construct a coffee bean package image set containing two subsets, namely low-resolution (LRCB) and high-resolution coffee bean image sets (HRCB), enclosing multiple real-world challenges. These cases involve various packaging types (bottles and bags), label sides (front and back), rotation, and different illumination. To address the image quality problem, we design a dedicated preprocessing pipeline for package label situations. We develop and evaluate four Qwen-VL OCR variants with prompt engineering, which are compared against four baselines: DocTR, PaddleOCR, EasyOCR, and Tesseract. Extensive comparison using various metrics, including the Levenshtein distance, Cosine similarity, Jaccard index, Exact Match, BLEU score, and ROUGE scores (ROUGE-1, ROUGE-2, and ROUGE-L), proves significant improvements upon the baselines. In addition, the public POIE dataset validation test proves how well the framework can generalize, thus demonstrating its practicality and reliability for label recognition. Full article
(This article belongs to the Special Issue Digital Imaging Processing, Sensing, and Object Recognition)
Show Figures

Figure 1

24 pages, 2310 KB  
Article
Optimizing Mycophenolate Therapy in Renal Transplant Patients Using Machine Learning and Population Pharmacokinetic Modeling
by Anastasia Tsyplakova, Aleksandra Catic-Djorđevic, Nikola Stefanović and Vangelis D. Karalis
Med. Sci. 2025, 13(4), 235; https://doi.org/10.3390/medsci13040235 - 20 Oct 2025
Viewed by 971
Abstract
Background/Objectives: Mycophenolic acid (MPA) is used as part of first-line combination immunosuppressive therapy for renal transplant recipients. Personalized dosing approaches are needed to balance efficacy and minimize toxicity due to the pharmacokinetic variability of the drug. In this study, population pharmacokinetic (PopPK) modeling [...] Read more.
Background/Objectives: Mycophenolic acid (MPA) is used as part of first-line combination immunosuppressive therapy for renal transplant recipients. Personalized dosing approaches are needed to balance efficacy and minimize toxicity due to the pharmacokinetic variability of the drug. In this study, population pharmacokinetic (PopPK) modeling and machine learning (ML) techniques are coupled to provide valuable insights into optimizing MPA therapy. Methods: Using data from 76 renal transplant patients, two PopPK models were developed to describe and predict MPA levels for two different formulations (enteric-coated mycophenolate sodium and mycophenolate mofetil). Covariate effects on drug clearance were assessed, and Monte Carlo simulations were used to evaluate exposure under normal and reduced clearance conditions. ML techniques, including principal component analysis (PCA) and ensemble tree models (bagging and boosting), were applied to identify predictive factors and explore associations between MPA plasma/saliva concentrations and the examined covariates. Results: Total daily dose and post-transplant time (PTP) were identified as key covariates affecting clearance. PCA highlighted MPA dose as the primary determinant of plasma levels, with urea and PTP also playing significant roles. Boosted tree analysis confirmed these findings, demonstrating strong predictive accuracy (R2 > 0.91). Incorporating saliva MPA levels improved predictive performance, suggesting that saliva may be a complementary monitoring tool, although plasma monitoring remained superior. Simulations allowed exploring potential dosing adjustments for patients with reduced clearance. Conclusions: This study demonstrates the potential of integrating machine learning with population pharmacokinetic modeling to improve the understanding of MPA variability and support individualized dosing strategies in renal transplant recipients. The developed PopPK/ML models provide a methodological foundation for future research toward more personalized immunosuppressive therapy. Full article
(This article belongs to the Section Translational Medicine)
Show Figures

Graphical abstract

Back to TopTop