MDPI - Publisher of Open Access Journals

19 pages, 1039 KiB

Open AccessArticle

Prediction of Parkinson Disease Using Long-Term, Short-Term Acoustic Features Based on Machine Learning

by Mehdi Rashidi, Serena Arima, Andrea Claudio Stetco, Chiara Coppola, Debora Musarò, Marco Greco, Marina Damato, Filomena My, Angela Lupo, Marta Lorenzo, Antonio Danieli, Giuseppe Maruccio, Alberto Argentiero, Andrea Buccoliero, Marcello Dorian Donzella and Michele Maffia

Brain Sci. 2025, 15(7), 739; https://doi.org/10.3390/brainsci15070739 - 10 Jul 2025

Viewed by 504

Abstract

Background: Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease, affecting countless individuals worldwide. PD is characterized by the onset of a marked motor symptomatology in association with several non-motor manifestations. The clinical phase of the disease is usually [...] Read more.

Background: Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease, affecting countless individuals worldwide. PD is characterized by the onset of a marked motor symptomatology in association with several non-motor manifestations. The clinical phase of the disease is usually preceded by a long prodromal phase, devoid of overt motor symptomatology but often showing some conditions such as sleep disturbance, constipation, anosmia, and phonatory changes. To date, speech analysis appears to be a promising digital biomarker to anticipate even 10 years before the onset of clinical PD, as well serving as a useful prognostic tool for patient follow-up. That is why, the voice can be nominated as the non-invasive method to detect PD from healthy subjects (HS). Methods: Our study was based on cross-sectional study to analysis voice impairment. A dataset comprising 81 voice samples (41 from healthy individuals and 40 from PD patients) was utilized to train and evaluate common machine learning (ML) models using various types of features, including long-term (jitter, shimmer, and cepstral peak prominence (CPP)), short-term features (Mel-frequency cepstral coefficient (MFCC)), and non-standard measurements (pitch period entropy (PPE) and recurrence period density entropy (RPDE)). The study adopted multiple machine learning (ML) algorithms, including random forest (RF), K-nearest neighbors (KNN), decision tree (DT), naïve Bayes (NB), support vector machines (SVM), and logistic regression (LR). Cross-validation technique was applied to ensure the reliability of performance metrics on train and test subsets. These metrics (accuracy, recall, and precision), help determine the most effective models for distinguishing PD from healthy subjects. Result: Among all the algorithms used in this research, random forest (RF) was the best-performing model, achieving an accuracy of 82.72% with a ROC-AUC score of 89.65%. Although other models, such as support vector machine (SVM), could be considered with an accuracy of 75.29% and a ROC-AUC score of 82.63%, RF was by far the best one when evaluated across all metrics. The K-nearest neighbor (KNN) and decision tree (DT) performed the worst. Notably, by combining a comprehensive set of long-term, short-term, and non-standard acoustic features, unlike previous studies that typically focused on only a subset, our study achieved higher predictive performance, offering a more robust model for early PD detection. Conclusions: This study highlights the potential of combining advanced acoustic analysis with ML algorithms to develop non-invasive and reliable tools for early PD detection, offering substantial benefits for the healthcare sector. Full article

(This article belongs to the Section Neurodegenerative Diseases)

► Show Figures

Figure 1

19 pages, 3291 KiB

Open AccessArticle

Predicting High-Cost Healthcare Utilization Using Machine Learning: A Multi-Service Risk Stratification Analysis in EU-Based Private Group Health Insurance

by Eslam Abdelhakim Seyam

Risks 2025, 13(7), 133; https://doi.org/10.3390/risks13070133 - 8 Jul 2025

Viewed by 317

Abstract

Healthcare cost acceleration and resource allocation issues have worsened across European health systems, where a small group of patients drives excessive healthcare spending. The prediction of high-cost utilization patterns is important for the sustainable management of healthcare and focused intervention measures. The aim [...] Read more.

Healthcare cost acceleration and resource allocation issues have worsened across European health systems, where a small group of patients drives excessive healthcare spending. The prediction of high-cost utilization patterns is important for the sustainable management of healthcare and focused intervention measures. The aim of our study was to derive and validate machine learning algorithms for high-cost healthcare utilization prediction based on detailed administrative data and by comparing three algorithmic methods for the best risk stratification performance. The research analyzed extensive insurance beneficiary records which compile data from health group collective funds operated by non-life insurers across EU countries, across multiple service classes. The definition of high utilization was equivalent to the upper quintile of overall health expenditure using a moderate cost threshold. The research applied three machine learning algorithms, namely logistic regression using elastic net regularization, the random forest, and support vector machines. The models used a comprehensive set of predictor variables including demographics, policy profiles, and patterns of service utilization across multiple domains of healthcare. The performance of the models was evaluated using the standard train–test methodology and rigorous cross-validation procedures. All three models demonstrated outstanding discriminative ability by achieving area under the curve values at near-perfect levels. The random forest achieved the best test performance with exceptional metrics, closely followed by logistic regression with comparable exceptional performance. Service diversity proved to be the strongest predictor across all models, while dentistry services produced an extraordinarily high odds ratio with robust confidence intervals. The group of high utilizers comprised approximately one-fifth of the sample but demonstrated significantly higher utilization across all service classes. Machine learning algorithms are capable of classifying patients eligible for the high utilization of healthcare services with nearly perfect discriminative ability. The findings justify the application of predictive analytics for proactive case management, resource planning, and focused intervention measures across private group health insurance providers in EU countries. Full article

► Show Figures

Figure 1

18 pages, 357 KiB

Open AccessArticle

Hybrid CNN-LSTM Model with Custom Activation and Loss Functions for Predicting Fan Actuator States in Smart Greenhouses

by Gregorius Airlangga, Julius Bata, Oskar Ika Adi Nugroho and Boby Hartanto Pramudita Lim

AgriEngineering 2025, 7(4), 118; https://doi.org/10.3390/agriengineering7040118 - 10 Apr 2025

Viewed by 1544

Abstract

Smart greenhouses rely on precise environmental control to optimize crop yields and resource efficiency. In this study, we propose a novel hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) architecture to predict fan actuator states based on environmental data. The hybrid [...] Read more.

Smart greenhouses rely on precise environmental control to optimize crop yields and resource efficiency. In this study, we propose a novel hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) architecture to predict fan actuator states based on environmental data. The hybrid model integrates CNNs for spatial feature extraction and LSTMs for temporal dependency modeling, enhanced by a custom activation function and loss function tailored for the problem’s characteristics. The model was trained and evaluated on a comprehensive dataset containing 37,923 samples with 13 environmental features, collected from a smart greenhouse. Experimental results demonstrate the superior performance of the hybrid CNN-LSTM model, achieving an accuracy of 0.9992, precision of 0.9989, recall of 0.9996, and an F1 score of 0.9992, significantly outperforming traditional machine learning methods such as Random Forest and Gradient Boosting, as well as standalone CNN and LSTM architectures. The high recall underscores the model’s reliability in identifying positive actuator states, critical for greenhouse management. This study highlights the importance of hybrid architectures in handling complex spatiotemporal data, offering potential applications beyond greenhouses, such as healthcare monitoring and predictive maintenance. Despite the model’s strengths, limitations include computational complexity and limited interpretability, necessitating future work on optimization and explainability. These findings establish a foundation for integrating deep learning into smart agricultural systems, advancing the automation and efficiency of environmental control mechanisms. Full article

► Show Figures

Figure 1

12 pages, 461 KiB

Open AccessArticle

The Application of Machine Learning Models to Predict Stillbirths

by Oguzhan Gunenc, Sukran Dogru, Fikriye Karanfil Yaman, Huriye Ezveci, Ulfet Sena Metin and Ali Acar

Medicina 2025, 61(3), 472; https://doi.org/10.3390/medicina61030472 - 7 Mar 2025

Viewed by 1165

Abstract

Background and Objectives: This study aims to evaluate the predictive value of comprehensive data obtained in obstetric clinics for the detection of stillbirth and the predictive ability set of machine learning models for stillbirth. Material and Method: The study retrospectively included [...] Read more.

Background and Objectives: This study aims to evaluate the predictive value of comprehensive data obtained in obstetric clinics for the detection of stillbirth and the predictive ability set of machine learning models for stillbirth. Material and Method: The study retrospectively included all stillbirths followed up at a hospital between January 2015 and March 2024 and randomly selected pregnancies that resulted in a live birth. The electronic record system accessed pregnant women’s maternal, fetal, and obstetric characteristics. Based on the perinatal characteristics of the cases, four distinct machine learning classifiers were developed: logistic regression (LR), Support Vector Machine (SVM), Random Forest (RF), and multilayer perceptron (MLP). Results: The study included a total of 951 patients, 499 of whom had live births and 452 of whom had stillbirths. The consanguinity rate, fetal anomalies, history of previous stillbirth, maternal thrombosis, oligohydramnios, and abruption of the placenta were significantly higher in the stillbirth group (p = 0.001). Previous stillbirth histories resulted in a higher rate of stillbirth (OR: 7.31, 95%CI: 2.76–19.31, p = 0.001). Previous thrombosis histories resulted in a higher rate of stillbirth (OR: 14.13, 95%CI: 5.08–39.31, p = 0.001). According to the accuracy estimates of the machine learning models, RF is the most successful model with 96.8% accuracy, 96.3% sensitivity, and 97.2% specificity. Conclusions: The RF machine learning approach employed to predict stillbirths had an accuracy rate of 96.8%. We believe that the elevated success rate of stillbirth prediction using maternal, neonatal, and obstetric risk factors will assist healthcare providers in reducing stillbirth rates through prenatal care interventions. Full article

(This article belongs to the Section Obstetrics and Gynecology)

► Show Figures

Figure 1

34 pages, 2988 KiB

Open AccessArticle

Improving Surgical Site Infection Prediction Using Machine Learning: Addressing Challenges of Highly Imbalanced Data

by Salha Al-Ahmari and Farrukh Nadeem

Diagnostics 2025, 15(4), 501; https://doi.org/10.3390/diagnostics15040501 - 19 Feb 2025

Viewed by 1113

Abstract

Background: Surgical site infections (SSIs) lead to higher hospital readmission rates and healthcare costs, representing a significant global healthcare burden. Machine learning (ML) has demonstrated potential in predicting SSIs; however, the challenge of addressing imbalanced class ratios remains. Objectives: The aim [...] Read more.

Background: Surgical site infections (SSIs) lead to higher hospital readmission rates and healthcare costs, representing a significant global healthcare burden. Machine learning (ML) has demonstrated potential in predicting SSIs; however, the challenge of addressing imbalanced class ratios remains. Objectives: The aim of this study is to evaluate and enhance the predictive capabilities of machine learning models for SSIs by assessing the effects of feature selection, resampling techniques, and hyperparameter optimization. Methods: Using routine SSI surveillance data from multiple hospitals in Saudi Arabia, we analyzed a dataset of 64,793 surgical patients, of whom 1632 developed SSI. Seven machine learning algorithms were created and tested: Decision Tree (DT), Gaussian Naive Bayes (GNB), Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Stochastic Gradient Boosting (SGB), and K-Nearest Neighbors (KNN). We also improved several resampling strategies, such as undersampling and oversampling. Grid search five-fold cross-validation was employed for comprehensive hyperparameter optimization, in conjunction with balanced sampling techniques. Features were selected using a filter method based on their relationships with the target variable. Results: Our findings revealed that RF achieves the highest performance, with an MCC of 0.72. The synthetic minority oversampling technique (SMOTE) is the best-performing resampling technique, consistently enhancing the performance of most machine learning models, except for LR and GNB. LR struggles with class imbalance due to its linear assumptions and bias toward the majority class, while GNB’s reliance on feature independence and Gaussian distribution make it unreliable for under-represented minority classes. For computational efficiency, the Instance Hardness Threshold (IHT) offers a viable alternative undersampling technique, though it may compromise performance to some extent. Conclusions: This study underscores the potential of ML models as effective tools for assessing SSI risk, warranting further clinical exploration to improve patient outcomes. By employing advanced ML techniques and robust validation methods, these models demonstrate promising accuracy and reliability in predicting SSI events, even in the face of significant class imbalances. In addition, using MCC in this study ensures a more reliable and robust evaluation of the model’s predictive performance, particularly in the presence of an imbalanced dataset, where other metrics may fail to provide an accurate evaluation. Full article

(This article belongs to the Special Issue Artificial Intelligence for Clinical Diagnostic Decision Making)

► Show Figures

Figure 1

12 pages, 312 KiB

Open AccessArticle

Assessment of Quality of Life in Lithuanian Patients with Multimorbidity Using the EQ-5D-5L Questionnaire

by Olga Vasiliauskienė, Dovydas Vasiliauskas, Aušrinė Kontrimienė, Lina Jaruševičienė and Ida Liseckienė

Medicina 2025, 61(2), 292; https://doi.org/10.3390/medicina61020292 - 8 Feb 2025

Viewed by 877

Abstract

Background and Objectives: Despite the critical importance of effective healthcare management for patients with multimorbidity, robust and reliable tools for assessing health-related quality of life in Lithuania remain scarce. We aim to identify trends in the quality of life of patients with [...] Read more.

Background and Objectives: Despite the critical importance of effective healthcare management for patients with multimorbidity, robust and reliable tools for assessing health-related quality of life in Lithuania remain scarce. We aim to identify trends in the quality of life of patients with multimorbidity and to evaluate the effectiveness of the Lithuanian version of the EuroQol EQ-5D-5L questionnaire. Materials and Methods: The study included patients between the ages of 40 and 85 (N = 498) who had at least two chronic conditions, arterial hypertension being a prerequisite. The participants completed a comprehensive set of questionnaires specifically prepared for the TELELISPA “Improved healthcare quality for patients with multimorbidity in Lithuania” project which included the translated EQ-5D-5L questionnaire. The predictive validity of the EQ-5D-5L questionnaire was assessed using correlations with the SF-36 and EQ-VAS scores, a random forest regression model. Reliability was evaluated using Cronbach’s alpha and inter-item correlations. Trends in the quality of life in different patient groups were assessed with Chi-square tests. Results: The EQ-5D-5L questionnaire demonstrated high reliability and validity with a Cronbach’s alpha value of 0.737, EQ-5D-5L random forest machine learning regression model RMSE value of 0.1396, and adequate scores from other measures. Lower quality of life was found in patients with multimorbidity who had chronic conditions such as angina pectoris, heart failure, atrial fibrillation, or joint diseases, as well as the patients who were older than 60 years of age, women, or unemployed. Different aspects of quality of life were also significantly negatively impacted by diabetes, asthma, and chronic kidney disease. Heart failure, joint diseases, and older age had the biggest negative effect on quality of life. Conclusions: It is found that the Lithuanian EQ-5D-5L questionnaire is suitable for the assessment of the quality of life in patients with multimorbidity and indicates lower quality of life among those with specific cardiovascular and joint disorder chronic conditions and, in particular, demographic groups. Full article

(This article belongs to the Special Issue Providing Primary Care to Those with Multimorbidities: Current Challenges and Technologies)

20 pages, 1215 KiB

Open AccessSystematic Review

Machine Learning and Deep Learning Models for Dengue Diagnosis Prediction: A Systematic Review

by Daniel Cristobal Andrade Girón, William Joel Marín Rodriguez, Flor de María Lioo-Jordan and Jose Luis Ausejo Sánchez

Informatics 2025, 12(1), 15; https://doi.org/10.3390/informatics12010015 - 6 Feb 2025

Cited by 1 | Viewed by 3434

Abstract

The global crisis triggered by the dengue outbreak has increased mortality and placed significant pressure on healthcare services worldwide. In response to this crisis, there has been a notable increase in research employing machine learning and deep learning algorithms to anticipate diagnosis in [...] Read more.

The global crisis triggered by the dengue outbreak has increased mortality and placed significant pressure on healthcare services worldwide. In response to this crisis, there has been a notable increase in research employing machine learning and deep learning algorithms to anticipate diagnosis in patients with suspected dengue. To conduct a comprehensive systematic review, a detailed analysis was carried out to explore and examine the machine learning methodologies applied in diagnosing this disease. An exhaustive search was conducted across numerous scientific databases, including Scopus, IEEE Xplore, PubMed, ACM, ScienceDirect, Wiley, and Sage, encompassing studies up to May 2024. This extensive search yielded a total of 2723 relevant articles. Following a rigorous evaluation, 32 scientific studies were selected for the final review, meeting the established criteria. A comprehensive analysis of these studies revealed the implementation of 48 distinct machine learning and deep learning algorithms, showcasing the heterogeneity of methodological approaches employed in the research domain. The results indicated that, in terms of performance, the support vector machine (SVM) algorithm was the most efficient, being reported in 25% of the analyzed studies. The Random Forest algorithm was the second most frequently used, appearing in 15.62% of the 32 reviewed articles. The PCA-SVM algorithm (poly-5), a variant of SVM, emerged as the best-performing model, achieving 99.52% accuracy, 99.75% sensitivity, and 99.09% specificity. These findings offer significant insights into the potential of machine learning techniques in the early diagnosis of dengue, underscoring the necessity to persist in exploring and refining these methodologies to enhance clinical care in cases of this disease. Full article

► Show Figures

Figure 1

18 pages, 514 KiB

Open AccessSystematic Review

Exploring Applications of Artificial Intelligence in Critical Care Nursing: A Systematic Review

by Elena Porcellato, Corrado Lanera, Honoria Ocagli and Matteo Danielis

Nurs. Rep. 2025, 15(2), 55; https://doi.org/10.3390/nursrep15020055 - 4 Feb 2025

Cited by 3 | Viewed by 5472

Abstract

Background: Artificial intelligence (AI) has been increasingly employed in healthcare across diverse domains, including medical imaging, personalized diagnostics, therapeutic interventions, and predictive analytics using electronic health records. Its integration is particularly impactful in critical care, where AI has demonstrated the potential to enhance [...] Read more.

Background: Artificial intelligence (AI) has been increasingly employed in healthcare across diverse domains, including medical imaging, personalized diagnostics, therapeutic interventions, and predictive analytics using electronic health records. Its integration is particularly impactful in critical care, where AI has demonstrated the potential to enhance patient outcomes. This systematic review critically evaluates the current applications of AI within the domain of critical care nursing. Methods: This systematic review is registered with PROSPERO (CRD42024545955) and was conducted in accordance with PRISMA guidelines. Comprehensive searches were performed across MEDLINE/PubMed, SCOPUS, CINAHL, and Web of Science. Results: The initial review identified 1364 articles, of which 24 studies met the inclusion criteria. These studies employed diverse AI techniques, including classical models (e.g., logistic regression), machine learning approaches (e.g., support vector machines, random forests), deep learning architectures (e.g., neural networks), and generative AI tools (e.g., ChatGPT). The analyzed health outcomes encompassed postoperative complications, ICU admissions and discharges, triage assessments, pressure injuries, sepsis, delirium, and predictions of adverse events or critical vital signs. Most studies relied on structured data from electronic medical records, such as vital signs and laboratory results, supplemented by unstructured data, including nursing notes and patient histories; two studies also integrated audio data. Conclusion: AI demonstrates significant potential in nursing, facilitating the use of clinical practice data for research and decision-making. The choice of AI techniques varies based on the specific objectives and requirements of the model. However, the heterogeneity of the studies included in this review limits the ability to draw definitive conclusions about the effectiveness of AI applications in critical care nursing. Future research should focus on more robust, interventional studies to assess the impact of AI on nursing-sensitive outcomes. Additionally, exploring a broader range of health outcomes and AI applications in critical care will be crucial for advancing AI integration in nursing practices. Full article

(This article belongs to the Special Issue Advances in Critical Care Nursing)

► Show Figures

Figure 1

20 pages, 942 KiB

Open AccessSystematic Review

Evaluating the Performance of Artificial Intelligence-Based Large Language Models in Orthodontics—A Systematic Review and Meta-Analysis

by Farraj Albalawi, Sanjeev B. Khanagar, Kiran Iyer, Nora Alhazmi, Afnan Alayyash, Anwar S. Alhazmi, Mohammed Awawdeh and Oinam Gokulchandra Singh

Appl. Sci. 2025, 15(2), 893; https://doi.org/10.3390/app15020893 - 17 Jan 2025

Cited by 3 | Viewed by 1996

Abstract

Background: In recent years, there has been remarkable growth in AI-based applications in healthcare, with a significant breakthrough marked by the launch of large language models (LLMs) such as ChatGPT and Google Bard. Patients and health professional students commonly utilize these models due [...] Read more.

Background: In recent years, there has been remarkable growth in AI-based applications in healthcare, with a significant breakthrough marked by the launch of large language models (LLMs) such as ChatGPT and Google Bard. Patients and health professional students commonly utilize these models due to their accessibility. The increasing use of LLMs in healthcare necessitates an evaluation of their ability to generate accurate and reliable responses. Objective: This study assessed the performance of LLMs in answering orthodontic-related queries through a systematic review and meta-analysis. Methods: A comprehensive search of PubMed, Web of Science, Embase, Scopus, and Google Scholar was conducted up to 31 October 2024. The quality of the included studies was evaluated using the Prediction model Risk of Bias Assessment Tool (PROBAST), and R Studio software (Version 4.4.0) was employed for meta-analysis and heterogeneity assessment. Results: Out of 278 retrieved articles, 10 studies were included. The most commonly used LLM was ChatGPT (10/10, 100% of papers), followed by Google’s Bard/Gemini (3/10, 30% of papers), and Microsoft’s Bing/Copilot AI (2/10, 20% of papers). Accuracy was primarily evaluated using Likert scales, while the DISCERN tool was frequently applied for reliability assessment. The meta-analysis indicated that the LLMs, such as ChatGPT-4 and other models, do not significantly differ in generating responses to queries related to the specialty of orthodontics. The forest plot revealed a Standard Mean Deviation of 0.01 [CI: 0.42–0.44]. No heterogeneity was observed between the experimental group (ChatGPT-3.5, Gemini, and Copilot) and the control group (ChatGPT-4). However, most studies exhibited a high PROBAST risk of bias due to the lack of standardized evaluation tools. Conclusions: ChatGPT-4 has been extensively used for a variety of tasks and has demonstrated advanced and encouraging outcomes compared to other LLMs, and thus can be regarded as a valuable tool for enhancing educational and learning experiences. While LLMs can generate comprehensive responses, their reliability is compromised by the absence of peer-reviewed references, necessitating expert oversight in healthcare applications. Full article

(This article belongs to the Special Issue Applied and Innovative Computational Intelligence Systems: 3rd Edition)

► Show Figures

Figure 1

21 pages, 4884 KiB

Open AccessArticle

Evaluation of Machine Learning Algorithms for Classification of Visual Stimulation-Induced EEG Signals in 2D and 3D VR Videos

by Mingliang Zuo, Xiaoyu Chen and Li Sui

Brain Sci. 2025, 15(1), 75; https://doi.org/10.3390/brainsci15010075 - 16 Jan 2025

Cited by 3 | Viewed by 1550

Abstract

Backgrounds: Virtual reality (VR) has become a transformative technology with applications in gaming, education, healthcare, and psychotherapy. The subjective experiences in VR vary based on the virtual environment’s characteristics, and electroencephalography (EEG) is instrumental in assessing these differences. By analyzing EEG signals, researchers [...] Read more.

Backgrounds: Virtual reality (VR) has become a transformative technology with applications in gaming, education, healthcare, and psychotherapy. The subjective experiences in VR vary based on the virtual environment’s characteristics, and electroencephalography (EEG) is instrumental in assessing these differences. By analyzing EEG signals, researchers can explore the neural mechanisms underlying cognitive and emotional responses to VR stimuli. However, distinguishing EEG signals recorded by two-dimensional (2D) versus three-dimensional (3D) VR environments remains underexplored. Current research primarily utilizes power spectral density (PSD) features to differentiate between 2D and 3D VR conditions, but the potential of other feature parameters for enhanced discrimination is unclear. Additionally, the use of machine learning techniques to classify EEG signals from 2D and 3D VR using alternative features has not been thoroughly investigated, highlighting the need for further research to identify robust EEG features and effective classification methods. Methods: This study recorded EEG signals from participants exposed to 2D and 3D VR video stimuli to investigate the neural differences between these conditions. Key features extracted from the EEG data included PSD and common spatial patterns (CSPs), which capture frequency-domain and spatial-domain information, respectively. To evaluate classification performance, several classical machine learning algorithms were employed: ssupport vector machine (SVM), k-nearest neighbors (KNN), random forest (RF), naive Bayes, decision Tree, AdaBoost, and a voting classifier. The study systematically compared the classification performance of PSD and CSP features across these algorithms, providing a comprehensive analysis of their effectiveness in distinguishing EEG signals in response to 2D and 3D VR stimuli. Results: The study demonstrated that machine learning algorithms can effectively classify EEG signals recorded during watching 2D and 3D VR videos. CSP features outperformed PSD in classification accuracy, indicating their superior ability to capture EEG signals differences between the VR conditions. Among the machine learning algorithms, the Random Forest classifier achieved the highest accuracy at 95.02%, followed by KNN with 93.16% and SVM with 91.39%. The combination of CSP features with RF, KNN, and SVM consistently showed superior performance compared to other feature-algorithm combinations, underscoring the effectiveness of CSP and these algorithms in distinguishing EEG responses to different VR experiences. Conclusions: This study demonstrates that EEG signals recorded during watching 2D and 3D VR videos can be effectively classified using machine learning algorithms with extracted feature parameters. The findings highlight the superiority of CSP features over PSD in distinguishing EEG signals under different VR conditions, emphasizing CSP’s value in VR-induced EEG analysis. These results expand the application of feature-based machine learning methods in EEG studies and provide a foundation for future research into the brain cortical activity of VR experiences, supporting the broader use of machine learning in EEG-based analyses. Full article

(This article belongs to the Section Computational Neuroscience, Neuroinformatics, and Neurocomputing)

► Show Figures

Figure 1

21 pages, 5273 KiB

Open AccessArticle

Integrating Statistical Methods and Machine Learning Techniques to Analyze and Classify COVID-19 Symptom Severity

by Yaqeen Raddad, Ahmad Hasasneh, Obada Abdallah, Camil Rishmawi and Nouar Qutob

Big Data Cogn. Comput. 2024, 8(12), 192; https://doi.org/10.3390/bdcc8120192 - 16 Dec 2024

Cited by 1 | Viewed by 2097

Abstract

Background/Objectives: The COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), led to significant global health challenges, including the urgent need for accurate symptom severity prediction aimed at optimizing treatment. While machine learning (ML) and deep learning (DL) models have [...] Read more.

Background/Objectives: The COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), led to significant global health challenges, including the urgent need for accurate symptom severity prediction aimed at optimizing treatment. While machine learning (ML) and deep learning (DL) models have shown promise in predicting COVID-19 severity using imaging and clinical data, there is limited research utilizing comprehensive tabular symptom datasets. This study aims to address this gap by leveraging a detailed symptom dataset to develop robust models for categorizing COVID-19 symptom severity, thereby enhancing clinical decision making. Methods: A unique tabular dataset was created using questionnaire responses from 5654 individuals, including demographic information, comorbidities, travel history, and medical data. Both unsupervised and supervised ML techniques were employed, including k-means clustering to categorize symptom severity into mild, moderate, and severe clusters. In addition, classification models, namely, Support Vector Machine (SVM), Adaptive Boosting (AdaBoost), eXtreme Gradient Boosting (XGBoost), random forest, and a deep neural network (DNN) were used to predict symptom severity levels. Feature importance was analyzed using the random forest model for its robustness with high-dimensional data and ability to capture complex non-linear relationships, and statistical significance was evaluated through ANOVA and Chi-square tests. Results: Our study showed that fatigue, joint pain, and headache were the most important features in predicting severity. SVM, AdaBoost, and random forest achieved an accuracy of 94%, while XGBoost achieved an accuracy of 96%. DNN showed robust performance in handling complex patterns with 98% accuracy. In terms of precision and recall metrics, both the XGBoost and DNN models demonstrated robust performance, particularly for the moderate class. XGBoost recorded 98% precision and 97% recall, while DNN achieved 99% precision and recall. The clustering approach improved classification accuracy by reducing noise and dimensionality. Statistical tests confirmed the significance of additional features like Body Mass Index (BMI), age, and dominant variant type. Conclusions: Integrating symptom data with advanced ML models offers a promising approach for accurate COVID-19 severity classification. This method provides a reliable tool for healthcare professionals to optimize patient care and resource management, particularly in managing COVID-19 and potential future pandemics. Future work should focus on incorporating imaging and clinical data to further enhance model accuracy and clinical applicability. Full article

► Show Figures

Figure 1

24 pages, 12705 KiB

Open AccessArticle

Site Selection of Elderly Care Facilities Based on Multi-Source Spatial Big Data and Integrated Learning

by Yin Zhang, Junhong Zhu, Fangyi Li and Yingjie Wang

ISPRS Int. J. Geo-Inf. 2024, 13(12), 451; https://doi.org/10.3390/ijgi13120451 - 15 Dec 2024

Cited by 2 | Viewed by 1492

Abstract

This study explores a method to improve the site selection for elderly care facilities in an aging region, using Hefei City, China, as the study area. It combines topographic conditions, population distribution, economic development status, and other multi-source spatial big data at a [...] Read more.

This study explores a method to improve the site selection for elderly care facilities in an aging region, using Hefei City, China, as the study area. It combines topographic conditions, population distribution, economic development status, and other multi-source spatial big data at a 500 m grid scale; constructs a prediction model for the suitability of sites for elderly care facilities based on integrated learning; and carries out a comprehensive evaluation and feature importance analysis. Finally, it uses trained random forest (RF) and gradient boosting decision tree (GBDT) models to predict preliminary site selection results for elderly care facilities. A second screening that compares three degrees of population aging is conducted to obtain the final site selection results. The results show the following: (1) The comprehensive evaluation indexes of the two integrated learning models, RF and GBDT, are above or below 80% as needed, which is better than the four single learning models. (2) The prediction results of the RF and GBDT models have 87.9% and 78.4% fit to existing elderly facilities, respectively, which indicates that the methods are reasonable and reliable. (3) The results of both the RF and GBDT models indicate that the closest distance to healthcare facilities and the size of the population distribution are the two most important factors affecting the location of elderly care facilities. (4) The results of the preliminary site selection show an overall spatial distribution of higher suitability in the main urban area and lower suitability in the suburban counties. The secondary screening finds that priority needs to be given to the periphery of the main urban area and to Lujiang County and other surrounding townships that have a more serious degree of aging as soon as possible in the site selection of new elderly care facilities. Full article

► Show Figures

Figure 1

19 pages, 576 KiB

Open AccessArticle

Predicting Missing Values in Survey Data Using Prompt Engineering for Addressing Item Non-Response

by Junyung Ji, Jiwoo Kim and Younghoon Kim

Future Internet 2024, 16(10), 351; https://doi.org/10.3390/fi16100351 - 27 Sep 2024

Viewed by 2366

Abstract

Survey data play a crucial role in various research fields, including economics, education, and healthcare, by providing insights into human behavior and opinions. However, item non-response, where respondents fail to answer specific questions, presents a significant challenge by creating incomplete datasets that undermine [...] Read more.

Survey data play a crucial role in various research fields, including economics, education, and healthcare, by providing insights into human behavior and opinions. However, item non-response, where respondents fail to answer specific questions, presents a significant challenge by creating incomplete datasets that undermine data integrity and can hinder or even prevent accurate analysis. Traditional methods for addressing missing data, such as statistical imputation techniques and deep learning models, often fall short when dealing with the rich linguistic content of survey data. These approaches are also hampered by high time complexity for training and the need for extensive preprocessing or feature selection. In this paper, we introduce an approach that leverages Large Language Models (LLMs) through prompt engineering for predicting item non-responses in survey data. Our method combines the strengths of both traditional imputation techniques and deep learning methods with the advanced linguistic understanding of LLMs. By integrating respondent similarities, question relevance, and linguistic semantics, our approach enhances the accuracy and comprehensiveness of survey data analysis. The proposed method bypasses the need for complex preprocessing and additional training, making it adaptable, scalable, and capable of generating explainable predictions in natural language. We evaluated the effectiveness of our LLM-based approach through a series of experiments, demonstrating its competitive performance against established methods such as Multivariate Imputation by Chained Equations (MICE), MissForest, and deep learning models like TabTransformer. The results show that our approach not only matches but, in some cases, exceeds the performance of these methods while significantly reducing the time required for data processing. Full article

► Show Figures

Graphical abstract

23 pages, 1151 KiB

Open AccessArticle

Enhancing Cybersecurity in Healthcare: Evaluating Ensemble Learning Models for Intrusion Detection in the Internet of Medical Things

by Theyab Alsolami, Bader Alsharif and Mohammad Ilyas

Sensors 2024, 24(18), 5937; https://doi.org/10.3390/s24185937 - 13 Sep 2024

Cited by 11 | Viewed by 4193

Abstract

This study investigates the efficacy of machine learning models for intrusion detection in the Internet of Medical Things, aiming to enhance cybersecurity defenses and protect sensitive healthcare data. The analysis focuses on evaluating the performance of ensemble learning algorithms, specifically Stacking, Bagging, and [...] Read more.

This study investigates the efficacy of machine learning models for intrusion detection in the Internet of Medical Things, aiming to enhance cybersecurity defenses and protect sensitive healthcare data. The analysis focuses on evaluating the performance of ensemble learning algorithms, specifically Stacking, Bagging, and Boosting, using Random Forest and Support Vector Machines as base models on the WUSTL-EHMS-2020 dataset. Through a comprehensive examination of performance metrics such as accuracy, precision, recall, and F1-score, Stacking demonstrates exceptional accuracy and reliability in detecting and classifying cyber attack incidents with an accuracy rate of 98.88%. Bagging is ranked second, with an accuracy rate of 97.83%, while Boosting yielded the lowest accuracy rate of 88.68%. Full article

(This article belongs to the Special Issue Recent Trends and Advances in Sensors Cybersecurity)

► Show Figures

Figure 1

17 pages, 12206 KiB

Open AccessArticle

Machine Learning versus Cox Models for Predicting Overall Survival in Patients with Osteosarcoma: A Retrospective Analysis of the EURAMOS-1 Clinical Trial Data

by Marta Spreafico, Audinga-Dea Hazewinkel, Michiel A. J. van de Sande, Hans Gelderblom and Marta Fiocco

Cancers 2024, 16(16), 2880; https://doi.org/10.3390/cancers16162880 - 19 Aug 2024

Viewed by 1661

Abstract

Since the mid-1980s, there has been little progress in improving survival of patients diagnosed with osteosarcoma. Survival prediction models play a key role in clinical decision-making, guiding healthcare professionals in tailoring treatment strategies based on individual patient risks. The increasing interest of the [...] Read more.

Since the mid-1980s, there has been little progress in improving survival of patients diagnosed with osteosarcoma. Survival prediction models play a key role in clinical decision-making, guiding healthcare professionals in tailoring treatment strategies based on individual patient risks. The increasing interest of the medical community in using machine learning (ML) for predicting survival has sparked an ongoing debate on the value of ML techniques versus more traditional statistical modelling (SM) approaches. This study investigates the use of SM versus ML methods in predicting overall survival (OS) using osteosarcoma data from the EURAMOS-1 clinical trial (NCT00134030). The well-established Cox proportional hazard model is compared to the extended Cox model that includes time-varying effects, and to the ML methods random survival forests and survival neural networks. The impact of eight variables on OS predictions is explored. Results are compared on different model performance metrics, variable importance, and patient-specific predictions. The article provides comprehensive insights to aid healthcare researchers in evaluating diverse survival prediction models for low-dimensional clinical data. Full article

(This article belongs to the Special Issue Surgery for Osteosarcoma)

► Show Figures

Figure 1

Search Results (22)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (22)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI