MDPI - Publisher of Open Access Journals

18 pages, 1197 KiB

Open AccessArticle

Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy

by Fatma Hilal Yagin, Yasin Gormez, Cemil Colak, Abdulmohsen Algarni, Fahaid Al-Hashem and Luca Paolo Ardigò

Pharmaceuticals 2025, 18(7), 975; https://doi.org/10.3390/ph18070975 - 28 Jun 2025

Viewed by 820

Abstract

Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning [...] Read more.

Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework—leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints—to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC–AUC, precision–recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC–AUC and precision–recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning—particularly dANNs and SVM—in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Figure 1

20 pages, 2067 KiB

Open AccessArticle

Explainable Boosting Machines Identify Key Metabolomic Biomarkers in Rheumatoid Arthritis

by Fatma Hilal Yagin, Cemil Colak, Abdulmohsen Algarni, Ali Algarni, Fahaid Al-Hashem and Luca Paolo Ardigò

Medicina 2025, 61(5), 833; https://doi.org/10.3390/medicina61050833 - 30 Apr 2025

Viewed by 991

Abstract

Background and Objectives: Rheumatoid arthritis (RA) is a chronic autoimmune disease characterised by joint inflammation and pain. Metabolomics approaches, which are high-throughput profiling of small molecule metabolites in plasma or serum in RA patients, have so far provided biomarker discovery in the [...] Read more.

Background and Objectives: Rheumatoid arthritis (RA) is a chronic autoimmune disease characterised by joint inflammation and pain. Metabolomics approaches, which are high-throughput profiling of small molecule metabolites in plasma or serum in RA patients, have so far provided biomarker discovery in the literature for clinical subgroups, risk factors, and predictors of treatment response using classical statistical approaches or machine learning models. Despite these recent developments, an explainable artificial intelligence (XAI)-based methodology has not been used to identify RA metabolomic biomarkers and distinguish patients with RA. This study constructed a XAI-based EBM model using global plasma metabolomics profiling to identify metabolites predictive of RA patients and to develop a classification model that can distinguish RA patients from healthy controls. Materials and Methods: Global plasma metabolomics data were analysed from RA patients (49 samples) and healthy individuals (10 samples). SMOTE technique was used for class imbalance in data preprocessing. EBM, LightGBM, and AdaBoost algorithms were applied to generate a discriminatory model between RA and controls. Comprehensive performance metrics were calculated, and the interpretability of the optimal model was assessed using global and local feature descriptions. Results: A total of 59 samples were analysed, 49 from RA patients, and 10 from healthy subjects. The EBM generated better results than LightGBM and AdaBoost by attaining an AUC of 0.901 (95% CI: 0.847–0.955) with 87.8% sensitivity which helps prevent false negative early RA diagnosis. The primary biomarkers EBM-based XAI identified were N-acetyleucine, pyruvic acid, and glycerol-3-phosphate. EBM global explanation analysis indicated that elevated pyruvic acid levels were significantly correlated with RA, whereas N-acetyleucine exhibited a nonlinear relationship, implying possible protective effects at specific concentrations. Conclusions: This study underscores the promise of XAI and evidence-based medicine methodology in developing biomarkers for RA through metabolomics. The discovered metabolites offer significant insights into RA pathophysiology and may function as diagnostic biomarkers or therapeutic targets. Incorporating EBM methodologies integrated with XAI improves model transparency and increases the therapeutic applicability of predictive models for RA diagnosis/management. Furthermore, the transparent structure of the EBM model empowers clinicians to understand and verify the reasoning behind each prediction, thereby fostering trust in AI-assisted decision-making and facilitating the integration of metabolomic insights into routine clinical practice. Full article

(This article belongs to the Special Issue New Strategies for the Diagnosis and Treatment of Rheumatic and Musculoskeletal Diseases)

► Show Figures

Figure 1

18 pages, 2075 KiB

Open AccessArticle

Proposed Comprehensive Methodology Integrated with Explainable Artificial Intelligence for Prediction of Possible Biomarkers in Metabolomics Panel of Plasma Samples for Breast Cancer Detection

by Cemil Colak, Fatma Hilal Yagin, Abdulmohsen Algarni, Ali Algarni, Fahaid Al-Hashem and Luca Paolo Ardigò

Medicina 2025, 61(4), 581; https://doi.org/10.3390/medicina61040581 - 25 Mar 2025

Cited by 2 | Viewed by 1209

Abstract

Aim: Breast cancer (BC) is the most common type of cancer in women, accounting for more than 30% of new female cancers each year. Although various treatments are available for BC, most cancer-related deaths are due to incurable metastases. Therefore, the early [...] Read more.

Aim: Breast cancer (BC) is the most common type of cancer in women, accounting for more than 30% of new female cancers each year. Although various treatments are available for BC, most cancer-related deaths are due to incurable metastases. Therefore, the early diagnosis and treatment of BC are crucial before metastasis. Mammography and ultrasonography are primarily used in the clinic for the initial identification and staging of BC; these methods are useful for general screening but have limitations in terms of sensitivity and specificity. Omics-based biomarkers, like metabolomics, can make early diagnosis much more accurate, make tracking the disease’s progression more accurate, and help make personalized treatment plans that are tailored to each tumor’s specific molecular profile. Metabolomics technology is a feasible and comprehensive method for early disease detection and biomarker identification at the molecular level. This research aimed to establish an interpretable predictive artificial intelligence (AI) model using plasma-based metabolomics panel data to identify potential biomarkers that distinguish BC individuals from healthy controls. Methods: A cohort of 138 BC patients and 76 healthy controls were studied. Plasma metabolites were examined using LC-TOFMS and GC-TOFMS techniques. Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Random Forest (RF) were evaluated using performance metrics such as Receiver Operating Characteristic-Area Under the Curve (ROC AUC), accuracy, sensitivity, specificity, and F1 score. ROC and Precision-Recall (PR) curves were generated for comparative analysis. The SHapley Additive Descriptions (SHAP) analysis evaluated the optimal prediction model for interpretability. Results: The RF algorithm showed improved accuracy (0.963 ± 0.043) and sensitivity (0.977 ± 0.051); however, LightGBM achieved the highest ROC AUC (0.983 ± 0.028). RF also achieved the best Precision-Recall Area under the Curve (PR AUC) at 0.989. SHAP search found glycerophosphocholine and pentosidine as the most significant discriminatory metabolites. Uracil, glutamine, and butyrylcarnitine were also among the significant metabolites. Conclusions: Metabolomics biomarkers and an explainable AI (XAI)-based prediction model showed significant diagnostic accuracy and sensitivity in the detection of BC. The proposed XAI system using interpretable metabolite data can serve as a clinical decision support tool to improve early diagnosis processes. Full article

(This article belongs to the Special Issue Insights and Advances in Cancer Biomarkers)

► Show Figures

Figure 1

15 pages, 1180 KiB

Open AccessArticle

Untargeted Lipidomic Biomarkers for Liver Cancer Diagnosis: A Tree-Based Machine Learning Model Enhanced by Explainable Artificial Intelligence

by Cemil Colak, Fatma Hilal Yagin, Abdulmohsen Algarni, Ali Algarni, Fahaid Al-Hashem and Luca Paolo Ardigò

Medicina 2025, 61(3), 405; https://doi.org/10.3390/medicina61030405 - 26 Feb 2025

Cited by 1 | Viewed by 1561

Abstract

Background and Objectives: Liver cancer ranks among the leading causes of cancer-related mortality, necessitating the development of novel diagnostic methods. Deregulated lipid metabolism, a hallmark of hepatocarcinogenesis, offers compelling prospects for biomarker identification. This study aims to employ explainable artificial intelligence (XAI) [...] Read more.

Background and Objectives: Liver cancer ranks among the leading causes of cancer-related mortality, necessitating the development of novel diagnostic methods. Deregulated lipid metabolism, a hallmark of hepatocarcinogenesis, offers compelling prospects for biomarker identification. This study aims to employ explainable artificial intelligence (XAI) to identify lipidomic biomarkers for liver cancer and to develop a robust predictive model for early diagnosis. Materials and Methods: This study included 219 patients diagnosed with liver cancer and 219 healthy controls. Serum samples underwent untargeted lipidomic analysis with LC-QTOF-MS. Lipidomic data underwent univariate and multivariate analyses, including fold change (FC), t-tests, PLS-DA, and Elastic Network feature selection, to identify significant biomarker candidate lipids. Machine learning models (AdaBoost, Random Forest, Gradient Boosting) were developed and evaluated utilizing these biomarkers to differentiate liver cancer. The AUC metric was employed to identify the optimal predictive model, whereas SHAP was utilized to achieve interpretability of the model’s predictive decisions. Results: Notable alterations in lipid profiles were observed: decreased sphingomyelins (SM d39:2, SM d41:2) and increased fatty acids (FA 14:1, FA 22:2) and phosphatidylcholines (PC 34:1, PC 32:1). AdaBoost exhibited a superior classification performance, achieving an AUC of 0.875. SHAP identified PC 40:4 as the most efficacious lipid for model predictions. The SM d41:2 and SM d36:3 lipids were specifically associated with an increased risk of low-onset cancer and elevated levels of the PC 40:4 lipid. Conclusions: This study demonstrates that untargeted lipidomics, in conjunction with explainable artificial intelligence (XAI) and machine learning, may effectively identify biomarkers for the early detection of liver cancer. The results suggest that alterations in lipid metabolism are crucial to the progression of liver cancer and provide valuable insights for incorporating lipidomics into precision oncology. Full article

(This article belongs to the Special Issue Insights and Advances in Cancer Biomarkers)

► Show Figures

Figure 1

16 pages, 1777 KiB

Open AccessArticle

Metabolomics Biomarker Discovery to Optimize Hepatocellular Carcinoma Diagnosis: Methodology Integrating AutoML and Explainable Artificial Intelligence

by Fatma Hilal Yagin, Radwa El Shawi, Abdulmohsen Algarni, Cemil Colak, Fahaid Al-Hashem and Luca Paolo Ardigò

Diagnostics 2024, 14(18), 2049; https://doi.org/10.3390/diagnostics14182049 - 15 Sep 2024

Cited by 5 | Viewed by 2014

Abstract

Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We [...] Read more.

Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We investigated publicly accessible data encompassing HCC patients and cirrhotic controls. The TPOT tool, which is an AutoML tool, was used to optimize the preparation of features and data, as well as to select the most suitable machine learning model. The TreeSHAP approach, which is a type of XAI, was used to interpret the model by assessing each metabolite’s individual contribution to the categorization process. Results: TPOT had superior performance in distinguishing between HCC and cirrhosis compared to other AutoML approaches AutoSKlearn and H2O AutoML, in addition to traditional machine learning models such as random forest, support vector machine, and k-nearest neighbor. The TPOT technique attained an AUC value of 0.81, showcasing superior accuracy, sensitivity, and specificity in comparison to the other models. Key metabolites, including L-valine, glycine, and DL-isoleucine, were identified as essential by TPOT and subsequently verified by TreeSHAP analysis. TreeSHAP provided a comprehensive explanation of the contribution of these metabolites to the model’s predictions, thereby increasing the interpretability and dependability of the results. This thorough assessment highlights the strength and reliability of the AutoML framework in the development of clinical biomarkers. Conclusions: This study shows that AutoML and XAI can be used together to create metabolomic biomarkers that are specific to HCC. The exceptional performance of TPOT in comparison to traditional models highlights its capacity to identify biomarkers. Furthermore, TreeSHAP boosted model transparency by highlighting the relevance of certain metabolites. This comprehensive method has the potential to enhance the identification of biomarkers and generate precise, easily understandable, AI-driven solutions for diagnosing HCC. Full article

(This article belongs to the Special Issue Artificial Intelligence and Deep Learning in Clinical Classification and Prediction)

► Show Figures

Figure 1

19 pages, 1470 KiB

Open AccessArticle

Platelet Metabolites as Candidate Biomarkers in Sepsis Diagnosis and Management Using the Proposed Explainable Artificial Intelligence Approach

by Fatma Hilal Yagin, Umran Aygun, Abdulmohsen Algarni, Cemil Colak, Fahaid Al-Hashem and Luca Paolo Ardigò

J. Clin. Med. 2024, 13(17), 5002; https://doi.org/10.3390/jcm13175002 - 23 Aug 2024

Cited by 4 | Viewed by 2038

Abstract

Background: Sepsis is characterized by an atypical immune response to infection and is a dangerous health problem leading to significant mortality. Current diagnostic methods exhibit insufficient sensitivity and specificity and require the discovery of precise biomarkers for the early diagnosis and treatment [...] Read more.

Background: Sepsis is characterized by an atypical immune response to infection and is a dangerous health problem leading to significant mortality. Current diagnostic methods exhibit insufficient sensitivity and specificity and require the discovery of precise biomarkers for the early diagnosis and treatment of sepsis. Platelets, known for their hemostatic abilities, also play an important role in immunological responses. This study aims to develop a model integrating machine learning and explainable artificial intelligence (XAI) to identify novel platelet metabolomics markers of sepsis. Methods: A total of 39 participants, 25 diagnosed with sepsis and 14 control subjects, were included in the study. The profiles of platelet metabolites were analyzed using quantitative 1H-nuclear magnetic resonance (NMR) technology. Data were processed using the synthetic minority oversampling method (SMOTE)-Tomek to address the issue of class imbalance. In addition, missing data were filled using a technique based on random forests. Three machine learning models, namely extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and kernel tree boosting (KTBoost), were used for sepsis prediction. The models were validated using cross-validation. Clinical annotations of the optimal sepsis prediction model were analyzed using SHapley Additive exPlanations (SHAP), an XAI technique. Results: The results showed that the KTBoost model (0.900 accuracy and 0.943 AUC) achieved better performance than the other models in sepsis diagnosis. SHAP results revealed that metabolites such as carnitine, glutamate, and myo-inositol are important biomarkers in sepsis prediction and intuitively explained the prediction decisions of the model. Conclusion: Platelet metabolites identified by the KTBoost model and XAI have significant potential for the early diagnosis and monitoring of sepsis and improving patient outcomes. Full article

(This article belongs to the Special Issue Sepsis and Organ Dysfunction: New Insights into Diagnosis and Treatment)

► Show Figures

Figure 1

15 pages, 2431 KiB

Open AccessArticle

Hybrid Explainable Artificial Intelligence Models for Targeted Metabolomics Analysis of Diabetic Retinopathy

by Fatma Hilal Yagin, Cemil Colak, Abdulmohsen Algarni, Yasin Gormez, Emek Guldogan and Luca Paolo Ardigò

Diagnostics 2024, 14(13), 1364; https://doi.org/10.3390/diagnostics14131364 - 27 Jun 2024

Cited by 6 | Viewed by 2570

Abstract

Background: Diabetic retinopathy (DR) is a prevalent microvascular complication of diabetes mellitus, and early detection is crucial for effective management. Metabolomics profiling has emerged as a promising approach for identifying potential biomarkers associated with DR progression. This study aimed to develop a hybrid [...] Read more.

Background: Diabetic retinopathy (DR) is a prevalent microvascular complication of diabetes mellitus, and early detection is crucial for effective management. Metabolomics profiling has emerged as a promising approach for identifying potential biomarkers associated with DR progression. This study aimed to develop a hybrid explainable artificial intelligence (XAI) model for targeted metabolomics analysis of patients with DR, utilizing a focused approach to identify specific metabolites exhibiting varying concentrations among individuals without DR (NDR), those with non-proliferative DR (NPDR), and individuals with proliferative DR (PDR) who have type 2 diabetes mellitus (T2DM). Methods: A total of 317 T2DM patients, including 143 NDR, 123 NPDR, and 51 PDR cases, were included in the study. Serum samples underwent targeted metabolomics analysis using liquid chromatography and mass spectrometry. Several machine learning models, including Support Vector Machines (SVC), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), and Multilayer Perceptrons (MLP), were implemented as solo models and in a two-stage ensemble hybrid approach. The models were trained and validated using 10-fold cross-validation. SHapley Additive exPlanations (SHAP) were employed to interpret the contributions of each feature to the model predictions. Statistical analyses were conducted using the Shapiro–Wilk test for normality, the Kruskal–Wallis H test for group differences, and the Mann–Whitney U test with Bonferroni correction for post-hoc comparisons. Results: The hybrid SVC + MLP model achieved the highest performance, with an accuracy of 89.58%, a precision of 87.18%, an F1-score of 88.20%, and an F-beta score of 87.55%. SHAP analysis revealed that glucose, glycine, and age were consistently important features across all DR classes, while creatinine and various phosphatidylcholines exhibited higher importance in the PDR class, suggesting their potential as biomarkers for severe DR. Conclusion: The hybrid XAI models, particularly the SVC + MLP ensemble, demonstrated superior performance in predicting DR progression compared to solo models. The application of SHAP facilitates the interpretation of feature importance, providing valuable insights into the metabolic and physiological markers associated with different stages of DR. These findings highlight the potential of hybrid XAI models combined with explainable techniques for early detection, targeted interventions, and personalized treatment strategies in DR management. Full article

(This article belongs to the Special Issue Digital Technology and Artificial Intelligence in Ophthalmology)

► Show Figures

Figure 1

14 pages, 3297 KiB

Open AccessArticle

Combining the Strengths of the Explainable Boosting Machine and Metabolomics Approaches for Biomarker Discovery in Acute Myocardial Infarction

by Ahmet Kadir Arslan, Fatma Hilal Yagin, Abdulmohsen Algarni, Fahaid AL-Hashem and Luca Paolo Ardigò

Diagnostics 2024, 14(13), 1353; https://doi.org/10.3390/diagnostics14131353 - 26 Jun 2024

Viewed by 2395

Abstract

Acute Myocardial Infarction (AMI), a common disease that can have serious consequences, occurs when myocardial blood flow stops due to occlusion of the coronary artery. Early and accurate prediction of AMI is critical for rapid prognosis and improved patient outcomes. Metabolomics, the study [...] Read more.

Acute Myocardial Infarction (AMI), a common disease that can have serious consequences, occurs when myocardial blood flow stops due to occlusion of the coronary artery. Early and accurate prediction of AMI is critical for rapid prognosis and improved patient outcomes. Metabolomics, the study of small molecules within biological systems, is an effective tool used to discover biomarkers associated with many diseases. This study intended to construct a predictive model for AMI utilizing metabolomics data and an explainable machine learning approach called Explainable Boosting Machines (EBM). The EBM model was trained on a dataset of 102 prognostic metabolites gathered from 99 individuals, including 34 healthy controls and 65 AMI patients. After a comprehensive data preprocessing, 21 metabolites were determined as the candidate predictors to predict AMI. The EBM model displayed satisfactory performance in predicting AMI, with various classification performance metrics. The model’s predictions were based on the combined effects of individual metabolites and their interactions. In this context, the results obtained in two different EBM modeling, including both only individual metabolite features and their interaction effects, were discussed. The most important predictors included creatinine, nicotinamide, and isocitrate. These metabolites are involved in different biological activities, such as energy metabolism, DNA repair, and cellular signaling. The results demonstrate the potential of the combination of metabolomics and the EBM model in constructing reliable and interpretable prediction outputs for AMI. The discussed metabolite biomarkers may assist in early diagnosis, risk assessment, and personalized treatment methods for AMI patients. This study successfully developed a pipeline incorporating extensive data preprocessing and the EBM model to identify potential metabolite biomarkers for predicting AMI. The EBM model, with its ability to incorporate interaction terms, demonstrated satisfactory classification performance and revealed significant metabolite interactions that could be valuable in assessing AMI risk. However, the results obtained from this study should be validated with studies to be carried out in larger and well-defined samples. Full article

(This article belongs to the Special Issue Artificial Intelligence in Cardiology Diagnosis )

► Show Figures

Figure 1

28 pages, 9511 KiB

Open AccessArticle

Design and Evaluation of a Low-Power Wide-Area Network (LPWAN)-Based Emergency Response System for Individuals with Special Needs in Smart Buildings

by Habibullah Safi, Ali Imran Jehangiri, Zulfiqar Ahmad, Mohammed Alaa Ala’anzy, Omar Imhemed Alramli and Abdulmohsen Algarni

Sensors 2024, 24(11), 3433; https://doi.org/10.3390/s24113433 - 26 May 2024

Cited by 7 | Viewed by 2702

Abstract

The Internet of Things (IoT) is a growing network of interconnected devices used in transportation, finance, public services, healthcare, smart cities, surveillance, and agriculture. IoT devices are increasingly integrated into mobile assets like trains, cars, and airplanes. Among the IoT components, wearable sensors [...] Read more.

The Internet of Things (IoT) is a growing network of interconnected devices used in transportation, finance, public services, healthcare, smart cities, surveillance, and agriculture. IoT devices are increasingly integrated into mobile assets like trains, cars, and airplanes. Among the IoT components, wearable sensors are expected to reach three billion by 2050, becoming more common in smart environments like buildings, campuses, and healthcare facilities. A notable IoT application is the smart campus for educational purposes. Timely notifications are essential in critical scenarios. IoT devices gather and relay important information in real time to individuals with special needs via mobile applications and connected devices, aiding health-monitoring and decision-making. Ensuring IoT connectivity with end users requires long-range communication, low power consumption, and cost-effectiveness. The LPWAN is a promising technology for meeting these needs, offering a low cost, long range, and minimal power use. Despite their potential, mobile IoT and LPWANs in healthcare, especially for emergency response systems, have not received adequate research attention. Our study evaluated an LPWAN-based emergency response system for visually impaired individuals on the Hazara University campus in Mansehra, Pakistan. Experiments showed that the LPWAN technology is reliable, with 98% reliability, and suitable for implementing emergency response systems in smart campus environments. Full article

(This article belongs to the Special Issue LoRa Communication Technology for IoT Applications)

► Show Figures

Figure 1

25 pages, 4684 KiB

Open AccessArticle

A Meta-Heuristic Sustainable Intelligent Internet of Things Framework for Bearing Fault Diagnosis of Electric Motor under Variable Load Conditions

by Swarnali Deb Bristi, Mehtar Jahin Tatha, Md. Firoj Ali, Uzair Aslam Bhatti, Subrata K. Sarker, Mehdi Masud, Yazeed Yasin Ghadi, Abdulmohsen Algarni and Dip K. Saha

Sustainability 2023, 15(24), 16722; https://doi.org/10.3390/su152416722 - 11 Dec 2023

Cited by 11 | Viewed by 1780

Abstract

The study introduces an Intelligent Diagnosis Framework (IDF) optimized using the Grasshopper Optimization Algorithm (GOA), an advanced swarm intelligence method, to enhance the precision of bearing defect diagnosis in electrical machinery. This area is vital for the energy sector and IoT manufacturing, but [...] Read more.

The study introduces an Intelligent Diagnosis Framework (IDF) optimized using the Grasshopper Optimization Algorithm (GOA), an advanced swarm intelligence method, to enhance the precision of bearing defect diagnosis in electrical machinery. This area is vital for the energy sector and IoT manufacturing, but the evolving designs of electric motors add complexity to fault identification. Machine learning offers potential solutions but faces challenges due to computational intensity and the need for fine-tuning hyperparameters. The optimized framework, named GOA-IDF, is rigorously tested using experimental bearing fault data from the CWRU database, focusing on the 12,000 drive end and fan end datasets. Compared to existing machine learning algorithms, GOA-IDF shows superior diagnostic capabilities, especially in processing high-frequency data that are susceptible to noise interference. This research confirms that GOA-IDF excels in accurately categorizing faults and operates with increased computational efficiency. This advancement is a significant contribution to fault diagnosis in electrical motors. It suggests that integrating intelligent frameworks with meta-heuristic optimization techniques can greatly improve the standards of health monitoring and maintenance in the electrical machinery domain. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and the Internet of Things (IoT) for Sustainable Applications)

► Show Figures

Figure 1

23 pages, 3004 KiB

Open AccessArticle

Revolutionizing Diabetes Diagnosis: Machine Learning Techniques Unleashed

by Zain Shaukat, Wisal Zafar, Waqas Ahmad, Ihtisham Ul Haq, Ghassan Husnain, Mosleh Hmoud Al-Adhaileh, Yazeed Yasin Ghadi and Abdulmohsen Algarni

Healthcare 2023, 11(21), 2864; https://doi.org/10.3390/healthcare11212864 - 31 Oct 2023

Cited by 9 | Viewed by 3454

Abstract

The intricate and multifaceted nature of diabetes disrupts the body’s crucial glucose processing mechanism, which serves as a fundamental energy source for the cells. This research aims to predict the occurrence of diabetes in individuals by harnessing the power of machine learning algorithms, [...] Read more.

The intricate and multifaceted nature of diabetes disrupts the body’s crucial glucose processing mechanism, which serves as a fundamental energy source for the cells. This research aims to predict the occurrence of diabetes in individuals by harnessing the power of machine learning algorithms, utilizing the PIMA diabetes dataset. The selected algorithms employed in this study encompass Decision Tree, K-Nearest Neighbor, Random Forest, Logistic Regression, and Support Vector Machine. To execute the experiments, two software tools, namely Waikato Environment for Knowledge Analysis (WEKA) version 3.8.1 and Python version 3.10, were utilized. To evaluate the performance of the algorithms, several metrics were employed, including true positive rate, false positive rate, precision, recall, F-measure, Matthew’s correlation coefficient, receiver operating characteristic area, and precision–recall curves area. Furthermore, various errors such as Mean Absolute Error, Root Mean Squared Error, Relative Absolute Error, and Root Relative Squared Error were examined to assess the accuracy of the models. Upon conducting the experiments, it was observed that Logistic Regression outperformed the other techniques, exhibiting the highest precision of 81 percent using Python and 80.43 percent using WEKA. These findings shed light on the efficacy of machine learning in predicting diabetes and highlight the potential of Logistic Regression as a valuable tool in this domain. Full article

(This article belongs to the Special Issue Data Mining and Sentiment Analysis in Healthcare)

► Show Figures

Figure 1

24 pages, 9790 KiB

Open AccessReview

Ransomware Detection Using Machine Learning: A Survey

by Amjad Alraizza and Abdulmohsen Algarni

Big Data Cogn. Comput. 2023, 7(3), 143; https://doi.org/10.3390/bdcc7030143 - 16 Aug 2023

Cited by 55 | Viewed by 30595

Abstract

Ransomware attacks pose significant security threats to personal and corporate data and information. The owners of computer-based resources suffer from verification and privacy violations, monetary losses, and reputational damage due to successful ransomware assaults. As a result, it is critical to accurately and [...] Read more.

Ransomware attacks pose significant security threats to personal and corporate data and information. The owners of computer-based resources suffer from verification and privacy violations, monetary losses, and reputational damage due to successful ransomware assaults. As a result, it is critical to accurately and swiftly identify ransomware. Numerous methods have been proposed for identifying ransomware, each with its own advantages and disadvantages. The main objective of this research is to discuss current trends in and potential future debates on automated ransomware detection. This document includes an overview of ransomware, a timeline of assaults, and details on their background. It also provides comprehensive research on existing methods for identifying, avoiding, minimizing, and recovering from ransomware attacks. An analysis of studies between 2017 and 2022 is another advantage of this research. This provides readers with up-to-date knowledge of the most recent developments in ransomware detection and highlights advancements in methods for combating ransomware attacks. In conclusion, this research highlights unanswered concerns and potential research challenges in ransomware detection. Full article

(This article belongs to the Special Issue Managing Cybersecurity Threats and Increasing Organizational Resilience)

► Show Figures

Figure 1

14 pages, 1737 KiB

Open AccessArticle

Empowering Short Answer Grading: Integrating Transformer-Based Embeddings and BI-LSTM Network

by Wael H. Gomaa, Abdelrahman E. Nagib, Mostafa M. Saeed, Abdulmohsen Algarni and Emad Nabil

Big Data Cogn. Comput. 2023, 7(3), 122; https://doi.org/10.3390/bdcc7030122 - 21 Jun 2023

Cited by 6 | Viewed by 3976

Abstract

Automated scoring systems have been revolutionized by natural language processing, enabling the evaluation of students’ diverse answers across various academic disciplines. However, this presents a challenge as students’ responses may vary significantly in terms of length, structure, and content. To tackle this challenge, [...] Read more.

Automated scoring systems have been revolutionized by natural language processing, enabling the evaluation of students’ diverse answers across various academic disciplines. However, this presents a challenge as students’ responses may vary significantly in terms of length, structure, and content. To tackle this challenge, this research introduces a novel automated model for short answer grading. The proposed model uses pretrained “transformer” models, specifically T5, in conjunction with a BI-LSTM architecture which is effective in processing sequential data by considering the past and future context. This research evaluated several preprocessing techniques and different hyperparameters to identify the most efficient architecture. Experiments were conducted using a standard benchmark dataset named the North Texas Dataset. This research achieved a state-of-the-art correlation value of 92.5 percent. The proposed model’s accuracy has significant implications for education as it has the potential to save educators considerable time and effort, while providing a reliable and fair evaluation for students, ultimately leading to improved learning outcomes. Full article

(This article belongs to the Special Issue Artificial Intelligence and Natural Language Processing)

► Show Figures

Figure 1

25 pages, 885 KiB

Open AccessArticle

TF-TDA: A Novel Supervised Term Weighting Scheme for Sentiment Analysis

by Arwa Alshehri and Abdulmohsen Algarni

Electronics 2023, 12(7), 1632; https://doi.org/10.3390/electronics12071632 - 30 Mar 2023

Cited by 6 | Viewed by 3604

Abstract

In text classification tasks, such as sentiment analysis (SA), feature representation and weighting schemes play a crucial role in classification performance. Traditional term weighting schemes depend on the term frequency within the entire document collection; therefore, they are called unsupervised term weighting (UTW) [...] Read more.

In text classification tasks, such as sentiment analysis (SA), feature representation and weighting schemes play a crucial role in classification performance. Traditional term weighting schemes depend on the term frequency within the entire document collection; therefore, they are called unsupervised term weighting (UTW) schemes. One of the most popular UTW schemes is term frequency–inverse document frequency (TF-IDF); however, this is not sufficient for SA tasks. Newer weighting schemes have been developed to take advantage of the membership of documents in their categories. These are called supervised term weighting (STW) schemes; however, most of them weigh the extracted features without considering the characteristics of some noisy features and data imbalances. Therefore, in this study, a novel STW approach was proposed, known as term frequency–term discrimination ability (TF-TDA). TF-TDA mainly presents the extracted features with different degrees of discrimination by categorizing them into several groups. Subsequently, each group is weighted based on its contribution. The proposed method was examined over four SA datasets using naive Bayes (NB) and support vector machine (SVM) models. The experimental results proved the superiority of TF-TDA over two baseline term weighting approaches, with improvements ranging from 0.52% to 3.99% in the F1 score. The statistical test results verified the significant improvement obtained by TF-TDA in most cases, where the p-value ranged from 0.0000597 to 0.0455. Full article

(This article belongs to the Special Issue Innovative Solutions for Pervasive Sentiment Analysis)

► Show Figures

Figure 1

12 pages, 1614 KiB

Open AccessArticle

Investigating Students’ Pre-University Admission Requirements and Their Correlation with Academic Performance for Medical Students: An Educational Data Mining Approach

by Ayman Qahmash, Naim Ahmad and Abdulmohsen Algarni

Brain Sci. 2023, 13(3), 456; https://doi.org/10.3390/brainsci13030456 - 8 Mar 2023

Cited by 6 | Viewed by 2987

Abstract

Medical education is one of the most sought-after disciplines for its prestigious and noble status. Institutions endeavor to identify admissions criteria to register bright students who can handle the complexity of medical training and become competent clinicians. This study aims to apply statistical [...] Read more.

Medical education is one of the most sought-after disciplines for its prestigious and noble status. Institutions endeavor to identify admissions criteria to register bright students who can handle the complexity of medical training and become competent clinicians. This study aims to apply statistical and educational data mining approaches to study the relationship between pre-admission criteria and student performance in medical programs at a public university in Saudi Arabia. The present study is a retrospective cohort study conducted at the College of Computer Science, King Khalid University, Abha, Kingdom of Saudi Arabia between February and November 2022. The current pre-admission criterion is the admission score taken as the weighted average of high school percentage (HSP), general aptitude test (GAT) and standard achievement admission test (SAAT), with respective weights of 0.3, 0.3 and 0.4. Regression and optimization techniques have been applied to identify weightages that better fit the data. Five classification techniques—Decision Tree, Neural Network, Random Forest, Naïve Bayes and K-Nearest Neighbors—are employed to develop models to predict student performance. The regression and optimization analyses show that optimized weights of HSP, GAT and SAAT are 0.3, 0.2 and 0.5, respectively. The results depict that the performance of the models improves with admission scores based on optimized weightages. Further, the Neural Network and Naïve Bayes techniques outperform other techniques. Firstly, this study proposes to revise the weights of HSP, GAT and SAAT to 0.3, 0.2 and 0.5, respectively. Secondly, as the evaluation metrics of models remain less than 0.75, this study proposes to identify additional student features for calculating admission scores to select ideal candidates for medical programs. Full article

(This article belongs to the Special Issue Intelligent Neural Systems for Solving Real Problems)

► Show Figures

Figure 1

Search Results (16)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (16)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI