Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (21)

Search Parameters:
Keywords = AutoML TPOT

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
5 pages, 314 KB  
Proceeding Paper
Bespoke Biomarker Combinations for Cancer Survival Prognosis Using Artificial Intelligence on Tumour Transcriptomics
by Ricardo Jorge Pais, Tiago Alexandre Pais and Uraquitan Lima Filho
Med. Sci. Forum 2025, 37(1), 18; https://doi.org/10.3390/msf2025037018 - 2 Sep 2025
Viewed by 155
Abstract
Accurate cancer prognosis remains a major challenge, as single gene expression biomarkers often lack clinical reliability, and most ML approaches fail even when considering large gene panels. In this study, we used a novel AutoML framework (O2Pmgen) benchmarked with a well-established framework (TPOT) [...] Read more.
Accurate cancer prognosis remains a major challenge, as single gene expression biomarkers often lack clinical reliability, and most ML approaches fail even when considering large gene panels. In this study, we used a novel AutoML framework (O2Pmgen) benchmarked with a well-established framework (TPOT) on TCGA transcriptomic data for breast, lung, and renal cancers to identify small gene panels predictive of patient survival. From 58 EMT-related genes, we found models based on panels of 6–10 genes that outperformed single-marker models and ML models that considered the 58 EMT genes, with performance gains up to 21%. Further, the generated models achieved good predictive power with AUCs of 71–83%. Our results demonstrated that affordable and efficient prognostic tools using small, biologically relevant gene sets can provide better risk stratification in clinical oncology. Full article
Show Figures

Figure 1

21 pages, 3353 KB  
Article
Automated Machine Learning-Based Significant Wave Height Prediction for Marine Operations
by Yuan Zhang, Hao Wang, Bo Wu, Jiajing Sun, Mingli Fan, Shu Dai, Hengyi Yang and Minyi Xu
J. Mar. Sci. Eng. 2025, 13(8), 1476; https://doi.org/10.3390/jmse13081476 - 31 Jul 2025
Viewed by 716
Abstract
Determining/predicting the environment dominates a variety of marine operations, such as route planning and offshore installation. Significant wave height (Hs) is a critical parameter-defining wave, a dominating marine load. Data-driven machine learning methods have been increasingly applied to Hs prediction, but challenges remain [...] Read more.
Determining/predicting the environment dominates a variety of marine operations, such as route planning and offshore installation. Significant wave height (Hs) is a critical parameter-defining wave, a dominating marine load. Data-driven machine learning methods have been increasingly applied to Hs prediction, but challenges remain in hyperparameter tuning and spatial generalization. This study explores a novel effective approach for intelligent Hs forecasting for marine operations. Multiple automated machine learning (AutoML) frameworks, namely H2O, PyCaret, AutoGluon, and TPOT, have been systematically evaluated on buoy-based Hs prediction tasks, which reveal their advantages and limitations under various forecast horizons and data quality scenarios. The results indicate that PyCaret achieves superior accuracy in short-term forecasts, while AutoGluon demonstrates better robustness in medium-term and long-term predictions. To address the limitations of single-point prediction models, which often exhibit high dependence on localized data and limited spatial generalization, a multi-point data fusion framework incorporating Principal Component Analysis (PCA) is proposed. The framework utilizes Hs data from two stations near the California coast to predict Hs at another adjacent station. The results indicate that it is possible to realize cross-station predictions based on the data from adjacent (high relevance) stations. Full article
(This article belongs to the Section Physical Oceanography)
Show Figures

Figure 1

24 pages, 5075 KB  
Article
Automated Machine Learning-Based Prediction of the Effects of Physicochemical Properties and External Experimental Conditions on Cadmium Adsorption by Biochar
by Shuoyang Wang, Xiangyu Song, Jicheng Duan, Shuo Li, Dangdang Gao, Jia Liu, Fanjing Meng, Wen Yang, Shixin Yu, Fangshu Wang, Jie Xu, Siyi Luo, Fangchao Zhao and Dong Chen
Water 2025, 17(15), 2266; https://doi.org/10.3390/w17152266 - 30 Jul 2025
Cited by 1 | Viewed by 833
Abstract
Biochar serves as an effective adsorbent for the heavy metal cadmium, with its performance significantly influenced by its physicochemical properties and various environmental features. Traditional machine learning models, though adept at managing complex multi-feature relationships, rely heavily on expertise in feature engineering and [...] Read more.
Biochar serves as an effective adsorbent for the heavy metal cadmium, with its performance significantly influenced by its physicochemical properties and various environmental features. Traditional machine learning models, though adept at managing complex multi-feature relationships, rely heavily on expertise in feature engineering and hyperparameter optimization. To address these issues, this study employs an automated machine learning (AutoML) approach, automating feature selection and model optimization, coupled with an intuitive online graphical user interface, enhancing accessibility and generalizability. Comparative analysis of four AutoML frameworks (TPOT, FLAML, AutoGluon, H2O AutoML) demonstrated that H2O AutoML achieved the highest prediction accuracy (R2 = 0.918). Key features influencing adsorption performance were identified as initial cadmium concentration (23%), stirring rate (14.7%), and the biochar H/C ratio (9.7%). Additionally, the maximum adsorption capacity of the biochar was determined to be 105 mg/g. Optimal production conditions for biochar were determined to be a pyrolysis temperature of 570–800 °C, a residence time of ≥2 h, and a heating rate of 3–10 °C/min to achieve an H/C ratio of <0.2. An online graphical user interface was developed to facilitate user interaction with the model. This study not only provides practical guidelines for optimizing biochar but also introduces a novel approach to modeling using AutoML. Full article
Show Figures

Figure 1

19 pages, 1425 KB  
Article
Early Detection of Autism Spectrum Disorder Through Automated Machine Learning
by Khafsa Ehsan, Kashif Sultan, Abreen Fatima, Muhammad Sheraz and Teong Chee Chuah
Diagnostics 2025, 15(15), 1859; https://doi.org/10.3390/diagnostics15151859 - 24 Jul 2025
Cited by 1 | Viewed by 2767
Abstract
Background/Objectives: Autism spectrum disorder (ASD) is a neurodevelopmental disorder distinguished by an extensive range of symptoms, including reduced social interaction, communication difficulties and tiresome behaviors. Early detection of ASD is important because it allows for timely intervention, which significantly improves developmental, behavioral, [...] Read more.
Background/Objectives: Autism spectrum disorder (ASD) is a neurodevelopmental disorder distinguished by an extensive range of symptoms, including reduced social interaction, communication difficulties and tiresome behaviors. Early detection of ASD is important because it allows for timely intervention, which significantly improves developmental, behavioral, and communicative outcomes in children. However, traditional diagnostic procedures for identifying autism spectrum disorder (ASD) typically involve lengthy clinical examinations, which can be both time-consuming and costly. This research proposes leveraging automated machine learning (AUTOML) to streamline the diagnostic process and enhance its accuracy. Methods: In this study, by collecting data from various rehabilitation centers across Pakistan, we applied a specific AUTOML tool known as Tree-based Pipeline Optimization Tool (TPOT) for ASD detection. Notably, this study marks one of the initial explorations into utilizing AUTOML for ASD detection. The experimentations indicate that the TPOT provided the best pipeline for the dataset, which was verified using a manual machine learning method. Results: The study contributes to the field of ASD diagnosis by using AUTOML to determine the likelihood of ASD in children at prompt stages of evolution. The study also provides an evaluation of precision, recall, and F1-score metrics to confirm the correctness of the diagnosis. The propose TPOT-based AUTOML framework attained an overall accuracy 78%, with a precision of 83%, a recall of 90%, and an F1-score of 86% for the autistic class. Conclusions: In summary, this research offers an encouraging approach to improve the detection of autism spectrum disorders (ASD) in children, which could lead to better results for affected individuals and their families. Full article
(This article belongs to the Special Issue Artificial Intelligence in Biomedical Diagnostics and Analysis 2024)
Show Figures

Figure 1

25 pages, 7504 KB  
Article
Explainable Artificial Intelligence (XAI) for Flood Susceptibility Assessment in Seoul: Leveraging Evolutionary and Bayesian AutoML Optimization
by Kounghoon Nam, Youngkyu Lee, Sungsu Lee, Sungyoon Kim and Shuai Zhang
Remote Sens. 2025, 17(13), 2244; https://doi.org/10.3390/rs17132244 - 30 Jun 2025
Viewed by 1190
Abstract
This study aims to enhance the accuracy and interpretability of flood susceptibility mapping (FSM) in Seoul, South Korea, by integrating automated machine learning (AutoML) with explainable artificial intelligence (XAI) techniques. Ten topographic and environmental conditioning factors were selected as model inputs. We first [...] Read more.
This study aims to enhance the accuracy and interpretability of flood susceptibility mapping (FSM) in Seoul, South Korea, by integrating automated machine learning (AutoML) with explainable artificial intelligence (XAI) techniques. Ten topographic and environmental conditioning factors were selected as model inputs. We first employed the Tree-based Pipeline Optimization Tool (TPOT), an evolutionary AutoML algorithm, to construct baseline ensemble models using Gradient Boosting (GB), Random Forest (RF), and XGBoost (XGB). These models were further fine-tuned using Bayesian optimization via Optuna. To interpret the model outcomes, SHAP (SHapley Additive exPlanations) was applied to analyze both the global and local contributions of each factor. The SHAP analysis revealed that lower elevation, slope, and stream distance, as well as higher stream density and built-up areas, were the most influential factors contributing to flood susceptibility. Moreover, interactions between these factors, such as built-up areas located on gentle slopes near streams, further intensified flood risk. The susceptibility maps were reclassified into five categories (very low to very high), and the GB model identified that approximately 15.047% of the study area falls under very-high-flood-risk zones. Among the models, the GB classifier achieved the highest performance, followed by XGB and RF. The proposed framework, which integrates TPOT, Optuna, and SHAP within an XAI pipeline, not only improves predictive capability but also offers transparent insights into feature behavior and model logic. These findings support more robust and interpretable flood risk assessments for effective disaster management in urban areas. Full article
(This article belongs to the Special Issue Artificial Intelligence for Natural Hazards (AI4NH))
Show Figures

Figure 1

19 pages, 4395 KB  
Article
Web-Based Baseflow Estimation in SWAT Considering Spatiotemporal Recession Characteristics Using Machine Learning
by Jimin Lee, Jeongho Han, Bernard Engel and Kyoung Jae Lim
Environments 2025, 12(3), 94; https://doi.org/10.3390/environments12030094 - 17 Mar 2025
Cited by 1 | Viewed by 1145
Abstract
The increasing frequency and severity of hydrological extremes due to climate change necessitate accurate baseflow estimation and effective watershed management for sustainable water resource use. The Soil and Water Assessment Tool (SWAT) is widely utilized for hydrological modeling but shows limitations in baseflow [...] Read more.
The increasing frequency and severity of hydrological extremes due to climate change necessitate accurate baseflow estimation and effective watershed management for sustainable water resource use. The Soil and Water Assessment Tool (SWAT) is widely utilized for hydrological modeling but shows limitations in baseflow simulation due to its uniform application of the alpha factor across Hydrologic Response Units (HRUs), neglecting spatial and temporal variability. To address these challenges, this study integrated SWAT with the Tree-Based Pipeline Optimization Tool (TPOT), an automated machine learning (AutoML) framework, to predict HRU-specific alpha factors. Furthermore, a user-friendly web-based program was developed to improve the accessibility and practical application of these optimized alpha factors, supporting more accurate baseflow predictions, even in ungauged watersheds. The proposed HRU-specific alpha factor approach in the study area significantly enhanced the recession and baseflow predictions compared to the traditional uniform alpha factor method. This improvement was supported by key performance metrics, including the Nash–Sutcliffe Efficiency (NSE), the coefficient of determination (R2), the percent bias (PBIAS), and the mean absolute percentage error (MAPE). This integrated framework effectively improves the accuracy and practicality of hydrological modeling, offering scalable and innovative solutions for sustainable watershed management in the face of increasing water stress. Full article
(This article belongs to the Special Issue Hydrological Modeling and Sustainable Water Resources Management)
Show Figures

Figure 1

22 pages, 11145 KB  
Article
Regional Soil Moisture Estimation Leveraging Multi-Source Data Fusion and Automated Machine Learning
by Shenglin Li, Pengyuan Zhu, Ni Song, Caixia Li and Jinglei Wang
Remote Sens. 2025, 17(5), 837; https://doi.org/10.3390/rs17050837 - 27 Feb 2025
Cited by 6 | Viewed by 2282
Abstract
Soil moisture (SM) monitoring in farmland at a regional scale is crucial for precision irrigation management and ensuring food security. However, existing methods for SM estimation encounter significant challenges related to accuracy, generalizability, and automation. This study proposes an integrated data fusion method [...] Read more.
Soil moisture (SM) monitoring in farmland at a regional scale is crucial for precision irrigation management and ensuring food security. However, existing methods for SM estimation encounter significant challenges related to accuracy, generalizability, and automation. This study proposes an integrated data fusion method to systematically assess the potential of three automated machine learning (AutoML) frameworks—tree-based pipeline optimization tool (TPOT), AutoGluon, and H2O AutoML—in retrieving SM. To evaluate the impact of input variables on estimation accuracy, six input scenarios were designed: multispectral data (MS), thermal infrared data (TIR), MS combined with TIR, MS with auxiliary data, TIR with auxiliary data, and a comprehensive combination of MS, TIR, and auxiliary data. The research was conducted in a winter wheat cultivation area within the People’s Victory Canal Irrigation Area, focusing on the 0–40 cm soil layer. The results revealed that the scenario incorporating all data types (MS + TIR + auxiliary) achieved the highest retrieval accuracy. Under this scenario, all three AutoML frameworks demonstrated optimal performance. AutoGluon demonstrated superior performance in most scenarios, particularly excelling in the MS + TIR + auxiliary data scenario. It achieved the highest retrieval accuracy with a Pearson correlation coefficient (R) value of 0.822, root mean square error (RMSE) of 0.038 cm3/cm3, and relative root mean square error (RRMSE) of 16.46%. This study underscores the critical role of input data types and fusion strategies in enhancing SM estimation accuracy and highlights the significant advantages of AutoML frameworks for regional-scale SM retrieval. The findings offer a robust technical foundation and theoretical guidance for advancing precision irrigation management and efficient SM monitoring. Full article
Show Figures

Figure 1

28 pages, 4440 KB  
Article
Simplatab: An Automated Machine Learning Framework for Radiomics-Based Bi-Parametric MRI Detection of Clinically Significant Prostate Cancer
by Dimitrios I. Zaridis, Vasileios C. Pezoulas, Eugenia Mylona, Charalampos N. Kalantzopoulos, Nikolaos S. Tachos, Nikos Tsiknakis, George K. Matsopoulos, Daniele Regge, Nikolaos Papanikolaou, Manolis Tsiknakis, Kostas Marias and Dimitrios I. Fotiadis
Bioengineering 2025, 12(3), 242; https://doi.org/10.3390/bioengineering12030242 - 26 Feb 2025
Cited by 3 | Viewed by 1990
Abstract
Background: Prostate cancer (PCa) diagnosis using MRI is often challenged by lesion variability. Methods: This study introduces Simplatab, an open-source automated machine learning (AutoML) framework designed for, but not limited to, automating the entire machine Learning pipeline to facilitate the detection of clinically [...] Read more.
Background: Prostate cancer (PCa) diagnosis using MRI is often challenged by lesion variability. Methods: This study introduces Simplatab, an open-source automated machine learning (AutoML) framework designed for, but not limited to, automating the entire machine Learning pipeline to facilitate the detection of clinically significant prostate cancer (csPCa) using radiomics features. Unlike existing AutoML tools such as Auto-WEKA, Auto-Sklearn, ML-Plan, ATM, Google AutoML, and TPOT, Simplatab offers a comprehensive, user-friendly framework that integrates data bias detection, feature selection, model training with hyperparameter optimization, explainable AI (XAI) analysis, and post-training model vulnerabilities detection. Simplatab requires no coding expertise, provides detailed performance reports, and includes robust data bias detection, making it particularly suitable for clinical applications. Results: Evaluated on a large pan-European cohort of 4816 patients from 12 clinical centers, Simplatab supports multiple machine learning algorithms. The most notable features that differentiate Simplatab include ease of use, a user interface accessible to those with no coding experience, comprehensive reporting, XAI integration, and thorough bias assessment, all provided in a human-understandable format. Conclusions: Our findings indicate that Simplatab can significantly enhance the usability, accountability, and explainability of machine learning in clinical settings, thereby increasing trust and accessibility for AI non-experts. Full article
Show Figures

Graphical abstract

21 pages, 7635 KB  
Article
Developing an Hourly Water Level Prediction Model for Small- and Medium-Sized Agricultural Reservoirs Using AutoML: Case Study of Baekhak Reservoir, South Korea
by Jeongho Han and Joo Hyun Bae
Agriculture 2025, 15(1), 71; https://doi.org/10.3390/agriculture15010071 - 30 Dec 2024
Cited by 3 | Viewed by 1642
Abstract
This study focuses on developing an hourly water level prediction model for small- and medium-sized agricultural reservoirs using the Tree-based Pipeline Optimization Tool (TPOT), an automated machine learning (AutoML) technique. The study area is the Baekhak Reservoir in South Korea, and various precipitation-related [...] Read more.
This study focuses on developing an hourly water level prediction model for small- and medium-sized agricultural reservoirs using the Tree-based Pipeline Optimization Tool (TPOT), an automated machine learning (AutoML) technique. The study area is the Baekhak Reservoir in South Korea, and various precipitation-related and reservoir water storage data were collected. Using these collected data, we compared widely used individual machine learning and deep learning models with the pipeline models generated by TPOT. The comparison showed that pipeline models, which included various preprocessing and ensemble techniques, exhibited higher predictive accuracy than individual machine learning and even deep learning models. The optimal pipeline model was evaluated for its performance in predicting water levels during an extreme rainfall event, demonstrating its effectiveness for hourly water level prediction. However, issues such as the overprediction of peak water levels and delays in predicting sudden water level changes were observed, likely due to inaccuracies in the ultra-short-term forecast precipitation data and the lack of information on reservoir operations (e.g., gate openings and drainage plans for agriculture). This study highlights the potential of AutoML techniques for use in hydrological modeling, and demonstrates their contribution to more efficient water management and flood prevention strategies in agricultural reservoirs. Full article
Show Figures

Figure 1

14 pages, 8478 KB  
Article
Estimating Rainfall Erosivity in North Korea Using Automated Machine Learning: Insights into Regional Soil Erosion Risks
by Jeongho Han and Seoro Lee
Land 2024, 13(12), 2038; https://doi.org/10.3390/land13122038 - 28 Nov 2024
Cited by 1 | Viewed by 1141
Abstract
Soil erosion due to rainfall is a critical environmental issue in North Korea, exacerbated by deforestation and climate change. This study aims to estimate rainfall erosivity (RE) in North Korea using automated machine learning (AutoML), with a particular focus on regional soil erosion [...] Read more.
Soil erosion due to rainfall is a critical environmental issue in North Korea, exacerbated by deforestation and climate change. This study aims to estimate rainfall erosivity (RE) in North Korea using automated machine learning (AutoML), with a particular focus on regional soil erosion risks. North Korean data were sourced from the European Centre for Medium-Range Weather Forecasts (ECMWF) ReAnalysis 5 dataset, while South Korean data were obtained from the Korea Meteorological Administration. Data from 50 stations in South Korea (2013–2019) and 27 stations in North Korea (1980–2020) were used. The GradientBoostingRegressor (GBR) model, optimized using the Tree-based Pipeline Optimization Tool (TPOT), was trained on South Korean data. The model’s performance was evaluated using metrics such as the root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2), achieving high predictive accuracy across eight stations in South Korea. Using the optimized model, RE in North Korea was estimated, and the spatial distribution of RE was analyzed using the Kriging interpolation. Results reveal significant regional variability, with the southern and western areas displaying the highest erosivity. These findings provide valuable insights into soil erosion management and the development of sustainable agricultural and environmental strategies in North Korea. Full article
(This article belongs to the Section Land, Soil and Water)
Show Figures

Figure 1

17 pages, 4057 KB  
Article
A Comparative Analysis of Automated Machine Learning Tools: A Use Case for Autism Spectrum Disorder Detection
by Rana Tuqeer Abbas, Kashif Sultan, Muhammad Sheraz and Teong Chee Chuah
Information 2024, 15(10), 625; https://doi.org/10.3390/info15100625 - 11 Oct 2024
Cited by 3 | Viewed by 2078
Abstract
Automated Machine Learning (AutoML) enhances productivity and efficiency by automating the entire process of machine learning model development, from data preprocessing to model deployment. These tools are accessible to users with varying levels of expertise and enable efficient, scalable, and accurate classification across [...] Read more.
Automated Machine Learning (AutoML) enhances productivity and efficiency by automating the entire process of machine learning model development, from data preprocessing to model deployment. These tools are accessible to users with varying levels of expertise and enable efficient, scalable, and accurate classification across different applications. This paper evaluates two popular AutoML tools, the Tree-Based Pipeline Optimization Tool (TPOT) version 0.10.2 and Konstanz Information Miner (KNIME) version 5.2.5, comparing their performance in a classification task. Specifically, this work analyzes autism spectrum disorder (ASD) detection in toddlers as a use case. The dataset for ASD detection was collected from various rehabilitation centers in Pakistan. TPOT and KNIME were applied to the ASD dataset, with TPOT achieving an accuracy of 85.23% and KNIME achieving 83.89%. Evaluation metrics such as precision, recall, and F1-score validated the reliability of the models. After selecting the best models with optimal accuracy, the most important features for ASD detection were identified using these AutoML tools. The tools optimized the feature selection process and significantly reduced diagnosis time. This study demonstrates the potential of AutoML tools and feature selection techniques to improve early ASD detection and outcomes for affected children and their families. Full article
(This article belongs to the Special Issue Real-World Applications of Machine Learning Techniques)
Show Figures

Figure 1

31 pages, 1004 KB  
Article
Daily Streamflow Forecasting Using AutoML and Remote-Sensing-Estimated Rainfall Datasets in the Amazon Biomes
by Matteo Bodini
Signals 2024, 5(4), 659-689; https://doi.org/10.3390/signals5040037 - 10 Oct 2024
Cited by 2 | Viewed by 3017
Abstract
Reliable streamflow forecasting is crucial for several tasks related to water-resource management, including planning reservoir operations, power generation via Hydroelectric Power Plants (HPPs), and flood mitigation, thus resulting in relevant social implications. The present study is focused on the application of Automated Machine-Learning [...] Read more.
Reliable streamflow forecasting is crucial for several tasks related to water-resource management, including planning reservoir operations, power generation via Hydroelectric Power Plants (HPPs), and flood mitigation, thus resulting in relevant social implications. The present study is focused on the application of Automated Machine-Learning (AutoML) models to forecast daily streamflow in the area of the upper Teles Pires River basin, located in the region of the Amazon biomes. The latter area is characterized by extensive water-resource utilization, mostly for power generation through HPPs, and it has a limited hydrological data-monitoring network. Five different AutoML models were employed to forecast the streamflow daily, i.e., auto-sklearn, Tree-based Pipeline Optimization Tool (TPOT), H2O AutoML, AutoKeras, and MLBox. The AutoML input features were set as the time-lagged streamflow and average rainfall data sourced from four rain gauge stations and one streamflow gauge station. To overcome the lack of training data, in addition to the previous features, products estimated via remote sensing were leveraged as training data, including PERSIANN, PERSIANN-CCS, PERSIANN-CDR, and PDIR-Now. The selected AutoML models proved their effectiveness in forecasting the streamflow in the considered basin. In particular, the reliability of streamflow predictions was high both in the case when training data came from rain and streamflow gauge stations and when training data were collected by the four previously mentioned estimated remote-sensing products. Moreover, the selected AutoML models showed promising results in forecasting the streamflow up to a three-day horizon, relying on the two available kinds of input features. As a final result, the present research underscores the potential of employing AutoML models for reliable streamflow forecasting, which can significantly advance water-resource planning and management within the studied geographical area. Full article
(This article belongs to the Special Issue Rainfall Estimation Using Signals)
Show Figures

Figure 1

16 pages, 1777 KB  
Article
Metabolomics Biomarker Discovery to Optimize Hepatocellular Carcinoma Diagnosis: Methodology Integrating AutoML and Explainable Artificial Intelligence
by Fatma Hilal Yagin, Radwa El Shawi, Abdulmohsen Algarni, Cemil Colak, Fahaid Al-Hashem and Luca Paolo Ardigò
Diagnostics 2024, 14(18), 2049; https://doi.org/10.3390/diagnostics14182049 - 15 Sep 2024
Cited by 9 | Viewed by 2383
Abstract
Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We [...] Read more.
Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We investigated publicly accessible data encompassing HCC patients and cirrhotic controls. The TPOT tool, which is an AutoML tool, was used to optimize the preparation of features and data, as well as to select the most suitable machine learning model. The TreeSHAP approach, which is a type of XAI, was used to interpret the model by assessing each metabolite’s individual contribution to the categorization process. Results: TPOT had superior performance in distinguishing between HCC and cirrhosis compared to other AutoML approaches AutoSKlearn and H2O AutoML, in addition to traditional machine learning models such as random forest, support vector machine, and k-nearest neighbor. The TPOT technique attained an AUC value of 0.81, showcasing superior accuracy, sensitivity, and specificity in comparison to the other models. Key metabolites, including L-valine, glycine, and DL-isoleucine, were identified as essential by TPOT and subsequently verified by TreeSHAP analysis. TreeSHAP provided a comprehensive explanation of the contribution of these metabolites to the model’s predictions, thereby increasing the interpretability and dependability of the results. This thorough assessment highlights the strength and reliability of the AutoML framework in the development of clinical biomarkers. Conclusions: This study shows that AutoML and XAI can be used together to create metabolomic biomarkers that are specific to HCC. The exceptional performance of TPOT in comparison to traditional models highlights its capacity to identify biomarkers. Furthermore, TreeSHAP boosted model transparency by highlighting the relevance of certain metabolites. This comprehensive method has the potential to enhance the identification of biomarkers and generate precise, easily understandable, AI-driven solutions for diagnosing HCC. Full article
Show Figures

Figure 1

18 pages, 1554 KB  
Article
Towards Cleaner Ports: Predictive Modeling of Sulfur Dioxide Shipping Emissions in Maritime Facilities Using Machine Learning
by Carlos D. Paternina-Arboleda, Dayana Agudelo-Castañeda, Stefan Voß and Shubhendu Das
Sustainability 2023, 15(16), 12171; https://doi.org/10.3390/su151612171 - 9 Aug 2023
Cited by 14 | Viewed by 2900
Abstract
Maritime ports play a pivotal role in fostering the growth of domestic and international trade and economies. As ports continue to expand in size and capacity, the impact of their operations on air quality and climate change becomes increasingly significant. While nearby regions [...] Read more.
Maritime ports play a pivotal role in fostering the growth of domestic and international trade and economies. As ports continue to expand in size and capacity, the impact of their operations on air quality and climate change becomes increasingly significant. While nearby regions may experience economic benefits, there are significant concerns regarding the emission of atmospheric pollutants, which have adverse effects on both human health and climate change. Predictive modeling of port emissions can serve as a valuable tool in identifying areas of concern, evaluating the effectiveness of emission reduction strategies, and promoting sustainable development within ports. The primary objective of this research is to utilize machine learning frameworks to estimate the emissions of SO2 from ships during various port activities, including hoteling, maneuvering, and cruising. By employing these models, we aim to gain insights into the emission patterns and explore strategies to mitigate their impact. Through our analysis, we have identified the most effective models for estimating SO2 emissions. The AutoML TPOT framework emerges as the top-performing model, followed by Non-Linear Regression with interaction effects. On the other hand, Linear Regression exhibited the lowest performance among the models evaluated. By employing these advanced machine learning techniques, we aim to contribute to the body of knowledge surrounding port emissions and foster sustainable practices within the maritime industry. Full article
(This article belongs to the Special Issue Sustainability in Logistics and Supply Chain Management)
Show Figures

Figure 1

21 pages, 3234 KB  
Article
Automated Classification of Atherosclerotic Radiomics Features in Coronary Computed Tomography Angiography (CCTA)
by Mardhiyati Mohd Yunus, Ahmad Khairuddin Mohamed Yusof, Muhd Zaidi Ab Rahman, Xue Jing Koh, Akmal Sabarudin, Puteri N. E. Nohuddin, Kwan Hoong Ng, Mohd Mustafa Awang Kechik and Muhammad Khalis Abdul Karim
Diagnostics 2022, 12(7), 1660; https://doi.org/10.3390/diagnostics12071660 - 8 Jul 2022
Cited by 10 | Viewed by 4109
Abstract
Radiomics is the process of extracting useful quantitative features of high-dimensional data that allows for automated disease classification, including atherosclerotic disease. Hence, this study aimed to quantify and extract the radiomic features from Coronary Computed Tomography Angiography (CCTA) images and to evaluate the [...] Read more.
Radiomics is the process of extracting useful quantitative features of high-dimensional data that allows for automated disease classification, including atherosclerotic disease. Hence, this study aimed to quantify and extract the radiomic features from Coronary Computed Tomography Angiography (CCTA) images and to evaluate the performance of automated machine learning (AutoML) model in classifying the atherosclerotic plaques. In total, 202 patients who underwent CCTA examination at Institut Jantung Negara (IJN) between September 2020 and May 2021 were selected as they met the inclusion criteria. Three primary coronary arteries were segmented on axial sectional images, yielding a total of 606 volume of interest (VOI). Subsequently, the first order, second order, and shape order of radiomic characteristics were extracted for each VOI. Model 1, Model 2, Model 3, and Model 4 were constructed using AutoML-based Tree-Pipeline Optimization Tools (TPOT). The heatmap confusion matrix, recall (sensitivity), precision (PPV), F1 score, accuracy, receiver operating characteristic (ROC), and area under the curve (AUC) were analysed. Notably, Model 1 with the first-order features showed superior performance in classifying the normal coronary arteries (F1 score: 0.88; Inverse F1 score: 0.94), as well as in classifying the calcified (F1 score: 0.78; Inverse F1 score: 0.91) and mixed plaques (F1 score: 0.76; Inverse F1 score: 0.86). Moreover, Model 2 consisting of second-order features was proved useful, specifically in classifying the non-calcified plaques (F1 score: 0.63; Inverse F1 score: 0.92) which are a key point for prediction of cardiac events. Nevertheless, Model 3 comprising the shape-based features did not contribute to the classification of atherosclerotic plaques. Overall, TPOT shown promising capabilities in terms of finding the best pipeline and tailoring the model using CCTA-based radiomic datasets. Full article
(This article belongs to the Section Medical Imaging and Theranostics)
Show Figures

Figure 1

Back to TopTop