MDPI - Publisher of Open Access Journals

6 pages, 380 KB

Open AccessProceeding Paper

Bridging the Data Gap in ML-Based NIDS: An Automated Honeynet Platform for Generating Real-World Malware Traffic Datasets

by Gabriel Ulloa Cano, Gabriel Sánchez Pérez, José Portillo-Portillo, Linda Karina Toscano Medina, Aldo Hernández Suárez, Jesús Olivares Mercado, Héctor Manuel Pérez Meana, Luis Javier García Villalba and Pablo Velarde Alvarado

Eng. Proc. 2026, 123(1), 36; https://doi.org/10.3390/engproc2026123036 - 13 Feb 2026

Viewed by 227

Abstract

The effectiveness of Machine Learning (ML)-based Network Intrusion Detection Systems (NIDS) is critically hampered by the scarcity of realistic and up-to-date malware traffic datasets. To address this gap, we present an automated platform for generating real-world malware traffic datasets. Our solution leverages a [...] Read more.

The effectiveness of Machine Learning (ML)-based Network Intrusion Detection Systems (NIDS) is critically hampered by the scarcity of realistic and up-to-date malware traffic datasets. To address this gap, we present an automated platform for generating real-world malware traffic datasets. Our solution leverages a production-environment honeynet (T-Pot), deployed within a university network and segmented via a secure WireGuard VPN, to capture live attacks using high-interaction honeypots (Dionaea, Cowrie, ADBhoney). A fully automated pipeline handles traffic capture, transfer, filtering based on honeypot logs, and malware analysis (VirusTotal, VxAPI). The output is the IPN-UAN-23 dataset—a curated, labeled corpus of malicious network traffic. This platform functions as a vital automated security tool, providing the continuous stream of actionable intelligence required to develop and refine robust ML-based NIDS within a DevSecOps lifecycle. Full article

(This article belongs to the Proceedings of First Summer School on Artificial Intelligence in Cybersecurity)

► Show Figures

Figure 1

48 pages, 5217 KB

Open AccessArticle

AutoML-Based Prediction of Unconfined Compressive Strength of Stabilized Soils: A Multi-Dataset Evaluation on Worldwide Experimental Data

by Romulo Murucci Oliveira, Deivid Campos, Katia Vanessa Bicalho, Bruno da S. Macêdo, Matteo Bodini, Camila Martins Saporetti and Leonardo Goliatt

Forecasting 2025, 7(4), 80; https://doi.org/10.3390/forecast7040080 - 18 Dec 2025

Cited by 1 | Viewed by 1028

Abstract

Unconfined Compressive Strength (UCS) of stabilized soils is commonly used for evaluating the effectiveness of soil improvement techniques. Achieving target UCS values through conventional trial-and-error approaches requires extensive laboratory experiments, which are time-consuming and resource-intensive. Automated Machine Learning (AutoML) frameworks offer a promising [...] Read more.

Unconfined Compressive Strength (UCS) of stabilized soils is commonly used for evaluating the effectiveness of soil improvement techniques. Achieving target UCS values through conventional trial-and-error approaches requires extensive laboratory experiments, which are time-consuming and resource-intensive. Automated Machine Learning (AutoML) frameworks offer a promising alternative by enabling automated, reproducible, and accessible predictive modeling of UCS values from more readily obtainable index and physical soil and stabilizer properties, reducing the reliance on experimental testing and empirical relationships, and allowing systematic exploration of multiple models and configurations. This study evaluates the predictive performance of five state-of-the-art AutoML frameworks (i.e., AutoGluon, AutoKeras, FLAML, H2O, and TPOT) using analyses of results from 10 experimental datasets comprising 2083 samples from laboratory experiments spanning diverse soil types, stabilizers, and experimental conditions across many countries worldwide. Comparative analyses revealed that FLAML achieved the highest overall performance (average PI score of 0.7848), whereas AutoKeras exhibited lower accuracy on complex datasets; AutoGluon , H2O and TPOT also demonstrated strong predictive capabilities, with performance varying with dataset characteristics. Despite the promising potential of AutoML, prior research has shown that fully automated frameworks have limited applicability to UCS prediction, highlighting a gap in end-to-end pipeline automation. The findings provide practical guidance for selecting AutoML tools based on dataset characteristics and research objectives, and suggest avenues for future studies, including expanding the range of AutoML frameworks and integrating interpretability techniques, such as feature importance analysis, to deepen understanding of soil–stabilizer interactions. Overall, the results indicate that AutoML frameworks can effectively accelerate UCS prediction, reduce laboratory workload, and support data-driven decision-making in geotechnical engineering. Full article

► Show Figures

Figure 1

32 pages, 28258 KB

Open AccessArticle

Machine Learning-Based Classification of ICU-Acquired Neuromuscular Weakness: A Comparative Study in Survivors of Critical Illness

by David Estévez-Freire, Ivan Cangas, Andrés Tirado-Espín, Johanna Pozo-Neira, Fernando Villalba-Meneses, Diego Almeida-Galárraga and Omar Alvarado-Cando

Life 2025, 15(12), 1802; https://doi.org/10.3390/life15121802 - 25 Nov 2025

Viewed by 926

Abstract

Classifying the severity of intensive-care-unit-acquired muscle atrophy (ICU-AW) is essential for early prognosis and individualized neurorehabilitation, improving functional outcomes in survivors of critical illness. This study evaluated and compared advanced machine learning (ML) algorithms for classifying neuromuscular atrophy in neurocritical patients. Clinical, biochemical, [...] Read more.

Classifying the severity of intensive-care-unit-acquired muscle atrophy (ICU-AW) is essential for early prognosis and individualized neurorehabilitation, improving functional outcomes in survivors of critical illness. This study evaluated and compared advanced machine learning (ML) algorithms for classifying neuromuscular atrophy in neurocritical patients. Clinical, biochemical, anthropometric, and morphometric data from 198 neuro-ICU patients were retrospectively analyzed. Six supervised ML models—Support Vector Machine (SVM), Multilayer Perceptron (MLP), Extreme Gradient Boosting (XGBoost), TPOT AutoML, AdaBoost, and Multinomial Logistic Regression—were trained using stratified cross-validation, synthetic oversampling, and hyperparameter optimization. Among the most outstanding models, SVM achieved the best performance (accuracy = 93%, ROC-AUC = 0.95), followed by MLP (accuracy = 82.8%, ROC-AUC = 0.93) and XGBoost (accuracy = 80%, ROC-AUC = 0.94). Stability analyses across random seeds confirmed the robustness of SVM and TPOT, with the highest median AUPRC (>0.90). Explainable AI methods (LIME and SHAP) identified BMI, serum albumin, and body surface area as the most influential variables, showing physiologically consistent patterns associated with a classification of muscle loss. Full article

(This article belongs to the Section Biochemistry, Biophysics and Computational Biology)

► Show Figures

Figure 1

34 pages, 7119 KB

Open AccessArticle

A Deployment-Aware Framework for Carbon- and Water- Efficient LLM Serving

by Julian Hoxha, Marsela Thanasi-Boçe and Tarek Khalifa

Sustainability 2025, 17(23), 10473; https://doi.org/10.3390/su172310473 - 22 Nov 2025

Viewed by 1173

Abstract

Inference now dominates the lifecycle footprint of large language models, yet published estimates often use inconsistent boundaries and optimize carbon while ignoring water. We present a provider-agnostic framework that unifies scope-transparent measurement with time-resolved, SLO-aware orchestration and jointly optimizes carbon and consumptive water. [...] Read more.

Inference now dominates the lifecycle footprint of large language models, yet published estimates often use inconsistent boundaries and optimize carbon while ignoring water. We present a provider-agnostic framework that unifies scope-transparent measurement with time-resolved, SLO-aware orchestration and jointly optimizes carbon and consumptive water. Measurement reports daily medians at a comprehensive serving boundary that includes accelerators, host CPU/DRAM, provisioned idle, and PUE uplift, and provides accelerator-only whiskers for reconciliation. Optimization uses a mixed-integer linear program solved over five-minute windows; it selects region, batch size, and phase-aware hardware for prefill and decode while enforcing

p 95

TTFT and TPOT as well as capacity constraints. Applied to four representative models, a single SLO-aware policy reduces comprehensive-boundary medians by 57 to 59 percent for energy, 59 to 60 percent for water, and 78 to 80 percent for location-based CO₂, with SLOs met in every window. For a day with 500 million queries on GPT-4o, totals fall from 0.344 to 0.145 GWh, 1.196 to 0.490 ML, and 121 to 25 t CO₂ (location-based). The framework offers a deployable template for carbon- and water-aware LLM serving with auditable and scope-transparent reporting. Full article

► Show Figures

Figure 1

19 pages, 981 KB

Open AccessReview

Molecular Self-Reassembled Regenerated Fibres and Their Significance in Tissue Engineering Bio-Composites

by Kristiyan Stiliyanov-Atanasov and Probal Basu

Fibers 2025, 13(11), 149; https://doi.org/10.3390/fib13110149 - 4 Nov 2025

Viewed by 970

Abstract

Due to their interesting physicochemical and bioactive properties, regenerated fibres (including cellulose and collagen regenerated fibres) have been considered attractive biomaterials for biomedical applications. These regenerated fibres have an altered molecular arrangement compared to the native fibres and exhibit unique properties. Despite their [...] Read more.

Due to their interesting physicochemical and bioactive properties, regenerated fibres (including cellulose and collagen regenerated fibres) have been considered attractive biomaterials for biomedical applications. These regenerated fibres have an altered molecular arrangement compared to the native fibres and exhibit unique properties. Despite their distinctive structural characteristics, a meagre amount of research explores their potential for the development of tissue-engineering bio-composites. This work focuses on exploring the promise of cellulose and collagen-based regenerated fibres in tissue-regeneration bio-composite development. Initially, the work investigates the similarities and dissimilarities between the collagen and cellulose structures, which are linked to their specific properties, such as crystallinity, chemical characteristics, and mechanical properties. It then delves deeper into their molecular structural reassembly and various aspects of the already reported bio-composites developed using them. Finally, their promise in the development of tissue-engineering bio-composites is explored through a meticulous comparative analysis of their advantages and challenges. It was found that efficient biodegradability is one of the key advantages of regenerated fibres, whereas difficulty in processing presents a significant disadvantage. Despite these facts, regenerated fibres can incorporate enhanced and desired properties into the bio-composite matrix, which could lead to tissue-specific bio-regenerative applications. Full article

► Show Figures

Figure 1

15 pages, 2365 KB

Open AccessArticle

Leveraging Explainable Automated Machine Learning (AutoML) and Metabolomics for Robust Diagnosis and Pathophysiological Insights in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS)

by Fatma Hilal Yagin, Cemil Colak, Fahaid Al-Hashem, Sarah A. Alzakari, Amel Ali Alhussan and Mohammadreza Aghaei

Diagnostics 2025, 15(21), 2755; https://doi.org/10.3390/diagnostics15212755 - 30 Oct 2025

Cited by 1 | Viewed by 1205

Abstract

Background/Objectives: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a debilitating complex disease with an elusive etiology, lacking objective diagnostic biomarkers. This study leverages advanced Automated Machine Learning (AutoML) to analyze plasma metabolomic and lipidomic profiles for the purpose of ME/CFS detection. Methods: [...] Read more.

Background/Objectives: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a debilitating complex disease with an elusive etiology, lacking objective diagnostic biomarkers. This study leverages advanced Automated Machine Learning (AutoML) to analyze plasma metabolomic and lipidomic profiles for the purpose of ME/CFS detection. Methods: We utilized a publicly available dataset comprising 888 metabolic features from 106 ME/CFS patients and 91 matched controls. Three AutoML frameworks—TPOT, Auto-Sklearn, and H2O AutoML—were benchmarked under identical time constraints. Univariate ROC and PLS-DA analyses with cross-validation, permutation testing, and VIP-based feature selection were applied to standardized, log-transformed omics data to identify significant discriminatory metabolites/lipids and assess their intercorrelations. Results: TPOT significantly outperformed its counterparts, achieving an area under the curve (AUC) of 92.1%, accuracy of 87.3%, sensitivity of 85.8%, and specificity of 89.0%. The PLS-DA model revealed a moderate but statistically significant discrimination between ME/CFS and controls. Explainable artificial intelligence (XAI) via SHAP analysis of the optimal TPOT model identified key metabolites implicating dysregulated pathways in mitochondrial energy metabolism (succinic acid, pyruvic acid, leucine), chronic inflammation (prostaglandin D₂, 11,12-EET), gut–brain axis communication (glycocholic acid), and cell membrane integrity (pc(35:2)a). Conclusions: Our results demonstrate that TPOT-derived models not only provide a highly accurate and robust diagnostic tool but also yield biologically interpretable insights into the pathophysiology of ME/CFS, highlighting its potential for clinical decision support and elucidating novel therapeutic targets. Full article

(This article belongs to the Special Issue The Future of Diagnostics: Exploring the Role of Artificial Intelligence in Medicine)

► Show Figures

Figure 1

13 pages, 1587 KB

Open AccessArticle

Glioma Grading by Integrating Radiomic Features from Peritumoral Edema in Fused MRI Images and Automated Machine Learning

by Amir Khorasani

J. Imaging 2025, 11(10), 336; https://doi.org/10.3390/jimaging11100336 - 27 Sep 2025

Cited by 1 | Viewed by 1207

Abstract

We aimed to investigate the utility of peritumoral edema-derived radiomic features from magnetic resonance imaging (MRI) image weights and fused MRI sequences for enhancing the performance of machine learning-based glioma grading. The present study utilized the Multimodal Brain Tumor Segmentation Challenge 2023 (BraTS [...] Read more.

We aimed to investigate the utility of peritumoral edema-derived radiomic features from magnetic resonance imaging (MRI) image weights and fused MRI sequences for enhancing the performance of machine learning-based glioma grading. The present study utilized the Multimodal Brain Tumor Segmentation Challenge 2023 (BraTS 2023) dataset. Laplacian Re-decomposition (LRD) was employed to fuse multimodal MRI sequences. The fused image quality was evaluated using the Entropy, standard deviation (STD), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) metrics. A comprehensive set of radiomic features was subsequently extracted from peritumoral edema regions using PyRadiomics. The Boruta algorithm was applied for feature selection, and an optimized classification pipeline was developed using the Tree-based Pipeline Optimization Tool (TPOT). Model performance for glioma grade classification was evaluated based on accuracy, precision, recall, F1-score, and area under the curve (AUC) parameters. Analysis of fused image quality metrics confirmed that the LRD method produces high-quality fused images. From 851 radiomic features extracted from peritumoral edema regions, the Boruta algorithm selected different sets of informative features in both standard MRI and fused images. Subsequent TPOT automated machine learning optimization analysis identified a fine-tuned Stochastic Gradient Descent (SGD) classifier, trained on features from T₁Gd+FLAIR fused images, as the top-performing model. This model achieved superior performance in glioma grade classification (Accuracy = 0.96, Precision = 1.0, Recall = 0.94, F1-Score = 0.96, AUC = 1.0). Radiomic features derived from peritumoral edema in fused MRI images using the LRD method demonstrated distinct, grade-specific patterns and can be utilized as a non-invasive, accurate, and rapid glioma grade classification method. Full article

(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)

► Show Figures

Figure 1

5 pages, 314 KB

Open AccessProceeding Paper

Bespoke Biomarker Combinations for Cancer Survival Prognosis Using Artificial Intelligence on Tumour Transcriptomics

by Ricardo Jorge Pais, Tiago Alexandre Pais and Uraquitan Lima Filho

Med. Sci. Forum 2025, 37(1), 18; https://doi.org/10.3390/msf2025037018 - 2 Sep 2025

Viewed by 467

Abstract

Accurate cancer prognosis remains a major challenge, as single gene expression biomarkers often lack clinical reliability, and most ML approaches fail even when considering large gene panels. In this study, we used a novel AutoML framework (O2Pmgen) benchmarked with a well-established framework (TPOT) [...] Read more.

Accurate cancer prognosis remains a major challenge, as single gene expression biomarkers often lack clinical reliability, and most ML approaches fail even when considering large gene panels. In this study, we used a novel AutoML framework (O2Pmgen) benchmarked with a well-established framework (TPOT) on TCGA transcriptomic data for breast, lung, and renal cancers to identify small gene panels predictive of patient survival. From 58 EMT-related genes, we found models based on panels of 6–10 genes that outperformed single-marker models and ML models that considered the 58 EMT genes, with performance gains up to 21%. Further, the generated models achieved good predictive power with AUCs of 71–83%. Our results demonstrated that affordable and efficient prognostic tools using small, biologically relevant gene sets can provide better risk stratification in clinical oncology. Full article

(This article belongs to the Proceedings of

7th CiiEM International Congress 2025—Empowering One Health to Reduce Social Vulnerabilities

)

► Show Figures

Figure 1

21 pages, 3353 KB

Open AccessArticle

Automated Machine Learning-Based Significant Wave Height Prediction for Marine Operations

by Yuan Zhang, Hao Wang, Bo Wu, Jiajing Sun, Mingli Fan, Shu Dai, Hengyi Yang and Minyi Xu

J. Mar. Sci. Eng. 2025, 13(8), 1476; https://doi.org/10.3390/jmse13081476 - 31 Jul 2025

Cited by 2 | Viewed by 1369

Abstract

Determining/predicting the environment dominates a variety of marine operations, such as route planning and offshore installation. Significant wave height (Hs) is a critical parameter-defining wave, a dominating marine load. Data-driven machine learning methods have been increasingly applied to Hs prediction, but challenges remain [...] Read more.

Determining/predicting the environment dominates a variety of marine operations, such as route planning and offshore installation. Significant wave height (Hs) is a critical parameter-defining wave, a dominating marine load. Data-driven machine learning methods have been increasingly applied to Hs prediction, but challenges remain in hyperparameter tuning and spatial generalization. This study explores a novel effective approach for intelligent Hs forecasting for marine operations. Multiple automated machine learning (AutoML) frameworks, namely H2O, PyCaret, AutoGluon, and TPOT, have been systematically evaluated on buoy-based Hs prediction tasks, which reveal their advantages and limitations under various forecast horizons and data quality scenarios. The results indicate that PyCaret achieves superior accuracy in short-term forecasts, while AutoGluon demonstrates better robustness in medium-term and long-term predictions. To address the limitations of single-point prediction models, which often exhibit high dependence on localized data and limited spatial generalization, a multi-point data fusion framework incorporating Principal Component Analysis (PCA) is proposed. The framework utilizes Hs data from two stations near the California coast to predict Hs at another adjacent station. The results indicate that it is possible to realize cross-station predictions based on the data from adjacent (high relevance) stations. Full article

(This article belongs to the Section Physical Oceanography)

► Show Figures

Figure 1

24 pages, 5075 KB

Open AccessArticle

Automated Machine Learning-Based Prediction of the Effects of Physicochemical Properties and External Experimental Conditions on Cadmium Adsorption by Biochar

by Shuoyang Wang, Xiangyu Song, Jicheng Duan, Shuo Li, Dangdang Gao, Jia Liu, Fanjing Meng, Wen Yang, Shixin Yu, Fangshu Wang, Jie Xu, Siyi Luo, Fangchao Zhao and Dong Chen

Water 2025, 17(15), 2266; https://doi.org/10.3390/w17152266 - 30 Jul 2025

Cited by 3 | Viewed by 1711

Abstract

Biochar serves as an effective adsorbent for the heavy metal cadmium, with its performance significantly influenced by its physicochemical properties and various environmental features. Traditional machine learning models, though adept at managing complex multi-feature relationships, rely heavily on expertise in feature engineering and [...] Read more.

Biochar serves as an effective adsorbent for the heavy metal cadmium, with its performance significantly influenced by its physicochemical properties and various environmental features. Traditional machine learning models, though adept at managing complex multi-feature relationships, rely heavily on expertise in feature engineering and hyperparameter optimization. To address these issues, this study employs an automated machine learning (AutoML) approach, automating feature selection and model optimization, coupled with an intuitive online graphical user interface, enhancing accessibility and generalizability. Comparative analysis of four AutoML frameworks (TPOT, FLAML, AutoGluon, H₂O AutoML) demonstrated that H₂O AutoML achieved the highest prediction accuracy (R² = 0.918). Key features influencing adsorption performance were identified as initial cadmium concentration (23%), stirring rate (14.7%), and the biochar H/C ratio (9.7%). Additionally, the maximum adsorption capacity of the biochar was determined to be 105 mg/g. Optimal production conditions for biochar were determined to be a pyrolysis temperature of 570–800 °C, a residence time of ≥2 h, and a heating rate of 3–10 °C/min to achieve an H/C ratio of <0.2. An online graphical user interface was developed to facilitate user interaction with the model. This study not only provides practical guidelines for optimizing biochar but also introduces a novel approach to modeling using AutoML. Full article

(This article belongs to the Special Issue Advanced Adsorbent-Based Technologies for Efficient Wastewater Treatment)

► Show Figures

Figure 1

19 pages, 1425 KB

Open AccessArticle

Early Detection of Autism Spectrum Disorder Through Automated Machine Learning

by Khafsa Ehsan, Kashif Sultan, Abreen Fatima, Muhammad Sheraz and Teong Chee Chuah

Diagnostics 2025, 15(15), 1859; https://doi.org/10.3390/diagnostics15151859 - 24 Jul 2025

Cited by 5 | Viewed by 6693

Abstract

Background/Objectives: Autism spectrum disorder (ASD) is a neurodevelopmental disorder distinguished by an extensive range of symptoms, including reduced social interaction, communication difficulties and tiresome behaviors. Early detection of ASD is important because it allows for timely intervention, which significantly improves developmental, behavioral, [...] Read more.

Background/Objectives: Autism spectrum disorder (ASD) is a neurodevelopmental disorder distinguished by an extensive range of symptoms, including reduced social interaction, communication difficulties and tiresome behaviors. Early detection of ASD is important because it allows for timely intervention, which significantly improves developmental, behavioral, and communicative outcomes in children. However, traditional diagnostic procedures for identifying autism spectrum disorder (ASD) typically involve lengthy clinical examinations, which can be both time-consuming and costly. This research proposes leveraging automated machine learning (AUTOML) to streamline the diagnostic process and enhance its accuracy. Methods: In this study, by collecting data from various rehabilitation centers across Pakistan, we applied a specific AUTOML tool known as Tree-based Pipeline Optimization Tool (TPOT) for ASD detection. Notably, this study marks one of the initial explorations into utilizing AUTOML for ASD detection. The experimentations indicate that the TPOT provided the best pipeline for the dataset, which was verified using a manual machine learning method. Results: The study contributes to the field of ASD diagnosis by using AUTOML to determine the likelihood of ASD in children at prompt stages of evolution. The study also provides an evaluation of precision, recall, and F1-score metrics to confirm the correctness of the diagnosis. The propose TPOT-based AUTOML framework attained an overall accuracy 78%, with a precision of 83%, a recall of 90%, and an F1-score of 86% for the autistic class. Conclusions: In summary, this research offers an encouraging approach to improve the detection of autism spectrum disorders (ASD) in children, which could lead to better results for affected individuals and their families. Full article

(This article belongs to the Special Issue Artificial Intelligence in Biomedical Diagnostics and Analysis 2024)

► Show Figures

Figure 1

13 pages, 3490 KB

Open AccessArticle

Enhanced Realism in Animal Fur Simulation for Digital Conservation: A Physically-Based Rendering and Augmented Reality Approach

by Xuewei Xu, Chuanqian Tang, Xiaodan Zhang and Zhiqiang Liu

Appl. Sci. 2025, 15(14), 8049; https://doi.org/10.3390/app15148049 - 19 Jul 2025

Viewed by 963

Abstract

The rising popularity of ecotourism on the Tibetan Plateau has intensified the tension between wildlife conservation and economic development. Conventional wildlife displays often fail to achieve high-fidelity, non-invasive representations of animal morphology and typically lack immersive, interactive features, limiting public engagement in ecological [...] Read more.

The rising popularity of ecotourism on the Tibetan Plateau has intensified the tension between wildlife conservation and economic development. Conventional wildlife displays often fail to achieve high-fidelity, non-invasive representations of animal morphology and typically lack immersive, interactive features, limiting public engagement in ecological protection. To address these limitations, this study presents a fur simulation algorithm based on the Texture Procedural Overlay Technique (TPOT), integrated with Augmented Reality (AR) technology, focusing on the endangered white-lipped deer in the Sanjiangyuan region. The proposed TPOT-based algorithm enhances the visual realism of fur through multi-layered procedural texturing and physical property fusion. Combined with an AR-driven interactive framework, it seamlessly integrates high-resolution 3D models into real-world environments, significantly improving user immersion and engagement. Comparative experiments demonstrate that the approach surpasses traditional static display fidelity and animation rendering efficiency methods. User feedback further validates its effectiveness for scientific research and environmental education. This work introduces an innovative technological solution for wildlife conservation on the Tibetan Plateau and provides a practical reference for applying digital technologies in ecotourism. Full article

► Show Figures

Figure 1

25 pages, 7504 KB

Open AccessArticle

Explainable Artificial Intelligence (XAI) for Flood Susceptibility Assessment in Seoul: Leveraging Evolutionary and Bayesian AutoML Optimization

by Kounghoon Nam, Youngkyu Lee, Sungsu Lee, Sungyoon Kim and Shuai Zhang

Remote Sens. 2025, 17(13), 2244; https://doi.org/10.3390/rs17132244 - 30 Jun 2025

Cited by 4 | Viewed by 2224

Abstract

This study aims to enhance the accuracy and interpretability of flood susceptibility mapping (FSM) in Seoul, South Korea, by integrating automated machine learning (AutoML) with explainable artificial intelligence (XAI) techniques. Ten topographic and environmental conditioning factors were selected as model inputs. We first [...] Read more.

This study aims to enhance the accuracy and interpretability of flood susceptibility mapping (FSM) in Seoul, South Korea, by integrating automated machine learning (AutoML) with explainable artificial intelligence (XAI) techniques. Ten topographic and environmental conditioning factors were selected as model inputs. We first employed the Tree-based Pipeline Optimization Tool (TPOT), an evolutionary AutoML algorithm, to construct baseline ensemble models using Gradient Boosting (GB), Random Forest (RF), and XGBoost (XGB). These models were further fine-tuned using Bayesian optimization via Optuna. To interpret the model outcomes, SHAP (SHapley Additive exPlanations) was applied to analyze both the global and local contributions of each factor. The SHAP analysis revealed that lower elevation, slope, and stream distance, as well as higher stream density and built-up areas, were the most influential factors contributing to flood susceptibility. Moreover, interactions between these factors, such as built-up areas located on gentle slopes near streams, further intensified flood risk. The susceptibility maps were reclassified into five categories (very low to very high), and the GB model identified that approximately 15.047% of the study area falls under very-high-flood-risk zones. Among the models, the GB classifier achieved the highest performance, followed by XGB and RF. The proposed framework, which integrates TPOT, Optuna, and SHAP within an XAI pipeline, not only improves predictive capability but also offers transparent insights into feature behavior and model logic. These findings support more robust and interpretable flood risk assessments for effective disaster management in urban areas. Full article

(This article belongs to the Special Issue Artificial Intelligence for Natural Hazards (AI4NH))

► Show Figures

Figure 1

19 pages, 4395 KB

Open AccessArticle

Web-Based Baseflow Estimation in SWAT Considering Spatiotemporal Recession Characteristics Using Machine Learning

by Jimin Lee, Jeongho Han, Bernard Engel and Kyoung Jae Lim

Environments 2025, 12(3), 94; https://doi.org/10.3390/environments12030094 - 17 Mar 2025

Cited by 2 | Viewed by 1923

Abstract

The increasing frequency and severity of hydrological extremes due to climate change necessitate accurate baseflow estimation and effective watershed management for sustainable water resource use. The Soil and Water Assessment Tool (SWAT) is widely utilized for hydrological modeling but shows limitations in baseflow [...] Read more.

The increasing frequency and severity of hydrological extremes due to climate change necessitate accurate baseflow estimation and effective watershed management for sustainable water resource use. The Soil and Water Assessment Tool (SWAT) is widely utilized for hydrological modeling but shows limitations in baseflow simulation due to its uniform application of the alpha factor across Hydrologic Response Units (HRUs), neglecting spatial and temporal variability. To address these challenges, this study integrated SWAT with the Tree-Based Pipeline Optimization Tool (TPOT), an automated machine learning (AutoML) framework, to predict HRU-specific alpha factors. Furthermore, a user-friendly web-based program was developed to improve the accessibility and practical application of these optimized alpha factors, supporting more accurate baseflow predictions, even in ungauged watersheds. The proposed HRU-specific alpha factor approach in the study area significantly enhanced the recession and baseflow predictions compared to the traditional uniform alpha factor method. This improvement was supported by key performance metrics, including the Nash–Sutcliffe Efficiency (NSE), the coefficient of determination (R²), the percent bias (PBIAS), and the mean absolute percentage error (MAPE). This integrated framework effectively improves the accuracy and practicality of hydrological modeling, offering scalable and innovative solutions for sustainable watershed management in the face of increasing water stress. Full article

(This article belongs to the Special Issue Hydrological Modeling and Sustainable Water Resources Management)

► Show Figures

Figure 1

22 pages, 11145 KB

Open AccessArticle

Regional Soil Moisture Estimation Leveraging Multi-Source Data Fusion and Automated Machine Learning

by Shenglin Li, Pengyuan Zhu, Ni Song, Caixia Li and Jinglei Wang

Remote Sens. 2025, 17(5), 837; https://doi.org/10.3390/rs17050837 - 27 Feb 2025

Cited by 15 | Viewed by 3391

Abstract

Soil moisture (SM) monitoring in farmland at a regional scale is crucial for precision irrigation management and ensuring food security. However, existing methods for SM estimation encounter significant challenges related to accuracy, generalizability, and automation. This study proposes an integrated data fusion method [...] Read more.

Soil moisture (SM) monitoring in farmland at a regional scale is crucial for precision irrigation management and ensuring food security. However, existing methods for SM estimation encounter significant challenges related to accuracy, generalizability, and automation. This study proposes an integrated data fusion method to systematically assess the potential of three automated machine learning (AutoML) frameworks—tree-based pipeline optimization tool (TPOT), AutoGluon, and H2O AutoML—in retrieving SM. To evaluate the impact of input variables on estimation accuracy, six input scenarios were designed: multispectral data (MS), thermal infrared data (TIR), MS combined with TIR, MS with auxiliary data, TIR with auxiliary data, and a comprehensive combination of MS, TIR, and auxiliary data. The research was conducted in a winter wheat cultivation area within the People’s Victory Canal Irrigation Area, focusing on the 0–40 cm soil layer. The results revealed that the scenario incorporating all data types (MS + TIR + auxiliary) achieved the highest retrieval accuracy. Under this scenario, all three AutoML frameworks demonstrated optimal performance. AutoGluon demonstrated superior performance in most scenarios, particularly excelling in the MS + TIR + auxiliary data scenario. It achieved the highest retrieval accuracy with a Pearson correlation coefficient (R) value of 0.822, root mean square error (RMSE) of 0.038 cm³/cm³, and relative root mean square error (RRMSE) of 16.46%. This study underscores the critical role of input data types and fusion strategies in enhancing SM estimation accuracy and highlights the significant advantages of AutoML frameworks for regional-scale SM retrieval. The findings offer a robust technical foundation and theoretical guidance for advancing precision irrigation management and efficient SM monitoring. Full article

► Show Figures

Figure 1

Search Results (39)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (39)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI