Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (310)

Search Parameters:
Keywords = robust outlier analysis

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 9838 KB  
Article
Processing of Large Underground Excavation System—Skeleton Based Section Segmentation for Point Cloud Regularization
by Przemysław Dąbek, Jacek Wodecki, Adam Wróblewski and Sebastian Gola
Appl. Sci. 2026, 16(1), 313; https://doi.org/10.3390/app16010313 (registering DOI) - 28 Dec 2025
Abstract
Numerical modelling of airflow in underground mines is gaining importance in modern ventilation system design and safety assessment. Computational Fluid Dynamics (CFD) simulations enable detailed analyses of air movement, contaminant dispersion, and heat transfer, yet their reliability depends strongly on the accuracy of [...] Read more.
Numerical modelling of airflow in underground mines is gaining importance in modern ventilation system design and safety assessment. Computational Fluid Dynamics (CFD) simulations enable detailed analyses of air movement, contaminant dispersion, and heat transfer, yet their reliability depends strongly on the accuracy of the geometric representation of excavations. Raw point cloud data obtained from laser scanning of underground workings are typically irregular, noisy, and contain discontinuities that must be processed before being used for CFD meshing. This study presents a methodology for automatic segmentation and regularization of large-scale point cloud data of underground excavation systems. The proposed approach is based on skeleton extraction and trajectory analysis, which enable the separation of excavation networks into individual tunnel segments and crossings. The workflow includes outlier removal, alpha-shape generation, voxelization, medial-axis skeletonization, and topology-based segmentation using neighbor relationships within the voxel grid. A proximity-based correction step is introduced to handle doubled crossings produced by the skeletonization process. The segmented sections are subsequently regularized through radial analysis and surface reconstruction to produce uniform and watertight models suitable for mesh generation in CFD software (Ansys 2024 R1). The methodology was tested on both synthetic datasets and real-world laser scans acquired in underground mine conditions. The results demonstrate that the proposed segmentation approach effectively isolates single-line drifts and crossings, ensuring continuous and smooth geometry while preserving the overall excavation topology. The developed method provides a robust preprocessing framework that bridges the gap between point cloud acquisition and numerical modelling, enabling automated transformation of raw data into CFD-ready geometric models for ventilation and safety analysis of complex underground excavation systems. Full article
(This article belongs to the Special Issue Mining Engineering: Present and Future Prospectives)
Show Figures

Figure 1

25 pages, 2770 KB  
Article
Analysis of the Travelling Time According to Weather Conditions Using Machine Learning Algorithms
by Gülçin Canbulut
Appl. Sci. 2026, 16(1), 6; https://doi.org/10.3390/app16010006 - 19 Dec 2025
Viewed by 147
Abstract
A large share of the global population now lives in urban areas, which creates growing challenges for city life. Local authorities are seeking ways to enhance urban livability, with transportation emerging as a major focus. Developing smart public transit systems is therefore a [...] Read more.
A large share of the global population now lives in urban areas, which creates growing challenges for city life. Local authorities are seeking ways to enhance urban livability, with transportation emerging as a major focus. Developing smart public transit systems is therefore a key priority. Accurately estimating travel times is essential for managing transport operations and supporting strategic decisions. Previous studies have used statistical, mathematical, or machine learning models to predict travel time, but most examined these approaches separately. This study introduces a hybrid framework that combines statistical regression models and machine learning algorithms to predict public bus travel times. The analysis is based on 1410 bus trips recorded between November 2021 and July 2022 in Kayseri, Turkey, including detailed meteorological and operational data. A distinctive aspect of this research is the inclusion of weather variables—temperature, humidity, precipitation, air pressure, and wind speed—which are often neglected in the literature. Additionally, sensitivity analyses are conducted by varying k values in the K-nearest neighbors (KNN) algorithm and threshold values for outlier detection to test model robustness. Among the tested models, CatBoost achieved the best performance with a mean squared error (MSE) of approximately 18.4, outperforming random forest (MSE = 25.3) and XGBoost (MSE = 23.9). The empirical results show that the CatBoost algorithm consistently achieves the lowest mean squared error across different preprocessing and parameter settings. Overall, this study presents a comprehensive and environmentally aware approach to travel time prediction, contributing to the advancement of intelligent and adaptive urban transportation systems. Full article
Show Figures

Figure 1

21 pages, 893 KB  
Article
Enhancing Diagnostic Infrastructure Through Innovation-Driven Technological Capacity in Healthcare
by Nicoleta Mihaela Doran
Healthcare 2025, 13(24), 3328; https://doi.org/10.3390/healthcare13243328 - 18 Dec 2025
Viewed by 182
Abstract
Background: This study examines how national innovation performance shapes the diffusion of advanced diagnostic technologies across European healthcare systems. Strengthening technological capacity through innovation is increasingly essential for resilient and efficient health services. The analysis quantifies the influence of innovation capacity on the [...] Read more.
Background: This study examines how national innovation performance shapes the diffusion of advanced diagnostic technologies across European healthcare systems. Strengthening technological capacity through innovation is increasingly essential for resilient and efficient health services. The analysis quantifies the influence of innovation capacity on the availability of medical imaging technologies in 26 EU Member States between 2018 and 2024. Methods: A balanced panel dataset was assembled from Eurostat, the European Innovation Scoreboard, and World Bank indicators. Dynamic relationships between innovation performance and the adoption of CT, MRI, gamma cameras, and PET scanners were estimated using a two-step approach combining General-to-Specific (GETS) outlier detection with Robust Least Squares regression to address heterogeneity and specification uncertainty. Results: Higher innovation scores significantly increase the diffusion of R&D-intensive technologies such as MRI and PET, while CT availability shows limited responsiveness due to market maturity. Public health expenditure supports frontier technologies when strategically targeted, whereas GDP growth has no significant effect. Population size consistently enhances technological capacity through scale and system-integration effects. Conclusions: The findings show that innovation ecosystems, rather than economic growth alone, drive the modernization of diagnostic infrastructure in the EU. Integrating innovation metrics into health-technology assessments offers a more accurate basis for designing innovation-oriented investment policies in European healthcare. Full article
Show Figures

Figure 1

22 pages, 2261 KB  
Article
Statistical and Multivariate Analysis of the IoT-23 Dataset: A Comprehensive Approach to Network Traffic Pattern Discovery
by Humera Ghani, Shahram Salekzamankhani and Bal Virdee
J. Cybersecur. Priv. 2025, 5(4), 112; https://doi.org/10.3390/jcp5040112 - 16 Dec 2025
Viewed by 340
Abstract
The rapid expansion of Internet of Things (IoT) technologies has introduced significant challenges in understanding the complexity and structure of network traffic data, which is essential for developing effective cybersecurity solutions. This research presents a comprehensive statistical and multivariate analysis of the IoT-23 [...] Read more.
The rapid expansion of Internet of Things (IoT) technologies has introduced significant challenges in understanding the complexity and structure of network traffic data, which is essential for developing effective cybersecurity solutions. This research presents a comprehensive statistical and multivariate analysis of the IoT-23 dataset to identify meaningful network traffic patterns and assess the effectiveness of various analytical methods for IoT security research. The study applies descriptive statistics, inferential analysis, and multivariate techniques, including Principal Component Analysis (PCA), DBSCAN clustering, and factor analysis (FA), to the publicly available IoT-23 dataset. Descriptive analysis reveals clear evidence of non-normal distributions: for example, the features src_bytes, dst_bytes, and src_pkts have skewness values of −4.21, −3.87, and −2.98, and kurtosis values of 38.45, 29.67, and 18.23, respectively. These values indicate highly skewed, heavy-tailed distributions with frequent outliers. Correlation analysis revealed a strong positive correlation (0.97) between orig_bytes and resp_bytes, and a strong negative correlation (−0.76) between duration and resp_bytes, while inferential statistics indicate that linear regression provides optimal modeling of data relationships. Key findings show that PCA is highly effective, capturing 99% of the dataset’s variance and enabling significant dimensionality reduction. DBSCAN clustering identifies six distinct clusters, highlighting diverse network traffic behaviors within IoT environments. In contrast, FA explains only 11.63% of the variance, indicating limited suitability for this dataset. These results establish important benchmarks for future IoT cybersecurity research and demonstrate the superior effectiveness of PCA and DBSCAN for analyzing complex IoT network traffic data. The findings offer practical guidance for researchers in selecting appropriate statistical methods for IoT dataset analysis, ultimately supporting the development of more robust cybersecurity solutions. Full article
(This article belongs to the Special Issue Intrusion/Malware Detection and Prevention in Networks—2nd Edition)
Show Figures

Figure 1

28 pages, 4625 KB  
Article
Hybrid PCA-Based and Machine Learning Approaches for Signal-Based Interference Detection and Anomaly Classification Under Synthetic Data Conditions
by Sebastián Čikovský, Patrik Šváb and Peter Hanák
Sensors 2025, 25(24), 7585; https://doi.org/10.3390/s25247585 - 14 Dec 2025
Viewed by 322
Abstract
This article addresses anomaly detection in multichannel spatiotemporal data under strict low-false-alarm constraints (e.g., 1% False Positive Rate, FPR), a requirement essential for safety-critical applications such as signal interference monitoring in sensor networks. We introduce a lightweight, interpretable pipeline that deliberately avoids deep [...] Read more.
This article addresses anomaly detection in multichannel spatiotemporal data under strict low-false-alarm constraints (e.g., 1% False Positive Rate, FPR), a requirement essential for safety-critical applications such as signal interference monitoring in sensor networks. We introduce a lightweight, interpretable pipeline that deliberately avoids deep learning dependencies, implemented solely in NumPy and scikit-learn. The core innovation lies in fusing three complementary anomaly signals in an ensemble: (i) Principal Component Analysis (PCA) Reconstruction Error (MSE) to capture global structure deviations, (ii) Local Outlier Factor (LOF) on residual maps to detect local rarity, and (iii) Monte Carlo Variance as a measure of epistemic uncertainty in model predictions. These signals are combined via learned logistic regression (F*) and specialized Neyman–Pearson optimized fusion (F** and F***) to rigorously enforce bounded false alarms. Evaluated on synthetic benchmarks that simulate realistic anomalies and extensive SNR shifts (±12 dB), the fusion approach demonstrates exceptional robustness. While the best single baseline (MC-variance) achieves a True Positive Rate (TPR) of ≈0.60 at 1% FPR on the 0 dB hold-out, the fusion significantly raises this to ≈0.74 (F**), avoiding the performance collapse of baselines under degraded SNR (maintaining ≈ 0.62 TPR at −12 dB). This deployable solution provides a transparent, edge-ready anomaly detection capability that is highly effective at operating points critical for reliable monitoring in dynamic environments. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

30 pages, 10644 KB  
Article
Integrating PCA and Fractal Modeling for Identifying Geochemical Anomalies in the Tropics: The Malang–Lumajang Volcanic Arc, Indonesia
by Wahyu Widodo, Ernowo Ernowo, Ridho Nanda Pratama, Mochamad Rifat Noor, Denni Widhiyatna, Edya Putra, Arifudin Idrus, Bambang Pardiarto, Zach Boakes, Martua Raja Parningotan, Triswan Suseno, Retno Damayanti, Purnama Sendjaja, Dwi Rachmawati and Ayumi Hana Putri Ramadani
Geosciences 2025, 15(12), 470; https://doi.org/10.3390/geosciences15120470 - 12 Dec 2025
Viewed by 321
Abstract
Intense chemical weathering in tropical environments poses challenges for conventional geochemical exploration, as primary lithological signatures become heavily altered. Stream sediment geochemistry provides a robust alternative for detecting anomalous geochemical patterns under these conditions. In this study, 636 stream sediment samples and 15 [...] Read more.
Intense chemical weathering in tropical environments poses challenges for conventional geochemical exploration, as primary lithological signatures become heavily altered. Stream sediment geochemistry provides a robust alternative for detecting anomalous geochemical patterns under these conditions. In this study, 636 stream sediment samples and 15 rock samples were evaluated using Principal Component Analysis (PCA), Median + 2 Median Absolute Deviation (MAD), and Concentration–Area (C–A) fractal modeling to identify potential anomaly zones. These results were compared with the traditional Mean plus 2 Standard Deviation (SD) approach. The findings indicated that Mean + 2SD offers a conservative threshold but overlooks anomalies in heterogeneous datasets, while Median + 2MAD provides robustness against outliers. The C-A fractal model effectively characterizes low- and high-order anomalies by capturing multiscale variability. Elements such as Au–Ag–Hg–Se–Sb–As form a system indicating low- to intermediate-sulphated epithermal mineralization. Au–Pb points to polymetallic hydrothermal mineralization along intrusive contacts. The southern region is a primary mineralization center controlled by an intrusive–volcanic boundary, whereas the east and west areas exhibit secondary mineralization, characterized by altered lava breccia. The correlation between shallow epithermal and deeper intrusive-related porphyry systems, especially regarding Au–Ag, offers new insights into the metallogenic landscape of the Sunda–Banda arc. Beyond regional significance, this research presents a geostatistical workflow designed to mitigate exploration uncertainty in geochemically complex zones, providing a structured approach applicable to volcanic-arc mineralized provinces worldwide. Full article
(This article belongs to the Section Geochemistry)
Show Figures

Figure 1

32 pages, 7383 KB  
Article
Vertebra Segmentation and Cobb Angle Calculation Platform for Scoliosis Diagnosis Using Deep Learning: SpineCheck
by İrfan Harun İlkhan, Halûk Gümüşkaya and Firdevs Turgut
Informatics 2025, 12(4), 140; https://doi.org/10.3390/informatics12040140 - 11 Dec 2025
Viewed by 530
Abstract
This study presents SpineCheck, a fully integrated deep-learning-based clinical decision support platform for automatic vertebra segmentation and Cobb angle (CA) measurement from scoliosis X-ray images. The system unifies end-to-end preprocessing, U-Net-based segmentation, geometry-driven angle computation, and a web-based clinical interface within a single [...] Read more.
This study presents SpineCheck, a fully integrated deep-learning-based clinical decision support platform for automatic vertebra segmentation and Cobb angle (CA) measurement from scoliosis X-ray images. The system unifies end-to-end preprocessing, U-Net-based segmentation, geometry-driven angle computation, and a web-based clinical interface within a single deployable architecture. For secure clinical use, SpineCheck adopts a stateless “process-and-delete” design, ensuring that no radiographic data or Protected Health Information (PHI) are permanently stored. Five U-Net family models (U-Net, optimized U-Net-2, Attention U-Net, nnU-Net, and UNet3++) are systematically evaluated under identical conditions using Dice similarity, inference speed, GPU memory usage, and deployment stability, enabling deployment-oriented model selection. A robust CA estimation pipeline is developed by combining minimum-area rectangle analysis with Theil–Sen regression and spline-based anatomical modeling to suppress outliers and improve numerical stability. The system is validated on a large-scale dataset of 20,000 scoliosis X-ray images, demonstrating strong agreement with expert measurements based on Mean Absolute Error, Pearson correlation, and Intraclass Correlation Coefficient metrics. These findings confirm the reliability and clinical robustness of SpineCheck. By integrating large-scale validation, robust geometric modeling, secure stateless processing, and real-time deployment capabilities, SpineCheck provides a scalable and clinically reliable framework for automated scoliosis assessment. Full article
Show Figures

Figure 1

41 pages, 7185 KB  
Article
Two-Stage Dam Displacement Analysis Framework Based on Improved Isolation Forest and Metaheuristic-Optimized Random Forest
by Zhihang Deng, Qiang Wu and Minshui Huang
Buildings 2025, 15(24), 4467; https://doi.org/10.3390/buildings15244467 - 10 Dec 2025
Viewed by 264
Abstract
Dam displacement monitoring is crucial for assessing structural safety; however, conventional models often prioritize single-task prediction, leading to an inherent difficulty in balancing monitoring data quality with model performance. To bridge this gap, this study proposes a novel two-stage analytical framework that synergistically [...] Read more.
Dam displacement monitoring is crucial for assessing structural safety; however, conventional models often prioritize single-task prediction, leading to an inherent difficulty in balancing monitoring data quality with model performance. To bridge this gap, this study proposes a novel two-stage analytical framework that synergistically integrates an improved isolation forest (iForest) with a metaheuristic-optimized random forest (RF). The first stage focuses on data cleaning, where Kalman filtering is applied for denoising, and a newly developed Dynamic Threshold Isolation Forest (DTIF) algorithm is introduced to effectively isolate noise and outliers amidst complex environmental loads. In the second stage, the model’s predictive capability is enhanced by first employing the LASSO algorithm for feature importance analysis and optimal subset selection, followed by an Improved Reptile Search Algorithm (IRSA) for fine-tuning RF hyperparameters, thereby significantly boosting the model’s robustness. The IRSA incorporates several key improvements: Tent chaotic mapping during initialization to ensure population diversity, an adaptive parameter adjustment mechanism combined with a Lévy flight strategy in the encircling phase to dynamically balance global exploration and convergence, and the integration of elite opposition-based learning with Gaussian perturbation in the hunting phase to refine local exploitation. Validated against field data from a concrete hyperbolic arch dam, the proposed DTIF algorithm demonstrates superior anomaly detection accuracy across nine distinct outlier distribution scenarios. Moreover, for long-term displacement prediction tasks, the IRSA-RF model substantially outperforms traditional benchmark models in both predictive accuracy and generalization capability, providing a reliable early risk warning and decision-support tool for engineering practice. Full article
(This article belongs to the Special Issue Structural Health Monitoring Through Advanced Artificial Intelligence)
Show Figures

Figure 1

24 pages, 2288 KB  
Article
Anomaly Detection in Imbalanced Network Traffic Using a ResCAE-BiGRU Framework
by Xiaofeng Nong, Kuangyu Qin and Xingliu Xie
Symmetry 2025, 17(12), 2087; https://doi.org/10.3390/sym17122087 - 5 Dec 2025
Viewed by 405
Abstract
To address the critical challenge of low detection rates for rare anomaly classes in network traffic, a problem exacerbated by severe data imbalance, this paper proposes a deep learning framework for anomaly detection in imbalanced network traffic. Initially, the framework employs the Isolation [...] Read more.
To address the critical challenge of low detection rates for rare anomaly classes in network traffic, a problem exacerbated by severe data imbalance, this paper proposes a deep learning framework for anomaly detection in imbalanced network traffic. Initially, the framework employs the Isolation Forest (iForest) and SMOTE-Tomek techniques for outlier removal and data balancing, respectively, to enhance data quality. The model first undergoes unsupervised pre-training using a symmetrically designed Residual Convolutional Autoencoder (ResCAE) to learn robust feature representations. Subsequently, the pre-trained encoder is integrated with a Bidirectional Gated Recurrent Unit (BiGRU) to capture temporal dependencies within the traffic features. During the fine-tuning phase, a Sharpness-Aware Minimization (SAM) optimizer is employed to enhance the model’s generalization capability. The experimental results on the public CICIDS2017 and UNSW-NB15 datasets reveal the model’s outstanding performance, achieving an accuracy, precision, recall, and F1-score of 99.33%, 99.53%, 99.33%, and 99.41%, respectively. Comparative analysis against baseline models confirms that the proposed method not only surpasses traditional machine learning algorithms but also holds a significant advantage over contemporary deep learning models. The results validate that this framework effectively resolves the issue of low detection rates for rare anomaly classes caused by data imbalance, offering a powerful and robust solution for building high-performance anomaly detection frameworks. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

20 pages, 609 KB  
Article
Geometric Fusion Mechanism for Reliable Central Measure Construction Amid Partial and Distorted Information
by Mohammed Ahmed Alomair and Muhammad Raza
Axioms 2025, 14(12), 883; https://doi.org/10.3390/axioms14120883 - 29 Nov 2025
Viewed by 135
Abstract
Biased estimates and fluctuating measures of central tendency are significant impediments to statistical inference and computational data analysis and are often caused by partial and distorted observations. The imputed least-squares-based estimators are very sensitive to non-normality, outliers, and missing data; thus, they cannot [...] Read more.
Biased estimates and fluctuating measures of central tendency are significant impediments to statistical inference and computational data analysis and are often caused by partial and distorted observations. The imputed least-squares-based estimators are very sensitive to non-normality, outliers, and missing data; thus, they cannot guarantee reliability in the presence of anomalous data. In an attempt to overcome these inadequacies, this paper utilizes a geometric fusion scheme, the Minimum Regularized Covariance Determinant (MRCD), to construct high-quality central measures. The suggested mechanism incorporates the concept of geometric dispersion and resistance-based principles of covariance to form stable dispersion structures, irrespective of data contamination and incompleteness. In this computational scheme, three estimators are developed, all of which use adaptive logarithmic transformations to boost efficiency and robustness. The theoretical argument can be characterized by analytical derivations of bias and Mean Squared Error (MSE) in large-sample situations, and empirical gains were verified by large-scale Monte Carlo experiments using synthetic populations and real-world datasets. The proposed MRCDdriven estimators are known to have a lower MSE as well as higher percentage relative efficiency (PRE) as compared to classical estimators. Overall, the findings indicate that the geometric fusion mechanism (MRCD) is a powerful, self-scaling, and statistically sound way of computing central measures in a situation in which information is incomplete and distorted. Full article
(This article belongs to the Section Mathematical Analysis)
Show Figures

Figure 1

19 pages, 428 KB  
Article
Vocational Education and Training in the European Union: A Data-Driven Comparative Analysis
by Alicia Vila, Laura Calvet, Josep Prieto and Angel A. Juan
Information 2025, 16(12), 1037; https://doi.org/10.3390/info16121037 - 27 Nov 2025
Viewed by 908
Abstract
Vocational education and training (VET) is a strategic driver of national education and skills development systems. It covers both Initial VET (IVET), which provides young people with vocational qualifications before they enter the labor market, and Continuing VET (CVET), which supports adults in [...] Read more.
Vocational education and training (VET) is a strategic driver of national education and skills development systems. It covers both Initial VET (IVET), which provides young people with vocational qualifications before they enter the labor market, and Continuing VET (CVET), which supports adults in updating or expanding their skills throughout their working lives. VET provides individuals with essential skills for employment and supports economies in adapting to technological, labor market, and social changes. Within the European Union (EU), VET plays a central role in addressing labor market transformation, the green and digital transitions, the rise of artificial intelligence, and the pursuit of social equity. This paper presents a data-driven analysis of VET in the EU countries. It reviews the relevant literature and outlines the role of Cedefop, the European Centre for the Development of Vocational Training, together with its main VET performance indicators. The analysis draws on publicly available Cedefop data on key VET indicators, filtered for reliability and systematically processed to ensure robust results. This research focuses on a selected set of key indicators covering participation in IVET at upper- and post-secondary levels, adult participation in both formal and non-formal learning, government and enterprise expenditure on training, the gender employment gap, and adult employment rates. These indicators are derived from Cedefop data spanning the period 2010–2024, with coverage varying across indicators. This study applies descriptive analysis to identify outlier countries, correlation analysis to explore relationships between indicators, and cluster analysis to group countries with similar VET profiles. It also compares the largest EU countries using common indicators. The results suggest key patterns, differences, and connections in VET performance across EU countries, offering insights for policy development and future research in VET. Full article
Show Figures

Graphical abstract

20 pages, 2950 KB  
Article
The Role of MER Processing Pipelines for STN Functional Identification During DBS Surgery: A Feature-Based Machine Learning Approach
by Vincenzo Levi, Stefania Coelli, Chiara Gorlini, Federica Forzanini, Sara Rinaldo, Nico Golfrè Andreasi, Luigi Romito, Roberto Eleopra and Anna Maria Bianchi
Bioengineering 2025, 12(12), 1300; https://doi.org/10.3390/bioengineering12121300 - 26 Nov 2025
Viewed by 422
Abstract
Microelectrode recording (MER) is commonly used to validate preoperative targeting during subthalamic nucleus (STN) deep brain stimulation (DBS) surgery for Parkinson’s Disease (PD). Although machine learning (ML) has been used to improve STN localization using MER data, the impact of preprocessing steps on [...] Read more.
Microelectrode recording (MER) is commonly used to validate preoperative targeting during subthalamic nucleus (STN) deep brain stimulation (DBS) surgery for Parkinson’s Disease (PD). Although machine learning (ML) has been used to improve STN localization using MER data, the impact of preprocessing steps on the accuracy of classifiers has received little attention. We evaluated 24 distinct preprocessing pipelines combining four artifact removal strategies, three outlier handling methods, and optional feature normalization. The effect of each data processing procedure’s component of interest was evaluated in function of the performance obtained using three ML models. Artifact rejection methods (i.e., unsupervised variance-based algorithm (COV) and background noise estimation (BCK)), combined with optimized outlier management (i.e., statistical outlier identification per hemisphere (ORH)) consistently improved classification performance. In contrast, applying hemisphere-specific feature normalization prior to classification led to performance degradation across all metrics. SHAP (SHapley Additive exPlanations) analysis, performed to determine feature importance across pipelines, revealed stable agreement with regard to influential features across diverse preprocessing configurations. In conclusion, optimal artifact rejection and outlier treatment are essential in preprocessing MER for STN identification in DBS, whereas preliminary feature normalization strategies may impair model performance. Overall, the best classification performance was obtained by applying the Random Forest model to the dataset treated using COV artifact rejection and ORH outlier management (accuracy = 0.945). SHAP-based interpretability offers valuable guidance for refining ML pipelines. These insights can inform robust protocol development for MER-guided DBS targeting. Full article
(This article belongs to the Special Issue AI and Data Analysis in Neurological Disease Management)
Show Figures

Graphical abstract

16 pages, 2090 KB  
Article
Bidirectional Mendelian Randomization and Multi-Omics Uncover Causal Serum Metabolites and Neuro-Related Mechanistic Pathways in Acute Myeloid Leukemia
by Haohan Ye, Yuanheng Liu, Jun Tang and Xiaoli Li
Int. J. Mol. Sci. 2025, 26(23), 11307; https://doi.org/10.3390/ijms262311307 - 22 Nov 2025
Viewed by 624
Abstract
Acute myeloid leukemia (AML) is a lethal clonal hematopoietic malignancy. Several reports have shown that serum metabolite alterations have been implicated in AML, but the causal relationship and underlying biological mechanisms remain unclear. We performed bidirectional Mendelian randomization (MR) to evaluate the association [...] Read more.
Acute myeloid leukemia (AML) is a lethal clonal hematopoietic malignancy. Several reports have shown that serum metabolite alterations have been implicated in AML, but the causal relationship and underlying biological mechanisms remain unclear. We performed bidirectional Mendelian randomization (MR) to evaluate the association between 486 serum metabolites and AML. The analytical approaches used to minimize research bias included the inverse variance weighting (IVW), MR-Egger and weighted median (WM) methods. Sensitivity analyses were performed using Cochran’s Q Test, MR-Egger, MR pleiotropy residual sum and outlier (MR-PRESSO), and Leave-one-out (LOO) analysis. Metabolic pathway analysis was conducted using the MetaboAnalyst 6.0 platform. We utilized RNA-seq data to explore the potential genes and mechanisms underlying the regulation of AML occurrence by serum metabolites. We identified 23 serum metabolites (13 known and 10 unknown) significantly associated with AML. Sensitivity analyses further validated the robustness of these associations. No evidence of reverse causality was detected by reverse MR analysis. The core pathways were histidine metabolism and fructose/mannose metabolism. Transcriptomic integration revealed 39 overlapping genes (differentially expressed genes vs. metabolite-associated genes) as key mediators, enriched in neuroactive ligand signaling, synaptic vesicle cycle, and GABAergic synapse (KEGG), plus synapse assembly and calmodulin binding and neuron-to-neuron synapse (GO). This study establishes causal links between specific serum metabolites and AML, revealing neuro-related mechanistic pathways. These findings provide novel biomarkers and therapeutic targets for AML precision medicine. Full article
(This article belongs to the Special Issue 25th Anniversary of IJMS: Updates and Advances in Molecular Oncology)
Show Figures

Figure 1

31 pages, 3054 KB  
Article
Outlier Detection in EEG Signals Using Ensemble Classifiers
by Agnieszka Duraj, Natalia Łukasik and Piotr S. Szczepaniak
Appl. Sci. 2025, 15(22), 12343; https://doi.org/10.3390/app152212343 - 20 Nov 2025
Viewed by 397
Abstract
Epilepsy is one of the most prevalent neurological disorders, affecting over 50 million people worldwide. Accurate detection and characterization of epileptic activity are clinically critical, as seizures are associated with substantial morbidity, mortality, and impaired quality of life. Electroencephalography (EEG) remains the gold [...] Read more.
Epilepsy is one of the most prevalent neurological disorders, affecting over 50 million people worldwide. Accurate detection and characterization of epileptic activity are clinically critical, as seizures are associated with substantial morbidity, mortality, and impaired quality of life. Electroencephalography (EEG) remains the gold standard for epilepsy assessment; however, its manual interpretation is time-consuming, subjective, and prone to inter-rater variability, emphasizing the need for automated analytical approaches. This study proposes an automated ensemble classification framework for outlier detection in EEG signals. Three interpretable baseline models—Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and decision tree (DT-CART)—were screened. Ensembles were formed only from base models that had a pre-registered meta-selection rule (F1 on the outlier-class >0.60). Under this criterion, DT-CART did not qualify and was excluded from all ensembles; final ensembles combined SVM and k-NN. The framework was evaluated on two publicly available datasets with distinct acquisition conditions. The Bonn EEG dataset comprises 500 artifact-free single-channel recordings from healthy subjects and epilepsy patients under controlled laboratory settings. In contrast, the Guinea-Bissau and Nigeria Epilepsy (GBNE) dataset contains multi-channel EEG recordings from 97 participants acquired in field conditions using low-cost equipment, reflecting real-world diagnostic challenges such as motion artifacts and signal variability. The ensemble framework substantially improved outlier detection performance, with stacking achieving up to a 95.0% F1-score (accuracy 95.0%) on the Bonn dataset and 85.5% F1-score (accuracy 85.5%) on the GBNE dataset. These findings demonstrate that the proposed approach provides a robust, interpretable, and generalizable solution for EEG analysis, with strong potential to enhance reliable, efficient, and scalable epilepsy detection in both laboratory and resource-limited clinical environments. Full article
(This article belongs to the Special Issue EEG Signal Processing in Medical Diagnosis Applications)
Show Figures

Figure 1

15 pages, 763 KB  
Article
Verification of Accuracy of Genomically Enhanced Predicted Transmitting Ability Techniques in Predicting Milk and Fat Production in Holstein Cattle in Taiwan
by Chun-Hsuan Chao and Jen-Wen Shiau
Animals 2025, 15(22), 3334; https://doi.org/10.3390/ani15223334 - 19 Nov 2025
Viewed by 379
Abstract
This study evaluated the predictive performance of genomically enhanced predicted transmitting abilities for milk (gPTAM) and fat yield (gPTAF) in 986 first-lactation Holstein cows from 25 herds in Taiwan. Ordinary least squares and linear mixed models revealed significant positive associations between genomic predictions [...] Read more.
This study evaluated the predictive performance of genomically enhanced predicted transmitting abilities for milk (gPTAM) and fat yield (gPTAF) in 986 first-lactation Holstein cows from 25 herds in Taiwan. Ordinary least squares and linear mixed models revealed significant positive associations between genomic predictions and observed yields (milk: β = 1.201, R2 = 0.469; fat: β = 1.444, R2 = 0.507). Incorporating herd and birth-year effects improved model fit and reduced residual variability. Five-fold cross-validation confirmed the robustness of the full mixed model, with predictive R2 increasing to 0.293 for milk and 0.363 for fat, distinct from the OLS R2 (0.469 and 0.507) representing phenotypic variation explained, indicating moderate predictive ability of genomic PTA values under subtropical production conditions. Model adequacy checks supported appropriate model specification, with only a mild outlier signal in the milk model. Regional analysis revealed a significant genotype-by-environment interaction for PTAF (p = 0.018) but not for PTAM, indicating reduced prediction accuracy in environmentally variable regions and highlighting trait-specific environmental sensitivity. Quartile stratification by gPTA and Net Merit Score demonstrated clear yield gradients, confirming both the predictive and economic value of genomic evaluations under subtropical dairy production systems. Full article
Show Figures

Figure 1

Back to TopTop