Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (368)

Search Parameters:
Keywords = imbalance handling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 2318 KB  
Article
Transformer Tokenization Strategies for Network Intrusion Detection: Addressing Class Imbalance Through Architecture Optimization
by Gulnur Aksholak, Agyn Bedelbayev, Raiymbek Magazov and Kaplan Kaplan
Computers 2026, 15(2), 75; https://doi.org/10.3390/computers15020075 (registering DOI) - 1 Feb 2026
Abstract
Network intrusion detection has challenges that fundamentally differ from language and vision tasks typically addressed by Transformer models. In particular, network traffic features lack inherent ordering, datasets are extremely class-imbalanced (with benign traffic often exceeding 80%), and reported accuracies in the literature vary [...] Read more.
Network intrusion detection has challenges that fundamentally differ from language and vision tasks typically addressed by Transformer models. In particular, network traffic features lack inherent ordering, datasets are extremely class-imbalanced (with benign traffic often exceeding 80%), and reported accuracies in the literature vary widely (57–95%) without systematic explanation. To address these challenges, we propose a controlled experimental study that isolates and quantifies the impact of tokenization strategies on Transformer-based intrusion detection systems. Specifically, we introduce and compare three tokenization approaches—feature-wise tokenization (78 tokens) based on CICIDS2017, a sample-wise single-token baseline, and an optimized sample-wise tokenization—under identical training and evaluation protocols on a highly imbalanced intrusion detection dataset. We demonstrate that tokenization choice alone accounts for an accuracy gap of 37.43 percentage points, improving performance from 57.09% to 94.52% (100 K data). Furthermore, we show that architectural mechanisms for handling class imbalance—namely Batch Normalization and capped loss weights—yield an additional 15.05% improvement, making them approximately 21× more effective than increasing the training data by 50%. We achieve a macro-average AUC of 0.98, improve minority-class recall by 7–12%, and maintain strong discrimination even for classes with as few as four samples (AUC 0.9811). These results highlight tokenization and imbalance-aware architectural design as primary drivers of performance in Transformer-based intrusion detection and contribute practical guidance for deploying such models in modern network infrastructures, including IoT and cloud environments where extreme class imbalance is inherent. This study also presents practical implementation scheme recommending sample-wise tokenization, constrained class weighting, and Batch Normalization after embedding and classification layers to improve stability and performance in highly unstable table-based IDS problems. Full article
Show Figures

Figure 1

20 pages, 1245 KB  
Review
The Interplay Between Bone Biology and Iron Metabolism: Molecular Mechanisms and Clinical Implications
by Margherita Correnti, Elena Gammella, Gaetano Cairo and Stefania Recalcati
Biomedicines 2026, 14(2), 301; https://doi.org/10.3390/biomedicines14020301 - 29 Jan 2026
Viewed by 254
Abstract
The maintenance of bone homeostasis requires the coordinated activity of specialized cells (osteoblasts, osteoclasts and osteocytes), soluble factors and hormones with regulatory functions. Disruption of this tightly controlled balance contributes to several skeletal pathological conditions, among which osteoporosis is one of the most [...] Read more.
The maintenance of bone homeostasis requires the coordinated activity of specialized cells (osteoblasts, osteoclasts and osteocytes), soluble factors and hormones with regulatory functions. Disruption of this tightly controlled balance contributes to several skeletal pathological conditions, among which osteoporosis is one of the most prevalent. Iron, an essential element for the basic cellular functions of both osteoblasts and osteoclasts, plays a pivotal role in preserving bone homeostasis and skeletal integrity. Both iron deficiency and iron overload impair bone remodeling through distinct but converging mechanisms. Iron deficiency compromises collagen synthesis, alters hypoxia-dependent signaling, and may affect vitamin D metabolism, collectively predisposing the individual to reduced bone mineral density and increased fracture risk. Conversely, excess iron enhances oxidative stress, promotes osteoclastogenesis, and suppresses osteoblast differentiation and function, thereby favoring bone loss, particularly in the aging population and postmenopausal individuals. Hepcidin, the master regulator of systemic iron availability, has emerged as a key modulator of bone turnover, whereas the bone-derived hormone fibroblast growth factor 23 (FGF23) links iron imbalance to phosphate homeostasis, vitamin D metabolism, and inflammation. Beyond metabolic bone diseases, dysregulated iron handling is increasingly recognized as a hallmark of osteosarcoma biology, influencing tumor growth, metabolic reprogramming, and an individual’s susceptibility to ferroptosis. The emerging, albeit only preclinical, evidence of the roles of iron and ferroptosis in osteosarcoma is therefore also covered. This review summarizes the current understanding of the interactions between iron metabolism and bone biology and addresses how an imbalance in iron metabolism may lead to major skeletal disorders. Overall, iron homeostasis could represent a potential target for preventing and treating osteoporosis and for improving therapeutic strategies for osteosarcoma. Full article
(This article belongs to the Special Issue The Role of Iron in Human Diseases)
Show Figures

Figure 1

25 pages, 2638 KB  
Article
Toward Personalized ACS Therapy: How Disease Status and Patient Lifestyle Shape the Molecular Signature of Autologous Conditioned Serum
by Christoph Bauer, Daniela Kern, Kalojan Petkin and Stefan Nehrer
J. Clin. Med. 2026, 15(3), 1014; https://doi.org/10.3390/jcm15031014 - 27 Jan 2026
Viewed by 103
Abstract
Background/Objectives: Autologous conditioned serum (ACS) is an intra-articular orthobiologic for osteoarthritis (OA) intended to shift the joint cytokine milieu toward an anti-inflammatory, pro-regenerative profile. In the present study, we compared the molecular composition of ACS (specifically IMPACT® ACS) from OA patients [...] Read more.
Background/Objectives: Autologous conditioned serum (ACS) is an intra-articular orthobiologic for osteoarthritis (OA) intended to shift the joint cytokine milieu toward an anti-inflammatory, pro-regenerative profile. In the present study, we compared the molecular composition of ACS (specifically IMPACT® ACS) from OA patients with that of healthy controls and assessed demographic and lifestyle influences on mediator levels. Methods: ACS was prepared from the whole blood of 50 OA patients and 20 healthy controls using the IMPACT® centrifugation system (Plasmaconcept, Cologne, Germany) with glass-bead incubation and standardized handling. Cytokines, growth factors, and matrix metalloproteinases (MMPs) were quantified using multiplex immunoassays and ELISA. To account for demographic imbalances across cohorts, the primary findings were verified using age- and sex-adjusted multiple linear regression models. Results: Pro-inflammatory mediators were minimal in both cohorts, with IL-1β undetectable and IL-6 and TNF-α at very low levels. IL-1 receptor antagonist (IL-1RA) was consistently present. Notably, OA-derived ACS exhibited a catabolic shift compared to controls, characterized by significantly higher MMP-2 and MMP-3 levels. Growth factor profiling showed lower TGF-β1 and TGF-β3 in OA-derived ACS, with TGF-β2 showing no significant difference after adjustment. Exploratory stratified analyses indicated potential differences across sex, BMI, smoking status, and diet for select mediators, though subgroup sizes were limited. Conclusions: ACS prepared with a standardized IMPACT® protocol displays a broad anti-inflammatory profile. However, increased MMPs and isoform-specific differences in TGF-β reflect a disease-associated molecular imprint. Consequently, patient-related heterogeneity supports the need for standardized reporting and motivates further research into stratified ACS therapy. Full article
Show Figures

Figure 1

17 pages, 566 KB  
Article
AE-CTGAN: Autoencoder–Conditional Tabular GAN for Multi-Omics Imbalanced Class Handling and Cancer Outcome Prediction
by Ibrahim Al-Hurani, Sara H. ElFar, Abedalrhman Alkhateeb and Salama Ikki
Algorithms 2026, 19(2), 95; https://doi.org/10.3390/a19020095 - 25 Jan 2026
Viewed by 126
Abstract
The rapid advancement of sequencing technologies has led to the generation of complex multi-omics data, which are often high-dimensional, noisy, and imbalanced, posing significant challenges for traditional machine learning methods. The novelty of this work resides in the architecture-level integration of autoencoders with [...] Read more.
The rapid advancement of sequencing technologies has led to the generation of complex multi-omics data, which are often high-dimensional, noisy, and imbalanced, posing significant challenges for traditional machine learning methods. The novelty of this work resides in the architecture-level integration of autoencoders with Generative Adversarial Network (GAN) and Conditional Tabular Generative Adversarial Network (CTGAN) models, where the autoencoder is employed for latent feature extraction and noise reduction, while GAN-based models are used for realistic sample generation and class imbalance mitigation in multi-omics cancer datasets. This study proposes a novel framework that combines an autoencoder for dimensionality reduction and a CTGAN for generating synthetic samples to balance underrepresented classes. The process starts with selecting the most discriminative features, then extracting latent representations for each omic type, merging them, and generating new minority samples. Finally, all samples are used to train a neural network to predict specific cancer outcomes, defined here as clinically relevant biomarkers or patient characteristics. In this work, the considered outcome in the bladder cancer is Tumor Mutational Burden (TMB), while the breast cancer outcome is menopausal status, a key factor in treatment planning. Experimental results show that the proposed model achieves high precision, with an average precision of 0.9929 for TMB prediction in bladder cancer and 0.9748 for menopausal status in breast cancer, and reaches perfect precision (1.000) for the positive class in both cases. In addition, the proposed AE–CTGAN framework consistently outperformed an autoencoder combined with a standard GAN across all evaluation metrics, achieving average accuracies of 0.9929 and 0.9748, recall values of 0.9846 and 0.9777, and F1-scores of 0.9922 for bladder and breast cancer datasets, respectively. A comparative fidelity analysis in the latent space further demonstrated the superiority of CTGAN, reducing the average Euclidean distance between real and synthetic samples by approximately 72% for bladder cancer and by up to 84% for breast cancer compared to a standard GAN. These findings confirm that CTGAN generates high-fidelity synthetic samples that preserve the structural characteristics of real multi-omics data, leading to more reliable class balancing and improved predictive performance. Overall, the proposed framework provides an effective and robust solution for handling class imbalance in multi-omics cancer data and enhances the accuracy of clinically relevant outcome prediction. Full article
Show Figures

Figure 1

53 pages, 3615 KB  
Review
Progress in Aero-Engine Fault Signal Recognition and Intelligent Diagnosis
by Shunming Li, Wenbei Shi, Jiantao Lu, Haibo Zhang, Yanfeng Wang, Peng Zhang, Mengqi Feng and Yan Wang
Machines 2026, 14(1), 118; https://doi.org/10.3390/machines14010118 - 19 Jan 2026
Viewed by 190
Abstract
Accurate diagnosis of aero-engine faults and precise signal characterization are crucial to ensuring operational reliability and service life prediction. The structural complexity of engines and the variability of operating conditions pose significant challenges for fault diagnosis and identification. Based on an analysis and [...] Read more.
Accurate diagnosis of aero-engine faults and precise signal characterization are crucial to ensuring operational reliability and service life prediction. The structural complexity of engines and the variability of operating conditions pose significant challenges for fault diagnosis and identification. Based on an analysis and emphasis on the critical importance of aero-engine fault signal recognition and diagnosis, this paper comprehensively reviews and discusses the classification and evolution of aero-engine fault signal recognition techniques. The review traces this evolution along its developmental trajectory, from classical methods to emerging approaches such as quantum signal processing for weak feature extraction. It also examines characteristics of different types of aviation engine failures and the progression of diagnostic research over time. This review provides multiple tables to compare the applicability, advantages, and limitations of various signal recognition methods and deep learning diagnostic architectures. Detailed discussions synthesize the relative merits of different approaches and their selection trade-offs. Based on this overview, the paper outlines the complexity of real aero-engine faults and key research directions. Building on these developments in fault signal recognition and diagnosis, the paper addresses the complexity and the research areas receiving particular attention within real aero-engine faults. It highlights key research areas, including handling data imbalance, adapting to variable and cross-domain conditions, and advancing diagnostic and data enhancement methods for weak composite faults. Finally, the paper analyzes the multifaceted challenges in the field and identifies future trends in aero-engine fault signal recognition and intelligent diagnosis. Full article
Show Figures

Figure 1

26 pages, 3132 KB  
Article
An Unsupervised Cloud-Centric Intrusion Diagnosis Framework Using Autoencoder and Density-Based Learning
by Suresh K. S, Thenmozhi Elumalai, Radhakrishnan Rajamani, Anubhav Kumar, Balamurugan Balusamy, Sumendra Yogarayan and Kaliyaperumal Prabu
Future Internet 2026, 18(1), 54; https://doi.org/10.3390/fi18010054 - 19 Jan 2026
Viewed by 138
Abstract
Cloud computing environments generate high-dimensional, large-scale, and highly dynamic network traffic, making intrusion diagnosis challenging due to evolving attack patterns, severe traffic imbalance, and limited availability of labeled data. To address these challenges, this study presents an unsupervised, cloud-centric intrusion diagnosis framework that [...] Read more.
Cloud computing environments generate high-dimensional, large-scale, and highly dynamic network traffic, making intrusion diagnosis challenging due to evolving attack patterns, severe traffic imbalance, and limited availability of labeled data. To address these challenges, this study presents an unsupervised, cloud-centric intrusion diagnosis framework that integrates autoencoder-based representation learning with density-based attack categorization. A dual-stage autoencoder is trained exclusively on benign traffic to learn compact latent representations and to identify anomalous flows using reconstruction-error analysis, enabling effective anomaly detection without prior attack labels. The detected anomalies are subsequently grouped using density-based learning to uncover latent attack structures and support fine-grained multiclass intrusion diagnosis under varying attack densities. Experiments conducted on the large-scale CSE-CIC-IDS2018 dataset demonstrate that the proposed framework achieves an anomaly detection accuracy of 99.46%, with high recall and low false-negative rates in the optimal latent-space configuration. The density-based classification stage achieves an overall multiclass attack classification accuracy of 98.79%, effectively handling both majority and minority attack categories. Clustering quality evaluation reports a Silhouette Score of 0.9857 and a Davies–Bouldin Index of 0.0091, indicating strong cluster compactness and separability. Comparative analysis against representative supervised and unsupervised baselines confirms the framework’s scalability and robustness under highly imbalanced cloud traffic, highlighting its suitability for future Internet cloud security ecosystems. Full article
(This article belongs to the Special Issue Cloud and Edge Computing for the Next-Generation Networks)
Show Figures

Figure 1

14 pages, 792 KB  
Entry
Legislative Cost Estimation Systems in South Korea and the U.S.
by Joochul Yoon and Hyungjo Hur
Encyclopedia 2026, 6(1), 21; https://doi.org/10.3390/encyclopedia6010021 - 19 Jan 2026
Viewed by 142
Definition
This entry aims to enhance the legislative cost estimation system in South Korea by conducting a comparative analysis with the Congressional Budget Office (CBO) in the United States. The analysis reveals significant structural divergences between the two systems. First, the US system operates [...] Read more.
This entry aims to enhance the legislative cost estimation system in South Korea by conducting a comparative analysis with the Congressional Budget Office (CBO) in the United States. The analysis reveals significant structural divergences between the two systems. First, the US system operates under binding fiscal rules like PAYGO, whereas the South Korean system functions primarily as an informational tool. Second, a severe workload imbalance exists; South Korean analysts at the National Assembly Budget Office (NABO) handle approximately 12.7 times more estimates annually than their US counterparts, placing a substantial burden on personnel. Third, unlike the US, South Korea lacks institutional mechanisms to alleviate this workload or enforce the utilization of cost estimates. The findings suggest that expanding NABO’s analytical workforce and institutionalizing the linkage between cost estimates and legislative decision-making are essential for improving fiscal efficiency. Full article
(This article belongs to the Section Social Sciences)
Show Figures

Figure 1

18 pages, 977 KB  
Article
BI-GBDT: A Graph-Free Behavioral Interaction-Aware Gradient Boosting Framework for Fraud Detection in Large-Scale Payment Systems
by Mustafa Berk Keles and Mehmet Gokturk
Appl. Sci. 2026, 16(2), 876; https://doi.org/10.3390/app16020876 - 14 Jan 2026
Viewed by 165
Abstract
Detecting fraudulent and anomalous transactions in large-scale digital payment systems is significantly challenging due to severe class imbalance and the fact that transactional risk is tightly coupled to the historical interactions and behaviors of transacting parties. In this study, a scalable Behavioral Interaction-Aware [...] Read more.
Detecting fraudulent and anomalous transactions in large-scale digital payment systems is significantly challenging due to severe class imbalance and the fact that transactional risk is tightly coupled to the historical interactions and behaviors of transacting parties. In this study, a scalable Behavioral Interaction-Aware Gradient Boosting (BI-GBDT) framework is proposed for anomaly detection in tabular transaction data to overcome these challenges. The methodology models sending and receiving behaviors separately through direction-specific clustering based on transaction frequency and amount. Each transaction is characterized by cluster-pair prevalence ratios, which capture the population-level prevalence of sender–receiver interaction patterns. To handle extreme class imbalance, all transactions are clustered, and a cluster-level risk score is computed as the ratio of anomalous transactions to the total number of transactions within each cluster. This score is incorporated as a feature, serving as a behavioral risk prior highlighting concentrated anomaly. These interaction-aware features are integrated into a GBDT in a big data environment. Experiments were conducted on a large masked real-world payment dataset spanning six months and containing more than 456 million transactions, with the prediction task defined as binary classification between fraudulent and non-fraudulent transactions. Unlike standard GBDT models trained only on transactional attributes and graph-based approaches, BI-GBDT captures sender–receiver interaction patterns in a graph-free manner and outperforms a baseline GBDT, reducing the false positive rate from 37.0% to 4.3%, increasing recall from 52.3% to 72.0%, and improving accuracy from 63.0% to 95.7%. Full article
(This article belongs to the Special Issue Machine Learning and Its Application for Anomaly Detection)
Show Figures

Figure 1

39 pages, 3907 KB  
Article
RoadMark-cGAN: Generative Conditional Learning to Directly Map Road Marking Lines from Aerial Orthophotos via Image-to-Image Translation
by Calimanut-Ionut Cira, Naoto Yokoya, Miguel-Ángel Manso-Callejo, Ramon Alcarria, Clifford Broni-Bediako, Junshi Xia and Borja Bordel
Electronics 2026, 15(1), 224; https://doi.org/10.3390/electronics15010224 - 3 Jan 2026
Viewed by 355
Abstract
Road marking lines can be extracted from aerial images using semantic segmentation (SS) models; however, in this work, a conditional generative adversarial network, RoadMark-cGAN, is proposed for direct extraction of these representations with image-to-image translation techniques. The generator features residual and attention blocks [...] Read more.
Road marking lines can be extracted from aerial images using semantic segmentation (SS) models; however, in this work, a conditional generative adversarial network, RoadMark-cGAN, is proposed for direct extraction of these representations with image-to-image translation techniques. The generator features residual and attention blocks added in a functional bottleneck, while the discriminator features a modified PatchGAN, with an optimized encoder and an attention block added. The proposed model is improved in three versions (v2 to v4), in which dynamic dropout techniques and a novel “Morphological Boundary-Sensitive Class-Balanced” (MBSCB) loss are progressively added to better handle the high class imbalance present in the data. All models were trained on a novel “RoadMarking-binary” dataset (29,405 RGB orthoimage tiles of 256 × 256 pixels and their corresponding ground truth masks) to learn the distribution of road marking lines found on pavement. The metrical evaluation on the test set containing 2045 unseen images showed that the best proposed model achieved average improvements of 45.2% and 1.7% in the Intersection-over-Union (IoU) score for the positive, underrepresented class when compared to the best Pix2Pix and SS models, respectively, trained for the same task. Finally, a qualitative, visual comparison was conducted to assess the quality of the road marking predictions of the best models and their mapping performance. Full article
Show Figures

Figure 1

43 pages, 554 KB  
Review
A Survey of Six Classical Classifiers, Including Algorithms, Methodological Characteristics, Foundational Variants, and Recent Advances
by Ali Hussein Alshammari, Gergely Bencsik and Almashhadani Hasnain Ali
Algorithms 2026, 19(1), 37; https://doi.org/10.3390/a19010037 - 1 Jan 2026
Viewed by 634
Abstract
Classification is a core supervised learning task in data analysis, and six classical classifier families (k-Nearest Neighbors, Support Vector Machine, Decision Tree, Random Forest, Logistic Regression, and Naïve Bayes) remain widely used in practice and underpin many subsequent variants. Although both single-family and [...] Read more.
Classification is a core supervised learning task in data analysis, and six classical classifier families (k-Nearest Neighbors, Support Vector Machine, Decision Tree, Random Forest, Logistic Regression, and Naïve Bayes) remain widely used in practice and underpin many subsequent variants. Although both single-family and multi-classifier surveys exist, there is still a gap for a method-centered study that, within a coherent framework, combines algorithmic representations for training and prediction, methodological characteristics, an explicit methodological comparison of the foundational variants within each family, and method-oriented advances published between 2020 and 2025. The survey is organized around a fixed set of performance-related perspectives, including accuracy, hyperparameter tuning, scalability, class imbalance, behavior in high-dimensional settings, decision-boundary complexity, interpretability, computational efficiency, and multiclass handling. It highlights strengths, weaknesses, and trade-offs across the six families and their variants, helping researchers and practitioners select or extend classification approaches. It also outlines future research directions arising from the limitations across the examined methods. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))
22 pages, 932 KB  
Review
Absorption of Energy in Excess, Photoinhibition, Transpiration, and Foliar Heat Emission Feedback Loops During Global Warming
by Roshanak Zarrin Ghalami, Maria Duszyn and Stanisław Karpiński
Cells 2026, 15(1), 75; https://doi.org/10.3390/cells15010075 - 1 Jan 2026
Viewed by 555
Abstract
Global warming is increasingly constraining plant productivity by altering the photosynthetic energy balance and leaf thermoregulation. Under high light and elevated temperatures, absorption of energy in excess (AEE) by photosystem II disrupts photosynthetic electron transport, oxygen evolution, and CO2 assimilation, often accompanied [...] Read more.
Global warming is increasingly constraining plant productivity by altering the photosynthetic energy balance and leaf thermoregulation. Under high light and elevated temperatures, absorption of energy in excess (AEE) by photosystem II disrupts photosynthetic electron transport, oxygen evolution, and CO2 assimilation, often accompanied by reduced foliar transpiration. These conditions promote photoinhibition, as reflected by a decrease in maximal photosynthetic efficiency (Fv/Fm), an increase in non-photochemical quenching (NPQ), and photooxidative stress associated with enhanced reactive oxygen species (ROS) production. In addition to environmental heat stress, AEE influences foliar temperature through internal energy partitioning, including regulated dissipation of AEE as heat and changes in transpirational cooling. The relative contributions of NPQ, photochemistry, and transpiration to leaf temperature regulation are strongly context dependent and vary with light intensity, temperature changes, and water availability. Under global warming, rising background temperatures and increased vapor pressure deficit may constrain transpirational cooling and alter the balance between non-photochemical and photochemical energy dissipation and usage, respectively. In this review, we synthesize current knowledge on AEE handling, photoinhibition, NPQ and other quenching processes, and on transpiration cooling, and discuss a conceptual framework in which sustained imbalance among these processes under global warming conditions could amplify foliar heat stress and increase the risk of cellular damage. Rather than proposing new physiological mechanisms, this work integrates existing evidence across molecular, leaf, and ecosystem scales to highlight potential feedbacks relevant to plant performance under future climate prediction scenarios. Full article
(This article belongs to the Special Issue Plant Stress and Acclimation Responses During Global Warming)
Show Figures

Figure 1

32 pages, 2805 KB  
Article
Geologically Constrained Multi-Scale Transformer for Lithology Identification Under Extreme Class Imbalance
by Xiao Li, Puhong Feng, Baohua Yu, Chun-Ping Li, Junbo Liu and Jie Zhao
Eng 2026, 7(1), 8; https://doi.org/10.3390/eng7010008 - 25 Dec 2025
Viewed by 262
Abstract
Accurate identification of lithology is considered very important in oil and gas exploration because it has a direct impact on the evaluation and development planning of any reservoir. In complex reservoirs where extreme class imbalance occurs, as critical minority lithologies cover less than [...] Read more.
Accurate identification of lithology is considered very important in oil and gas exploration because it has a direct impact on the evaluation and development planning of any reservoir. In complex reservoirs where extreme class imbalance occurs, as critical minority lithologies cover less than 5%, the identification accuracy is severely constrained. Recent deep learning methods include convolutional neural networks, recurrent architectures, and transformer-based models that have achieved substantial improvements over traditional machine learning approaches in identifying lithology. These methods demonstrate great performance in catching spatial patterns and sequential dependencies from well log data, and they show great recognition accuracy, up to 85–88%, in the case of a moderate imbalance scenario. However, when these methods are extended to complex reservoirs under extreme class imbalance, the following three major limitations have been identified: (1) single-scale architectures, such as CNNs or standard Transformers, cannot capture thin-layer details less than 0.5 m and regional geological trends larger than 2 m simultaneously; (2) generic imbalance handling techniques, including focal loss alone or basic SMOTE, prove to be insufficient for extreme ratios larger than 50:1; and (3) conventional Transformers lack depth-dependent attention mechanisms incorporating stratigraphic continuity principles. This paper is dedicated to proposing a geological-constrained multi-scale Transformer framework tailored for 1D well-log sequences under extreme imbalance larger than 50:1. The systematic approach addresses the extreme imbalance by deep-feature fusion and advanced class-rebalancing strategies. Accordingly, this framework integrates multi-scale convolutional feature extraction using 1 × 3, 1 × 5, 1 × 7 kernels, hierarchical attention mechanisms with depth-aware position encoding based on Walther’s Law to model long-range dependencies, and adaptive three-stage class-rebalancing through SMOTE–Tomek hybrid resampling, focal loss, and CReST self-training. The experimental validation based on 32,847 logging samples demonstrates significant improvements: overall accuracy reaches 90.3% with minority class F1 scores improving by 20–25% percentage points (argillaceous siltstone 73.5%, calcareous sandstone 68.2%, coal seams 65.8%), and G-mean of 0.804 confirming the balanced recognition. Of note, the framework maintains stable performance even when there is extreme class imbalance at a ratio of up to 100:1 with minority class F1 scores above 64%, representing a two-fold improvement over the state-of-the-art methods, where former Transformer-based approaches degrade below. This paper provides the fundamental technical development for the intelligent transformation of oil and gas exploration, with extensive application prospects. Full article
(This article belongs to the Section Chemical, Civil and Environmental Engineering)
Show Figures

Figure 1

28 pages, 765 KB  
Systematic Review
Radiomic-Based Machine Learning Classifiers for HPV Status Prediction in Oropharyngeal Cancer: A Systematic Review and Meta-Analysis
by Anna Luíza Damaceno Araújo, Luiz Paulo Kowalski, Alan Roger Santos-Silva, Brendo Vinícius Rodrigues Louredo, Cristina Saldivia-Siracusa, Otávio Augusto A. M. de Melo, Deivid Cabral, Andrés Coca-Pelaz, Orlando Guntinas-Lichius, Remco de Bree, Pawel Golusinski, Karthik N. Rao, Robert P. Takes, Nabil F. Saba and Alfio Ferlito
Diagnostics 2026, 16(1), 68; https://doi.org/10.3390/diagnostics16010068 - 24 Dec 2025
Viewed by 573
Abstract
Background: The aim of the present systematic review (SR) is to compile evidence regarding the use of radiomic-based machine learning (ML) models for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC) patients and to assess their reliability, methodological frameworks, and [...] Read more.
Background: The aim of the present systematic review (SR) is to compile evidence regarding the use of radiomic-based machine learning (ML) models for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC) patients and to assess their reliability, methodological frameworks, and clinical applicability. The SR was conducted following PRISMA 2020 guidelines and registered in PROSPERO (CRD42025640065). Methods: Using the PICOS framework, the review question was defined as follows: “Can radiomic-based ML models accurately predict HPV status in OPSCC?” Electronic databases (Cochrane, Embase, IEEE Xplore, BVS, PubMed, Scopus, Web of Science) and gray literature (arXiv, Google Scholar and ProQuest) were searched. Retrospective cohort studies assessing radiomics for HPV prediction were included. Risk of bias (RoB) was evaluated using Prediction model Risk Of Bias ASsessment Tool (PROBAST), and data were synthesized based on imaging modality, architecture type/learning modalities, and the presence of external validation. Meta-analysis was performed for externally validated models using MetaBayesDTA and RStudio. Results: Twenty-four studies including 8627 patients were analyzed. Imaging modalities included computed tomography (CT), magnetic resonance imaging (MRI), contrast-enhanced computed tomography (CE-CT), and 18F-fluorodeoxyglucose positron emission tomography (18F-FDG PET). Logistic regression, random forest, eXtreme Gradient Boosting (XGBoost), and convolutional neural networks (CNNs) were commonly used. Most datasets were imbalanced with a predominance of HPV+ cases. Only eight studies reported external validation results. AUROC values ranged between 0.59 and 0.87 in the internal validation and between 0.48 and 0.91 in the external validation results. RoB was high in most studies, mainly due to reliance on p16-only HPV testing, insufficient events, or inadequate handling of class imbalance. Deep Learning (DL) models achieved moderate performance with considerable heterogeneity (sensitivity: 0.61; specificity: 0.65). In contrast, traditional models provided higher, more consistent performance (sensitivity: 0.72; specificity: 0.77). Conclusions: Radiomic-based ML models show potential for HPV status prediction in OPSCC, but methodological heterogeneity and a high RoB limit current clinical applicability. Full article
(This article belongs to the Special Issue Clinical Diagnosis of Otorhinolaryngology)
Show Figures

Figure 1

20 pages, 2966 KB  
Article
EMAFG-RTDETR: An Improved RTDETR Algorithm for UAV-Based Concrete Defect Detection
by Jinlong Yang, Shaojiang Dong, Jun Luo, Shizheng Sun, Jiayuan Luo, Kaibo Yan, Cai Chen and Xin Zhou
Drones 2026, 10(1), 6; https://doi.org/10.3390/drones10010006 - 23 Dec 2025
Viewed by 462
Abstract
To address the challenges of varying scales of concrete defects, class imbalance, and hardware limitations, we propose EMAFG-RTDETR, a UAV-based concrete defect detection algorithm built upon RTDETR. In the feature extraction stage, a lightweight multi-scale attention feature extraction module (EMA-PRepFaster block) is designed, [...] Read more.
To address the challenges of varying scales of concrete defects, class imbalance, and hardware limitations, we propose EMAFG-RTDETR, a UAV-based concrete defect detection algorithm built upon RTDETR. In the feature extraction stage, a lightweight multi-scale attention feature extraction module (EMA-PRepFaster block) is designed, where PConv and RepConv are fused to improve the FasterNet block. At the same time, an Efficient Multi-scale Attention (EMA) module is introduced to enhance spatial feature extraction while reducing computational redundancy. For feature fusion, the Gather-and-Distribute mechanism of GOLD-YOLO is adopted to improve the fusion of multi-scale features. The introduction of Powerful-IoU v2 not only accelerates the training process but also enhances the model’s ability to capture defects of different sizes. To handle the issue of sample imbalance, a novel classification loss function, EMASVLoss, is proposed. This function adjusts classification loss values through piecewise weighting and integrates an exponential moving average mechanism for dynamic weight smoothing, improving model adaptability. Finally, the algorithm was deployed and validated on an octocopter UAV developed by our team. Experimental results demonstrate that EMAFG-RTDETR achieves a 2.5% improvement in mean Average Precision (mAP@0.5), reaching 90% on the concrete defect dataset, with reductions in both parameter size and computational cost. Moreover, the UAV equipped with the proposed algorithm can accurately detect cracks and spalling defects on concrete surfaces, validating the effectiveness of the improved model. Full article
Show Figures

Figure 1

19 pages, 4215 KB  
Article
Modeling and Evaluation of Reversible Traction Substations in DC Railway Systems: A Real-Time Simulation Platform Toward a Digital Twin
by Dario Zaninelli, Hamed Jafari Kaleybar and Morris Brenna
Appl. Sci. 2026, 16(1), 80; https://doi.org/10.3390/app16010080 - 21 Dec 2025
Viewed by 308
Abstract
Traditional diode-based rectifiers (TDRs) in railway traction substations (TSSs) are inefficient at handling bidirectional power flow and cannot recover regenerative braking energy (RBE). Replacing these conventional systems with reversible traction substations (RTSSs) requires detailed modeling, extensive simulations, and validation using real data. This [...] Read more.
Traditional diode-based rectifiers (TDRs) in railway traction substations (TSSs) are inefficient at handling bidirectional power flow and cannot recover regenerative braking energy (RBE). Replacing these conventional systems with reversible traction substations (RTSSs) requires detailed modeling, extensive simulations, and validation using real data. This paper presents a DT-oriented real-time modeling and Hardware-in-the-Loop (HIL) platform for the analysis and performance assessment of RTSSs in DC railway systems. The integration of interleaved PWM rectifiers enables bidirectional power flow, allowing efficient RBE recovery and its return to the main grid. Modeling railway networks with moving trains is complex due to nonlinear dynamics arising from continuously varying positions, speeds, and accelerations. The proposed approach introduces an innovative multi-train simulation method combined with low-level transient and power-quality analysis. The validated DT model, supported by HIL emulation using OPAL-RT, accurately reproduces real-world system behavior, enabling optimal component sizing and evaluation of key performance indicators such as voltage ripple, total harmonic distortion, passive-component stress, and current imbalance. The results demonstrate improved energy efficiency, enhanced system design, and reduced operational costs. Meanwhile, experimental validation on a small-scale RTSS prototype, based on data from the Italian 3 kV DC railway system, confirms the accuracy and applicability of the proposed DT-oriented framework. Full article
(This article belongs to the Section Electrical, Electronics and Communications Engineering)
Show Figures

Figure 1

Back to TopTop