Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,328)

Search Parameters:
Keywords = dataset bias

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
33 pages, 43253 KB  
Article
Multi-Domain Interference-Suppressed DETR for SAR Object Detection
by Zhibin Zhang, Ruihui Peng, Dianxing Sun, Shuncheng Tan and Zhaozheng Wei
Remote Sens. 2026, 18(13), 2076; https://doi.org/10.3390/rs18132076 (registering DOI) - 24 Jun 2026
Abstract
Synthetic aperture radar (SAR) object detection has long been affected by spatial speckle interference, spectral energy imbalance, and structural bias in cross-scale feature fusion. In this article, we propose the Multi-Domain Interference-Suppressed Detection Transformer (MDIS-DETR), a unified multi-domain interference-suppressed detection framework built on [...] Read more.
Synthetic aperture radar (SAR) object detection has long been affected by spatial speckle interference, spectral energy imbalance, and structural bias in cross-scale feature fusion. In this article, we propose the Multi-Domain Interference-Suppressed Detection Transformer (MDIS-DETR), a unified multi-domain interference-suppressed detection framework built on the Real-Time Detection Transformer (RT-DETR) architecture. Specifically, spatial-domain interference is suppressed by learnable fusion of complementary denoising responses at the input stage. Furthermore, frequency-domain interference is suppressed by polarization-guided attention together with adaptive frequency refinement within the encoder. In addition, structural-domain interference is suppressed by non-sequential cross-scale interaction to enhance multi-scale consistency. Extensive experiments on multiple SAR benchmarks demonstrate that MDIS-DETR establishes state-of-the-art (SOTA) performance across datasets. Notably, on SARDet-100K, currently the largest SAR detection dataset with a scale comparable to the Common Objects in Context (COCO) dataset, it achieves 58.82% mAP, surpassing the RT-DETR baseline by 4.58%. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

29 pages, 1861 KB  
Article
Physics-Supported Linear and Nonlinear Dimensionality Reduction for Supervised Adaptive Channel Selection in Hybrid RF-FSO-THz Communication Systems
by Luis Miguel Pires and Vitor Fialho
Electronics 2026, 15(13), 2778; https://doi.org/10.3390/electronics15132778 (registering DOI) - 24 Jun 2026
Abstract
Hybrid RF-FSO-THz communication systems are promising candidates for future Internet of Things (IoT) and 6G networks because they combine the robustness of radio frequency links, the high-capacity potential of Free-Space Optical communications, and the ultra-wideband capabilities of terahertz transmission. Adaptive channel selection in [...] Read more.
Hybrid RF-FSO-THz communication systems are promising candidates for future Internet of Things (IoT) and 6G networks because they combine the robustness of radio frequency links, the high-capacity potential of Free-Space Optical communications, and the ultra-wideband capabilities of terahertz transmission. Adaptive channel selection in such systems depends on multiple correlated environmental and physical-layer variables, including distance, rain intensity, humidity, visibility, turbulence strength, signal-to-noise ratio, channel capacity, and energy-efficiency metrics. This paper presents a physics-supported benchmark framework for supervised adaptive channel selection in hybrid RF-FSO-THz systems and systematically investigates the impact of linear and nonlinear dimensionality-reduction techniques on predictive performance, statistical robustness, computational complexity, and physical interpretability. A multi-scenario dataset comprising 5000 samples was generated using calibrated RF, FSO, and THz propagation models under clear, rain, fog, and worst-case environmental conditions. Principal Component Analysis (PCA) and Kernel PCA were evaluated together with Random Forest, Support Vector Machines (SVMs), XGBoost, Gradient Boosting (GB), Multi-Layer Perceptron (MLP), Logistic Regression, and Decision Trees. The results demonstrate that PCA preserves nearly all predictive capabilities while reducing the original 33-dimensional feature space by approximately 81.8%, maintaining accuracies close to 97–98% with the best-performing classifiers. Statistical significance analysis confirms that PCA introduces only modest degradations, whereas Kernel PCA consistently reduces the predictive performance while increasing memory requirements and inference latency. Additional environmental-only validation experiments indicate that adaptive channel selection remains highly learnable even when only pre-selection environmental descriptors are available, partially mitigating concerns regarding self-consistency bias. Overall, the results suggest that PCA provides an advantageous compromise among predictive accuracy, computational efficiency, statistical robustness, and physical interpretability for supervised adaptive channel selection in physics-supported hybrid wireless communication systems. Full article
23 pages, 12628 KB  
Article
Bioinformatics-Based Data Mining of GenBank and Diversity Patterns of Soil Fungal Sequences
by Željko Savković, Miloš Stupar, Andrija Finka, Slaven Zjalić and Jelena Lončar
Forests 2026, 17(7), 731; https://doi.org/10.3390/f17070731 (registering DOI) - 24 Jun 2026
Abstract
Soil fungi are key drivers of terrestrial ecosystem functioning, contributing to organic matter decomposition, nutrient cycling, and plant–microorganism interactions. Despite their importance, the global distribution and structural biases of public sequence records for soil fungi remain incompletely characterized. In this study, we analyzed [...] Read more.
Soil fungi are key drivers of terrestrial ecosystem functioning, contributing to organic matter decomposition, nutrient cycling, and plant–microorganism interactions. Despite their importance, the global distribution and structural biases of public sequence records for soil fungi remain incompletely characterized. In this study, we analyzed soil-associated fungal DNA sequences retrieved from the NCBI GenBank database using a custom R-based bioinformatics pipeline. Following filtering and metadata standardization, 544,554 filtered sequence records were obtained. The taxonomic composition of the dataset consisted primarily of Ascomycota (69.62%), followed by Basidiomycota, Glomeromycota, and Mucoromycota, with Trichoderma, Penicillium, and Aspergillus representing the most frequent genera. The geographic distribution revealed strong sampling bias, with China and the United States accounting for over one-third of all records. Ecological metadata indicated that rhizospheric and forest soils were the most common sources of the deposited sequences. At the same time, gene marker analyses confirmed the widespread use of the ITS region as the primary fungal barcode. Sequence diversity analyses revealed continental variation, with Europe and Asia showing higher medians, while the ordination highlighted clustering of sequence profiles, particularly among records from extreme environments. This study demonstrates the potential of public sequence databases for large-scale biodiversity assessments while highlighting the influence of sampling bias and the limitations of metadata. Full article
Show Figures

Figure 1

30 pages, 3927 KB  
Systematic Review
Current Trends in AI Gait Analysis for the Detection and Assessment of Parkinson’s Disease Severity: Systematic Review and Meta-Analysis of Performance Using Logit Transformation
by Philippe Gorce and Julien Jacquier-Bret
Healthcare 2026, 14(13), 1820; https://doi.org/10.3390/healthcare14131820 (registering DOI) - 23 Jun 2026
Abstract
Background/Objectives: Artificial intelligence (AI) offers a promising approach for detecting and classifying symptom severity in patients with Parkinson’s disease (PD). The objective was to provide an overview of AI methods performance used for this classification through a systematic review and meta-analysis conducted in [...] Read more.
Background/Objectives: Artificial intelligence (AI) offers a promising approach for detecting and classifying symptom severity in patients with Parkinson’s disease (PD). The objective was to provide an overview of AI methods performance used for this classification through a systematic review and meta-analysis conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Methods: The Google Scholar, IEEE Xplore, PubMed/MedLine, and ScienceDirect databases were searched for the period 2015–2025. The studies included were original, peer-reviewed studies written in English that addressed an AI method based on machine learning (ML) or deep learning (DL) for the classification of PD patients. The dataset used had to be “Gait in Parkinson’s Disease,” in which the severity of disease symptoms was assessed using the Hoehn and Yahr (H&Y) scale. Studies had to report at least one of the five performance metrics: accuracy, sensitivity, specificity, precision, and F1 score. Two reviewers independently selected articles, assessed the risk of bias using PROBAST (Prediction Model Study Risk of Bias Assessment Tool), and extracted data. The logit-transformed values were pooled separately by performance metrics and by severity level using a random-effects model. Cochran’s Q test, the I2 statistic, and inter-study variability (τ2), computed using the generalized inverse variance method with the restricted maximum likelihood model, were used to assess heterogeneity. Forest plots with 95% confidence intervals were used to present the results. Possible causes of heterogeneity were explored using a subgroup analysis (ML vs. DL) and a sensitivity analysis. Finally, publication bias (Egger’s test) and the certainty of the evidence (using GRADE—Grading of Recommendations Assessment, Development, and Evaluation) were assessed to verify the generalizability of the results. Results: Among the 257 unique records, 12 studies were included. The methods demonstrated very high overall performance (>92%): accuracy (96.4%, 95% CI: 95.9–96.9%), specificity (97.7%, 95% CI: 97.3–98.1%), sensitivity (94.0%, 95% CI: 92.7–95.2%), precision (93.4%, 95% CI: 92.0–94.6%), F1 score (92.1%, 95% CI: 90.6–93.4%). Accuracy, specificity, and precision were high for all H&Y levels. However, the more advanced the symptoms, the lower the sensitivity (97.3% for H&Y0 vs. 92.1% for H&Y3). ML models achieved the best results for classifying healthy patients (H&Y0: 95.7% to 98.2%), while DL approaches performed better for classifying higher severity levels (>92%). Heterogeneity and inter-study variability were moderate (I2: 40–50% and τ2: 0.3–0.4) for precision and F1 score, and high (I2 > 90% and τ2 > 0.6) for accuracy, specificity, and sensitivity. The GRADE analysis revealed low-quality evidence for precision and F1 score and very-low quality for accuracy, specificity, and sensitivity. Conclusions: Thus, AI-based wearable gait assessment devices show great promise in terms of aiding clinical decision-making and treatment personalization. However, further research using a rigorous methodology (PROBAST) is needed to ensure the generalizability of the results and the clinical viability of the proposed solutions. Full article
21 pages, 11840 KB  
Article
Rehospitalization Burden Profiles After Traumatic Spinal Cord Injury: A Data-Driven Latent Class Analysis of the SCIMS Public-Use Database
by Andrea Calderone, Maria Pia Onesta, Laura Simoncini, Antonino Nunnari, Fabrizio Sottile, Angelo Quartarone and Rocco Salvatore Calabrò
J. Clin. Med. 2026, 15(13), 4890; https://doi.org/10.3390/jcm15134890 (registering DOI) - 23 Jun 2026
Abstract
Background/Objectives: Rehospitalization after traumatic spinal cord injury (SCI) is common, but binary or count summaries may obscure heterogeneity in timing, recurrence, frequency, and duration. We aimed to identify clinically interpretable rehospitalization burden profiles in the SCIMS 2021ARPublic dataset and examine descriptive associations with [...] Read more.
Background/Objectives: Rehospitalization after traumatic spinal cord injury (SCI) is common, but binary or count summaries may obscure heterogeneity in timing, recurrence, frequency, and duration. We aimed to identify clinically interpretable rehospitalization burden profiles in the SCIMS 2021ARPublic dataset and examine descriptive associations with clinical correlates and participation outcomes. Methods: We analyzed Form I, Form II, and Record Status public-use files. Among 29,310 individuals with at least one non-lost follow-up interview, 28,745 with at least one non-missing rehospitalization indicator entered latent class analysis. Four prespecified indicators captured early, recurrent, frequent, and prolonged rehospitalization. Candidate two- through six-class models were compared using AIC, BIC, entropy, class size, posterior probabilities, and interpretability. Pairwise adjusted logistic models examined candidate clinical correlates in 10,407 participants with complete 2016+ follow-up data. Adjusted linear models examined CHART participation domains in 20,766–20,949 participants. Results: A four-profile solution was retained: low rehospitalization burden (59.8%), early/prolonged rehospitalization (18.9%), frequent/prolonged rehospitalization (7.7%), and high recurrent/frequent/prolonged burden (13.6%). UTI and pressure ulcer history showed the most consistent associations with burdened profiles. Severe pain and frequent sleep problems were associated with selected heavier-burden profiles, while depressive symptoms showed smaller and less precise associations. Sensitivity analyses supported structural stability while highlighting observation-time bias and classification uncertainty inherent to wave-based public-use data. Compared with the low-burden profile, burden profiles showed lower CHART scores, especially for mobility and occupation. Conclusions: Rehospitalization after traumatic SCI is heterogeneous. These utilization burden profiles summarize distinct observed patterns but require prospective validation before use in risk stratification or follow-up planning. Full article
Show Figures

Graphical abstract

16 pages, 1554 KB  
Review
Explainable and Trustworthy Artificial Intelligence in Cardiology: A Narrative Review of Clinical Applications, Operational Integration, and Future Directions
by Mateusz Lucki, Ewa Lucka, Jacek Żak, Przemysław Mitkowski and Maciej Lesiak
J. Clin. Med. 2026, 15(13), 4885; https://doi.org/10.3390/jcm15134885 (registering DOI) - 23 Jun 2026
Abstract
Background/Objectives: Artificial intelligence (AI) is increasingly transforming cardiology through advanced analytical tools capable of identifying complex patterns across cardiovascular imaging, electrophysiology, and clinical datasets. Machine learning (ML) and deep learning (DL) algorithms are being integrated into echocardiography, cardiac computed tomography (CT), cardiac magnetic [...] Read more.
Background/Objectives: Artificial intelligence (AI) is increasingly transforming cardiology through advanced analytical tools capable of identifying complex patterns across cardiovascular imaging, electrophysiology, and clinical datasets. Machine learning (ML) and deep learning (DL) algorithms are being integrated into echocardiography, cardiac computed tomography (CT), cardiac magnetic resonance imaging (MRI), and electrocardiography (ECG), enabling earlier diagnosis and more personalized cardiovascular care. This narrative review summarizes current clinical and organizational applications of AI in cardiology and discusses emerging concepts related to explainable and trustworthy AI. Methods: A narrative review was conducted according to SANRA recommendations using the PubMed, MEDLINE, Web of Science, and Scopus databases, including peer-reviewed publications from 2015 to 2026 addressing clinical, organizational, and ethical applications of AI in cardiology, with particular emphasis on cardiovascular imaging, electrocardiography, heart failure, digital health, and explainable AI frameworks. Results: Substantial evidence demonstrates that AI-based tools can achieve expert-level performance in cardiovascular imaging interpretation, automated electrocardiographic analysis, and clinical risk prediction. Across multiple cardiovascular settings, AI has been associated with improved diagnostic accuracy, enhanced workflow efficiency, and earlier detection of cardiovascular disease. Predictive models support risk stratification in heart failure and ischemic heart disease, while chatbots and digital health platforms may facilitate patient engagement, remote monitoring, and continuity of care. Despite these advances, important challenges remain, including algorithmic bias, limited transparency, insufficient external validation, data heterogeneity, and barriers to routine clinical implementation. Emerging explainable AI approaches may improve model interpretability, clinician confidence, and the safe adoption of AI-driven decision support systems. Conclusions: Artificial intelligence is rapidly evolving from a research-oriented technology into a clinically relevant component of cardiovascular care. Current evidence indicates that AI can enhance diagnostic performance, improve risk prediction, streamline clinical workflows, and facilitate more personalized management across multiple cardiovascular domains. However, the successful translation of AI into routine practice will depend on robust external validation, transparent decision-making mechanisms, regulatory oversight, and clinician acceptance. The development of explainable and trustworthy AI frameworks represents a critical step toward the safe, ethical, and sustainable integration of AI into modern cardiology. Full article
Show Figures

Figure 1

17 pages, 2596 KB  
Article
Intelligent Injection Molding: Machine Learning-Driven Optimization of Processing Parameters for Enhanced Mechanical Properties in Short-Fiber-Reinforced Thermoplastics
by Rafael Aguirre Flores, Francisco J. González, Felipe Avalos Belmontes and Jesús Francisco Lara Sánchez
Processes 2026, 14(13), 2037; https://doi.org/10.3390/pr14132037 (registering DOI) - 23 Jun 2026
Abstract
Optimizing the injection molding of short-fiber-reinforced thermoplastics (SFRTs) is a persistent challenge due to the complex interplay between processing parameters and final mechanical performance. To address this, we developed and validated a machine learning (ML) pipeline to maximize both the tensile strength and [...] Read more.
Optimizing the injection molding of short-fiber-reinforced thermoplastics (SFRTs) is a persistent challenge due to the complex interplay between processing parameters and final mechanical performance. To address this, we developed and validated a machine learning (ML) pipeline to maximize both the tensile strength and Charpy impact resistance in polyamide 6 with 30% glass fiber (PA6-GF30). Through a designed experimental campaign, we systematically varied four key process parameters—melt temperature (260–300 °C), injection pressure (600–1000 bar), packing pressure (400–800 bar), and cooling time (15–35 s). The resulting dataset was used to train and compare three different regression models: Random Forest (RF), Gradient Boosting (GB), and Support Vector Regression (SVR). Our findings indicate that the Gradient Boosting (GB) algorithm yielded the most reliable predictions, significantly outperforming the other evaluated models. Further analysis using SHAP (Shapley Additive exPlanations) identified packing pressure as the dominant factor influencing tensile strength (contributing approximately 40% to the prediction), while melt temperature emerged as the key driver for impact resistance (around 35% contribution). By integrating our best-performing GB model with a multi-objective genetic algorithm, we identified an optimal set of parameters that simultaneously enhances both mechanical properties. Among the evaluated models (Random Forest, Support Vector Regression, and Gradient Boosting), the Gradient Boosting algorithm achieved the highest predictive accuracy. Compared to the baseline condition (280 °C melt temperature, 800 bar injection pressure, 600 bar packing pressure, 25 s cooling time), experimental validation of these optimized settings demonstrated substantial improvement: tensile strength increased from 145 MPa to 171 MPa (an 18% enhancement), and impact resistance rose from 45 kJ/m2 to 55 kJ/m2 (a 22% gain). This work establishes that an integrated ML and optimization framework can serve as a transformative approach for high-precision manufacturing of advanced engineering polymers. The primary novelty of this work lies in the development of a fully integrated, bias-free methodological framework that explicitly couples physical interpretability with multi-objective optimization, bridging the critical gap between black-box predictions and actionable industrial insights. Full article
(This article belongs to the Special Issue Processing and Applications of Polymer Composite Materials)
Show Figures

Graphical abstract

35 pages, 3804 KB  
Article
A Confound-Aware Framework for Multi-Class EEG Classification and Explainable Model Evaluation
by Ahmed Alqurashi and Abdullah Alharthi
Mathematics 2026, 14(13), 2239; https://doi.org/10.3390/math14132239 (registering DOI) - 23 Jun 2026
Abstract
Objective diagnosis in psychiatry remains challenging due to the lack of reliable biological markers and the presence of confounding variables in observational data. While EEG-based machine learning models have shown promising classification performance, their validity remains unclear when confounding factors such as age [...] Read more.
Objective diagnosis in psychiatry remains challenging due to the lack of reliable biological markers and the presence of confounding variables in observational data. While EEG-based machine learning models have shown promising classification performance, their validity remains unclear when confounding factors such as age are not explicitly controlled. In this work, we propose a confound-aware mathematical framework for supervised learning, where classification is formulated as a mapping f:RE×C×TY under the presence of a confounding variable A. Within this formulation, model performance is interpreted as a function of both predictive structure and confound dependence. The proposed framework integrates classification, regression, and feature selection into a unified evaluation pipeline. A central contribution is the Cross-Task Explanation Concordance (CTEC) index, a rank-based metric that quantifies the stability of feature importance across models and predictive tasks. Experimental results on a large-scale EEG dataset (N = 670) demonstrate that deep learning models outperform handcrafted approaches under standard evaluation. However, under confound-controlled settings, handcrafted models show a dual response to confound control: age residualization improves classification by removing feature-level noise (+20.3%), while age-matching collapses performance to chance (balanced accuracy, BA = 0.238) by eliminating demographic separability. Deep learning models retain partial robustness under both conditions. These findings highlight that conventional performance metrics may overestimate model validity in the presence of structured bias. The proposed framework provides a general mathematical approach for evaluating supervised learning models under confounding effects and is applicable to a wide range of data-driven systems beyond EEG. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

30 pages, 3047 KB  
Article
Air Pollution Prediction Based on Stacked Deep Autoencoder Network Model
by Dhuha Saad Ismael, Nurulkamal Masseran and Sakhinah Abu Bakar
Electronics 2026, 15(13), 2756; https://doi.org/10.3390/electronics15132756 (registering DOI) - 23 Jun 2026
Abstract
Urban air pollution, especially the problem of PM2.5, is one of the major health challenges facing the planet today. To provide accurate PM2.5 predictions despite data noise and missing data, the authors proposed a deep learning model. We constructed a [...] Read more.
Urban air pollution, especially the problem of PM2.5, is one of the major health challenges facing the planet today. To provide accurate PM2.5 predictions despite data noise and missing data, the authors proposed a deep learning model. We constructed a Stacked Autoencoder–Convolutional Neural Network–Bidirectional Long Short-Term Memory–Long Short-Term Memory (SAE-CNN-BiLSTM-LSTM) model that (1) utilises convolutional layers to extract spatial features from the input data, (2) employs bidirectional LSTM layers to capture long-term temporal dependencies, and (3) utilises an autoencoder to learn latent representations of the data to mitigate the effects of missing data. The model was trained on a large dataset of hourly measurements of air quality and meteorological parameters collected between 2018 and 2020 in Klang, Malaysia. The performance of the model on data that were not used during training was evaluated using a range of metrics. The SAE-CNN-BiLSTM-LSTM model achieved a test RMSE of approximately 11.97 µg/m3 and an R2 statistic of approximately 0.85 for PM2.5 concentrations, outperforming the other models tested on the same datasets. The additional metrics of MAE, MAPE, Mean Bias Error, and Index of Agreement confirmed the model’s accuracy and low bias in the prediction of air pollution levels. Statistical tests, such as the Diebold–Mariano test, confirmed the significance of the model’s accuracy over the CNN-LSTM models. These findings indicate that the proposed model effectively captures the dynamics of the air pollution data. The proposed model structure efficiently achieved an accurate and lightweight model for urban air pollution forecasting. Full article
Show Figures

Figure 1

29 pages, 1519 KB  
Article
Spatial Multi-Sensor Fusion with Heterogeneous Error Characteristics
by Ben Ingram, Rodrigo Paredes, Joel Díaz, Felipe Besoaín and Ricardo Baettig
Appl. Sci. 2026, 16(13), 6294; https://doi.org/10.3390/app16136294 (registering DOI) - 23 Jun 2026
Viewed by 37
Abstract
Fusing spatial observations from sensors with heterogeneous error characteristics is a persistent challenge in geostatistics. Classical kriging assumes a Gaussian likelihood for all observations, an assumption that fails when sensors exhibit one-sided or asymmetric noise. We present a Variable Rank Kriging (VRK) formulation [...] Read more.
Fusing spatial observations from sensors with heterogeneous error characteristics is a persistent challenge in geostatistics. Classical kriging assumes a Gaussian likelihood for all observations, an assumption that fails when sensors exhibit one-sided or asymmetric noise. We present a Variable Rank Kriging (VRK) formulation that supports per-observation heterogeneous likelihoods where each observation may define its own likelihood function, thus enabling principled fusion of sensors whose noise structures are significantly different in terms of distribution family and magnitude. Within this framework, we use the exponential (one-sided) likelihood as a case study to demonstrate the method and compare it with sampling-based numerical alternatives for general likelihoods without closed forms. A non-collocated RTK calibration workflow uses kriging predictions from a sparse high-accuracy reference to characterise sensor-specific likelihood parameters without requiring co-located paired observations. Synthetic 1-D and 2-D experiments show that correct per-point likelihood specification reduces RMSE by up to 92% (1-D) and 57% (2-D) relative to a misspecified Gaussian model while also eliminating systematic positive bias. A demonstration using NEON Airborne Observation Platform lidar data at Harvard Forest confirms these findings in a practical, real-world scenario. Across multiple subsamples of the lidar dataset, the exponential likelihood reduces vegetated-zone RMSE by 20.6% (open zone: 18.6%) and mean absolute bias by 26.5% relative to a heteroscedastic Gaussian baseline. The open-source vrk Python (>=3.10) package provides a reproducible implementation that can be applied to any spatial domain that requires multi-sensor spatial fusion with heterogeneous error structures. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 635 KB  
Article
Noise-Adjusted Shrinkage Covariance Estimation in High Dimensions
by Esra Pamukçu
Axioms 2026, 15(6), 468; https://doi.org/10.3390/axioms15060468 (registering DOI) - 22 Jun 2026
Viewed by 58
Abstract
High-dimensional covariance estimation remains a fundamental challenge when the number of variables (p) substantially exceeds the sample size (n). In such settings, the sample covariance matrix is unstable, singular, and heavily contaminated by estimation noise. Although shrinkage estimators improve stability and thresholding methods [...] Read more.
High-dimensional covariance estimation remains a fundamental challenge when the number of variables (p) substantially exceeds the sample size (n). In such settings, the sample covariance matrix is unstable, singular, and heavily contaminated by estimation noise. Although shrinkage estimators improve stability and thresholding methods promote sparsity, each approach alone may introduce bias or lose structural information. This study proposes a Noise-Adjusted Shrinkage Covariance (NASC) framework as a post-processing enhancement strategy for shrinkage-based covariance estimators. The framework first stabilizes the covariance structure through shrinkage toward a structured target, then suppresses noise-induced small covariance entries via thresholding, and finally applies a stabilization step to ensure positive definiteness of the resulting estimator. Sensitivity analyses were conducted to investigate the effects of the shrinkage and thresholding parameters, and the Monte Carlo simulations were subsequently performed using the best-performing parameter configuration. The simulation results showed that shrinkage alone may not sufficiently suppress entrywise noise, whereas NASC-adjusted estimators improved upon their corresponding shrinkage baselines in many scenarios, with the strongest gains observed for sparse covariance structures and for shrinkage estimators that do not explicitly suppress entrywise estimation noise. Improvements were more limited for highly optimized shrinkage estimators. Real-data analyses were conducted on the SRBCT and colon cancer benchmark datasets. On the SRBCT dataset, numerical stability and positive-definiteness properties were examined, while LOOCV-LDA classification performance without prior feature selection or dimensionality reduction was evaluated on the colon cancer dataset. The results suggest that NASC provides a computationally simple and numerically stable extension to classical shrinkage covariance estimation methods for high-dimensions. Full article
(This article belongs to the Special Issue Recent Developments in Statistical Research)
Show Figures

Figure 1

26 pages, 4710 KB  
Article
ST-CDF: A Generative AI Framework for Physics-Consistent Imputation and Simulation in Precision Agriculture
by Chenkai Guo, Hui Fan, Shenghua Dong, Minhua Yin, Guangping Qi, Yanlin Ma, Chungang Jing, Hao Liu, Ni Song and Yanxia Kang
Appl. Sci. 2026, 16(12), 6250; https://doi.org/10.3390/app16126250 (registering DOI) - 22 Jun 2026
Viewed by 73
Abstract
Incomplete spatio-temporal (ST) data from sensor networks in precision agriculture often limits environmental modeling and decision-making accuracy. To address this, we propose the Spatio-Temporal Conditional Diffusion Framework (ST-CDF), a generative approach for high-fidelity data reconstruction. The framework’s core is a deep denoising network [...] Read more.
Incomplete spatio-temporal (ST) data from sensor networks in precision agriculture often limits environmental modeling and decision-making accuracy. To address this, we propose the Spatio-Temporal Conditional Diffusion Framework (ST-CDF), a generative approach for high-fidelity data reconstruction. The framework’s core is a deep denoising network that integrates a Graph Attention Network (GAT) to explicitly model non-Euclidean spatial correlations, a Differential Attention Transformer to capture abrupt temporal dynamics, and an Inverse Discrete Wavelet Transform (IDWT) module to preserve multi-scale signal details. The generative process is constrained by a physics-informed training objective, which injects known physical laws (i.e., the Penman–Monteith equation for reference evapotranspiration, ET0) as an inductive bias, ensuring the imputed data maintains physical consistency. For privacy-preserving deployment on resource-constrained IoT devices, we extend the framework with a Federated Cluster-Guided Distillation (Fed-CGD) strategy. We conducted extensive experiments against established methods on two real-world agricultural datasets. ST-CDF demonstrated improved imputation accuracy across evaluated metrics. Its efficacy was most pronounced in the physically-demanding ET0 calculation task, where data imputed by ST-CDF at an 80% missing rate achieved a Root Mean Square Error (RMSE) of 0.3485 and a Coefficient of Determination (R2) of 0.7558, outperforming the baseline models. Furthermore, we explore ST-CDF as an explainable (XAI) framework for active agricultural decision support, demonstrating its utility in performing counterfactual simulations of “what-if” interventions, such as irrigation. The findings highlight ST-CDF as an effective, physically-grounded, and interpretable tool for data-driven scientific computation and precision agriculture. Full article
Show Figures

Figure 1

24 pages, 10285 KB  
Article
Intelligent Veterinary Disease Management Driven by Knowledge Graph for Conservation Breeding of Captive Forest Musk Deer
by Dequan Guo, Xin Fan, Zijie Lan, Chengli Zheng, Dapeng Zhang, Zhenyu Wang and Minyao Tan
Vet. Sci. 2026, 13(6), 602; https://doi.org/10.3390/vetsci13060602 (registering DOI) - 21 Jun 2026
Viewed by 98
Abstract
In artificial breeding of forest musk deer (Moschus berezovskii), common diseases such as abscess, enteritis, pneumonia, and parasitic infections exhibit persistently high morbidity rates. The early symptoms of certain diseases are often insidious and difficult to discern. Conventional manual inspection routines not only [...] Read more.
In artificial breeding of forest musk deer (Moschus berezovskii), common diseases such as abscess, enteritis, pneumonia, and parasitic infections exhibit persistently high morbidity rates. The early symptoms of certain diseases are often insidious and difficult to discern. Conventional manual inspection routines not only fail to achieve accurate diagnosis but also frequently disturb the animals, induce stress responses, and consequently delay optimal treatment windows. To address this practical challenge, this study employs an improved BRW-GPLinker joint entity-relationship extraction approach to perform integrated extraction and structural organization of disease entities, symptom manifestations, etiological associations, and preventive and therapeutic measures from farming literature and clinical records, thereby constructing a disease knowledge graph for forest musk deer. Through the introduction of a Boundary-Aware Module for refined entity boundary detection, a Relative Distance Bias Module to mitigate pairing errors in dense contexts, and a Weighted Sparse Multi-label Cross-Entropy loss function to enhance recall for infrequent relations, the proposed model achieves an F1 score of 0.887 on a self-constructed dataset and demonstrates favorable generalization capability on medical-domain datasets. By transforming fragmented clinical logs and manuals into structured medical associations, this knowledge graph facilitates rapid retrieval of forest musk deer disease information, thereby enhancing veterinary decision-making efficiency and assisting forest musk deer health management. Full article
Show Figures

Figure 1

22 pages, 1470 KB  
Article
Predicting District Heating Networks Fault Location with Graph Neural Networks
by Ivan Plokhikh, Dmitriy Pushkarev, Oleg Gobyzov, Sergey Filimonov, Alexander Dekterev, Rustam Mullyadzhanov and Sergey Alekseenko
Energies 2026, 19(12), 2920; https://doi.org/10.3390/en19122920 (registering DOI) - 20 Jun 2026
Viewed by 192
Abstract
District heating networks (DHNs) are critical infrastructure prone to physical failures such as leakage-related faults, which cause significant energy and financial losses. Traditional physics-based monitoring methods are computationally expensive and require the continual recalibration of complex mathematical models, while standard data-driven approaches often [...] Read more.
District heating networks (DHNs) are critical infrastructure prone to physical failures such as leakage-related faults, which cause significant energy and financial losses. Traditional physics-based monitoring methods are computationally expensive and require the continual recalibration of complex mathematical models, while standard data-driven approaches often fail due to the scarcity of real-world sensor data. This study addresses these challenges by proposing a topology-aware graph neural network (GNN) architecture for fault localization. The methodology follows a two-stage process: first, a graph attention-based architecture is designed and optimized using a synthetic dataset to effectively capture multi-step neighborhood dependencies. Second, the model is adapted and evaluated on a physically simulated dataset of a real urban DHN, comprising 187 nodes and 42,570 operational states. The problem is formulated as a multi-class classification task across supply and return subnets. The results demonstrate high predictive performance, achieving an accuracy of 96% on the supply subnet and 91% on the return subnet. Analysis of prediction errors reveals a strong bias towards local topological mistakes, indicating the model’s ability to capture the physical propagation of disturbances. These findings highlight the efficacy of GNNs in handling sparse data and exploiting network topology for robust DHN monitoring. Full article
Show Figures

Figure 1

21 pages, 673 KB  
Review
Bridging Ancestry-Stratified Bias in Pharmacogenomics AI: Toward Metabolomics-Inclusive Multi-Omics Precision Medicine
by Heayyean Lee, Khadijah Sajid and Dayeon Lee
J. Pers. Med. 2026, 16(6), 332; https://doi.org/10.3390/jpm16060332 (registering DOI) - 20 Jun 2026
Viewed by 188
Abstract
Pharmacogenomics AI offers significant potential for individualized drug therapy; however, its clinical benefits remain unevenly distributed. Models trained predominantly on European-ancestry data consistently underperform in non-European populations, with polygenic risk scores (PRS) showing an estimated 39–73% reduction in predictive accuracy in African-ancestry cohorts [...] Read more.
Pharmacogenomics AI offers significant potential for individualized drug therapy; however, its clinical benefits remain unevenly distributed. Models trained predominantly on European-ancestry data consistently underperform in non-European populations, with polygenic risk scores (PRS) showing an estimated 39–73% reduction in predictive accuracy in African-ancestry cohorts across complex traits. These disparities have driven increased interest in moving beyond single-layer genomic approaches. Multi-omics frameworks integrating genomic, transcriptomic, proteomic, and metabolomic data have emerged as a promising strategy to improve prediction across heterogeneous clinical populations, as each molecular layer provides distinct and complementary biological information. Among these layers, metabolomics may represent a particularly transferable component across populations. Metabolite profiles capture the downstream functional output of biological systems influenced by genetic, environmental, dietary, and microbiome-related factors, and may therefore be less reliant on ancestry-stratified allele frequency structures that underlie performance disparities in genomic models. This review synthesizes evidence regarding the mechanistic basis of genomic bias in pharmacogenomics AI, the emerging role of multi-omics integration, especially metabolomics, in improving predictive performance, and the current landscape of computational strategies for bias mitigation, including federated learning, transfer learning, domain adaptation, and synthetic data generation. Collectively, current evidence supports metabolomics-inclusive multi-omics frameworks as a biologically plausible, hypothesis-generating strategy to reduce reliance on ancestry-linked genomic features. However, direct evidence that such frameworks reduce ancestry-related bias in clinical AI outputs remains limited, underscoring the need for globally diverse datasets and prospective multi-population validation. Full article
(This article belongs to the Section Omics/Informatics)
Show Figures

Figure 1

Back to TopTop