MDPI - Publisher of Open Access Journals

34 pages, 1829 KB

Open AccessFeature PaperArticle

Sparse Simulation of Autoregressive Gaussian Processes

by Tadej Krivec and Juš Kocijan

Mathematics 2026, 14(12), 2111; https://doi.org/10.3390/math14122111 - 13 Jun 2026

Viewed by 105

This study proposes a novel and improved numerical approximation of the simulation of Gaussian process autoregressive models. As a Bayesian nonparametric regression method, Gaussian process models offer the unique advantage of providing closed-form uncertainty quantification. When Gaussian process models are used for autoregressive [...] Read more.

This study proposes a novel and improved numerical approximation of the simulation of Gaussian process autoregressive models. As a Bayesian nonparametric regression method, Gaussian process models offer the unique advantage of providing closed-form uncertainty quantification. When Gaussian process models are used for autoregressive models, the validation procedure requires the model’s simulation or multi-step-ahead prediction. However, simulating dynamical Gaussian process models is complex due to the intractable propagation of uncertain inputs through the nonlinear model. Numerical approximation, namely Monte Carlo simulation, is one of the most frequent options for simulating dynamical models based on Gaussian processes. The computational burden of Monte Carlo simulation algorithms increases cubically with data size, representing a challenge. This paper introduces a unified simulation framework invariant to sparse and variational approximations to obtain a static sample from the pseudo-point posterior. Furthermore, we propose an innovative method for simulating Gaussian process dynamical models. A single parameter is proposed to regulate the trade-off between computational complexity and algorithmic accuracy. This innovation demonstrates the potential to replace the conditionally independent Monte Carlo method with no additional computational burden, thereby enhancing estimates of latent responses. The proposed simulation method is demonstrated using two synthetic examples and a realistic case study. Full article

(This article belongs to the Special Issue Nonlinear Dynamics and Control: Challenges and Innovations)

► Show Figures

Figure 1

28 pages, 25036 KB

Open AccessArticle

Non-Invasive Blood Glucose Estimation from Exhaled Breath: Patient-Level Validation of a Compact Electronic Nose Approach

by Alberto Gudiño-Ochoa, Eduardo Ruiz-Velázquez, Julio Alberto García-Rodríguez, Raquel Ochoa-Ornelas and Sofia Uribe-Toscano

AI 2026, 7(6), 213; https://doi.org/10.3390/ai7060213 - 11 Jun 2026

Viewed by 313

Abstract

Non-invasive blood glucose estimation from exhaled breath has been proposed as a painless alternative to repeated capillary measurements; however, performance evaluation remains challenging in small-sample settings. This study investigates the estimation of blood glucose from human breath using volatile organic compound (VOC) signals [...] Read more.

Non-invasive blood glucose estimation from exhaled breath has been proposed as a painless alternative to repeated capillary measurements; however, performance evaluation remains challenging in small-sample settings. This study investigates the estimation of blood glucose from human breath using volatile organic compound (VOC) signals acquired with an electronic nose. Responses from three metal-oxide sensor channels sensitive to CO, alcohol, and acetone were collected from 58 individuals, with one measurement per subject, and analyzed using strictly patient-level five-fold cross-validation, in which test folds comprised only real subjects. Two experimental factors were examined. First, model performance was evaluated with and without an additional interpretable alcohol–acetone log-ratio capturing relative variation between compounds. Second, model training was performed using either real data only or fold-wise tabular synthetic augmentation generated via a Gaussian copula fitted exclusively on training subjects, while evaluation remained strictly real-only. Under real-only training, classical machine learning models achieved the lowest prediction errors (approximately 6–7 mg/dL), whereas under synthetic augmentation FTTransformer was the best-performing deep learning model. This findings should be understood as a constrained proof-of-concept analysis rather than as evidence of diagnostic capability or clinical readiness. Full article

(This article belongs to the Special Issue AI-Driven Innovations in Medical Computer Engineering and Healthcare)

► Show Figures

Graphical abstract

22 pages, 2625 KB

Open AccessArticle

Lens Antenna Arrays for THz Superconducting HEB Mixers: A Review and a Metasurface Coupling Approach

by Yuner Gan, Ruiguang Peng, Shijia Feng, Maimai Mu and Qian Wang

Sensors 2026, 26(10), 3258; https://doi.org/10.3390/s26103258 - 21 May 2026

Viewed by 586

Abstract

Terahertz hot electron bolometer (HEB) mixers, which offer the highest sensitivity in the frequency range above 1.5 THz, are equipped on space observatories to detect the terahertz radiation emitted from the interstellar medium within galaxies. To increase the mapping speed, it is essential [...] Read more.

Terahertz hot electron bolometer (HEB) mixers, which offer the highest sensitivity in the frequency range above 1.5 THz, are equipped on space observatories to detect the terahertz radiation emitted from the interstellar medium within galaxies. To increase the mapping speed, it is essential to develop large HEB mixer arrays. However, conventional quasi-optical coupling methods, including single large silicon lens approaches and silicon lens array approaches, suffer from the conflict of achieving high filling factor and uniform illumination on the HEB mixer array. This paper reviews the research progress on quasi-optical coupled HEB mixer arrays and proposes an innovative array coupling scheme to overcome the existing limitation. We designed a metasurface beam shaper based on the Gerchberg–Saxton algorithm and COMSOL simulation to transform an incoming Gaussian beam into a flattop beam in the focal plane, thereby forming uniform illumination for an antenna-coupled HEB mixer array. The metasurface is intended primarily for uniform local oscillator (LO) distribution across the array. The simulation of the metasurface beam shaper at 0.6 THz demonstrates a flattop beam with a flat region approximately 3 mm wide, and the intensity across this region varies by only 4.2%. The same simulation is also performed at 1.6 THz, and the flat region is 1.5 mm wide with a 5.5% intensity variation. This work demonstrates the feasibility of using a metasurface to convert a Gaussian beam into a flattop beam at terahertz frequencies as well as a pathway for array-level coupling schemes for HEB mixer arrays with high filling factor and uniform illumination. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Graphical abstract

15 pages, 3660 KB

Open AccessArticle

Relative Entropy Computations for Nonlinear Deformations of the Porous Steel Structures

by Michał Strąkowski and Marcin Kamiński

Materials 2026, 19(9), 1783; https://doi.org/10.3390/ma19091783 - 28 Apr 2026

Viewed by 287

Abstract

In this paper, we investigate the application of the relative entropy framework for safety assessments of steel elements with structural defects at the micro- and macro-scales. Mathematical theories developed by Bhattacharyya and by Kullback and Leibler (K-L) have been used for this purpose. [...] Read more.

In this paper, we investigate the application of the relative entropy framework for safety assessments of steel elements with structural defects at the micro- and macro-scales. Mathematical theories developed by Bhattacharyya and by Kullback and Leibler (K-L) have been used for this purpose. This approach uses both expectations and variations, similar to the First-Order Reliability Method (FORM), but is extended to include 3rd- and 4th-order central probabilistic moments. It is necessary to use a hybrid computational technique that combines the Finite Element Method (FEM) software ABAQUS CAE 2017 with the implemented Gurson–Tvergaard–Needleman (GTN) damage model and the computer algebra system MAPLE. The iterative generalized stochastic perturbation technique has been used to determine the probabilistic moments of structural response, to utilize the Weighted Least Squares Method to approximate the structural response function, and to determine uncertainty in the stress, strain, and displacement state functions. This approach is based on relative entropy because of its universality. There is no need to assume a type of distribution of the state functions, in contrast to FORM, where a Gaussian distribution is required. This paper verifies whether relative entropy can serve as an alternative to FORM for determining reliability. The yield surface of the porous material with a random values of the void volume fraction f are also presented. Full article

(This article belongs to the Section Metals and Alloys)

► Show Figures

Graphical abstract

24 pages, 2639 KB

Open AccessArticle

Machine Learning-Assisted Modal Sensitivity and Parameter Ranking in Systems with Viscoelastic Damping

by Jakub Porysek and Magdalena Łasecka-Plura

Appl. Sci. 2026, 16(8), 3749; https://doi.org/10.3390/app16083749 - 11 Apr 2026

Viewed by 544

Abstract

This paper proposes a machine-learning-assisted framework for modal sensitivity analysis of systems with viscoelastic damping elements, including both classical and fractional rheological models. Surrogate models are trained to approximate natural frequencies over a prescribed parameter space using two sampling strategies (Grid and Latin [...] Read more.

This paper proposes a machine-learning-assisted framework for modal sensitivity analysis of systems with viscoelastic damping elements, including both classical and fractional rheological models. Surrogate models are trained to approximate natural frequencies over a prescribed parameter space using two sampling strategies (Grid and Latin Hypercube) and two regression approaches: multi-layer perceptron (MLP) and Gaussian process regression (GPR). Sensitivities are obtained from the surrogates by finite differences and complemented by model-interpretability measures, namely permutation feature importance (PFI) and Shapley Additive Explanations (SHAP). The surrogate-based results are compared with analytically obtained sensitivities. Local first- and second-order sensitivities of natural frequencies are derived analytically using the direct differentiation method (DDM) for a nonlinear eigenvalue problem formulated in the Laplace domain and further transformed into dimensionless sensitivity measures. The methodology is demonstrated for a single-degree-of-freedom oscillator with classical and fractional Kelvin damper models and a two-story frame equipped with a fractional Kelvin damper. The results show very good agreement between analytical and surrogate-based sensitivities. Feature-importance rankings obtained by PFI and SHAP are consistent with the dimensionless sensitivities and capture changes in parameter influence under varying damping levels. Dispersion studies indicate only minor ranking variations. Full article

(This article belongs to the Section Civil Engineering)

► Show Figures

Figure 1

25 pages, 2055 KB

Open AccessArticle

Simultaneous Confidence Intervals for All Pairwise Differences of Coefficients of Variation of Delta-Inverse Gaussian Distributions

by Wasurat Khumpasee, Sa-Aat Niwitpong and Suparat Niwitpong

Symmetry 2026, 18(4), 604; https://doi.org/10.3390/sym18040604 - 2 Apr 2026

Viewed by 510

Abstract

This study develops and evaluates simultaneous confidence interval procedures for all pairwise differences of coefficients of variation under delta-inverse Gaussian distributions. The objective is to provide reliable comparative inference for relative variability in zero-inflated and highly skewed data, where standard normal-based methods may [...] Read more.

This study develops and evaluates simultaneous confidence interval procedures for all pairwise differences of coefficients of variation under delta-inverse Gaussian distributions. The objective is to provide reliable comparative inference for relative variability in zero-inflated and highly skewed data, where standard normal-based methods may be unreliable. Five approaches were studied and compared in terms of coverage probabilities and average widths: generalized confidence interval, adjusted generalized confidence interval, fiducial confidence interval, method of variance estimates recovery, and normal approximation. A Monte Carlo simulation study was conducted under varying shape parameters, zero-inflation probabilities, sample sizes, and numbers of populations (

k

= 3, 6, and 10). Although most methods produced CPs near the nominal 0.95 level, meaningful differences emerged when both coverage accuracy and interval efficiency were considered. The AGCI method consistently delivered stable coverage across parameter settings and remained robust as dimensionality increased. The MOVER approach achieved competitive coverage while frequently yielding narrower intervals. In contrast, GCI occasionally showed mild undercoverage, and FCI tended to produce overly wide intervals. An empirical application to zero-inflated mortality data supports the simulation findings. Overall, AGCI and MOVER provide reliable and practical tools for simultaneous inference on differences in CVs across delta-IG populations. Full article

(This article belongs to the Special Issue Skewed (Asymmetrical) Probability Distributions and Applications Across Disciplines, 5th Edition)

► Show Figures

Figure 1

25 pages, 3347 KB

Open AccessArticle

Variational Bayesian-Based Reliability Evaluation of Nonlinear Structures by Active Learning Gaussian Process Modeling

by Wei-Chao Hou, Yu Xin, Ding-Tang Wang, Zuo-Cai Wang and Zong-Zu Liu

Infrastructures 2026, 11(4), 118; https://doi.org/10.3390/infrastructures11040118 - 27 Mar 2026

Viewed by 520

Abstract

In this study, variational Bayesian inference (VBI) with Gaussian mixture models is applied to update models of nonlinear structures, and then, the calibrated model is employed to estimate the failure probability of structures using a subset simulation (SS) algorithm. To improve the computation [...] Read more.

In this study, variational Bayesian inference (VBI) with Gaussian mixture models is applied to update models of nonlinear structures, and then, the calibrated model is employed to estimate the failure probability of structures using a subset simulation (SS) algorithm. To improve the computation efficiency of probabilistic nonlinear model updating, a Gaussian Process (GP) model is used to construct a surrogate likelihood function in Bayesian inference using an active learning algorithm, and then, Gaussian mixture models (GMMs) are employed to approximate the unknown posterior probabilistic density functions (PDFs) of model parameters. The optimized hyperparameters of GMMs can be obtained by maximizing the evidence lower bound (ELBO), and the stochastic gradient search method is used to solve this optimization problem. Based on the optimized hyperparameters, the posterior distributions of model parameters can be approximated using a combination of multiple Gaussian components. Subsequently, the SS algorithm is used to calculate the earthquake-induced failure probability of structures based on the calibrated nonlinear model. To verify the feasibility and effectiveness of the proposed method, a numerical simulation of a two-span bridge structure subjected to seismic excitations was developed. Moreover, the proposed strategy is further applied to estimate the failure probability of a scaled monolithic column structure subjected to bi-directional earthquake excitations. Both numerical and experimental results indicate that the proposed method is feasible and effective for probabilistic nonlinear model updates, and the updated model can significantly enhance the accuracy of structural failure probability predictions. Full article

(This article belongs to the Section Infrastructures and Structural Engineering)

► Show Figures

Figure 1

44 pages, 4394 KB

Open AccessArticle

Data-Driven Yield Estimation and Maximization Using Bayesian Optimization Under Uncertainty

by Kei Sano, Daiki Kawahito, Yukiya Saito, Hironori Moki and Dragan Djurdjanovic

Appl. Sci. 2026, 16(7), 3213; https://doi.org/10.3390/app16073213 - 26 Mar 2026

Viewed by 429

Abstract

In this paper, we propose a novel method which utilizes samples of measured product quality characteristics to efficiently estimate the probabilities of those quality characteristics being within the desired specifications and, consequently, the process yield. Specifically, when dealing with 1D Gaussian distributions, we [...] Read more.

In this paper, we propose a novel method which utilizes samples of measured product quality characteristics to efficiently estimate the probabilities of those quality characteristics being within the desired specifications and, consequently, the process yield. Specifically, when dealing with 1D Gaussian distributions, we formally prove that the proposed yield estimator asymptotically gives a lower Mean Squared Error compared to the best unbiased estimator. In order to enable maximization of yield, this novel estimator is incorporated into the framework of Bayesian Optimization which iteratively seeks controllable tool parameters under which the outgoing product yield is maximized. The newly proposed yield maximization method is demonstrated in an application involving high-fidelity simulations of a reactive ion etch chamber, a tool component commonly used in semiconductor manufacturing. The aim of these simulations was to rapidly and reliably determine tool parameters that maximize the probability of delivering desired plasma density characteristics under stochastic variations in chamber conditions. The novel yield estimation and optimization methods show superiority when the number of experimental observations is limited and the distributions of outgoing product characteristics can be approximated well by a Gaussian distribution. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

52 pages, 51167 KB

Open AccessArticle

Detection and Comparative Evaluation of Noise Perturbations in Simulated Dynamical Systems and ECG Signals Using Complexity-Based Features

by Kevin Mallinger, Sebastian Raubitzek, Sebastian Schrittwieser and Edgar Weippl

Mach. Learn. Knowl. Extr. 2026, 8(4), 85; https://doi.org/10.3390/make8040085 - 25 Mar 2026

Viewed by 598

Abstract

Noise contamination is a common challenge in the analysis of time series data, where stochastic perturbations can obscure deterministic dynamics and complicate the interpretation of signals from chaotic and physiological systems. Reliable identification of noise regimes and their intensity is therefore essential for [...] Read more.

Noise contamination is a common challenge in the analysis of time series data, where stochastic perturbations can obscure deterministic dynamics and complicate the interpretation of signals from chaotic and physiological systems. Reliable identification of noise regimes and their intensity is therefore essential for robust analysis of dynamical and biomedical signals, where incorrect attribution of stochastic perturbations can lead to misleading interpretations of system behavior. For this reason, the present study examines the role of complexity-based descriptors for identifying stochastic perturbations in time series and analyzes how these metrics respond to different noise regimes across heterogeneous dynamical systems. A supervised learning approach based on complexity descriptors was developed to analyze controlled perturbations in multiple signal types. Gaussian, pink, and low-frequency noise disturbances were injected at predefined intensity levels into the Rössler and Lorenz chaotic systems, the Hénon map, and synthetic electrocardiogram signals, while AR(1) processes were used for validation on inherently stochastic signals. From these systems, eighteen entropy-based, fractal, statistical, and singular value decomposition-based complexity metrics were extracted from either raw signals or reconstructed phase spaces. These features were used to perform three classification tasks that capture different aspects of noise characterization, including detecting the presence of noise, identifying the perturbation type, and discriminating between different noise intensities. In addition to predictive modeling, the study evaluates the complexity profiles and feature relevance of the metrics under varying perturbation regimes. The results show that no single complexity metric consistently discriminates noise regimes across all systems. Instead, system-specific relevance patterns emerge. Under given experimental constraints (data partitioning, machine learning algorithm, etc.), Approximate Entropy provides the strongest discrimination for the Lorenz system and the Hénon map, the Coefficient of Variation, Sample and Permutation Entropy dominate classification for ECG signals, and the Condition Number and Variance of first derivative together with Fisher Information are most informative for the Rössler system. Across all datasets, the proposed framework achieves an average accuracy of 99% for noise presence detection, 98.4% for noise type classification, and 98.5% for noise intensity classification. These findings demonstrate that complexity metrics capture structural and statistical signatures of stochastic perturbations across a diverse set of dynamic systems. Full article

► Show Figures

Figure 1

54 pages, 54419 KB

Open AccessArticle

An Investigation into Uncertainty Quantification of Shallow Foundation Failure Mechanisms in Horizontally Stratified Layered Soil Strata

by Ambrosios-Antonios Savvides

Appl. Sci. 2026, 16(6), 3051; https://doi.org/10.3390/app16063051 - 21 Mar 2026

Viewed by 522

Abstract

In light of the evolution of computer science and computational mechanics, an uncertainty analysis of engineering systems has become increasingly feasible. In this paper, the failure of shallow foundations in layered soil continua is examined. It is shown that Gaussian input distributions lead [...] Read more.

In light of the evolution of computer science and computational mechanics, an uncertainty analysis of engineering systems has become increasingly feasible. In this paper, the failure of shallow foundations in layered soil continua is examined. It is shown that Gaussian input distributions lead to approximately Gaussian output response distributions even in the presence of an extensive nonlinear relationship between them. Soil configurations that provide larger average values and higher output variability in terms of bearing capacity force are those in which cohesive, stronger soils such as clays exist in the upper layers. Configurations with sandy soils in the upper layers, in several cases, provide greater average values of maximum displacements, rotations, and output variation. In this paper, the probabilities of the Meyerhof spline onset point are also estimated. Therefore, the proposed framework can support shallow foundation design decisions. Full article

(This article belongs to the Special Issue Digital Multi-Hazard Risk Modelling and Life-Cycle Assessment for Next-Generation Resilient and Sustainable Built Environments)

► Show Figures

Figure 1

14 pages, 2938 KB

Open AccessArticle

Effect of Crystal-to-Detector Distance Variations on Serial Femtosecond Crystallography Data Collected at PAL-XFEL

by Ki Hyun Nam, Sehan Park and Jaehyun Park

Crystals 2026, 16(3), 203; https://doi.org/10.3390/cryst16030203 - 17 Mar 2026

Viewed by 605

Abstract

Serial femtosecond crystallography (SFX) using X-ray free electron lasers (XFELs) enables the determination of room-temperature structures of biological macromolecules without radiation damage. The accuracy of detector geometry parameters, including the crystal-to-detector distance (CTDD), is critical for reliable data processing. In SFX experiments, the [...] Read more.

Serial femtosecond crystallography (SFX) using X-ray free electron lasers (XFELs) enables the determination of room-temperature structures of biological macromolecules without radiation damage. The accuracy of detector geometry parameters, including the crystal-to-detector distance (CTDD), is critical for reliable data processing. In SFX experiments, the CTDD may shift during data collection due to changes in the experimental setup or installation of the sample delivery system. Such CTDD variations can affect the quality of SFX datasets; however, their impact has not been fully elucidated in the context of SFX data processing. In this study, we investigated the influence of CTDD variations on SFX datasets collected at Pohang Accelerator Laboratory X-ray Free Electron Laser (PAL-XFEL) with thermolysin, lysozyme, and glucose isomerase crystals processed by four indexing algorithms. At the optimized CTDD, the distribution of unit cell parameters exhibited a Gaussian pattern; however, it became distorted as the CTDD deviated further from the optimal value. Data analysis indicated that the CTDD tolerance for successful data processing and structure determination was approximately ±3–5 mm from the optimized CTDD. These findings provide insight into indexing behavior in SFX data processing at PAL-XFEL and offer practical guidance for improving data processing efficiency. Full article

(This article belongs to the Section Biomolecular Crystals)

► Show Figures

Figure 1

18 pages, 2747 KB

Open AccessArticle

Stochastic Air Quality Modelling of Ship Emissions in Port Areas for Maritime Decarbonization Pathways

by Ramazan Şener and Yordan Garbatov

J. Mar. Sci. Eng. 2026, 14(6), 542; https://doi.org/10.3390/jmse14060542 - 13 Mar 2026

Viewed by 469

Abstract

Decarbonizing the maritime sector requires not only adopting alternative fuels and propulsion technologies but also quantitatively assessing their impacts on coastal and urban air quality. This study develops a stochastic, time-resolved air-quality modelling framework to evaluate ship-related pollutant dispersion in port environments. The [...] Read more.

Decarbonizing the maritime sector requires not only adopting alternative fuels and propulsion technologies but also quantitatively assessing their impacts on coastal and urban air quality. This study develops a stochastic, time-resolved air-quality modelling framework to evaluate ship-related pollutant dispersion in port environments. The approach integrates Automatic Identification System (AIS) trajectories, vessel-specific emission factors, and meteorological inputs within a moving-source Gaussian dispersion model to simulate the spatio-temporal evolution of pollutant concentrations. A 24 h case study for the Ports of Los Angeles and Long Beach demonstrates highly intermittent emission behaviour, with peak aggregated emission rates reaching approximately 1.2 kg/s for CO₂ and 3.8 g/s for SO₂. Temporally integrated concentration fields reveal maximum cumulative dosages of 0.145 g·s/m³ for NO_x, 0.023 g·s/m³ for SO₂, 0.014 g·s/m³ for total PM, and 7.5 g·s/m³ for CO₂ in near-port traffic corridors. Sensitivity analysis indicates that effective emission height variations alter cumulative exposure by up to 17%, whereas temporal resolution changes produce deviations below 7%, confirming numerical stability. Monte Carlo uncertainty propagation demonstrates bounded but non-negligible variability in exposure estimates under realistic emission and wind uncertainties. Results show that cumulative exposure patterns differ substantially from short-term concentration peaks, highlighting the importance of time-integrated and receptor-based metrics for port air quality assessment. The proposed AIS-driven stochastic framework provides a reproducible and computationally efficient tool for evaluating operational mitigation strategies and supporting evidence-based maritime decarbonization pathways. Full article

(This article belongs to the Special Issue Towards Net-Zero Shipping Innovation and Integration in Maritime Decarbonization)

► Show Figures

Figure 1

32 pages, 4390 KB

Open AccessArticle

Predicting the Remaining Useful Life of Ship Shafting Using Bayesian Networks with Asymmetric Probability Distributions

by Peng Dong, Ge Han and Luwen Yuan

Symmetry 2026, 18(3), 443; https://doi.org/10.3390/sym18030443 - 4 Mar 2026

Viewed by 507

Abstract

Accurately predicting the remaining useful life (RUL) of ship shafting is crucial for ensuring navigation safety and optimizing operation and maintenance. Traditional Bayesian Network (BN) methods are usually based on the assumption of symmetric distributions. They struggle to effectively characterize common statistical properties [...] Read more.

Accurately predicting the remaining useful life (RUL) of ship shafting is crucial for ensuring navigation safety and optimizing operation and maintenance. Traditional Bayesian Network (BN) methods are usually based on the assumption of symmetric distributions. They struggle to effectively characterize common statistical properties such as asymmetry and heavy tails during the shafting degradation process, leading to biases in prediction results. To address this issue, this study proposes an Asymmetric Distribution Bayesian Network (ADBN) method. The method consists of three key components. Firstly, each node selects the optimal asymmetric distribution form based on the Bayesian Information Criterion (BIC) to better fit data characteristics. Secondly, a Generalized Linear Model (GLM) is used to associate distribution parameters (e.g., location, scale, shape) with parent node states, enabling the conditional distribution to adaptively evolve with the system degradation process. Finally, to tackle the complex inference problem under asymmetric distributions, an approximate algorithm based on stochastic gradient variational inference is designed to ensure prediction timeliness. Experimental results show that the ADBN method outperforms traditional Gaussian networks in terms of Mean Absolute Error in the early, middle, and late stages of RUL prediction, and can provide more accurate prediction intervals. This research offers a probabilistic approach that better aligns with actual statistical properties for modeling ship shafting degradation. Full article

(This article belongs to the Special Issue Symmetry in Fault Detection, Diagnosis, and Prognostics)

► Show Figures

Figure 1

27 pages, 1628 KB

Open AccessArticle

Synthetic Data Augmentation for Imbalanced Tabular Data: A Comparative Study of Generation Methods

by Dong-Hyun Won, Kwang-Seong Shin and Sungkwan Youm

Electronics 2026, 15(4), 883; https://doi.org/10.3390/electronics15040883 - 20 Feb 2026

Cited by 3 | Viewed by 2074

Abstract

Class imbalance in tabular datasets poses a challenge for machine learning classification tasks, often leading to biased models that underperform in predicting minority class instances. This study presents a comparative analysis of synthetic data generation methods for addressing class imbalance in tabular data. [...] Read more.

Class imbalance in tabular datasets poses a challenge for machine learning classification tasks, often leading to biased models that underperform in predicting minority class instances. This study presents a comparative analysis of synthetic data generation methods for addressing class imbalance in tabular data. We evaluate four augmentation approaches—Synthetic Minority Over-sampling Technique (SMOTE), Gaussian Copula, Tabular Variational Autoencoder (TVAE), and Conditional Tabular Generative Adversarial Network (CTGAN)—using the University of California Irvine (UCI) Bank Marketing dataset, which exhibits a class imbalance ratio of approximately 7.88:1. Our experimental framework assesses each method across three dimensions: statistical fidelity to the original data distribution evaluated through four complementary metrics (marginal numerical similarity, categorical distribution similarity, correlation structure preservation, and Kolmogorov–Smirnov test), machine learning utility measured through classification performance, and minority class detection capability. Results indicate that all augmentation methods achieved statistically significant improvements over the baseline (

p < 0.05

). SMOTE achieved the highest recall (54.2%, a 117.6% relative improvement over the baseline) and F1-Score (0.437, +22.4% over the baseline) for minority class detection, while Gaussian Copula provided the highest composite fidelity score (0.930) with competitive predictive performance. A weak negative correlation (

ρ = - 0.30

) between composite fidelity and classification performance was observed, suggesting that higher statistical fidelity does not necessarily translate to better downstream task performance. Deep learning-based methods (TVAE, CTGAN) showed statistically significant improvements over the baseline (recall: +58% to +63%) but underperformed compared to simpler methods under default configurations, suggesting the need for larger training samples or more extensive hyperparameter tuning. These findings offer reference points for practitioners working with moderately imbalanced tabular data with limited minority class samples, supporting the selection of generation strategies based on specific requirements regarding data fidelity and classification objectives. Full article

(This article belongs to the Special Issue Data-Related Challenges in Machine Learning: Theory and Application)

► Show Figures

Figure 1

24 pages, 2846 KB

Open AccessArticle

Efficient Hierarchical Latent Gaussian Models for Heterogeneous and Skewed IoT Reliability Data

by Adrian Dudek and Jerzy Baranowski

Symmetry 2026, 18(2), 325; https://doi.org/10.3390/sym18020325 - 11 Feb 2026

Viewed by 729

Abstract

The reliability of Internet of Things systems is critical for industrial applications; however, operational reliability data are often heterogeneous and strongly right-skewed, exhibiting non-Gaussian behaviour, overdispersion, and production-level variability that challenge classical predictive maintenance models. Existing approaches frequently rely on pooled assumptions or [...] Read more.

The reliability of Internet of Things systems is critical for industrial applications; however, operational reliability data are often heterogeneous and strongly right-skewed, exhibiting non-Gaussian behaviour, overdispersion, and production-level variability that challenge classical predictive maintenance models. Existing approaches frequently rely on pooled assumptions or simplified error structures, limiting their ability to identify latent batch-level degradation and to jointly interpret discrete failure events and continuous lifetime information. To address these limitations, this study proposes a hierarchical Bayesian framework based on Integrated Nested Laplace Approximation (INLA) to jointly model discrete reset counts and continuous failure times. Three Latent Gaussian Models are evaluated—ranging from pooled baseline specifications to a fully joint model with shared latent batch effects—using a synthetic dataset designed to mimic realistic industrial fault patterns. The analysis demonstrates that standard pooled models fail to capture the degradation dynamics of defective device batches. In contrast, the hierarchical joint model successfully recovers latent quality variations, accurately links high reset intensity with shortened lifetimes, and substantially improves model fit, achieving a DIC reduction of over 67% compared to baseline approaches. INLA provides a computationally efficient and rigorously calibrated alternative to MCMC-based methods for modelling skewed and heterogeneous reliability data. The proposed framework enables reliable identification of defective production batches and robust uncertainty quantification, offering a practical tool for data-driven predictive maintenance in Industry 4.0. Future work will focus on validating the proposed framework using real industrial IoT datasets. Full article

(This article belongs to the Special Issue Skewed (Asymmetrical) Probability Distributions and Applications Across Disciplines, Fourth Edition)

► Show Figures

Figure 1

Search Results (122)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (122)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI