Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (7,196)

Search Parameters:
Keywords = machine learning regression model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
34 pages, 5939 KB  
Article
Explainable Machine Learning for Volatile Fatty Acid Soft-Sensing in Anaerobic Digestion: A Pilot Feasibility Study
by Bibars Amangeldy, Assiya Boltaboyeva, Nurdaulet Tasmurzayev, Zhanel Baigarayeva, Baglan Imanbek, Aliya Jemal Getahun, Dinara Turmakhanbet, Moldir Kuatova and Waldemar Wojcik
Algorithms 2026, 19(3), 183; https://doi.org/10.3390/a19030183 (registering DOI) - 1 Mar 2026
Abstract
Sustainable energy systems such as anaerobic digestion (AD) bioreactors exhibit complex nonlinear dynamics that complicate the monitoring of key stability indicators using conventional laboratory-based methods. As a preliminary investigation, this pilot study explores the feasibility of using machine learning-based soft sensing to estimate [...] Read more.
Sustainable energy systems such as anaerobic digestion (AD) bioreactors exhibit complex nonlinear dynamics that complicate the monitoring of key stability indicators using conventional laboratory-based methods. As a preliminary investigation, this pilot study explores the feasibility of using machine learning-based soft sensing to estimate Total Volatile Fatty Acids (TVFA(M)) from routinely measured physicochemical parameters. Using a short-term laboratory dataset obtained from controlled CO2 biomethanisation experiments, several regression models were benchmarked, including an attention-based deep learning architecture (TabNet), multi-architecture artificial neural networks (ANNs), gradient-boosting ensembles (CatBoost, XGBoost, LightGBM), and classical kernel-based approaches. Model performance was evaluated under a cross-validated framework to assess predictive capability and consistency across folds within the limited experimental scope. Among the tested models, TabNet achieved highly competitive performance, yielding an R2 of 0.8551, an RMSE of 0.0090, and an MAE of 0.0067. To support model transparency and interpretability, Explainable Artificial Intelligence (XAI) techniques based on SHapley Additive exPlanations (SHAP) were applied, identifying pCO2 as the dominant contributor to TVFA(M) predictions within the studied operational range. The results demonstrate the potential of explainable machine learning models as soft sensors for TVFA(M) estimation under controlled laboratory conditions. Although restricted to controlled laboratory conditions and a short observation period, this pilot study demonstrates the potential of explainable machine learning models for TVFA(M) estimation and provides a methodological benchmark for future validation using larger and more diverse datasets. Full article
Show Figures

Figure 1

28 pages, 1904 KB  
Article
Environmental Drivers and Explainable Modeling to Resolve Trace Metal Dynamics in a Lotic System
by Akasya Topçu, Dilara Gerdan Koç, İlknur Meriç Turgut and Serkan Taşdemir
Toxics 2026, 14(3), 215; https://doi.org/10.3390/toxics14030215 (registering DOI) - 28 Feb 2026
Abstract
Trace metal contamination in lotic freshwater systems exhibits pronounced heterogeneity arising from coupled hydrological connectivity, geochemical partitioning, and anthropogenic forcing, complicating exposure characterization in urban and peri-urban catchments. Addressing this complexity requires integrative analytical approaches capable of deciphering system-level controls, prompting an investigation [...] Read more.
Trace metal contamination in lotic freshwater systems exhibits pronounced heterogeneity arising from coupled hydrological connectivity, geochemical partitioning, and anthropogenic forcing, complicating exposure characterization in urban and peri-urban catchments. Addressing this complexity requires integrative analytical approaches capable of deciphering system-level controls, prompting an investigation of the environmental structuring and governing controls of dissolved trace metal signatures in a human-impacted stream using a system-oriented computational framework. To capture temporal variability associated with seasonal hydrological contrasts and heterogeneous pollution inputs, a station-based, season-resolved sampling strategy was implemented during the wet and dry seasons. Physicochemical gradients (pH, temperature, dissolved oxygen, and electrical conductivity), inorganic nitrogen species (NH3, NO2, and NO3), and phosphorus fractions (total phosphorus, TP; total orthophosphate, TOP; soluble reactive P, SRP) were jointly analyzed with dissolved concentrations of chromium (Cr), copper (Cu), nickel (Ni), lead (Pb), cadmium (Cd), mercury (Hg), and arsenic (As). Regression-based machine learning models were used to quantify element-specific sensitivities to hydrochemical drivers under wet–dry periods and to identify optimal predictive configurations. Predictive performance was consistently high for trace metals (R2 generally >0.95), with Random Forest providing the best accuracy for Cr, Ni, Pb, Cd, As, and Hg, whereas Cu was most reliably captured by an XGBoost tree ensemble (R2 = 0.994). Explainability analyses revealed heterogeneous, metal-specific control regimes: Cr was primarily driven by temperature, Ni by NO2 and redox-sensitive conditions, Cd by NH3 and temperature, and As by Hg in combination with phosphorus-related and redox-linked proxies, while Pb showed comparatively lower predictability relative to other metals. Trace metal distributions are therefore structured primarily by differential environmental sensitivity rather than uniform source-driven inputs, reinforcing the need for integrative computational frameworks when interpreting freshwater contamination under intensifying anthropogenic and climatic pressures. Full article
(This article belongs to the Special Issue Distribution and Behavior of Trace Metals in the Environment)
Show Figures

Graphical abstract

20 pages, 4514 KB  
Article
Hybrid Physical–Machine Learning Soil Moisture Modeling at Orchard Scale in Irrigated Citrus Orchards Using Sentinel 1 and 2 and Agroclimatic Data
by Héctor Izquierdo-Sanz and Enrique Moltó
Agronomy 2026, 16(5), 541; https://doi.org/10.3390/agronomy16050541 (registering DOI) - 28 Feb 2026
Abstract
Accurate orchard-scale soil moisture information is a key requirement for efficient irrigation management in perennial crops such as citrus orchards, particularly in Mediterranean environments characterized by water scarcity and strong spatial and temporal variability in soil moisture, canopy structure, and irrigation scheduling. This [...] Read more.
Accurate orchard-scale soil moisture information is a key requirement for efficient irrigation management in perennial crops such as citrus orchards, particularly in Mediterranean environments characterized by water scarcity and strong spatial and temporal variability in soil moisture, canopy structure, and irrigation scheduling. This study proposes a hybrid physical–machine learning methodology for soil moisture estimation that integrates in situ capacitance sensor measurements, Sentinel-1 SAR observations, Sentinel-2 optical imagery, and ERA5-Land agroclimatic variables. Physically based soil moisture estimates were first obtained through the inversion of Sentinel-1 backscatter using integral equation scattering models, a physically based soil dielectric model, and a simplified vegetation attenuation scheme. These physically derived estimates were subsequently incorporated as predictors within supervised machine learning models, together with multi-source remote sensing and meteorological variables. Several algorithms were evaluated, including regularized linear models, support vector regression, random forests, and gradient boosting methods. Model performance was assessed using a strict interannual validation strategy based on independent-year predictions to ensure robust generalization. Within this methodology, tree-based ensemble models achieved the highest and most consistent performance at the orchard scale, with coefficients of determination ranging from 0.55 to 0.76 and root mean square errors typically between 0.7 and 1.1% volumetric soil moisture in the best-performing cases. Benchmarking against a physical-only baseline demonstrated that the hybrid methodology consistently reduced prediction errors and improved temporal robustness under independent-year validation. Overall, the results demonstrate that hybrid physical–machine learning approaches provide a robust and scalable solution for orchard-scale soil moisture monitoring in irrigated citrus orchards using operational data streams, supporting advanced irrigation management and precision agriculture applications in Mediterranean perennial cropping systems. Full article
21 pages, 9850 KB  
Article
A Bias Correction Scheme for FY-3E/HIRAS-II Data Assimilation Based on EXtreme Gradient Boosting
by Hongtao Chen and Li Guan
Remote Sens. 2026, 18(5), 744; https://doi.org/10.3390/rs18050744 (registering DOI) - 28 Feb 2026
Abstract
More and more spaceborne infrared hyperspectral atmospheric observations are assimilated into data assimilation systems. The key to bias correction (BC) of these instruments depends on selecting predictors. However, it is difficult to find a set of predictors that are highly correlated with the [...] Read more.
More and more spaceborne infrared hyperspectral atmospheric observations are assimilated into data assimilation systems. The key to bias correction (BC) of these instruments depends on selecting predictors. However, it is difficult to find a set of predictors that are highly correlated with the O-B biases in all FY-3E/HIRAS-II channels, due to its multi-channel characteristics. A machine learning model XGBoost (EXtreme Gradient Boosting) BC scheme for FY-3E/HIRAS-II is established in this article. The selected predictors include model skin temperature, model total column water vapor, 1000–300 hPa thickness, 200–50 hPa thickness, scan position, observed brightness temperature (BT) and simulated BT. The method is also compared with the operational static BC and the variational BC, to validate its effect. The two-week data assimilation experiments show that the XGBoost BC is the most effective among the three BC schemes. The mean and standard deviation of O-B in all channels are the smallest after BC, and the effective observations through quality control are the largest, followed by the static BC. The static BC and variational BC are performed based on linear regression, which may lead to a small loss of valid observations in some channels that are weakly correlated with the predictor, whereas machine learning algorithms can search for the nonlinear correlation between biases and predictors. Compared with ERA5, both temperature- and humidity-analysis fields based on XGBoost BC are closest to ERA5 at all levels, and the root mean square errors do not change much over time. Full article
23 pages, 7334 KB  
Article
Shallow Water Bathymetry Inversion Method Based on Spatiotemporal Coupling Correlation Adaptive Spectroscopy
by Jiaxing Du, Houpu Li, Shuaidong Jia, Gaixiao Li, Jian Dong, Bing Liu and Shaofeng Bian
Remote Sens. 2026, 18(5), 741; https://doi.org/10.3390/rs18050741 (registering DOI) - 28 Feb 2026
Abstract
Shallow water bathymetry data underpins maritime shipping and marine resource survey/protection, but its accuracy is constrained by water heterogeneity and spectral interference. To address this, this study proposes a Spatio-Temporal Coupling and Correlation Adaptive Spectral (STCCAS) inversion method, integrating four machine learning models: [...] Read more.
Shallow water bathymetry data underpins maritime shipping and marine resource survey/protection, but its accuracy is constrained by water heterogeneity and spectral interference. To address this, this study proposes a Spatio-Temporal Coupling and Correlation Adaptive Spectral (STCCAS) inversion method, integrating four machine learning models: Random Forest (RF), XGBoost, Support Vector Regression (SVR), and Multi-Layer Perceptron (MLP). Experiments were conducted in Tampa Bay’s nearshore waters, using Sentinel-2 imagery and Airborne LiDAR Bathymetry (ALB) data. Core to STCCAS, the Temporal Stability Index (TSI) quantifies spectral temporal consistency, while the Normalized Difference Turbidity Index (NDTI) characterizes water turbidity, and the two indices synergistically form a dual-scale “spectral reliability-turbidity stability” evaluation system for pixel-level feature quality assessment—coupled with spectral fusion features and spatial location, they jointly realize pixel-level feature reliability weighting and dynamic filtering to build a water condition-adaptive input set. Comparative analysis of inversion performance under the original spectral features (OSFs) inversion method vs. STCCAS inversion method confirms STCCAS significantly boosts accuracy. XGBoost outperforms others, achieving a coefficient of determination (R2) of 0.93, root mean square error (RMSE) of 0.16 m, and mean absolute error (MAE) of 0.12 m. STCCAS breaks the limitations of traditional fixed feature combinations, effectively adapting to nearshore water heterogeneity. It provides a novel method for high-frequency, high-precision shallow water bathymetry inversion, with important practical value for marine environmental monitoring and resource management. Full article
22 pages, 1258 KB  
Article
Raman Spectroscopy Assisted by Machine Learning Algorithms for the Prediction of Different Types of Oral Cancer Cells
by Maria Lasalvia, Vito Capozzi and Giuseppe Perna
Appl. Sci. 2026, 16(5), 2380; https://doi.org/10.3390/app16052380 (registering DOI) - 28 Feb 2026
Abstract
Oral squamous cell carcinoma (OSCC) cytology involves extracting a cell sample consisting of single cells or small clusters of cells from patients’ head and neck area in order to identify abnormal morphological characteristics after staining it. This method is used to screen for [...] Read more.
Oral squamous cell carcinoma (OSCC) cytology involves extracting a cell sample consisting of single cells or small clusters of cells from patients’ head and neck area in order to identify abnormal morphological characteristics after staining it. This method is used to screen for early cancer and the formation of metastases within the oral cavity. OSCC diagnosis partly depends on pathologists’ skills and also laboratories’ instrumentation. The use of Raman spectroscopy could support diagnoses performed using traditional methods, providing information based on the cellular biochemical environment. Technical drawbacks related to low signal-to-noise ratios of Raman spectroscopy and the need to obtain diagnostic information within a reasonable time frame have recently led to the analysis of Raman spectra using machine learning (ML) methods in order to obtain reliable information about the correct attribution of unknown cellular spectra. So, we used Raman micro-spectroscopy combined with machine learning methods to build classification models, which allow the diagnosis of different grades of OSCC in cell samples. The Raman spectra were analysed in the 980–1800 cm−1 range by focusing the laser beam onto the nucleus and the cytoplasm regions of single cells from different cell lines modelling healthy (HaCaT) and cancer (Cal-27, SAS and HSC-3) cytological samples. We considered six classification algorithms (k-Nearest Neighbours, Logistic Regression, Naïve Bayes, artificial Neural Network, Random Forest and Support Vector Machine) to classify unknown Raman spectra. We report two classification tasks: a 4-level classification, which encompasses healthy cells, two different types of cancer cells, and one type of metastatic cells, and a 3-level classification, which includes healthy cells, non-metastatic cancer cells, and metastatic cancer cells. Our findings show that both Neural Network and Support Vector Machine algorithms applied to Raman spectra measured in the cytoplasm region can achieve sensitivity, precision and F1-score values larger than 90% in the 3-groups classifications, whereas Support Vector Machine performs better in the 4-groups classification with respect to a Neural Network. These results contribute to increasing confidence in the clinical translation of ML-assisted Raman spectroscopy as a tool to support conventional cytological techniques. Full article
(This article belongs to the Section Optics and Lasers)
15 pages, 2687 KB  
Article
Interpretable Machine Learning Insights into Adhesion and Modulus of Biomedical HA–Dopamine Hydrogels
by Yuze Zhang, Yabei Xu, Yimin Shi and Daxin Liang
Gels 2026, 12(3), 206; https://doi.org/10.3390/gels12030206 (registering DOI) - 28 Feb 2026
Abstract
Hyaluronic acid–dopamine (HA-Dopa) hydrogels have emerged as promising adhesive biomaterials for biomedical applications. However, the complex dependencies between formulation parameters and hydrogel performance pose challenges for rational material design. In this study, an interpretable machine learning framework was developed to investigate the structure–property [...] Read more.
Hyaluronic acid–dopamine (HA-Dopa) hydrogels have emerged as promising adhesive biomaterials for biomedical applications. However, the complex dependencies between formulation parameters and hydrogel performance pose challenges for rational material design. In this study, an interpretable machine learning framework was developed to investigate the structure–property relationships of HA-Dopa hydrogels. A dataset comprising 228 data points was collected from 37 peer-reviewed publications, representing heterogeneous experimental conditions across different research groups, and gradient boosting regression models were established to predict adhesion strength and elastic modulus, achieving test R2 of 0.99 and 0.94, respectively, with stable performance across cross-validation splits. SHAP analysis revealed that HA molecular weight and dopamine substitution degree are the dominant factors governing adhesion, while mechanical properties exhibit more distributed dependence on multiple formulation parameters. The identified synergistic interactions between key features provide potential guidance for targeted formulation optimization. This work demonstrates the utility of interpretable machine learning in elucidating structure–property relationships and accelerating the development of functional hydrogels for biomedical applications. Full article
(This article belongs to the Special Issue Recent Research on Medical Hydrogels (2nd Edition))
Show Figures

Figure 1

31 pages, 2638 KB  
Article
Explainable AI for Predicting and Justifying Firm-Level Financial Resilience in Healthcare Services
by Lucia Morosan-Danila, Claudia-Elena Grigoras-Ichim, Otilia-Maria Bordeianu, Daniela-Mihaela Neamtu, Daniela-Tatiana Agheorghiesei, Dumitru Filipeanu and Alexandru Tugui
Electronics 2026, 15(5), 1022; https://doi.org/10.3390/electronics15051022 (registering DOI) - 28 Feb 2026
Abstract
Healthcare service providers face recurrent systemic disruptions (e.g., pandemics, reimbursement delays, supply shortages, and regulatory shocks), yet firm-level resilience monitoring remains underdeveloped due to limited explainability and weak out-of-time validation in prior work. We develop an explainable machine learning pipeline to predict firm-level [...] Read more.
Healthcare service providers face recurrent systemic disruptions (e.g., pandemics, reimbursement delays, supply shortages, and regulatory shocks), yet firm-level resilience monitoring remains underdeveloped due to limited explainability and weak out-of-time validation in prior work. We develop an explainable machine learning pipeline to predict firm-level financial resilience (a financial health/robustness proxy) for outpatient healthcare providers. Using annual data for 2600 Romanian firms (Nomenclature of Economic Activities - NACE 8622) over 2014–2023, resilience is operationalised as an ordered three-class label derived from a Principal Component Analysis (PCA)-based composite score built from eight capital structure and asset composition ratios, with train-only frozen thresholds and a strict anti-leakage protocol. We evaluate multinomial logistic regression (baseline), Random Forest (RF), and HistGradientBoosting (HGB) (primary) on a prospective 2023 hold-out using Accuracy, Balanced Accuracy, and Macro-F1, with bootstrap uncertainty for key contrasts. The primary model achieves Balanced Accuracy = 0.943 and Macro-F1 = 0.944 in 2023, outperforming the linear baseline and RF; errors concentrated between adjacent classes. Model-faithful permutation importance on HGB highlights working-capital disciplines (receivables, cash, inventory, asset structure), while RF–SHAPley Additive Explanations (SHAP) is used only for auxiliary pattern exploration and stability checks, with Individual Conditional Expectation (ICE)/Partial Dependency Plot (PDP) confirming key nonlinear regimes on HGB. Overall, the results support governance-ready, interpretable resilience monitoring while maintaining a clear separation between predictive explanations and causal claims. Full article
(This article belongs to the Special Issue Women's Special Issue Series: Artificial Intelligence)
Show Figures

Figure 1

31 pages, 5098 KB  
Article
A Forecasting Model for Passenger Flows of Urban Rail Transit Based on Multi-Source Spatio-Temporal Features and Optimized Ensemble Learning
by Haochu Cui and Yan Sun
Modelling 2026, 7(2), 48; https://doi.org/10.3390/modelling7020048 (registering DOI) - 28 Feb 2026
Abstract
In this study, we propose a novel model based on multi-source spatio-temporal features and optimized ensemble learning for forecasting station- and line-level passenger flows of urban rail transit. First, we design a spatio-temporal feature engineering method to enhance the accuracy of forecasting using [...] Read more.
In this study, we propose a novel model based on multi-source spatio-temporal features and optimized ensemble learning for forecasting station- and line-level passenger flows of urban rail transit. First, we design a spatio-temporal feature engineering method to enhance the accuracy of forecasting using passenger flow features; the temporal features include periodic and lag effects and the spatial features cover spatio-temporal attention mechanisms, adjacency relationships in the network graph and station clustering features. Furthermore, an improved ensemble learning method based on Extra Randomized Trees (ExtraTrees) and Light Gradient Boosting Machine (LightGBM) is developed to forecast the station-level passenger flows using a weighted sum method in which a particle swarm optimization algorithm is adopted to determine the weights assigned to the forecasting results of the two models. Finally, ridge regression is adopted as the meta-learning model to forecast line-level passenger flows. We employed passenger flow data from three urban rail transit lines in Hangzhou to demonstrate the feasibility of the proposed model. The results indicate that it produces more accurate passenger flow forecasts at the station and line levels than benchmark models. Therefore, it can provide a solid support for optimizing the operations, management, and planning for both a single urban rail transit station and the entire network. Full article
(This article belongs to the Special Issue Machine Learning and Artificial Intelligence in Modelling)
24 pages, 7484 KB  
Article
Estimation of Nitrogen Content in Alfalfa Plants Based on Multi-Source Feature Fusion
by Jiapeng Zhu, Haohao Dang, Demin Fu, Guangping Qi, Yanxia Kang, Yanlin Ma, Siqin Zhang, Chungang Jing, Bojie Xie, Yuanbo Jiang, Jinxi Chen, Boda Li and Jun Yu
Plants 2026, 15(5), 752; https://doi.org/10.3390/plants15050752 (registering DOI) - 28 Feb 2026
Abstract
Plant nitrogen content (PNC) is a core physiological parameter characterizing crop nitrogen nutrition status. Its precise and dynamic monitoring is crucial for crop growth diagnosis, optimizing nitrogen fertilizer management, enhancing fertilizer use efficiency, and reducing agricultural nonpoint source pollution. This study utilized multispectral [...] Read more.
Plant nitrogen content (PNC) is a core physiological parameter characterizing crop nitrogen nutrition status. Its precise and dynamic monitoring is crucial for crop growth diagnosis, optimizing nitrogen fertilizer management, enhancing fertilizer use efficiency, and reducing agricultural nonpoint source pollution. This study utilized multispectral imagery from unmanned aerial vehicles (UAVs) to extract vegetation indices (VIs) and texture feature values (TFVs) during critical growth stages of alfalfa. By combining TFVs to construct texture indices (TIs), variables exhibiting extremely significant correlations with alfalfa PNC (p < 0.001) were identified. We used VIs, TIs, and their combined features as model inputs. The performance of four machine learning models—random forest regression (RFR), Support Vector Regression (SVR), Backpropagation Neural Network (BPNN), and gradient boosting (XG-Boost)—was comprehensively assessed for estimating alfalfa PNC. Our results indicate the following: (1) The correlation coefficients |r| between VIs and alfalfa PNC ranged from 0.56 to 0.68; TIs constructed from TFVs significantly enhanced PNC correlation compared to raw texture values, with |r| exceeding 0.6. (2) Integrating VIs and TIs substantially improved the accuracy of PNC estimation models across growth stages. Compared to using VIs or TIs alone, the validation set R2 increased by 5.4–19.7%, 1.7–16.4%, and 5.2–17.2% for the branching, budding, and initial flowering stages, respectively. (3) The XG-Boost model demonstrated optimal performance across all growth stages and input variables. Particularly during the budding stage, the VIs + TIs model achieved the highest fitting accuracy: training set R2 = 0.81, RMSE = 0.15%; validation set R2 = 0.80, RMSE = 0.12%. In summary, integrating multispectral vegetation indices and texture indices effectively enhances the accuracy of PNC estimation in alfalfa, providing scientific support for precision field management and fertilization decisions in alfalfa cultivation. Full article
(This article belongs to the Special Issue Water and Nutrient Management for Sustainable Crop Production)
32 pages, 19818 KB  
Article
An Interpretable Ensemble Machine Learning Framework for Predicting the Ultimate Flexural Capacity of BFRP-Reinforced Concrete Beams
by Sebghatullah Jueyendah and Elif Ağcakoca
Polymers 2026, 18(5), 601; https://doi.org/10.3390/polym18050601 (registering DOI) - 28 Feb 2026
Abstract
Prediction of the ultimate moment capacity (Mu) of BFRP-reinforced concrete beams is complicated by nonlinear parameter interactions and the linear-elastic response of BFRP, reducing the accuracy of conventional design models. This study develops an optimized machine learning (ML) framework incorporating random forest, extra [...] Read more.
Prediction of the ultimate moment capacity (Mu) of BFRP-reinforced concrete beams is complicated by nonlinear parameter interactions and the linear-elastic response of BFRP, reducing the accuracy of conventional design models. This study develops an optimized machine learning (ML) framework incorporating random forest, extra trees, gradient boosting, adaboost, bagging, support vector regression, histogram-based gradient boosting, and ensemble voting and stacking strategies for reliable prediction of the Mu of BFRP-reinforced concrete beams. A comprehensive database of material, geometric, reinforcement, and BFRP mechanical parameters was analyzed, and model performance was evaluated using an 80/20 train–test split and 10-fold cross-validation based on R2, RMSE, MAE, and MAPE. The stacking regressor demonstrated superior predictive performance, achieving an R2 of 0.999 (RMSE = 0.590) in training and an R2 of 0.988 (RMSE = 2.487) in testing, indicating excellent robustness and strong generalization capability in predicting Mu. Furthermore, interpretability analyses based on SHAP, PDP, ALE, and ICE demonstrate that span length (L) and beam depth (h) constitute the governing parameters in the prediction of Mu. Unlike prior studies focused mainly on predictive accuracy, this work proposes an optimized and interpretable stacking ensemble framework that integrates explainable AI with classical flexural mechanics for physically consistent and reliable prediction of the ultimate moment capacity of BFRP-reinforced concrete beams. Full article
(This article belongs to the Special Issue Fiber-Reinforced Polymer Composites: Progress and Prospects)
Show Figures

Graphical abstract

35 pages, 21097 KB  
Article
An Efficient and Sparse Kernelized Gray RVFL Network for Energy Forecasting
by Wenkang Gong and Gaofeng Zong
Systems 2026, 14(3), 257; https://doi.org/10.3390/systems14030257 (registering DOI) - 28 Feb 2026
Abstract
Reliable energy forecasting is essential for the planning and dispatch of power and fuel systems; however, energy series are often short and exhibit pronounced nonlinearity. To tackle this small sample setting, we propose a gray random vector functional link (GRVFL) framework and further [...] Read more.
Reliable energy forecasting is essential for the planning and dispatch of power and fuel systems; however, energy series are often short and exhibit pronounced nonlinearity. To tackle this small sample setting, we propose a gray random vector functional link (GRVFL) framework and further derive a kernelized variant (KGRVFL). In GRVFL, an RVFL network is integrated into gray system modeling, and the parameters are learned via sparsity-regularized regression, enabling stable and reproducible training without backpropagation or evolutionary optimization. Hyperparameters are tuned using Bayesian optimization driven by a Top-k mean absolute percentage error (Top-k MAPE) criterion to improve robustness. To further promote compactness, we introduce a fractional ratio-type Fr-1 penalty and solve the resulting problem efficiently using a fractional coordinate descent (FCD) algorithm. The proposed methods are assessed on six real-world energy datasets using eight evaluation metrics. Comparisons with nine gray model baselines and six machine learning forecasters demonstrate that the sparse KGRVFL (SKGRVFL) achieves higher predictive accuracy and improved training stability under small sample conditions. Full article
(This article belongs to the Section Systems Engineering)
34 pages, 7649 KB  
Article
SMOTE-Data-Augmented Machine Learning for Enhancing Individual Tree Biomass Estimation Using UAV LiDAR
by Sina Jarahizadeh and Bahram Salehi
Remote Sens. 2026, 18(5), 729; https://doi.org/10.3390/rs18050729 (registering DOI) - 28 Feb 2026
Abstract
Estimating individual tree Above-Ground Biomass (AGB) is essential for assessing ecological functions and carbon storage in both forest and urban environments. Traditional field-based methods, such as plot measurements, are costly and impractical for large-scale applications. However, satellite- and aerial-based techniques lack the spatial [...] Read more.
Estimating individual tree Above-Ground Biomass (AGB) is essential for assessing ecological functions and carbon storage in both forest and urban environments. Traditional field-based methods, such as plot measurements, are costly and impractical for large-scale applications. However, satellite- and aerial-based techniques lack the spatial resolution for individual-tree-level analysis. Unmanned Aerial Vehicle (UAV) Light Detection and Ranging (LiDAR) data, combined with machine learning (ML), offers a powerful alternative for detailed tree structure measurement and AGB estimation. Leveraging advances in deep-learning-based individual tree detection and geometric structure estimation including Height (H), Surface Area (SA), Volume (V), and Crown Width (CW), this study develops ML regression models for estimating individual tree AGB. We explore three objectives: (1) evaluating four regression models including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Feed-Forward Neural Network (FFNN); (2) sensitivity assessment of different geometric feature combinations on model accuracy; and (3) improving model robustness using Synthetic Minority Over-sampling Technique (SMOTE) data augmentation for addressing imbalanced data. Results show that the RF model outperforms others that achieved the lowest RMSE and most balanced residual distribution. CW was the strongest single predictor of AGB and, in combination with H, yielded to the most accurate results. This combination improved RMSE and R2 by 14.2% and 89.3% with respect to single-variable-based models. The integration of SMOTE and RF further improved model performance since it lowered RMSE by 225.6 kg (~22.1%) and increased R2 by 0.76 (~49.0%). This was particularly evident in underrepresented low and high AGB ranges. The proposed RF-SMOTE approach is a cost-effective and scalable approach for generating high-quality ground truth data to enable large-scale satellite-based biomass estimation and help forest carbon accounting and planning in cities and forests. Full article
Show Figures

Figure 1

16 pages, 2265 KB  
Article
Development and Validation of an Interpretable Model for Predicting Postoperative Hyperlactatemia in Young Children Following Congenital Heart Surgery
by Yuchan Chen, Wenxin Ge, Lixin Hu, Jiaqi Chen and Yajun Chen
J. Clin. Med. 2026, 15(5), 1846; https://doi.org/10.3390/jcm15051846 (registering DOI) - 28 Feb 2026
Abstract
Objectives: Postoperative hyperlactatemia (POHL) is a common complication after pediatric cardiac surgery, yet its perioperative risk factors remain unclear. This study developed and internally validated an interpretable machine learning (ML) model to identify young children at risk for POHL. Methods: We [...] Read more.
Objectives: Postoperative hyperlactatemia (POHL) is a common complication after pediatric cardiac surgery, yet its perioperative risk factors remain unclear. This study developed and internally validated an interpretable machine learning (ML) model to identify young children at risk for POHL. Methods: We retrospectively analyzed 3224 children aged 0 to 36 months from 2018 to 2023. Four ML models, including logistic regression (LR), random forest (RF), support vector machine (SVM), and eXtreme Gradient Boosting (XGBoost), were trained and validated. Model performance was assessed using discrimination, calibration, and classification metrics, and decision curve analysis evaluated clinical utility. SHapley Additive exPlanation (SHAP) provided both global and local interpretability. Results: Of the 3224 children, 731 (22.7%) developed POHL, with a median age of 5 months. The RF model performed best (AUC, 0.821; 95% CI, 0.787–0.854; sensitivity, 69.7%; specificity, 84.1%; Brier score, 0.146). SHAP analysis identified 8 key predictors of POHL. Established factors included cardiopulmonary bypass duration, lowest bypass temperature, epinephrine dose, and RACHS-1 category. Novel contributors comprised low body weight, reduced left ventricular end-diastolic diameter, plasma transfusion, and continued mechanical ventilation within the first 24 postoperative hours. Conclusions: We developed and internally validated an interpretable RF model that integrates established and novel predictors to estimate POHL risk in young children after cardiac surgery. Pending external validation, it may support earlier risk recognition and more personalized perioperative management in this high-risk pediatric population. Full article
(This article belongs to the Special Issue Management of Congenital Heart Disease (CHD))
Show Figures

Figure 1

17 pages, 2147 KB  
Article
The Use of a Smartphone to Assess the Two-Minute Step Test: Validity of Machine Learning Compared to Analytical Data Processing
by Gustavo de Oliveira Hoffmann, Guilerme Parra Martini, John G. Buckley and Andre Luiz Felix Rodacki
Sensors 2026, 26(5), 1520; https://doi.org/10.3390/s26051520 (registering DOI) - 28 Feb 2026
Abstract
The 2-Minute Step Test (2MST) is commonly scored by step count, which overlooks how the task is performed. This study tested whether a smartphone held to the thigh can be used to quantify thigh kinematics to determine 2MST outcome parameters, and whether a [...] Read more.
The 2-Minute Step Test (2MST) is commonly scored by step count, which overlooks how the task is performed. This study tested whether a smartphone held to the thigh can be used to quantify thigh kinematics to determine 2MST outcome parameters, and whether a machine learning (ML) data analysis approach of the smartphone signal yields better agreement with motion capture (ground truth) compared to a more typical analytical data analysis approach (AA). Eighty-four healthy adults completed the 2MST while holding a smartphone against the right thigh. A thigh angular velocity ‘ground truth’ reference was obtained by simultaneous recording via motion capture (Vicon). Smartphone signals were resampled and processed using analytical (i.e., adaptive Butterworth filtering) and machine-learning data processing approaches (i.e., a stacked regression model trained to identify peak angular velocities). Step cycles and cycle duration were identical across equipment modalities and data analysis pipelines (mean 143 ± 18 cycles; 0.84 ± 0.11 s). However, the mean and variability of peak thigh angular velocity differed across the different modalities/pipelines (motion capture: 303 ± 39°·s−1; AA: 280 ± 47°·s−1; ML: 304 ± 37°·s−1). Bland–Altman agreement, compared to the ground truth measure, showed larger bias and limits of agreement for AA (bias 25.5°·s−1; −49.8–100.8) compared to ML (bias 1.0°·s−1; −15.4–17.5). These findings support using a smartphone held to the thigh to assess how the 2MST is performed, including providing the number and timing of steps completed and the average and variability in thigh angular velocity across cycles. Findings also suggest that a machine learning data analysis approach provides thigh angular velocity measures that are nearly identical to motion capture techniques, whereas a typical analytical data analysis approach has errors of around 8%. Full article
Show Figures

Figure 1

Back to TopTop