Saved Queries

The US Southwest is one of the driest and hottest regions, with a recent upsurge in land surface temperature (LST). Further, with land-use changes and global warming, anthropogenic pollution also significantly contributes to the rise in surface temperatures. While the impact of pollution on LST has been studied only in specific urban regions, insights from a broader, more diverse topography remain limited. This research incorporates LST with land cover parameters (NDBI, MNDWI, NDBSI, SAVI, WET), surface albedo, air pollutants (NO₂, SO₂, O₃, CO), aerosol particles, urban nighttime light, and digital elevation model to evaluate the non-linear spatial dependence of these variables for the summer (from June to August 2025) and winter (from December 2024 to February 2025) seasons in the US southwest. All multi-resolution inputs were harmonized by projecting to WGS84 and applying a ~11 km fishnet sampling grid commensurate with the coarsest-resolution dataset (Sentinel-5P), ensuring each sample captures a unique pixel value across all layers. AutoML was applied to benchmark learning algorithms, and we found that CatBoost, Extra Trees, LightGBM, HistGradientBoosting, and Random Forest were among the optimal models for predicting LST. After tuning these models using Bayesian optimization, we achieved a mean R² of 0.86 during summer and 0.84 during winter. After developing the hyperparameter-optimized model, explainable AI, e.g., SHAP, was employed to understand the complex nonlinear dynamics and top contributing features. Landcover variables had a more dominant impact on the spatial distribution of summer LST, while winter LST was more influenced by pollutant parameters. Partial Dependency Plot and Accumulated Local Effect were further incorporated to examine the marginal effects of the top-contributing features on spatial LST prediction. By extending the study area to the entire US Southwest, this study effectively captures urban–rural contrasts, climate- and land-cover–dependent pollutant responses, and regional climatic influences. It presents explicit spatial dependencies among LST, pollutants, land cover, topography, and nighttime activity that will aid future researchers and policymakers in effectively developing sustainable thermal planning for urban activities. Full article

(This article belongs to the Special Issue Emulation and Surrogate Modeling in Remote Sensing: Advances, Challenges and Applications)

►▼ Show Figures

Figure 1

20 pages, 4514 KB

Open AccessArticle

Hybrid Physical–Machine Learning Soil Moisture Modeling at Orchard Scale in Irrigated Citrus Orchards Using Sentinel 1 and 2 and Agroclimatic Data

by Héctor Izquierdo-Sanz and Enrique Moltó

Agronomy 2026, 16(5), 541; https://doi.org/10.3390/agronomy16050541 (registering DOI) - 28 Feb 2026

Abstract

Accurate orchard-scale soil moisture information is a key requirement for efficient irrigation management in perennial crops such as citrus orchards, particularly in Mediterranean environments characterized by water scarcity and strong spatial and temporal variability in soil moisture, canopy structure, and irrigation scheduling. This study proposes a hybrid physical–machine learning methodology for soil moisture estimation that integrates in situ capacitance sensor measurements, Sentinel-1 SAR observations, Sentinel-2 optical imagery, and ERA5-Land agroclimatic variables. Physically based soil moisture estimates were first obtained through the inversion of Sentinel-1 backscatter using integral equation scattering models, a physically based soil dielectric model, and a simplified vegetation attenuation scheme. These physically derived estimates were subsequently incorporated as predictors within supervised machine learning models, together with multi-source remote sensing and meteorological variables. Several algorithms were evaluated, including regularized linear models, support vector regression, random forests, and gradient boosting methods. Model performance was assessed using a strict interannual validation strategy based on independent-year predictions to ensure robust generalization. Within this methodology, tree-based ensemble models achieved the highest and most consistent performance at the orchard scale, with coefficients of determination ranging from 0.55 to 0.76 and root mean square errors typically between 0.7 and 1.1% volumetric soil moisture in the best-performing cases. Benchmarking against a physical-only baseline demonstrated that the hybrid methodology consistently reduced prediction errors and improved temporal robustness under independent-year validation. Overall, the results demonstrate that hybrid physical–machine learning approaches provide a robust and scalable solution for orchard-scale soil moisture monitoring in irrigated citrus orchards using operational data streams, supporting advanced irrigation management and precision agriculture applications in Mediterranean perennial cropping systems. Full article

(This article belongs to the Special Issue Advances in Remote Sensing Agronomic Application for Mapping and Modeling Soil Properties)

31 pages, 5098 KB

Open AccessArticle

A Forecasting Model for Passenger Flows of Urban Rail Transit Based on Multi-Source Spatio-Temporal Features and Optimized Ensemble Learning

by Haochu Cui and Yan Sun

Modelling 2026, 7(2), 48; https://doi.org/10.3390/modelling7020048 (registering DOI) - 28 Feb 2026

Abstract

In this study, we propose a novel model based on multi-source spatio-temporal features and optimized ensemble learning for forecasting station- and line-level passenger flows of urban rail transit. First, we design a spatio-temporal feature engineering method to enhance the accuracy of forecasting using passenger flow features; the temporal features include periodic and lag effects and the spatial features cover spatio-temporal attention mechanisms, adjacency relationships in the network graph and station clustering features. Furthermore, an improved ensemble learning method based on Extra Randomized Trees (ExtraTrees) and Light Gradient Boosting Machine (LightGBM) is developed to forecast the station-level passenger flows using a weighted sum method in which a particle swarm optimization algorithm is adopted to determine the weights assigned to the forecasting results of the two models. Finally, ridge regression is adopted as the meta-learning model to forecast line-level passenger flows. We employed passenger flow data from three urban rail transit lines in Hangzhou to demonstrate the feasibility of the proposed model. The results indicate that it produces more accurate passenger flow forecasts at the station and line levels than benchmark models. Therefore, it can provide a solid support for optimizing the operations, management, and planning for both a single urban rail transit station and the entire network. Full article

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence in Modelling)

32 pages, 19818 KB

Open AccessArticle

An Interpretable Ensemble Machine Learning Framework for Predicting the Ultimate Flexural Capacity of BFRP-Reinforced Concrete Beams

by Sebghatullah Jueyendah and Elif Ağcakoca

Polymers 2026, 18(5), 601; https://doi.org/10.3390/polym18050601 (registering DOI) - 28 Feb 2026

Abstract

Prediction of the ultimate moment capacity (Mu) of BFRP-reinforced concrete beams is complicated by nonlinear parameter interactions and the linear-elastic response of BFRP, reducing the accuracy of conventional design models. This study develops an optimized machine learning (ML) framework incorporating random forest, extra trees, gradient boosting, adaboost, bagging, support vector regression, histogram-based gradient boosting, and ensemble voting and stacking strategies for reliable prediction of the Mu of BFRP-reinforced concrete beams. A comprehensive database of material, geometric, reinforcement, and BFRP mechanical parameters was analyzed, and model performance was evaluated using an 80/20 train–test split and 10-fold cross-validation based on R², RMSE, MAE, and MAPE. The stacking regressor demonstrated superior predictive performance, achieving an R² of 0.999 (RMSE = 0.590) in training and an R² of 0.988 (RMSE = 2.487) in testing, indicating excellent robustness and strong generalization capability in predicting Mu. Furthermore, interpretability analyses based on SHAP, PDP, ALE, and ICE demonstrate that span length (L) and beam depth (h) constitute the governing parameters in the prediction of Mu. Unlike prior studies focused mainly on predictive accuracy, this work proposes an optimized and interpretable stacking ensemble framework that integrates explainable AI with classical flexural mechanics for physically consistent and reliable prediction of the ultimate moment capacity of BFRP-reinforced concrete beams. Full article

(This article belongs to the Special Issue Fiber-Reinforced Polymer Composites: Progress and Prospects)

►▼ Show Figures

Graphical abstract

34 pages, 7649 KB

Open AccessArticle

SMOTE-Data-Augmented Machine Learning for Enhancing Individual Tree Biomass Estimation Using UAV LiDAR

by Sina Jarahizadeh and Bahram Salehi

Remote Sens. 2026, 18(5), 729; https://doi.org/10.3390/rs18050729 (registering DOI) - 28 Feb 2026

Abstract

Estimating individual tree Above-Ground Biomass (AGB) is essential for assessing ecological functions and carbon storage in both forest and urban environments. Traditional field-based methods, such as plot measurements, are costly and impractical for large-scale applications. However, satellite- and aerial-based techniques lack the spatial resolution for individual-tree-level analysis. Unmanned Aerial Vehicle (UAV) Light Detection and Ranging (LiDAR) data, combined with machine learning (ML), offers a powerful alternative for detailed tree structure measurement and AGB estimation. Leveraging advances in deep-learning-based individual tree detection and geometric structure estimation including Height (H), Surface Area (SA), Volume (V), and Crown Width (CW), this study develops ML regression models for estimating individual tree AGB. We explore three objectives: (1) evaluating four regression models including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Feed-Forward Neural Network (FFNN); (2) sensitivity assessment of different geometric feature combinations on model accuracy; and (3) improving model robustness using Synthetic Minority Over-sampling Technique (SMOTE) data augmentation for addressing imbalanced data. Results show that the RF model outperforms others that achieved the lowest RMSE and most balanced residual distribution. CW was the strongest single predictor of AGB and, in combination with H, yielded to the most accurate results. This combination improved RMSE and R² by 14.2% and 89.3% with respect to single-variable-based models. The integration of SMOTE and RF further improved model performance since it lowered RMSE by 225.6 kg (~22.1%) and increased R² by 0.76 (~49.0%). This was particularly evident in underrepresented low and high AGB ranges. The proposed RF-SMOTE approach is a cost-effective and scalable approach for generating high-quality ground truth data to enable large-scale satellite-based biomass estimation and help forest carbon accounting and planning in cities and forests. Full article

(This article belongs to the Special Issue UAV Applications for Forest Management: Wood Volume, Biomass, and Mapping (Second Edition))

►▼ Show Figures

Figure 1

24 pages, 712 KB

Open AccessArticle

Leveraging Machine Learning to Evaluate the ESG Performance of Listed and OTC Firms in a Small Open Economy

by Hui-Juan Xiao, Tsung-Nan Chou, Jian-Fa Li and Kuei-Kuei Lai

Appl. Syst. Innov. 2026, 9(3), 52; https://doi.org/10.3390/asi9030052 (registering DOI) - 27 Feb 2026

Abstract

This study investigates the predictability of Environmental, Social, and Governance (ESG) performance using financial fundamentals within the context of Taiwan, a prominent small open economy integrated into global value chains. As global markets transition toward mandatory sustainability reporting, identifying the financial ante-cedents of ESG outcomes is critical for risk management and regulatory oversight. Uti-lizing a decade of firm-level data (2014–2023) from the Taiwan Economic Journal (TEJ), we employ supervised machine learning (ML) architectures-including Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost)-to classify firms into ESG performance tiers based on indicators such as profitability, valuation, and scale. Our empirical results provide robust support for the Slack Resources Hypothesis, identifying Return on Assets (ROA) and Firm Size (SIZE) as the most consistent predictors of ESG excellence across the semiconductor, cement, and steel sectors. Conversely, mar-ket-based indicators (Tobin’s Q) dominate predictive models for the financial industry. Methodologically, XGBoost delivers superior predictive calibration for the financial sector, while Decision Trees offer highly interpretable threshold-based logic for risk screening. Our study contributes a transparent “early-warning” framework, enabling investors and regulators to identify sustainability risks through auditable financial benchmarks. The findings suggest that while financial latitude is a structural prerequisite for ESG engagement, it is not its sole determinant, pointing toward a “virtuous circle” of financial health and managerial quality. Full article

35 pages, 6524 KB

Open AccessArticle

Strictly Chronological CNN Embeddings with Gradient-Boosted Trees for Next-Day Log-Return Forecasting

by Zezhi Bao, Xiaofei Li, Menghuan Shi, Yueen Huang and Junjie Du

Symmetry 2026, 18(3), 416; https://doi.org/10.3390/sym18030416 - 27 Feb 2026

Abstract

Daily equity return forecasting is challenging due to low signal-to-noise ratios, heavy-tailed innovations, and persistent distribution drift. We study one-step-ahead log-return prediction using daily market variables and return-based transformations. We propose a CNN–LightGBM hybrid that transfers a last-step CNN embedding to a gradient-boosted tree regressor through explicit embedding standardization, which stabilizes the representation interface for tree learning. To reduce train-to-evaluation mismatch under drift, we adopt split-wise, training-only standardization with a recency-aware fit-latest-W rule. Return-related predictors are anchored on a one-sided wavelet-denoised close series, while other market channels are retained in their original form to preserve episodic extremes. Experiments on NIFTY50 with walk-forward model selection show statistically reliable accuracy gains over Naive0 and competitive performance against representative deep sequence baselines, and the supplementary evaluations on HDFC and INDA provide additional out-of-sample evidence on these two assets under the same strictly chronological protocol. A long-or-cash decision rule based on the return forecasts yields positive risk-adjusted performance under realistic transaction-cost assumptions, supporting the practical relevance of the predictive signal. Full article

(This article belongs to the Special Issue Symmetry in Artificial Intelligence and Applications)

28 pages, 3163 KB

Open AccessArticle

Mineral Prospectivity Prediction in the Mayoumu Area, Tibet, Based on Multi-Source Exploration Information and Ensemble Learning Models

by Kai Qiao, Tao Luo, Shihao Ding, Cong Han, Shisong Gong, Zhiwen Ren and Yong Huang

Remote Sens. 2026, 18(5), 703; https://doi.org/10.3390/rs18050703 - 26 Feb 2026

Viewed by 89

Abstract

Plateau–orogenic belts host a substantial share of global gold resources, yet quantitative prospectivity mapping is challenged by complex mineralization and strongly heterogeneous, multi-scale datasets. Using the Mayoumu area (Tibet) as a representative orogenic gold district, we develop an integrated multi-source workflow that fuses remote-sensing alteration information with regional geochemical and structural constraints within an ensemble-learning framework. Alteration anomalies were mapped from GF-5 hyperspectral imagery using mixture-tuned matched filtering (MTMF) and from Sentinel-2 multispectral imagery using the iCrosta method to extend alteration signals across scales. Geochemical anomalies were extracted from 1:200,000 stream-sediment data through isometric log-ratio (ILR) transformation and robust principal component analysis (RPCA). At the same time, ore-controlling structures were quantified using Euclidean-distance-to-fault layers. Three Boosting-based ensemble models—gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM)—were trained to predict mineral prospectivity. Performance was evaluated using confusion matrix metrics and ROC–AUC, and key predictors were interpreted using SHAP. All three models achieved AUC values > 0.90, with LightGBM performing best (AUC = 0.94) and delineating high-prospectivity zones that coincide with known occurrences and highlight additional targets. The proposed workflow provides a practical, transferable reference for gold prospectivity mapping in complex orogenic belts worldwide. Full article

(This article belongs to the Topic Big Data and AI for Geoscience)

23 pages, 4967 KB

Open AccessArticle

Comparative Evaluation of Machine Learning Models Using Structured and Unstructured Clinical Data for Predicting Unplanned General Medicine Readmissions in a Tertiary Hospital in Australia

by Yogesh Sharma, Campbell Thompson, Arduino A. Mangoni, Chris Horwood and Richard Woodman

Computers 2026, 15(3), 138; https://doi.org/10.3390/computers15030138 - 26 Feb 2026

Viewed by 122

Abstract

Background: Unplanned 30-day hospital readmissions, a key healthcare quality metric, are common and costly. Prediction models built on structured data often perform modestly, and the added value of unstructured clinical notes remains unclear. Methods: This retrospective cohort study included 4135 general medicine admissions to a tertiary Australian hospital between July 2022 and June 2023. Structured predictors included demographics, comorbidities, frailty, prior healthcare utilisation, length-of-stay, inflammatory markers, socioeconomic indicators, and lifestyle factors. We developed deep learning models using structured data alone, unstructured text alone, and a combined multimodal architecture integrating both modalities. For benchmarking, multiple classical machine learning models trained on structured features were evaluated using identical data splits, including logistic regression, XGBoost, random forest, gradient boosting, extra trees, and HistGradient Boosting. Model performance was assessed on a hold-out test set using ROC-AUC, accuracy, precision, recall, and F1-score. Results: Unplanned readmissions occurred in 24.3% of admissions. Among classical machine learning models, logistic regression achieved the highest discrimination (ROC-AUC 0.64), with no substantial improvement observed from ensemble methods. Structured-only deep learning achieved ROC-AUC 0.62. Unstructured text-only and multimodal models achieved ROC-AUCs of 0.52 and 0.58, respectively. Although overall discrimination of the multimodal model was lower than structured-only performance, it demonstrated improved sensitivity and F1-score for identifying patients who were readmitted. Prior hospitalisations, emergency department visits, and comorbidity burden were the strongest predictors. Conclusions: Structured EMR variables remain the main drivers of 30-day readmission risk. More complex classical machine learning models did not outperform logistic regression, and incorporating unstructured clinical text provided only modest improvement in identifying high-risk patients without enhancing overall discrimination. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) in Medical Informatics)

►▼ Show Figures

Figure 1

32 pages, 8251 KB

Open AccessArticle

Tracking Quarter-Century Spatio-Temporal Soil Salinization Dynamics in Semi-Arid Landscapes Using Earth Observation and Machine Learning

by Aiman Achemrk, Jamal-Eddine Ouzemou, Ahmed Laamrani, Ali El Battay, Soufiane Hajaj, Sabir Oussaoui and Abdelghani Chehbouni

Remote Sens. 2026, 18(5), 687; https://doi.org/10.3390/rs18050687 - 26 Feb 2026

Viewed by 123

Abstract

Soil salinization represents a critical constraint to sustainable agriculture in arid and semi-arid regions, where salinity threatens soil productivity, water quality, and ecosystem resilience. Soil salinity pattern prediction is complicated by tightly coupled landscape hydro-climatic processes, wherein the central Sabkha acts as a persistent salt sink, episodic inundation and intense evaporation concentrate dissolved salts, and a shallow saline groundwater table interacts with the semi-arid climate to drive surface salinization. Conventional mapping is laborious and lacks the precision needed to capture the spatio-temporal dynamics of soil salinity across landscapes. This study developed an integrated framework uniting multi-temporal Landsat imagery (2000–2025), hypsometric data, climatic indicators, and in situ soil electrical conductivity (ECe) measurements to model soil salinity dynamics using machine learning (ML), over the Sehb El Masjoune (SEM) semi-arid region, Morocco. A total of 233 soil samples were collected in the investigated area in 2022, 2023, 2024, and 2025 to assess the spatial variability to calibrate and validate modeling findings. To this end, three predictive algorithms, i.e., Gradient-Boosted Trees (GBT), Support Vector Regression (SVR), and Random Forest (RF) were assessed. Our findings showed that SVR achieved the highest predictive capability (R² = 0.76; RMSE = 32.91 dS/m), whereas SVR-based salinity maps revealed a distinct spatial organization of salinization processes, characterized by extremely saline soils (≥64 dS/m) concentrated in the central study area (i.e., SEM center) and a progressive decline toward adjacent agricultural lands (0–8 dS/m). Our results demonstrated that from 2000 to 2025, moderately to highly saline areas (≥16 dS/m) expanded by nearly 10%, driven by recurrent droughts and inefficient drainage. Hydroclimatic analysis confirmed that dry years (SPI: Standardized Precipitation Index ≤ −0.5) promoted net salinity build-up through the expansion and persistence of moderate-to-high salinity classes (≥16 dS/m), whereas wet years (SPI ≥ +0.5) favored temporary leaching and partial recovery, mainly within the low-to-moderate range. This integrative remote sensing–ML approach provides a robust and scalable framework for operational soil salinity monitoring, offering valuable insights for sustainable land-use planning in similar Sabkha’s data-scarce agroecosystems. Full article

(This article belongs to the Special Issue Environmental Monitoring Based on Remote Sensing, Earth Observation and Geoinformation)

►▼ Show Figures

Figure 1

22 pages, 6811 KB

Open AccessArticle

Sound-Based Tool Wear Classification in Turning of AISI 316L Using Multidomain Acoustic Features and SHAP-Enhanced Gradient Boosting Models

by Savaş Koç, Mehmet Şükrü Adin, Ramazan İlenç, Mateusz Bronis and Serdar Ekinci

Materials 2026, 19(5), 861; https://doi.org/10.3390/ma19050861 - 25 Feb 2026

Viewed by 199

Abstract

Reliable tool-wear monitoring is essential for maintaining machining quality and preventing unscheduled downtime in manufacturing. This investigation presents a sound-based classification framework for identifying wear states in the turning of AISI 316L stainless steel using advanced gradient-boosting models. Acoustic signals were recorded under constant cutting parameters to eliminate process-induced variability, and each recording was divided into standardized 2 s segments. A total of 540 multidomain features—including RMS, ZCR, spectral descriptors, Mel-spectrogram statistics, MFCCs and their derivatives, and discrete wavelet energies—were extracted to capture both stationary and transient characteristics of tool–workpiece interactions. Feature selection was performed using a three-stage pipeline comprising Boruta, LASSO, and SHAP analysis, resulting in a compact subset of highly informative descriptors. LightGBM, XGBoost, and CatBoost classifiers were trained using stratified 10-fold cross-validation across three wear states: Unworn, Slight wear, and Severe wear. LightGBM and XGBoost achieved the best performance, with mean accuracies above 0.96 and strong PRC–AUC and ROC–AUC values (0.98–1.00). Although Slight wear remained the most difficult class due to its transitional acoustic characteristics, all models showed clear separability for Unworn and Severe wear conditions. The results confirm that boosted decision-tree methods combined with SHAP-enhanced feature selection provide an effective, low-cost, and non-contact solution for tool-wear classification in 316L turning. Full article

(This article belongs to the Special Issue Cutting Process of Advanced Materials)

►▼ Show Figures

Graphical abstract

25 pages, 6010 KB

Open AccessArticle

Comprehensive Early Alert and Adaptive Local Response Framework for Wildfire Risk in Transmission Line Corridors Using Coupled Global Factors in Power System

by Tianliang Xue, Chengsi Xiang, Xi Chen and Lei Zhang

Processes 2026, 14(5), 752; https://doi.org/10.3390/pr14050752 - 25 Feb 2026

Viewed by 100

Abstract

Escalating global climate change has intensified the frequency and scale of wildfires in mountainous regions hosting transmission line infrastructure. These conflagrations act as extreme meteorological events, capable of generating localized heatwaves that compromise the air insulation of power lines and trigger protective relay operations, thereby posing systemic threats to regional grid stability. To enhance wildfire early-warning efficacy for grid security, this study formulates wildfire early warning for power transmission corridors as a regression-based risk prediction problem and proposes a hierarchical “global screening–local refinement” risk assessment framework. The primary contribution of this study lies in the integration of a machine-learning-based global wildfire risk screening model with tower-level spatial refinement using geographically weighted regression (GWR), enabling coordinated global–local wildfire risk characterization along power transmission corridors The framework employs a predictive model built on a Gradient Boosting Decision Tree algorithm, integrating geospatial and statistical analyses. A global risk model, utilizing historical data from the Himawari-8 satellite alongside meteorological, topographic, and anthropogenic variables, produces a composite risk index. This index is spatially interpolated via Kriging to generate stratified wildfire risk maps for broad-area assessment. For precise corridor-level analysis, these Globally Projected Risk Indices, along with localized terrain features, inter-tower clearance distances, and proximity to historical ignition points, are incorporated into a Geographically Weighted Regression model. This yields a spatially calibrated wildfire risk index along critical routes. The results show that the GBDT-based model achieved the best predictive performance among the evaluated regression models, with an R² of 0.626 and a mean squared error of 0.178. This approach offers a scientifically robust and operationally viable reference for wildfire prevention strategies in power line maintenance. Full article

(This article belongs to the Special Issue AI-Driven Innovations for Enhancing Power System Stability and Operational Efficiency)

35 pages, 1965 KB

Open AccessArticle

Efficient Recurrent Multi-Layer Neural Network for Multi-Scale Noise and Activity Drift Mitigation in Wideband Cognitive Radio Networks

by Sunil Jatti and Anshul Tyagi

Algorithms 2026, 19(3), 172; https://doi.org/10.3390/a19030172 - 25 Feb 2026

Viewed by 68

Abstract

Wideband spectrum sensing in Cognitive Radio Networks (CRNs) is challenging due to sparse primary user (PU) activity and noise clustering, which obscure signals and generate false alarms. Hence, a novel “Graph Discrete Wavelet Bayesian Kernel Boosted Decision Self-Attention Clustering Neural Network (GDWB-KBSC-NN)” is proposed. When sparse PU activity is masked by irregular interference bursts, traditional sensing algorithms misclassify weak transmissions as noise, leading to low detection reliability. To resolve this, the first hidden layer employs Discrete Wavelet Sparse Bayesian Kernel Analysis (DW-SBK), integrating Discrete Wavelet Packet Transform (DWPT), Sparse Bayesian Learning (SBL), and Kernel PCA. This restores the true sparse pattern of the spectrum, separates interference from actual PU signals, and enhances detection of weak channels. Additionally, PU signals are fragmented due to cross-scale activity drift, where dynamic bandwidth switching and variable burst durations disrupt temporal continuity. Therefore, the second layer incorporates Gradient Boosted Multi-Head Fuzzy Clustering (GB-MHFC), where Gradient Boosted Decision Trees (GBDT) model nonlinear spectral–temporal patterns, Multi-Head Self-Attention (MHSA) captures long- and short-range temporal dependencies, and Fuzzy C-Means Clustering (FCM) groups feature representations into stable PU activity modes, thereby reducing misclassifications and enhancing robustness under highly dynamic CRN conditions. The proposed method demonstrates superior performance with a maximum detection probability of 0.98, classification accuracy of 98%, lowest sensing error of 5.412%, and the fastest sensing time of 3.65 s. Full article

(This article belongs to the Special Issue Energy-Efficient Algorithms for Large-Scale Wireless Sensor Networks)

16 pages, 2363 KB

Open AccessArticle

A Data-Efficient Machine Learning Approach for Breast Ultrasound Lesion Classification Integrating Image-Derived Features and Sonographic Descriptors

by Adil Gursel Karacor and Sevim Sahin

Diagnostics 2026, 16(5), 664; https://doi.org/10.3390/diagnostics16050664 - 25 Feb 2026

Viewed by 230

Abstract

Background/Objectives: Breast ultrasound is widely used for the diagnostic evaluation of breast lesions; however, reliable lesion characterization remains challenging due to substantial image heterogeneity and the limited size of most clinically available datasets. These constraints reduce the generalizability of end-to-end deep learning approaches in routine practice. The objective of this study was to evaluate a data-efficient diagnostic framework that integrates image-derived features with clinical sonographic descriptors to improve breast ultrasound lesion classification in small cohorts. Methods: Ultrasound images from the publicly available BrEaST-Lesions dataset were processed using a pretrained convolutional neural network to extract compact image feature representations from full images, lesion masks, and cropped tumor regions. These features were combined with manually recorded sonographic descriptors after label encoding to form a unified tabular dataset. Gradient-boosted tree models were trained using descriptor-only and fused feature sets with fivefold stratified cross-validation and evaluated on an independent external hold-out test set. Results: Using sonographic descriptors alone, the best-performing model (LightGBM) achieved an external validation accuracy of 0.88, with an area under the receiver operating characteristic curve (AUC) of 0.95. Incorporation of image-derived features improved diagnostic performance on the external test set, yielding an accuracy of 0.88, an AUC of 0.96, and a sensitivity of 1.00 for malignant lesion detection. The fused framework demonstrated more stable generalization than descriptor-only models, particularly for malignant cases. Conclusions: Combining image-derived features with clinical sonographic descriptors within a tabular learning framework provides a robust and data-efficient approach for breast ultrasound-based lesion classification. This strategy supports diagnostic decision-making in small ultrasound datasets and represents a clinically realistic alternative when large-scale deep learning models are impractical. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

►▼ Show Figures

Figure 1

17 pages, 7402 KB

Open AccessArticle

Digital Mapping of Soil pH Using Tree-Based Models Coupled with Residual Kriging

by Yanyan Tian, Suyang Cao, Pei Sun, Quanguo Kang, Shaohua Liu, Xinao Zheng, Lifei Wei and Qikai Lu

Land 2026, 15(3), 365; https://doi.org/10.3390/land15030365 - 25 Feb 2026

Viewed by 140

Abstract

Soil pH is a critical soil property governing nutrient availability and ecosystem functioning. Digital mapping of its spatial distribution is essential for precision agriculture and sustainable land management. This study performs a comparative analysis of six tree-based models coupled with residual kriging (RK) for 30 m resolution mapping of soil pH in Shayang County, China. Specifically, random forest (RF), extremely randomized trees (ERT), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) were used. Based on 1343 soil samples and 32 environmental variables, experimental results demonstrate that the integration of RK enhanced the prediction accuracy of all standalone models by taking the spatial dependence of residuals into account. Among the models, CatBoost-RK achieved the best performance with an R² of 0.7265, RMSE of 0.5072, and RPD of 1.9122, closely followed by ERT-RK and RF-RK. The analysis of variable importance identified soil type (ST) and mean annual precipitation (MAP) as the most critical factors affecting soil pH distribution. The generated 30 m resolution soil pH map reveals distinct patterns across different land use types, with croplands showing lower soil pH and grasslands exhibiting higher pH with greater variability. These findings confirm the effectiveness of the hybrid ML-RK framework and provide valuable insights for selecting optimal modeling strategies in digital soil mapping. Full article

(This article belongs to the Special Issue Digital Soil Mapping for Soil Health Monitoring in Agricultural Lands)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 39.

Go to page 1 2 3 4 5

Search Results (1,927)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI