MDPI - Publisher of Open Access Journals

20 pages, 2240 KB

Open AccessArticle

Prediction of Surface Soil Organic Carbon in Karst Cropland Based on Multi-Temporal Remote Sensing Data and Stacking Ensemble Method

by Kaiping Li, Yuan Li, Wenxian Wu and Leping Yang

Land 2026, 15(5), 884; https://doi.org/10.3390/land15050884 (registering DOI) - 20 May 2026

Abstract

Accurate prediction of soil organic carbon (SOC) in cropland is important for food production, sustainable soil management, and carbon sequestration. Although digital soil mapping (DSM) has been widely used in the prediction of SOC, most of the current DSM studies use only a [...] Read more.

Accurate prediction of soil organic carbon (SOC) in cropland is important for food production, sustainable soil management, and carbon sequestration. Although digital soil mapping (DSM) has been widely used in the prediction of SOC, most of the current DSM studies use only a single remote sensing image and a single machine learning (ML) approach, and few studies apply multi-temporal remote sensing images and ensemble methods. This study explores the accuracy of the prediction of surface SOC in cropland by comparing multi-temporal Sentinel-2A remote sensing with random forest (RF), support vector machine (SVM), gradient boosted decision trees (GBDT), extreme gradient boosted decision trees (XGBoost), and a stacking ensemble method consisting of these four ML approaches. The potential of multi-temporal remote sensing data and the stacking ensemble method for SOC prediction is discussed. To this end, 76 sampling points were selected in the study area, soil samples were collected at depths of 0–10 cm and 10–20 cm for each soil profile, and a total of 152 soil samples were obtained. Remote sensing variables extracted from topography, climate, and Sentinel-2A images on 13 January and 31 August 2023 were used as predictor variables. The results showed that the stacking ensemble method with multi-temporal predictor variables outperformed all single models and variable combinations. However, the overall predictive accuracy remained moderate, with the best performance for 0–10 cm (R² = 0.386, RMSE = 4.782, MAE = 3.36) and 10–20 cm (R² = 0.425, RMSE = 4.484, MAE = 4.031). The relatively low R² values, despite the use of advanced methods, highlight the inherent challenges of SOC prediction in highly fragmented karst croplands. This study demonstrates the incremental benefit, rather than a universal high accuracy, of combining multi-temporal Sentinel-2 imagery with a stacking ensemble to improve SOC mapping in such complex environments. Full article

► Show Figures

Figure 1

19 pages, 6884 KB

Open AccessArticle

Data-Driven Evaluation of Bearing Capacity for In-Service Pile Foundations Using Dynamic Stiffness and Machine Learning

by Yuxuan Zeng, Jun Guo, Wangyu He, Yueying Chen and Meng Ma

Geotechnics 2026, 6(2), 50; https://doi.org/10.3390/geotechnics6020050 - 18 May 2026

Viewed by 113

Abstract

In the assessment of bearing capacity for in-service bridge pile foundations, static load tests are costly, destructive, and difficult to scale. The traditional dynamic formula approach relies heavily on an empirical dynamic–static conversion coefficient that introduces considerable uncertainty. To address these limitations, this [...] Read more.

In the assessment of bearing capacity for in-service bridge pile foundations, static load tests are costly, destructive, and difficult to scale. The traditional dynamic formula approach relies heavily on an empirical dynamic–static conversion coefficient that introduces considerable uncertainty. To address these limitations, this study proposes a non-destructive evaluation method for pile foundation bearing capacity based on measured dynamic stiffness and machine learning algorithms. Using data from a highway bridge inspection project, a dataset comprising 680 piles was compiled, including measured dynamic stiffness, geometric parameters, and design load information. An end-to-end binary classification model was constructed to map multidimensional physical features to an engineering decision target, namely, whether the bearing capacity meets the design requirement. The performance of several algorithms was compared, including logistic regression, random forest, and gradient boosting decision tree (GBDT). Among the evaluated models, the GBDT model demonstrated the best capability for capturing the complex nonlinear pile–soil interactions. On an independent test set, it achieved an accuracy of 96.3% and an F1 score of 0.96, with a very low false-negative rate, satisfying the high precision required for engineering safety screening. Feature importance analysis indicates that measured dynamic stiffness contributed approximately 42% to the classification outcome, establishing it as the dominant indicator for detecting capacity deficiencies and reinforcing its physical relevance as a key health indicator for pile foundations. This study demonstrates that data-driven methods can effectively circumvent the uncertainty associated with traditional empirical coefficients, providing a promising approach to the health monitoring and rapid evaluation of in-service bridge pile foundations. Full article

(This article belongs to the Special Issue Recent Developments in the Machine Learning Modeling of Geotechnical Data)

► Show Figures

Figure 1

27 pages, 36911 KB

Open AccessArticle

Land Use Classification in Rare Earth Mining Areas Based on Multi-Source Remote Sensing and Feature Optimization

by Xiaolong Cheng, Bingzi Li, Zihao Yuan, Weifeng He and Zhirong Wen

Land 2026, 15(5), 797; https://doi.org/10.3390/land15050797 - 8 May 2026

Viewed by 201

Abstract

Rare earth elements are vital, non-renewable strategic resources, and their exploitation has significant impacts on regional ecological security and sustainable development. To address the issue of insufficient accuracy in land use classification in rare earth mining areas, this study takes the Lingbei rare [...] Read more.

Rare earth elements are vital, non-renewable strategic resources, and their exploitation has significant impacts on regional ecological security and sustainable development. To address the issue of insufficient accuracy in land use classification in rare earth mining areas, this study takes the Lingbei rare earth mining area in Dingnan County, Jiangxi Province, as a case study. Multi-source remote sensing data, including Sentinel-2 imagery, Sentinel-1 SAR data, nighttime light data, and DEM data, were integrated to construct a feature set combining spectral, textural, and topographic information. On this basis, this study developed a feature optimization framework that combines recursive feature elimination (RFE), mean decrease accuracy (MDA), and K-fold cross-validation (CV), termed RFE-MDA-CV. We designed nine feature combination schemes and compared them with the optimal feature subset. Their performance was systematically evaluated across four classifiers: RF, SVM, CART, and GBDT. The results were as follows: (1) the optimized feature set combined with the RF classifier consistently achieved the highest classification performance, with a mean OA of approximately 93.2% and a kappa coefficient of about 0.916, outperforming CART and SVM by around 4-5 percentage points; (2) land use remained generally stable between 2016 and 2023, but frequent conversions occurred between forest land, cropland, and impervious surfaces, mainly driven by urban expansion and mining activities; and (3) cross-regional experiments demonstrated that the proposed feature optimization framework has good applicability and transferability in mining areas with similar geomorphological and metallogenic conditions. Overall, the proposed RFE-MDA-CV method can be effectively implemented on the Google Earth Engine platform, significantly improving the accuracy and robustness of land use classification in rare earth mining areas, while providing reliable technical support for ecological monitoring and land resource management. Full article

► Show Figures

Figure 1

28 pages, 6191 KB

Open AccessArticle

Prediction of Groove Depth in Femtosecond Laser Ablation via Attention Mechanism and Monotonic Constraint

by Guangxian Li, Luyang Ding, Meng Liu, Hui Xie and Songlin Ding

Machines 2026, 14(5), 509; https://doi.org/10.3390/machines14050509 - 3 May 2026

Viewed by 211

Abstract

Femtosecond laser ablation (FLA) is efficient for the machining of micro-groove arrays on the surface of ultrahard cutting tools. The depth of the groove determines the precision and efficiency of ablation. In this study, an “Attention-based Monotonic Physics-Guided Neural Network” (AM-PGNN) algorithm is [...] Read more.

Femtosecond laser ablation (FLA) is efficient for the machining of micro-groove arrays on the surface of ultrahard cutting tools. The depth of the groove determines the precision and efficiency of ablation. In this study, an “Attention-based Monotonic Physics-Guided Neural Network” (AM-PGNN) algorithm is proposed to accurately predict groove depth in the FLA of tungsten carbide (WC). The new algorithm incorporates machining parameters directly governing the energy deposition and thermal accumulation, thereby determining the prediction of the micro-groove depth generation. By embedding the physics-guided monotonic relationships of parameter depth into the learning process, a dedicated physical loss coupled with an attention mechanism to enable adaptive feature weighting is constructed, which strengthens the representation of causal dependencies. Experimental data for training and testing are obtained from the FLA of WC with different machining parameters. Comparison between AM-PGNN and typical algorithms, including a Support Vector Machine (SVM), Deep Neural Network (DNN), Convolutional Neural Network (CNN), Gradient Boosting Decision Tree (GBDT), and a conventional PGNN, demonstrates that the proposed AM-PGNN achieves superior prediction accuracy. Moreover, AM-PGNN attains a physical consistency degree (PCD) of 100%, indicating strict adherence to monotonicity consistent with the actual situation. AM-PGNN also exhibits enhanced robustness to input perturbations, as reflected by reduced standard deviation (Std) and normalized absolute deviation (NAD). Finally, AM-PGNN is shown to be applicable in the FLA of different materials through additional experiments on Cu and SiC, achieving R² values above 0.93 while maintaining a PCD of 100%. Full article

(This article belongs to the Special Issue Integration of Industrial Machines into Smart Manufacturing, Digital Twin Technology for Industry 4.0 Machinery)

► Show Figures

Figure 1

32 pages, 2574 KB

Open AccessArticle

ETGB-SEF: Entmax-TabNet Gradient Boosting Stacked Ensemble Framework for Disease Stage Prediction

by Bowen Yang and Wenying He

Symmetry 2026, 18(5), 779; https://doi.org/10.3390/sym18050779 - 1 May 2026

Viewed by 230

Abstract

Disease staging is a critical component of clinical diagnosis, treatment, and prognosis assessment. However, structured clinical data typically exhibit high-dimensional, nonlinear feature interactions; stage-specific dominant features; and threshold-based discontinuities. These characteristics make it challenging for a single model to achieve both global feature [...] Read more.

Disease staging is a critical component of clinical diagnosis, treatment, and prognosis assessment. However, structured clinical data typically exhibit high-dimensional, nonlinear feature interactions; stage-specific dominant features; and threshold-based discontinuities. These characteristics make it challenging for a single model to achieve both global feature modeling capability and local discriminative power, thereby limiting further improvements in prediction accuracy. To address this limitation, we propose a novel deep ensemble learning framework, ETGB-SEF (Entmax-TabNet Gradient Boosting Stacked Ensemble Framework), for multiclass disease staging. First, at the base model level, Entmax-1.5 replaces Sparsemax in TabNet, thereby enabling an adjustable sparse feature selection mechanism that enhances the ability to model weakly correlated clinical features while preserving interpretability. Second, at the model-fusion level, a stacked ensemble architecture in the probability space is developed. This architecture integrates the modified TabNet with Gradient Boosting Decision Trees (GBDT) in a complementary way, enabling the former to capture global nonlinear semantic dependencies while the latter captures threshold-based discriminative boundaries among clinical features. Extensive experiments on real-world datasets demonstrate that the proposed method consistently outperforms existing state-of-the-art approaches. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

23 pages, 924 KB

Open AccessArticle

Vertical Federated XGBoost with Privacy Preservation via Secure Multiparty Computation

by Asma Ramay, Estrid He, Mengmeng Yang, Tabinda Sarwar, Xinqian Wang and Xun Yi

J. Cybersecur. Priv. 2026, 6(3), 79; https://doi.org/10.3390/jcp6030079 (registering DOI) - 1 May 2026

Viewed by 222

Abstract

Gradient Boosted Decision Trees (GBDTs) are popular for their strong predictive performance. However, in domains like finance and healthcare, data are often distributed across organizations, making collaborative model training challenging due to privacy concerns. Vertical federated learning (VFL) enables such collaboration when data [...] Read more.

Gradient Boosted Decision Trees (GBDTs) are popular for their strong predictive performance. However, in domains like finance and healthcare, data are often distributed across organizations, making collaborative model training challenging due to privacy concerns. Vertical federated learning (VFL) enables such collaboration when data are split by features, but many existing methods focus on protecting raw data while exposing sensitive model information, such as gradients and Hessians—especially to the label-owning party. Techniques like Homomorphic Encryption and Secret Sharing help, but often rely on trusted or privileged parties and may still leak intermediate statistics. To address this, we propose MPC-XGB, a privacy-preserving framework for training XGBoost under VFL with an honest-but-curious threat model. It uses secure three-party computation with Replicated Secret Sharing, distributing data across non-colluding servers and performing all computations on shares. This ensures that raw data, labels, and model statistics remain hidden, while supporting both secure training and prediction. Experiments show that MPC-XGB achieves strong performance (0.93 accuracy, 0.82 AUC), comparable to that of existing methods, with improved privacy guarantees. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

27 pages, 3578 KB

Open AccessArticle

Predicting Corporate Carbon Disclosure in China: Evidence from Interpretable Machine Learning

by He Peng Yang, Norhaiza Bt. Khairudin and Danilah Binti Salleh

Sustainability 2026, 18(8), 4022; https://doi.org/10.3390/su18084022 - 17 Apr 2026

Viewed by 283

Abstract

Corporate carbon disclosure has become increasingly important in China’s transition toward sustainability and low-carbon development, yet existing research often focuses on isolated determinants and relies mainly on linear empirical models. Using 48,187 observations of Chinese A-share firms from 2012 to 2024, this study [...] Read more.

Corporate carbon disclosure has become increasingly important in China’s transition toward sustainability and low-carbon development, yet existing research often focuses on isolated determinants and relies mainly on linear empirical models. Using 48,187 observations of Chinese A-share firms from 2012 to 2024, this study identifies the key predictors of corporate carbon disclosure. It develops an interpretable machine learning model and compares its predictive performance with that of linear regression, LASSO, decision tree, random forest, support vector machine, GBDT, and XGBoost. The results show that ensemble methods outperform linear models in both in-sample and out-of-sample predictions. GBDT delivers the best out-of-sample performance, with an R² of 0.5191, suggesting that nonlinear relationships and interaction effects matter in predicting corporate carbon disclosure. The key factors identified are firm size, media attention, environmental policy intensity, market concentration, and executive financial background. The heterogeneity tests show that regulatory and governance factors are more important for firms in heavily polluting industries, state-owned firms, and firms in central and western China, whereas market factors are more important for firms in eastern China, private firms, and firms in less polluting industries. Overall, the paper provides new evidence on the prediction of corporate carbon disclosure and offers practical implications for regulators and firms seeking to improve their sustainability-related disclosure practices. Full article

► Show Figures

Figure 1

1 pages, 149 KB

Open AccessRetraction

RETRACTED: Yang et al. Unraveling Spatial Nonstationary and Nonlinear Dynamics in Life Satisfaction: Integrating Geospatial Analysis of Community Built Environment and Resident Perception via MGWR, GBDT, and XGBoost. ISPRS Int. J. Geo-Inf. 2025, 14, 131

by Di Yang, Qiujie Lin, Haoran Li, Jinliu Chen, Hong Ni, Pengcheng Li, Ying Hu and Haoqi Wang

ISPRS Int. J. Geo-Inf. 2026, 15(4), 177; https://doi.org/10.3390/ijgi15040177 - 16 Apr 2026

Viewed by 403

Abstract

The journal retracts the article titled “Unraveling Spatial Nonstationary and Nonlinear Dynamics in Life Satisfaction: Integrating Geospatial Analysis of Community Built Environment and Resident Perception via MGWR, GBDT, and XGBoost” [...] Full article

19 pages, 1079 KB

Open AccessArticle

Intelligent Triggering of Safety Risk Warning in Metro Tunnel Construction: A Two-Stage Framework Integrating Static and Dynamic Data

by Liang Ou, Yinghui Zhang and Yun Chen

Buildings 2026, 16(8), 1550; https://doi.org/10.3390/buildings16081550 - 15 Apr 2026

Viewed by 373

Abstract

With the rapid expansion of metro tunnel construction, safety risks such as collapse, water inrush, and gas explosion have become increasingly critical. Existing warning models often lack fine-grained disaster type identification and dynamic risk assessment capabilities. This paper proposes a two-stage intelligent warning [...] Read more.

With the rapid expansion of metro tunnel construction, safety risks such as collapse, water inrush, and gas explosion have become increasingly critical. Existing warning models often lack fine-grained disaster type identification and dynamic risk assessment capabilities. This paper proposes a two-stage intelligent warning framework based on multi-source data fusion. First, a dual-autoencoder structure (MLP-AE and LSTM-AE) extracts deep features from static geological parameters and dynamic monitoring sequences. Then, a multilayer perceptron (MLP) classifier identifies four typical states: normal, collapse, water/mud inrush, and gas explosion. Subsequently, a regression model predicts a continuous risk score, mapped to three risk levels: Safe, Moderate Risk, and Significant Risk. Experimental results demonstrate that, compared with Decision Tree (DT), Gradient Boosting Decision Tree (GBDT), and Bayesian Network (BN), the proposed framework achieves superior performance in risk level identification, with an accuracy of 91% and an F1-score of 0.87. Notably, it exhibits particularly strong recall for severe (Level III) risks, which is crucial for practical engineering applications. The proposed framework provides a practical and intelligent approach for safety warning in metro tunnel construction. Full article

(This article belongs to the Section Building Structures)

► Show Figures

Figure 1

33 pages, 8917 KB

Open AccessArticle

An Automated Decision-Support Framework for Interior Space Quality Evaluation Using Computer Vision and Multi-Criteria Decision-Making

by Yuanan Wang, Zichen Zhao and Xuesong Guan

Buildings 2026, 16(8), 1508; https://doi.org/10.3390/buildings16081508 - 12 Apr 2026

Viewed by 642

Abstract

With the growing adoption of data-driven workflows and the need to compare numerous interior design alternatives in housing renewal, scalable and consistent assessment of interior space quality is increasingly important; however, current practice still depends on manual scoring and expert judgment. To address [...] Read more.

With the growing adoption of data-driven workflows and the need to compare numerous interior design alternatives in housing renewal, scalable and consistent assessment of interior space quality is increasingly important; however, current practice still depends on manual scoring and expert judgment. To address this gap, we propose an automation-ready framework that evaluates interior space quality from visual data. We construct the Functionality–Healthiness–Aesthetics Spatial Interior Dataset-10K (FHASID-10K) with 13,962 images for systematic validation. Three sub-models quantify functionality via space utilization and circulation smoothness, healthiness via detection of health-related visual elements, and aesthetics via semantic visual representations with regression-based prediction. Dimension scores are standardized and fused using the analytic hierarchy process (AHP) and the technique for order preference by similarity to ideal solution (TOPSIS) to produce a comprehensive score for ranking and grading. Experiments show stable score distributions and clear differentiation across space categories and style–space combinations. A gradient-boosted decision tree (GBDT) surrogate reconstructs the fused score with high accuracy (test R² = 0.9992; MSE = 1.1 × 10⁻⁵), and human-subject evaluation shows strong agreement with overall-quality ratings (r = 0.760, p < 0.001). Overall, the framework enables scalable benchmarking, scheme comparison, and decision support. Full article

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

► Show Figures

Figure 1

20 pages, 3510 KB

Open AccessArticle

Nondestructive Detection of Eggshell Thickness Using Near-Infrared Spectroscopy Based on GBDT Feature Selection and an Improved CatBoost Algorithm

by Ziqing Li, Ying Ji, Changheng Zhao, Dehe Wang and Rongyan Zhou

Foods 2026, 15(8), 1286; https://doi.org/10.3390/foods15081286 - 8 Apr 2026

Viewed by 355

Abstract

Eggshell thickness is a critical indicator for evaluating egg breakage resistance and hatchability, yet traditional measurement methods remain destructive and inefficient. To address this, this study proposes a robust prediction approach by integrating Gradient Boosting Decision Tree (GBDT) feature optimization with an improved [...] Read more.

Eggshell thickness is a critical indicator for evaluating egg breakage resistance and hatchability, yet traditional measurement methods remain destructive and inefficient. To address this, this study proposes a robust prediction approach by integrating Gradient Boosting Decision Tree (GBDT) feature optimization with an improved CatBoost algorithm. First, a joint strategy of Standard Normal Variate (SNV) and Multiplicative Scatter Correction (MSC) was employed to eliminate spectral scattering noise and enhance organic matrix fingerprint information. Subsequently, GBDT was introduced for nonlinear feature evaluation to adaptively screen the top 50 wavelengths, effectively mitigating the “curse of dimensionality” and multicollinearity in full-spectrum data. A CatBoost regression model was then constructed using an Ordered Boosting mechanism, supported by a dual anti-overfitting strategy that merged 10-fold nested cross-validation with Bootstrap resampling. Experimental results demonstrate that this method significantly outperforms traditional algorithms in both prediction accuracy and generalization. The coefficients of determination (R²) for the calibration and prediction sets reached 0.930 and 0.918, respectively, with a root mean square error of prediction (RMSEP) of 0.008 mm. Residual analysis confirms that prediction errors follow a zero-mean Gaussian distribution, indicating that systematic bias was effectively eliminated. This research provides a reliable theoretical foundation and technical support for the intelligent grading of poultry egg quality. Full article

(This article belongs to the Section Food Analytical Methods)

► Show Figures

Figure 1

21 pages, 5711 KB

Open AccessArticle

A Study on High-Precision Dimensional Measurement of Irregularly Shaped Carbonitrided 820CrMnTi Components

by Xiaojiao Gu, Dongyang Zheng, Jinghua Li and He Lu

Materials 2026, 19(8), 1491; https://doi.org/10.3390/ma19081491 - 8 Apr 2026

Viewed by 336

Abstract

For irregularly shaped 820CrMnTi carburizing and nitriding parts, the challenges of high reflectivity-induced overexposure, low surface contrast, and interference from minute burrs in industrial online inspection are addressed in this paper. An innovative precision detection method integrating adaptive imaging and a dual-drive heterogeneous [...] Read more.

For irregularly shaped 820CrMnTi carburizing and nitriding parts, the challenges of high reflectivity-induced overexposure, low surface contrast, and interference from minute burrs in industrial online inspection are addressed in this paper. An innovative precision detection method integrating adaptive imaging and a dual-drive heterogeneous coupling model (RGFCN) is proposed. Such parts, due to surface photovoltaic characteristic changes caused by carburizing and nitriding heat treatment and the complex on-site lighting environment, are prone to local overexposure and “false out-of-tolerance” measurements caused by outlier sensitivity in traditional inspections. First, an innovative programmatic adaptive exposure control algorithm based on grayscale histogram feedback is introduced, which dynamically adjusts imaging parameters in real time to effectively suppress high-brightness overexposure under specific working conditions. Second, a novel adaptive main-axis scanning strategy is designed to construct a dynamic follow-up coordinate system, eliminating projection errors introduced by random positioning from a geometric perspective. Additionally, Gaussian gradient energy fields are combined with the Huber M-estimation robust fitting mechanism to suppress thermal noise while automatically reducing the weight of burrs and oil stains, achieving “immunity” to non-functional defects. Meanwhile, a data-driven innovative compensation approach is introduced. Based on sample training, gradient boosting decision trees (GBDTs) are integrated to explore the nonlinear mapping relationship between multidimensional feature spaces and system residuals, achieving implicit calibration of lens distortion and environmental coupling errors. By simulating factory conditions with drastic 24 h day–night lighting fluctuations and strong oil stain interference, statistical analysis of over 1000 mass-produced parts shows that this method exhibits excellent robustness in complex environments. It reduces the false out-of-tolerance rate caused by burrs by over 90%, and the standard deviation of repeated measurements converges to the micrometer level. This effectively addresses the visual inspection challenges of irregular, highly reflective parts on dynamic production lines. Full article

(This article belongs to the Special Issue Latest Developments in Advanced Machining Technologies for Materials)

► Show Figures

Figure 1

24 pages, 10406 KB

Open AccessArticle

Evaluating the Performance of AlphaEarth Foundation Embeddings for Irrigated Cropland Mapping Across Regions and Years

by Lulu Yang, Yuan Gao, Xiangyang Zhao, Nannan Liang, Ru Ma, Shixiang Xi, Xiao Zhang and Rui Wang

Remote Sens. 2026, 18(7), 1065; https://doi.org/10.3390/rs18071065 - 2 Apr 2026

Viewed by 816

Abstract

Accurate irrigated cropland mapping is critical for agricultural water management and food security. Existing image-based irrigation mapping workflows primarily rely on vegetation indices and synthetic aperture radar (SAR) backscatter features, which have limited capacity to characterize the temporal evolution of irrigation processes and [...] Read more.

Accurate irrigated cropland mapping is critical for agricultural water management and food security. Existing image-based irrigation mapping workflows primarily rely on vegetation indices and synthetic aperture radar (SAR) backscatter features, which have limited capacity to characterize the temporal evolution of irrigation processes and crop growth conditions. The AlphaEarth Foundation (AEF) model developed by Google DeepMind provides compact embeddings with temporal semantic information learned via self-supervision, yet their utility for irrigation mapping has not been systematically assessed. In this study, a comprehensive assessment of AEF embeddings for irrigated cropland mapping was performed in terms of feature separability, classification performance, and spatiotemporal transferability. Experiments were conducted in two representative irrigated regions: the Guanzhong Plain in China and Kansas in the USA. Class separability of the 64 embedding dimensions was quantified using the Jeffries–Matusita (JM) distance. Then, the AEF embeddings were compared with the Sentinel feature set (Sentinel-2 bands, normalized difference vegetation index(NDVI), enhanced vegetation index(EVI), normalized difference water index(NDWI) and Sentinel-1 vertical transmit vertical receive(VV), vertical transmit horizontal receive(VH)) using K-means clustering and supervised classifiers, including Decision Tree (DT), Random Forest (RF), Gradient Boosting Decision Trees (GBDT), Support Vector Machine (SVM), and Multi-layer Perceptron (MLP). Finally, transfer experiments across 2022 and 2024 in the Guanzhong Plain and Kansas were conducted to examine cross-year and cross-region performance. The results showed that AEF embeddings consistently provide stronger class separability in both study areas, with a maximum JM distance of 1.58 (A29). Using AEF embeddings, RF achieved overall accuracies (OA) of 0.95 in the Guanzhong Plain and 0.93 in Kansas, outperforming models based on Sentinel-1/2 bands and indices. Notably, unsupervised K-means clustering on AEF embeddings yielded OA > 0.85, indicating high intrinsic separability between irrigated and rainfed croplands. Transfer experiments further demonstrate stable temporal transfer (cross-year OA > 0.87), whereas cross-region transfer is constrained by differences in irrigation regimes, crop phenology and management practices, resulting in limited spatial generalization (OA~0.3). Overall, this study demonstrates the potential of high-information-density representations from geospatial foundation models for irrigated cropland mapping and provides methodological and technical insights to support transfer learning and operational mapping over large areas. Full article

(This article belongs to the Special Issue Near Real-Time (NRT) Agriculture Monitoring)

► Show Figures

Figure 1

30 pages, 4624 KB

Open AccessArticle

Distribution Characteristics and Hazard Assessment of Ground Collapse in the Mining Activity Areas of the Turpan–Hami Basin

by Tao Wang, Chao Jin, Ning Liang, Yongchao Li, Shuaihua Song, Jingjing Ying, Yiqing Zhao and Bowen Zheng

Appl. Sci. 2026, 16(7), 3354; https://doi.org/10.3390/app16073354 - 30 Mar 2026

Viewed by 497

Abstract

The Turpan–Hami Basin, a critical energy hub in northwestern China, is plagued by frequent ground collapses induced by extensive mining over karst geology, threatening ecology and safety. Current hazard assessment methods, mainly single linear or traditional machine learning models, fail to capture the [...] Read more.

The Turpan–Hami Basin, a critical energy hub in northwestern China, is plagued by frequent ground collapses induced by extensive mining over karst geology, threatening ecology and safety. Current hazard assessment methods, mainly single linear or traditional machine learning models, fail to capture the complex nonlinear interactions inherent to this coupled geo-mining environment. This study addresses this gap by establishing a multi-dimensional “Geology-Mining-Hydrology-Environment” index system comprising 14 critical factors—including lithology, goaf distribution, mining intensity, and their interaction terms. A coupled gradient boosting decision tree and logistic regression (GBDT-LR) model, optimized for the multi-factor coupling characteristics of ground collapse in arid mining basins, was applied for the hazard assessment. The results reveal a distinct spatial pattern of “core agglomeration with multi-level gradient differentiation.” Extremely high-hazard areas, covering 9.21% of the area, are concentrated in the core mining areas northwest of Turpan and southwest of Hami, while high-hazard areas (4.63%) form surrounding belts. The GBDT-LR model (AUC = 0.871) demonstrated significantly superior performance over a single logistic regression model (AUC = 0.813), proving its enhanced capability to identify high-hazard areas by modeling complex factor interactions. This work provides an essential scientific foundation for implementing zonal hazard management and prioritizing disaster prevention projects in key areas of the basin. Full article

(This article belongs to the Special Issue Remote Sensing Technology in Landslide and Land Subsidence—2nd Edition)

► Show Figures

Figure 1

25 pages, 4104 KB

Open AccessArticle

Prediction of Postoperative Stroke in Elderly Surgical ICU Patients Using Random Forest Model: Development on MIMIC-IV with Cross-Institutional and Temporal External Validation

by Houji Jin, Mohammadsaeed Haghi, Nausin Kudrot, Kamiar Alaei and Maryam Pishgar

BioMedInformatics 2026, 6(2), 16; https://doi.org/10.3390/biomedinformatics6020016 - 27 Mar 2026

Viewed by 877

Abstract

Postoperative stroke is a serious and fatal condition that often affects elderly surgical patients. This rare but severe complication arises from complex interactions between comorbidities, physiologic instability and demographic disturbances that traditional risk tools often fail to capture.This study aims to develop and [...] Read more.

Postoperative stroke is a serious and fatal condition that often affects elderly surgical patients. This rare but severe complication arises from complex interactions between comorbidities, physiologic instability and demographic disturbances that traditional risk tools often fail to capture.This study aims to develop and validate a machine learning model with an improved ability to predict the risk of postoperative stroke in elderly patients utilising the comprehensive clinical and demographic ICU data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. External validation was performed on MIMIC-III and the eICU Collaborative Research Database, with eICU being the primary validation set. We identified postoperative surgical intensive care unit (SICU) patients aged 55 years or older from all databases. A strict temporal window of the first 24 h of ICU admission was applied across all three datasets while extracting features like laboratory measurements and vital sign summaries in order to ensure that all predictor values were derived from a fixed observation period at the beginning of ICU stay. After preprocessing, applying Multivariate Imputation by Chained Equations (MICE) imputation and initial screening of 88 candidate variables, 20 clinically meaningful predictors were selected through a multistage feature selection pipeline incorporating RFECV and permutation importance. SHAP analysis and LIME analysis were used for interpretability. We evaluated ten machine learning techniques, including Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors (KNNs), Support Vector Machine (SVM–RBF Kernel), Gradient Boosting (GBDT), Neural Network, XGBoost, CatBoost, Naive Bayes. Among them, Random Forest demonstrated strong predictive performance by achieving an AUROC of 0.8072 (95% CI [0.7890, 0.8253]) on the internal validation set. The model also achieved AUROC of 0.7557 (95% CI [0.7267, 0.7794]) and 0.9144 (95% CI [0.8893, 0.9378]) on the external validation sets eICU and MIMIC-III, respectively. Mean systolic blood pressure, Elixhauser score, minimum calcium, and minimum INR (PT) were consistently identified as the most influential predictors through both SHAP analysis and LIME analysis, thus strengthening model interpretability. Our findings suggest that a Random Forest-based predictive model can provide an accurate and generalisable prediction of postoperative stroke in elderly ICU patients using routinely collected physiologic and laboratory data. This also supports early risk stratification and targeted postoperative monitoring. Full article

(This article belongs to the Section Applied Biomedical Data Science)

► Show Figures

Figure 1

Search Results (448)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (448)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI