Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,126)

Search Parameters:
Keywords = XGBoost modeling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 11400 KB  
Article
Characterizing Short-Duration Summer Rainstorms in Nanjing, China, Using Multi-Source Remote Sensing and Explainable AI
by Yiding Wang, Ningxin Yong, Siyu Zhu and Yang Hong
Remote Sens. 2026, 18(13), 2212; https://doi.org/10.3390/rs18132212 (registering DOI) - 5 Jul 2026
Abstract
With global warming and rapid urbanization, short-duration summer rainstorms are becoming more intense and localized, posing growing challenges to urban flood resilience. However, their spatiotemporal characteristics, vertical structures, and environmental drivers remain poorly understood. Here, we combine multi-source remote sensing datasets and China’s [...] Read more.
With global warming and rapid urbanization, short-duration summer rainstorms are becoming more intense and localized, posing growing challenges to urban flood resilience. However, their spatiotemporal characteristics, vertical structures, and environmental drivers remain poorly understood. Here, we combine multi-source remote sensing datasets and China’s new-generation satellite-borne dual-frequency precipitation radar observations to investigate summer rainstorms in Nanjing, China, during 2017–2024. Results reveal pronounced spatiotemporal heterogeneity, with higher rainfall intensities concentrated over urban and adjacent areas. During the study period, rainstorm intensity and duration increased by 7.44% and 38.63%, respectively, while the affected area decreased by 8.18%, indicating a transition toward more localized yet more intense rainfall events. Environmental analyses suggest that large-scale thermodynamic conditions and regional topographic forcing provide a favorable background for convection development, while local urban thermal effects may further modulate rainfall enhancement. Three-dimensional radar detection of an illustrative rainstorm event indicates an inverted-cone vertical structure, suggesting a mixed convective-stratiform precipitation structure involving both warm-rain and ice-phase processes. An Explainable Bayesian-Optimized XGBoost (EBOX) model further identifies near-surface air temperature and specific humidity as the primary environmental factors associated with rainstorm occurrence and development. Overall, this study highlights the value of integrating satellite remote sensing with explainable artificial intelligence to improve understanding of urban extreme rainfall and provide new insights into how climate change, topography, and urbanization jointly shape precipitation extremes in rapidly urbanizing monsoon regions. Full article
Show Figures

Figure 1

27 pages, 2493 KB  
Article
Assessing the Potential of EMIT Hyperspectral Data Combined with DEM-Derived Terrain Variables for Predicting Soil As, Cu and Zn Concentrations in a Mountainous Region of Southwest China
by Guangping Qie, Minzi Wang, Ziping Pan, Zongdi Sun, Wenjin Xie, Zhiyi Liu and Guangxing Wang
Remote Sens. 2026, 18(13), 2211; https://doi.org/10.3390/rs18132211 (registering DOI) - 5 Jul 2026
Abstract
Spaceborne imaging spectroscopy has created new opportunities for monitoring soil properties at regional scales. Its use for predicting soil heavy metal concentrations in mountainous environments, however, remains insufficiently tested, especially when EMIT hyperspectral data are used. In this study, EMIT Level-2A surface reflectance [...] Read more.
Spaceborne imaging spectroscopy has created new opportunities for monitoring soil properties at regional scales. Its use for predicting soil heavy metal concentrations in mountainous environments, however, remains insufficiently tested, especially when EMIT hyperspectral data are used. In this study, EMIT Level-2A surface reflectance data were integrated with DEM-derived terrain variables to estimate soil arsenic (As), copper (Cu), and zinc (Zn) concentrations in Renhuai, Guizhou Province, Southwest China. Only soil samples falling within valid EMIT coverage were used for element-specific modeling, resulting in 139 samples for As, 136 for Cu, and 130 for Zn. To reduce redundancy among predictors, EMIT spectral variables and terrain factors were screened before model construction. Random forest and XGBoost models were then tested using repeated spatial cross-validation. The best-performing model for As combined EMIT predictors with elevation and achieved a validation R2 of 0.460. Model performance was considerably weaker for Cu, with a validation R2 of 0.188. For Zn, the model failed to outperform the mean-based benchmark, producing a negative validation R2 of −0.028. The spatial prediction maps and residual patterns suggested that the EMIT-based prediction showed moderate potential for As, limited predictive value for Cu, and exploratory rather than reliable mapping capability for Zn under the current sample and predictor conditions. Full article
(This article belongs to the Special Issue Hyperspectral Data Analysis of Vegetation and Soil Monitoring)
Show Figures

Figure 1

21 pages, 1281 KB  
Article
Credit Card Fraud Detection Under Extreme Class Imbalance Using Leakage-Safe Feature Selection and GA-Based Hyperparameter Optimization
by Chen Ma, Lihong Zhang, Zhi Xing and Junjing Su
Appl. Sci. 2026, 16(13), 6734; https://doi.org/10.3390/app16136734 (registering DOI) - 5 Jul 2026
Abstract
Credit card fraud detection is a typical rare-event classification problem because fraudulent transactions usually account for only a very small proportion of all transactions. Conventional evaluation on balanced or resampled test data may lead to overly optimistic performance estimates. To address this issue, [...] Read more.
Credit card fraud detection is a typical rare-event classification problem because fraudulent transactions usually account for only a very small proportion of all transactions. Conventional evaluation on balanced or resampled test data may lead to overly optimistic performance estimates. To address this issue, this study proposes a leakage-safe credit card fraud detection framework integrating Random Forest Gini impurity-based feature selection, resampling strategy evaluation, and Genetic Algorithm (GA)-based hyperparameter optimization. The framework was evaluated on the public European credit card fraud dataset containing 284,807 transactions, of which only 492 were fraudulent. The original dataset was first divided into a stratified training set and an untouched original-distribution test set. Feature selection, standardization, resampling, GA optimization, and threshold tuning were performed only on the training data or training folds. The final test set contained 85,443 transactions, including 148 fraudulent transactions, and was used only once for final evaluation. Experimental results show that GA-XGBoost achieved the best overall balance among the optimized models, with a PR-AUC of 0.798, ROC-AUC of 0.967, MCC of 0.814, balanced accuracy of 0.865, fraud-class precision of 0.908, fraud-class recall of 0.730, and fraud-class F1-score of 0.809. Compared with baseline XGBoost, GA-XGBoost improved PR-AUC from 0.741 to 0.798, MCC from 0.766 to 0.814, and fraud-class F1-score from 0.764 to 0.809, while reducing false positives from 22 to 11 and false negatives from 43 to 40. The ablation results further indicate that resampling strategies are not universally beneficial and should be evaluated under the original test distribution. These findings suggest that leakage-safe evaluation and fraud-class-oriented metrics provide a more reliable basis for practical credit card fraud detection. Full article
22 pages, 5124 KB  
Article
Analysis of Spatial–Temporal Pattern and Driving Force of Heat Island in Urban Agglomeration Around Hangzhou Bay
by Hongyu Li, Liuzhu Wang, Chao Fan, Sheng Zhao and Feng Gui
Land 2026, 15(7), 1205; https://doi.org/10.3390/land15071205 (registering DOI) - 5 Jul 2026
Abstract
In the context of global warming, thermal environmental problems in coastal urban ag-glomerations have become increasingly prominent. This study focuses on the urban ag-glomeration around Hangzhou Bay, constructs annual heat island intensity classification maps based on MODIS summer land surface temperature (LST) data [...] Read more.
In the context of global warming, thermal environmental problems in coastal urban ag-glomerations have become increasingly prominent. This study focuses on the urban ag-glomeration around Hangzhou Bay, constructs annual heat island intensity classification maps based on MODIS summer land surface temperature (LST) data from 2000 to 2020, analyzes the spatiotemporal patterns of heat islands, and investigates their driving mechanisms using the Extreme Gradient Boosting and Shapley Additive exPlanations (XGBoost-SHAP) model. The results show that: (1) the high-frequency area of strong heat islands expanded by 62.10% during the study period, extending from early built-up areas to newly developed coastal zones, with the spatial pattern transitioning from point-like distribution to areal agglomeration; (2) significant differences exist between the north and south coasts, where strong heat island center migration on the north coast is consistent with impervious surface expansion, whereas the south coast is significantly influenced by coastal wetland siltation; (3) impermeable surfaces and wind speed are key factors affecting LST, with impermeable surfaces acting as the primary driver of temperature increase, while wind speed plays a significant role in moderating temperatures. This study provides a scientific basis for thermal environment regulation in coastal urban agglomerations. Full article
Show Figures

Figure 1

18 pages, 2971 KB  
Article
AI-Driven Prediction of Surface Roughness and Cutting Force in Milling Aluminum Alloy Under Data-Scarce Conditions
by Mohammad Hossein Ebrahimi and Seyed Ali Niknam
Machines 2026, 14(7), 756; https://doi.org/10.3390/machines14070756 (registering DOI) - 5 Jul 2026
Abstract
Accurate prediction of surface roughness and cutting forces in milling aluminum alloys remains challenging under data-scarce conditions, where limited experimental data restricts the application of conventional machine learning models. This study addresses this gap by developing a systematic machine learning framework using 108 [...] Read more.
Accurate prediction of surface roughness and cutting forces in milling aluminum alloys remains challenging under data-scarce conditions, where limited experimental data restricts the application of conventional machine learning models. This study addresses this gap by developing a systematic machine learning framework using 108 milling experiments (repeated to 216 tests) on aluminum alloys AA2024-T351 and AA6061-T6. Five primary machining inputs—material type, spindle speed, feed rate, depth of cut, and tool coating—were used. Through feature engineering, 35 interaction features were generated to capture non-linear relationships. A two-step preprocessing strategy was applied: Winsorization at the 5th and 95th percentiles to handle outliers, followed by hybrid scaling combining RobustScaler and MinMaxScaler. Eight machine learning algorithms, including XGBoost, NGBoost, LightGBM, CatBoost, Random Forest, MLP, SVR, and Least Squares Boosting, were developed and hyperparameter-optimized using the Optuna framework with Tree-structured Parzen Estimator. Models were evaluated using R2, MAE, and RMSE on a 70/15/15 train–validation–test split. Results demonstrate that XGBoost achieved the highest predictive accuracy for surface roughness (Ra) (R2 = 0.99829) and for resultant cutting force (FN) (R2 = 0.997). Feed rate was identified as the dominant machining parameter, accounting for 87.7% of the total importance in predicting surface roughness. SHAP analysis confirmed that engineered interaction features—particularly Feed_Coating and Material_Feed—carry strong physical relevance. Additionally, NGBoost enabled probabilistic regression, providing uncertainty estimates. The proposed framework proves highly effective for multi-output prediction in machining under limited data, offering a robust, interpretable, and industry-ready solution for quality control in aluminum alloy milling operations. Full article
Show Figures

Figure 1

15 pages, 1334 KB  
Article
Predictors of Cognitive Skills Underlying Global Competences of Filipino Students in PISA 2018: A Machine-Learning Approach
by Allan B. I. Bernardo, Macario O. Cordel and Justin Gerard E. Ricardo
Educ. Sci. 2026, 16(7), 1076; https://doi.org/10.3390/educsci16071076 (registering DOI) - 5 Jul 2026
Abstract
Global competences are important capacities that students should develop to effectively function in the 21st century. There is a need to understand the factors associated with the components of global competence in specific educational systems. In this study, we use data from 15-year-old [...] Read more.
Global competences are important capacities that students should develop to effectively function in the 21st century. There is a need to understand the factors associated with the components of global competence in specific educational systems. In this study, we use data from 15-year-old Filipino students in the 2018 Programme for International Student Assessment (PISA) to explore predictors of three cognitive skills underlying global competence: perspective taking, cognitive flexibility/adaptability, and intercultural communication ability. Machine learning approaches were used to model the three cognitive skills, with XGBoost outperforming the other techniques across the target outcomes. The XGBoost models showed moderate predictive performance, with training R2 values ranging from 0.4953 to 0.5394 and test R2 values ranging from 0.3896 to 0.4469. Shapley Additive Explanations approach was used to identify variables that had the most impact on the model of each cognitive skill index. The results suggest that perspective taking could be a skill that underlies all three cognitive skills, but that intercultural communication ability seems to be the cognitive skill that is more strongly associated with intercultural and global knowledge and experiences than the other two. The results are discussed in terms of how the cognitive components of global competence might be a disjoint set of factors that are beginning to show connections in the cognitive repertoire of Filipino learners. Full article
Show Figures

Figure 1

28 pages, 8780 KB  
Article
Interpretable Machine Learning for Multi-Dimensional Visual Quality Grading Under Small-Data Conditions: A Case Study on Artisanal Flatbread
by Katiuscia Mannaro, Matteo Baire and Alessandro Fanti
Mach. Learn. Knowl. Extr. 2026, 8(7), 195; https://doi.org/10.3390/make8070195 (registering DOI) - 5 Jul 2026
Abstract
Interpretable machine learning for ordinal quality grading faces a fundamental tension between model transparency and predictive performance, particularly under small-data conditions where end-to-end deep learning is unreliable and domain knowledge must compensate for limited training samples. We present a dual-target feature engineering framework [...] Read more.
Interpretable machine learning for ordinal quality grading faces a fundamental tension between model transparency and predictive performance, particularly under small-data conditions where end-to-end deep learning is unreliable and domain knowledge must compensate for limited training samples. We present a dual-target feature engineering framework for interpretable ordinal grading validated on pane Carasau, a traditional flatbread whose extreme surface variability makes it a challenging small-data benchmark for machine learning under realistic acquisition constraints. The pipeline extracts 116 handcrafted visual descriptors organised into four families—colour, texture, spatial, and hotspot—and grades the quality along two independent axes: global toasting intensity and spatial uniformity, complemented by a continuous Toasting Index for process monitoring, on a dataset of 1512 images spanning four acquisition campaigns and three product types. On the primary within-batch evaluation set Campaign 01, N=1090), XGBoost achieves F1 macro =0.906 for toast classification and R2=0.886 for continuous regression, substantially outperforming two fine-tuned CNN baselines on the same evaluation set (MobileNetV2: F1 =0.523; EfficientNet-B0: F1 =0.518). Feature importance analysis reveals that colour descriptors dominate toasting prediction (87.5%), whilst spatial and texture features are essential for uniformity assessment (47.4% combined), providing physically grounded explanations directly traceable to the underlying thermal process. Cross-batch generalisation on held-out campaigns is moderate for the same product (XGB F1 = 0.718, κ = 0.703); cross-product transfer to geometrically distinct variants requires product-specific adaptation. The framework requires no GPU, runs on standard CPU hardware at 4 s per image, and provides complete decision transparency, supporting deployment without specialised hardware. Full article
(This article belongs to the Section Learning)
24 pages, 5749 KB  
Article
Replacing Yield Detrending with Direct Spatiotemporal Inputs Improves LSTM-Based Rice Yield Estimation
by Nuo Chen, Fumin Wang, Xiaobin Zhang, Zhen Zhao, Wenkai Wan, Junwei Liu, Zhou Shi and Songchao Chen
Remote Sens. 2026, 18(13), 2200; https://doi.org/10.3390/rs18132200 (registering DOI) - 5 Jul 2026
Abstract
Accurate rice yield estimation is essential for food security. Two key factors affecting estimation accuracy are the long-term upward trend in yield over time and regional heterogeneity across space. Current studies predominantly employ statistical detrending methods (e.g., moving averages, linear regression) to isolate [...] Read more.
Accurate rice yield estimation is essential for food security. Two key factors affecting estimation accuracy are the long-term upward trend in yield over time and regional heterogeneity across space. Current studies predominantly employ statistical detrending methods (e.g., moving averages, linear regression) to isolate temporal trends. However, such methods rely on prior assumptions about the time–yield relationship and may introduce systematic bias when these assumptions break down. Meanwhile, the individual contributions of temporal and spatial information, and their interactive effects, have not been systematically evaluated within a unified framework. We selected 112 rice-growing counties across six U.S. states (2000–2021), using vegetation index (Normalized Difference Vegetation Index), meteorological indicators (growing degree days, killing degree days, and cumulative precipitation), and spatiotemporal variables (year, longitude, and latitude). We designed six input configurations to compare conventional detrending against direct temporal variable inclusion, testing across four model architectures (Long Short-Term Memory, Random Forest, XGBoost, and Transformer). Results showed that: (1) directly inputting year significantly outperformed detrending across all models, with the combined spatiotemporal configuration achieving the best performance (LSTM R2 = 0.61 vs. 0.54 for detrending); (2) year was the most important predictor in SHAP analysis, with spatiotemporal variables ranking higher than most meteorological and remote sensing variables; (3) spatial information consistently improved accuracy and mitigated systematic bias for extreme yield regions; (4) the combined configuration performed best across different states, years (including extreme climate events), and yield levels, achieving near-end-of-season accuracy at the grain-filling stage (1.5–2 months before harvest). This study demonstrates that integrating raw spatiotemporal data directly into deep learning models is more effective than statistical detrending, offering a simpler and more robust approach for large-scale crop yield estimation. Full article
67 pages, 3288 KB  
Article
An Optimization-Driven Fuzzy Transformer–Deep Belief Network for PM2.5 Air Pollution Prediction: A Spatio-Temporal Framework Based on Aerosol Optical Depth
by Mohammad Mehdi Sharifi Nevisi, Pardis Sadatian Moghaddam, Mehrdad Kaveh, Diego Martín, Nuria Serrano and José Vicente Álvarez-Bravo
Mathematics 2026, 14(13), 2402; https://doi.org/10.3390/math14132402 (registering DOI) - 5 Jul 2026
Abstract
Forecasting fine particulate matter with a diameter of 2.5 μm (PM2.5) is critically important due to its adverse effects on human health and environmental sustainability. Although ground-based monitoring stations provide accurate measurements, their limited spatial coverage restricts large-scale PM2.5 assessment, [...] Read more.
Forecasting fine particulate matter with a diameter of 2.5 μm (PM2.5) is critically important due to its adverse effects on human health and environmental sustainability. Although ground-based monitoring stations provide accurate measurements, their limited spatial coverage restricts large-scale PM2.5 assessment, especially in complex urban regions. Consequently, aerosol optical depth (AOD) derived from satellite imagery, combined with advanced deep learning (DL) techniques, has emerged as an effective alternative by offering wide spatial coverage and rich spatio-temporal information. This paper proposed an optimization-driven fuzzy transformer–deep belief network (ODFT-DBN) for accurate PM2.5 air pollution prediction. The proposed framework integrates a fuzzy inference module to model uncertainty and nonlinear environmental relationships, a transformer encoder to capture long-range spatio-temporal dependencies, and a DBN to extract hierarchical features and improve prediction robustness. In addition, a novel multi-objective gray wolf optimizer (NMOGWO) is employed to jointly optimize the model hyper-parameters and fuzzy membership functions. The proposed approach is implemented for the city of Tehran, Iran, using meteorological variables, topographical features, ground-based PM2.5 measurements, and satellite-derived AOD data. The ODFT-DBN model is compared with several benchmark methods, including bidirectional encoder representations from transformers (BERT), transformer, long short-term memory (LSTM), gated recurrent unit (GRU), convolutional neural network (CNN), DBN, and extreme gradient boosting (XGBoost). Experimental results demonstrate that the proposed framework achieves superior predictive performance, attaining an R2 value of 0.94 and root mean square error (RMSE) of 0.8 μg/m3. Scatter plot analyses indicate a strong agreement between predicted and observed PM2.5 values, while the proposed model exhibits low variance, stable convergence behavior, and acceptable computational time. Overall, the results confirm the effectiveness, robustness, and practical applicability of the proposed ODFT-DBN framework for spatio-temporal PM2.5 forecasting. Full article
(This article belongs to the Special Issue Applications of Optimization Algorithms and Evolutionary Computation)
41 pages, 15308 KB  
Article
Explainable Ensemble Learning for Rapid Seismic Damage Assessment: A Comprehensive Benchmark Using Real Data from the 2023 Kahramanmaraş Earthquakes
by Celal Bıçakcı, Kamil Karataş, Selim Serhan Yıldız, Süleyman Sefa Bilgilioğlu and Himmet Karaman
Buildings 2026, 16(13), 2660; https://doi.org/10.3390/buildings16132660 (registering DOI) - 4 Jul 2026
Abstract
The 6 February 2023 Kahramanmaraş earthquakes caused widespread structural damage and highlighted the need for rapid building-level decision support in post-earthquake assessment. This study presents an explainable ensemble learning framework for seismic damage prediction using 16,611 building-level field observations from Kırıkhan, Hatay, Türkiye. [...] Read more.
The 6 February 2023 Kahramanmaraş earthquakes caused widespread structural damage and highlighted the need for rapid building-level decision support in post-earthquake assessment. This study presents an explainable ensemble learning framework for seismic damage prediction using 16,611 building-level field observations from Kırıkhan, Hatay, Türkiye. The original damage records were reorganized into three operational classes: No-Damage, Slight–Moderate, and Heavy–Collapse. Eight tree-based ensemble models, LightGBM, CatBoost, XGBoost, Random Forest, Extra Trees, Gradient Boosting Machine, AdaBoost, and HistGradientBoosting, were evaluated under a consistent protocol using class-weighting strategies where supported, with Balanced Accuracy as the primary metric. LightGBM and Random Forest achieved the joint-highest Balanced Accuracy value (0.650). Random Forest produced the strongest agreement-based metrics, while LightGBM remained closely competitive and was selected as the representative model for explainability because of its balanced class-wise behavior. CatBoost achieved the highest Heavy–Collapse recall (0.729), XGBoost achieved the highest Macro-AUC (0.821), and GBM produced the highest Overall Accuracy (0.658), showing that model ranking varied by evaluation criterion. SHapley Additive exPlanations identified building age, lithology, number of floors, structural system, plinth area, and proximity to faults and surface ruptures as key contributors. The remaining classification uncertainty, particularly among adjacent damage states, indicates that the framework is best interpreted as a complementary decision-support tool for preliminary screening and prioritization before final safety decisions or official damage assessment. Full article
(This article belongs to the Section Building Structures)
Show Figures

Figure 1

27 pages, 14549 KB  
Article
High-Resolution Inversion, Driving Mechanisms, and Source Apportionment of Near-Surface Ozone in Arid Urban Clusters: A Case Study of the Tianshan North Slope Urban Agglomeration
by Guangrui Pan, Yunyun Xi, Tuodi Wang, Liqiang Shen, Yutian Luo, Zhijun Li, Lihong Wang, Liping Xu, Linlin Cui, Shuliang Zhang, Xiangjun Lu and Yongpeng Tong
Remote Sens. 2026, 18(13), 2191; https://doi.org/10.3390/rs18132191 (registering DOI) - 4 Jul 2026
Abstract
Ozone (O3), as a key secondary pollutant, exhibits pronounced spatiotemporal heterogeneity, posing significant challenges to coordinated regional air pollution control. However, systematic understanding of high-resolution O3 spatial inversion and its driving mechanisms in arid urban agglomerations remains limited. In this [...] Read more.
Ozone (O3), as a key secondary pollutant, exhibits pronounced spatiotemporal heterogeneity, posing significant challenges to coordinated regional air pollution control. However, systematic understanding of high-resolution O3 spatial inversion and its driving mechanisms in arid urban agglomerations remains limited. In this study, the Tianshan North Slope Urban Agglomeration (TNSUA) was selected as the study area, and a multi-model comparative framework was established to comprehensively evaluate the O3 inversion performance of 16 machine learning and deep learning models, including Extreme Gradient Boosting (XGBoost), Random Forest (RF), Extremely Randomized Trees (ET), and Gradient Boosting Decision Tree (GBDT). Based on the optimal model performance, high-precision daily O3 spatial reconstruction for the year 2023 was achieved across the study region. The contributions of individual driving factors and their nonlinear response relationships were quantitatively interpreted using Shapley Additive Explanations (SHAP). Furthermore, a backward trajectory model combined with the Weighted Potential Source Contribution Function (WPSCF) and Weighted Concentration Weighted Trajectory (WCWT) methods was employed to identify potential source regions and transport pathways of O3. The results indicate that: (1) The XGBoost model exhibited the best performance (R2 = 0.93, RPD > 3). The reconstructed results reveal that high O3 concentrations in 2023 were primarily distributed in southern Urumqi, southern Changji, and southern Tacheng, with southern Urumqi identified as the most prominent hotspot. (2) The spatial variability of O3 was predominantly driven by downward shortwave radiation (DSR) and air temperature (TEM), both of which showed significant nonlinear responses and threshold effects on O3 formation. (3) Source apportionment analysis indicates that westerly transport serves as a major exogenous contribution pathway, with potential source regions mainly located in the surrounding areas of the northern Tianshan slope as well as Central Asia, particularly eastern Kazakhstan and northern Kyrgyzstan. This study systematically elucidates the formation mechanisms of O3 pollution in arid urban agglomerations from three aspects—high-precision inversion, driving mechanism analysis, and cross-regional transport identification—thereby providing a scientific basis for precise air pollution control strategies. Full article
(This article belongs to the Section Atmospheric Remote Sensing)
27 pages, 3065 KB  
Article
A Machine Learning-Based Inversion Framework for Particle Size Distribution Reconstruction Using Multi-Angle Light Scattering
by Hariyanto, Tomy Abuzairi, Ucuk Darusalam and Purnomo Sidi Priambodo
Math. Comput. Appl. 2026, 31(4), 122; https://doi.org/10.3390/mca31040122 (registering DOI) - 4 Jul 2026
Abstract
Particle size distribution (PSD) is a key determinant of aerosol optical properties and plays an important role in optical sensing and environmental monitoring. However, estimating PSD from light scattering measurements remains a challenging inverse problem due to its ill-posed nature and sensitivity to [...] Read more.
Particle size distribution (PSD) is a key determinant of aerosol optical properties and plays an important role in optical sensing and environmental monitoring. However, estimating PSD from light scattering measurements remains a challenging inverse problem due to its ill-posed nature and sensitivity to noise. To achieve the objective, this study proposed a physics-informed, data-driven inversion framework for PSD reconstruction using multi-angle light scattering signals generated from Mie scattering simulations. Synthetic datasets were generated using Johnson–SB, lognormal, and bimodal lognormal PSDs under various optical conditions, and the resulting scattering intensities were used to train machine learning models, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Support Vector Regression (SVR). The proposed framework was evaluated using both point-wise error metrics and distribution-based metrics, including Kullback–Leibler divergence and Wasserstein distance. The results showed that RF and XGBoost consistently achieved the highest reconstruction accuracy, with R2 values exceeding 0.98 across different PSDs, and significantly outperformed conventional linear baseline methods, including Ridge regression (representing Tikhonov regularization) and Non-negative Least Squares (NNLS). Additional experiments using lognormal and bimodal lognormal PSDs further confirmed the distributional generalization capability of the proposed model. The reconstructed PSDs also showed strong agreement with the reference distributions and remained robust under Gaussian, lognormal, and combined noise perturbations of up to 20%. Therefore, integrating physics-based scattering simulations with machine learning provided an accurate and robust solution for the inverse Mie scattering problem in optical particle characterization. Full article
(This article belongs to the Section Engineering)
23 pages, 11148 KB  
Article
Impacts of Extreme Climate on NDVI in China—Response Patterns and Threshold Effects
by Mengwei Li, Rong Li, Rui Zhu, Jianan Shan, Zitong Zhao, Fulong Chen and Zhenliang Yin
Remote Sens. 2026, 18(13), 2180; https://doi.org/10.3390/rs18132180 (registering DOI) - 4 Jul 2026
Abstract
Studying the impact of extreme climate events on vegetation dynamics is crucial for maintaining ecosystem stability. Based on ERA5-Land temperature and precipitation data, as well as MOD13C2 data, this study employs Pearson correlation analysis, wavelet analysis, and the XGBoost-SHAP model to analyze the [...] Read more.
Studying the impact of extreme climate events on vegetation dynamics is crucial for maintaining ecosystem stability. Based on ERA5-Land temperature and precipitation data, as well as MOD13C2 data, this study employs Pearson correlation analysis, wavelet analysis, and the XGBoost-SHAP model to analyze the response of the Normalized Difference Vegetation Index (NDVI) in China to extreme climate variations. The findings reveal the following. (1) NDVI increased steadily at a rate of 0.021/10 yr from 2001 to 2024. Extreme temperature and precipitation indices show an increasing trend in most regions. (2) NDVI was positively correlated with most extreme temperature and precipitation indices, and showed a significant negative correlation exclusively with Frost Days FDO (Frost Days) and CDD (Consecutive Dry Days). (3) R25 (Number of Heavy Precipitation Days) and SDII (Simple daily intensity index) are the primary drivers of nationwide NDVI changes, with contribution rates of 31.6% and 12.5%. Extreme climate indices can significantly affect vegetation growth when surpassing certain thresholds. For instance, the thresholds for R25 and SDII are 1.85 days and 4.35 mm/day. This study provides a scientific basis for understanding and managing vegetation responses to increasing climate extremes. Full article
Show Figures

Figure 1

29 pages, 550 KB  
Article
Multi-Stage Evaluation Framework for Identifying Deployment-Ready Prediabetes Prediction Models
by Michael Sher and Milan Toma
Mach. Learn. Knowl. Extr. 2026, 8(7), 192; https://doi.org/10.3390/make8070192 - 3 Jul 2026
Abstract
Selecting machine learning algorithms for clinical deployment demands comprehensive evaluation beyond conventional performance metrics. While automated frameworks simplify model generation, identifying algorithms suitable for real-world medical applications requires systematic assessment of learning dynamics, generalization stability, and cross-subset reliability. This study addresses prediabetes prediction [...] Read more.
Selecting machine learning algorithms for clinical deployment demands comprehensive evaluation beyond conventional performance metrics. While automated frameworks simplify model generation, identifying algorithms suitable for real-world medical applications requires systematic assessment of learning dynamics, generalization stability, and cross-subset reliability. This study addresses prediabetes prediction through a multi-stage evaluation comparing automated machine learning frameworks, neural networks, gradient boosting implementations (XGBoost, CatBoost, LightGBM), and specialized imbalance-handling techniques. A questionnaire-based dataset with a substantial class imbalance was analyzed through progressive evaluation stages: aggregate performance metrics, learning curve analysis, minority class detection capability, and cross-subset generalization stability. Linear Discriminant Analysis achieved maximum validation metrics in automated screening but exhibited flat learning curves, indicating an exhausted learning capacity. XGBoost demonstrated optimal convergence dynamics with the highest validation performance (0.749 AUC), yet suffered substantial validation-to-test degradation (5.9 percentage points). CatBoost, despite inferior validation performance (0.696 accuracy), exhibited exceptional cross-subset stability with minimal performance decline (0.2 percentage points) while achieving a comparable test accuracy (0.694). CatBoost was selected for deployment based on its superior generalization stability, demonstrating that a multi-dimensional evaluation spanning aggregate metrics, learning dynamics, and cross-subset stability is essential for identifying clinically deployable models, as validation performance alone provides insufficient evidence for real-world applicability. Full article
32 pages, 1086 KB  
Article
A Multisource Hardware Sensing Signal Fusion Network for Robust State Prediction and Anomaly Perception
by Yufei Li, Junxian Zhao, Yi Wei, Xichen Wang, Yaqing Yang, Yang Yang and Yan Zhan
Sensors 2026, 26(13), 4234; https://doi.org/10.3390/s26134234 - 3 Jul 2026
Abstract
With the rapid development of intelligent manufacturing, edge computing, and industrial and financial–industrial digital systems, large volumes of multisource hardware sensing signals are continuously generated in complex production environments, including environmental, electrical, vibration, network communication, and device operational signals. Owing to the heterogeneity, [...] Read more.
With the rapid development of intelligent manufacturing, edge computing, and industrial and financial–industrial digital systems, large volumes of multisource hardware sensing signals are continuously generated in complex production environments, including environmental, electrical, vibration, network communication, and device operational signals. Owing to the heterogeneity, asynchrony, noise interference, and disturbance sensitivity of these signals, conventional state prediction methods often fail to sufficiently characterize the dynamic response relationships among different sensing sources and cannot maintain stable prediction performance under non-stationary scenarios such as load surges, network congestion, and device anomalies. To address these challenges, a multisource hardware sensing signal fusion network is proposed for the edge-computing and digital production test scenario of an intelligent equipment manufacturing enterprise in Hebei Province, China, with the aim of achieving robust state prediction and anomaly perception in complex digital systems. In the proposed method, environmental sensing, device power, edge-node operation, vibration monitoring, network communication, and system output states are uniformly modeled as multisource engineering sensing signals, and an end-to-end prediction framework is constructed with cross-source sensing signal alignment to facilitate temporal coherence, disturbance-aware residual correction to substantially mitigate disturbance contamination, and context-adaptive fusion. Experimental results show that the proposed method achieves the best performance in the overall state prediction task, with MAE, RMSE, MAPE, and R2 reaching 0.0968, 0.1457, 8.12%, and 0.9416, respectively, outperforming baseline methods including ARIMA, XGBoost, LightGBM, LSTM, TCN, Transformer, Attention Fusion, and Multimodal Transformer. In the disturbance robustness experiment, the Event-MAE and Event-RMSE of the proposed method are reduced to 0.1126 and 0.1694, respectively, with an Avg. Drop of only 28.98%, indicating that more stable responses can be achieved under non-stationary disturbance scenarios. In the abnormal-state recognition task, Accuracy, Precision, Recall, and F1-score values of 94.32%, 93.76%, 92.85%, and 93.30% are achieved, respectively. The results demonstrate that the proposed method can effectively improve the state prediction accuracy, disturbance robustness, and anomaly warning capability of multisource hardware sensing data in complex industrial and financial–industrial digital systems, thereby providing an effective modeling scheme for intelligent monitoring and engineering decision-making in AI-driven industrial and financial sensing scenarios. Full article
(This article belongs to the Special Issue Intelligent Sensing and Digital Signal Processing in Smart Data)
Back to TopTop