Saved Queries

The Yellow River Basin contains abundant coal resources; however, its ecological environment is inherently fragile, and vegetation degradation has been further intensified by extensive mining activities. Accurate classification of individual tree species in mining-affected areas is therefore essential for assessing ecological conditions and establishing a scientific foundation for targeted restoration and sustainable management. To address this need, an evaluated machine learning framework was developed and evaluated for individual tree species classification in a coal mining area of the Yellow River Basin using integrated unmanned aerial vehicle (UAV) data. A comprehensive feature set was constructed by extracting 278 attributes per tree. These attributes included 224 spectral bands and 29 hyperspectral indices derived from hyperspectral imagery, 24 textural metrics obtained from RGB orthophotos, and one canopy height feature generated from a LiDAR-derived model. Based on ground-truth data from 1095 individual trees, seven machine learning algorithms were trained and systematically compared: Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), Gradient Boosting (GB), Logistic Regression (LR), and XGBoost. Statistical significance testing using 5 × 5 repeated cross-validation, together with the Friedman test and post hoc Nemenyi test, and additional model stability analysis consistently identified XGBoost as the optimal classifier. On an independent test set, XGBoost achieved high accuracy (Overall Accuracy = 0.897, Kappa = 0.811) with an efficient training time of 2.36 s. Further analysis demonstrated the critical and complementary roles of hyperspectral and structural features in species discrimination. The optimized model was subsequently applied to generate a detailed wall-to-wall tree species map across the entire mining area. Overall, this study presents a statistically informed comparison of classifiers for multi-source feature-based species discrimination and delivers an evaluated and practical pipeline for effective vegetation monitoring. The proposed framework provides a scientific tool for assessing and managing ecological recovery in complex mining environments, particularly within ecologically sensitive regions such as the Yellow River Basin. Full article

(This article belongs to the Special Issue Remote Sensing and Smart Forestry (Third Edition))

►▼ Show Figures

Figure 1

22 pages, 3998 KB

Open AccessArticle

Interspecific Habitat Suitability of Four Southeast Asian Spiny Climbing Palms (Korthalsia) Through Species Distribution Modeling

by Tushar Andriyas, Nisa Leksungnoen, Suwimon Uthairatsamee, Chatchai Ngernsaengsaruay, Nisachol Pungtambol, Pichet Chanton, Nittaya Mianmit, Wirongrong Duangjai, Buapan Puangsin and Phruet Racharak

Plants 2026, 15(9), 1348; https://doi.org/10.3390/plants15091348 - 28 Apr 2026

Abstract

Rattans of the genus Korthalsia are ecologically and economically important non-timber forest resources in Southeast Asia, yet their conservation is limited by knowledge of species-specific distribution patterns and environmental constraints. We modeled the potential distributions of four Korthalsia species (K. flagellaris, K. laciniosa, K. rigida, and K. scortechinii) using species distribution models (SDMs). Models were fitted in R using the sdm package, and ensemble maps were generated by combining predictions from Random Forest (RF), Generalized Linear Models (GLMs), Generalized Additive Models (GAM), and GLMnet. The top predictors influencing habitat distribution included soil physical structure, atmospheric moisture demand, and canopy light availability. The dominance of these factors reflects three distinct and non-interchangeable environmental axes that regulate belowground moisture dynamics, atmospheric constraints on gas exchange, and the energetic requirements for recruitment. All four species ensemble models significantly outperformed the null model, and spatial block cross-validation (k = 5, 200 km blocks) indicated a marginal drop in area under the curve (AUC) values, confirming a predictive signal under geographically independent evaluation. Ensemble suitability maps identified Peninsular Malaysia, Borneo, and Sumatra as centers of predicted habitat. Core habitat was estimated to be less than 0.6% of total suitable area for all species, ranging from 980 km² (K. scortechinii) to 19,256 km² (K. rigida), with anthropogenic modification exceeding 50% in the core habitat in K. flagellaris and K. rigida. These results provide the first species-specific baseline for these Korthalsia across Southeast Asia, supporting more targeted conservation and restoration planning under varying habitat constraints. Full article

(This article belongs to the Special Issue Conservation, Ecology and Management of Rare and Endangered Plant Species)

►▼ Show Figures

Figure 1

14 pages, 1640 KB

Open AccessArticle

Small-Data Neural Computing Outperforms RSM: Low-Cost Smart Optimization in Injection Molding

by Ming-Lang Yeh, Wen Pei and Han-Ching Huang

Appl. Sci. 2026, 16(9), 4288; https://doi.org/10.3390/app16094288 - 28 Apr 2026

Abstract

In smart manufacturing, the injection molding industry faces a “data scarce environment” due to prohibitive physical trial costs. Processing recycled polypropylene (rPP) exacerbates this challenge, as traditional response surface methodology (RSM) fails to capture complex non-linear rheological behaviors induced by material variability. This study proposes a “domain-knowledge guided data augmentation framework,” integrating Taguchi experimental data (L₂₅) with Moldex3D digital twin simulations to construct a 300-sample hybrid dataset. A back-propagation neural network (BPNN) with L₂ regularization was employed for small-sample learning, providing a continuous differentiable physical mapping. To rigorously prevent neighborhood data leakage, the model was evaluated via a strict nested group-based 5-fold cross-validation. Particle swarm optimization (PSO) was coupled to overcome the local minima of gradient descent. Comparative analysis demonstrates that BPNN significantly outperforms both traditional RSM and a newly introduced Random Forest (RF) baseline, achieving a testing mean squared error (MSE) of 0.001 (±0.0002) and a testing R² of 0.95. PSO minimized the shrinkage rate to 3.079%, validated via Moldex3D digital twin simulation with a 0.19% relative error. Synergizing virtual–physical integration with robust neural computing enables superior process control precision in small-data regimes, offering small and medium-sized enterprises (SMEs) a cost-effective pathway for smart optimization. Full article

(This article belongs to the Section Applied Industrial Technologies)

►▼ Show Figures

Figure 1

34 pages, 3920 KB

Open AccessArticle

A Data-Centric Approach to Water Quality Prediction: Sample Size, Augmentation, and Model Performance with a Focus on Ammonium in a Tropical Wetland

by Doris Mejia Avila, Viviana Soto Barrera and Franklin Torres Bejarano

Water 2026, 18(9), 1043; https://doi.org/10.3390/w18091043 - 28 Apr 2026

Abstract

Framed within data-centric artificial intelligence, this study integrates statistics, geotechnologies and AI to improve water quality prediction. The primary objective was to identify the minimum sample size required to train robust and accurate machine learning models. Based on 30 sampling points in a tropical wetland in northern Colombia, ammonium concentration was selected as the target variable, and total dissolved solids, suspended solids, phosphate, dissolved oxygen, nitrate and chemical oxygen demand were chosen as predictors. Because 30 observations are insufficient to train robust models, data augmentation was performed using ordinary kriging (OK) and empirical Bayesian kriging (EBK). From the kriging-interpolated surfaces, 1000 synthetic points (randomly and spatially distributed while preserving the estimated spatial structure) were sampled; from this expanded dataset, subsamples of varying sizes were drawn to train six algorithms: multiple linear regression (MLR), random forest (RF), k-nearest neighbours (k-NN), gradient boosting machines (GBM), multilayer perceptron (MLP) and radial basis function neural network (RBF-NN). The RF, k-NN, MLP, RBF-NN and GBM models trained on the interpolated data exhibited excellent performance: in the testing phase, they achieved adjusted coefficients of determination > 0.95 and symmetric mean absolute percentage errors (SMAPEs) < 10%, and the resulting predictive surfaces showed comparable performance under external validation. According to the criteria of stability, goodness of fit, and external validation, the optimal minimum sample size for most algorithms was 104 observations. These results represent a significant advance in mitigating data scarcity in water quality modelling. The identification of effective data augmentation methods and the determination of appropriate sample sizes, as demonstrated here, support the robust application of AI techniques in water quality prediction. The proposed strategy is transferable to other quantitative, spatially continuous environmental variables and thus contributes to the development of the emerging subdiscipline of geospatial artificial intelligence (GeoAI). Full article

(This article belongs to the Section Water Quality and Contamination)

►▼ Show Figures

Figure 1

40 pages, 4664 KB

Open AccessArticle

Physics-Informed Machine Learning for Property Prediction and Process Optimization in Additively Manufactured Filled Polymer Composites: A Bayesian Optimization Approach

by Kimberley Rooney, Sajib Mistry, Alokesh Pramanik and Animesh K. Basak

Industries 2026, 1(1), 2; https://doi.org/10.3390/industries1010002 - 28 Apr 2026

Abstract

The development of filled photopolymer composites for Digital Light Processing (DLP) additive manufacturing requires optimizing processing parameters to achieve the desired mechanical properties. Traditional experimental approaches are time-intensive, while physics-based models often struggle to capture the complex interactions among parameters. This study presents a physics-informed machine learning framework that combines Random Forest with Bayesian optimization (RF-BO) to predict the ultimate tensile strength in recycled thermoset resin composites manufactured via DLP. A validation dataset of 19 systematically varied formulations (each with n = 5 measurement replicates for reliability) was generated and augmented with 1500 physics-informed synthetic samples to enable robust model training. The limited experimental dataset, while insufficient for traditional statistical inference, provided critical validation of physical trends, including non-monotonic particle-size effects and optimal processing windows. Six machine learning algorithms were evaluated, with RF-BO achieving superior performance (R² = 0.9125, MSE = 1.07 MPa). The framework identified optimal processing conditions of 59–64 μm particle size, 5.0 ± 0.5 wt.% concentration, and 60 min cure time, predicting a maximum UTS of 43.84 MPa with a prediction error of less than 1.0 MPa. Feature importance analysis revealed that cure time was the dominant parameter (40%), followed by particle size (30%), validating the physical interpretability. This approach demonstrates significant potential for accelerating materials design in composite additive manufacturing while maintaining physically meaningful predictions. Full article

(This article belongs to the Special Issue Machine Learning in Manufacturing: Digital Twins, Optimization and Control)

►▼ Show Figures

Graphical abstract

16 pages, 919 KB

Open AccessArticle

A Comparative Performance Study of Host-Based Intrusion Detection Using TextRank-Based System Call Preprocessing and Deep Learning Models

by Hyunwook You, Chulgyun Park, Dongkyoo Shin and Dongil Shin

Electronics 2026, 15(9), 1856; https://doi.org/10.3390/electronics15091856 - 27 Apr 2026

Abstract

Host-based intrusion detection systems (HIDSs) can address the limitations of network-based detection by analyzing system calls and other low-level events. Many existing benchmark datasets remain inadequate for evaluating modern attacks because they were built in outdated environments and cover only a limited set of attack behaviors. To address this gap, this study builds a TextRank-based preprocessing pipeline on the LID-DS 2021 dataset and compares five end-to-end pipelines: Random Forest (RF), Long Short-Term Memory (LSTM), Convolutional Neural Network(CNN) + LSTM, LSTM, Bidirectional LSTM (BiLSTM), and CNN + Bidirectional Gated Recurrent Unit (BiGRU). Of the 15 scenarios in the dataset, six multi-stage attacks were excluded, and three representative scenarios were selected based on attack-category coverage and suitability for single-chunk host-level detection. Within these three selected scenarios and same-scenario file-level splits, the deep learning pipelines achieved F1-scores of 0.90–0.94, whereas RF ranged from 0.55 to 0.63. Among the evaluated pipelines, CNN + BiGRU produced the strongest overall results. These findings indicate that, under this constrained evaluation setting, sequential deep learning pipelines can be effective for scenario-specific system-call-based HIDS; however, broader generalization to unseen attacks or to the full LID-DS 2021 scenario set remains unverified. Full article

(This article belongs to the Special Issue Emerging Research Trends and Technologies in Intrusion Detection Systems (IDSs) and Artificial Intelligence (AI) Utilization)

►▼ Show Figures

Figure 1

26 pages, 8312 KB

Open AccessArticle

Attention-Enhanced ResUNet for Dynamic Tropopause Pressure Retrieval over the Winter Tibetan Plateau: Integrating FY-4A Multi-Channel Data with Topographic Constraints

by Junjie Wu, Liang Bai, Mingrui Lu, Xiaojing Li, Wanyin Luo and Tinglong Zhang

Remote Sens. 2026, 18(9), 1342; https://doi.org/10.3390/rs18091342 - 27 Apr 2026

Abstract

The dynamical tropopause layer pressure (DTLP) represents a key interface characterizing upper-tropospheric stratification and atmospheric dynamical structure. Its spatial morphology and gradient variations directly influence jet stream distribution as well as the intensity and location of clear-air turbulence (CAT). Over the Tibetan Plateau, complex terrain and pronounced dynamical variability result in a significantly lower tropopause height and enhanced horizontal gradients during winter. Aircraft cruising altitudes frequently approach or intersect the tropopause layer in this region, making accurate and fine-scale characterization of DTLP structures critically important for aviation safety. A deep learning-based DTLP retrieval model (Att-ResUNet_DEM) is developed by integrating terrain constraints and an attention mechanism. Using MERRA-2 reanalysis data as supervisory labels, the model incorporates a squeeze-and-excitation (SE) attention mechanism within a residual encoder–decoder framework, while a digital elevation model (DEM) is introduced as an additional input channel and fused with satellite brightness temperature data to explicitly account for terrain effects. A random forest (RF) model is implemented as a baseline for comparison. Compared with the RF model, the Att-ResUNet_DEM reduces the MAE and RMSE by 13.20% and 9.19%, respectively, while increasing the correlation coefficient to 0.76. Over the primary aviation corridors of the Tibetan Plateau, the Att-ResUNet_DEM model achieves a correlation coefficient(R) of 0.87, with markedly reduced gradient dispersion. A representative CAT case further confirms the model’s ability to capture the overall DTLP morphology and gradient enhancement zones. Overall, by combining a regionalized modeling strategy with terrain constraints, this study systematically improves DTLP retrieval accuracy and gradient consistency over complex terrain, providing a new technical pathway for high-resolution tropopause monitoring and aviation operation support. Full article

(This article belongs to the Special Issue Satellite Observation of Middle and Upper Atmospheric Dynamics)

►▼ Show Figures

Figure 1

21 pages, 27653 KB

Open AccessArticle

Field Phenotyping of Triticale Overwintering Dynamics Under Varied Sowing Practices Using Spectral Indices

by Wenjun Gao, Xiaofeng Cao, Mengyu Sun, Ruyu Li, Tile Huang and Weiyue Ma

Agronomy 2026, 16(9), 880; https://doi.org/10.3390/agronomy16090880 (registering DOI) - 27 Apr 2026

Abstract

This study aims to enhance the early warning and monitoring of frost damage in triticale (×Triticosecale Wittmack), as well as to identify frost-tolerant materials. To this end, this work focused on phenotyping the dynamics of triticale under different damage intensities using spectral indices. Sixteen triticale genotypes were planted under three sowing date (SD) treatments, with three sowing rate (SR) gradients set for each SD. The multispectral data of triticale under six frost damage intensities were acquired using an unmanned aerial vehicle (UAV) platform. A total of eight spectral indices (SIs) were extracted from samples under each intensity. In general, for each combination of SD and SR, all SIs decreased monotonically with increasing damage intensity. These indices are therefore suitable for monitoring frost damage in triticale under complex sowing scenarios. Under early frost damage, the relative decline rates (RDRs) of the SRI (Simple Ratio Vegetation Index), EVI2 (Enhanced Vegetation Index 2), NIRv (Near-Infrared Reflectance of Vegetation), and GLI (Green Leaf Index) were higher than those of other indices, indicating that they are more sensitive to early frost damage and thus more suitable for frost warning. Under frost stress, the RDRs of the indices were higher in early-sown samples than in late-sown samples. SD plays a more significant role than SR in determining the response of triticale indices to frost damage. Models were developed to detect triticale under varying damage intensities with SIs and classification algorithms—XGBoost, Quadratic Discriminant Analysis (QDA), Random Forest (RF), and Support Vector Machine (SVM). The SVM classifier demonstrated the best generalization performance (overall accuracy: 98.03%; F1-score: 0.98). The detection contributions of indices within the optimal model were evaluated by their respective SHAP (Shapley Additive Explanations) values. The GLI, NIRv, NDVI (Normalized Difference Vegetation Index), and GNDVI (Green Normalized Difference Vegetation Index) were identified as key indices, as they exhibit higher cumulative SHAP values. Identification models for triticale with different frost tolerance levels were established based on the time-series data of these key indices and the above four algorithms. The optimal model based on the SVM algorithm achieved an identification accuracy exceeding 90%. The average overwintering dynamics and frost damage responses of the key indices were analyzed for triticale with different frost tolerance levels under all treatments. Under frost stress, these indices and their RDRs in frost-tolerant triticale were generally higher and lower, respectively, than those in frost-sensitive triticale. These four key indices can thus assist in the identification of frost tolerance in triticale. This study aids in the early warning and monitoring of frost damage in triticale under complex planting scenarios and the evaluation of overwintering performance in triticale germplasm. Full article

(This article belongs to the Special Issue Advancing Plant Phenotyping for Precision Crop Growth Monitoring and Forecast Leveraging In Situ, Remote and Proximal Sensing, and AI)

►▼ Show Figures

Figure 1

14 pages, 684 KB

Open AccessArticle

Comparison of a Linear Mixed Model and Tree-Based Machine Learning Models for Daily Milk Yield Prediction in Dairy Cows During Summer

by Babak Darabighane and Alberto Stanislao Atzori

Information 2026, 17(5), 415; https://doi.org/10.3390/info17050415 - 27 Apr 2026

Abstract

The expansion of digital technologies in dairy farming (precision dairy farming) has created new opportunities for the systematic use of data, which can lead to more efficient production processes. This study aimed to develop and evaluate models for predicting daily milk yield from dairy cows during summer. This yield was modeled at the individual level, with days in milk and parity group included as baseline covariates in all analyses. Three feature-set scenarios were defined and evaluated, in which the temperature–humidity index (THI) and milk yield history were added to the baseline variables either separately (Scenarios 1 and 2) or jointly (Scenario 3). Performance was evaluated using walk-forward validation, and feature selection was nested within each iteration’s training window. The performance of the linear mixed model (LMM) was then compared with two machine learning models, random forest (RF) and gradient boosting machine (GBM), within the same experimental framework. In Scenario 3, all three models showed similar fits (R² = 0.92 and concordance correlation coefficient = 0.96), although the GBM model yielded a smaller error (root mean square error [RMSE] = 2.07 ± 0.22, mean absolute error [MAE] = 1.39 ± 0.12) than the RF model (RMSE = 2.10 ± 0.23, MAE = 1.45 ± 0.13) and the LMM (RMSE = 2.15 ± 0.22, MAE = 1.41 ± 0.10). Overall, adding the THI and recent milk yield history to the baseline variables improved short-term prediction accuracy in this dataset, with the GBM model showing the smallest error. These results can support farmers and herd managers in predicting short-term milk yield under heat stress conditions and making timely management decisions. Full article

(This article belongs to the Special Issue Advancing Smart Systems Through Deep Learning, Generative AI, and Big Data Analytics)

►▼ Show Figures

Figure 1

27 pages, 1739 KB

Open AccessArticle

Optimization of Soil Steam Sterilization for Panax notoginseng Based on SVR Multi-Output Prediction and Multi-Decision Mode

by Liangsheng Jia, Bohao Min, Liang Yang, Yanning Yang, Hao Zhang and Xiangxiang He

Agronomy 2026, 16(9), 877; https://doi.org/10.3390/agronomy16090877 (registering DOI) - 26 Apr 2026

Abstract

Empirical parameter settings in steam-based soil disinfestation for Panax notoginseng (a valuable medicinal plant) often hinder the simultaneous optimization of pathogen control and energy efficiency. To address this limitation, this study aims to develop a parameter regulation framework that integrates multi-output regression with scenario-oriented intelligent decision-making. Initially, a comprehensive dataset comprising critical parameters—steam pressure (P_steam), soil compaction (C_soil), and heating time (t_heat)—was established. A random search (RS) hyperparameter optimization scheme was employed to comparatively evaluate the multi-output predictive performance of Random Forest (RF), Support Vector Regression (SVR), and Multilayer Perceptron (MLP) for the joint estimation of soil temperature (T_soil) and root-rot pathogen kill rate (Kill_rate). Subsequently, by integrating total energy consumption (E_total) and operating electricity cost models, a constrained search algorithm was implemented to develop three objective-oriented decision modes: “maximize Kill_rate”, “minimize C_electricity”, and “maximize Efficiency”. Results demonstrate that the RS-optimized SVR yielded superior multi-output performance, achieving R² of 0.968 for T_soil (MAE = 2.44 °C) and 0.808 for Kill_rate (MAE = 7.85%). Compared to conventional empirical configurations, the proposed decision modes exhibited significant advantages across diverse scenarios. In the “maximize Kill_rate” mode, dynamic extensions of t_heat facilitated theoretical complete inactivation even under challenging heating conditions, effectively eliminating disinfection “blind spots” inherent in fixed-duration strategies. Under the “minimize C_electricity” mode, precise regulation of P_steam reduced operational electricity costs by 18.2% while satisfying the constraint of Kill_rate ≥ 95%. Furthermore, the “maximize Efficiency” mode identified an optimal operating point at C_soil = 64 kPa (P_steam = 0.4 MPa, t_heat = 13 min), thereby mitigating performance degradation associated with excessive tillage or high media rigidity and achieving an optimized cost–benefit ratio. By synthesizing high-fidelity multi-output regression with a flexible multi-mode decision-making framework, this study provides an intelligent solution for soil disinfestation in protected agriculture, facilitating the coordinated optimization of phytosanitary efficacy, energy expenditure, and economic viability. Full article

(This article belongs to the Section Soil and Plant Nutrition)

15 pages, 3268 KB

Open AccessArticle

Assessing Climate-Driven Range Dynamics of Hippophae tibetana Schltdl. Using an Ensemble Modeling Approach

by Tao Ma, Biyu Liu, Danping Xu and Zhihang Zhuo

Diversity 2026, 18(5), 257; https://doi.org/10.3390/d18050257 - 26 Apr 2026

Viewed by 58

Abstract

Hippophae tibetana Schltdl. is a cold-tolerant deciduous shrub endemic to the Tibetan Plateau, playing a vital ecological role in high-altitude environments. This study utilized the Biomod2 platform to model its current and future potential distribution under climate change, integrating 34 environmental variables across bioclimatic, topographic, edaphic, anthropogenic, and ultraviolet (UV) dimensions. Among ten candidate species distribution models (SDMs), the random forest (RF) algorithm exhibited the highest predictive accuracy and stability. An ensemble model (EM) combining RF, GBM, MARS, and FDA further improved predictive performance (ROC = 0.992, TSS = 0.923, and Kappa = 0.886). Key determinants of habitat suitability included altitude, temperature, UV radiation, and biodiversity, with RF response curves revealing distinct nonlinear thresholds. Optimal suitability occurred at around a 4000 m elevation, decreasing beyond this range, while temperature and UV exhibited similar unimodal responses. Under the SSP2-4.5 climate scenario, the suitable habitat is projected to expand from the 2050s to the 2090s, particularly in eastern Qinghai, southwestern Gansu, northwestern Sichuan, and central–southern Tibet. The species’ distribution centroid is anticipated to shift southwestward toward Qinghai Province, with more rapid migration projected after the 2050s. These findings underscore the complex interplay of environmental factors shaping H. tibetana distribution and offer valuable insights for conservation planning in the ecologically fragile Tibetan Plateau. Full article

(This article belongs to the Section Biodiversity Conservation)

►▼ Show Figures

Figure 1

31 pages, 7149 KB

Open AccessArticle

Nationwide Solar Radiation Zoning and Performance Comparison of Empirical and Deep Learning Models

by Bing Hui, Qian Zhang, Lei Hou, Yan Zhang, Qinghua Shi, Guoqing Chen and Junhui Wang

Appl. Sci. 2026, 16(9), 4229; https://doi.org/10.3390/app16094229 - 26 Apr 2026

Viewed by 59

Abstract

Accurate solar radiation estimation is critical for optimizing solar energy applications. This study divided 819 meteorological stations in China into six solar radiation zones using k-means, hierarchical, and bisecting k-means clustering based on daily relative sunshine duration. Correlation analysis and feature importance evaluation were conducted to quantify the contributions of key meteorological variables. A comparison of models considering regional heterogeneity was performed. Six sunshine-based empirical models, three machine learning models (Random Forest, Support Vector Machine, and Extreme Gradient Boosting), and two deep learning models (Long Short-Term Memory and Gated Recurrent Unit) were systematically evaluated across 98 stations with observed solar radiation data. Model performance was assessed using the coefficient of determination (R²), mean absolute error (MAE), root mean square error (RMSE), and normalized RMSE (NRMSE). Results showed that k-means clustering outperformed the other two methods and was adopted for final zoning. The correlation analysis identified sunshine duration (S), extraterrestrial radiation (R_a), temperature difference (ΔT), and maximum temperature (T_max) as the dominant influencing factors, with clear regional heterogeneity. The deep learning models, particularly LSTM (R² = 0.939, RMSE = 1.702 MJ/m/²/d¹, MAE = 1.319 MJ/m/²/d¹, NRMSE = 0.046), achieved the highest accuracy, followed by GRU, XGB, SVM, and RF. Among the empirical models, Model 5 performed best in Zones 1, 3, 4, and 5, while Model 6 was optimal in Zones 2 and 6. The key novelty of the study is an integrated zoning–prediction framework for regional solar radiation estimation, combining clustering validation, correlation analysis, empirical model calibration, and deep learning benchmarking, with enhanced physical interpretability and prediction accuracy. Full article

20 pages, 1387 KB

Open AccessArticle

Multidimensional Heterogeneous Hierarchical Measurement Model for Civil Aviation Passengers’ Sensitive Data

by Shuang Wang, Fangzheng Liu, Zhiping Li, Lei Ding and Zhaojun Gu

Symmetry 2026, 18(5), 738; https://doi.org/10.3390/sym18050738 (registering DOI) - 26 Apr 2026

Viewed by 103

Abstract

To address the challenges of complex, heterogeneous, and blurred sensitivity boundaries in the sensitive data sources of civil aviation passengers, this paper proposes a hierarchical measurement method. This model integrates information entropy and random forest, achieving measurable sensitivity. Firstly, the correlation between data sensitivity level and business characteristics is established. Then, a Random Forest-based Hierarchical Measurement with Sensitivity Information Content Analysis (RF-HM-SICA) model integrating information entropy and random forest is proposed to construct a sensitivity measurable hierarchical measurement method for passenger sensitive data. The experimental results show that the RF-HM-SICA model exhibits high stability, generalization capability, and boundary sample protection ability under different data sizes and sensitivity levels, making it suitable for solving the multidimensional heterogeneity measurement problem of sensitive data of civil aviation passengers and providing support for data security sharing protection. In particular, the recognition accuracy and precision for high-sensitivity data approach 1.0 across datasets of different scales, while RF-HM-SICA exhibits the lowest misclassification rate among all compared models. Full article

(This article belongs to the Special Issue Security and Privacy Protection for Mobile Crowd Sensing)

►▼ Show Figures

Figure 1

19 pages, 5937 KB

Open AccessArticle

Integrating Pigeon-Inspired Optimization and Support Vector Machines for Forest Aboveground Biomass Estimation

by Xiaomeng Kang, Ling Wang, Chunyan Chang, Xicun Zhu, Xiao Liu, Chang Qiu, Xianzhang Meng and Danning Chen

Forests 2026, 17(5), 524; https://doi.org/10.3390/f17050524 (registering DOI) - 25 Apr 2026

Viewed by 137

Abstract

Estimating forest aboveground biomass (AGB) in mountainous forest ecosystems remains a significant challenge due to complex terrain, the high cost and limited applicability of traditional field-based methods. To address this issue, a remote sensing-based AGB estimation framework integrating intelligent optimization and machine learning was developed for Mount Tai in eastern China. Sentinel-2 multispectral data were selected to derive 105 candidate variables, including spectral bands, vegetation indices, texture features, and topographic factors, from which 17 key variables were selected using Pearson correlation analysis for model construction. A Support Vector Machine (SVM) optimized by the Pigeon-inspired optimization (PIO) algorithm was developed to adaptively determine optimal hyperparameters, and its performance was compared with that of Random Forest (RF) and standard SVM models. Among the three models, PIO-SVM produced the highest numerical accuracy. For the training dataset, it obtained an R² of 0.85 and an RMSE of 46.12 t/hm². For the testing dataset, it achieved an R² of 0.73 and an RMSE of 62.19 t/hm², compared with 0.72 and 66.25 t/hm² for the standard SVM model and 0.70 and 65.19 t/hm² for the RF model. The spatial distribution of AGB derived from the optimal model shows higher AGB values in the central and northern regions characterized by dense forest cover, in close agreement with field observations. Overall, the results suggest that PIO-based parameter optimization can improve SVM performance for AGB estimation in mountainous forests. This study provides a reliable and efficient framework for regional-scale monitoring of forest biomass and carbon sink dynamics. Full article

(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

►▼ Show Figures

Figure 1

19 pages, 1618 KB

Open AccessArticle

Simulation and Correction Study of Solar Irradiance in Guangdong Based on WRF-Solar and Random Forest

by Yuanhong He, Zheng Li, Fang Zhou and Zhiqiu Gao

Energies 2026, 19(9), 2077; https://doi.org/10.3390/en19092077 - 24 Apr 2026

Viewed by 128

Abstract

To improve solar irradiance simulation accuracy for precise photovoltaic power forecasting, we developed a hybrid framework combining WRF-Solar numerical simulation and random forest (RF) machine learning for a PV plant in Guangdong, China. Weather conditions were objectively classified into clear, intermittent cloudy, and overcast using the Daily Variability Index (DVI) and Daily Clear-sky Index (DCI). We calibrated the WRF-Solar model’s microphysics and radiative transfer schemes via sensitivity tests to optimize overcast-sky performance, then applied RF correction to the simulated irradiance. Results show that RF correction significantly reduces simulation errors for intermittent and overcast conditions, while the original WRF-Solar outperforms the corrected results under clear skies due to RF overfitting. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Photovoltaic Energy Systems)

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 136.

Go to page 1 2 3 4 5

Search Results (6,752)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI