Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,925)

Search Parameters:
Keywords = random forest regression

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 6403 KB  
Article
Integrating Machine Learning and Geospatial Analysis for Nitrate Contamination in Water Resources Management: A Case Study of Sinkholes in Winkler County, Texas
by Rapheal Udeh, Joonghyeok Heo, Jeongho Lee and Moung-Jin Lee
Water 2026, 18(6), 710; https://doi.org/10.3390/w18060710 - 18 Mar 2026
Abstract
This study used machine learning methods and spatial analysis to examine groundwater quality in Winkler County, Texas, focusing on nitrate pollution. By analyzing 85 years of groundwater data from six aquifers, the study uses advanced machine learning models Random Forest, Decision Tree, Linear [...] Read more.
This study used machine learning methods and spatial analysis to examine groundwater quality in Winkler County, Texas, focusing on nitrate pollution. By analyzing 85 years of groundwater data from six aquifers, the study uses advanced machine learning models Random Forest, Decision Tree, Linear Regression, and XGBoost to predict contamination levels and explore spatial and temporal trends. These models were chosen because of their ability to handle larger and more complex datasets and their ability to capture nonlinear relationships between water quality parameters and environmental variables. These machine learning algorithms are particularly effective at identifying patterns and interactions that may not be obvious with traditional analytical methods, and get more reliable and accurate results. Our decadal analysis specifically identified systematic fluctuations in nitrate levels, with a notable increase since the early 2000s, driven by the synergistic effects of rising temperatures and intensified agricultural land use. Climate change, pressured by rising temperatures and lessened precipitation, along with natural factors such as the formation of sinkholes, has been identified as a key driver of groundwater quality fluctuations. Elevated nitrate levels were mostly related to agricultural irrigation and excessive use of synthetic fertilizers. The machine learning model also highlights how land cover changes and human activities are contributing to groundwater quality deterioration. This research reinforces the value of integrating machine learning and spatial analysis for groundwater management. This is especially true in areas affected by sinkholes. It provides important information to reduce man-made impacts to water quality in West Texas. Full article
Show Figures

Figure 1

16 pages, 4589 KB  
Article
Estimation of PM2.5 Concentration in Yangquan City from 2015 to 2024 Based on MODIS Image and Meteorological Data and Analysis of Spatial and Temporal Variation
by Qinfeng Yao, Jinjun Liu, Shenghua Chen, Yongxiang Ning and Sunwen Du
Atmosphere 2026, 17(3), 308; https://doi.org/10.3390/atmos17030308 - 18 Mar 2026
Abstract
This study employed Moderate-Resolution Imaging Spectroradiometer (MODIS) aerosol optical depth data meteorological data, Digital Elevation Model (DEM), Normalized Difference Vegetation Index (NDVI), and ground monitoring data for particulate matter (PM2.5) to construct a model for estimating the PM2.5 concentration in Yangquan City, Shanxi [...] Read more.
This study employed Moderate-Resolution Imaging Spectroradiometer (MODIS) aerosol optical depth data meteorological data, Digital Elevation Model (DEM), Normalized Difference Vegetation Index (NDVI), and ground monitoring data for particulate matter (PM2.5) to construct a model for estimating the PM2.5 concentration in Yangquan City, Shanxi Province, from 2015 to 2024. The spatial and temporal changes in the PM2.5 concentration were analyzed. The results revealed the following: (1) The random forest model was more accurate than the multiple linear regression model. The spring model R2 increased by 38.7%, and the Root Mean Square Error (RMSE) decreased by 92.6%. The summer model R2 increased by 65.1%, and the RMSE decreased by 92.5%. The autumn model R2 increased by 2.7%, and the RMSE decreased by 83.4%. The winter model R2 increased by 25.4%, and the RMSE decreased by 95.5%. (2) The PM2.5 concentration in Yangquan City showed an upward trend from 2015 to 2017, and then a downward trend from 2018 to 2024, with an average decrease of 18.3 μg/m3. The highest concentration of PM2.5 was 55–85 μg/m3 in winter, and the lowest concentration of PM2.5 was 25–40 μg/m3 in summer. In terms of spatial distribution, the PM2.5 concentration in Yangquan City exhibits a pattern of being lower in the northwest and higher in the southeast. The high values are primarily concentrated in the central urban areas and major industrial zones in the southeast. Full article
Show Figures

Figure 1

36 pages, 11911 KB  
Article
Soil Moisture Retrieval Using Multi-Satellite Dual-Frequency GNSS-IR Considering Environmental Factors
by Shihai Nie, Yongjun Jia, Peng Li, Xing Wu and Yuchao Tang
Remote Sens. 2026, 18(6), 917; https://doi.org/10.3390/rs18060917 - 18 Mar 2026
Abstract
Global Navigation Satellite System Interferometric Reflectometry (GNSS-IR) provides a low-cost, all-weather approach for continuous soil moisture content (SMC) retrieval. However, in single-constellation, multi-satellite applications, the optimal satellite number and the combined effects of multiple environmental factors on retrieval accuracy and stability remain insufficiently [...] Read more.
Global Navigation Satellite System Interferometric Reflectometry (GNSS-IR) provides a low-cost, all-weather approach for continuous soil moisture content (SMC) retrieval. However, in single-constellation, multi-satellite applications, the optimal satellite number and the combined effects of multiple environmental factors on retrieval accuracy and stability remain insufficiently quantified. To address these issues, this study develops a dual-frequency GNSS-IR SMC retrieval framework that explicitly incorporates multiple environmental factors. Entropy-based fusion (EFM) is used to adaptively weight dual-frequency phase-delay observations, and a marginal-gain criterion is introduced to determine a suitable number of participating satellites. On this basis, univariate linear regression (ULR) and random forest (RF) models are established, and the Normalized Difference Vegetation Index (NDVI), temperature, and precipitation are incorporated into the RF model to improve retrieval robustness and quantify the relative contributions of environmental factors. The results show that multi-satellite combinations significantly improve SMC retrieval performance, while the incremental gain exhibits clearly diminishing returns and converges when the number of participating satellites reaches about 5–6 within a single constellation. Dual-frequency fusion consistently outperforms single-frequency schemes across different GNSS constellations, demonstrating the complementary value of multi-frequency information under multi-satellite conditions. In addition, the environmentally informed nonlinear model achieves higher accuracy and stability than the linear model, and the dominant environmental drivers differ across stations. Overall, this study provides quantitative support for configuring single-constellation multi-satellite GNSS-IR soil moisture monitoring schemes and for improving retrieval robustness under complex environmental conditions. Full article
(This article belongs to the Special Issue Remote Sensing in Monitoring Coastal and Inland Waters)
Show Figures

Figure 1

28 pages, 7628 KB  
Article
Fine-Scale and Population-Weighted PM2.5 Modeling in Melbourne: Towards Detailed Urban Exposure Mapping
by Jun Gao, Xuying Ma, Qian Chayn Sun, Wenhui Cai, Xiaoqi Wang, Yifan Wang, Zelei Tan, Danyang Li, Yuanyuan Fan, Leshu Zhang, Yixin Xu, Xueyao Liu and Yuxin Ma
ISPRS Int. J. Geo-Inf. 2026, 15(3), 134; https://doi.org/10.3390/ijgi15030134 - 17 Mar 2026
Abstract
Despite concern over air pollution, fine-scale spatial and demographic disparities in exposure remain largely unquantified in Australian cities due to sparse monitoring and coarse models. In Greater Melbourne, this gap limits neighbourhood-level assessment of PM2.5 exposure and associated environmental inequalities. To address [...] Read more.
Despite concern over air pollution, fine-scale spatial and demographic disparities in exposure remain largely unquantified in Australian cities due to sparse monitoring and coarse models. In Greater Melbourne, this gap limits neighbourhood-level assessment of PM2.5 exposure and associated environmental inequalities. To address this gap, we integrated 6-month averaged PM2.5 observations (October 2023 to March 2024) from 5 regulatory monitoring stations and 13 low-cost sensors (LCSs) to develop a land use regression (LUR) model estimating concentrations at a 100 m resolution. These estimates were used to calculate population-weighted PM2.5 exposure (PWE) at the mesh block level across Melbourne. To examine factors associated with spatial heterogeneity in PWE, we applied a hybrid modeling framework combining Spatially Explicit Random Forest (Spatial-RF) and Geographically Weighted Regression (GWR), incorporating physical, built-environment, and socio-demographic variables from the Synthesized Multi-Dimensional Environmental Exposure Database (SEED). The Spatial-RF model initially exhibited an R2 of 0.56. After multicollinearity diagnostics using the Variance Inflation Factor (VIF), three key explanatory variables were selected for GWR modeling: the Normalized Difference Vegetation Index (NDVI), the Index of Education and Occupation (IEO), and the proportion of culturally and linguistically diverse populations (CALDP). The developed GWR model achieved higher model performance (R2 = 0.65) than Spatial-RF and global Ordinary Least Squares (OLS) regression (R2 = 0.38), revealing strong spatial non-stationarity. Results show that PWE generally ranged from 5 to 7 µg/m3, exceeding the 2021 WHO air quality guideline, with hotspots in the urban core and along major transport corridors. Elevated exposure occurred in both socioeconomically disadvantaged areas and residents in urban centers with higher socio-economic status, reflecting complex, spatially contingent exposure inequalities. These findings support fine-scale, equity-oriented air quality management. Full article
(This article belongs to the Special Issue Spatial Data Science and Knowledge Discovery)
Show Figures

Figure 1

26 pages, 1011 KB  
Article
A Study on Machine Learning-Based Cost Estimation Models for AI Training Data Construction
by Yoon-Seok Ko and Bong Gyou Lee
Appl. Sci. 2026, 16(6), 2891; https://doi.org/10.3390/app16062891 - 17 Mar 2026
Abstract
This study proposes an explainable machine learning framework for estimating the total project cost (TPC) of AI training-data construction, where cost information is difficult to structure due to heterogeneous workflows and quality requirements. Using 386 public AI training-data projects conducted between 2020 and [...] Read more.
This study proposes an explainable machine learning framework for estimating the total project cost (TPC) of AI training-data construction, where cost information is difficult to structure due to heterogeneous workflows and quality requirements. Using 386 public AI training-data projects conducted between 2020 and 2022, we derive 24 numerical predictors from standardized final reports and construct three input tracks: a baseline feature set, a principal component analysis (PCA)-enhanced set, and a factor analysis (FA)–enhanced set capturing latent cost structures. Four regression models (Ridge, Random Forest, XGBoost, and LightGBM) are evaluated using nested cross-validation. XGBoost achieves the best overall performance across all three tracks (Baseline, PCA-enhanced, and FA-enhanced). Among them, PCA-enhanced XGBoost attains the highest predictive accuracy (R2 = 0.868; RMSE = 1084.9; MAE = 746.9; MAPE = 0.358; pooled out-of-fold), while Baseline XGBoost yields the lowest MAE (731.4; R2 = 0.863). To support transparent decision-making, Shapley Additive exPlanations (SHAP)-based attribution and scenario-based sensitivity analyses are conducted. Results show that project scale and process-level unit costs are dominant cost-drivers, while cloud usage, expert participation, and de-identification requirements exhibit secondary effects. The proposed framework provides an interpretable, data-driven approach to cost information management and decision support for data-intensive AI projects. Full article
Show Figures

Figure 1

18 pages, 2232 KB  
Article
Machine Learning-Driven Assessment of Soil Carbon Sequestration and Emission Reduction Potential in Tea Plantations
by Tinghao Wang, Yiming Si, Xiang Shen, Ming Cao, Wenxin Cheng, Huiming Zeng, Tong Li and Kun Cheng
Agronomy 2026, 16(6), 632; https://doi.org/10.3390/agronomy16060632 - 17 Mar 2026
Abstract
Robust quantification of greenhouse gas (GHG) balances in tea plantations is critical for evaluating their contribution to agricultural carbon neutrality. This study aimed to develop data-driven models to quantify soil organic carbon (SOC) sequestration and N2O emissions in Chinese tea plantations, [...] Read more.
Robust quantification of greenhouse gas (GHG) balances in tea plantations is critical for evaluating their contribution to agricultural carbon neutrality. This study aimed to develop data-driven models to quantify soil organic carbon (SOC) sequestration and N2O emissions in Chinese tea plantations, evaluate their net GHG balance at the national scale, and assess the mitigation potential under alternative nitrogen management scenarios. Using a comprehensive national dataset, we compared multiple machine learning (ML) approaches with a conventional multiple linear regression (MLR) model to simulate N2O emissions and SOC changes in Chinese tea plantations. All ML models substantially outperformed the MLR model, with the Random Forest (RF) algorithm achieving the highest predictive accuracy. The RF models yielded R2 values of 0.68 for N2O emissions and 0.67 for SOC changes, with no significant prediction bias. Variable importance and marginal effect analyses revealed strong non-linear controls. Mineral N fertilizer input was the dominant driver of N2O emissions, followed by organic N input, soil clay content, and SOC. In contrast, SOC dynamics were primarily regulated by organic carbon inputs, tea plantation age, climate variables, and soil pH. National-scale simulations indicated an average N2O emission intensity of 9.03 kg N2O ha−1 yr−1 and a mean SOC sequestration rate of 0.88 t C ha−1 yr−1. Overall, SOC sequestration offset N2O emissions, rendering Chinese tea plantations a net GHG sink (−2525 Gg CO2-eq yr−1). Scenario analyses showed that mineral N reduction increased net GHG uptake by 1804 Gg CO2-eq, while organic fertilizer substitution achieved a substantially larger mitigation potential of 5961 Gg CO2-eq. By integrating SOC sequestration and N2O emissions within a unified modeling framework and applying machine-learning-based national-scale simulations, this study provides a more comprehensive and data-driven quantification of GHG balances in tea ecosystems, offering a scientific basis for evaluating their role in agricultural carbon neutrality strategies. Full article
(This article belongs to the Special Issue Application of Machine Learning and Modelling in Food Crops)
Show Figures

Figure 1

17 pages, 673 KB  
Article
An Information-Theoretic Analysis of High-Frequency Load Disaggregation
by Gabriel Arquelau Pimenta Rodrigues, André Luiz Marques Serrano, Geraldo Pereira Rocha Filho, Vinícius Pereira Gonçalves and Rodolfo Ipolito Meneguette
Entropy 2026, 28(3), 334; https://doi.org/10.3390/e28030334 - 17 Mar 2026
Abstract
High-frequency non-intrusive load monitoring provides detailed harmonic information for appliances’ power disaggregation, and machine-learning approaches have demonstrated good performance in this task. However, these methods provide little transparency regarding the information structure of the aggregate signal. To address this, this paper models NILM [...] Read more.
High-frequency non-intrusive load monitoring provides detailed harmonic information for appliances’ power disaggregation, and machine-learning approaches have demonstrated good performance in this task. However, these methods provide little transparency regarding the information structure of the aggregate signal. To address this, this paper models NILM as a coding-decoding process and applies information-theoretic measures to quantify uncertainty, recoverability, temporal contribution, and inter-appliance masking effects in aggregate signals. In the analyzed dataset, transfer entropy suggests negligible temporal gains, which is consistent with the observed effectiveness of pointwise models such as Random Forest. Moreover, conditional mutual information emphasizes the asymmetric masking relationships between appliances, with the laptop charger acting as a dominant interferer in the considered measurements. These findings are validated through a Random Forest regression model with minimum Redundancy Maximum Relevance feature selection. The results show that the mutual information between an appliance and the aggregate is a good predictor of disaggregation performance in the examined data, as appliances with high mutual information, such as hair dryer and electric water heater, achieve lower estimation errors, while others, such as iron, are difficult to recover despite stable distributions. This relationship is statistically supported by a strong negative monotonic correlation between normalized mutual information and the disaggregation error (Spearman rs=0.81, p=0.015). Hence, this work demonstrates how information-theoretic analysis can help characterize disaggregation difficulty prior to model training and assess the observability of appliances in high-frequency NILM. Full article
Show Figures

Figure 1

19 pages, 667 KB  
Article
A Machine Learning Approach to Audit Modification Risk Prediction in Financial Reporting: Methods, Data, and Human-Centered Challenges
by Gökhan Silahtaroğlu, Feyza Dereköy and Esra Baytören
J. Risk Financial Manag. 2026, 19(3), 221; https://doi.org/10.3390/jrfm19030221 - 17 Mar 2026
Abstract
Financial reporting irregularities and audit modifications represent important warning signals of elevated fraud and financial distress risk. While recent studies report high predictive accuracy in fraud detection, most approaches frame the problem as a purely algorithmic classification task and offer limited interpretability for [...] Read more.
Financial reporting irregularities and audit modifications represent important warning signals of elevated fraud and financial distress risk. While recent studies report high predictive accuracy in fraud detection, most approaches frame the problem as a purely algorithmic classification task and offer limited interpretability for auditors, regulators, and decision-makers. This study reframes financial statement analysis as a human-interpretable audit modification risk prediction problem. It integrates domain-informed feature engineering with machine learning models. Using firm-level financial data and audit disclosures, audit opinions are used as a proxy indicator of elevated fraud-related reporting risk rather than confirmed fraudulent behavior. Logistic Regression, Random Forest, and Gradient Boosting models are trained under class imbalance using cost-sensitive learning and evaluated with recall, ROC–AUC, precision, F1-score, and accuracy. The results demonstrate that humanized categorical representations preserve predictive performance while substantially enhancing interpretability. Permutation-based feature importance analysis further identifies financially intuitive risk patterns and threshold-like conditions associated with elevated audit modification risk. The findings suggest that interpretable, risk-oriented machine learning frameworks can support more transparent and actionable financial reporting risk monitoring systems. Beyond predictive performance, the study discusses human-centered challenges related to model interpretability, decision support, and the integration of machine-learning systems into real-world financial reporting and audit-risk assessment workflows. Full article
(This article belongs to the Section Financial Technology and Innovation)
Show Figures

Figure 1

25 pages, 2146 KB  
Article
Machine Learning-Based Predictive Modelling of Key Operating Parameters in an Industrial-Scale Wet Vertical Stirred Media Mill
by Okay Altun, Aydın Kaya, Ali Seydi Keçeli, Ece Uzun, Meltem Güler and Nurettin Alper Toprak
Minerals 2026, 16(3), 311; https://doi.org/10.3390/min16030311 - 16 Mar 2026
Abstract
To the authors’ knowledge, this is the first industrial machine learning (ML) study focused on wet vertical stirred media milling. The study develops and validates machine learning (ML) models to predict the key operating parameters, namely mill discharge product size, mill feed slurry [...] Read more.
To the authors’ knowledge, this is the first industrial machine learning (ML) study focused on wet vertical stirred media milling. The study develops and validates machine learning (ML) models to predict the key operating parameters, namely mill discharge product size, mill feed slurry flow rate, mill power draw, and the specific energy consumption of an industrial wet vertical stirred media mill operating at a copper plant. A physics-guided workflow was adapted, combining relief coefficient-based variable screening with fundamental stirred milling principles to define 20 different structured model input scenarios. In the scope, six regression approaches, linear regression (LR), fine tree regression (FTR), support vector regression (SVR), random forest regression (RFR), artificial neural network regression (ANN), and Gaussian process regression (GPR), were trained and validated using plant sensor data and evaluated using R2 and RMSE. Overall performance was reasonable, with GPR providing the highest predictive accuracy, followed by RFR/ANN, while LR, SVR, and FTR performed lower. The potential benefit of feed size was also assessed conceptually through an upper-bound sensitivity analysis, representing a best-case scenario where an online feed size measurement would be available. Because the feed size descriptor (F80) was not independently measured but derived from an energy–size relationship, the associated accuracy gains are reported as theoretical upper-bound indications rather than independent predictive capability. Overall, the findings support ML-based decision support in stirred milling operations and motivate future work using independently measured feed size (or reliable proxy sensing). Full article
(This article belongs to the Collection Advances in Comminution: From Crushing to Grinding Optimization)
Show Figures

Figure 1

41 pages, 8144 KB  
Article
Statistical Development of Rainfall IDF Curves and Machine Learning-Based Bias Assessment: A Case Study of Wadi Al-Rummah, Saudi Arabia
by Ibrahim T. Alhbib, Ibrahim H. Elsebaie and Saleh H. Alhathloul
Hydrology 2026, 13(3), 96; https://doi.org/10.3390/hydrology13030096 - 16 Mar 2026
Abstract
Reliable estimation of extreme rainfall is essential for hydraulic design and flood risk mitigation, particularly in arid regions where rainfall exhibits strong temporal and spatial variability. This study presents a statistical framework for developing rainfall intensity-duration-frequency (IDF) curves, complemented by a machine learning-based [...] Read more.
Reliable estimation of extreme rainfall is essential for hydraulic design and flood risk mitigation, particularly in arid regions where rainfall exhibits strong temporal and spatial variability. This study presents a statistical framework for developing rainfall intensity-duration-frequency (IDF) curves, complemented by a machine learning-based assessment of model bias and performance. The analysis was conducted using data from ten rainfall stations located within or near the Wadi Al-Rummah Basin. Annual maximum series (AMS) from 1969 to 2024 were first reconstructed to address missing years using a modified normal ratio method (NRM) combined with nearest-station selection, ensuring spatial consistency while preserving station-specific rainfall characteristics. Six probability distributions (Weibull, Gumbel, gamma, lognormal, generalized extreme value (GEV), and generalized Pareto) were fitted to each station, and the best-fit distribution was identified using multiple goodness-of-fit (GOF) criteria, including the Kolmogorov–Smirnov (K-S) test, Anderson–Darling (A-D) test, root mean square error (RMSE), chi-square (χ2) statistic, Akaike information criterion (AIC), Bayesian information criterion (BIC), and the coefficient of determination (R2). Statistical IDF curves were then developed for durations ranging from 5 to 1440 min and return periods from 2 to 1000 years. To evaluate the robustness of the statistically derived IDF curves, three machine learning (ML) models, multiple linear regression (MLR), regression random forest (RRF), and multilayer feed-forward neural network (MFFNN), were trained as surrogate models using duration, return period, and station geographic attributes as predictor variables. Model performance was evaluated using RMSE, MAE, and mean bias metrics across stations and return periods. The lognormal distribution emerged as the best-fit model for four stations, while the Gumbel and gamma distributions were selected for two stations each. Overall, no single probability distribution consistently outperformed others, indicating station-dependent behavior. Among the machine learning models, the MFFNN achieved the closest agreement with statistical IDF estimates (RMSE0.97, MAE0.65, bias0.02), followed by RRF and MLR based on global average performance across all stations and return periods. The proposed framework offers a reliable approach for rainfall IDF development and evaluation in arid region watersheds. Full article
(This article belongs to the Section Statistical Hydrology)
Show Figures

Figure 1

20 pages, 2270 KB  
Article
Predicting Anthropogenic Wildfire Occurrence Using Explainable Machine Learning Models: A Nationwide Case Study of South Korea
by Mingyun Cho and Chan Park
Fire 2026, 9(3), 126; https://doi.org/10.3390/fire9030126 - 16 Mar 2026
Abstract
Anthropogenic wildfires account for the majority of wildfire ignitions in human-dominated landscapes, yet their spatial drivers remain insufficiently understood at national scales. This study aims to identify key factors influencing anthropogenic wildfire occurrence and to develop a robust and interpretable prediction framework using [...] Read more.
Anthropogenic wildfires account for the majority of wildfire ignitions in human-dominated landscapes, yet their spatial drivers remain insufficiently understood at national scales. This study aims to identify key factors influencing anthropogenic wildfire occurrence and to develop a robust and interpretable prediction framework using nationwide data from South Korea. Wildfire occurrence records from 2011–2021 were integrated with daily meteorological, environmental, and socio-economic variables at a 1 km grid resolution. A stacking ensemble model combining Random Forest, XGBoost, LightGBM, Extra Trees, and logistic regression was implemented to improve predictive robustness under rare-event conditions. Model performance was evaluated using ROC–AUC, PR–AUC, and threshold-optimized F1-scores, and variable contributions were interpreted using feature importance and SHAP analyses. The ensemble model achieved a PR–AUC of 0.934 and an ROC–AUC of 0.941. Relative humidity and maximum temperature were identified as influential meteorological variables, while human-accessibility-related variables, particularly distance to roads and agricultural land, showed consistently high contributions to spatial ignition probability. These findings indicate that anthropogenic wildfire occurrence is shaped by interactions between fire-weather conditions and spatial patterns of human accessibility. The proposed framework provides a scalable approach for understanding anthropogenic wildfire mechanisms and supporting prevention strategies in forested landscapes. Full article
Show Figures

Figure 1

11 pages, 750 KB  
Article
Predicting Dental Anxiety and Cooperative Behavior in Children Using Machine Learning: A Cross-Sectional Predictive Modeling Study
by Narmin M. Helal and Heba Sabbagh
Dent. J. 2026, 14(3), 170; https://doi.org/10.3390/dj14030170 - 16 Mar 2026
Abstract
Background/Objectives: Dental anxiety and uncooperative behavior present significant challenges in pediatric dentistry and may adversely affect treatment outcomes and oral health. The main goal of this study was to evaluate the predictive performance of machine learning models in classifying dental anxiety measured [...] Read more.
Background/Objectives: Dental anxiety and uncooperative behavior present significant challenges in pediatric dentistry and may adversely affect treatment outcomes and oral health. The main goal of this study was to evaluate the predictive performance of machine learning models in classifying dental anxiety measured using the Abeer Children Dental Anxiety Scale (ACDAS), predicting uncooperative behavior, estimating continuous dental anxiety scores, and identifying key predictors among children aged 6–11 years attending pediatric dental clinics in Jeddah, Saudi Arabia. Methods: This is an analytical cross-sectional study conducted among 952 children to evaluate whether machine learning models could predict dental anxiety and cooperative behavior based on demographic, clinical, and behavioral variables. Twenty variables captured demographic, medical, and dental history, BMI, and anxiety/behavioral measures. Data preprocessing included removing sparse variables, imputing missing values, and encoding categorical and ordinal variables appropriately. Logistic Regression models were trained to classify dental anxiety and cooperative behavior. A Random Forest Regressor was used to predict continuous anxiety scores, and a Random Forest Classifier was used for feature importance analysis. Principal Component Analysis (PCA) and K-Means clustering were applied to explore behavioral subgroups. Results: This dataset shows the Logistic Regression model with 0.92 accuracy (ROC AUC 0.98) for predicting dental anxiety and 0.91 accuracy (ROC AUC 0.95) for cooperative behavior. The Random Forest Regressor predicted anxiety scores with R2 = 0.97. Feature importance revealed that sensory and cognitive responses were key predictors of anxiety and cooperation. Unsupervised clustering identified two behavioral profiles: one with lower and another with higher anxiety and cooperation. Conclusions: ML models demonstrated strong prediction of dental anxiety and cooperation in this pediatric sample. While promising for early detection and personalized management of anxious or uncooperative children, further validation is essential before clinical use. Full article
Show Figures

Figure 1

33 pages, 9582 KB  
Article
Proxilience Effects on Spatial Disparities in Metropolitan Areas—A Cross-Scale Analysis of “Superbowl” Agglomerations
by Alexandru Bănică, Karima Kourtit, Cristian-Manuel Foșalău and Oliver-Valentin Dinter
Land 2026, 15(3), 468; https://doi.org/10.3390/land15030468 - 15 Mar 2026
Abstract
In the spirit of the recent debate on the 15-minute city, two concepts are central: urban proximity and resilience. They became cornerstones of new urban planning perspectives on sustainability, livability, and inclusiveness in cities and metropolitan areas. Very recently, the notion of ‘proxilience’ [...] Read more.
In the spirit of the recent debate on the 15-minute city, two concepts are central: urban proximity and resilience. They became cornerstones of new urban planning perspectives on sustainability, livability, and inclusiveness in cities and metropolitan areas. Very recently, the notion of ‘proxilience’ has been introduced as an integration of urban planning views on the drivers of citizens’ wellbeing. The present study seeks to conceptualize and operationalize the proxilience concept for the case of metropolitan agglomerations, in which the core is termed here ‘Superbowl Economy’. Consequently, the paper presents a data-driven analytical approach that uses detailed empirical data on spatial density patterns, demographic factors, socioeconomic indicators, environmental quality attributes, infrastructure accessibility, and access to services and amenities. The empirical part of the study is based on a blend of geostatistical and econometric models (correlation and regression analysis, AHP modelling, and Random Forest model). The analysis framework and the underlying propositions on the proxilience impacts on spatial patterns of disparities in wellbeing are applied and tested for the greater Iași Metropolitan Area, which is one of the largest urban poles in Romania. The findings confirm proxilience as a novel, multidimensional tool that advances spatial (urban–regional) livability in a polarized yet fragmented urban system. Full article
(This article belongs to the Special Issue The 15-Minute City: Land-Use Policy Impacts)
Show Figures

Figure 1

18 pages, 2774 KB  
Article
Hybrid RF–ConvLSTM Approach for Rainfall Estimation from MSG Data over Northern Algeria
by Fethi Ouallouche, Mourad Lazri, Karim Labadi, Djamal Alouache, Yacine Mohia, Mounir Sehad and Soltane Ameur
Atmosphere 2026, 17(3), 296; https://doi.org/10.3390/atmos17030296 - 15 Mar 2026
Abstract
This study introduces a novel approach to 3-hourly and daily precipitation estimation over northern Algeria. The novel approach benefits from the classification capabilities of Random Forest (RF) and the predictive power of Convolutional Long Short-Term Memory (ConvLSTM) regression, with multi-temporal observations from the [...] Read more.
This study introduces a novel approach to 3-hourly and daily precipitation estimation over northern Algeria. The novel approach benefits from the classification capabilities of Random Forest (RF) and the predictive power of Convolutional Long Short-Term Memory (ConvLSTM) regression, with multi-temporal observations from the SEVIRI radiometer onboard the Meteosat Second Generation (MSG) satellite. The approach is a two-stage process: A Random Forest classifier is first used to provide a probabilistic characterization of precipitation occurrence and rainfall regimes. The ConvLSTM model then applies spatio-temporal regression to estimate rainfall intensities by analyzing multi-channel temporal sequences. The hybrid model produces spatially and temporally consistent precipitation fields by taking advantage of the spatio-temporal correlations of meteorological events, with the aim of obtaining accurate 3-hourly and daily rainfall accumulations for Northern Algeria. Results show a dramatic improvement over the reference RF-based technique, with correlation coefficients reaching 0.89 for 3-hourly accumulations and 0.91 for daily rainfall. Full article
Show Figures

Figure 1

31 pages, 4400 KB  
Article
Regional-Scale Mapping of Gully Network in Mediterranean Olive Landscapes Using Machine Learning Algorithms: The Guadalquivir Basin
by Paula González-Garrido, Adolfo Peña-Acevedo, Francisco-Javier Mesas-Carrascosa and Juan Julca-Torres
Agronomy 2026, 16(6), 622; https://doi.org/10.3390/agronomy16060622 - 14 Mar 2026
Abstract
Gully erosion is a significant threat to the sustainability of soil in Mediterranean basins. Despite its impact, there is a lack of research providing accurate regional-scale cartography of complete gully networks. This study aims to automatically map the gully network in the olive-growing [...] Read more.
Gully erosion is a significant threat to the sustainability of soil in Mediterranean basins. Despite its impact, there is a lack of research providing accurate regional-scale cartography of complete gully networks. This study aims to automatically map the gully network in the olive-growing landscapes of the Guadalquivir basin (Spain) using Machine Learning (ML) algorithms: Random Forest (RF), Support Vector Machine (SVM), Decision Tree (DT), and Logistic Regression (LR). We integrated these models with 17 predictive variables (including hydrotopographic, climatic, and edaphic factors) and the Gully Head Initiation (GHI) index. RF was the most suitable model, achieving an Area Under the Curve (AUC) of 0.91 and an F1-score of 0.83, and enabled the delineation of a gully network totalling 8439.05 km. Variable importance analysis revealed that flow accumulation (17.33%) and the GHI index (nearly 30%) were the primary predictors, with the Rainy Day Normal (RDN)-based formulation outperforming the maximum daily precipitation (Pmax)-based one. Spatially, countryside hill landscapes exhibited the highest gully densities (42.50 m/ha). The results demonstrate the effectiveness of combining ML with physically based indices to generate high-resolution gully cartography for soil conservation planning in Mediterranean olive groves. Full article
(This article belongs to the Special Issue Advanced Machine Learning in Agriculture—2nd Edition)
Show Figures

Figure 1

Back to TopTop