MDPI - Publisher of Open Access Journals

19 pages, 4385 KB

Open AccessArticle

Impact of Climate Warming on Cropland Water Use Efficiency in Northeast China Based on BESS Satellite Data

by Fenfen Guo, Haoran Wu, Zhan Su, Yanan Chen, Jiaoyue Wang and Xuguang Tang

Remote Sens. 2026, 18(8), 1223; https://doi.org/10.3390/rs18081223 - 17 Apr 2026

Understanding the long-term dynamics of cropland water use efficiency (WUE) and its underlying environmental drivers is essential for ensuring food and water security, particularly for regions facing intensified climate change. Here, we investigated the spatial patterns and long-term trends of gross primary productivity [...] Read more.

Understanding the long-term dynamics of cropland water use efficiency (WUE) and its underlying environmental drivers is essential for ensuring food and water security, particularly for regions facing intensified climate change. Here, we investigated the spatial patterns and long-term trends of gross primary productivity (GPP), evapotranspiration (ET), and WUE in cropland ecosystems across Northeast China during the past two decades as the nation’s primary commodity grain base using the time-series Breathing Earth System Simulator (BESS) products. Subsequently, the ridge regression method was used to quantitatively disentangle the relative contributions of key climatic variables to the observed WUE trends of cropland. Our results revealed a pronounced decreasing gradient in both GPP and ET along the southeast–northwest direction. A significant increase in GPP was observed over the 20-year period (p < 0.01), with 95.94% of the cropland area showing positive trends. ET showed a slight, non-significant increase (p > 0.05), though 82.77% of pixels exhibited positive trends, particularly in the northwest. Consequently, WUE showed a widespread and significant enhancement (p < 0.01), with approximately 98% of cropland pixels exhibiting increasing trends. Attribution analysis identified air temperature as the dominant environmental variable, accounting for 92.4% of the observed WUE increase, while solar radiation and precipitation contributed modestly (3.4% and 3.2%, respectively). Our findings underscore the predominant role of thermal conditions in shaping the carbon–water coupling efficiency of agroecosystems in semi-arid to semi-humid transition zones. This study provides quantitative evidence that warming climate, rather than changes in water availability or radiation, has been the primary climatic factor driving the improved cropland WUE over the past two decades. These insights have important implications for developing adaptive water management strategies to enhance agricultural climate resilience in Northeast China and similar regions worldwide. Full article

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

► Show Figures

Figure 1

19 pages, 2664 KB

Open AccessArticle

Machine Learning-Based Prediction of Multi-Year Cumulative Atmospheric Corrosion Loss in Low-Alloy Steels with SHAP Analysis

by Saurabh Tiwari, Seong Jun Heo and Nokeun Park

Coatings 2026, 16(4), 488; https://doi.org/10.3390/coatings16040488 - 17 Apr 2026

Abstract

Atmospheric corrosion of carbon and low-alloy steels causes direct economic losses that are estimated at around 3.4% of the global GDP, and its accurate multi-year prediction is essential for protective coating selection, service-life estimation, and infrastructure maintenance scheduling. In this study, machine learning [...] Read more.

Atmospheric corrosion of carbon and low-alloy steels causes direct economic losses that are estimated at around 3.4% of the global GDP, and its accurate multi-year prediction is essential for protective coating selection, service-life estimation, and infrastructure maintenance scheduling. In this study, machine learning (ML) algorithms, including gradient boosting regressor (GBR), eXtreme gradient boosting (XGBoost), random forest (RF), support vector regression (SVR), and ridge regression, were trained on a 600-sample physics-grounded dataset to predict the cumulative atmospheric corrosion loss (µm) of low-alloy steels over 1–10 years of exposure. The dataset was constructed using the exact ISO 9223:2012 dose–response function (DRF) for a first-year corrosion rate and the ISO 9224:2012 power-law multi-year kinetic model (C(t) = C₁·t0.5), spanning ISO 9223 corrosivity categories C2–CX across 11 environmental and material input features. All models were evaluated on the original (untransformed) corrosion scale under an 80/20 train/test split and five-fold cross-validation. Gradient boosting achieved the best overall performance with test set R² = 0.968, CV-R² = 0.969, RMSE = 10.58 µm, MAE = 5.99 µm, and MAPE = 12.6%. XGBoost was a close second (R² = 0.958, CV-R² = 0.960). RF achieved an R² of 0.944. SHAP (SHapley Additive exPlanations) analysis identified SO₂ deposition rate, exposure time, relative humidity, Cl⁻ deposition rate, and temperature as the five most influential predictors. The dominance of the SO₂ deposition rate (mean |SHAP| = 26.37 µm) and the high second-place ranking of exposure time (13.67 µm) are fully consistent with the ISO 9223:2012 dose–response function and ISO 9224:2012 power-law kinetics, respectively, while among the material features, Cu and Cr contents showed the strongest negative SHAP contributions, confirming their corrosion-inhibiting roles in weathering steels. These results establish a physics-consistent, interpretable ML benchmark exceeding R² = 0.90 for multi-year cumulative corrosion loss prediction and provide a quantitative tool for alloy screening, coating selection in aggressive atmospheric environments, and service-life planning. Full article

(This article belongs to the Special Issue Anti-Corrosion and Anti-Wear Coatings: Fundamentals, Technologies, and Applications)

20 pages, 2493 KB

Open AccessArticle

Non-Destructive Determination of Moisture Content in White Tea During Withering Using VNIR Spectroscopy and Ensemble Modeling

by Qinghai He, Hongkai Shen, Zhiyuan Liu, Benxue Ma, Yong He, Zhi Lin, Weihong Liu, Pei Wang, Xiaoli Li and Peng Qi

Horticulturae 2026, 12(4), 488; https://doi.org/10.3390/horticulturae12040488 - 16 Apr 2026

Abstract

As one of the six major traditional tea types in China, white tea’s quality formation is primarily influenced by the withering process. However, traditional methods for monitoring withering fail to achieve precise and stable control of moisture content. To address this issue, a [...] Read more.

As one of the six major traditional tea types in China, white tea’s quality formation is primarily influenced by the withering process. However, traditional methods for monitoring withering fail to achieve precise and stable control of moisture content. To address this issue, a total of 650 samples were collected at 13 withering time points (0–36 h), and the dataset was split into training and test sets at a 7:3 ratio. This study proposes a PRXBoost ensemble model for quantitative detection of withered white tea, which integrates data augmentation and intelligent algorithms. The ensemble model uses a Bagging-based weighted integration technique to combine Partial Least Squares Regression (PLSR), Ridge, and Extreme Gradient Boosting (XGBoost) models, and it conducts an in-depth analysis of the decision-making process within the PRXBoost model. First, the effectiveness of the data augmentation strategy and the superiority of the gradient descent algorithm are verified through pre-modeling based on the PLSR model and hyperparameter pre-search using the XGBoost model, respectively. Additionally, the Bayes algorithm is employed to optimize the weights of the sub-models, further enhancing the overall predictive performance. The results show that the PRXBoost model achieved the best performance among the compared models on the test set, with R² = 0.854 and RMSE = 0.080, exceeding the highest R² of a single model by 6%. These results indicate that PRXBoost provided improved predictive performance for moisture estimation within the current dataset. Finally, the SHapley Additive exPlanations (SHAP) algorithm is used to analyze the influence of each input feature on the prediction results, successfully identifying the 1916 nm and 1453 nm spectral bands as significant influencers of the prediction outcomes. These results suggest that the proposed model can support rapid, non-destructive monitoring of moisture evolution and provide actionable information for withering endpoint decision control. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in the Processing of Horticultural Crops)

23 pages, 1180 KB

Open AccessArticle

Carbon Emission Prediction Model for Railway Passenger Stations on the Qinghai–Tibet Plateau

by Guanguan Jia and Qingqin Wang

Sustainability 2026, 18(8), 3881; https://doi.org/10.3390/su18083881 - 14 Apr 2026

Viewed by 260

Abstract

Controlling operation-stage carbon emissions (CE) from transport buildings is crucial for China’s dual-carbon goals and the ecological security of the Qinghai–Tibet Plateau (QTP), and the sustainable development of plateau transport infrastructure. For plateau railway passenger stations (RPS), limited monitoring and distinctive high-altitude, cold-climate [...] Read more.

Controlling operation-stage carbon emissions (CE) from transport buildings is crucial for China’s dual-carbon goals and the ecological security of the Qinghai–Tibet Plateau (QTP), and the sustainable development of plateau transport infrastructure. For plateau railway passenger stations (RPS), limited monitoring and distinctive high-altitude, cold-climate operations make daily CE prediction difficult with conventional measurement- or simulation-based methods. This study develops a machine-learning approach based on a Monte Carlo synthetic database and derives engineering-standard formulas for direct use. Building scale, meteorology and passenger flow volume (PFV) were compiled for 12 representative RPS, and a large synthetic database of daily carbon emission was generated under multiple distribution constraints. With daily mean temperature, heating degree days, altitude, station floor area and PFV as inputs, four models were trained and assessed using mean absolute error, root mean square error, mean absolute percentage error (MAPE) and R². The results show that random forest (RF) performed best, achieving ~6% MAPE and R² > 0.99 on the test set, and markedly lower errors than multivariable linear regression. Interpretation of RF via feature importance and partial dependence shows that floor area, altitude and PFV dominate emissions and exhibit nonlinear response patterns. To improve transparency and transferability, ridge regression was used to fit a linear surrogate to RF predictions, producing engineering-standard formulas for daily and annual operation-stage CE. The formulas retain most predictive accuracy while requiring only readily obtainable variables, enabling rapid estimation and scenario analysis for cold, high-altitude RPS. The proposed workflow provides a replicable pathway for operational CE assessment in data-scarce regions and supports low-carbon planning, design and operation of RPS on the QTP, thereby contributing to more sustainable infrastructure development in high-altitude regions. Full article

(This article belongs to the Section Green Building)

11 pages, 2705 KB

Open AccessArticle

Applying Self-Information-Inspired Encoding to Task-Based fMRI for Decoding Second-Language Proficiency During Naturalistic Speech Listening

by Xin Xiong, Chenyang Zhu, Chunwu Wang and Jianfeng He

Appl. Sci. 2026, 16(8), 3805; https://doi.org/10.3390/app16083805 - 14 Apr 2026

Viewed by 192

Abstract

Individual differences in second-language (L2) proficiency are expected to influence how listeners parse and represent continuous speech, yet their neural signatures under naturalistic conditions remain unclear. We investigated this question using task-based fMRI during continuous speech listening. A total of 43 healthy participants [...] Read more.

Individual differences in second-language (L2) proficiency are expected to influence how listeners parse and represent continuous speech, yet their neural signatures under naturalistic conditions remain unclear. We investigated this question using task-based fMRI during continuous speech listening. A total of 43 healthy participants completed four listening runs synchronized with MRI acquisition via PsychoPy(Peirce 2007), with eyes open throughout scanning. To promote sustained attention and comprehension, participants provided a native-language oral recall after each run. Based on behavioral proficiency scores, participants were grouped into low- (LP, n = 14), moderate- (MP, n = 14), and high-proficiency (HP, n = 15) groups. We evaluated three temporal information-encoding frameworks derived from BOLD dynamics: direct temporal series, functional connectivity (FC), and self-information weighted inter-subject correlation (ISC-W). Using a 10 × 5-fold nested cross-validation scheme, we tested both categorical classification (Support Vector Machines) for discrete proficiency groups (LP, MP, HP) and continuous multivariate regression (Ridge/Lasso) for continuous proficiency scores. Furthermore, we applied ROI-based ANOVA and univariate Neural Correlation Analysis (NCA) to identify key brain regions, evaluating significance via nonparametric permutation testing (1000 permutations) and False Discovery Rate (FDR) correction. Results indicated that while categorical classification yielded numerical trends—with ISC-W performing best—it did not reach statistical significance under stringent permutation testing. However, multivariate continuous regression using ISC-W features successfully predicted continuous proficiency scores with statistical significance (p < 0.05). Exploratory ROI analysis highlighted the bilateral orbital inferior frontal gyrus (IFG_orb_bilat) as a highly sensitive region. These findings suggest that L2 proficiency is best represented as a distributed, continuous neural variable, and that self-information weighting effectively filters background noise to capture cognitive variance. Methodologically, this study provides a reproducible pipeline integrating information-theoretic feature construction with rigorous whole-brain nonparametric inference. Full article

► Show Figures

Figure 1

16 pages, 4604 KB

Open AccessArticle

Simulation and Experiment of the Interaction Process Between Seeding and Soil-Engaging for Transverse Sugarcane Planter

by Biao Zhang, Dan Pan, Qiancheng Liu, Weimin Shen and Guangyi Liu

Agriculture 2026, 16(8), 853; https://doi.org/10.3390/agriculture16080853 - 12 Apr 2026

Viewed by 311

Abstract

Uneven seed spacing, skewed stalk posture, and inconsistent planting depth remain major challenges in horizontal sugarcane planting. To address these issues, a semi-automatic transverse sugarcane planter integrating a supply–buffer–discharge seeder and multiple soil-engaging components was developed. The seed placement process and the interaction [...] Read more.

Uneven seed spacing, skewed stalk posture, and inconsistent planting depth remain major challenges in horizontal sugarcane planting. To address these issues, a semi-automatic transverse sugarcane planter integrating a supply–buffer–discharge seeder and multiple soil-engaging components was developed. The seed placement process and the interaction between stalk discharge and soil disturbance were investigated through Discrete Element Method (DEM) simulations and experiments. First, the working principle and key component parameters of the whole machine were determined. It integrated the processes of soil crushing, furrowing, seeding, ridge covering. In addition, a dynamic analysis was conducted on the inter-particle disengagement effect during the two-step seed filling process of lifting and discharging. Secondly, a discrete element simulation model for the entire process of soil-engaging seed arrangement operations was established for the machine. The effects of forward speed and seed outlet position were studied using a discrete element method (DEM) simulation model that coupled soil disturbance flow with stalk-seed discharge behaviour. Furthermore, a response surface methodology (RSM) experiment was performed on the seeding test bench to quantify the effects of guiding parameters on seed placement uniformity. The determination coefficient (R²) of the established regression model exceeded 0.9, indicating high prediction accuracy. The optimal collaborative parameter combination was optimized as follows: forward speed of 1.2 m·s⁻¹, buffer inclination angle of 55°and supply roller speed of 26 r·min⁻¹. After verification, the seed placement uniformity coefficient of the seeder reached 91.8 ± 1.4%, which met the expected accuracy requirements for horizontal planting. Full article

(This article belongs to the Section Agricultural Technology)

► Show Figures

Figure 1

17 pages, 834 KB

Open AccessArticle

Improved Data-Driven Shrinkage Estimators for Regression Models Under Severe Multicollinearity

by Ali Rashash R. Alzahrani and Asma Ahmad Alzahrani

Mathematics 2026, 14(8), 1245; https://doi.org/10.3390/math14081245 - 9 Apr 2026

Viewed by 215

Abstract

Multicollinearity is a critical issue in regression analysis, often resulting in inflated variances and unstable parameter estimates. Ridge regression is a widely adopted solution to address this challenge; however, existing ridge estimators are typically tailored to specific scenarios, limiting their universal applicability. Akhtar [...] Read more.

Multicollinearity is a critical issue in regression analysis, often resulting in inflated variances and unstable parameter estimates. Ridge regression is a widely adopted solution to address this challenge; however, existing ridge estimators are typically tailored to specific scenarios, limiting their universal applicability. Akhtar and Alharthi developed ridge estimators based on condition-adjusted ridge estimators (CAREs) to handle severe multicollinearity issues. However, their approach did not account for the error variances in the estimation process. In this study, we propose improvements to these CAREs by incorporating error variances, resulting in the development of multiscale ridge estimators (

M S R E_{1}

,

M S R E_{2}

,

M S R E_{3}

and

M S R E_{4}

) that more effectively address the challenges posed by severe multicollinearity. We compare the performance of our newly proposed estimators with ordinary least square (OLS) and other existing ridge estimators using both simulation studies and real-life datasets. The evaluation, based on estimated mean squared error (MSE), demonstrates that the proposed estimators consistently outperform existing methods, particularly in scenarios with significant multicollinearity, larger sample sizes, and higher predictor dimensions. Results from three real-life datasets further validate the proposed estimators’ ability to reduce estimation error and improve predictive accuracy across diverse practical applications. Full article

(This article belongs to the Special Issue Statistical Machine Learning: Models and Its Applications)

► Show Figures

Figure 1

15 pages, 1474 KB

Open AccessArticle

Prognostic Power of Ensemble Learning in Colorectal Cancer with Peritoneal Metastasis: A Multi-Institutional Analysis

by Yoshiko Bamba, Michio Itabashi, Hirotoshi Kobayashi, Kenjiro Kotake, Masayasu Kawasaki, Yukihide Kanemitsu, Yusuke Kinugasa, Hideki Ueno, Kotaro Maeda, Takeshi Suto, Kimihiko Funahashi, Heita Ozawa, Fumikazu Koyama, Shingo Noura, Hideyuki Ishida, Masayuki Ohue, Tomomichi Kiyomatsu, Soichiro Ishihara, Keiji Koda, Hideo Baba, Kenji Kawada, Yojiro Hashiguchi, Takanori Goi, Yuji Toiyama, Naohiro Tomita, Eiji Sunami, Yoshito Akagi, Jun Watanabe, Kenichi Hakamada, Goro Nakayama, Kenichi Sugihara and Yoichi Ajioka Show full author list Hide full author list

Bioengineering 2026, 13(4), 434; https://doi.org/10.3390/bioengineering13040434 - 8 Apr 2026

Viewed by 410

Abstract

Background: Owing to significant clinical heterogeneity, the achievement of accurate survival forecasting for individuals with colorectal cancer and peritoneal metastasis continues to be a complex undertaking. We aimed to transcend traditional prognostic limitations by evaluating machine learning boosting models against standard regression-based methods [...] Read more.

Background: Owing to significant clinical heterogeneity, the achievement of accurate survival forecasting for individuals with colorectal cancer and peritoneal metastasis continues to be a complex undertaking. We aimed to transcend traditional prognostic limitations by evaluating machine learning boosting models against standard regression-based methods in terms of estimating overall survival (OS). Methods: Utilizing a multi-institutional registry of 150 patients diagnosed with synchronous peritoneal metastasis of colorectal cancer, we integrated 124 clinicopathological variables to refine our predictive models. Beyond standard preprocessing—including standardization and median imputation—we rigorously compared XGBoost and LightGBM against Ridge, Lasso, and linear regression via five-fold cross-validation. To specifically address right-censoring, an XGBoost Cox model was implemented and validated using Harrell’s C-index, with SHAP and LIME providing essential model interpretability. Results: Boosting models consistently outperformed linear alternatives, which struggled with high error rates and negative R2 values. Specifically, XGBoost achieved an MAE of 475 ± 60 and an RMSE of 585 ± 88. The XGBoost Cox model reached a C-index of 0.64 ± 0.06. SHAP analysis highlighted inflammatory markers and peritoneal disease extent as the most influential prognostic drivers. Conclusions: While boosting models offer a clear accuracy advantage over linear methods, their prognostic power remains moderate. These findings underscore the potential of ensemble learning in oncology, yet mandate external validation before these tools can be integrated into clinical decision-making. Full article

(This article belongs to the Section Biosignal Processing)

► Show Figures

Figure 1

31 pages, 4302 KB

Open AccessArticle

A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series

by Nuria Fernández Palomares, Laura Álvarez de Prado, Luis Alfonso Menéndez García, David Fernández López, Sandra Buján and Antonio Bernardo Sánchez

Appl. Sci. 2026, 16(7), 3396; https://doi.org/10.3390/app16073396 - 31 Mar 2026

Viewed by 346

Abstract

This study develops a reproducible and auditable workflow to prepare regulatory air-quality monitoring time series for subsequent temporal analysis, including observational PRE/POST applications around coal-fired power plant closures in northwestern Spain. The dataset comprises daily concentrations from 28 monitoring stations (2006–2023) for PM [...] Read more.

This study develops a reproducible and auditable workflow to prepare regulatory air-quality monitoring time series for subsequent temporal analysis, including observational PRE/POST applications around coal-fired power plant closures in northwestern Spain. The dataset comprises daily concentrations from 28 monitoring stations (2006–2023) for PM₁₀, PM_2.5, NO, NO₂, NO_x, O₃, SO₂, and CO, affected by missingness, structural inconsistencies, and extreme values. The contribution of this study lies in integrating standardized data ingestion and QA/QC chained-equation imputation with Bayesian Ridge regression, hold-out validation, physicochemical consistency checks, and robust extreme-value handling within a traceable processing workflow. Missing values are reconstructed per pollutant using plant-level multi-station pooling to improve stability. Performance is evaluated using a 5% masked hold-out and summarized with MAE, RMSE, R², and bias, complemented by an operational fit-quality label. Post-imputation controls enforce NO–NO₂–NO_x consistency and the physical constraint PM_2.5 ≤ PM₁₀, while extreme values are screened through a hierarchical robustness framework combining a Hampel filter, winsorization, and a Tukey IQR criterion. The workflow outputs documented diagnostics and robust daily series while preserving the traceability of observed values, flags, edits, and final decisions. Full article

(This article belongs to the Section Environmental Sciences)

► Show Figures

Figure 1

21 pages, 5707 KB

Open AccessArticle

Data-Efficient Multi-Objective Design of Auxiliary Localization Coils for Misalignment-Robust UAV WPT

by Jiali Liu, Dechun Yuan, Linxuan Li, Zhihao Han and Nian Li

Appl. Sci. 2026, 16(7), 3393; https://doi.org/10.3390/app16073393 - 31 Mar 2026

Viewed by 272

Abstract

To address the challenges of difficult quantitative design and potential coil mismatch in auxiliary coils within wireless power transfer systems, a data-driven parameter optimization method based on multi-objective particle swarm optimization (MOPSO) was proposed. First, based on the inductor–capacitor–capacitor series (LCC-S) compensation topology, [...] Read more.

To address the challenges of difficult quantitative design and potential coil mismatch in auxiliary coils within wireless power transfer systems, a data-driven parameter optimization method based on multi-objective particle swarm optimization (MOPSO) was proposed. First, based on the inductor–capacitor–capacitor series (LCC-S) compensation topology, a mechanism-based analysis was conducted, establishing coil side length A and number of turns N as core optimization variables. Subsequently, a collaborative optimization framework integrating “parametric simulation–surrogate modeling–active learning” was established. An offline fingerprint database was constructed via finite element simulation, and a high-accuracy surrogate model was developed using a kernel ridge regression ensemble approach. Active learning strategies were employed to adaptively augment data points and mitigate uncertainty. Finally, the multi-objective particle swarm optimization (MOPSO) algorithm was applied to identify the Pareto-optimal solution set. Experimental results reveal that the optimized auxiliary coil parameters achieved positioning errors below 8 mm at all test points. The maximum positioning error was significantly reduced by approximately 80% compared to the traditional empirical approach, providing a useful parameter-selection reference for high-precision wireless charging alignment systems under the investigated static operating conditions. Full article

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

► Show Figures

Figure 1

36 pages, 26341 KB

Open AccessArticle

Sandbody Prediction Based on Fusion of Seismic Multi-Attributes and Machine Learning Under Sedimentary Facies Constraint—A Case Study of Chenguanzhuang Area in Dongying Depression, Bohai Bay Basin

by Jinshuai Liu, Chengyan Lin, Chris Elders and Azhari Faris

Appl. Sci. 2026, 16(7), 3341; https://doi.org/10.3390/app16073341 - 30 Mar 2026

Viewed by 219

Abstract

In complex sedimentary environments, the identification of thin sandbodies and the accurate prediction of their thickness remain challenging, particularly when relying on a single analytical approach. Taking the lower sub-member of the fourth member of the Shahejie Formation (Es₄^L) in [...] Read more.

In complex sedimentary environments, the identification of thin sandbodies and the accurate prediction of their thickness remain challenging, particularly when relying on a single analytical approach. Taking the lower sub-member of the fourth member of the Shahejie Formation (Es₄^L) in the Chenguanzhuang area of the Dongying Depression as a case study, this study proposes a quantitative prediction method that integrates sedimentary facies constraints with machine learning-based seismic multi-attribute fusion. Based on core observations, well log data, and 3D seismic datasets, the study area is subdivided into two zones: Zone I (shallow-water delta front) and Zone II (shore–shallow lake). Sensitive attributes for each zone are optimized using Pearson correlation analysis and hierarchical clustering, and five machine learning models—SVR, Random Forest, MLP, Ridge Regression, and Lasso Regression—are systematically evaluated. The MLP model is selected for Zone I, achieving R² values of 0.856 and 0.936 for the training and test sets, respectively, whereas Ridge Regression combined with leave-one-out cross-validation (LOOCV) is adopted for Zone II to mitigate overfitting caused by limited well data, yielding R² values of 0.864 and 0.779. Compared with conventional linear regression (R² = 0.45), the proposed approach significantly improves the accuracy of quantitative sandbody prediction, providing a reliable geological basis for hydrocarbon exploration and an effective technical framework for similar complex sedimentary environments. Full article

► Show Figures

Figure 1

24 pages, 4905 KB

Open AccessArticle

Research on Control Factors and Parameter Optimization of Surfactant Flooding in Low-Permeability Reservoirs Using Random Forest Algorithm

by Yangnan Shangguan, Chunning Gao, Junhong Jia, Jinghua Wang, Guowei Yuan, Huilin Wang, Jiangping Wu, Ke Wu, Yun Bai, Hengye Liu and Yujie Bai

Processes 2026, 14(7), 1108; https://doi.org/10.3390/pr14071108 - 29 Mar 2026

Viewed by 332

Abstract

As oil and gas development increasingly targets low and ultra-low permeability reservoirs, conventional recovery techniques often prove insufficient for mobilizing residual oil. Surfactant flooding, a key chemical enhanced oil recovery (EOR) technology, thus requires careful system optimization and mechanistic investigation. This study focuses [...] Read more.

As oil and gas development increasingly targets low and ultra-low permeability reservoirs, conventional recovery techniques often prove insufficient for mobilizing residual oil. Surfactant flooding, a key chemical enhanced oil recovery (EOR) technology, thus requires careful system optimization and mechanistic investigation. This study focuses on low-permeability reservoirs in the Changqing Oilfield, evaluating three surfactant systems—YHS-Z1 (a 7:3 mass ratio blend of hydroxypropyl sulfobetaine and cocamide), YHS-Z2 (a polyether carboxylate, a nonionic-anionic composite) and a middle-phase microemulsion system (Heavy alkylbenzene sulfonate and hydroxysulfobetaine were combined with a mass ratio of 7:3)—through a series of experiments including interfacial tension measurement, contact angle analysis, static and dynamic oil displacement tests, as well as emulsion transport/retention index assessments, to comprehensively characterize their oil displacement properties. Based on the experimental data, this study constructed four classical regression models: Ridge Regression, Random Forest (RF), Gradient Boosting Regression (GBR), and Support Vector Regression (SVR), and conducted a comparative analysis of their predictive performance. The results demonstrate that the Random Forest (RF) model achieved the optimal prediction performance, with a Mean Absolute Error (MAE) of 1.8245, a Mean Absolute Percentage Error (MAPE) of 4.78%, and a coefficient of determination (R²) of 0.9428 on the training set. Further analysis using the SHapley Additive exPlanations (SHAP) algorithm revealed that the retention index is the primary global factor (accounting for 49.79% of the variance), while significant intergroup differences exist in the primary factors across different surfactant systems. Concurrently, single-factor and multi-factor sensitivity analyses were conducted to elucidate synergistic effects and threshold behaviors among parameters. The optimal parameter combination, identified via a random search method, achieved a predicted recovery factor of 45.61%, representing a 6.57% improvement over the highest experimental value. This study demonstrates that machine learning methods can effectively identify the dominant factors in oil displacement and enable synergistic parameter optimization, thereby providing a theoretical foundation for the efficient development of surfactant flooding in low-permeability reservoirs. Full article

(This article belongs to the Topic Enhanced Oil Recovery Technologies, 4th Edition)

► Show Figures

Figure 1

27 pages, 3936 KB

Open AccessArticle

Productivity Prediction in Tight Oil Reservoirs: A Stacking Ensemble Approach with Hybrid Feature Selection

by Zhengyang Kang, Yong Zheng, Tianyang Zhang, Haoyu Chen, Xiaoyan Zhou, Quanyu Cai and Yiran Sun

Processes 2026, 14(7), 1089; https://doi.org/10.3390/pr14071089 - 27 Mar 2026

Viewed by 328

Abstract

To address the challenges of low accuracy and complex influencing factors in predicting horizontal well fracturing productivity during the development of unconventional oil and gas resources such as tight oil, this paper proposes a productivity prediction framework based on an improved feature selection [...] Read more.

To address the challenges of low accuracy and complex influencing factors in predicting horizontal well fracturing productivity during the development of unconventional oil and gas resources such as tight oil, this paper proposes a productivity prediction framework based on an improved feature selection method and an ensemble learning model. This study employs a fusion analysis using the entropy weight method to combine Pearson correlation analysis and improved gray relational analysis (IGRA) for feature selection. Thirteen machine learning models were tested with six distinct parameter combinations to construct a Stacking-based ensemble learning model, with base models including Random Forest (RF), Ridge Regression (RR), and Artificial Neural Network (ANN). Particle Swarm Optimization (PSO) was employed to optimize hyperparameters, followed by interpretability analysis using SHapley Additive exPlanations (SHAP). The results indicate that the model with fused weights demonstrated optimal performance. The Stacking model achieved significantly improved accuracy after PSO optimization, with the coefficient of determination increasing by 4.9%, outperforming all comparison models. Engineering guidance is provided: Under current geological conditions, sand ratio and displacement fluid volume require fine-tuning to prevent over-treatment. Fracturing design should implement differentiated strategies based on the target sand body thickness. This study not only delivers a high-precision production prediction tool but also offers decision support for efficient unconventional oil and gas field development through its exceptional interpretability. Full article

(This article belongs to the Section Petroleum and Low-Carbon Energy Process Engineering)

► Show Figures

Figure 1

19 pages, 2119 KB

Open AccessArticle

UHPC Creep Behavior and Neural Network Prediction with Calibration of fib Model Code 2020

by Shijun Wang, Mengen Yue, Wenming Zhang and Teng Tong

Buildings 2026, 16(7), 1300; https://doi.org/10.3390/buildings16071300 - 25 Mar 2026

Viewed by 229

Abstract

Ultra-High-Performance Concrete (UHPC) is increasingly used in slender and prestressed structural members due to its superior strength and durability. However, inaccurate or incomplete prediction of creep deformation may lead to excessive long-term deflection, prestress loss, cracking, and potential serviceability or safety risks in [...] Read more.

Ultra-High-Performance Concrete (UHPC) is increasingly used in slender and prestressed structural members due to its superior strength and durability. However, inaccurate or incomplete prediction of creep deformation may lead to excessive long-term deflection, prestress loss, cracking, and potential serviceability or safety risks in buildings and infrastructure. Therefore, reliable prediction methods for UHPC creep are essential for both structural design and long-term performance assessment. In this study, a database containing 60 literature-derived UHPC creep records was compiled to investigate the creep coefficient at approximately 100 days. Pearson correlation analysis revealed strong interdependence among predictors and weak single-variable linear relationships, indicating that creep behavior is governed by nonlinear interactions. A feedforward backpropagation neural network (BPNN) trained using the Levenberg–Marquardt algorithm was developed to predict the creep coefficient. To maintain engineering interpretability, the fib Model Code 2020 (MC2020) formulation was adopted as a code-based benchmark and further calibrated using ridge regression. Results show that the calibrated MC2020 model improves prediction consistency, while the BPNN model provides the highest predictive accuracy. The proposed framework integrates machine-learning prediction with interpretable code-based calibration, contributing to the development of creep modeling approaches for UHPC and providing practical support for the safe design of UHPC structures. Full article

(This article belongs to the Section Building Materials, and Repair & Renovation)

► Show Figures

Figure 1

15 pages, 569 KB

Open AccessArticle

Ecological Correlates of Differences in Mean Age at Death Across Nearly Extinct Cohorts: The Role of Dietary Habits

by Alessandro Menotti, Paolo Emilio Puddu, David R. Jacobs, Anthony Kafatos, Miodrag Ostojic and Hanna Tolonen

Nutrients 2026, 18(7), 1021; https://doi.org/10.3390/nu18071021 - 24 Mar 2026

Viewed by 300

Abstract

Objectives. Our objective was to study the ecological relationship of many risk factors and personal characteristics with mean age at death (AD) after a 50-year follow-up of nearly extinct cohorts. Material and Methods. There were 16 cohorts totaling 12,763 middle-aged men enrolled in [...] Read more.

Objectives. Our objective was to study the ecological relationship of many risk factors and personal characteristics with mean age at death (AD) after a 50-year follow-up of nearly extinct cohorts. Material and Methods. There were 16 cohorts totaling 12,763 middle-aged men enrolled in the Seven Countries Study (SCS), and 58 variables were measured, including traditional risk factors, dietary nutrition and anthropometric variables. A follow-up of 50 years allowed the use of AD as the end-point. Analysis included simple linear regression correlation and multivariate modelling using Principal Component Analysis and regression and Ridge regression. Results. Out of 58 variables, only 11 (10 nutrition-dietary items plus age) showed a significant linear correlation coefficient (R) ≥ 0.50 and a p value ≤ 0.05. Linear regression was computed by using as a predictor the dietary factor score derived from a Principal Component Analysis of the 11 significant variables, which were used as independent variables, whose coefficients were significantly related with AD, and the final R² was 0.52. The Principal Component regression and Ridge regression documented the direct relationship of food groups of vegetable origin (including olive oil) with the AD and the inverse relationship for food groups of animal origin. Conclusions. A few variables, all related to diet and nutrition, were able to statistically explain about 50% of the different AD in 16 cohorts of men followed up with nearly until death. Other variables, including traditional cardiovascular disease risk factors, did not contribute in a significant way for this purpose. Full article

(This article belongs to the Section Nutrition and Public Health)

► Show Figures

Figure 1

Search Results (351)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (351)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI