MDPI - Publisher of Open Access Journals

24 pages, 5039 KiB

Open AccessArticle

Advanced Estimation of Winter Wheat Leaf’s Relative Chlorophyll Content Across Growth Stages Using Satellite-Derived Texture Indices in a Region with Various Sowing Dates

by Jingyun Chen, Quan Yin, Jianjun Wang, Weilong Li, Zhi Ding, Pei Sun Loh, Guisheng Zhou and Zhongyang Huo

Plants 2025, 14(15), 2297; https://doi.org/10.3390/plants14152297 - 25 Jul 2025

Viewed by 278

Abstract

Accurately estimating leaves’ relative chlorophyll contents (widely represented by Soil and Plant Analysis Development (SPAD) values) across growth stages is crucial for assessing crop health, particularly in regions characterized by varying sowing dates. Unlike previous studies focusing on high-resolution UAV imagery or specific [...] Read more.

Accurately estimating leaves’ relative chlorophyll contents (widely represented by Soil and Plant Analysis Development (SPAD) values) across growth stages is crucial for assessing crop health, particularly in regions characterized by varying sowing dates. Unlike previous studies focusing on high-resolution UAV imagery or specific growth stages, this research incorporates satellite-derived texture indices (TIs) into a SPAD value estimation model applicable across multiple growth stages (from tillering to grain-filling). Field experiments were conducted in Jiangsu Province, China, where winter wheat sowing dates varied significantly from field to field. Sentinel-2 imagery was employed to extract vegetation indices (VIs) and TIs. Following a two-step variable selection method, Random Forest (RF)-LassoCV, five machine learning algorithms were applied to develop estimation models. The newly developed model (SVR-RBF_VIs+TIs) exhibited robust estimation performance (R² = 0.8131, RMSE = 3.2333, RRMSE = 0.0710, and RPD = 2.3424) when validated against independent SPAD value datasets collected from fields with varying sowing dates. Moreover, this optimal model also exhibited a notable level of transferability at another location with different sowing times, wheat varieties, and soil types from the modeling area. In addition, this research revealed that despite the lower resolution of satellite imagery compared to UAV imagery, the incorporation of TIs significantly improved estimation accuracies compared to the sole use of VIs typical in previous studies. Full article

(This article belongs to the Special Issue Remote Sensing Application in Augmenting Water and Fertilizer Utilization for Sustainable Agriculture)

► Show Figures

Figure 1

34 pages, 4523 KiB

Open AccessArticle

Evaluating Prediction Performance: A Simulation Study Comparing Penalized and Classical Variable Selection Methods in Low-Dimensional Data

by Edwin Kipruto and Willi Sauerbrei

Appl. Sci. 2025, 15(13), 7443; https://doi.org/10.3390/app15137443 - 2 Jul 2025

Viewed by 402

Abstract

Variable selection is important for developing accurate and interpretable prediction models. While classical and penalized methods are widely used, few simulation studies provide meaningful comparisons. This study compares their predictive performance and model complexity in low-dimensional data. Three classical methods (best subset selection, [...] Read more.

Variable selection is important for developing accurate and interpretable prediction models. While classical and penalized methods are widely used, few simulation studies provide meaningful comparisons. This study compares their predictive performance and model complexity in low-dimensional data. Three classical methods (best subset selection, backward elimination, and forward selection) and four penalized methods (nonnegative garrote (NNG), lasso, adaptive lasso (ALASSO), and relaxed lasso (RLASSO)) were compared. Tuning parameters were selected using cross-validation (CV), Akaike information criterion (AIC), and Bayesian information criterion (BIC). Classical methods performed similarly and produced worse predictions than penalized methods in limited-information scenarios (small samples, high correlation, and low signal-to-noise ratio (SNR)), but performed comparably or better in sufficient-information scenarios (large samples, low correlation, and high SNR). Lasso was superior under limited information but was less effective in sufficient-information scenarios. NNG, ALASSO, and RLASSO outperformed lasso in sufficient-information scenarios, with no clear winner among them. AIC and CV produced similar results and outperformed BIC, except in sufficient-information settings, where BIC performed better. Our findings suggest that no single method consistently outperforms others, as performance depends on the amount of information in the data. Lasso is preferred in limited-information settings, whereas classical methods are more suitable in sufficient-information settings, as they also tend to select simpler models. Full article

(This article belongs to the Special Issue Machine Learning in Biomedical Sciences)

► Show Figures

Figure 1

30 pages, 4883 KiB

Open AccessArticle

Cyber-Secure IoT and Machine Learning Framework for Optimal Emergency Ambulance Allocation

by Jonghyuk Kim and Sewoong Hwang

Appl. Sci. 2025, 15(13), 7156; https://doi.org/10.3390/app15137156 - 25 Jun 2025

Viewed by 429

Abstract

Optimizing ambulance deployment is a critical task in emergency medical services (EMS), as it directly affects patient outcomes and system efficiency. This study proposes a cyber-secure, machine learning-based framework for predicting region-specific ambulance allocation and response times across South Korea. The model integrates [...] Read more.

Optimizing ambulance deployment is a critical task in emergency medical services (EMS), as it directly affects patient outcomes and system efficiency. This study proposes a cyber-secure, machine learning-based framework for predicting region-specific ambulance allocation and response times across South Korea. The model integrates heterogeneous datasets—including demographic profiles, transportation indices, medical infrastructure, and dispatch records from 229 EMS centers—and incorporates real-time IoT streams such as traffic flow and geolocation data to enhance temporal responsiveness. Supervised regression algorithms—Random Forest, XGBoost, and LightGBM—were trained on 2061 center-month observations. Among these, Random Forest achieved the best balance of accuracy and interpretability (MSE = 0.05, RMSE = 0.224). Feature importance analysis revealed that monthly patient transfers, dispatch variability, and high-acuity case frequencies were the most influential predictors, underscoring the temporal and contextual complexity of EMS demand. To support policy decisions, a Lasso-based simulation tool was developed, enabling dynamic scenario testing for optimal ambulance counts and dispatch time estimates. The model also incorporates the coefficient of variation (CV) of workload intensity as a performance metric to guide long-term capacity planning and equity assessment. All components operate within a cyber-secure architecture that ensures end-to-end encryption of sensitive EMS and IoT data, maintaining compliance with privacy regulations such as GDPR and HIPAA. By integrating predictive analytics, real-time data, and operational simulation within a secure framework, this study offers a scalable and resilient solution for data-driven EMS resource planning. Full article

(This article belongs to the Special Issue Advances in Internet of Things (IoT) Security: Challenges and Applications)

► Show Figures

Figure 1

17 pages, 3428 KiB

Open AccessArticle

Machine Learning Method for Prediction of Hearing Improvement After Stapedotomy

by Vid Rebol and Janez Rebol

Appl. Sci. 2024, 14(24), 11882; https://doi.org/10.3390/app142411882 - 19 Dec 2024

Viewed by 810

Abstract

Otosclerosis is a localized disease of the bone derived from the otic capsule. Surgery is considered for patients with conductive hearing loss of at least 15 dB in frequencies 250 to 1000 Hz or higher. In some cases, the decision as to whether [...] Read more.

Otosclerosis is a localized disease of the bone derived from the otic capsule. Surgery is considered for patients with conductive hearing loss of at least 15 dB in frequencies 250 to 1000 Hz or higher. In some cases, the decision as to whether surgery (stapedotomy) should be performed is challenging. We developed a machine learning method that predicts a patient’s postoperative hearing quality following stapedotomy, based on their preoperative hearing quality and other features. A separate set of regressors was trained to predict each postoperative hearing intensity on selected feature sets. For feature selection, the least absolute shrinkage and selection operator (Lasso) technique was used. Four models were constructed and evaluated: Lasso, Ridge, k-nearest neighbors, and random forest. The most successful predictions were made at air conduction frequencies between 1000 and 3000 Hz, with mean absolute errors of approximately 6 dB. Utilizing the nested CV method, the Lasso predictor achieved the highest overall prediction accuracy. This study presents the first stapedotomy result prediction method for operating surgeons using machine learning. The potential of audiogram estimation in predicting hearing recovery is demonstrated, offering an alternative to existing classification based models. Full article

(This article belongs to the Special Issue Machine Learning in Vibration and Acoustics 2.0)

► Show Figures

Figure 1

18 pages, 10887 KiB

Open AccessArticle

The Cost-Optimal Control of Building Air Conditioner Loads Based on Machine Learning: A Case Study of an Office Building in Nanjing

by Zhenwei Guo, Xinyu Wang, Yao Wang, Fenglei Zhu, Haizhu Zhou, Miao Zhang and Yuxiang Wang

Buildings 2024, 14(10), 3040; https://doi.org/10.3390/buildings14103040 - 24 Sep 2024

Viewed by 1679

Abstract

Building envelopes and indoor environments exhibit thermal inertia, forming a virtual energy storage system in conjunction with the building air conditioner (AC) system. This system represents a current demand response resource for building electricity use. Thus, this study centers on the CatBoost algorithm [...] Read more.

Building envelopes and indoor environments exhibit thermal inertia, forming a virtual energy storage system in conjunction with the building air conditioner (AC) system. This system represents a current demand response resource for building electricity use. Thus, this study centers on the CatBoost algorithm within machine learning (ML) technology, utilizing the LASSO regression model for feature selection and applying the Optuna framework for hyperparameter optimization (HPO) to develop a cost-optimal control method for minimizing building AC loads. This method addresses the challenges associated with traditional load forecasting and control methods, which are often impacted by environmental temperature, building parameters, and user behavior uncertainties. These methods struggle to accurately capture the complex dynamics and nonlinear relationships of AC operations, making it difficult to devise AC operation and virtual energy storage scheduling strategies effectively. The proposed method was applied and validated using a case study of an office building in Nanjing, China. The prediction results showed coefficient of variation in root mean square error (CV-RMSE) values of 6.4% and 2.2%. Compared with the original operating conditions, the indoor temperature remained within a comfortable range, the AC load was reduced by 5.25%, and the operating energy costs were reduced by 24.94%. These results demonstrate that the proposed method offers improved computational efficiency, enhanced model performance, and economic benefits. Full article

(This article belongs to the Special Issue Smart and Sustainable Buildings: New Trends, Technologies, and Integration in the Energy Transition)

► Show Figures

Figure 1

18 pages, 3584 KiB

Open AccessArticle

Advanced Predictive Modeling for Dam Occupancy Using Historical and Meteorological Data

by Ahmet Cemkut Badem, Recep Yılmaz, Muhammet Raşit Cesur and Elif Cesur

Sustainability 2024, 16(17), 7696; https://doi.org/10.3390/su16177696 - 4 Sep 2024

Viewed by 1639

Abstract

Dams significantly impact the environment, industries, residential areas, and agriculture. Efficient dam management can mitigate negative impacts and enhance benefits such as flood and drought reduction, energy efficiency, water access, and improved irrigation. This study tackles the critical issue of predicting dam occupancy [...] Read more.

Dams significantly impact the environment, industries, residential areas, and agriculture. Efficient dam management can mitigate negative impacts and enhance benefits such as flood and drought reduction, energy efficiency, water access, and improved irrigation. This study tackles the critical issue of predicting dam occupancy levels precisely to contribute to sustainable water management by enabling efficient water allocation among sectors, proactive drought management, controlled flood risk mitigation, and preservation of downstream ecological integrity. Our research suggests that combining physical models of water inflow and outflow “such as evapotranspiration using the Penman–Monteith equation, along with parameters like water consumption, solar radiation, and rainfall” with data-driven models based on historical reservoir data is crucial for accurately predicting occupancy levels. We implemented various prediction models, including Random Forest, Extra Trees, Long Short-Term Memory, Orthogonal Matching Pursuit CV, and Lasso Lars CV. To strengthen our proposed model with robust evidence, we conducted statistical tests on the mean absolute percentage errors of the models. Consequently, we demonstrated the impact of physical model parameters on prediction performance and identified the best method for predicting dam occupancy levels by comparing it with findings from the scientific literature. Full article

(This article belongs to the Special Issue AI Solutions for Improving Sustainability in Water Resource Management)

► Show Figures

Figure 1

20 pages, 590 KiB

Open AccessEditor’s ChoiceArticle

Metabolite Predictors of Breast and Colorectal Cancer Risk in the Women’s Health Initiative

by Sandi L. Navarro, Brian D. Williamson, Ying Huang, G. A. Nagana Gowda, Daniel Raftery, Lesley F. Tinker, Cheng Zheng, Shirley A. A. Beresford, Hayley Purcell, Danijel Djukovic, Haiwei Gu, Howard D. Strickler, Fred K. Tabung, Ross L. Prentice, Marian L. Neuhouser and Johanna W. Lampe

Metabolites 2024, 14(8), 463; https://doi.org/10.3390/metabo14080463 - 20 Aug 2024

Cited by 4 | Viewed by 2377

Abstract

Metabolomics has been used extensively to capture the exposome. We investigated whether prospectively measured metabolites provided predictive power beyond well-established risk factors among 758 women with adjudicated cancers [n = 577 breast (BC) and n = 181 colorectal (CRC)] and n = [...] Read more.

Metabolomics has been used extensively to capture the exposome. We investigated whether prospectively measured metabolites provided predictive power beyond well-established risk factors among 758 women with adjudicated cancers [n = 577 breast (BC) and n = 181 colorectal (CRC)] and n = 758 controls with available specimens (collected mean 7.2 years prior to diagnosis) in the Women’s Health Initiative Bone Mineral Density subcohort. Fasting samples were analyzed by LC-MS/MS and lipidomics in serum, plus GC-MS and NMR in 24 h urine. For feature selection, we applied LASSO regression and Super Learner algorithms. Prediction models were subsequently derived using logistic regression and Super Learner procedures, with performance assessed using cross-validation (CV). For BC, metabolites did not increase predictive performance over established risk factors (CV-AUCs~0.57). For CRC, prediction increased with the addition of metabolites (median CV-AUC across platforms increased from ~0.54 to ~0.60). Metabolites related to energy metabolism: adenosine, 2-hydroxyglutarate, N-acetyl-glycine, taurine, threonine, LPC (FA20:3), acetate, and glycerate; protein metabolism: histidine, leucic acid, isoleucine, N-acetyl-glutamate, allantoin, N-acetyl-neuraminate, hydroxyproline, and uracil; and dietary/microbial metabolites: myo-inositol, trimethylamine-N-oxide, and 7-methylguanine, consistently contributed to CRC prediction. Energy metabolism may play a key role in the development of CRC and may be evident prior to disease development. Full article

(This article belongs to the Special Issue Metabolomics-Based Biomarkers for Nutrition and Health)

► Show Figures

Graphical abstract

21 pages, 6541 KiB

Open AccessFeature PaperEditor’s ChoiceArticle

Comparison of Machine Learning Models for Predicting Interstitial Glucose Using Smart Watch and Food Log

by Haider Ali, Imran Khan Niazi, David White, Malik Naveed Akhter and Samaneh Madanian

Electronics 2024, 13(16), 3192; https://doi.org/10.3390/electronics13163192 - 12 Aug 2024

Cited by 4 | Viewed by 2757

Abstract

This study examines the performance of various machine learning (ML) models in predicting Interstitial Glucose (IG) levels using data from wrist-worn wearable sensors. The insights from these predictions can aid in understanding metabolic syndromes and disease states. A public dataset comprising information from [...] Read more.

This study examines the performance of various machine learning (ML) models in predicting Interstitial Glucose (IG) levels using data from wrist-worn wearable sensors. The insights from these predictions can aid in understanding metabolic syndromes and disease states. A public dataset comprising information from the Empatica E4 smart watch, the Dexcom Continuous Glucose Monitor (CGM) measuring IG, and a food log was utilized. The raw data were processed into features, which were then used to train different ML models. This study evaluates the performance of decision tree (DT), support vector machine (SVM), Random Forest (RF), Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Gaussian Naïve Bayes (GNB), lasso cross-validation (LassoCV), Ridge, Elastic Net, and XGBoost models. For classification, IG labels were categorized into high, standard, and low, and the performance of the ML models was assessed using accuracy (40–78%), precision (41–78%), recall (39–77%), F1-score (0.31–0.77), and receiver operating characteristic (ROC) curves. Regression models predicting IG values were evaluated based on R-squared values (−7.84–0.84), mean absolute error (5.54–60.84 mg/dL), root mean square error (9.04–68.07 mg/dL), and visual methods like residual and QQ plots. To assess whether the differences between models were statistically significant, the Friedman test was carried out and was interpreted using the Nemenyi post hoc test. Tree-based models, particularly RF and DT, demonstrated superior accuracy for classification tasks in comparison to other models. For regression, the RF model achieved the lowest RMSE of 9.04 mg/dL with an R-squared value of 0.84, while the GNB model performed the worst, with an RMSE of 68.07 mg/dL. A SHAP analysis identified time from midnight as the most significant predictor. Partial dependence plots revealed complex feature interactions in the RF model, contrasting with the simpler interactions captured by LDA. Full article

(This article belongs to the Special Issue Machine Learning for Biomedical Applications)

► Show Figures

Graphical abstract

22 pages, 3817 KiB

Open AccessEditor’s ChoiceArticle

Enhancing Immunotherapy Response Prediction in Metastatic Lung Adenocarcinoma: Leveraging Shallow and Deep Learning with CT-Based Radiomics across Single and Multiple Tumor Sites

by Cécile Masson-Grehaigne, Mathilde Lafon, Jean Palussière, Laura Leroy, Benjamin Bonhomme, Eva Jambon, Antoine Italiano, Sophie Cousin and Amandine Crombé

Cancers 2024, 16(13), 2491; https://doi.org/10.3390/cancers16132491 - 8 Jul 2024

Cited by 3 | Viewed by 2070

Abstract

This study aimed to evaluate the potential of pre-treatment CT-based radiomics features (RFs) derived from single and multiple tumor sites, and state-of-the-art machine-learning survival algorithms, in predicting progression-free survival (PFS) for patients with metastatic lung adenocarcinoma (MLUAD) receiving first-line treatment including immune checkpoint [...] Read more.

This study aimed to evaluate the potential of pre-treatment CT-based radiomics features (RFs) derived from single and multiple tumor sites, and state-of-the-art machine-learning survival algorithms, in predicting progression-free survival (PFS) for patients with metastatic lung adenocarcinoma (MLUAD) receiving first-line treatment including immune checkpoint inhibitors (CPIs). To do so, all adults with newly diagnosed MLUAD, pre-treatment contrast-enhanced CT scan, and performance status ≤ 2 who were treated at our cancer center with first-line CPI between November 2016 and November 2022 were included. RFs were extracted from all measurable lesions with a volume ≥ 1 cm³ on the CT scan. To capture intra- and inter-tumor heterogeneity, RFs from the largest tumor of each patient, as well as lowest, highest, and average RF values over all lesions per patient were collected. Intra-patient inter-tumor heterogeneity metrics were calculated to measure the similarity between each patient lesions. After filtering predictors with univariable Cox p < 0.100 and analyzing their correlations, five survival machine-learning algorithms (stepwise Cox regression [SCR], LASSO Cox regression, random survival forests, gradient boosted machine [GBM], and deep learning [Deepsurv]) were trained in 100-times repeated 5-fold cross-validation (rCV) to predict PFS on three inputs: (i) clinicopathological variables, (ii) all radiomics-based and clinicopathological (full input), and (iii) uncorrelated radiomics-based and clinicopathological variables (uncorrelated input). The Models’ performances were evaluated using the concordance index (c-index). Overall, 140 patients were included (median age: 62.5 years, 36.4% women). In rCV, the highest c-index was reached with Deepsurv (c-index = 0.631, 95%CI = 0.625–0.647), followed by GBM (c-index = 0.603, 95%CI = 0.557–0.646), significantly outperforming standard SCR whatever its input (c-index range: 0.560–0.570, all p < 0.0001). Thus, single- and multi-site pre-treatment radiomics data provide valuable prognostic information for predicting PFS in MLUAD patients undergoing first-line CPI treatment when analyzed with advanced machine-learning survival algorithms. Full article

(This article belongs to the Special Issue Imaging and Molecular Biology as Biomarkers for Lung Cancer)

► Show Figures

Figure 1

16 pages, 750 KiB

Open AccessArticle

Evaluating Outcome Prediction via Baseline, End-of-Treatment, and Delta Radiomics on PET-CT Images of Primary Mediastinal Large B-Cell Lymphoma

by Fereshteh Yousefirizi, Claire Gowdy, Ivan S. Klyuzhin, Maziar Sabouri, Petter Tonseth, Anna R. Hayden, Donald Wilson, Laurie H. Sehn, David W. Scott, Christian Steidl, Kerry J. Savage, Carlos F. Uribe and Arman Rahmim

Cancers 2024, 16(6), 1090; https://doi.org/10.3390/cancers16061090 - 8 Mar 2024

Cited by 12 | Viewed by 2964

Abstract

Objectives: Accurate outcome prediction is important for making informed clinical decisions in cancer treatment. In this study, we assessed the feasibility of using changes in radiomic features over time (Delta radiomics: absolute and relative) following chemotherapy, to predict relapse/progression and time to progression [...] Read more.

Objectives: Accurate outcome prediction is important for making informed clinical decisions in cancer treatment. In this study, we assessed the feasibility of using changes in radiomic features over time (Delta radiomics: absolute and relative) following chemotherapy, to predict relapse/progression and time to progression (TTP) of primary mediastinal large B-cell lymphoma (PMBCL) patients. Material and Methods: Given the lack of standard staging PET scans until 2011, only 31 out of 103 PMBCL patients in our retrospective study had both pre-treatment and end-of-treatment (EoT) scans. Consequently, our radiomics analysis focused on these 31 patients who underwent [¹⁸F]FDG PET-CT scans before and after R-CHOP chemotherapy. Expert manual lesion segmentation was conducted on their scans for delta radiomics analysis, along with an additional 19 EoT scans, totaling 50 segmented scans for single time point analysis. Radiomics features (on PET and CT), along with maximum and mean standardized uptake values (SUVmax and SUVmean), total metabolic tumor volume (TMTV), tumor dissemination (Dmax), total lesion glycolysis (TLG), and the area under the curve of cumulative standardized uptake value-volume histogram (AUC-CSH) were calculated. We additionally applied longitudinal analysis using radial mean intensity (RIM) changes. For prediction of relapse/progression, we utilized the individual coefficient approximation for risk estimation (ICARE) and machine learning (ML) techniques (K-Nearest Neighbor (KNN), Linear Discriminant Analysis (LDA), and Random Forest (RF)) including sequential feature selection (SFS) following correlation analysis for feature selection. For TTP, ICARE and CoxNet approaches were utilized. In all models, we used nested cross-validation (CV) (with 10 outer folds and 5 repetitions, along with 5 inner folds and 20 repetitions) after balancing the dataset using Synthetic Minority Oversampling TEchnique (SMOTE). Results: To predict relapse/progression using Delta radiomics between the baseline (staging) and EoT scans, the best performances in terms of accuracy and F1 score (F1 score is the harmonic mean of precision and recall, where precision is the ratio of true positives to the sum of true positives and false positives, and recall is the ratio of true positives to the sum of true positives and false negatives) were achieved with ICARE (accuracy = 0.81 ± 0.15, F1 = 0.77 ± 0.18), RF (accuracy = 0.89 ± 0.04, F1 = 0.87 ± 0.04), and LDA (accuracy = 0.89 ± 0.03, F1 = 0.89 ± 0.03), that are higher compared to the predictive power achieved by using only EoT radiomics features. For the second category of our analysis, TTP prediction, the best performer was CoxNet (LASSO feature selection) with c-index = 0.67 ± 0.06 when using baseline + Delta features (inclusion of both baseline and Delta features). The TTP results via Delta radiomics were comparable to the use of radiomics features extracted from EoT scans for TTP analysis (c-index = 0.68 ± 0.09) using CoxNet (with SFS). The performance of Deauville Score (DS) for TTP was c-index = 0.66 ± 0.09 for n = 50 and 0.67 ± 03 for n = 31 cases when using EoT scans with no significant differences compared to the radiomics signature from either EoT scans or baseline + Delta features (p-value> 0.05). Conclusion: This work demonstrates the potential of Delta radiomics and the importance of using EoT scans to predict progression and TTP from PMBCL [¹⁸F]FDG PET-CT scans. Full article

(This article belongs to the Special Issue PET/CT in Cancers Outcomes Prediction)

► Show Figures

Figure 1

6 pages, 843 KiB

Open AccessProceeding Paper

Comparing Regression Techniques for Temperature Downscaling in Different Climate Classifications

by Ali Ilghami Kkhosroshahi, Mohammad Bejani, Hadi Pourali and Arman Hosseinpour Salehi

Eng. Proc. 2023, 56(1), 291; https://doi.org/10.3390/ASEC2023-15256 - 26 Oct 2023

Cited by 2 | Viewed by 741

Abstract

This study aims to identify the optimal regression techniques for downscaling among ten commonly used methods in climatology, including SVR, LinearSVR, LASSO, LASSOCV, Elastic Net, Bayesian Ridge, RandomForestRegressor, AdaBoost Regressor, KNeighbors Regressor, and XGBRegressor. For the Köppen climate classification system, including A (tropical), [...] Read more.

This study aims to identify the optimal regression techniques for downscaling among ten commonly used methods in climatology, including SVR, LinearSVR, LASSO, LASSOCV, Elastic Net, Bayesian Ridge, RandomForestRegressor, AdaBoost Regressor, KNeighbors Regressor, and XGBRegressor. For the Köppen climate classification system, including A (tropical), B (dry), C (temperate), and D (continental), synoptic station data were collected. Furthermore, for the purpose of downscaling, a general circulation model (GCM) had been utilized. Additionally, to enhance the performance of downscaling accuracy, mutual information (MI) was employed for feature selection. The downscaling performance was evaluated using the coefficient of determination (DC) and root mean square error (RMSE). Results indicate that SVR had superior performance in tropical and dry climates and LassoCV with RandomForestRegressor had better results in temperate and continental climates. Full article

(This article belongs to the Proceedings of The 4th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

20 pages, 9515 KiB

Open AccessArticle

Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity

by Fuat Kaya, Gaurav Mishra, Rosa Francaviglia and Ali Keshavarzi

Land 2023, 12(4), 819; https://doi.org/10.3390/land12040819 - 3 Apr 2023

Cited by 10 | Viewed by 3552

Abstract

Cation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on farms in the Nagaland state [...] Read more.

Cation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on farms in the Nagaland state (northeast India) which are fragmented and intertwined with the forest ecosystem. The current study applied the digital soil mapping (DSM) methodology, based on the CEC values determined in soil samples obtained from 305 points in the region, which is mountainous and difficult to access. Firstly, digital auxiliary data were obtained from three open-access sources, including indices generated from the time series Landsat 8 OLI satellite, topographic variables derived from a digital elevation model (DEM), and the WorldClim dataset. Furthermore, the CEC values and the auxiliary were used data to model Lasso regression (LR), stochastic gradient boosting (GBM), support vector regression (SVR), random forest (RF), and K-nearest neighbors (KNN) machine learning (ML) algorithms were systematically compared in the R-Core Environment Program. Model performance were evaluated with the square root mean error (RMSE), determination coefficient (R²), and mean absolute error (MAE) of 10-fold cross-validation (CV). The lowest RMSE was obtained by the RF algorithm with 4.12 cmol_c kg⁻¹, while the others were in the following order: SVR (4.27 cmol_c kg⁻¹) <KNN (4.45 cmol_c kg⁻¹) <LR (4.67 cmol_c kg⁻¹) <GBM (5.07 cmol_c kg⁻¹). In particular, WorldClim-based climate covariates such as annual mean temperature (BIO-1), annual precipitation (BIO-12), elevation, and solar radiation were the most important variables in all algorithms. High uncertainty (SD) values have been found in areas with low soil sampling density and this finding is to be considered in future soil surveys. Full article

(This article belongs to the Special Issue Machine Learning and Data Science Techniques for Remote Sensing and Social Media Data)

► Show Figures

Figure 1

17 pages, 3098 KiB

Open AccessArticle

Multispectral UAV-Based Monitoring of Leek Dry-Biomass and Nitrogen Uptake across Multiple Sites and Growing Seasons

by Jérémie Haumont, Peter Lootens, Simon Cool, Jonathan Van Beek, Dries Raymaekers, Eva Ampe, Tim De Cuypere, Onno Bes, Jonas Bodyn and Wouter Saeys

Remote Sens. 2022, 14(24), 6211; https://doi.org/10.3390/rs14246211 - 8 Dec 2022

Cited by 3 | Viewed by 2632

Abstract

Leek farmers tend to apply too much nitrogen fertilizer as its cost is relatively low compared to the gross value of leek. Recently, several studies have shown that proximal sensing technologies could accurately monitor the crop nitrogen content and biomass. However, their implementation [...] Read more.

Leek farmers tend to apply too much nitrogen fertilizer as its cost is relatively low compared to the gross value of leek. Recently, several studies have shown that proximal sensing technologies could accurately monitor the crop nitrogen content and biomass. However, their implementation is impeded by practical limitations and the limited area they can cover. UAV-based monitoring might alleviate these issues. Studies on UAV-based vegetable crop monitoring are still limited. Because of the economic importance and environmental impact of leeks in Flanders, this study aimed to investigate the ability of UAV-based multispectral imaging to accurately monitor leek nitrogen uptake and dry biomass across multiple fields and seasons. Different modelling approaches were tested using twelve spectral VIs and the interquartile range of each of these VIs within the experimental plots as predictors. In a leave-one-flight out cross-validation (LOF-CV), leek dry biomass (DBM) was most accurately predicted using a lasso regression model (RMSE_ct = 6.60 g plant⁻¹, R²= 0.90). Leek N-uptake was predicted most accurately by a simple linear regression model based on the red wide dynamic range (RWDRVI) (RMSE_ct = 0.22 gN plant⁻¹, R² = 0.85). The results showed that randomized Kfold-CV is an undesirable approach. It resulted in more consistent and lower RMSE values during model training and selection, but worse performance on new data. This would be due to information leakage of flight-specific conditions in the validation data split. However, the model predictions were less accurate for data acquired in a different growing season (DBM: RMSEP = 8.50 g plant⁻¹, R² = 0.77; N-uptake: RMSEP = 0.27 gN plant⁻¹, R² = 0.68). Recalibration might solve this issue, but additional research is required to cope with this effect during image acquisition and processing. Further improvement of the model robustness could be obtained through the inclusion of phenological parameters such as crop height. Full article

(This article belongs to the Special Issue Agricultural Applications Using Hyperspectral Data)

► Show Figures

Graphical abstract

15 pages, 543 KiB

Open AccessArticle

Associations between Advanced Glycation End Products, Body Composition and Mediterranean Diet Adherence in Kidney Transplant Recipients

by Josipa Radić, Marijana Vučković, Andrea Gelemanović, Ela Kolak, Dora Bučan Nenadić, Mirna Begović and Mislav Radić

Int. J. Environ. Res. Public Health 2022, 19(17), 11060; https://doi.org/10.3390/ijerph191711060 - 4 Sep 2022

Cited by 4 | Viewed by 2239

Abstract

There is limited evidence on the associations between dietary patterns, body composition, and nonclassical predictors of worse outcomes such as advanced glycation end products (AGE) in kidney transplant recipients (KTRs). The aim of this cross-sectional study was to determine the level of AGE-determined [...] Read more.

There is limited evidence on the associations between dietary patterns, body composition, and nonclassical predictors of worse outcomes such as advanced glycation end products (AGE) in kidney transplant recipients (KTRs). The aim of this cross-sectional study was to determine the level of AGE-determined cardiovascular (CV) risk in Dalmatian KTRs and possible associations between AGE, adherence to the Mediterranean diet (MeDi), and nutritional status. Eighty-five (85) KTRs were enrolled in this study. For each study participant, data were collected on the level of AGE, as measured by skin autofluorescence (SAF), Mediterranean Diet Serving Score (MDSS), body mass composition, anthropometric parameters, and clinical and laboratory parameters. Only 11.76% of the participants were adherent to the MeDi. Sixty-nine percent (69%) of KTRs had severe CV risk based on AGE, while 31% of KTRs had mild to moderate CV risk. The results of the LASSO regression analysis showed that age, dialysis type, dialysis vintage, presence of CV and chronic kidney disease, C- reactive protein level, urate level, percentage of muscle mass, and adherence to recommendations for nuts, meat, and sweets were identified as positive predictors of AGE. The negative predictors for AGE were calcium, phosphate, cereal adherence according to the MeDi, and trunk fat mass. These results demonstrate extremely low adherence to the MeDi and high AGE levels related CV risk in Dalmatian KTRs. Lifestyle interventions in terms of CV risk management and adherence to the MeDi of KTRs should be taken into consideration when taking care of this patient population. Full article

(This article belongs to the Special Issue New Advances in Nutrition and Chronic Non-communicable Diseases)

► Show Figures

Figure 1

22 pages, 5949 KiB

Open AccessEditor’s ChoiceArticle

Choosing Feature Selection Methods for Spatial Modeling of Soil Fertility Properties at the Field Scale

by Caner Ferhatoglu and Bradley A. Miller

Agronomy 2022, 12(8), 1786; https://doi.org/10.3390/agronomy12081786 - 29 Jul 2022

Cited by 10 | Viewed by 3012

Abstract

With the growing availability of environmental covariates, feature selection (FS) is becoming an essential task for applying machine learning (ML) in digital soil mapping (DSM). In this study, the effectiveness of six types of FS methods from four categories (filter, wrapper, embedded, and [...] Read more.

With the growing availability of environmental covariates, feature selection (FS) is becoming an essential task for applying machine learning (ML) in digital soil mapping (DSM). In this study, the effectiveness of six types of FS methods from four categories (filter, wrapper, embedded, and hybrid) were compared. These FS algorithms chose relevant covariates from an exhaustive set of 1049 environmental covariates for predicting five soil fertility properties in ten fields, in combination with ten different ML algorithms. Resulting model performance was compared by three different metrics (R² of 10-fold cross validation (CV), robustness ratio (RR; developed in this study), and independent validation with Lin’s concordance correlation coefficient (IV-CCC)). FS improved CV, RR, and IV-CCC compared to the models built without FS for most fields and soil properties. Wrapper (BorutaShap) and embedded (Lasso-FS, Random forest-FS) methods usually led to the optimal models. The filter-based ANOVA-FS method mostly led to overfit models, especially for fields with smaller sample quantities. Decision-tree based models were usually part of the optimal combination of FS and ML. Considering RR helped identify optimal combinations of FS and ML that can improve the performance of DSM compared to models produced from full covariate stacks. Full article

(This article belongs to the Special Issue Geostatistics and Machine Learning in the Mapping of Agricultural Soils: State-of-the-Art and Perspectives)

► Show Figures

Figure 1

Search Results (23)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (23)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI