MDPI - Publisher of Open Access Journals

21 pages, 874 KiB

Open AccessArticle

Explainable Use of Foundation Models for Job Hiring

by Vishnu S. Pendyala, Neha Bais Thakur and Radhika Agarwal

Electronics 2025, 14(14), 2787; https://doi.org/10.3390/electronics14142787 - 11 Jul 2025

Viewed by 1149

Automating candidate shortlisting is a non-trivial task that stands to benefit substantially from advances in artificial intelligence. We evaluate a suite of foundation models such as Llama 2, Llama 3, Mixtral, Gemma-2b, Gemma-7b, Phi-3 Small, Phi-3 Mini, Zephyr, and Mistral-7b for their ability [...] Read more.

Automating candidate shortlisting is a non-trivial task that stands to benefit substantially from advances in artificial intelligence. We evaluate a suite of foundation models such as Llama 2, Llama 3, Mixtral, Gemma-2b, Gemma-7b, Phi-3 Small, Phi-3 Mini, Zephyr, and Mistral-7b for their ability to predict hiring outcomes in both zero-shot and few-shot settings. Using only features extracted from applicants’ submissions, these models, on average, achieved an AUC above 0.5 in zero-shot settings. Providing a few examples similar to the job applicants based on a nearest neighbor search improved the prediction rate marginally, indicating that the models perform competently even without task-specific fine-tuning. For Phi-3 Small and Mixtral, all reported performance metrics fell within the 95% confidence interval across evaluation strategies. Model outputs were interpreted quantitatively via post hoc explainability techniques and qualitatively through prompt engineering, revealing that decisions are largely attributable to knowledge acquired during pre-training. A task-specific MLP classifier trained solely on the provided dataset only outperformed the strongest foundation model (Zephyr in 5-shot setting) by approximately 3 percentage points on accuracy, but all the foundational models outperformed the baseline model by more than 15 percentage points on f1 and recall, underscoring the competitive strength of general-purpose language models in the hiring domain. Full article

(This article belongs to the Special Issue Innovative Applications of Large Language Models in Natural Language Processing (NLP))

► Show Figures

Figure 1

16 pages, 2124 KiB

Open AccessArticle

Missing Data in Orthopaedic Clinical Outcomes Research: A Sensitivity Analysis of Imputation Techniques Utilizing a Large Multicenter Total Shoulder Arthroplasty Database

by Kevin A. Hao, Terrie Vasilopoulos, Josie Elwell, Christopher P. Roche, Keegan M. Hones, Jonathan O. Wright, Joseph J. King, Thomas W. Wright, Ryan W. Simovitch and Bradley S. Schoch

J. Clin. Med. 2025, 14(11), 3829; https://doi.org/10.3390/jcm14113829 - 29 May 2025

Cited by 1 | Viewed by 478

Abstract

Background: When missing data are present in clinical outcomes studies, complete-case analysis (CCA) is often performed, whereby patients with missing data are excluded. While simple, CCA analysis may impart selection bias and reduce statistical power, leading to erroneous statistical results in some cases. [...] Read more.

Background: When missing data are present in clinical outcomes studies, complete-case analysis (CCA) is often performed, whereby patients with missing data are excluded. While simple, CCA analysis may impart selection bias and reduce statistical power, leading to erroneous statistical results in some cases. However, there exist more rigorous statistical approaches, such as single and multiple imputation, which approximate the associations that would have been present in a full dataset and preserve the study’s power. The purpose of this study is to evaluate how statistical results differ when performed after CCA analysis versus imputation methods. Methods: This simulation study analyzed a sample dataset consisting of 2204 shoulders, with complete datapoints from a larger multicenter total shoulder arthroplasty database. From the sampled dataset of demographics, surgical characteristics, and clinical outcomes, we created five test datasets, ranging from 100 to 2000 shoulders, and simulated 10–50% missingness in the postoperative American Shoulder and Elbow Surgeons (ASES) score and range of motion in four planes in missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR) patterns. Missingness in outcomes was remedied using CCA, three single imputation techniques, and two multiple imputation techniques. The imputation performance was evaluated relative to the native complete dataset using the root mean squared error (RMSE) and the mean absolute percentage error (MAPE). We also compared the mean and standard deviation (SD) of the postoperative ASES score and the results of multivariable linear and logistic regression to understand the effects of imputation on the study results. Results: The average overall RMSE and MAPE were similar for MCAR (22.6 and 27.2%) and MAR (19.2 and 17.7%) missingness patterns, but were substantially poorer for NMAR (37.5 and 79.2%); the sample size and the percentage of data missingness minimally affected RMSE and MAPE. Aggregated mean postoperative ASES scores were within 5% of the true value when missing data were remedied with CCA, and all candidate imputation methods for nearly all ranges of sample size and data missingness when data were MCAR or MAR, but not when data were NMAR. When data were MAR, CCA resulted in overestimates of the SD. When data were MCAR or MAR, the accuracy of the regression estimate (β or OR) and its corresponding 95% CI varied substantially based on the sample size and proportion of missing data for multivariable linear regression, but not logistic regression. When data were MAR, the width of the 95% CI was up to 300% larger when CCA was used, whereas most imputation methods maintained the width of the 95% CI within 50% of the true value. Single imputation with k-nearest neighbor (kNN) method and multiple imputation with predictive mean matching (MICE-PMM) best-reproduced point estimates and intervariable relationships resembling the native dataset. Availability of correlated outcome scores improved the RMSE, MAPE, accuracy of the mean postoperative ASES score, and multivariable linear regression model estimates. Conclusions: Complete-case analysis can introduce selection bias when data are MAR, and it results in loss of statistical power, resulting in loss of precision (i.e., expansion of the 95% CI) and predisposition to false-negative findings. Our data demonstrate that imputation can reliably reproduce missing clinical data and generate accurate population estimates that closely resemble results derived from native primary shoulder arthroplasty datasets (i.e., prior to simulated data missingness). Further study of the use of imputation in clinical database research is critical, as the use of CCA may lead to different conclusions in comparison to more rigorous imputation approaches. Full article

(This article belongs to the Section Orthopedics)

► Show Figures

Figure 1

21 pages, 7393 KiB

Open AccessArticle

Exploring Cooling Effects of Land Cover Type Local Climate Zones in Relation to Built-Up Areas: A County Perspective

by Yihui Liu, Jing Xu, Shiqiang Sun, Tianyu Li and Jianfei Cao

Atmosphere 2025, 16(2), 194; https://doi.org/10.3390/atmos16020194 - 8 Feb 2025

Viewed by 972

Abstract

Evaluating cooling effects within local climate zones (LCZs) is vital for urban planning, especially in countering urban heat islands and enhancing thermal comfort. While prior research has primarily focused on the influence of land cover type LCZs on land surface temperature in large [...] Read more.

Evaluating cooling effects within local climate zones (LCZs) is vital for urban planning, especially in countering urban heat islands and enhancing thermal comfort. While prior research has primarily focused on the influence of land cover type LCZs on land surface temperature in large cities, the role of LCZ characteristics across diverse urban environments in smaller regions remains insufficiently explored. This study investigates the cooling effects of land cover type LCZs within the urban built-up areas at the county level using ECOSTRESS land surface temperature data. We utilized a random forest regression model to assess the impact of LCZ factors—namely, landscape shape index, percentage of landscape, and nearest neighbor distance among six built type LCZs and three land cover type LCZs—on cooling effects. The key findings are as follows: (1) Densely constructed LCZs demonstrated higher land surface temperature due to increased heat retention, with LCZ2 showing the highest mean land surface temperature (37.64 °C), whereas land cover type LCZs, particularly LCZG, recorded significantly lower mean land surface temperature (31.86 °C). (2) Land cover type LCZs were crucial in reducing land surface temperature, with LCZG presenting the most significant cooling effects, ranging from 0.31 to 79.43. LCZA and LCZD displayed comparable cooling effects, with average values of 6.54 and 6.46, respectively. (3) The landscape shape index of the land cover type LCZs themselves contributed most significantly to cooling effects. The percentage of landscape and landscape shape index of LCZ3 and LCZ6 notably influenced cooling effects in LCZA and LCZD, while LCZ5 predominantly affected LCZG. (4) The interactions between key factors and cooling effects were found to be complex and nonlinear. U-shaped effects were observed (e.g., central percentage of landscape affecting LCZA and LCZD, and percentage of landscape in LCZ6 affecting LCZG), alongside inverted U-shaped effects (e.g., landscape shape index in LCZ6 affecting LCZG) and combined effects (e.g., landscape shape index of LCZ3 affecting LCZA). The results emphasize the significance of maintaining contiguous and well-structured land cover type LCZs to optimize cooling effects. Full article

(This article belongs to the Section Biometeorology and Bioclimatology)

► Show Figures

Figure 1

19 pages, 7037 KiB

Open AccessArticle

An Artificial Intelligence Home Monitoring System That Uses CNN and LSTM and Is Based on the Android Studio Development Platform

by Guo-Ming Sung, Sachin D. Kohale, Te-Hui Chiang and Yu-Jie Chong

Appl. Sci. 2025, 15(3), 1207; https://doi.org/10.3390/app15031207 - 24 Jan 2025

Cited by 1 | Viewed by 970

Abstract

This paper developed an artificial intelligence home environment monitoring system by using the Android Studio development platform. A database was constructed within a server to store sensor data. The proposed system comprises multiple sensors, a message queueing telemetry transport (MQTT) communication protocol, cloud [...] Read more.

This paper developed an artificial intelligence home environment monitoring system by using the Android Studio development platform. A database was constructed within a server to store sensor data. The proposed system comprises multiple sensors, a message queueing telemetry transport (MQTT) communication protocol, cloud data storage and computation, and end device control. A mobile application was developed using MongoDB software, which is a file-oriented NoSQL database management system developed using C++. This system represents a new database for processing big sensor data. The k-nearest neighbor (KNN) algorithm was used to impute missing data. Node-RED development software was used within the server as a data-receiving, storage, and computing environment that is convenient to manage and maintain. Data on indoor temperature, humidity, and carbon dioxide concentrations are transmitted to a mobile phone application through the MQTT communication protocol for real-time display and monitoring. The system can control a fan or warning light through the mobile application to maintain ambient temperature inside the house and to warn users of emergencies. A long short-term memory (LSTM) model and a convolutional neural network (CNN) model were used to predict indoor temperature, humidity, and carbon dioxide concentrations. Average relative errors in the predicted values of humidity and carbon dioxide concentration were approximately 0.0415% and 0.134%, respectively, for data storage using the KNN algorithm. For indoor temperature prediction, the LSTM model had a mean absolute percentage error of 0.180% and a root-mean-squared error of 0.042 °C. The CNN–LSTM model had a mean absolute percentage error of 1.370% and a root-mean-squared error of 0.117 °C. Full article

► Show Figures

Figure 1

16 pages, 4518 KiB

Open AccessEditor’s ChoiceArticle

Inversion of Aerosol Chemical Composition in the Beijing–Tianjin–Hebei Region Using a Machine Learning Algorithm

by Baojiang Li, Gang Cheng, Chunlin Shang, Ruirui Si, Zhenping Shao, Pu Zhang, Wenyu Zhang and Lingbin Kong

Atmosphere 2025, 16(2), 114; https://doi.org/10.3390/atmos16020114 - 21 Jan 2025

Viewed by 1069

Abstract

Aerosols and their chemical composition exert an influence on the atmospheric environment, global climate, and human health. However, obtaining the chemical composition of aerosols with high spatial and temporal resolution remains a challenging issue. In this study, using the NR-PM1 collected in the [...] Read more.

Aerosols and their chemical composition exert an influence on the atmospheric environment, global climate, and human health. However, obtaining the chemical composition of aerosols with high spatial and temporal resolution remains a challenging issue. In this study, using the NR-PM1 collected in the Beijing area from 2012 to 2013, we found that the annual average concentration was 41.32 μg·m⁻³, with the largest percentage of organics accounting for 49.3% of NR-PM1, followed by nitrates, sulfates, and ammonium. We then established models of aerosol chemical composition based on a machine learning algorithm. By comparing the inversion accuracies of single models—namely MLR (Multivariable Linear Regression) model, SVR (Support Vector Regression) model, RF (Random Forest) model, KNN (K-Nearest Neighbor) model, and LightGBM (Light Gradient Boosting Machine)—with that of the combined model (CM) after selecting the optimal model, we found that although the accuracy of the KNN model was the highest among the other single models, the accuracy of the CM model was higher. By employing the CM model to the spatially and temporally matched AOD (aerosol optical depth) data and meteorological data of the Beijing–Tianjin–Hebei region, the spatial distribution of the annual average concentrations of the four components was obtained. The areas with higher concentrations are mainly situated in the southwest of Beijing, and the annual average concentrations of the four components in Beijing’s southwest are 28 μg·m⁻³, 7 μg·m⁻³, 8 μg·m⁻³, and 15 μg·m⁻³ for organics, sulfates, ammonium, and nitrates, respectively. This study not only provides new methodological ideas for obtaining aerosol chemical composition concentrations based on satellite remote sensing data but also provides a data foundation and theoretical support for the formulation of atmospheric pollution prevention and control policies. Full article

(This article belongs to the Special Issue Atmospheric Pollution in Highly Polluted Areas)

► Show Figures

Figure 1

22 pages, 2104 KiB

Open AccessEditor’s ChoiceArticle

Burrowing Owls Require Mutualist Species and Ample Interior Habitat Space

by K. Shawn Smallwood and Michael L. Morrison

Diversity 2024, 16(9), 590; https://doi.org/10.3390/d16090590 - 19 Sep 2024

Viewed by 1687

Abstract

Mitigating habitat loss of western burrowing owls (Athene cunicularia hypugaea) often involves relocation from California ground squirrel (Otospermophilus beecheyi) burrows to offsite nest boxes. Naval Air Station Lemoore (NASL), Kings and Fresno counties, California, initiated this approach to displace [...] Read more.

Mitigating habitat loss of western burrowing owls (Athene cunicularia hypugaea) often involves relocation from California ground squirrel (Otospermophilus beecheyi) burrows to offsite nest boxes. Naval Air Station Lemoore (NASL), Kings and Fresno counties, California, initiated this approach to displace a regionally important population from airfield grasslands. We examined monitoring data of burrowing owls and fossorial mammals at NASL to assess mitigation options. Occupied nests increased by 33 (61%), with 47 nest box installations in 1997–2001, peaked at 87 in 1999, then declined by 50 through 2013. Although ≥13 nest boxes were occupied in 2000, none were occupied in 2003–2013. Within a 43.1 ha isolated grassland monitored for 13 years, nest site reuse in ground squirrel burrows averaged only 17% between any 2 consecutive years. Compared to the average density across grassland study areas, ground squirrel burrow systems/ha numbered 43% higher within 60 m of occupied nests and non-breeding-season burrows. Vegetation clearing to restore kangaroo rat (Dipodomys n. nitratoides) habitat preceded a 7.4-fold increase in ground squirrel burrow systems and a 4-fold increase in occupied nests, but drought-induced extirpation of ground squirrels eliminated occupied nests from the 43.1 ha grassland study area. Ground cover near occupied nests averaged 58% of the mean vegetation height and 67% of the mean percentage of bare ground in the field. Both nest sites and non-breeding-season burrows occurred >60 m interior to field edges 1.4 times more than expected. Non-breeding-season burrows averaged 328 m from same-year nest sites, and only 7% of non-breeding-season burrows were also used as nest sites. Mitigating habitat loss should be made more effective by fostering natural burrow construction by fossorial mammals on patches of short-stature vegetation that is sufficiently expansive to support breeding colonies of ≥12 pairs averaging ≥60 m from the field’s edge and a separation between non-breeding-season burrows and nest burrows minimally equal to mean nearest-neighbor distances among nests. Full article

(This article belongs to the Section Biodiversity Loss & Dynamics)

► Show Figures

Figure 1

21 pages, 2362 KiB

Open AccessArticle

Prediction of Acceleration Amplification Ratio of Rocking Foundations Using Machine Learning and Deep Learning Models

by Sivapalan Gajan

Appl. Sci. 2023, 13(23), 12791; https://doi.org/10.3390/app132312791 - 29 Nov 2023

Cited by 1 | Viewed by 1596

Abstract

Experimental results reveal that rocking shallow foundations reduce earthquake-induced force and flexural displacement demands transmitted to structures and can be used as an effective geotechnical seismic isolation mechanism. This paper presents data-driven predictive models for maximum acceleration transmitted to structures founded on rocking [...] Read more.

Experimental results reveal that rocking shallow foundations reduce earthquake-induced force and flexural displacement demands transmitted to structures and can be used as an effective geotechnical seismic isolation mechanism. This paper presents data-driven predictive models for maximum acceleration transmitted to structures founded on rocking shallow foundations during earthquake loading. Results from base-shaking experiments on rocking foundations have been utilized for the development of artificial neural network regression (ANN), k-nearest neighbors regression, support vector regression, random forest regression, adaptive boosting regression, and gradient boosting regression models. Acceleration amplification ratio, defined as the maximum acceleration at the center of gravity of a structure divided by the peak ground acceleration of the earthquake, is considered as the prediction parameter. For five out of six models developed in this study, the overall mean absolute percentage error in predictions in repeated k-fold cross validation tests vary between 0.128 and 0.145, with the ANN model being the most accurate and most consistent. The cross validation mean absolute error in predictions of all six models vary between 0.08 and 0.1, indicating that the maximum acceleration of structures supported by rocking foundations can be predicted within an average error limit of 8% to 10% of the peak ground acceleration of the earthquake. Full article

(This article belongs to the Special Issue The Application of Machine Learning in Geotechnical Engineering)

► Show Figures

Figure 1

16 pages, 7844 KiB

Open AccessTechnical Note

A Modified Version of the Direct Sampling Method for Filling Gaps in Landsat 7 and Sentinel 2 Satellite Imagery in the Coastal Area of Rhone River

by Lokmen Farhat, Ioannis Manakos, Georgios Sylaios and Chariton Kalaitzidis

Remote Sens. 2023, 15(21), 5122; https://doi.org/10.3390/rs15215122 - 26 Oct 2023

Cited by 3 | Viewed by 2052

Abstract

Earth Observation (EO) data, such as Landsat 7 (L7) and Sentinel 2 (S2) imagery, are often used to monitor the state of natural resources all over the world. However, this type of data tends to suffer from high cloud cover percentages during rainfall/snow [...] Read more.

Earth Observation (EO) data, such as Landsat 7 (L7) and Sentinel 2 (S2) imagery, are often used to monitor the state of natural resources all over the world. However, this type of data tends to suffer from high cloud cover percentages during rainfall/snow seasons. This has led researchers to focus on developing algorithms for filling gaps in optical satellite imagery. The present work proposes two modifications to an existing gap-filling approach known as the Direct Sampling (DS) method. These modifications refer to ensuring the algorithm starts filling unknown pixels (UPs) that have a specified minimum number of known neighbors (Nx) and to reducing the search area to pixels that share similar reflectance as the Nx of the selected UP. Experiments were performed on images acquired from coastal water bodies in France. The validation of the modified gap-filling approach was performed by imposing artificial gaps on originally gap-free images and comparing the simulated images with the real ones. Results indicate that satisfactory performance can be achieved for most spectral bands. Moreover, it appears that the bi-layer (BL) version of the algorithm tends to outperform the uni-layer (UL) version in terms of overall accuracy. For instance, in the case of B04 of an L7 image with a cloud percentage of 27.26%, accuracy values for UL and BL simulations are, respectively, 64.05 and 79.61%. Furthermore, it has been confirmed that the introduced modifications have indeed helped in improving the overall accuracy and in reducing the processing time. As a matter of fact, the implementation of a conditional filling path (minNx = 4) and a targeted search (n2 = 200) when filling cloud gaps in L7 imagery has contributed to an average increase in accuracy of around 35.06% and an average gain in processing time by around 78.18%, respectively. Full article

(This article belongs to the Special Issue Geographic Data Analysis and Modeling in Remote Sensing)

► Show Figures

Graphical abstract

24 pages, 838 KiB

Open AccessArticle

Sarcopenia Prediction for Elderly People Using Machine Learning: A Case Study on Physical Activity

by Minje Seok and Wooseong Kim

Healthcare 2023, 11(9), 1334; https://doi.org/10.3390/healthcare11091334 - 5 May 2023

Cited by 17 | Viewed by 18202

Abstract

Sarcopenia is a well-known age-related disease that can lead to musculoskeletal disorders and chronic metabolic syndromes, such as sarcopenic obesity. Numerous studies have researched the relationship between sarcopenia and various risk factors, leading to the development of predictive models based on these factors. [...] Read more.

Sarcopenia is a well-known age-related disease that can lead to musculoskeletal disorders and chronic metabolic syndromes, such as sarcopenic obesity. Numerous studies have researched the relationship between sarcopenia and various risk factors, leading to the development of predictive models based on these factors. In this study, we explored the impact of physical activity (PA) in daily life and obesity on sarcopenia prediction. PA is easier to measure using personal devices, such as smartphones and watches, or lifelogs, than using other factors that require medical equipment and examination. To demonstrate the feasibility of sarcopenia prediction using PA, we trained various machine learning models, including gradient boosting machine (GBM), xgboost (XGB), lightgbm (LGB), catboost (CAT), logistic regression, support vector classifier, k-nearest neighbors, random forest (RF), multi-layer perceptron, and deep neural network (DNN), using data samples from the Korea National Health and Nutrition Examination Survey. Among the models, the DNN achieved the most precise accuracy on average, 81%, with PA features across all data combinations, and the accuracy increased up to 90% with the addition of obesity information, such as total fat mass and fat percentage. Considering the difficulty of measuring the obesity feature, when adding waist circumference to the PA features, the DNN recorded the highest accuracy of 84%. This model accuracy could be improved by using separate training sets according to gender. As a result of measurement with various metrics for accurate evaluation of models, GBM, XGB, LGB, CAT, RF, and DNN demonstrated significant predictive performance using only PA features including waist circumference, with AUC values at least around 0.85 and often approaching or exceeding 0.9. We also found the key features for a highly performing model such as the quantified PA value and metabolic equivalent score in addition to a simple obesity measure such as body mass index (BMI) and waist circumference using SHAP analysis. Full article

► Show Figures

Figure 1

19 pages, 4546 KiB

Open AccessArticle

Development of Chloroplast Microsatellite Markers and Evaluation of Genetic Diversity and Population Structure of Cutleaf Groundcherry (Physalis angulata L.) in China

by Shangguo Feng, Kaili Jiao, Zhenhao Zhang, Sai Yang, Yadi Gao, Yanyun Jin, Chenjia Shen, Jiangjie Lu, Xiaori Zhan and Huizhong Wang

Plants 2023, 12(9), 1755; https://doi.org/10.3390/plants12091755 - 25 Apr 2023

Cited by 8 | Viewed by 1938

Abstract

Cutleaf groundcherry (Physalis angulata L.), an annual plant containing a variety of active ingredients, has great medicinal value. However, studies on the genetic diversity and population structure of P. angulata are limited. In this study, we developed chloroplast microsatellite (cpSSR) markers and [...] Read more.

Cutleaf groundcherry (Physalis angulata L.), an annual plant containing a variety of active ingredients, has great medicinal value. However, studies on the genetic diversity and population structure of P. angulata are limited. In this study, we developed chloroplast microsatellite (cpSSR) markers and applied them to evaluate the genetic diversity and population structure of P. angulata. A total of 57 cpSSRs were identified from the chloroplast genome of P. angulata. Among all cpSSR loci, mononucleotide markers were the most abundant (68.24%), followed by tetranucleotide (12.28%), dinucleotide (10.53%), and trinucleotide (8.77%) markers. In total, 30 newly developed cpSSR markers with rich polymorphism and good stability were selected for further genetic diversity and population structure analyses. These cpSSRs amplified a total of 156 alleles, 132 (84.62%) of which were polymorphic. The percentage of polymorphic alleles and the average polymorphic information content (PIC) value of the cpSSRs were 81.29% and 0.830, respectively. Population genetic diversity analysis indicated that the average observed number of alleles (Na), number of effective alleles (He), Nei’s gene diversity (h), and Shannon information indices (I) of 16 P. angulata populations were 1.3161, 1.1754, 0.1023, and 0.1538, respectively. Moreover, unweighted group arithmetic mean, neighbor-joining, principal coordinate, and STRUCTURE analyses indicated that 203 P. angulata individuals from 16 populations were grouped into four clusters. A molecular variance analysis (AMOVA) illustrated the considerable genetic variation among populations, while the gene flow (Nm) value (0.2324) indicated a low level of gene flow among populations. Our study not only provided a batch of efficient genetic markers for research on P. angulata but also laid an important foundation for the protection and genetic breeding of P. angulata resources. Full article

(This article belongs to the Special Issue Solanaceae Genetic Resources: Genomics, Phenomics and Breeding)

► Show Figures

Figure 1

22 pages, 6414 KiB

Open AccessArticle

Performance of Statistical and Intelligent Methods in Estimating Rock Compressive Strength

by Xuesong Zhang, Farag M. A. Altalbawy, Tahani A. S. Gasmalla, Ali Hussein Demin Al-Khafaji, Amin Iraji, Rahmad B. Y. Syah and Moncef L. Nehdi

Sustainability 2023, 15(7), 5642; https://doi.org/10.3390/su15075642 - 23 Mar 2023

Cited by 9 | Viewed by 2813

Abstract

This research was conducted to forecast the uniaxial compressive strength (UCS) of rocks via the random forest, artificial neural network, Gaussian process regression, support vector machine, K-nearest neighbor, adaptive neuro-fuzzy inference system, simple regression, and multiple linear regression approaches. For this purpose, geo-mechanical [...] Read more.

This research was conducted to forecast the uniaxial compressive strength (UCS) of rocks via the random forest, artificial neural network, Gaussian process regression, support vector machine, K-nearest neighbor, adaptive neuro-fuzzy inference system, simple regression, and multiple linear regression approaches. For this purpose, geo-mechanical and petrographic characteristics of sedimentary rocks in southern Iran were measured. The effect of petrography on geo-mechanical characteristics was assessed. The carbonate and sandstone samples were classified as mudstone to grainstone and calc-litharenite, respectively. Due to the shallow depth of the studied mines and the low amount of quartz minerals in the samples, the rock bursting phenomenon does not occur in these mines. To develop UCS predictor models, porosity, point load index, water absorption, P-wave velocity, and density were considered as inputs. Using variance accounted for, mean absolute percentage error, root-mean-square-error, determination coefficient (R²), and performance index (PI), the efficiency of the methods was evaluated. Analysis of model criteria using multiple linear regression allowed for the development of a user-friendly equation, which proved to have adequate accuracy. All intelligent methods (with R² > 90%) had excellent accuracy for estimating UCS. The percentage difference of the average of all six intelligent methods with the measured value was equal to +0.28%. By comparing the methods, the accuracy of the support vector machine with radial basis function in predicting UCS was (R² = 0.99 and PI = 1.92) and outperformed all the other methods investigated. Full article

(This article belongs to the Special Issue Advances in Rock Mechanics and Geotechnical Engineering)

► Show Figures

Figure 1

12 pages, 1614 KiB

Open AccessArticle

Investigating Students’ Pre-University Admission Requirements and Their Correlation with Academic Performance for Medical Students: An Educational Data Mining Approach

by Ayman Qahmash, Naim Ahmad and Abdulmohsen Algarni

Brain Sci. 2023, 13(3), 456; https://doi.org/10.3390/brainsci13030456 - 8 Mar 2023

Cited by 6 | Viewed by 2964

Abstract

Medical education is one of the most sought-after disciplines for its prestigious and noble status. Institutions endeavor to identify admissions criteria to register bright students who can handle the complexity of medical training and become competent clinicians. This study aims to apply statistical [...] Read more.

Medical education is one of the most sought-after disciplines for its prestigious and noble status. Institutions endeavor to identify admissions criteria to register bright students who can handle the complexity of medical training and become competent clinicians. This study aims to apply statistical and educational data mining approaches to study the relationship between pre-admission criteria and student performance in medical programs at a public university in Saudi Arabia. The present study is a retrospective cohort study conducted at the College of Computer Science, King Khalid University, Abha, Kingdom of Saudi Arabia between February and November 2022. The current pre-admission criterion is the admission score taken as the weighted average of high school percentage (HSP), general aptitude test (GAT) and standard achievement admission test (SAAT), with respective weights of 0.3, 0.3 and 0.4. Regression and optimization techniques have been applied to identify weightages that better fit the data. Five classification techniques—Decision Tree, Neural Network, Random Forest, Naïve Bayes and K-Nearest Neighbors—are employed to develop models to predict student performance. The regression and optimization analyses show that optimized weights of HSP, GAT and SAAT are 0.3, 0.2 and 0.5, respectively. The results depict that the performance of the models improves with admission scores based on optimized weightages. Further, the Neural Network and Naïve Bayes techniques outperform other techniques. Firstly, this study proposes to revise the weights of HSP, GAT and SAAT to 0.3, 0.2 and 0.5, respectively. Secondly, as the evaluation metrics of models remain less than 0.75, this study proposes to identify additional student features for calculating admission scores to select ideal candidates for medical programs. Full article

(This article belongs to the Special Issue Intelligent Neural Systems for Solving Real Problems)

► Show Figures

Figure 1

11 pages, 958 KiB

Open AccessArticle

How the Loss of Second Molars Corresponds with the Presence of Adjacent Third Molars in Chinese Adults: A Retrospective Study

by Li-Juan Sun, Yang Yang, Zhi-Bang Li, Yi Tian, Hong-Lei Qu, Ying An, Bei-Min Tian and Fa-Ming Chen

J. Clin. Med. 2022, 11(23), 7194; https://doi.org/10.3390/jcm11237194 - 3 Dec 2022

Cited by 9 | Viewed by 1767

Abstract

Third molars (M3s) can increase the pathological risks of neighboring second molars (M2s). However, whether the M3 presence affects M2 loss remains unknown. This retrospective study aimed to reveal the reasons for M2 loss and how M2 loss relates to neighboring M3s. The [...] Read more.

Third molars (M3s) can increase the pathological risks of neighboring second molars (M2s). However, whether the M3 presence affects M2 loss remains unknown. This retrospective study aimed to reveal the reasons for M2 loss and how M2 loss relates to neighboring M3s. The medical records and radiographic images of patients with removed M2(s) were reviewed to analyze why the teeth were extracted and if those reasons were related to adjacent M3s. Ultimately, 800 patients with 908 removed M2s were included. In the included quadrants, 526 quadrants with M3s were termed the M3 (+) group, and the other 382 quadrants without M3s were termed the M3 (−) group. The average age of patients in the M3 (+) group was 52.4 ± 14.8 years and that of the M3 (−) group was 56.7 ± 14.9 years, and the difference between the two groups was statistically significant (p < 0.001). Of the 908 M2s, 433 (47.7%) were removed due to caries and sequelae and 300 (33.0%) were removed due to periodontal diseases. Meanwhile, 14.4% of the M2s with adjacent M3s were removed due to distal caries and periodontitis, which were closely related to the neighboring M3s; this percentage was much lower when M3 were absent (1.8%). Additionally, 42.2% of M3s were removed simultaneously with neighboring M2s. The presence of M3s, regardless of impaction status, was associated with an earlier loss of their neighboring M2s. Full article

(This article belongs to the Special Issue Oral and Maxillofacial Surgery: Current Clinical Practice and Future Trends)

► Show Figures

Figure 1

35 pages, 43569 KiB

Open AccessArticle

Deep Learning Approach for Automatic Segmentation and Functional Assessment of LV in Cardiac MRI

by Anupama Bhan, Parthasarathi Mangipudi and Ayush Goyal

Electronics 2022, 11(21), 3594; https://doi.org/10.3390/electronics11213594 - 3 Nov 2022

Cited by 6 | Viewed by 2687

Abstract

The early diagnosis of cardiovascular diseases (CVDs) can effectively prevent them from worsening. The source of the disease can be effectively detected through analysis with cardiac magnetic resonance imaging (CMRI). The segmentation of the left ventricle (LV) in CMRI images plays an indispensable [...] Read more.

The early diagnosis of cardiovascular diseases (CVDs) can effectively prevent them from worsening. The source of the disease can be effectively detected through analysis with cardiac magnetic resonance imaging (CMRI). The segmentation of the left ventricle (LV) in CMRI images plays an indispensable role in the diagnosis of CVDs. However, the automated segmentation of LV is a challenging task, as it is confused with neighboring regions in the cardiac MRI. Deep learning models are effective in performing such complex segmentation because of the high performing convolutional neural networks (CNN). However, since segmentation using CNN involves the pixel-level classification of the image, it lacks the contextual information that is highly desirable in analyzing medical images. In this research, we propose a modified U-Net model to accurately segment the LV using context-enabled segmentation. The proposed model achieves the automatic segmentation and quantitative assessment of LV. The proposed model achieves the state-of-the-art accuracy by effectively utilizing various hyperparameters, such as batch size, batch normalization, activation function, loss function and dropout. Our method demonstrated a statistical significance in the endo- and epicardial walls with a dice score of 0.96 and 0.93, respectively, an average perpendicular distance of 1.73 and percentage of good contours of 96.22 were achieved. Furthermore, a high positive correlation of 0.98 between the clinical parameters, such as ejection fraction, end diastolic volume (EDV), end systolic volume (ESV) and gold standard was obtained. Full article

(This article belongs to the Special Issue Medical Image Processing Using AI)

► Show Figures

Figure 1

19 pages, 1540 KiB

Open AccessArticle

Concrete Strength Prediction Using Machine Learning Methods CatBoost, k-Nearest Neighbors, Support Vector Regression

by Alexey N. Beskopylny, Sergey A. Stel’makh, Evgenii M. Shcherban’, Levon R. Mailyan, Besarion Meskhi, Irina Razveeva, Andrei Chernil’nik and Nikita Beskopylny

Appl. Sci. 2022, 12(21), 10864; https://doi.org/10.3390/app122110864 - 26 Oct 2022

Cited by 66 | Viewed by 5547

Abstract

Currently, one of the topical areas of application of machine learning methods in the construction industry is the prediction of the mechanical properties of various building materials. In the future, algorithms with elements of artificial intelligence form the basis of systems for predicting [...] Read more.

Currently, one of the topical areas of application of machine learning methods in the construction industry is the prediction of the mechanical properties of various building materials. In the future, algorithms with elements of artificial intelligence form the basis of systems for predicting the operational properties of products, structures, buildings and facilities, depending on the characteristics of the initial components and process parameters. Concrete production can be improved using artificial intelligence methods, in particular, the development, training and application of special algorithms to determine the characteristics of the resulting concrete. The aim of the study was to develop and compare three machine learning algorithms based on CatBoost gradient boosting, k-nearest neighbors and support vector regression to predict the compressive strength of concrete using our accumulated empirical database, and ultimately to improve the production processes in construction industry. It has been established that artificial intelligence methods can be applied to determine the compressive strength of self-compacting concrete. Of the three machine learning algorithms, the smallest errors and the highest coefficient of determination were observed in the KNN algorithm: MAE was 1.97; MSE, 6.85; RMSE, 2.62; MAPE, 6.15; and the coefficient of determination R², 0.99. The developed models showed an average absolute percentage error in the range 6.15−7.89% and can be successfully implemented in the production process and quality control of building materials, since they do not require serious computing resources. Full article

(This article belongs to the Special Issue Advance of Reinforced Concrete)

► Show Figures

Figure 1

Search Results (41)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (41)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI