Next Article in Journal
Microbial Consortia: An Engineering Tool to Suppress Clubroot of Chinese Cabbage by Changing the Rhizosphere Bacterial Community Composition
Next Article in Special Issue
Post-Mortem Interval of Human Skeletal Remains Estimated with Handheld NIR Spectrometry
Previous Article in Journal
Not Too Warm, Not Too Cold: Thermal Treatments to Slightly Warmer or Colder Conditions from Mother’s Origin Can Enhance Performance of Montane Butterfly Larvae
Previous Article in Special Issue
Application of Aspartic Acid Racemization for Age Estimation in a Spanish Sample
 
 
Article

Efficiency of the Adjusted Binary Classification (ABC) Approach in Osteometric Sex Estimation: A Comparative Study of Different Linear Machine Learning Algorithms and Training Sample Sizes

1
Forensic Medicine and Clinical Toxicology, Faculty of Medicine, Alexandria University, Alexandria 21568, Egypt
2
Forensic Medicine and Clinical Toxicology Department, Faculty of Medicine, Misr University for Science and Technology, Giza 3236101, Egypt
3
University Department of Forensic Sciences, University of Split, Ruđera Boškovića 33, 21000 Split, Croatia
4
School of Medicine, University of Split, Šoltanska 2, 21000 Split, Croatia
5
Clinical Department for Pathology, Legal Medicine and Cytology, University Hospital Center Split, Spinčićeva 1, 21000 Split, Croatia
*
Author to whom correspondence should be addressed.
Academic Editors: Maria Giovanna Belcastro and Marco Milella
Biology 2022, 11(6), 917; https://doi.org/10.3390/biology11060917
Received: 10 May 2022 / Revised: 5 June 2022 / Accepted: 9 June 2022 / Published: 15 June 2022
This study adopts a dynamic methodology to explore challenges to the practical application of the adjusted binary classification (ABC) approach, which are related to the unmodifiable characteristics of data used in its development, such as intrasexual variation (sexual dimorphism) of variables and methodological factors such as the selected classification algorithm and sample size. The adequacy of a training dataset’s size was judged relative to the classification performance in an independent test set. Finding an optimal classifier was also addressed in this study, wherein the results demonstrate that both statistical modeling and machine learning techniques perform almost equally in the univariate models; however, differences are evident in the multivariate model due to the different number of variables included via the feature selection process, as well as the effect of inadequate training sample size relative to the test set. This approach is particularly useful when quick classification/prediction is required for making real-time forensic decisions.
The adjusted binary classification (ABC) approach was proposed to assure that the binary classification model reaches a particular accuracy level. The present study evaluated the ABC for osteometric sex classification using multiple machine learning (ML) techniques: linear discriminant analysis (LDA), boosted generalized linear model (GLMB), support vector machine (SVM), and logistic regression (LR). We used 13 femoral measurements of 300 individuals from a modern Turkish population sample and split data into two sets: training (n = 240) and testing (n = 60). Then, the five best-performing measurements were selected for training univariate models, while pools of these variables were used for the multivariable models. ML classifier type did not affect the performance of unadjusted models. The accuracy of univariate models was 82–87%, while that of multivariate models was 89–90%. After applying ABC to the crossvalidation set, the accuracy and the positive and negative predictive values for uni- and multivariate models were ≥95%. Sex could be estimated for 28–75% of individuals using univariate models but with an obvious sexing bias, likely caused by different degrees of sexual dimorphism and between-group overlap. However, using multivariate models, we minimized the bias and properly classified 81–87% of individuals. A similar performance was also noted in the testing sample (except for FEB), with accuracies of 96–100%, and a proportion of classified individuals between 30% and 82% in univariate models, and between 90% and 91% in multivariate models. When considering different training sample sizes, we demonstrated that LR was the most sensitive with limited sample sizes (n < 150), while GLMB was the most stable classifier. View Full-Text
Keywords: machine learning algorithms; adjusted binary classification; osteometric sex estimation; optimal training sample size machine learning algorithms; adjusted binary classification; osteometric sex estimation; optimal training sample size
Show Figures

Figure 1

MDPI and ACS Style

Attia, M.H.; Kholief, M.A.; Zaghloul, N.M.; Kružić, I.; Anđelinović, Š.; Bašić, Ž.; Jerković, I. Efficiency of the Adjusted Binary Classification (ABC) Approach in Osteometric Sex Estimation: A Comparative Study of Different Linear Machine Learning Algorithms and Training Sample Sizes. Biology 2022, 11, 917. https://doi.org/10.3390/biology11060917

AMA Style

Attia MH, Kholief MA, Zaghloul NM, Kružić I, Anđelinović Š, Bašić Ž, Jerković I. Efficiency of the Adjusted Binary Classification (ABC) Approach in Osteometric Sex Estimation: A Comparative Study of Different Linear Machine Learning Algorithms and Training Sample Sizes. Biology. 2022; 11(6):917. https://doi.org/10.3390/biology11060917

Chicago/Turabian Style

Attia, MennattAllah Hassan, Marwa A. Kholief, Nancy M. Zaghloul, Ivana Kružić, Šimun Anđelinović, Željana Bašić, and Ivan Jerković. 2022. "Efficiency of the Adjusted Binary Classification (ABC) Approach in Osteometric Sex Estimation: A Comparative Study of Different Linear Machine Learning Algorithms and Training Sample Sizes" Biology 11, no. 6: 917. https://doi.org/10.3390/biology11060917

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop