Next Article in Journal
Intragenic Deletions in FLNB Are Part of the Mutational Spectrum Causing Spondylocarpotarsal Synostosis Syndrome
Previous Article in Journal
Cancer Subtype Recognition Based on Laplacian Rank Constrained Multiview Clustering
Open AccessArticle

On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn’t

by 1,* and 2
1
Department of Biology, Lund University, Sölvegatan 35, 22362 Lund, Sweden
2
Department of Biology & Biochemistry, University of Houston, Science & Research Building 2, Suite #342, 3455 Cullen Bldv., Houston, TX 77204-5001, USA
*
Author to whom correspondence should be addressed.
Academic Editors: Linley Jesson and Robert Kofler
Genes 2021, 12(4), 527; https://doi.org/10.3390/genes12040527
Received: 22 December 2020 / Revised: 22 March 2021 / Accepted: 29 March 2021 / Published: 5 April 2021
(This article belongs to the Section Population and Evolutionary Genetics and Genomics)
In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a paper entitled “Soft sweeps are the dominant mode of adaptation in the human genome” (Schrider and Kern, Mol. Biol. Evolut. 2017, 34(8), 1863–1877) attracted a great deal of attention, in particular in conjunction with another paper (Kern and Hahn, Mol. Biol. Evolut. 2018, 35(6), 1366–1371), for purporting to discredit the Neutral Theory of Molecular Evolution (Kimura 1968). Here, we address an alleged novelty in Schrider and Kern’s paper, i.e., the claim that their study involved an artificial intelligence technique called supervised machine learning (SML). SML is predicated upon the existence of a training dataset in which the correspondence between the input and output is known empirically to be true. Curiously, Schrider and Kern did not possess a training dataset of genomic segments known a priori to have evolved either neutrally or through soft or hard selective sweeps. Thus, their claim of using SML is thoroughly and utterly misleading. In the absence of legitimate training datasets, Schrider and Kern used: (1) simulations that employ many manipulatable variables and (2) a system of data cherry-picking rivaling the worst excesses in the literature. These two factors, in addition to the lack of negative controls and the irreproducibility of their results due to incomplete methodological detail, lead us to conclude that all evolutionary inferences derived from so-called SML algorithms (e.g., S/HIC) should be taken with a huge shovel of salt. View Full-Text
Keywords: artificial intelligence (AI); supervised machine learning (SML); evolutionary biology; molecular and genome evolution; selective sweeps; population size artificial intelligence (AI); supervised machine learning (SML); evolutionary biology; molecular and genome evolution; selective sweeps; population size
Show Figures

Graphical abstract

MDPI and ACS Style

Elhaik, E.; Graur, D. On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn’t. Genes 2021, 12, 527. https://doi.org/10.3390/genes12040527

AMA Style

Elhaik E, Graur D. On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn’t. Genes. 2021; 12(4):527. https://doi.org/10.3390/genes12040527

Chicago/Turabian Style

Elhaik, Eran; Graur, Dan. 2021. "On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn’t" Genes 12, no. 4: 527. https://doi.org/10.3390/genes12040527

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop