Are Scientific Models of Life Testable? A Lesson from Simpson’s Paradox (Version 1, Original)
|Reviewer 1 Sarah Maurer Central Connecticut State University||Reviewer 2 Terence Kee University of Leeds|
|Approved with revisions||Approved with revisions|
Bandyopadhyay, P.S.; Grunska, N.; Dcruz, D.; Greenwood, M.C. Are Scientific Models of Life Testable? A Lesson from Simpson’s Paradox. Sci 2019, 1, 54.
Bandyopadhyay PS, Grunska N, Dcruz D, Greenwood MC. Are Scientific Models of Life Testable? A Lesson from Simpson’s Paradox. Sci. 2019; 1(2):54.Chicago/Turabian Style
Bandyopadhyay, Prasanta S.; Grunska, Nolan; Dcruz, Don; Greenwood, Mark C. 2019. "Are Scientific Models of Life Testable? A Lesson from Simpson’s Paradox." Sci 1, no. 2: 54.
Article Access Statistics
Central Connecticut State University
This paper presents two separate mechanisms for studying the origins of life, along with a brief description of some of the evidence to support these hypotheses. The authors then point out that both groups discount the other hypothesis as chemically implausible and provide reasons for each. The overview given is a reasonable representation of proponents of both fields given the very succinct nature of these descriptions, although perhaps a more modern point of view maybe that these processes are not exclusive and likely both needed to occur simultaneously (i.e. neither need be first for the origin of life).
The authors then describe Simpson’s paradox, which shows that two groups can have a similar trend, but when mixed together, they show the opposite trend. This statistical anomaly can result in the misinterpretation of real world data by ignoring “hidden” groups within the data set. The authors then show that this could occur in either origins of life scenario (MFT or RWT) by chance when many reactions are occurring in different locations/times by choosing seemingly random values for variables.
It seems like it must be mathematically known what conditions must be met for the paradox to occur, yet the authors only choose one data set under which this is possible without explaining the underlying conditions. They then suggest that this should be testable in the real sense, although it may be technologically limited.
If I imagine what this looks like as an experimentalist: I have a 96 well plate and put in some of the molecules that can react (in either scenario) under different environmental conditions (like pH, salt, temperature, minerals, etc.). Each well has a different amount of functional and non-functional molecules. I measure the amount of functional and non-functional molecules in each well at the end of the experiment, and, low and behold, all of the functional molecules grew at a slower pace than non-functional molecules. If I combinatorically mix one well with every other well, according to Simpson’s paradox, 1/60 of the resulting wells would have a higher rate of formation of functional molecules.
While this very simple experiment is likely testable, I cannot tell if it is meaningful. With the variable of “growth rate” of both functional and non-functional molecules life may have simply preferred the fastest growth rate for functional molecules, not necessarily a higher growth rate than non-functional molecules.
Also, if the environments needed to create this effect need to be very diverse, then a gradient set of environments like would be found in a geological setting may not produce the “two sets of reaction rates mix” effect that is proposed by the authors. Also, this doesn’t need to happen once but rather hundreds (thousands? Millions?) of times to generate a complex set of molecules for life. While it seems possible that this could have overcome a single instance of chemical implausibility, leading to abiogenesis, I feel it is unlikely to be the solution to chemical implausibility as a whole.
Without knowing more about the limitations of Simpson’s paradox, it seems like the authors could predict what the limits of the reaction rates would be to allow for this to take place. I also have questions:
How does degradation fit into the model? What if the values are both negative and positive for G?
Some of the values (90% functional molecules compared to only 10% non-functional molecules) seem a bit ambitious from a chemical plausibility standpoint. If your model requires using an implausible chemical situation to explain away chemical implausibility, is it solving the problem?
University of Leeds
This is an interesting paper as it uses the mathematical principles of Simpson's paradox (SP) to provide an argument against the "inefficiency problem"; a problem levelled equally against the RNA World and Metabolism First hypotheses in origin's of life (OOL) studies.
The paper itself is logically well-constrained and the results that the authors draw from their main argument based on the SP, are internally consistent.
There are some points, however, that I'd like flag up in terms of how best to interpret the results of this paper;
(i) One of the key take-home points of this paper which I feel could be flagged up as a little more important is that, rather than focus on individual reaction sequences, one would do better in OOL to consider networks of interconnected chemical processes. In other words, one should consider the chemical environment as being intimately connected to those reactions that lead to molecules of function. This is a positive of the SP approach in that it allows us to consider the effects on a larger global whole rather than a part of that whole.
(ii) Overall, the thrust of this paper appears to be that "inefficiency principle" should not be considered a significant objection to OOL studies. Indeed, this is well-worth supporting as it is well-recognised (in catalysis for example) that it is not so much the thermodynamic stability of key molecules that are important in their being selected for but what other reactions result from them in a dynamic, interconnected chemical network.
(iii) Thus, whether or not the SP in itself is a valuable consideration to OOL studies is connected to what the R1 and R2 etc..systems actually are, how they interconnect in a dynamic manner and how certain key chemicals in these combined networks have functional value.
(iv) Going back to point (i), the overall reason that biological life on earth exists is ultimately linked to the benefit that such life provides to the universe in its entirety. That is life's true environment and it is to that set of processes that we should aim to apply SP if we can!