# Required Levels of Catalysis for Emergence of Autocatalytic Sets in Models of Chemical Reaction Systems

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. RAF Sets

- Reflexively autocatalytic (RA) if every reaction in R is catalyzed by at least one molecule involved in any of the reactions in $\mathcal{R}$;
- F-generated (F) if every reactant in $\mathcal{R}$ can be constructed from a small “food set” F by successive applications of reactions from $\mathcal{R}$;
- Reflexively autocatalytic and F-generated (RAF) if it is both RA and F.

#### A Model of Catalytic Reaction Systems

## 3. Modification of the Original Definition and Algorithm

_{$\mathcal{R}$}(F) of the food set F relative to the set of reactions $\mathcal{R}$, which is defined as F together with all molecules that can be constructed from F by repeated applications of reactions in $\mathcal{R}$. For a set of reactions $\mathcal{R}$ to be F-generated (F), it is required that the reactants of every reaction in $\mathcal{R}$ are in cl

_{$\mathcal{R}$}(F).

_{$\mathcal{R}$}(F), but x ∉ supp($\mathcal{R}$). In words, the molecule x is in the closure of the food set (by default, as it is part of the food set), but it is not in the support of the reaction set. So, according to the original definition, this set $\mathcal{R}$ is not RAF (it does not conform to the RA part of the definition), even though logically one would consider this case to be a proper autocatalytic set.

#### 3.1. RAF Definition

- Reflexively autocatalytic (RA) if for all reactions r ∈ $\mathcal{R}$′ there exists a molecule x ∈ cl
_{$\mathcal{R}$′}(F) such that (x, r) ∈ C; - F-generated if ρ($\mathcal{R}$′) ⊆ cl
_{$\mathcal{R}$′}(F), where ρ($\mathcal{R}$′) is the set of all reactants in $\mathcal{R}$′; - Reflexively autocatalytic and F -generated (RAF) if $\mathcal{R}$′ is both RA and F.

_{$\mathcal{R}$}(F) in the F part, as in the original definition, the modified definition uses cl

_{$\mathcal{R}$}(F) in both the RA and the F part, thus simplifying it slightly. Note that an RAF in the original setting of [2] is still an RAF under this new definition by virtue of the following result, a proof of which is provided in the Appendix.

**Lemma 3.1**Any set of reactions that forms an RAF under the earlier definition of [2] is an RAF in the modified definition above.

- Remove all reactions that do not conform to the RA requirement;
- Remove all reactions that do not conform to the F requirement.

#### 3.2. RAF Algorithm

- Start with the complete set of reactions $\mathcal{R}$ and the food set F;
- Compute the closure of the food set cl
_{$\mathcal{R}$}(F) relative to the current set of reactions $\mathcal{R}$; - For each reaction r ∈ $\mathcal{R}$ for which (1) all catalysts, or (2) one or more reactants, are not in cl
_{$\mathcal{R}$}(F), remove r from $\mathcal{R}$; - Repeat steps 2 and 3 until no more reactions can be removed.

^{2}log |$\mathcal{R}$|) worst-case, but shown to be sub-quadratic on average in practice [2]), as it is dominated by step 2 (computing the closure of the food set). All results presented in this paper are generated with this new version of the RAF algorithm.

## 4. Required Levels of Catalysis

_{n}= 0.50 (or close to 0.50) of finding an RAF set in a number of instances of the random catalytic reaction model [31]. From these statistics, we then estimated a linear function f (n) = a + bn using an ordinary least squares regression. We compare these results with the theoretical linear relation which can be calculated from Theorem 4.1 (ii) in [3] (using P

_{n}= 0.50 and k = t = 2).

- Computational case for any RAF;
- Computational case for all-molecule RAFs;
- Theoretical case for all-molecule RAFs.

**A**) and the theoretical one (case

**C**). However, as case

**B**shows, this difference cannot be fully explained by the fact that the theoretical analysis assumes RAFs involving all molecules. Even with this stronger assumption, the theoretical slope (case

**C**) is still more than twice as large as the one from the simulations (case

**B**).

_{n}= 0.50 is f

_{C}(20) = 32.678, which seems unrealistically high. However, the actual level of catalysis required is only f

_{A}(20) = 1.475, which is chemically much more plausible.

## 5. Required Size of the Molecule Set

_{n}≥ 0.50 of RAF sets occurring?

_{n}= 0 for n ≤ 12, but P

_{13}= 0.982, and P

_{n}= 1 for n ≥ 14. So, in this case a value of at least n = 13 is required to get RAF sets with high probability. For only a slightly higher probability of catalysis (p = 0.00002), a value of n = 12 would be sufficient (results not shown).

_{n}= 0 for n ≤ 15, P

_{16}= 0.939, and P

_{n}= 1 for n ≥ 17. So, in this case a value of at least n = 16 is required to get RAF sets with high probability. Again, for only a slightly higher probability of catalysis (p = 0.000002), a value of n = 15 would be sufficient (results not shown). The size of the molecule set in this case would be |X| = 65534.

## 6. Template-Based Catalysis

#### 6.1. Theoretical Results

_{e}(κ)). Notice that P (n) can be chosen as close to 1 as we wish by selecting λ large enough (and independently of n!). Thus, this result justifies the statement that the average number of reactions each molecule catalyzes needs to grow only linearly with n in order for there to be a given (high) probability of generating an RAF.

_{1}+ s

_{2}that is complementary to the end-segment (of length s

_{1}) and the initial segment (of length s

_{2}) of the two molecules involved in the cleavage or ligation. Thus, for the above set-up we have: s

_{1}= s

_{2}= 2 and so s = 4. We also assume that the probability that a molecule x catalyzes a reaction r with a complementary template just depends on x and not on r (this is the analogue of the template-free model assumption (R

_{2}) in [3]).

^{s}(this factor would be 16 for the binary model with template size four). More precisely we have the following result, whose proof is provided in the Appendix.

**Theorem 6.1**Let P

_{s}(n) be the probability that there exists an all-molecule RAF under this template matching model. Suppose that each molecule catalyzes (on average) at least λ

_{s}n reactions, where λ

_{s}= κ

^{s}λ and λ > log

_{e}(κ). Then P

_{s}(n) satisfies the same inequality as P (n), namely

#### 6.2. Computational Results

^{n}, we can only go up to about n = 20 to get computational results in a reasonable amount of time (even on a parallel cluster) for the original model (Section 4). But for the template-based catalysis as described above, it is even worse. We now have to check for each pair of molecule x ∈ X and reaction r ∈ $\mathcal{R}$ whether there is a template match between any part of the molecule and the complement of the (4-site) reaction template. Since both |X| and |$\mathcal{R}$| are ∝ 2

^{n}, this means that |X × $\mathcal{R}$| ∝ 2

^{2}

^{n}, and in practice this means that we can only go up to about n = 16 with our template-based catalysis simulations.

**A**in Section 4). As can be expected, for smaller values of n, a higher level of catalysis is needed to find RAF sets with high probability (again, P

_{n}= 0.50 is taken as the transition point) in the case of template-based catalysis compared to the purely random model. Since each molecule type x ∈ X is now restricted, to some extent, in terms of which reactions it can catalyze, the system as a whole is more constrained, and it will be harder to get RAF sets.

## 7. Conclusions

## Acknowledgments

## Appendix

#### A1. Proof of Lemma 3.1

_{1}] the new one as given here in Section 3. We need to show that [RA] + [F-gen] implies [RA

_{1}], so suppose that r ∈ R′. By [RA] there exists x ∈ supp($\mathcal{R}$′) : (x, r) ∈ C. Either x ∈ ρ(R′) or x ∈ π(R′) (the molecules that are products of at least one reaction in $\mathcal{R}$′). In the first case, by [F-gen], we have x ⋲ cl

_{$\mathcal{R}$′}(F) so [RA

_{1}] holds. In the second case, where x ∈ π(R′) there exists (A, B) ∈ R′ for which x ∈ B. By [F-gen], A ⊆ cl

_{$\mathcal{R}$′}(F) and so B ⊆ cl

_{$\mathcal{R}$′}(F) and thus x ∈ cl

_{$\mathcal{R}$′}(F). So once again [RA

_{1}] holds.

#### A2. Proof of RAF Algorithm Correctness

_{1}) that is contained within an arbitrary subset (say $\mathcal{R}$

_{2}) of $\mathcal{R}$, then r is not eliminated if step 3 of the RAF algorithm is applied with $\mathcal{R}$

_{2}being taken as the “current set of reactions”. To see this, the fact that $\mathcal{R}$

_{1}is an RAF implies that there exists a molecule x ∈ cl

_{$\mathcal{R}$1}(F) that catalyzes r, and that the reactants of r are contained within cl

_{$\mathcal{R}$1}(F). Now, $\mathcal{R}$

_{1}⊆ R

_{2}and so cl

_{$\mathcal{R}$1}(F) ⊆ cl

_{$\mathcal{R}$2}(F); thus step 3 will not eliminate r from $\mathcal{R}$

_{2}. This establishes claim (iii) which implies, by induction, the further claim (iv): If a reaction r lies within any RAF that is contained within $\mathcal{R}$, then r is never eliminated at step 3 of the RAF algorithm at any stage starting with $\mathcal{R}$. Claim (i) now follows immediately.

_{$\mathcal{R}$′}(F) and (2) all reactants of r are in cl

_{$\mathcal{R}$′}(F). Condition (1) implies that $\mathcal{R}$′ satisfies the (RA) condition, and property (2) shows that $\mathcal{R}$′ satisfies the F-generated condition. Thus $\mathcal{R}$′ is an RAF; moreover it is the unique maximal RAF by virtue of claim (iv) established in the previous paragraph.

#### A3. Proof of Theorem 6.1

_{m}, r

_{m}) from [3]. Let

_{m}denote the number of forward (ligation) reactions that produce polymers of length at most m. We have:

_{s}(x) denote the set of polymers of length s that occur within x, and for any polymer t of length s let X

_{n}(t) denote the set of polymers of length at most n that contain at least one copy of t. Thus,

_{r}denote the complement of the template polymer of length s = s

_{1}+ s

_{2}(which is also polymer of length s). If we let R

_{n}(t) denote the set of forward reactions r with t

_{r}= t, then:

_{r}) (or, equivalently, if r ∈ R

_{n}(t) for some t ∈ T

_{s}(x)). Let p(x, r) be the probability that molecule x catalyzes reaction r. We have assumed that each molecule that could catalyze reaction r (by virtue of template matching) has the same probability of doing so, that is:

_{x}just depends on x and not r (the analogue of requirement (R2) in [3]).

_{x}be the expected number of (forward) reactions (r ∈ R

_{+}(n)) that a molecule x catalyzes, then:

_{+}(n) : x ∈ X (t

_{r})}. Note that |S(x)| is bounded above by the number of pairs (t, r) where t ∈ T

_{s}(x) and r ∈ R

_{+}(n) : r

_{t}= t. By virtue of Inequality (3) and Equation (5) this set of pairs has size at most κ

^{s}r

_{n–s}and so

_{x}≥ λ

_{s}n and applying (6), we obtain:

_{s}= κ

^{s}λ this last inequality can be written as:

_{+}(n) let ${q}_{r}^{*}$ be the probability that r is not catalyzed by any molecule. Then

^{y}≤ exp(–xy) for x, y > 0 we obtain:

_{n–s}by the expression given by Equation (1) for m = n – s and checking that the ratio that post-multiplies λ is greater than 1. The remainder of the proof now follows the argument in [3] for the proof of Theorem 4.1 (ii).

## References and Notes

- Steel, M. The emergence of a self-catalysing structure in abstract origin-of-life models. Appl. Math. Lett
**2000**, 3, 91–95. [Google Scholar] - Hordijk, W; Steel, M. Detecting autocatalytic, self-sustaining sets in chemical reaction systems. J. Theor. Biol
**2004**, 227, 451–461. [Google Scholar] - Mossel, E; Steel, M. Random biochemical networks: The probability of self-sustaining autocatalysis. J. Theor. Biol
**2005**, 233, 327–336. [Google Scholar] - Kauffman, SA. Autocatalytic sets of proteins. J. Theor. Biol
**1986**, 119, 1–24. [Google Scholar] - Kauffman, SA. The Origins of Order; Oxford University Press: New York, NY, USA, 1993. [Google Scholar]
- Sharov, A. Self-reproducing systems: Structure, niche relations and evolution. BioSystems
**1991**, 25, 237–249. [Google Scholar] - Letelier, JC; Soto-Andrade, J; Abarzúa, FG; Cornish-Bowden, A; Cárdenas, ML. Organizational invariance and metabolic closure: Analysis in terms of (M,R) systems. J. Theor. Biol
**2006**, 238, 949–961. [Google Scholar] - Cornish-Bowden, A; Cárdenas, ML; Letelier, JC; Soto-Andrade, J. Beyond reductionism: Metabolic circularity as a guiding vision for a real biology of systems. Proteomics
**2007**, 7, 839–845. [Google Scholar] - Jaramillo, S; Honorato-Zimmer, R; Pereira, U; Contreras, D; Reynaert, B; Hernández, V; Soto-Andrade, J; Cárdenas, M; Cornish-Bowden, A; Letelier, J. (M,R) Systems and RAF Sets: Common Ideas, Tools and Projections. Proceedings of the Alife XII Conference, Odense, Denmark, 19–23 August, 2010, 94–100.
- Flamm, C; Ullrich, A; Ekker, H; Mann, M; Hoegerl, D; Rohrschneider, M; Sauer, S; Scheuermann, G; Klemm, K; Hofacker, IL; et al. Evolution of metabolic networks: A computational framework. J Syst Chem
**2010**, 1. [Google Scholar] [CrossRef] - Kun, A; Papp, B; Szathmáry, E. Computational identification of obligatorily autocatalytic replicators embedded in metabolic networks. Genome Biol
**2008**, 9. [Google Scholar] [CrossRef] - Awazu, A; Kaneko, K. Discretness-induced transition in catalytic reaction networks. Phys Rev E
**2007**, 76, 041915:1–041915:8. [Google Scholar] - Brogioli, D. Marginally stable chemical systems as precursors of life. Phys Rev Lett
**2010**, 105, 058102:1–058102:4. [Google Scholar] - Bartsev, SI; Mezhevikin, VV. On initial steps of chemical prebiotic evolution: Triggering autocatalytic reaction of oligomerization. Adv. Space Res
**2008**, 42, 2008–2013. [Google Scholar] - Dyson, FJ. A model for the origin of life. J. Mol. Evol
**1982**, 18, 344–350. [Google Scholar] - Bollobas, B; Rasmussen, S. First cycles in random directed graph processes. Discret. Math
**1989**, 75, 55–68. [Google Scholar] - Lifson, S. On the crucial stages in the origin of animate matter. J. Mol. Evol
**1997**, 44, 1–8. [Google Scholar] - Szathmary, E. The evolution of replicators. Philos. Trans. R. Soc. Lond. B
**2000**, 355, 1669–1676. [Google Scholar] - Orgel, LE. The implausibility of metabolic cycles on the prebiotic earth. PLoS Biol
**2008**, 6, 5–13. [Google Scholar] - Sievers, D; von Kiedrowski, G. Self-replication of complementary nucleotide-based oligomers. Nature
**1994**, 369, 221–224. [Google Scholar] - Lee, DH; Severin, K; Ghadiri, MR. Autocatalytic networks: The transition from molecular self-replication to molecular ecosystems. Curr. Opin. Chem. Biol
**1997**, 1, 491–496. [Google Scholar] - Ashkenasy, G; Jegasia, R; Yadav, M; Ghadiri, MR. Design of a directed molecular network. Proc. Nat. Acad. Sci. USA
**2004**, 101, 10872–10877. [Google Scholar] - Hayden, EJ; von Kieddrowski, G; Lehman, N. Systems chemistry on ribozyme self-construction: Evidence for anabolic autocatalysis in a recombination network. Angew. Chem. Int. Ed
**2008**, 120, 8552–8556. [Google Scholar] - Lincoln, TA; Joyce, GF. Self-Sustained Replication of an RNA Enzyme. Science
**2009**, 323, 1229–1232. [Google Scholar] - Penny, D. An interpretive review of the origin of life research. Biol. Philos
**2005**, 20, 633–671. [Google Scholar] - Hordijk, W; Hein, J; Steel, M. Autocatalytic sets and the origin of life. Entropy
**2010**, 12, 1733–1742. [Google Scholar] - Dyson, FJ. Origins of Life; Cambridge University Press: Cambridge, UK, 1985. [Google Scholar]
- Morowitz, HJ; Kostelnik, JD; Yang, J; Cody, GD. The origin of intermediary metabolism. Proc. Nat. Acad. Sci. USA
**2000**, 97, 7704–7708. [Google Scholar] - Smith, E; Morowitz, HJ. Universality in intermediary metabolism. Proc. Nat. Acad. Sci. USA
**2004**, 101, 13168–13173. [Google Scholar] - Morowitz, HJ; Srinivasan, V; Smith, E. Ligand field theory and the origin of life as an emergent feature of the periodic table of elements. Biol. Bull
**2010**, 219, 1–6. [Google Scholar] - It was shown in [2] that P
_{n}= 0.50 can be taken as the “transition point” between RAFs not existing at all and RAFs occurring with high probability. - Scott, JK; Smith, GP. Searching for peptide ligands with an epitope library. Science
**1990**, 249, 286–390. [Google Scholar] - Yamauchi, A; Nakashima, T; Tokuriki, N; Hosokawa, M; Nogami, H; Arioka, S; Urabe, I; Yomo, T. Evolvability of random polypeptides through functional selection within a small library. Protein Eng
**2002**, 15, 619–626. [Google Scholar] - Bagley, RJ. A Model of Functional Self Organization; PhD Thesis; University of California, San Diego: CA, USA, 1991. [Google Scholar]
- Bagley, RJ; Farmer, JD. Spontaneous Emergence of a Metabolism. In Artificial Life II; Langton, CG, Taylor, C, Farmer, JD, Rasmussen, S, Eds.; Addison-Wesley: Upper Saddle River, NJ, USA, 1991; pp. 93–140. [Google Scholar]
- Bagley, RJ; Farmer, JD; Fontana, W. Evolution of a metabolism. In Artificial Life II; Langton, CG, Taylor, C, Farmer, JD, Rasmussen, S, Eds.; Addison-Wesley; Upper Saddle River, NJ, USA, 1991; pp. 141–158. [Google Scholar]
- Andersen, IT; Nan, L; Kjaersgaard, MIS. Search for life in catalytic reaction systems. Available online: http://www.stats.ox.ac.uk/research/genome/projects/pastprojects (accessed on 5 May 2011).

**Figure 1.**A simple example of a catalytic reaction system (CRS) with seven molecule types {a, b, c, d, e, f, g} (solid nodes) and four reactions {r

_{1}, r

_{2}, r

_{3}, r

_{4}} (open nodes). The food set is F = {a, b}. Solid arrows indicate reactants going into and products coming out of a reaction, dashed arrows indicate catalysis. The subset $\mathcal{R}$ = {r

_{1}, r

_{2}} (shown with bold arrows) is RAF.

**Figure 3.**The probability P

_{n}of finding RAF sets for different (fixed) catalysis probabilities p and values of n.

**Figure 4.**The level of catalysis f (n) required for the template-based catalysis case compared to the original (purely random) case, for different values of n.

A | f_{A}(n) = | 1.0970 + 0.0189n |

B | f_{B} (n) = | –0.4736 + 0.7012n |

C | f_{C} (n) = | 1.6339n |

© 2011 by the authors; licensee MDPI, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Hordijk, W.; Kauffman, S.A.; Steel, M. Required Levels of Catalysis for Emergence of Autocatalytic Sets in Models of Chemical Reaction Systems. *Int. J. Mol. Sci.* **2011**, *12*, 3085-3101.
https://doi.org/10.3390/ijms12053085

**AMA Style**

Hordijk W, Kauffman SA, Steel M. Required Levels of Catalysis for Emergence of Autocatalytic Sets in Models of Chemical Reaction Systems. *International Journal of Molecular Sciences*. 2011; 12(5):3085-3101.
https://doi.org/10.3390/ijms12053085

**Chicago/Turabian Style**

Hordijk, Wim, Stuart A. Kauffman, and Mike Steel. 2011. "Required Levels of Catalysis for Emergence of Autocatalytic Sets in Models of Chemical Reaction Systems" *International Journal of Molecular Sciences* 12, no. 5: 3085-3101.
https://doi.org/10.3390/ijms12053085