Next Article in Journal
Stability of Peakons and Periodic Peakons for the mCH–Novikov–CH Equation
Next Article in Special Issue
Fractional Stochastic Evolution Inclusions with Control on the Boundary
Previous Article in Journal
Direct and Fixed-Point Stability–Instability of Additive Functional Equation in Banach and Quasi-Beta Normed Spaces
Previous Article in Special Issue
On the Exiting Patterns of Multivariate Renewal-Reward Processes with an Application to Stochastic Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Determination of Mutation Rates with Two Symmetric and Asymmetric Mutation Types

1
Department of Mathematical Sciences, Florida Institute of Technology, College of Engineering and Science, Melbourne, FL 32901, USA
2
Department of Chemistry, Biology, and Health Sciences, South Dakota School of Mines and Technology, Rapid City, SD 57701, USA
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(8), 1701; https://doi.org/10.3390/sym14081701
Submission received: 6 July 2022 / Revised: 29 July 2022 / Accepted: 1 August 2022 / Published: 16 August 2022
(This article belongs to the Special Issue Stochastic Analysis with Applications and Symmetry)

Abstract

:
We revisit our earlier paper, with two of the coauthors, in which we proposed an unbiased and consistent estimator μ ^ n for an unknown mutation rate μ of microorganisms. Previously, we proved that the associated sequence of estimators μ ^ n converges to μ almost surely pointwise on a nonextinct set Ω 0 . Here, we show that this sequence converges also in the mean square with respect to conditional probability measure P 0 · = P · Ω 0 / P Ω 0 and that, with respect to P 0 , the estimator is asymptotically unbiased. We further assume that a microorganism can mutate or turn to a different variant of one of the two types. In particular, it can mean that bacteria under attack by a virus or chemical agent are either perishing or surviving, turning them to stronger variant. We propose estimators for their respective types and show that they are a.s. pointwise and L 2 -consistent and asymptotically unbiased with respect to measure P 0 .

1. Introduction

In numerous biological applications related to the proliferation of microorganisms, such as bacteria, viruses, and even cancer cells, at some point of their division, they start mutating. New mutants are typically stronger and more virulent. In most common situations, bacteria divide in two forming a deterministic branching process. If a colony starts from a single organism, in the nth generation, its population size is 2 n (unless some progenies perish). Suppose at some point that one of its offspring mutates, say, with probability μ (which is typically small). Then, a new mutant divides like a nonmutant and it engenders a deterministic branching process. As the rate of reverse mutation on the same phenotype, called revertant, is extremely small [1,2], we do not consider this option in our paper. The mutation probability μ is called the mutation rate.

1.1. Our Approach Compared to Other Methods

The mutation rate is a common target in biomedical research and its applications. Ever since the seminal paper by Luria and Delbrück [3] in 1943, numerous works on determination of mutation rate have emerged (cf., [4,5,6,7,8,9,10,11,12,13,14]). They all have dealt with various estimates for μ .
The deterministic branching process (a rooted tree) contained a branching subprocess (random subtree) of non mutants that turned out to be a Galton–Watson process, although the target process was that of new mutants. The new mutants and pre-existing mutants form non-Markov processes.
Why do we deem the branching processes approach more rigorous? In Luria and Delbrück [3], the authors assumed that the number of mutations M τ that occur in interval t , t + τ equals μ τ N t , where N t is the total number of bacteria at time t. This assumption is inaccurate, because to determine the mutation rate μ , they had to exclude from N t the number of pre-existing mutants, even if this number is small. We actually prove that even if the mutation rate μ is small, the number of pre-existing mutants can converge to infinity and thus cannot be neglected. Moreover, M τ would be better off to equal some μ τ 2 X t , where X t is the number of nonmutants at time t, which doubles in interval t , t + τ unless τ is too small. Stewart [15] also states that the Luria and Delbrück method is inaccurate.
Furthermore, if that would not be the case and N t would be approximately equal to the number of nonmutants at time t, then the Luria and Delbrück method is still problematic. Here is why: suppose X n is the number of nonmutants in generation n and suppose that each nonmutant can turn to a mutant with equal probability μ . Thus, each nonmutant is a Bernoulli r.v. with parameter μ . The number of newly born mutants N M n in generation n is thus binomial with parameters X n , μ , that is, N M n B X n , μ . Because X n is large and μ is small, the number N M n is approximately Poisson, with parameter λ n = N M n μ , which carries unknown N M n and μ . The probability that there is no mutant at generation n would have been p 0 = e λ n , which the authors claimed could be determined experimentally and is fairly simple. However, there is still a problem of finding N M n , which is not equal to the total number cells N n , nor does it equal the number of all mutants M n at generation n. This is the reason why we think this method is unpredictably inaccurate (although Zheng in [16,17] claims that the Luria/Delbrück method is the best). Precisely, μ N M n / 2 X n 1 (≠ M n / N n = M n / 2 n as the authors allude), provided that the latter is consistent in some sense and that n is sufficiently large. None of these requirements has ever been met in their work, nor in the work of their followers. This problem was also pointed out by Stewart [15].
Another problem in the statistical evaluation of the popular p 0 -method is that in order to maintain a “nonmutant” plate (which is required to reasonably estimate p 0 ), it applies to counts rendered at early times or for cultures with extremely low mutation rates to warrant at least one plate with no mutants. However, this carries an adverse effect on statistical significance.
In various other alternatives, many formulas contain the ratio M n / N n as a major component in their estimators (rather M n + k / N n + k M n / N n ) which has a similar shortcoming and it cannot be salvaged by logarithmic or other multipliers often occurring in the literature. We noticed that no work offers a general formula for the determination of μ (whether practical or not). All heuristic methods may work in some special cases, but none are claimed to be general. In the literature on mutation rates, we see that some methods claimed to be efficient when following certain heuristic instructions imposed on μ (which was to be determined) and the mean number of mutants. It is not clear how the results obtained for μ were validated. We compared several tables of the results produced from different methods and they all vary.
In Niccum et al. [18], we offered the first rigorous method of determining μ by a pointwise rapidly convergent sequence μ ^ n of estimators. At first, the original formula for μ ^ n we operated on looked impractical because it contained newly formed mutants that are difficult to tell from pre-existing mutants, but it was modified by replacing it with an identical formula requiring just the knowledge of the mutants in any two consecutive generations that turned out to be practical in real labs. That “it was still difficult to observe” as stated in Zheng [17] was not the situation with a series of our lab experiments [18] performed by us and our students (who coauthored the paper) and beyond. It also needs to be noted that the proposed estimator had a rapid speed of convergence that has been validated in numerous subsequent simulations rendered by our students in senior undergraduate and graduate projects with no single exception.

1.2. Synchronized Cultures

In our original mathematical model [18], we assumed that bacteria replicate synchronously. We retain this mathematical assumption in the present work as well. As explained in [18], the real count in the lab is a little different. This method has been rendered for decades and the count of mutants (that is compatible with our theoretical assumption) agrees with quite a few experiments made by different researchers, which obviously renders coincidences (and thus inaccuracies) probabilistically impossible.
Yet for the skeptics of our mathematical methodology, we would like to mention the common practice of synchronization of bacterial cultures known in the biological literature. First, a synchronized culture is a culture that contains cells in the same growth stage when they replicate, as opposed to asynchronous growth when the replication cycles are random. Most known cell divisions in a population are asynchronous. It is an observed, although unproven, assumption that random deviations of the cycle lengths are symmetric around the mean time of division. However, that needs to be rigorously validated. If this holds true, it is easy to deduce that a mere observations of a bacterial culture would give a pretty accurate identification of an underlying generation.
Alternatively, there are widely used synchronization approaches from 1957 (cf. Campbell [19]) and earlier and these continue to be described to the present time (cf. Chang et al. [20] of 2019). A popular technique, known as Helmstetter–Cummings [21,22] of 1968 and 2015 deals with an unsynchronized culture filtered through cellulose nitrate membrane filter. There are numerous other synchronization techniques by Anderson and Pettijohn [23], Kepes and Kepes [24,25], Kubitcheck [26], Ling and Chang [27], Noack, Klöden, and Bley [28], Shehata and Marr [29], and Wallden et al. [30], to name a few. A rigorous description of all those methods lies outside the scope of this paper.
Note that experimentally cells may not remain synchronized during growth complicating analysis. As mentioned above, it would be interesting to conduct lab experiments and learn about the statistical impact of unsynchronized replication. It seems that even if we assume the replication to be synchronous while in facts it is not, the deviations will be symmetric and not large, and as the result, the incorrect assumption will not corrupt the counts, because so far, the observed numbers were not conflicting with our experiments, and they agreed with experiments conducted in other labs. This conjecture is by no means intended to replace a rigorous investigation, which we plan to render.

1.3. Mutation or No Mutation?

There are many anecdotal stories about antibiotics causing antibiotic resistance and breeding new forms of virulent bacteria, in quite a few cases offering little to no remedy. One such notorious form is called MRSA, which stands for methicillin-resistant Staphylococcus aureus (S. aureus), a type of bacteria that has become resistant to potent antibiotics used to treat ordinary staph infections. It thus means that it is tougher to treat compared to other strains of Staphylococcus.
MRSA most commonly causes skin infections and can sometimes trigger pneumonia and other conditions. The symptoms largely depend on which part of the body is affected. Untreated, MRSA can become severe, leading to sepsis. MRSA infections are usually seen in individuals who were previously or currently in hospitals or other health care facilities such as dialysis centers or nursing homes. Many professionals are alarmed by the spread of MRSA as due to its resistance to many types of drugs; it is becoming known as a “super bug”.
On the other hand, there is a general consensus that while bacteria mutate, this process is neither impacted nor exacerbated by the use of antibiotics or other chemical agents. They claim that mutations occur periodically without any external force. However, that natural selection allows these to thrive over other bugs that do not have resistance. (It must be noted that antibiotics are natural products of other organisms that also try to keep bacteria away.)
Before we elaborate it, we wonder what a mutation is referred to in cellular/molecular biology. A mutation is formally defined as an alteration to the gene sequence of an organism. A “natural mutation” occurs spontaneously and can happen during bacterial DNA replication where enzymes choose a wrong base to pair with the original DNA, or due to the environmental UV light, modifying the structure of the original DNA strand that allows it to pair with bases that are not normally compatible.
While the entire bacterial genome is susceptible to mutation, certain mutations are more easily detected than others because of selection. An example is the trpE gene in the Escherichia coli (E. coli) WP2 strain [31]. This high mutation region allows researchers to detect mutation on the strain using the so-called reverse mutation assay, in which the bacteria are grown in an amino-acid-deficient culture to test if the mutation occurred ( t r p + ) or not ( t r p ) [32].
The natural mutation rate for E. coli in a rich medium mutation rate is 10 3 per genome per generation [33]. The mutation rate of methicillin-resistant Staphylococcus aureus (S. aureus)—MRSA strain is similar to the reported mutation rate of wild-type E. coli [34]. The elevated mutation rate is suspected to cause the increased rate of bacteria vancomycin-resistant genes, especially for the vancomycin-intermediate-resistant S. aureus strains—VRSA [35].

1.4. Other Means of Genetic Change

Other means of genetic change exist, in addition to classical spontaneous or genotoxicant-induced mutation, and, in some cases, these resulting genetic changes may also be covered by our model in as much as they may represent stochastic events. While the mechanisms responsible for these changes are well understood and these processes are not considered as classical mutations, they do result in genetic change in the affected organism. In some cases, the probabilities of these events are rare and, as such, the resulting genetic change may be described by probabilistic models. For the purposes of a more comprehensive discussion of other modes genetic changes, these alternative modes of change will be briefly described.
Transformation. Transformation is a process in which a bacteria takes up exogenous (free) DNA from its surrounding environment. This DNA can come from dead (lysed) bacteria. Transformation can happen artificially (such as a heat shock/electroporation), or naturally. A natural transformation occurs when there is a cooperative expression of multiple genes under specific conditions called competence [36]. This state is induced by a limited nutrition (usually amino-acid-deficient environment), associated with the stationary phase. For some other bacteria, transformation occurs most efficiently at the end of exponential growth approaching the stationary phase.
Transformation is one process that can lead to Horizontal Gene Transfer (HGT). There is a hypothesis about the HGT facilitated by the transformation of antibiotic-induced cell-wall-deficient bacteria [37]. This hypothesis states that before the bacteria die, they generate a cell-wall-deficient form and increase the uptake rate of exogenous DNA, because the DNA from dying bacteria release DNA. They are not doing the uptake as they are dead and could not then restore life to propagate this. Combining with the fact that patients who were treated with diverse antibiotics are more prone to get infected with MRSA makes this hypothesis more credible.
In this process, two bacterial cells attach, and plasmids and free intracellular DNA fragments travel from the donor cell to the target cell. In the target cell, plasmids can begin to replicate, whereas DNA fragments may become incorporated in the target cell’s genome. A study on MRSA conjugation stated that this mechanism could be one of the pathways by which methicillin resistance is transmitted among S. aureus strains [38].
Transduction. Another mechanism for bacteria genome variation is called transduction. When bacteria are infected with a virus (usually a bacteriophage), the virus uses the cell’s materials to produce new viruses and kill the bacteria when it reaches some threshold. During the production of new viruses, some fragments of the original bacteria’s DNA are taken up into the virus’ DNA. When this virus infects other cells, it also introduces the DNA to the new cell. If the infection goes to the latent phase and the fragments contain antibiotic genes, the bacteria become antibiotic-resistant [39]. Virus latency (or viral latency) is the ability of a pathogenic virus to lie dormant (latent) within a cell, denoted as the lysogenic part of the viral life cycle. If virus becomes latent, it neither reproduces nor kills the host cell.
So, as we see it, there is a variety of genetic alterations to the bacteria that are technically not mutations, but they often produce virulent species. While it is a common conjecture that bacteria do not mutate under chemical agents such as antibiotics, the latter does cause formidable DNA changes turning some strains to dangerous forms that can be even life-threatening. Most of these transformations are detectable and they can be observed and analyzed mathematically to determine the odds of their changes whether or not we call them mutations. Therefore, it stands to reason to consider those phenomena and place them in the same category as mutations and apply the same mathematical tools to establish rigorous estimators of their genetic transformations. It is also possible that an epigenetic change can occur that may be detectable by selection.

1.5. The Results

In the present paper, we show that the estimator proposed in [18] is not only almost surely pointwise consistent but also consistent in the L 2 -norm giving us yet another mathematical confirmation of the goodness of the proposed estimator. We also prove the convergence with respect to other probability measures. We further assume that a microorganism can mutate in one of the two types, which in particular, can model the spread of virulent antibiotics-resistant bacteria (referred to in the media as “superbugs”) that evolve under attacks by chemical agents so that they either die or mutate to survive the attacks. Consequently, mutated or altered bacteria’s variants that survive are referred to as type 1 mutants and other mutants are called type 2 mutants. We also propose estimators for either type of mutants. These estimators turn out to be unbiased and consistent in every common sense, that is, almost surely pointwise and in the L 2 -norm. (Some other forms of consistency are discussed throughout the forthcoming sections.)
We note that while death of bacteria is present in most bacterial cultures, it is rarely used in mathematical modeling. We address this matter in the present article and even generalize our model allowing replicated offspring to turn to one of the two mutant types with probabilities p and 1 p , respectively. It makes sense to call such division symmetric when p = 1 2 . In the event that this represents bacterial death, that, as mentioned, takes place under the use of a chemical agent (such as antibiotic) or bacteriophage attack.

2. Types of Stochastic Convergence

To make our paper self-contained, we present a short background on types of convergence of random variables (r.v.’s) related to the consistency of our estimators. (Cf. Dshalalow [40].)
Definition 1
(Types of convergence for a sequence of r.v.’s). Let Z , Z 1 , Z 2 , be a sequence of r.v. on a probability space Ω , F , P . We say that the sequence Z n converges to r.v. Z
(i)
in probability in notation Z n P Z if
lim n P Z n Z > ε = 0 , for each ε > 0 ,
(ii)
in the p-norm (or in the L p -norm or in the p-th mean) in notation Z n L p Z if
lim n E [ Z n Z p ] = 0 , p 1 ,
(iii)
in the mean (or in the L 1 -norm) in notation Z n L 1 Z if
lim n E [ Z n Z ] = 0 ,
(iv)
almost surely (a.s.) (or P -almost surely) in notation Z n a . s . Z if
P ω Ω : lim n Z n ω Z ω = 0 = 1 .
The latter means that Z n converges to Z pointwise for almost all ω Ω .
Theorem 1.
The following relations hold true:
Z n a . s . Z Z n P Z and Z n L 2 Z Z n L 1 Z Z n P Z
Definition 2.
If θ ^ n is an estimator of some parameter θ, then θ ^ n is a statistic (Borel measurable function) of a random sample X 1 , , X n X . In general, suppose X 1 , X 2 , is a sequence of r.v.’s on a filtered probability space Ω , F , F n , P adapted to ( F n ) . Notice that the r.v.’s X 1 , X 2 , need not be independent nor identically distributed. Let
θ ^ n : = δ X 1 , , X n
be a statistic of the sample of the first n r.v.’s. Then, θ ^ n is a sequence of estimators associated with X n and induced by a Borel measurable function δ. Let θ be a real number referred to as a parameter of X n .
(i)
The sequence θ ^ n is unbiased if E θ ^ n = θ holds for all n.
(ii)
The sequence θ ^ n is asymptotically unbiased if lim n E θ ^ n = θ
(iii)
The sequence θ ^ n is consistent
(a)
in probability if θ ^ n P θ
(b)
in the pth mean (or in the L p -norm) if θ ^ n L p θ for some p 1
(c)
almost surely if θ ^ n a . s . θ
We say that θ ^ n is a consistent estimator in probability, pth mean, or almost surely, respectively, if so is the sequence θ ^ n .

3. A Background on Stochastic Estimators of Mutation Rate

Consider a population of microorganisms, such as bacteria, stemming from a single parent. Assume that the bacteria replicate by the division in exactly two progeny. At generation n, there are exactly 2 n bacteria in the population forming a deterministic branching process, provided that the bacteria do not die. Suppose that a generation began with a single parent bacterium (at generation zero) and beginning from generation 1, each bacteria, independently of the others, mutates with probability μ 0 , 1 2 . The parent bacterium of the generation is thus assumed to be a nonmutant organism. Let X n be the total number of nonmutants at generation n, N M n be the total number of new mutants at generation n, and M n be the total number of all (preexisting and new) mutants at generation n. We assume that μ (referred to as the mutation rate) is constant at every generation. To determine μ experimentally, we proposed in [18] the estimator
μ ^ n + 1 = N M n + 1 / 2 X n = 2 X n X n + 1 / 2 X n = 1 X n + 1 2 X n
which turned out to be unbiased and consistent a.s. on the non extinction set Ω 0 (defined below) such that P Ω 0 = 1 μ 2 / ν 2 , where ν = 1 μ . (For various bacteria, such as E. coli, this probability is extremely close to 1.)
Denote
C n = ω : X n ω > 0 and Ω 0 = n = 1 C n .
Since the sequence C n is monotone decreasing, the notation Ω 0 in (2) makes sense and Ω 0 is referred to as the nonextinction set. As mentioned above, P Ω 0 = 1 μ 2 ν 2 = 1 ϕ .
In light of our next applications, it is more convenient to turn to the confined space Ω 0 , F 0 = F Ω 0 , P 0 , where
P 0 · = P · Ω 0 / P Ω 0 = P · | Ω 0 .
If X : Ω 0 R ¯ is a r.v. on the confined space Ω 0 , F Ω 0 , P 0 , then (cf., Dshalalow [41]), the expected value E 0 (with respect to measure P 0 ) is
E 0 X = X d P 0 = 1 P Ω 0 Ω 0 X d P = 1 P Ω 0 E 1 Ω 0 X .

4. Preliminaries

Let W n = X n 2 ν n , n = 0 , 1 , , be the normalized sequence of the respective nonmutants populations. This sequence is a martingale that almost surely on Ω 0 converges to a r.v. W (cf. [42]). Furthermore, Theorem 2 [Section 6, Chapter I] of [42] reveals the nature of this r.v., also stating that the convergence of W n to W is also in the mean square.
Note that Theorem 2 below is borrowed from [42] originally formulated and proved as Theorem 2. It could be more rigorously crafted under the specification of the confined probability space Ω 0 , F 0 = F Ω 0 , P 0 introduced in (3) and (4). We proceed with those key assumptions throughout the rest of the paper.
Theorem 2
([42]). If ν > 1 2 , then
(i)
W n W L 2 0 .
(ii)
E W = 1 , Var W = 1 2 ν · σ 2 2 ν 1 = μ ν μ .
(iii)
P W = 0 = ϕ = 1 P Ω 0 .
From Theorem 2 ( i i i ), we have that P 0 W > 0 = 1 .
Assume that ν > 1 2 . Since W n W : Ω 0 ( 0 , ] with the distribution ϕ and 1 ϕ , respectively, the sequence X n converges to
X = lim n W n 2 ν n
where X is defined on Ω and valued in 0 , with the distribution ϕ and 1 ϕ = P Ω 0 , respectively. Therefore, on the confined probability space, Ω 0 , F Ω 0 , P 0 , X equals P 0 -a.s. Hence, the r.v. 1 X on Ω 0 , F Ω 0 , P 0 equals 0 , P 0 -a.s., which implies that
E 0 1 X = 0
Now, because X n 1 on C n = X n > 0 , 1 X n 1 on C n . Therefore,
1 X n 1 C n 1 C n .
Because 1 C n is L 1 P -integrable, using the Lebesgue dominated convergence theorem (LDCT),
lim n E 1 X n 1 C n = E lim n 1 X n 1 C n = E lim n 1 X n · lim n 1 C n = E 1 X 1 Ω 0 = 0
Thus, we proved the following lemma,
Lemma 1.
E 0 1 X = E 1 X 1 Ω 0 = 0 , where E 0 and E are the integrals on respective spaces Ω 0 , F 0 = F Ω 0 , P 0 and Ω , F , P .
Next, recall [18] that N M n + 1 is the number of new mutants at generation n + 1 equal to
N M n + 1 = 2 X n X n + 1 .
Thus,
N M n + 1 2 ν n + 1 = 1 ν W n W n + 1 μ ν W a . s . pointwise .
Similarly, because N M n + 1 1 a.s. on A n + 1 = N M n + 1 > 0 , it follows that 1 N M n + 1 1 on A n + 1 that implies
1 N M n + 1 1 A n + 1 1 A n + 1 .
By the LDCT and because A n + 1 C n ,
lim n E 1 N M n + 1 1 A n + 1 = E lim n 1 N M n + 1 1 A n + 1 E lim n 1 N M n + 1 1 C n = E 1 N M 1 Ω 0 = 0 .
Using the same arguments as in the proof of Lemma 1, we easily conclude that
E 0 1 N M = 0 .
In a nutshell,
Lemma 2.
E 0 1 N M = E 1 N M 1 Ω 0 = 0 , where E 0 and E are the integrals on respective spaces Ω 0 , F 0 = F Ω 0 , P 0 and Ω , F , P .

5. The L 2 P 0 -Convergence of μ ^ n + 1

Theorem 3.
Let Ω 0 , F 0 = F Ω 0 , P 0 be the confined space introduced in Section 3 and Ω , F , P be the original space. The estimator μ ^ n + 1 is P 0 -asymptotically unbiased and it converges to μ in the L 2 P 0 -norm.
Proof. 
We use the formula
Var 0 μ ^ n + 1 = E 0 μ ^ n + 1 2 E 0 μ ^ n + 1 2 = 1 P Ω 0 E μ ^ n + 1 2 1 Ω 0 1 P Ω 0 E μ ^ n + 1 1 Ω 0 2 .
(i)
μ ^ n + 1 1 Ω 0 μ ^ n + 1 (dominating sequence that is P -integrable with μ ^ n + 1 d P = E μ ^ n + 1 = μ ). Thus, by the LDCT,
lim n E μ ^ n + 1 1 Ω 0 = E lim n μ ^ n + 1 1 Ω 0 = μ P Ω 0 .
implying that
E 0 μ ^ n + 1 μ
which means that the estimator μ ^ n + 1 is P 0 -asymptotically unbiased.
(ii)
Next, we show that μ ^ n + 1 is L 2 P 0 -integrable. First, from (1),
E μ ^ n + 1 2 X n = E N M n + 1 2 X n 2 X n = 1 4 X n 2 E N M n + 1 2 X n .
Given the σ -algebra σ X n , the r.v. N M n + 1 B 2 X n , μ , i.e., N M n + 1 is conditionally binomial with parameters 2 X n and μ . Thus,
E μ ^ n + 1 2 X n = 1 4 X n 2 2 X n μ ν + 4 X n 2 μ 2 = 1 X n 1 2 μ ν + μ 2
implying that
E μ ^ n + 1 2 1 C n X n = 1 X n 1 2 μ ν 1 C n + μ 2 1 C n .
and that
E μ ^ n + 1 2 1 C n = 1 2 μ ν E 1 X n 1 C n + μ 2 P C n .
From (6) and (10), we have
E μ ^ n + 1 2 1 C n μ 2 P Ω 0 .
Since μ ^ n + 1 2 1 Ω 0 μ ^ n + 1 2 1 C n and since μ ^ n + 1 2 1 C n is P -integrable, so is μ ^ n + 1 2 1 Ω 0 . Furthermore, because μ ^ n + 1 2 1 and because μ ^ n + 1 2 1 Ω 0 μ 2 1 Ω 0 a.s., by the LDCT,
E μ ^ n + 1 2 1 Ω 0 μ 2 P Ω 0
implying that
E 0 μ ^ n + 1 2 = 1 P Ω 0 E μ ^ n + 1 2 1 Ω 0 μ 2 .
(iii)
Using (9) and (11),
lim n μ ^ n + 1 E 0 μ ^ n + 1 L 2 P 0 2 = lim n Var 0 μ ^ n + 1 = lim n E 0 μ ^ n + 1 2 E 0 μ ^ n + 1 2 = μ 2 μ 2 = 0
or
= 1 P Ω 0 μ 2 P Ω 0 1 P 2 Ω 0 μ 2 P 2 Ω 0 = 0 .
(iv)
Notice that while μ ^ n + 1 is an unbiased estimator of μ with respect to measure P , it need not be unbiased with respect to measure P 0 . As mentioned, however, μ ^ n + 1 is asymptotically unbiased with respect measure P 0 . To establish the L 2 P 0 -convergence of μ ^ n + 1 to μ we use (9) and (12):
μ ^ n + 1 μ L 2 P 0 μ ^ n + 1 E 0 μ ^ n + 1 L 2 P 0 + E 0 μ ^ n + 1 μ L 2 P 0 = Var 0 μ ^ n + 1 + E 0 μ ^ n + 1 μ 0 as n .
Therefore, we proved that μ ^ n + 1 L 2 P 0 μ . □

6. The Speed of Convergence

In [18], we validated the pointwise convergence of estimator μ ^ n + 1 to μ by simulation and most importantly noticed that the speed of convergence was fast. Considering the replication time of various bacteria (20–40 min), it generally takes not more than 12 to 15 h of lab experiments (or 25 to 30 generations) to attain to a high accuracy approximation of μ . We wondered if similar qualities of μ ^ n + 1 apply to the L 2 -convergence (mean square) in some form. The considerations below are carried out in the confined space Ω 0 , F 0 = F Ω 0 , P 0 .
Let
ξ i [ n ] = 1 , i th non mutant in generation n does not mutate 0 , otherwise .
Then, η i n = 1 ξ i n is 1 if the ith nonmutant mutates and 0 otherwise.
We drop the superscript n in ξ s , and η s assuming that they are iid Bernoulli r.v.’s with parameters ν and μ , respectively. Then,
X n + 1 = i = 1 2 X n ξ i , n = 0 , 1 , , X 0 = 1
and
N M n + 1 = 2 X n X n + 1 = 2 X n i = 1 2 X n ξ i = i = 1 2 X n η i
implying that
μ ^ n + 1 = i = 1 2 X n η i · 2 X n , where η i B 1 , μ .
Setting X n = X (constant), μ ^ n + 1 is the sample mean of 2 X Bernoullir.v.’s and thus with X large, by the CLT (central limit theorem), μ ^ n + 1 has a Gaussian distribution with parameters μ (mean) and μ ν / 2 X (variance). Now, with n being just 10 or larger and μ small, X n = X must be a large number. Thus, with simulation, because X n is obtained empirically it is a constant and as noticed μ ^ n + 1 N μ , μ ν / 2 X , good for all n 10 . The variance μ ν / X of μ ^ n + 1 is small and must approach 0.
Suppose we “observe” bacterial replication on M plates where each colony starts with a single parent. Denote μ ^ n + 1 [ k ] the estimator associated with kth plate, k = 1 , , M . For notational brevity, we will use the same symbol μ ^ n + 1 k for its estimate (i.e., observed value). Denote by μ ¯ n + 1 the sample mean of M estimators (estimates) μ ^ n + 1 k , k = 1 , , M . Thus,
σ ^ n + 1 2 = 1 M 1 k = 1 M μ ^ n + 1 k μ ¯ n + 1 2
is the unbiased sample variance that is known to be consistent and weakly convergent to μ ν / 2 X as M . However, we expectthat σ ^ n + 1 rapidly converges to 0 with just reasonable n and M. This would be a heuristic feel of the true convergence speed of the variance of μ ^ n + 1 (cf. Equation (12)).
Example 1.
To validate the results above, we will simulate the process for several fixed values of μ and compute the sample estimates of the mean and variance of μ, i.e., μ ^ n and σ ^ n 2 , for generations n = 1 , 2 , , 25 for different numbers of plates ranging from M = 1 to M = 20 . The goals are twofold: (1) to validate the accuracy of the estimators empirically and (2) to experimentally examine the convergence of the estimators for various choices of generations n and plates M. The estimators are more promising for practical use if convergence occurs for relatively small n and M.
To compute estimates, we must simulate M paths of the process for each experiment. For each sample path, we start with X 0 = 1 nonmutant at the 0th generation and, for each generation n 1 , we assume the number of nonmutants is a binomial random variable X n B 2 X n 1 , 1 μ for a preselected mutation rate μ . We generated M paths from n = 1 to n = 30 . The estimated mutation rate is computed at each generation n = 0 to n = 29 as μ ^ n + 1 = N M n + 1 / 2 X n (rewritten as 1 X n + 1 / 2 X n for convenience) and the mean μ ¯ n is taken for each experiment with M plates. The sample variance will also computed. Ideally, the estimate μ ^ n will be near μ and the estimate σ ^ n 2 will approach 0 quickly for relatively small M.
We follow this approach in several examples. In Figure 1, we let μ = 0.01 and simulated the experiment for 25 generations for each selection of plates M { 2 , 5 , 10 , 20 } (We did not use M = 1 as we evaluate sample variance with degree of freedom of 1). There are several intuitive occurrences here.
Since μ ^ n + 1 = N M n + 1 / 2 X n , the estimate is 0 if there are no new mutants in generation n + 1 . As we see in Figure 1, the estimates are indeed 0 before the first mutant appears in each experiment. An interesting pattern emerges here: for M = 20 and M = 10 plates, the first mutant appears in generation 1 and 2, respectively; for M = 5 plates, the first mutant appears at generation 3; and for M = 2 , the first mutant appears at generation 4. The time of the first mutation is inversely related to the number of plates, which makes sense because there are more opportunities for mutations with more plates and, therefore, the estimate is nonzero sooner with more plates.
Second, the estimates behave erratically between the initial escape from 0 until roughly generation 7–8. During the early generations, there are smaller nonmutant populations in the plates, meaning intuitively that the random mutations have larger impacts on the estimates.
As such, variance is higher when the prior generation’s nonmutant population is smaller, and this nonmutant population grows over time in expectation since the mutation rate μ 0.05 , so we should expect to high variances in early generations. After 7 or 8 generations, the variance seems to dissipate enough to allow the estimates start to approach the true mutation rate μ = 0.01 .
Continuing beyond generation 8, estimates in all experiments approach μ = 0.01 , with the experiments with fewer plates, especially M = 2 , taking a longer time to converge—again, this is expected as the conditional variance of μ ¯ n + 1 is inversely proportional to M 2 , and a shrinking variance should result in convergence of the unbiased estimator.
Clearly, and unsurprisingly, the estimates have higher quality for larger generation n and larger number of plates M, but more interestingly, what does this mean for experimental use of the estimator? First, we define a modified relative error for the estimate
e 0.01 = max μ ¯ n μ μ , 0.01 ,
which is the relative error or 0.01, whichever is larger. Figure 2 is a heatmap of this modified relative error for each pair n and M from our simulations.
Thresholding the relative error as e 0.01 allows us to see clearly in Figure 2 that relative errors of less than 0.01 occurred for all generations n 20 regardless of the number of plates, but for more plates, near M = 20 , these small errors occur reliably earlier, at generations n < 15 . Thus, even low numbers of plates can be used if the experiment can continue for more generations. This is a promising finding for biology lab protocol, as growing more than 20–25 dishes per experiment is highly infeasible. After 15–20 generations (starting with 1 starting cells), we have the mean converged.
Beyond the behavior of the mean mutation rate themselves, we can consider the estimated variance of the mutation rate, σ ^ n + 1 2 . Figure 3 shows the estimated sample variance for several experiments with M = 2 , 5 , 10 , and 20 plates.
As we see in Figure 3, the variance converges to 0 within 10–12 generations regardless of the number of plates, as prior analysis suggested should happen.

7. Estimator p ^ n

We now assume that once a cell or microbe mutates, it turns into one of the two mutant types, with probabilities p and q = 1 p , respectively. The type i mutant then divides in accordance with a deterministic branching process throughout the generations and it does not mutate any further, nor does it alter its type. In particular, we can also interpret mutation types as a continual existence upon mutation (type 1) with probability p or the death (type 2) with probability q = 1 p . Note that our analysis in this section is interchangeably carried out in the original and confined probability spaces Ω , F , P and Ω 0 , F 0 = F Ω 0 , P 0 , respectively. In all cases, the underlying functionals can be distinguished by the absence or presence of subscript 0.
Denote by
p ^ n = N M n 1 N M n and q ^ n = N M n 2 N M n
the proportions of type 1 and 2 new mutants at the nth generation from among the N M n new mutants, respectively. We will show that the estimator p ^ n is unbiased and consistent (and so is q ^ n ).
Recall that A n = N M n > 0 . Denote g n z = E z X n . Below, we observe an interesting and unexpected result concerning the sequence A n . Whereas it obviously is not monotone, as the following two assertions show, the sequence P A n of their measures under a weak and natural assumption, is strictly monotone increasing (we recollect that the sequence P C n with respect to nonmutants is monotone nonincreasing) and furthermore, like P C n , P A n converges to P Ω 0 = 1 ϕ .
Theorem 4.
The following two assertions are valid.
(i)
P A n c = g n 1 ν 2 .
(ii)
If μ < ν 2 < 1 , the sequence P A n is strictly monotone increasing.
Proof. 
(i)
Let h n z = E z N M n . Then, because N M n + 1 B 2 X n , μ given σ X n ,
h n + 1 z = E E z N M n + 1 X n = E μ z + ν 2 X n = g n μ z + ν 2 . P A n + 1 c = h n + 1 0 = g n ν 2 .
(ii)
We show that g n ν 2 is strictly monotone decreasing.
First off, p z = ν z + μ 2 is strictly monotone increasing for z 0 , 1 . Now since ν 2 < 1 and p 1 = max p z : z 0 , 1 = 1 ,
p ν 2 = g 1 ν 2 = ν 3 + μ 2 < 1 .
Now, we show that
g n + 1 ν 2 < g n ν 2
or that
g n + 1 ( ν 2 ) = p g n ν 2 < g n ν 2 .
Let g n ν 2 = z . Consider p z z , where ⋄ stands for one of the three relations, < , > , or = . We have
ν z + μ 2 z 0
or
ν 2 z 2 1 2 ν μ z + μ 2 0
The discriminant of the equation ν 2 z 2 1 2 ν μ z + μ 2 = 0 is
D = 1 2 ν μ 2 4 ν 2 μ 2 = 1 4 μ ν = 1 + ν 2 4 ν = 2 ν 1 2 = = 0 , μ = ν = 1 2 > 0 , ν 1 2 .
Furthermore, after some algebra,
ν 2 z 2 1 2 ν μ z + μ 2 = ν 2 z z 1 z z 2 ,
where z 1 = μ 2 ν 2 (is the extinction probability ϕ ) and z 2 = 1 . Therefore, ν 2 z 2 1 2 ν μ z + μ 2 < 0 (and ⋄ is <) for all z μ 2 ν 2 , 1 . In conclusion,
if z = ν 2 , then ν 2 > μ 2 ν 2 whenever ν 2 > μ .
Therefore, if ν 2 > μ , p ν 2 = g 1 ν 2 = ν 3 + μ 2 < g 0 ν 2 = ν 2 . Next, since p z is strictly monotone increasing in z 0 , 1 and because p ν 2 < ν 2 , it follows that
g 2 ν 2 = p p ν 2 < g 1 ν 2 = p ν 2 .
Finally, since p z is strictly monotone increasing, by the principle of mathematical induction,
g n + 1 ν 2 = p p p ν 2 n + 1 p s < g n ν 2 = p p ν 2 n p s
that proves the assertion that g n ν 2 is a strictly monotone decreasing sequence and so is the sequence P { N M n = 0 } = P A n c . □
Corollary 1.
lim n P A n = lim n P N M n > 0 = 1 ϕ = P Ω 0 = 1 μ 2 ν 2 .
Proof. 
Let α n = g n ν 2 . Then, from the equation
g n + 1 ν 2 = p g n ν 2 holding for n = 0 , 1 , ,
we have
α n + 1 = p α n , n = 0 , 1 ,
Furthermore, since α n is monotone decreasing, α : = lim n α n exists and it is unique. Because p z is continuous, this limit from (15) can be found from the equation
α = p α .
From the theory of branching processes, we know (cf. Kimmel and Axelrod [43]) that the extinction probability ϕ is the smallest positive root of the equation z = p z . In our case, p z = ν z + μ 2 and the extinction probability ϕ < 1 if the mean 2 ν > 1 which is routinely met when μ < 1 2 . The equation z = ν z + μ 2 has two roots: 1 and the other one ϕ = μ 2 ν 2 which is the smallest of two. Consequently,
α = lim n P N M n = 0 = ϕ = μ 2 ν 2
ends up being equal to the extinction probability. Therefore,
lim n P N M n > 0 = P Ω 0 = 1 μ 2 ν 2 .
Since p ^ n = N M n 1 N M n , given N M n , N M n 1 B N M n , p and therefore,
E p ^ n N M n = 1 N M n E N M n 1 N M n = 1 N M n N M n p = p E p ^ n = p .
Furthermore,
E p ^ n 1 A n N M n = p 1 A n E p ^ n 1 A n = p P A n E n p ^ n = p .
Consequently, p ^ n is P - and P n -unbiased, where P n · = P · A n / P A n .
Theorem 5.
If ν > 1 2 , p ^ n converges to p in the L 2 P n -norm, L 2 P 0 -norm, and a.s. pointwise. Furthermore, p ^ n is asymptotically P 0 -unbiased.
Proof. 
(i)
We start off with
E p ^ n 2 1 A n N M n = 1 N M n 2 1 A n N M n p q + N M n 2 p 2 = p 2 + 1 N M n p q 1 A n
implying
E p ^ n 2 1 A n = p q E 1 N M n 1 A n + p 2 P A n .
Because 1 N M n 1 A n 1 , p ^ n 2 1 A n is P -integrable. By Lemma 2 (cf. Equation (7)),
Var n p ^ n = p q 1 P A n E 1 N M n 1 A n + p 2 p 2 = p q 1 P A n E 1 N M n 1 A n 0
Thus, because p ^ n is P n -unbiased,
p ^ n p L 2 P n = p ^ n E n p ^ n L 2 P n = Var n p ^ n 0 .
(ii)
Now we show that
μ ^ n + 1 1 = N M n + 1 1 / 2 X n μ p a . s .
The latter is true due to the following: As per our assumption, every new mutation takes place with a constant probability μ . The new mutants form process N M n . The estimator μ ^ n of μ is consistent in the sense that μ ^ n μ a.s. pointwise. If a new mutant is formed from a nonmutant, in our present model, it becomes a type 1 new mutant with a constant probability p, i.e., a type 1 new mutant emerges from a nonmutant with probability p μ . We then focus on X n and N M n + 1 1 of nonmutants in generation n and type 1 new mutants in generation n + 1 , respectively. We can then reduce this model to the original one with p μ being a new mutation rate. Thus, we define the estimator μ ^ n + 1 1 of p μ as N M n + 1 1 / 2 X n , and the latter must converge to p μ a.s. pointwise [18].
Consequently, from Theorem 3, μ ^ n + 1 1 converges to p μ in the L 2 P 0 -norm.
(iii)
Now, we turn to p ^ n + 1 defined as N M n + 1 1 / N M n + 1 . Dividing the numerator and denominator by 2 X n , we get
p ^ n + 1 = N M n + 1 1 / 2 X n N M n + 1 / 2 X n = μ ^ n + 1 1 μ ^ n + 1 p μ μ = p , a . s . on Ω 0 .
More specifically, if the numerator converges to p μ pointwise on Ω 1 and the denominator converges pointwise to μ on Ω 2 such that P Ω 1 = P Ω 2 = P Ω 0 ,then p ^ n + 1 p pointwise on set Ω 1 Ω 2 and clearly P Ω 1 Ω 2 = P Ω 0 . Thus, the convergence of p ^ n to p is a.s. pointwise on Ω 0 .
Consequently,
p ^ n 2 p 2 a . s . pointwise on Ω 0
and
p ^ n 2 1 Ω 0 p 2 1 Ω 0 a . s .
Because p ^ n 2 1 for all ω ’s, by the LDCT and Equation (18),
lim n E p ^ n 2 1 Ω 0 = E lim n p ^ n 2 1 Ω 0 = p 2 P Ω 0 .
(iv)
Var 0 p ^ n = E 0 p ^ n 2 E 0 p ^ n 2 = 1 P Ω 0 E p ^ n 2 1 Ω 0 1 P 2 Ω 0 E 2 p ^ n 1 Ω 0
p ^ n 1 Ω 0 p ^ n and p ^ n is P - integrable ( p ^ n d P = p )
Thus by the LDCT,
lim n E p ^ n 1 Ω 0 = p P Ω 0
and
E 0 p ^ n p ,
so that p ^ n is asymptotically P 0 -unbiased. Consequently,
Var 0 p ^ n p 2 p 2 = 0 .
(v)
p ^ n p L 2 P 0 p ^ n E 0 p ^ n L 2 P 0 + E 0 p ^ n p L 2 P 0
= Var p ^ n + E 0 p ^ n p 0 .
Remark 1.
(i)
If the second mutation type of bacteria are dead, the formula p ^ n + 1 = N M n + 1 1 N M n + 1 is impractical for a lab utility, since the dead cells are not counted. Here is a way out.
First, notice that in this case, N M n + 1 [ 1 ] = M n + 1 2 M n (i.e., the same formula as for N M n + 1 with only one mutation type). This is exactly because the dead bacteria cannot be counted. Next, we have
2 X n = X n + 1 + N M n + 1 1 + N M n + 1 2 ,
so that the “deficit” of the count is
N M n + 1 2 = 2 X n X n + 1 N M n + 1 1 = 2 X n X n + 1 M n + 1 + 2 M n .
Therefore,
p ^ n + 1 = N M n + 1 1 N M n + 1 = N M n + 1 1 N M n + 1 1 + N M n + 1 2 = M n + 1 2 M n 2 X n X n + 1 .
(ii)
If the second mutation type are not dead bacteria, then N M n + 1 [ 1 ] M n + 1 2 M n , but
N M n + 1 [ 1 ] = M n + 1 1 2 M n 1 .
However, Equation (19) still applies, whereas Equation (20) can be modified as
N M n + 1 2 = 2 X n X n + 1 N M n + 1 1 = 2 X n X n + 1 M n + 1 1 + 2 M n 1 .
Obviously, Equation (21) can be modified as follows
p ^ n + 1 = M n + 1 1 2 M n 1 2 X n X n + 1 ,
where superscripts 1 and 2 stand for the first and second type mutants, respectively.
In a nutshell, if type one bacteria die upon mutation, the formula for the estimator p ^ n of p is simple and it does not require one to distinguish the bacteria types. This is not the case with two surviving types, as per Equation (24).

8. Simulation of Estimators for Types 1 and 2

To validate the convergence and its speed, we render simulation analogous to that in Section 6. After generation of nonmutant paths similar to Section 6, we evaluate new mutant (all types) at each generation using N M n + 1 = 2 X n X n + 1 . For each new mutant generation, we run a binomial random number generator with distribution B N M n + 1 , p for mutation type 1, assuming type 2 is the rest of the mutation population with distribution B N M n + 1 , q = 1 p , because N M n + 1 = N M n + 1 [ 1 ] + N M n + 1 [ 2 ] . This new mutant type i generation is added to the current population of type i mutant (previous generation mutant doubled): M n + 1 [ i ] = N M n + 1 [ i ] + 2 · M n [ i ] , with the assumption that M 0 [ i ] = 0 (initially, we have 0 mutants). The algorithm is looped for n generations and we store the result in a new ( n + 1 ) × M matrix with first row being zero row. Type 1 mutation rate is estimated by Equation (21): p ^ n + 1 = M n + 1 1 2 M n 1 2 X n X n + 1 and the output was an n × M matrix. Mean mutation and variance of the mutation rate were evaluated using the sample variance equation in Section 6, knowing the convergence is established with Theorem 5.
For this simulation, we let the probability that a newly mutated bacterium is of type 1 be p = 0.25 . Figure 4 is the simulation results—specifically, the mean of the type 1 mutation rate estimate.
Similar to simulation in Section 6, we see a pattern here: the estimate is 0 if there are no new mutants. In this simulation, first mutation does not occur until generation 3 for M = 20 , generation 4 for both M = 10 and M = 5 , and generation 7 for M = 2 . The time of the first type 1 mutant is also inversely related to the number of plates, due to the appearance of type 1 mutation depends on the time of first mutation, which is because of, as discussed previously, more opportunities for mutations to occur with more plates.
The estimates stabilize after 9–10 generations, which is a couple of generations lag after first mutation occurs, as established above. This can be explained due to the probability of getting type 2 mutation is larger than type 1 and, due to this type 1 mutation, take more generation to appear. The following modified relative error heatmap (Figure 5) also describes this phenomenon.
Figure 5 shows the similar pattern as Figure 2: for all generations n 20 , regardless of the number of plates, the relative error is less than 0.01; the more plates there are, the less occurrence of large relative errors in earlier generations. For all number of plates, the threshold for small relative error is n 16 ; which is 1 to 2 generation behind the observed threshold from Figure 2 ( n 14 ). Similar to Section 6, this result is a promising finding for biology researchers as the number of dishes is feasibly small to be made in laboratory settings.
As described in Figure 6, similar to simulation in Section 6, the estimated variance of p ^ n quickly converges to 0 after 13–17 generations (which is also a couple of generations behind the convergence of variance discussed in Section 6), regardless of the number of plates.

9. Summary

In 2012, Niccum et al. [18] offered a simple stochastic estimator μ ^ n of μ in a most rigorous fashion proving that it was unbiased and consistent. The latter meant that the sequence μ ^ n was almost surely pointwise convergent to μ on a non extinct set. Best of all, the pointwise convergence proved to be fast, as it was validated by countless simulation runs for various values of μ .
In this paper, we studied the stochastic process describing bacterial mutation that begins at some point of their division. In our earlier paper [18] (by two of the present coauthors), we proposed a stochastic estimator μ ^ n n = 1 , 2 , of the mutation rate μ (or more precisely, mutation probability) and showed that the sequence μ ^ n converges to μ a.s. pointwise on the non extinct set Ω 0 Ω , where Ω , F , P is the probability space on which all processes were considered. P Ω 0 = 1 μ 2 ν 2 , where ν = 1 μ , and thus, P Ω 0 1 if μ is small which is the case in many practical situations. The stochastic estimator μ ^ n obeys Equation (1),
μ ^ n + 1 = N M n + 1 2 X n ,
where X n is the number of nonmutants in generation n and N M n + 1 is the number of newly formed mutants in generation n + 1 .
One of the subjects of interest was to establish a different type of convergence, namely L 2 . We formed the confined space Ω 0 , F 0 = F Ω 0 , P 0 = P / P Ω 0 and established the following results.
1.
The sequence of estimators μ ^ n is P 0 -asymptotically unbiased (that is, E 0 μ ^ n + 1 μ ) and it converges to μ in the L 2 P 0 -norm.
2.
In our earlier paper [18], even though we did not rigorously established the speed of pointwise convergence of μ ^ n to μ , we conducted simulation as well as numerous times thereafter, showing that the convergence was rapid. With the assumed μ = 7.6 × 10 6 (the number produced in our lab results), a good accuracy was observed in the 30th generation. We also wanted to test the L 2 -speed of convergence in the context of our paper. We used μ ^ n + 1 in the form
μ ^ n + 1 = i = 1 2 X n η i · 2 X n , where η i B 1 , μ
setting X n as a constant, due to the step-by-step simulation procedures. With generation n relatively small, like 10 or greater, because X n was large, μ ^ n + 1 N μ , μ 1 μ / 2 X n (Gaussian) using the CLT argument. The variance of μ ^ n + 1 is thus μ 1 μ / 2 X n being virtually zero. Yet, we formed the unbiased sample variance
σ ^ n + 1 2 = 1 M 1 k = 1 M μ ^ n + 1 k μ ¯ n + 1 2
where M is the number of plates, with bacterial colonies on each and with μ ¯ n + 1 being the sample mean of M estimators (estimates) μ ^ n + 1 [ k ] , k = 1 , , M . The conclusion thus was that σ ^ n + 1 2 will converge to zero with M not very large. Indeed, in Example 1, we fully validated it with μ = 0.01 , M = 2 , 5 , 10 , 20 and with n 10 . Granted, the convergence would be slower for a smaller μ , but the pointwise convergence of μ ^ n to μ 7.6 × 10 6 was around n = 30 and with M = 10 or 20.
3.
We further generalized our model by allowing bacteria mutate into one of the two mutant types, with constant probabilities p and q. One of the interpretation of this model is when the second type could be a dead bacterium (that could be attacked by viruses or chemical agent). As mentioned in the introduction, it can depose its DNA to be picked up by other bacteria. We proposed two respective estimators
p ^ n = N M n 1 N M n and q ^ n = N M n 2 N M n
and proved that they were unbiased and consistent. Here NM i is the number of newly formed type 1 mutants. Namely, we proved in Theorem 5 that p ^ n converged to p in the L 2 P n -norm , L 2 P 0 -norm, and a.s. pointwise, and p ^ n was asymptotically P 0 -unbiased. To prove this result, we had to show that the sequence P A n was strictly monotone increasing and converged to P Ω 0 = 1 μ 2 1 μ 2 . As in the above case, to test the speed of the L 2 convergence, we ran simulation with μ = 0.01 , p = 0.25 , and the number of plates M = 2 , 5 , 10 , 20 showing a good speed for the mean and variance for n 20 and 15 , respectively.

Author Contributions

Conceptualization, J.H.D. and R.R.S.; methodology, J.H.D. and R.R.S.; software, V.M.N. and R.T.W.; validation, J.H.D., V.M.N. and R.T.W.; formal analysis, J.H.D. and V.M.N.; data curation, V.M.N. and R.T.W.; writing—original draft preparation, J.H.D. and V.M.N.; writing—review and editing, J.H.D., V.M.N. and R.T.W.; visualization, V.M.N. and R.T.W.; supervision, J.H.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Simulation source code and data will be made available upon request to the corresponding author.

Acknowledgments

The authors are very thankful to the anonymous referees who’s many comments were insightful and helped improve our paper.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
a.s.almost surely
DNADeoxyribonucleic Acid
HGTHorizontal Gene Transfer
LDCTLebesgue Dominated Convergence Theorem
MRSAMethicillin-Resistant Staphylococcus aureus
VRSAVancomycin-Resistant Staphylococcus aureus

References

  1. Foster, P.L. Mechanisms of Stationary Phase Mutation: A Decade of Adaptive Mutation. Annu. Rev. Genet. 1999, 33, 57–88. [Google Scholar] [CrossRef] [PubMed]
  2. Goodson-Gregg, N.; De Stasio, E.A. Reinventing the Ames Test as a Quantitative Lab That Connects Classical and Molecular Genetics. Genetics 2009, 181, 23–31. [Google Scholar] [CrossRef] [PubMed]
  3. Luria, S.E.; Delbrück, M. Mutations of Bacteria from Virus Sensitivity to Virus Resistance. Genetics 1943, 28, 491–511. [Google Scholar] [CrossRef] [PubMed]
  4. Drake, J.W. A Constant Rate of Spontaneous Mutation in DNA-based Microbes. Proc. Natl. Acad. Sci. USA 1991, 88, 7160–7164. [Google Scholar] [CrossRef]
  5. Foster, P.L. Methods for determining spontaneous mutation rates. In Methods in Enzymology; Elsevier: Amsterdam, The Netherlands, 2006; Volume 409, pp. 195–213. [Google Scholar] [CrossRef]
  6. Jones, M.E.; Thomas, S.M.; Rogers, A. Luria-Delbruck Fluctuation Experiments: Design and Analysis. Genetics 1994, 136, 1209–1216. [Google Scholar] [CrossRef]
  7. Ma, W.T.; Sandri, G.V.; Sarkar, S. Analysis of the Luria-Delbrück Distribution Using Discrete Convolution Powers. J. Appl. Probab. 1992, 29, 255–267. [Google Scholar] [CrossRef]
  8. Oprea, M.; Kepler, T.B. Improved Inference of Mutation Rates: II. Generalization of the Luria-Delbrück Distribution for Realistic Cell-Cycle Time Distributions. Theor. Popul. Biol. 2001, 59, 49–59. [Google Scholar] [CrossRef] [PubMed]
  9. Rosche, W.A.; Foster, P.L. Determining Mutation Rates in Bacterial Populations. Methods 2000, 20, 4–17. [Google Scholar] [CrossRef]
  10. Sarkar, S.; Ma, W.T.; Sandri, G.v.H. On Fluctuation Analysis: A New, Simple and Efficient Method for Computing the Expected Number of Mutants. Genetics 1992, 85, 173–179. [Google Scholar] [CrossRef]
  11. Wu, X.; Strome, E.D.; Meng, Q.; Hastings, P.J.; Plon, S.E.; Kimmel, M. A Robust Estimator of Mutation Rates. Mutat. Res. 2009, 661, 101–109. [Google Scholar] [CrossRef]
  12. Wu, X.; Zhu, H. Fast Maximum Likelihood Estimation of Mutation Rates Using a Birth-Death Process. J. Theor. Biol. 2015, 366, 1–7. [Google Scholar] [CrossRef] [PubMed]
  13. Xiong, X.; Boyett, J.M.; Webster, R.G.; Stech, J. A Stochastic Model for Estimation of Mutation Rates in Multiple-replication Proliferation Processes. J. Math. Biol. 2009, 59, 175–191. [Google Scholar] [CrossRef] [PubMed]
  14. Drake, J.W. The Molecular Basis of Mutation; Holden-Day: San Francisco, CA, USA, 1970. [Google Scholar]
  15. Stewart, F.M. Fluctuation Tests: How Reliable Are the Estimates of Mutation Rates? Genetics 1994, 137, 1139–1146. [Google Scholar] [CrossRef] [PubMed]
  16. Zheng, Q. Progress of a Half Century in the Study of the Luria–Delbrück Distribution. Math. Biosci. 1999, 162, 1–32. [Google Scholar] [CrossRef]
  17. Zheng, Q. The Luria-Delbrück Protocol Is Still the Most Practical. J. Theor. Biol. 2015, 386, 188–190. [Google Scholar] [CrossRef]
  18. Niccum, B.A.; Poteau, R.; Hamman, G.E.; Varada, J.C.; Dshalalow, J.H.; Sinden, R.R. On an Unbiased and Consistent Estimator for Mutation Rates. J. Theor. Biol. 2012, 300, 360–367. [Google Scholar] [CrossRef]
  19. Campbell, A. Synchronization of Cell Division. Bacteriol. Rev. 1957, 21, 263–272. [Google Scholar] [CrossRef]
  20. Chang, Z.; Shen, Y.; Lang, Q.; Zheng, H.; Tokuyasu, T.A.; Huang, S.; Liu, C. Microfluidic Synchronizer Using a Synthetic Nanoparticle-Capped Bacterium. ACS Synth. Biol. 2019, 8, 962–967. [Google Scholar] [CrossRef]
  21. Helmstetter, C.E.; Cummings, D.J. Bacterial Synchronization by Selection of Cells at Division. Proc. Natl. Acad. Sci. USA 1963, 50, 767–774. [Google Scholar] [CrossRef]
  22. Helmstetter, C.E. A Ten-Year Search for Synchronous Cells: Obstacles, Solutions, and Practical Applications. Front. Microbiol. 2015, 6, 238. [Google Scholar] [CrossRef]
  23. Anderson, P.A.; Pettijohn, D.E. Synchronization of Division in Escherichia Coli. Science 1960, 131, 1098. [Google Scholar] [CrossRef]
  24. Kepes, F.; Kepes, A. Automatic synchronization of growth of “Escherichia coli” (author’s transl). Ann. Microbiol. 1980, 131, 3–16. [Google Scholar]
  25. Kepes, F.; Kepes, A. Freeze Preservation of Synchrony in Cultures of Enterobacteriaceae Synchronized by Continuous Phasing in Phosphate-Limited Media. Biotechnol. Bioeng. 1984, 26, 1288–1293. [Google Scholar] [CrossRef]
  26. Kubitschek, H.E. Linear Cell Growth in Escherichia Coli. Biophys. J. 1968, 8, 792–804. [Google Scholar] [CrossRef]
  27. Ling, H.; Chang, M.W. A Novel Synchronization Approach Using Synthetic Magnetic Escherichia Coli. Synth. Syst. Biotechnol. 2019, 4, 130–131. [Google Scholar] [CrossRef] [PubMed]
  28. Noack, S.; Klöden, W.; Bley, T. Modeling Synchronous Growth of Bacterial Populations in Phased Cultivation. Bioprocess Biosyst. Eng. 2008, 31, 435–443. [Google Scholar] [CrossRef] [PubMed]
  29. Shehata, T.E.; Marr, A.G. Synchronous Growth of Enteric Bacteria. J. Bacteriol. 1970, 103, 789–792. [Google Scholar] [CrossRef]
  30. Wallden, M.; Fange, D.; Lundius, E.G.; Baltekin, Ö.; Elf, J. The Synchronization of Replication and Division Cycles in Individual E. Coli Cells. Cell 2016, 166, 729–739. [Google Scholar] [CrossRef]
  31. Mortelmans, K.; Riccio, E.S. The Bacterial Tryptophan Reverse Mutation Assay with Escherichia Coli WP2. Mutat. Res. 2000, 455, 61–69. [Google Scholar] [CrossRef]
  32. Hamel, A.; Roy, M.; Proudlock, R. The bacterial reverse mutation test. In Genetic Toxicology Testing; Elsevier: Amsterdam, The Netherlands, 2016; pp. 79–138. [Google Scholar] [CrossRef]
  33. Lee, H.; Popodi, E.; Tang, H.; Foster, P.L. Rate and Molecular Spectrum of Spontaneous Mutations in the Bacterium Escherichia Coli as Determined by Whole-Genome Sequencing. Proc. Natl. Acad. Sci. USA 2012, 109, E2774–E2783. [Google Scholar] [CrossRef]
  34. Szafrańska, A.K.; Junker, V.; Steglich, M.; Nübel, U. Rapid Cell Division of Staphylococcus Aureus during Colonization of the Human Nose. BMC Genom. 2019, 20, 229. [Google Scholar] [CrossRef] [PubMed]
  35. Schaaff, F.; Reipert, A.; Bierbaum, G. An Elevated Mutation Frequency Favors Development of Vancomycin Resistance in Staphylococcus Aureus. Antimicrob. Agents Chemother. 2002, 46, 3540–3548. [Google Scholar] [CrossRef] [PubMed]
  36. Juan, P.A.; Attaiech, L.; Charpentier, X. Natural Transformation Occurs Independently of the Essential Actin-like MreB Cytoskeleton in Legionella Pneumophila. Sci. Rep. 2015, 5, 16033. [Google Scholar] [CrossRef] [PubMed]
  37. Woo, P.C.Y.; To, A.P.C.; Lau, S.K.P.; Yuen, K.Y. Facilitation of Horizontal Transfer of Antimicrobial Resistance by Transformation of Antibiotic-Induced Cell-Wall-Deficient Bacteria. Med. Hypotheses 2003, 61, 503–508. [Google Scholar] [CrossRef]
  38. Cafini, F.; Nguyen, L.T.T.; Higashide, M.; Román, F.; Prieto, J.; Morikawa, K. Horizontal Gene Transmission of the Cfr Gene MRSA Enterococcus: Role of Staphylococcus epidermidis as a reservoir and alternative pathway for the spread of linezolid resistance. J. Antimicrob. Chemother. 2016, 71, 587–592. [Google Scholar] [CrossRef]
  39. Stanczak-Mrozek, K.I.; Laing, K.G.; Lindsay, J.A. Resistance Gene Transfer: Induction of Transducing Phage by Sub-Inhibitory Concentrations of Antimicrobials Is Not Correlated to Induction of Lytic Phage. J. Antimicrob. Chemother. 2017, 72, 1624–1631. [Google Scholar] [CrossRef]
  40. Dshalalow, J.H. Real Analysis: An Introduction to the Theory of Real Functions and Integration; Studies in Advanced Mathematics; Chapman & Hall: Boca Raton, FL, USA, 2001. [Google Scholar]
  41. Dshalalow, J.H. Foundations of Abstract Analysis, 2nd ed.; Springer-Verlag: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
  42. Athreya, K.B.; Ney, P.E. Branching Processes; Number, B., Ed.; 196 in Die Grundlehren Der Mathematischen Wissenschaften in Einzeldarstellungen Mit Besonderer Berücksichtigung Der Anwendungsgebiete; Springer-Verlag: Berlin, Germany, 1972. [Google Scholar]
  43. Kimmel, M.; Axelrod, D. Branching Processes in Biology, 2nd ed.; Interdisciplinary Applied Mathematics; Springer-Verlag: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
Figure 1. The mean mutation rate estimates of μ = 0.01 for M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Figure 1. The mean mutation rate estimates of μ = 0.01 for M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Symmetry 14 01701 g001
Figure 2. Heatmap of modified relative errors e 0.01 of mean mutation rate estimates.
Figure 2. Heatmap of modified relative errors e 0.01 of mean mutation rate estimates.
Symmetry 14 01701 g002
Figure 3. The estimated variance of mutation rate for μ = 0.01 , M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Figure 3. The estimated variance of mutation rate for μ = 0.01 , M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Symmetry 14 01701 g003
Figure 4. The mean of type 1 mutation rate estimates of p = 0.25 for M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Figure 4. The mean of type 1 mutation rate estimates of p = 0.25 for M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Symmetry 14 01701 g004
Figure 5. Heatmap of modified relative errors e 0.01 of mean type 1 mutation rate estimates.
Figure 5. Heatmap of modified relative errors e 0.01 of mean type 1 mutation rate estimates.
Symmetry 14 01701 g005
Figure 6. The estimated variance of type 1 mutation rate for p = 0.25 , M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Figure 6. The estimated variance of type 1 mutation rate for p = 0.25 , M = 2 , 5 , 10 , 20 plates for generations n = 1 to 25.
Symmetry 14 01701 g006
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Dshalalow, J.H.; Nguyen, V.M.; Sinden, R.R.; White, R.T. Determination of Mutation Rates with Two Symmetric and Asymmetric Mutation Types. Symmetry 2022, 14, 1701. https://doi.org/10.3390/sym14081701

AMA Style

Dshalalow JH, Nguyen VM, Sinden RR, White RT. Determination of Mutation Rates with Two Symmetric and Asymmetric Mutation Types. Symmetry. 2022; 14(8):1701. https://doi.org/10.3390/sym14081701

Chicago/Turabian Style

Dshalalow, Jewgeni H., Van Minh Nguyen, Richard R. Sinden, and Ryan T. White. 2022. "Determination of Mutation Rates with Two Symmetric and Asymmetric Mutation Types" Symmetry 14, no. 8: 1701. https://doi.org/10.3390/sym14081701

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop