Probabilistic Inference with Polymerizing Biochemical Circuits

Probabilistic inference—the process of estimating the values of unobserved variables in probabilistic models—has been used to describe various cognitive phenomena related to learning and memory. While the study of biological realizations of inference has focused on animal nervous systems, single-celled organisms also show complex and potentially “predictive” behaviors in changing environments. Yet, it is unclear how the biochemical machinery found in cells might perform inference. Here, we show how inference in a simple Markov model can be approximately realized, in real-time, using polymerizing biochemical circuits. Our approach relies on assembling linear polymers that record the history of environmental changes, where the polymerization process produces molecular complexes that reflect posterior probabilities. We discuss the implications of realizing inference using biochemistry, and the potential of polymerization as a form of biological information-processing.


Introduction
Probabilistic inference-a procedure for estimating unobserved variables in probabilistic models-has been used to describe various aspects of cognition [1,2]. In this line of work, organisms are thought to build probabilistic models of their world and use these to guide action and perception via inference. Although the work on biological instantiations of probabilistic inference has focused on animal nervous systems [3,4], single-celled organisms also show complex, history-dependent behaviors in changing environments. This history-dependent character has been described in cognitive terms, such as "learning," "memory," and "decision-making" [5][6][7][8][9][10][11][12][13][14][15].
The internal state of microbes, for instance, is shaped by past experiences [16][17][18]. B. subtilis populations exposed to distinct environmental perturbations and then grown in the same environment can be distinguished (based on gene expression) for as long as 24 h after the perturbation, suggesting that cells retain a "memory" of past environments [16]. E. coli cells grown in an environment that switches periodically from glucose to lactose eventually adapt to the switches and maintain a consistent growth rate through the switches [17]. Yeast strains anticipate their environment by producing a metabolic program needed to metabolize an alternative nutrient, even before the preferred nutrient is depleted [19]. All these behaviors occur on an ontogenetic timescale, much faster than that of mutation and natural selection.
These observations raise the question of whether probabilistic inference can be realized in cellular biochemistry. In principle, Chemical Reaction Networks (CRNs), commonly used to formalize biochemical systems, can approximate any computable function [20,21]-and therefore implement inference. Several studies have shown how CRNs can implement probabilistic inference [22][23][24][25][26]. For instance, [22] showed it is possible to construct a CRN that, at steady-state, encodes the joint probability distributions of probabilistic models known as factor graphs, and suggested enzyme-free DNA strand displacement [27] as a physical realization. However, strand displacement is an unlikely cellular mechanism. Moreover, cells act as the environment changes [13,24], and while the abundances of cellular components change in a stochastic manner [28,29], which raises an important question: which biologically plausible, cellular mechanisms can support inference under such stochastic conditions? Some studies have explored phosphorylation and transcriptional regulation as vehicles for forming memories and simple associations. For example, [10] designed biochemical circuits, which use either phosphorylation or transcriptional control, that can be conditioned through a unicellular analog of Hebb's rule, while [30] evolved chemical circuits in silico for associative learning. Such mechanisms could potentially be used to build inference-performing circuits.
Yet biochemistry also includes generative and combinatorial mechanisms, such as polymerization, which are not part of the typical repertoire of molecular mechanisms (e.g., protein phosphorylation or transcriptional regulation) that are thought to "compute" or "process information". Through polymerization, the cell produces structures such as microtubules and actin cables, which are highly regulated [31] and crucial to a variety of cellular functions. While these polymers are often studied for their mechanical properties [32], it has also been recognized since the early days of molecular biology that polymerization can be viewed as a computational process, capable of implementing logical automata [33] (more recent studies have also explored the computational power and properties of polymers; see [34] and references therein).
In this paper, we follow a similar line of thought, by exploring how the process of polymer assembly can be used to realize probabilistic inference. We show how the process of linear polymer assembly can be used to perform inference in a Markov model. Our circuit uses the polarity of polymers and their constituents to create a molecular record of the environment's history (its past sequence of changes), which we show can be used to anticipate the environment's dynamics and regulate other biochemical programs. This suggests that polymerization-due to its ability to produce macromolecules out of many combinations of parts-can be a useful motif of biological computation. The paper is organized as follows. We first introduce the use of probabilistic inference to anticipate an environment that changes according to a Markov model, and derive the representations and operations needed to approximately implement inference in real-time in such an environment. We then describe a circuit that can perform these operations by assembling linear polymers. We describe several properties of the circuit, and show how the probabilistic information it records can be used to regulate a specific response to environmental change. We close with a discussion of the properties of this circuit and potential future directions.

Real-Time Inference in Markov Chemical Environments
Probabilistic models have been used to represent a variety of dynamic, uncertain environments. One of the simplest probabilistic models is the discrete-time, finite-state Markov model, in which the state of the environment at time t is assumed to depend only on the prior states going back to the t − k time point (where k is the order of the model; when k = 1, we have a first-order Markov model). While this Markov assumption is violated by many natural processes [35], it will serve as a useful idealization for understanding how biochemical circuits that perform inference can be constructed.
We consider a changing environment that can be in one of two states, A or B, and where switches between states are driven by the Markov model. Such models are parameterized by two transition probabilities: the probability of switching from A to B, π AB , and the probability of switching from B to A, π BA -as shown in Figure 1A. The transition probabilities are unobserved. Through probabilistic inference, one could estimate the values of these probabilities, and use this information to anticipate the environment's switches-potentially in a way that would be useful to an organism adapting to a changing environment.  (A) A discrete-time Markov model that switches between two states, A and B, parameterized by two transition probabilities, π AB and π BA . Note that the probabilities of staying in the same state (self-transitions, not shown) are derived from these two parameters (π AA = 1 − π AB , π BB = 1 − π BA ). (B) An environment generated using the model shown in (A). At fixed 20 time unit intervals, 100 units of A or B are pulsed in (after removing any prior A or B) and allowed to degrade (and/or be consumed by the circuit; hence the diminishing abundances of A and B). Pulses denoted by arrow heads. (C) Transition matrix T that represents the sufficient statistics for the model shown in (A). Given the length of the sequence n drawn from the model, the sufficient statistics are the number of switches from A to B, and the number of switches from B to A. (D) Transition operation on T, which increments the appropriate counter when the environment switches from state i to j (e.g., A to B). This operation is needed for implementing inference in the model. (E) Normalization-sampling operation, which normalizes a row in T (converting counts to probabilities) and samples an entry in proportion to its probability.
To guide our search for biological mechanisms that could instantiate inference in this model, we analyze the task of inference in this model through David Marr's three levels scheme [36]. Marr's scheme is a general "top-down" approach to understanding biological computation that begins by describing the computational level (level 1): what are the "inputs" and "outputs" to the task, and constitutes a "correct" solution? Given this computational formulation of the task, the algorithmic level (level 2) asks for a suitable representation and procedure for carrying out the task. Finally, the hardware level (level 3) asks for biologically plausible mechanisms that can realize the representation and algorithm from level 2. The idea is that each level guides and constrains the next level. We will apply this approach to derive the computational representations and operations required for inference in our probabilistic model (levels 1 and 2), and then construct a biological circuit guided by these requirements (level 3).
We first constructed a Markov chemical environment whose dynamics are driven by the Markov model shown in Figure 1A. We can think of this chemical environment as a bioreactor in which organisms grow, and where A and B are analogous to nutrients which are flowed into the reactor by an experimenter and consumed by circuit. The Markov chemical environment is created as follows. We used sequences sampled from the discretetime Markov model to add/remove A and B at fixed time intervals, as follows. Initially, a state X 0 ∈ {A, B} is sampled from the Markov model and a fixed amount of X 0 is added to a reactor which contains only our circuit (no A or B). After a fixed time interval, we sample X t+1 from the model given the previous state X t . If X t+1 = X t , no perturbation is performed; otherwise, we remove all A and B present in the reactor and add X t+1 . An environment generated by this procedure using a Markov model where π AB = π BA = 0.95 is shown in Figure 1B.
Assuming the environment's perturbations are generated by a Markov model, probabilistic inference can be used to anticipate the environment's states. In particular, it is useful to compute the probability of encountering A or B next given past observations of the environment's states, history = X t , X t−1 , X t−2 , . . . , where X t corresponds to the state of the environment at time t. In Bayesian terms, anticipation of the environment means computing the posterior predictive distribution, P(X t+1 | history). This computation is complicated by the fact that the transition probabilities π AB and π BA are unknown. One option in such cases is to place prior probabilities on these parameters and then integrate them out to calculate, P(X t+1 | history, π AB , π BA )dπ AB dπ BA . We take an alternative and sometimes simpler strategy of estimating the unobserved transition probabilities, π AB and π BA , through Bayesian inference, and use these estimates to anticipate the next state: If a mathematically convenient prior distribution is chosen for P(π AB , π BA ), then Equation (1) can be solved exactly (see Appendix A). Using the posterior distribution, we can then obtain estimates of the transition probabilities, π AB , π BA , and use these to anticipate the next state by sampling X t+1 ∼ P(X t+1 | X t , π AB , π BA ): An important complication in our setting is that cells act while the environment is changing, so the posterior distribution (Equation (1)) must be estimated in real-time. However, this computation simplifies considerably when we consider the representation and algorithm needed to estimate the posterior distribution (Marr level 2). Crucially, the sequence of observations about the environment, history, does not need to be stored in full; it can be compressed into a matrix T of transition counts ( Figure 1C). Assuming n possible states of the environment, T is an n × n matrix whose ith row indicates the number of times the environment switched from the ith state to each of the states. Since the sum of the ith row must equal the total number of transitions from state i that have been observed, the matrix can be summarized by n 2 − n entries. In an environment with two states, the sufficient statistics are simply two counts: the number of times the environment switched from A to B, and from B to A (along with the total number of transitions; see Appendix A for details).
Using these counts as the representation, the following elegant algorithm for computing the posterior distribution in real-time emerges. As the environment changes, update the relevant counts in T. Then anticipate the next state given the current state i by: (1) normalizing the ith row of T to convert counts into probabilities, and (2) sampling a state from that row. This algorithm relies on a representation of counts (as stored in T), and two computational operations. First, it is necessary to update the counts by a transition update operation ( Figure 1D) that distinguishes A to B transitions from B to A transitions. Second, in order to use these counts in downstream computations, we need to be able to convert the counts into probabilities by normalization and sample from the resulting distribution; this is the normalization-sampling operation shown in Figure 1E. With these three ingredients-the transition counts from T, an operation that updates the transition counts, and an operation that normalizes and samples from a row of T-the posterior distribution can be estimated in real-time.

Using Polymer Assembly to Represent a Changing Environment
To realize inference biochemically based on the above analysis, we need a biochemical circuit that can perform the operations shown in Figure 1C-E. Such a circuit would need a representation of the transition matrix T, which encodes the number of switches between relevant states of the environment. Two operations on T would then need to be realized molecularly: (1) the ability to count directionally, i.e., to record when the environment has switched from state A to B (as opposed to from B to A) and store this information in T, and (2) the ability to access a row of counts in T, normalize it, and sample from the resulting probability distribution. This is challenging since directional counting requires recording a potentially unbounded number of switches in the environment. Some digital counters proposed in the synthetic biology literature, such as [37], have a fixed capacity-determined by the number of genetically components-and are thus not suitable.
We instead constructed a biological realization of the counts matrix and its operations using polymerization. We make use of the idea that the process of assembling a macromolecule such as a polymer can be seen as a computational process. The rules that say how to add a new part (monomer), X n+1 , to an existing polymer X 1 -X 2 -· · · -X n , depending on the state of the new part X n+1 and the configuration of the existing polymer's end, X n , describe a logical flow [33]. A simple example would be the rule: if X n+1 = X n , where X n+1 is a monomer, then add X n+1 after X n . If two types of monomers are available, e.g., X i = A or X i = B, then this process would generate alternating polymers such as A-B-A or B-A-B-A. The polymerization process, then, is akin to running a kind of logical automata, and the resulting polymer's sequence is a record of the computation, similar to the output tape of a Turing machine. In our case, however, the assembly process is not sequential and deterministic (as it is in a Turing machine), but rather driven by concurrent, stochastic biochemical reactions.
We use polymerization to represent directional changes in the environment by building a set of linear polymers, which we call transition history polymers, made up of A and B molecules. Each polymer represents the sequence of transitions that have occurred in the environment. The process of assembling the polymer will allow us to count the number of environmental transitions.
The transition history polymers are assembled as follows. We assume that the A and B molecules have a distinguishable "head" site and "tail" site, and that each polymer is built starting from a T monomer (which also has a head and tail site). Polymerization then proceeds according to two sets of rules: (i) an unbound T can bind the head of a free A or B molecule (i.e., those molecules whose head and tail sites are unbound), forming a TA or TB dimer (Figure 2A, top); and (ii) the unbound tail of a TA or TB dimer can bind the head of a free A or B (Figure 2A, bottom). This second set of rules records the direction of change by producing memory molecules corresponding to each of the four possible transitions: A-To-A memory molecule upon A to A binding, A-To-B upon A to B binding, B-To-A upon B to A binding, and B-To-B upon B to B binding. For simplicity, we assume all these polymerization rules are irreversible and proceed according to the polymerization rate constant k. The process of polymer assembly thus implicitly "senses" the environment, and uses the polarity of the polymer's constituents to record the direction of change. (Note that if A and B are viewed as nutrients, e.g., two different kinds of sugars consumed by microbial cells a bioreactor, we cannot simply assume that these molecules could polymerize into arbitrary linear polymers. However, we could always posit a third "scaffold" protein, which binds A and B, and can differentially polymerize based on its binding state, using the rules in Figure 2A. For simplicity, we assume A and B have those polymerizing capacities directly.) As a consequence of polymerization, the abundance of A-To-B and B-To-A will encode the number of transitions from A to B and B to A, respectively, while the abundance of A-To-A and B-To-B will correspond to the duration of each state (more on this point below). We will refer to the A-To-B and B-To-A memory molecules as transition-encoding memory molecules, because their abundance tracks switches, and the A-To-A and B-To-B memory molecules as duration-encoding memory molecules, because their abundance tracks the environmental state's duration (a point we will return to below). We further assume that the memory molecules are degraded at a constant rate. Thus, the polymer assembly process will produce memory molecules that, if relatively stable, reflect the environment's history of transitions, as we will see below. Note that the number of polymers that the system will construct will depend on the abundance of T monomers (which serve as the starting monomers for polymers). The number of polymers being assembled will in turn affect the abundance of memory molecules A-To-B and B-To-A, a point that we will return to below.
The additional operation needed for realizing inference is normalization-sampling ( Figure 2B). Normalization-sampling means normalizing the transition counts, and then sampling from the resulting probability distribution. On a traditional digital computer, this is straightforward to implement sequentially. A standard algorithm for sampling from a probability vector θ (in effect, sampling from a multinomial distribution) is: calculate a cumulative sum up to each element θ i , then iterate through θ, draw a uniform random number w ∈ [0, 1] for each entry θ i , and return the first θ i such that ∑ i j=1 θ j ≥ w. Within a cell, however, there is no readily accessible notion of sequential iteration to support this procedure; a concurrent mechanism is needed instead.
We can make use of the fact that concentrations lend themselves to encoding probabilities (also discussed in [22]) to derive a non-sequential version of the same computation. Consider two proteins, an activator and a repressor, that have expression levels c 1 and c 2 , respectively, ( Figure 2B). We can normalize and sample from the vector θ = [c 1 c 2 ] by having the activator and repressor bind, in mutually exclusive fashion, a third molecule which we call an integrator. If the integrator is not in excess of the activator and repressor, then the fraction of activator-bound integrator will be proportional to the concentration of activator normalized by the sum of concentrations of activator and repressor ( Figure 2B). This means that a molecular interaction that depends on the concentration of activatorbound integrator will occur in proportion to the probability c 1 c 1 +c 2 . This mechanism does not depend on sequential order: the interactions between activator, repressor, and integrator can all take place concurrently (similar integrator schemes have also used by other circuits that sense environmental changes in a way that is sensitive to order of exposure, e.g., [38]).
To encode the posterior probability of encountering A (or B) given the environment's history, we can use the activator-repressor-integrator scheme to "normalize" the transition counts encoded by memory molecules ( Figure 2C). When the environment is in state A, the A-To-B and A-To-B transition molecules will bind, in proportion to their abundance, to the A-To-B-integrator. The fraction of integrator molecules bound by A-To-B (relative to those bound by A-To-A) will represent P(A | history) (likewise for A-To-B and P(B | history). One complication is that the duration-encoding memory molecules, A-To-A and B-To-B, whose abundance is determined by the environmental state's duration, will inevitably be more abundant than A-To-B and B-To-A. Thus, the rate constants governing the interactions between memory molecules and integrators have to be adjusted to account for this bias (see Appendix A).
We have implemented this circuit using Kappa, a rule-based language for describing and simulating stochastic biochemical systems (such simulations correspond formally to continuous-time Markov chains as described in [39,40]). Rule-based representations are especially suited for modeling polymerizing circuits such as this one, which may correspond to an intractably large number of ODEs (due to the combinatorial explosion of species [41]). We next explore several features of this circuit when exposed to changing, probabilistic environments.

Circuit Behavior Reflects the Environment's History
In order to simulate our model, we had to choose the number of polymers T that the circuit has to work with. The resulting circuit is sensitive to the environment's history. When simulated (using Kappa) in a periodic environment (π AB = π BA = 0.95), the circuit accumulates transition-encoding memory molecules at each switch ( Figure 3A,B), as expected. The fraction of active integrators qualitatively matches the posterior estimates of the idealized model ( Figure 3C,D). As the circuit accumulates A-To-B memory molecules, for example, the fraction of active A-To-B integrators-i.e., those integrators bound by the A-To-B memory molecule-increases, following the analytic estimates of the transition probabilities ( Figure 3C,D). Moreover, the linear polymers at the end of the simulation reflect the history of periodic switches ( Figure 3E).
When exposed to an environment with different statistics, the circuit behaves accordingly. We considered an environment where A frequently switches to B, but switches from B to A are rare, meaning B is a "sticky" state (π AB = 0.85, π BA = 0.25) ( Figure 4A). In this environment, the transition-encoding memory molecules track the switches, but the duration-encoding memory molecule B-To-B reflects that the environment tends to dwell in state B ( Figure 4B). As a result, the fractions of active A-To-B and B-To-A integrators qualitatively track the analytic posteriors ( Figure 4C,D). For instance, after the onset of B (time 50), the fraction of active B-To-A decreases as the environment stays in B ( Figure 4C, time 50-750). The linear polymers assembled in this environment reflect this environment's history ( Figure 4E). Thus, the circuit is not limited to one particular Markov environment, but rather uses the process of polymer assembly to produce memory molecules that reflect the environment's transition history.

C) Fraction of active A-To-B and B-To-A integrators. (D) Posterior medians for transition probabilities
π AB (black, π AB ) and π BA (grey, π BA ), calculated using a discrete-time Markov model with a uniform prior. (E) Polymers at the end of the simulation.

Activity of Integrator Complexes Reflects Posterior Probabilities
We next compared our circuit's behavior to the idealized probabilistic model more systematically, by exposing the circuit to many different assignments of the transition probabilities π AB , π BA ( Figure 5). For each environment, we first estimated the median of the posterior over π AB at each time point and then took π AB to be the median of those estimates (and similarly for π BA to obtain π BA ). We then computed a similar quantity for our circuit by taking the median of the fraction of bound A-To-B integrators through time ( I AB ) and likewise for B-To-A integrators to obtain I BA . Comparisons are shown for 9 different settings of π AB , π BA that were used to produce changing environments consisting of 50 perturbations (each with duration 50 time units). Each parameter assignment is plotted in a different color (100 simulations per assignment). On x-axis, ratio of posterior estimates π AB and π BA for π AB and π BA , respectively. π AB is the median of the posterior distribution over π AB , and same for π BA (x-axis values were jittered with random, small values for visualization purposes). On y-axis, ratio of the of active A-To-B integrator to that of active B-To-A integrator. I AB is the fraction of active A-To-B integrators at the end of the environment (same for I BA ).
Since these quantities are not directly, quantitatively comparable, we compared the ratio of posteriors from the idealized model ( π AB π BA ) with the ratio of the relevant active integrators ( I AB I BA ). The comparison shows broad agreement (rank correlation ρ = 0.94) across different assignments of transition probabilities (each assignment was simulated 100 times, Figure 5). It is important to note that this can only be a qualitative comparison, since the biological realization of the posterior and the analytic model begin from different starting points (zero active in the circuit's case, versus a 50%-50% prior in the analytic model). Another caveat is these ratios being compared only reflect end-state results, not differences in the temporal trajectory of posterior ratios. We also stress again that the analytic model which operates in discrete-time is, by definition, an idealization that could never perfectly match the biological circuit which operates in continuous time and obviously lacks synchronous switches from one time point to another. In the next section, we describe some properties of our polymerizing circuit and its difference from the idealized discrete-time model.

Properties and Limitations of the Polymerizing Mechanism
While our circuit was guided by an analysis of inference in an idealized Markov model, these results indicate an important difference between that abstract model and our circuit as realized using polymerizing biochemical parts. Unlike in the abstract model we had started with, the circuit consumes the very "signal" that it is responding to, as As and Bs get incorporated into the circuit's linear polymers. Thus, the circuit shapes the environment it inhabits, rather than passively observing its dynamics.
The magnitude of this effect will depend on the rate at which A and B are incorporated into polymers (k in Figure 2A) and the number of polymers being assembled, which is bounded by the abundance of T monomers. The rate of polymer assembly will in turn influence the rate of production of memory molecules. If A and B are abundant and T is small, as in Figures 3 and 4 where there are 20 Ts, then a single switch from, say, A to B can produce at most 20 A-To-B memory molecules. The concentration of T, therefore, influences the magnitude of the circuit's memory (measured by copies of memory molecules), and this in turn affects the sensitivity and noise in the binding of integrators (in all of the above simulations, integrators are assumed to be fixed at 400 copies). The parameters governing polymer assembly, therefore, shape how responsive the circuit can be to environmental change.
The abundance of the circuit's components relative to those of A and B also matters. In all of our simulations, A and B were far more abundant (5000 copies) than the number of polymers being assembled (10 Ts), and we chose an appropriately small polymerization rate constant (k = 0.0001; recall that k is the rate constant governing the addition of monomers to the end of a polymer). If A and B were in low abundance and/or the polymerization rate was really slow, then environmental switches would go undetected. On the other hand, if the polymerization rate was too high and/or the concentration of T too high, the circuit could in theory consume all the "nutrients" (using these to construct very long polymers), and therefore become insensitive to different durations (i.e., be unable to produce duration-encoding memory molecules). Naturally, the stability of the memory molecules produced through polymerization will determine how far back the circuit will "remember" environmental switches.

Environments with Changing Dynamics
While the periodic and "sticky" environments we have explored produce very different sequences of switches, both fall within our definition of Markov environments. Moreover, we have assumed that each environment follows the same set of transition probabilities throughout. We next asked how the circuit would behave in environment's where there is a sudden change in the way that the environment changes (similar to the "meta-changing" environments described in [13]).
In the environment shown in Figure 6, for example, the initial sequence of switches (time 0-500) is drawn from a periodic environment, π AB = π BA = 0.95, and the remaining sequence from a "sticky" B environment, π AB = 0.85, π BA = 0.05 ( Figure 6, time 500 through end). The circuit adapts to the new set of statistics, as seen in the changing curve of fraction of active integrators ( Figure 6C, starting at time 500). The resulting polymers also correctly reflect the environment's two distinct segments ( Figure 6E). Naturally, the rate at which the circuit will adapt to such changes in the environment's underlying dynamics will depend on how quickly the memory molecules are degraded.

Using Probabilistic Information to Regulate Other Processes
The probabilistic information obtained from the history of switches may be used to regulate other processes within the cell, such as a response to a specific environment. For this to be possible, as some cognitive scientists have argued, the information has to be "accessible" [42]. For accessibility, it is not sufficient to only have the system "contain" the information in its states in an information-theoretic sense (as was shown for memory in B. subtilis [16]). Rather, the information has to be instantiated in a form that is usable by other molecular processes within the cell.
We next show how the probabilistic information about the environment's history can be used to regulate an environmental state-specific program, as shown in Figure 7A. In this scenario, the organism is probabilistically presented with a toxin, and (1) an anti-toxin program is required in order to increase chances of survival or growth, (2) the anti-toxin program takes time and substantial energy to produce (so it cannot be kept on at all times). Depending on the frequency of the toxin, the timescale of environmental change, and the associated metabolic costs, there are scenarios where it makes sense for organisms to anticipate the onset of the toxin and produce the program in advance ( [13]; see also [43] for a seminal discussion of such trade-offs in changing environments). We consider an abstract version of this problem in which a protein associated with the B state, B-Program (akin to the anti-toxin program), has to be produced prior to the onset of B in a periodic environment. We assume that the ideal expression profile of B-Program would one where B-Program is produced in advance, reaches its peak at the onset of B, and then degraded (as shown in Figure 7A). Moreover, as the circuit experiences more periodic switches, we expect the regulation of B-Program to become more pronounced, reflecting the fact that more "data" about the environment's transition probabilities has been obtained.

A Regulating a state-specific program
To create this anticipatory behavior, we can utilize the posterior probability of encountering B, encoded in the integrator-memory molecule complexes. We designed a circuit that regulates B-Program in proportion to its posterior probability as encoded by these complexes ( Figure 7B). This posterior probability is encoded in the fraction of A-To-B integrators that are active, i.e., bound by the A-To-B memory molecule. When A is present, it can bind these active integrators, forming a complex that produces B-Program ( Figure 7B). However, when A binds an A-To-B integrator that is repressed (i.e., bound by A-To-A memory molecule), the resulting complex will degrade B-Program. When B is present, a symmetric circuit that uses the B-To-A integrator to enact the same logic with B-To-A and B-To-B memory molecules.
Crucially, the rate constants governing this circuit must match the timescales of change in the environment. If the desired behavior is to have the A-specific program (for example) reach its maximal expression prior to the onset of A, the circuit must be tuned to the average duration of A and B states. This "tuning" is reflected in the choice of rate constants. The rate constants for these interactions-which determine the binding of A and B to integrators, and the production/degradation of B-Program-have to be assigned in a way that produces the desired "sawtooth" expression profile shown in Figure 7A. We used an optimization procedure to set these rates (see Appendix A for details). With the resulting rate constants, the circuit indeed regulates B-Program in sawtooth-like fashion, where the regulation becomes more pronounced with increasing exposure to switches ( Figure 7C). This behavior is consistent with the idea that more experiences of the environment lead to more confident estimates of transition probabilities (by accumulation of memory molecules), and that more confident estimates produce sharper responses to switches. Note that due to the regulation of B-Program, some of the A to B molecules remain bound to integrators during a switch, and are thus not removed, resulting in a mixture of A and B molecules in the next state. The number of molecules that can be carried over in this way is bounded by the number of integrator molecules.

Discussion
We have described a polymerizing circuit that performs probabilistic inference in a Markov model. Our analysis of the requirements of real-time inference in the Markov model led us to look for ways to biochemically realize two somewhat counter-intuitive operations: directional counting and normalization-sampling. The first was performed through polymer assembly, relying on the polarity of polymers and their constituents. The second was performed using a scheme where a pair of proteins bind to a third "integrator" protein in mutually exclusive fashion. With these two operations, the circuit can be used to regulate another biochemical process of interest, such as an environmental state-specific program.
The resulting circuit has several interesting properties. First, this circuit exhibits predictive behavior in a dynamic environment without relying on mutation and natural selection across generations. While mutation and selection can, from a mathematical perspective, implement solutions to probabilistic inference problems [44], and Darwinian evolution and inference algorithms have been argued to be conceptually similar strategies for solving the same basic problem [45], mutation and selection are nonetheless biologically distinct processes that operate on longer timescales than ontogenetic ones. The former operate on longer timescales; the latter can occur, theoretically, during the "lifetime" of a single cell experiencing environmental change. The ontogenetic behavior we explored is therefore distinct from laboratory evolution experiments with microbes or in silicoevolved circuits that were selected to fit a particular environment (although several previous studies have presented such experimental results as evidence of "predictive" cellular behavior [46,47], as argued in [30]).
Second, our circuit consumes the "signal" (A and B, in our case) that it anticipates. These molecules become part of the circuit (organism) through the process of polymer assembly. While we have referred to "the environment" as if separate from the circuit, this circuit actually alters the concentration of A and B (the magnitude of the effect depends on the relative abundances of nutrients and the rate of polymer assembly). This demonstrates the blurry line between metabolism and signal transduction [48].
Since the circuit records the history of change in polymers, this opens up possibilities for molecular mechanisms that utilize these sequences by interacting with the polymers. A variety of molecular mechanisms, such as kinesin motors that walk along microtubuleswhich have been intensively studied experimentally [49] and theoretically [31]-can be a source of inspiration. Microtubules can also be regulated through myriad post-translational modifications to tubulin proteins, which have been viewed as a kind of combinatorial "tubulin code" [50][51][52]. These modifications can affect the localization and size of microtubules by tuning the binding of microtubule-regulating proteins. Such mechanisms can also be explored as forms of biological information-processing.
A theoretical study of these mechanisms would benefit from further development of formal languages for representing and simulating combinatorial biochemistry, such as Kappa [40]. Kappa, for example, currently does not explicitly model physical space, which raises a challenge: how to capture more of the ways in which physical space constrains and shapes polymerization-in a dividing cell, for instance-while keeping the formal language sufficiently abstract so as to be interpretable, and the simulations tractable? Computational tools for visualizing polymer assembly, through time and space, could also help the study of polymerization as a form of information-processing.
Several open questions about polymerization and inference also remain. It would be interesting to explore how the circuit design we have proposed can be adapted to more complex environments. A recent study has investigated how inference in hidden Markov Models can be performed using CRNs [53], and it would be worth exploring whether a polymerization scheme for real-time inference of the sort we have developed could be used for these more complex Markov models that have latent states. Similarly, it may be fruitful to explore how circuits could cope with "meta-changing" environments whose dynamics change according to yet another unobserved stochastic process [13]. Another avenue would be to explore the energetic costs of such inference-performing circuits, and how they might be integrated into a larger cellular model that accounts for cell division, growth, and the cell's self-producing ("autopoietic") capacity [54,55].

Appendix A.1. Real-Time Inference in a Discrete-Time Markov Model
We briefly describe an algorithm for real-time inference in a Markov model (which was used to generate the analytic estimate of the posterior in all figures). We assume a firstorder, discrete-time Markov model over two discrete states, A and B, which is characterized by transition probabilities π AB and π BA (where π AA = 1 − π AB and π BB = 1 − π BA ). A sequence of observations is generated from the model by first sampling an initial state X 0 (from a uniform prior over A and B) and sampling subsequent observations X 1 , X 2 , X 3 , . . . using the transition probabilities: The quantity of interest here is the posterior distribution over transition probabilities: where history stands for the sequence of observations of past states, X t , X t−1 , . . . . A standard approach is to use independent priors over the transition probabilities, which lets us treat the posterior for each parameter separately: P(π AB | history) ∝ P(history | π AB )P(π AB ) P(π BA | history) ∝ P(history | π BA )P(π BA ) A convenient choice of prior over transition probabilities is the Beta distribution. A Beta distribution, Beta(θ; α 0 , α 1 ), over θ ∈ [0, 1] is determined by a pair of shape parameters α 0 , α 1 . Different settings of the alphas can be used to encode distinct beliefs about θ. As examples: α 0 = α 1 = 1 is equivalent to a uniform distribution over [0, 1], α 0 = α 1 = 100 is a unimodal distribution whose mode is at θ = 0.5, and α 0 < 1, α 1 < 1 (e.g., α 0 = α 1 = 0.5) is a U-shaped distribution corresponding to the belief that θ is likely to be close to 0 or 1. The Beta distribution is conjugate to the binomial distribution [56], meaning that a posterior that is the product of a Beta prior and a binomial likelihood equals another Beta distribution whose parameters can be calculated exactly.
The observations of the environment, history, can be compressed into a set of counts (as described in main text) that are binomially distributed, which allows us to use the Beta-Binomial conjugacy to calculate our posteriors of interest analytically. For a two-state Markov model, the relevant counts are: (1) c AB , the number of A → B transitions, (2) c BA , the number of B → A transitions, and (3) the number of total switches s in history. (Note that the other entries of the transition counts matrix T can be computed from this pair of summary statistics and s: c AA = s − c AB and c BB = s − c BA .) We can then expand the posterior over π AB as follows: P(π AB | history) ∝ P(history | π AB )P(π AB ) ∝ P(c AB , s | π AB )P(π AB ) ∝ Bin(C AB ; π AB , s)Beta(π AB ; α 0 , α 1 ) = Beta(π AB ; C AB + α 0 , s − C AB + α 1 ) (and similarly for π BA .) This posterior can be computed in real-time up through the kth observation (a task which is called "filtering" [57]) using a recursive procedure that proceeds as follows. Let C k AB be the number of A → B transitions up to the kth observation, and let b k π AB be the posterior up to the kth observation. The base case of the procedure is the first observation, which is the switch from X 0 → X 1 (where k = 1): We can now define the b k+1 π AB in terms of b k π AB by noting that the posterior as of the kth observation, b k π AB , serves as the prior at k + 1. Let α k 0 , α k 1 be the parameters of b k π AB . Now let I AB be an indicator variable that is 1 if X k → X k+1 is an A → B transition (and 0 otherwise), and I AA an indicator variable that is 1 if X k → X k+1 is an A → A transition (and 0 otherwise). Then the posterior after the k + 1 observation is: b k+1 π AB = Beta(π AB ; α k 0 + I AB , α k 1 + I AA ) Note that if X k = B, then I AB = I AA = 0, which means the estimate of the posterior over π AB does not change (i.e., b k+1 AB = b k AB ). The same derivation as above applies to the posterior over π BA .
To estimate the posterior over the transition probabilities in the above figures (e.g., Figure 3D), we used the median of the posterior distribution at each time step.

Appendix A.2. Adjustment of Rate Constants for Normalization-Sampling Circuit
As noted in the main text, the rate constants for the normalization-sampling circuit, which control the binding of integrators to memory molecules, have to be adjusted to account for the fact that memory molecules encoding transitions (A-To-B and B-To-A) will inevitably be less abundant than the memory molecules encoding durations (A-To-B and B-To-A). The four rate constants to be adjusted for the A-To-B integrator are shown in Figure A1 (the analogous set of rate constants for B-To-A integrator are not shown). The normalization-sampling circuit for the A-To-B integrator is modeled by the following reactions: where M AB and M AA are the species of unbound A-To-B and A-To-A memory molecules (respectively), I the unbound integrator, C the integrator-bound A-To-B memory molecule, and D the integrator-bound A-To-A memory molecule. In order to accurately reflect the environment's statistics, the transition-encoding memory molecules for the integrators should be higher than those of duration-encoding memory molecules. But how much higher?
We selected rate constants that, in a specific environment that switches evenly between A and B for N pulses, would result in half of the A-To-B integrator molecules bound by A-To-B and the other half by A-To-A (and likewise for B-To-A integrators). Such an assignment of rate constants means that the circuit accurately reflects the environment's statistics of equal switching, and that after N pulses, half of the integrators are occupied. (The latter is a reasonable but somewhat arbitrary choice; depending on the abundance of integrators and the nutrients in the environment, one could choose rate constants that result in a different fraction of the integrators being used.) To assign the rate constants, we used the following combination of simulation and analytic calculation. We first simulated our circuit, initialized with 20 T monomers and 400 integrators, on a perfectly periodic environment (π AB = 1.0, π BA = 1. (B) Procedure for adjusting the rate constants by first simulating a specific periodic environment to obtain the level of memory molecules, and then analytically solving for the desired rate constants as described in Appendix A.
We then expressed the normalization-sampling circuit described above as the following system of equations, which we represented in the computer algebra system Sympy [58]: where M AB0 , M AA0 , I 0 are the initial amounts of (unbound) memory molecules and integrators, respectively. We plugged in I 0 = 400, and the abundances of the memory molecules from our simulation on a periodic environment for M AB0 , M AA0 . Since we aim to have half of the integrators bound by M AB and the other half by M AA , we set C = D = 1 2 I 0 . We then assumed that the backward rate constants are equal and set them to a reasonable value (k 2 = k 4 = 0.005) and solved for k 1 , k 3 using Sympy. For the specific periodic environment we used, this calculation produces k 1 /k 3 ≈ 45.