Skip to main content
Log in

Computation with finite stochastic chemical reaction networks

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

A highly desired part of the synthetic biology toolbox is an embedded chemical microcontroller, capable of autonomously following a logic program specified by a set of instructions, and interacting with its cellular environment. Strategies for incorporating logic in aqueous chemistry have focused primarily on implementing components, such as logic gates, that are composed into larger circuits, with each logic gate in the circuit corresponding to one or more molecular species. With this paradigm, designing and producing new molecular species is necessary to perform larger computations. An alternative approach begins by noticing that chemical systems on the small scale are fundamentally discrete and stochastic. In particular, the exact molecular counts of each molecular species present, is an intrinsically available form of information. This might appear to be a very weak form of information, perhaps quite difficult for computations to utilize. Indeed, it has been shown that error-free Turing universal computation is impossible in this setting. Nevertheless, we show a design of a chemical computer that achieves fast and reliable Turing-universal computation using molecular counts. Our scheme uses only a small number of different molecular species to do computation of arbitrary complexity. The total probability of error of the computation can be made arbitrarily small (but not zero) by adjusting the initial molecular counts of certain species. While physical implementations would be difficult, these results demonstrate that molecular counts can be a useful form of information for small molecular systems such as those operating within cellular environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Stochastic simulation implementations. Systems Biology Workbench: http://sbw.sourceforge.net; BioSpice: http://biospice.lbl.gov; Stochastirator: http://opnsrcbio.molsci.org; STOCKS: http://www.sysbio.pl/stocks; BioNetS: http://x.amath.unc.edu:16080/BioNetS; SimBiology package for MATLAB: http://www.mathworks.com/products/simbiology/index.html

  2. All unimolecular reactions can be turned into bimolecular reactions by adding a dummy catalyst.

  3. The asymptotic notation we use throughout this paper can be understood as follows. We write f(x, y,…) = O(g(x, y,…)) if \(\exists c > 0\) such that \(f(x,y,\ldots) \leq c \cdot g(x,y,\ldots)\) for all allowed values of x, y, …. The allowed range of the parameters will be given either explicitly, or implicitly (e.g. probabilities must be in the range [0,1]). Similarly, we write \(f(x,y,\ldots)=\Upomega(g(x,y,\ldots))\) if \(\exists c > 0\) such that \(f(x,y,\ldots) \geq c \cdot g(x,y,\ldots)\) for all allowed values of x, y, …. We say \(f(x,y,\ldots)=\Uptheta(g(x,y,\ldots))\) if both f(x, y,…) = O(g(xy,…)) and \(f(x,y,\ldots)=\Upomega(g(x,y,\ldots)).\)

  4. By the (extended) Church–Turing thesis, a TM, unlike a RM, is the best we can do, if we care only about super-polynomial distinctions in computing speed.

  5. For example, the naive approach of dividing #M by 2 by doing M + MM′ takes \(\Uptheta(1)\) time (independent of #M) as a function of the initial amount of #M. Note that the expected time for the last two remaining M’s to react is a constant. Thus, if this were a step of our TM simulation we would not attain the desired speed up with increasing molecular count.

  6. A slight modification of the clock module is necessary to maintain the desired behavior. Because of the need of intermediate species (e.g. \(A^\dagger)\) for tripling #A and #A*, the clock reactions need to be catalyzed by the appropriate intermediate species in addition to A and A*.

  7. Since in a reaction might simply not be chosen for an arbitrarily long time (although the odds of this happening decrease exponentially), we can’t insist on a zero probability of error at any fixed time.

  8. As \(m \rightarrow \infty,\) the difference between \(\sum_{q=1}^{m}(1/q)\) and log m approaches the Euler-Mascheroni constant.

  9. Chebyshev’s inequality states that for a random variable X with expected value μ and finite variance σ2, for any d > 0, \(\Pr[\left\vert{X - \mu}\right\vert \geq d \sigma] \leq 1/d^2.\)

  10. If i 0 > 1/δ + 1, then \(\delta > \int_{i_0-1}^\infty \frac{1}{x^2} dx > \sum_{x = i_0}^\infty \frac{1}{x^2}.\)

  11. This follows from the fact that \({\mathbb{E}}{[X|A]}\leq (1/\Pr[A]) {\mathbb{E}}{[X]}\) for random variable X and event A, and that the expected microstep duration conditional on the previous and current microsteps being correct is the same as the expected microstep duraction conditional on the entire simulation being correct.

References

  • Adalsteinsson D, McMillen D, Elston TC (2004) Biochemical network stochastic simulator (BioNetS): software for stochastic modeling of biochemical networks. BMC Bioinformatics 5:24

    Google Scholar 

  • Angluin D, Aspnes J, Eisenstat D (2006) Fast computation by population protocols with a leader. Technical Report YALEU/DCS/TR-1358, Yale University Department of Computer Science, 2006. Extended abstract to appear, DISC

  • Arkin AP, Ross J, McAdams HH (1998) Stochastic kinetic analysis of a developmental pathway bifurcation in phage-l Escherichia coli. Genetics 149:1633–1648

    Google Scholar 

  • Barak B (2002) A probabilistic-time hierarchy theorem for ‘slightly non-uniform’ algorithms. In Proceedings of RANDOM

  • Bennett CH (1982) The thermodynamics of computation – a review. Int J Theor Phys 21(12):905–939

    Article  Google Scholar 

  • Berry G, Boudol G (1990) The chemical abstract machine. In: Proceedings of the 17th ACM SIGPLAN-SIGACT annual symposium on principles of programming languages, pp 81–94

  • Cook M (2005) Networks of relations. PhD thesis, California Institute of Technology

  • Elowitz MB, Leibler S (2000) A synthetic oscillatory network of transcriptional regulators. Nature 403:335–338

    Article  Google Scholar 

  • Elowitz MB, Levine AJ, Siggia ED, Swain PS (2002) Stochastic gene expression in a single cell. Science 297:1183–1185

    Article  Google Scholar 

  • Érdi P, Tóth J (1989) Mathematical models of chemical reactions: theory and applications of deterministic and stochastic models. Manchester University Press

  • Ethier SN, Kurtz TG (1986) Markov processes: characterization and convergence. Wiley

  • Gibson M, Bruck J (2000) Efficient exact stochastic simulation of chemical systems with many species and many channels. J Phys Chem A 104:1876–1889

    Article  Google Scholar 

  • Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81:2340–2361

    Article  Google Scholar 

  • Gillespie DT (1992) A rigorous derivation of the chemical master equation. Physica A 188:404–425

    Article  Google Scholar 

  • Gillespie DT (2007) Stochastic simulation of chemical kinetics. Annu Rev Phys Chem 58:35–55

    Article  Google Scholar 

  • Guptasarma P (1995) Does replication-induced transcription regulate synthesis of the myriad low copy number proteins of Escherichia coli? Bioessays 17:987–997

    Article  Google Scholar 

  • Karp RM, Miller RE (1969) Parallel program schemata. J Comput Syst Sci 3(4):147–195

    MATH  MathSciNet  Google Scholar 

  • Kierzek AM (2002) STOCKS: STOChastic kinetic simulations of biochemical systems with Gillespie algorithm. Bioinformatics 18:470–481

    Article  Google Scholar 

  • Kurtz TG (1972) The relationship between stochastic and deterministic models for chemical reactions. J Chem Phys 57:2976–2978

    Article  Google Scholar 

  • Liekens AML, Fernando CT (2006) Turing complete catalytic particle computers. In: Proceedings of Unconventional Computing Conference, York

  • Levin B (1999) Genes VII. Oxford University Press

  • Macdonald J, Li Y, Sutovic M, Lederman H, Pendri K, Lu W, Andrews BL, Stefanovic D, Stojanovic MN (2006) Medium scale integration of molecular logic gates in an automaton. Nano Lett 6:2598–2603

    Article  Google Scholar 

  • Magnasco MO (1997) Chemical kinetics is Turing universal. Phys Rev Lett 78:1190–1193

    Article  Google Scholar 

  • McAdams HH, Arkin AP (1997) Stochastic mechanisms in gene expression. Proc Natl Acad Sci 94:814–819

    Google Scholar 

  • McQuarrie DA (1967) Stochastic approach to chemical kinetics. J Appl Probab 4:413–478

    Article  MATH  MathSciNet  Google Scholar 

  • Minsky ML (1961) Recursive unsolvability of Post’s Problem of ‘tag’ and other topics in theory of Turing machines. Annals of Math 74:437–455

    Article  MathSciNet  Google Scholar 

  • Neary T, Woods D (2005) A small fast universal Turing machine. Technical Report NUIM-CS-2005-TR-12, Dept. of Computer Science, NUI Maynooth

  • Paun G, Rozenberg G (2002) A guide to membrane computing. Theor Comput Sci 287:73–100

    Article  MATH  MathSciNet  Google Scholar 

  • Rothemund PWK (1996) A DNA and restriction enzyme implementation of Turing machines. In: Proceedings DNA Computers, pp 75–120

  • Rothemund PWK, Papadakis N, Winfree E (2004) Algorithmic self-assembly of DNA Sierpinski triangles. PLoS Biol 2:e424

    Article  Google Scholar 

  • Seelig G, Soloveichik D, Zhang DY, Winfree E (2006) Enzyme-free nucleic acid logic circuits. Science 314:1585–1588

    Article  Google Scholar 

  • de Silva AP, McClenaghan ND (2004) Molecular-scale logic gates. Chem – Euro J 10(3):574–586

    Article  Google Scholar 

  • Sipser M (1997) Introduction to the theory of computation. PWS Publishing

  • Sprinzak D, Elowitz MB (2005) Reconstruction of genetic circuits. Nature 438:443–448

    Article  Google Scholar 

  • Stojanovic MN, Mitchell TE, Stefanovic D (2002) Deoxyribozyme-based logic gates. J Am Chem Soc 124:3555–3561

    Article  Google Scholar 

  • Suel GM, Garcia-Ojalvo J, Liberman LM, Elowitz MB (2006) An excitable gene regulatory circuit induces transient cellular differentiation. Nature 440:545–550

    Article  Google Scholar 

  • van Kampen NG (1997) Stochastic processes in Physics and Chemistry, revised edition. Elsevier

  • Vasudeva K, Bhalla US (2004) Adaptive stochastic-deterministic chemical kinetic simulations. Bioinformatics 20:78–84

    Article  Google Scholar 

Download references

Acknowledgments

We thank G. Zavattaro for pointing out an error in an earlier version of this manuscript. This work is supported in part by the “Alpha Project” at the Center for Genomic Experimentation and Computation, an NIH Center of Excellence (Grant No. P50 HG02370), as well as NSF Grant No. 0523761 and NIMH Training Grant MH19138-15.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Soloveichik.

A. Proof details

A. Proof details

1.1 A.1 Clock analysis

The following three lemmas refer to the Markov chain in Fig. 3. We use p i (t) to indicate the probability of being in state i at time t. CDF stands for cumulative distribution function.

Fig. 3
figure 3

Continuous-time Markov chain for Lemmas 5–7. States i = 1,…,l indicate the identity of the currently present clock species C 1,…,C l . Transition to state 0 represents reaction dec 2 for the RM simulation or the state transition initiation reaction of the CTM simulation

Lemma 5

Suppose the process starts in state l. Then \(\forall t,\,p_1(t) \leq (1-p_0(t))\mu\) where \(\mu=1/(1+\frac{r}{f}+(\frac{r}{f})^2+\cdots+(\frac{r}{f})^{l-1}).\)

Proof

Consider the Markov chain restricted to states 1,…,l. We can prove that the invariance p i+1(t)/p i (t) ≥ r/f (for i = 1,…,l−1) is maintained at all times through the following argument. Letting ϕ i (t) = p i+1(t)/p i (t), we can show dϕ i (t)/dt ≥ 0 when ϕ i (t) = r/f and \(\forall i^{\prime}, \phi_{i^{\prime}}(t) \geq r/f,\) which implies that for no i can ϕ i (t) fall below r/f if it starts above. This is done by showing that dp i (t)/dt = p i+1(t) f + p i − 1(t) r−(r + f) p i (t) ≤ 0 since ϕ i (t) = r/f and ϕ i − 1(t) ≥ r/f, and dp i+1(t)/dt = p i+2(t) f + p i (t) r−(r + f) p i+1(t) ≥ 0 since ϕ i (t) = r/f and ϕ i+1(t) ≥ r/f (the p i − 1 or the p i+2 terms are zero for the boundary cases).

Now p i (t) = ϕ i − 1(t) ϕ i − 2(t) …ϕ1(t) p 1(t). Thus \(\sum_i p_i(t)=1\) implies \(p_1(t)=1/(1+\phi_1+\phi_2 \phi_1+\cdots+\phi_{l-1}\phi_{l-2}\cdots \phi_1)\leq 1/(1+{{r}\over{f}}+(\frac{r}{f})^2+\cdots+(\frac{r}{f})^{l-1}).\) This is a bound on the probability of being in state 1 given that we haven’t reached state 0 in the full chain of Fig. 3. Thus multiplying by 1−p 0(t) gives us the desired result.\(\square\)

Lemma 6

Suppose for some μ we have \(\forall t, \,p_1(t) \leq (1-p_0(t))\mu.\) Let T be a random variable describing the time until absorption at state 0. Then \(\Pr[T < t]\leq 1-e^{-\lambda t}\) for λ = f μ (i.e. our CDF is bounded by the CDF for an exponential random variable with rate λ = f μ).

Proof

The result follows from the fact that dp 0(t)/dt = p 1(t) f ≤  (1−p 0(t)) μ f. \(\square\)

Lemma 7

Starting at state l, the expected time to absorb at state 0 is \(O((\frac{r}{f})^{l-1}/f)\) assuming sufficiently large r/f.

Proof

The expected number of transitions to reach state 0 starting in state i is \(d_i = \frac{2 p q \left(\left(q/p\right)^l-\left(q/p\right)^{l-i}\right)}{(1-2 p)^2}-\frac{i}{q-p},\) where \(p = \frac{f}{f+r}\) is the probability of transitioning to a state to the left and q = 1−p is the probability of transitioning to the state to the right. This expression is obtained by solving the recurrence relation d i  = p d i − 1 + q d i+1 + 1 (0 > i > l) with boundary conditions d 0 = 0, d l  = d l − 1 + 1. Thus \(d_l < \frac{2 p q \left(q/p\right)^{l}}{(1-2 p)^2} = \frac{2 (r/f)^{l+1}}{(r/f - 1)^2}.\) This implies that for r/f larger than some constant, \(d_l = O((\frac{r}{f})^{l-1}).\) Since the expected duration of any transition is no more than 1/f, the desired bound is obtained.

By the above lemmas, the time for the clock to “tick” can be effectively thought of as an exponential random variable with rate \(\lambda = f/(1+\frac{r}{f} + (\frac{r}{f})^2 + \cdots + (\frac{r} {f})^{l-1}) = \Uptheta(\frac{f}{(r/f)^{l-1}}).\) Lemma 6 shows that the CDF of the tick is bounded by the CDF of this exponential random variable. Further, Lemma 7 shows that the expected time for the tick is bounded by (the order of) expected time of this exponential random variable. Note that Lemma 6 is true no matter how long the clock has already been running (a “memoryless” property). For our clock construction (Fig. 1b), we set λ by changing #A and #A* which define the forward and reverse rates f and r. Specifically, we have \(\lambda=\Uptheta \left(\frac{k {\#A^{\ast}}^{l}}{v {\#A}^{l-1}}\right).\)

1.2 A.2 Time/space-bounded RM simulation

Lemma 8

For the finite RM simulation, the probability of error per step is O((1/#A)l − 1). Further, the expected time per step is bounded by O((#A)l − 1 v/k).

Proof

Consider the point in time when the RM simulation enters a state in which it should decrement a non-empty register. If the time until dec 2 occurs were an exponential random variable with rate λ then the probability of error per step would be bounded by λ/(k/v + λ). (We are making the worst case assumption that there is exactly one register molecule; otherwise, the error is even smaller.) The time until dec 2 is not exponentially distributed, but by Sect. A.1, it can be bounded by an exponential random variable with rate \(\lambda=O(\frac{k}{v{\#A}^{l-1}})\) (#A* = 1 for the RM construction). Note that the clock may have been running for a while since the last dec operation (while the RM performs inc operations for example); however, this amount of time is irrelevant by the memoryless property established in Sect. A.1. Thus the probability of error per step is bounded by λ/(k/v + λ) = O((1/#A)l − 1). The expected time per RM step is bounded by the expected time for dec 2 which is O((#A)l − 1 v/k) by Sect. A.1. \(\square\)

The above lemma implies that we can use \({\#A}=\Uptheta((t/\delta)^{1/(l-1)})\) resulting in the expected time for the whole computation of \(O(\frac{vt^2}{k \delta})\) and the total probability of error being bounded by δ.

1.3 A.3 Time/space-bounded CTM simulation

In the following lemmas, we say a reaction completely finishes when it happens enough times that one of the reactants is used up.

Lemma 9

Starting with \(\Uptheta(m)\) molecules of X and \(\Uptheta(m)\) molecules of Y, the expected time for the reaction X + YY to completely finish is \(O(\frac{v}{k m} \hbox{log}\,{m}).\) The variance of the completion time is \(O((\frac{v}{k m})^2).\)

Proof

When there are q molecules of X remaining, the waiting time until next reaction is an exponential random variable with rate \(\Uptheta(kqm/v)\) and therefore mean \(O(\frac{v}{kqm}).\) Each waiting time is independent. Thus the total expected time is \(\sum_{q=1}^{\Uptheta(m)}O(\frac{v}{kqm}) = O(\frac{v} {km}\hbox{log}\,{m}).\) Footnote 8 The variance of each waiting time is \(O((\frac{v}{kqm})^2).\) Thus the total variance is \(\sum_{q=1}^{\Uptheta(m)}O((\frac{v}{kqm})^2) = O((\frac{v}{k m})^2).\) \(\square\)

Lemma 10

Starting with \(\Uptheta(m)\) molecules of X and \(\Uptheta(m)\) molecules of Y such that \({\Updelta}={\#Y}-{\#X}=\Upomega(m)\) the expected time for the reaction \(X + Y \rightarrow \emptyset\) to completely finish is \(O(\frac{v}{k m} \hbox{log}\,{m}).\) The variance of the completion time is \(O((\frac{v}{k m})^2).\)

Proof

This case can be proven by reducing to Lemma 9 with initial amounts \(\#Y^{\prime} =\Updelta\) and #X′ = #X. \(\square\)

Lemma 11

Starting with \(\Uptheta(m)\) molecules of X and 1 molecule of Y, the expected time for the reaction X + Y →2Y to completely finish is \(O(\frac{v}{k m} \hbox{log}\,{m}).\) The variance of the completion time is \(O((\frac{v}{k m})^2).\)

Proof

Consider splitting the process into two halves, with the first part bringing the amount of X to half its initial value and the second part using up the remainder. The time-reverse of the first part, as well as the second part, can both be bounded by processes covered by Lemma 9. (Assume that #X is fixed at its minimal value for part one, and assume #Y is fixed at its minimal value for part two. The variance can only decrease.)\(\square\)

Lemma 12

Some \(\lambda = \Uptheta(\frac{k \varepsilon^{3/2} 3^{s_{ct}}}{v s_{ct}})\) attains error at most ɛ per microstep of the CTM simulation.

Proof

Using the above lemmas with \(m = 3^{s_{ct}-1},\) by Chebyshev’s inequality,Footnote 9 with probability at least 1−ɛ/2 all reactions finish before some time \(t_{f} = \Uptheta(\frac{v}{km} (\hbox{log}(m) + 1/\sqrt{\varepsilon})) = O\left(\frac{v\,{\hbox{log}}\,{m}}{k m \varepsilon^{1/2}}\right).\) Now we set λ such that the probability that the clock ticks before time t f is smaller than ɛ/2 (for a total probability of error ɛ). Since the time until the clock ticks is bounded by the CDF of an exponential random variable with rate λ (Sect. A.1), it is enough that \(\lambda < \frac{\varepsilon}{2 t_f}\) and so we can choose some \(\lambda = \Uptheta \left(\frac{\varepsilon^{3/2} k m}{v\,{\hbox{log}}\,{m}}\right).\)

Lemma 13

Any TM with a two-way infinite tape using at most s tm space and t tm time can be converted to a CTM using s ct  = 2s tm space and t ct  = O(t tm s tm ) time. If \( \Updelta\) extra bits of padding on the CTM tape is used, then \(t_{ct} =O(t_{tm} (s_{tm}+\Updelta))\) time is required.

Proof

(sketch, see Neary and Woods 2005) Two bits of the CTM are used to represent a bit of the TM tape. The extra bit is used to store a TM head position marker. To move in the direction corresponding to moving the CTM head clockwise (the easy direction) is trivial. To move in the opposite direction, we use the temporary marker to record the current head position and then move each tape symbol clockwise by one position. Thus, a single TM operation in the worst case corresponds to O(s) CTM operations. \(\square\)

In order to simulate t tm steps of a TM that uses s tm bits of space on a CTM using \(\Updelta\) bits of padding requires \(t_{ct} = O(t_{tm} (s_{tm}+\Updelta))\) CTM steps and a circular tape of size \(s_{ct} =2s_{tm}+\Updelta\) (Lemma 13). Recall that in our CTM simulation, there are four microsteps corresponding to a single CTM operation, which is asymptotically still O(t ct ). Thus, in order for the total error to be at most δ, we need the error per CTM microstep to be \(\varepsilon = O(\frac{\delta}{t_{tm} (s_{tm}+\Updelta)}).\) Setting the parameters of the clock module (#A, #A*) to attain the largest λ satisfying Lemma 12, the expected time per microstep is \(O\left(\frac{v s_{ct}}{k 3^{s_{ct}} \varepsilon^{3/2}}) = O(\frac{v (s_{tm}+\Updelta)^{5/2} t_{tm}^{3/2}}{k 3^{2s_{tm}+\Updelta} \delta^{3/2}}\right).\) This can be done, for example, by setting \({\#A^{\ast}}^l = \Uptheta \left(\frac{3^{s_{ct}}}{s_{ct}}\right)\) and \({\#A}^{l-1} = \Uptheta \left(\frac{1}{\varepsilon^{3/2}}\right).\) Since there are total \(O(t_{tm}(s_{tm}+\Updelta))\) CTM microsteps, the total expected time is \(O\left(\frac{v (s_{tm}+\Updelta)^{7/2} t_{tm}^{5/2}}{k 3^{(2s_{tm}+\Updelta)} \delta^{3/2}}\right).\)

How large is the total molecular count? If we keep δ constant while increasing the complexity of the computation being performed, and setting #A* and #A as suggested above, we have that the total molecular count is \(\Uptheta(m+\#A)\) where \(m = 3^{2 s_{tm}+\Updelta}.\) Now m increases at least exponentially with \(s_{tm}+\Updelta,\) while #A increases at most polynomially. Further, m increases at least quadratically with t tm (for any reasonable algorithm \(2^{s_{tm}} \geq t_{tm}\)) while #A increases at most as a polynomial of degree \((3/2)\frac{1}{l-1} < 2.\) Thus m will dominate.

1.4 A.4 Unbounded RM simulation

After i dec 2 steps, we have #A = i 0 + i where i 0 is the initial number of A’s. The error probability for the next step is O(1/#A 2) = O(1/(i 0 + i)2) by Lemma 8 when l = 3. The total probability of error over an unbounded number of steps is \(O(\sum_{i=0}^\infty 1/(i_0+i)^2).\) To make sure this is smaller than δ we start out with \(i_0 =\Uptheta(1/\delta)\) molecules of A.Footnote 10

Now what is the total expected time for t steps? By Lemma 8 the expected time for the next step after i dec 2 steps is \(O(\#A^2 v/k)=O((i_0+i)^2 v/k).\) Since each step at most increases the total molecular count by 1, after t total steps v is not larger than O(i 0 + t + s 0), where s 0 is the sum of the initial values of all the registers. Thus the expected time for the tth step is bounded by \(O((i_0+i)^2 (i_0+t+s_0)/k) =O((1/{\delta}+t)^2 (1/{\delta}+t+ s_0)/k)\) and so the expected total time for t steps is \(O(t (1/{\delta+}t)^2 (1/{\delta}+t+s_0)/k).\)

1.5 A.5 Unbounded CTM simulation

We want to follow a similar strategy as in the RM simulation (Sect. A.4) and want the error probability on the ith CTM step to be bounded by \({\varepsilon}=1/(\Uptheta(1/\delta)+i)^2\) such that the total error probability after arbitrarily many steps is bounded by δ. By Lemma 12, we can attain per step error probability (taking the union bound over the 4 microsteps in a step) bounded by this ɛ when we choose a small enough \(\lambda = \Uptheta(\frac{k \varepsilon^{3/2} 3^{s_{ct}}}{v s_{ct}}) = \Uptheta(\frac{k 3^{s_{ct}}}{v (1/\delta + i)^3 s_{ct}}),\) where s ct is the current CTM tape size. Recall that λ is set by #A and #A* such that \(\lambda = \Uptheta(\frac{k {\#A^{\ast}}^l}{v \#A^{l-1}})\) (Sect. A.1). It is not hard to see that we can achieve the desired λ using clock Markov chain length l = 5, and appropriate \(\#A = \Uptheta((i_0+i) 3^{s_{ct}})\) and \(\#A^{\ast} = \Uptheta(3^{s_{ct}}),\) for appropriate \(i_0 = \Uptheta(1/\delta + {s_{ct}}_0),\) where \({s_{ct}}_0\) is the initial size of the tape. These values of #A and #A* can be attained if the SCRN triples the amount of A and A* whenever extending the tape and increases #A by an appropriate amount \(\Uptheta(3^{s_{ct}})\) on every step.

How fast is the simulation with these parameters? From Sect. A.1 we know that the expected time per microstep is \(O(1/\lambda) = O(\frac{v (1/\delta+{s_{ct}}_0+i)^4}{k 3^{s_{ct}}}).\) Since the total molecular count is asymptotically \(O(\#A) = O((1/\delta + {s_{ct}}_0+i) 3^{s_{ct}}),\) this expected time is \(O((1/\delta + {s_{ct}}_0+ i)^5/k).\) However, unlike in the bounded time/space simulations and the unbounded RM simulation, this expected time is conditional on all the previous microsteps being correct because if a microstep is incorrect, A and A* may increase by an incorrect amount (for example reactions tripling #A akin to \(A \rightarrow A^\dagger\) and \(A^\dagger \rightarrow 3A\) can drive A arbitrarily high if the catalyst state species for both reactions are erroneously present simultaneously). Nonetheless, the expected duration of a microstep conditional on the entire simulation being correct is at most a factor of 1/(1−δ) larger than this.Footnote 11 Since we can assume δ will always be bounded above by a constant less than one, the expected duration of a microstep conditional on the entire simulation being correct is still \(O((1/{\delta}+{s_{ct}}_0+i)^5/k).\) By Lemma 13, this yields total expected time to simulate t tm steps of a TM using at most s tm space and with initial input of size \({s_{tm}}_0\) is \(O((1/\delta + {s_{tm}}_0 + t_{tm} s_{tm})^5 t_{tm} s_{tm}/k)\) assuming the entire simulation is correct.

1.6 A.6 Decidability of reachability

We reduce the reachability question in SCRNs to the reachability question in Vector Addition Systems (VAS), a model of asynchronous parallel processes developed by Karp and Miller (1969). In the VAS model, we consider walks through a p dimensional integer lattice, where each step must be one of a finite set of vectors in \({\mathbb{N}}^p,\) and each point in the walk must have no negative coordinates. It is known that the following reachability question is decidable: given points x and y, is there a walk that reaches some point \({\mathbf y}^{\prime} \geq {\mathbf y}\) from x (Karp and Miller 1969). The correspondence between VASs and SCRNs is straightforward (Cook 2005). First consider chemical reactions in which no species occurs both as a reactant and as a product (i.e. reactions that have no catalysts). When such a reaction \(\alpha=\langle {\mathbf l},{\mathbf r},k \rangle\) occurs, the state of the SCRN changes by addition of the vector \(-{\mathbf l}+{\mathbf r}.\) Thus the trajectory of states is a walk through \({\mathbb{N}}^p\) wherein each step is any of a finite number of reactions, subject to the constraint requiring that the number of molecules of each species remain non-negative. Karp and Miller’s decidability results for VASs then directly imply that our reachability question of whether we ever enter a state greater than or equal to some target state is decidable for catalyst-free SCRNs. The restriction to catalyst-free reactions is easily lifted: each catalytic reaction can be replaced by two new reactions involving a new molecular species after which all reachability questions (not involving the new species) are identical for the catalyst-free and the catalyst-containing networks.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Soloveichik, D., Cook, M., Winfree, E. et al. Computation with finite stochastic chemical reaction networks. Nat Comput 7, 615–633 (2008). https://doi.org/10.1007/s11047-008-9067-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-008-9067-y

Keywords

Navigation